Press "Enter" to skip to content

Python Tutorial CSV Module How to Read, Parse, and Write CSV Files

pPython Tutorial: CSV Module How to Read, Parse, and Write CSV Filesp p Hey there.

Python Tutorial CSV Module How to Read, Parse, and Write CSV Files

Hows it going everybody in this article? Were going to be looking at how to read parse and write CSV files now if you dont know what CSV files are it stands for comma separated values basically CSV files allow us to put into a plain text file Some data and use some type of delimiter usually a comma to separate the different fields now I have a sample CSV file here that we can work with and if we look at this then we can see how these are Usually structured so this can kind of look like a mess, but its not really meant to be read directly This is just how the data is stored And then we can use our programs to parse out the information that we want, but we can see that the top Line here has our fields now our fields in this file are first name Last name and email so that tells us the information that we should expect to see on every line so if I go to the next line here then we can see that John is the first name and then a comma though is the last name and then a comma and then This long email here is the email so thats why these are called comma separated values and what separates two values is called a delimiter So the comma is a common delimiter But you can use just about anything so sometimes youll see files with tab delimited values or dashes or things like that But theyre all called CSV files. So now lets see what its like to read parts and write to CSV files So I have a file here called Parse CSV pie and within this file Were just going to import CSV now you may have looked at that Data and wondered Why were just not using it the strings split method on each line of the file to parse out the data and you could do That but the CSV module just makes parsing these files so much easier so for example if someone puts a comma or something in their name for some reason then we wouldnt want to split on that and also the CSV module will handle new lines and all those things so it just makes it a lot easier to parse out all the Information that we want without writing something complicated from scratch okay, so to read the CSV file Were just going to open this file.

Just like any other file So well use a context manager here And well say with open and the name of that file that I was just looking at is now CSV, and its in the same directory as the file that Im currently writing and we want to read this file So well put an r there as the second value and now what we want to call this so well say CSV file So to read this file We can say CSV reader and that can be any variable name that you like, but thats what I like And then we can say CSV and then use this reader method and then pass that CSV file in to that reader method now in the background that reader method is using something called a dialect that has some preset parameters for what it expects the format of our CSV file to be so by default its expecting values to be Separated by a comma and a few other things that well look at in Just a bit, but since our CSV file is pretty simple We dont need to pass any additional arguments right now So the CSV reader variable that we just created is going to be something that we need to iterate over So for example if we just print this out as is so print CSV reader and run that then we can see that right now. Its just an object in memory, so instead We need to loop over all these lines in the reader and see what we get so we can say for line in CSV reader and then print out each line And well run that ok so that looks better so each line that Were printing out is a list of all the values so the first value in the list is the name as the first name The second value in the list is the last name and the email is the third value if I scroll all the way up to the top you can see that our first line is the field names so it tells us that the you know first value this first name last name is the Second value and Third value is email so for example if were going by the Index like this would be index 0 and then 1 and then the email would be index 2 if we only wanted to print out All of the indexes then on this line here, we could say lets print out index 2 of each line And if we run that and we can see that now we get all of the emails printed out now if you dont want this First line of the field names and only want the values then we can just skip that first line so if anyone has seen my article on generators and we can actually step over value an iterable by calling next and running next well Return the next value if we want to capture that in a variable But if we just want to step over the value then we can come up here before our list. We can just say next CSV reader and that will loop over that first line And then when we iterate through this it should start at the second value Which is the first person in the list so now if we rerun this and scroll up to the top?

Now we can see that that John doe is now the first value, okay? So now lets see how we can write to a CSV file now We can do this with any list values, but since were already have a list of values here from our original CSV file Lets go ahead and just use those so lets say that we wanted to save these same values into a new CSV file But use dashes instead of commas for the delimiter now Dash is probably isnt a great delimiter, but I just want to show you something that happens when we do this now first Were going to want to write the field name headers into the new file So lets take out this next statement where were skipping over those so Now Ill come down here and now actually above our loop were going to want to open a new file for writing and So well say with open and well call this file new underscore names CSV We want to open this for writing so the second argument is aw, then well say as and well just call this a variable new file and to write to this file Were going to use a CSV writer so we can say CSV writer and that can be any variable name But that makes sense to me and well do CSV and then use this writer Method and were going to pass in new file to that writer method now if we left it like this then it would just write the same comma Separated file that we currently have now but if we want to use dashes as our delimiter Then we need to pass that in as an argument So its going to be the second argument to that writer method and we can say delimiter equals And well just go use a dash now We want to write each line of our original CSV file into this new file, so lets indent our for loop over here So that now were within the context manager of this new file and for each Line in this CSV reader. Which is our original file We want to write that to a new file so we can do that by saying CSV writer dot write Row and The row that we want to write is that line from the original reader?

So real quick before I run this we are opening the original file to be read and then were creating this CSV reader variable and we are using the CSV reader method to read that original CSV file and then were opening a new file for writing called new Names CSV and Then were creating a CSV writer variable and were using this writer method of the CSV module to Open up a writer using that new file with a delimiter of a dash and then for each line in this original CSV data we are writing out to the new file each line of the original file So now if we run this then we dont have any output here at the bottom But it should have created this new file called new named CSV and Ill go ahead and open that up now We can see in this new file that.

Python Tutorial CSV Module How to Read, Parse, and Write CSV Files

Its using dashes instead of commas for the delimiter now This makes it pretty hard to read, but I wanted to show you what it did with two of our values here So in our first value the email actually contained a dash so we can see here that our CSV writer knew to put quotes around the email Since it cant contain that delimiter And thats so when the CSV is read back in that it would know that the email is One whole value and that it shouldnt be split on the dash within the email itself and likewise here We can see that our second person here has a hyphenated last name of Smith Robinson so again the CSV writer knew to put quotes around the last name so that it can tell the difference between the delimiters and the values that just happen to contain dashes So now that weve seen how that works lets actually change this delimiter for the new file to something thats a bit more common, so aside from Commas tabs are very common Des limiters So lets use tab instead and in python the tab can be represented with this backslash t and if we rerun that and then open up the new names file again Then we can see that now all the values are separated by tabs instead. Thats a lot more easier Thats a lot more easy to read now Just like we passed the delimiter into our writer if we wanted to read in that tab delimited file then you could pass the delimiter argument into the reader as well, and Real quick let me show you what that would look like if we tried to read a CSV file with the wrong delimiter So let me copy part of this here where were reading in this file And now Im just going to comment out everything else for now now instead of reading the original file names CSV Were instead going to read the new tab delimited file that we just created which is new underscore names CSV now Lets pretend that we forgot to Specify the Tab delimiter and just try to read this as is so lets print out the lines that we get from this reader So well say four line in CSV reader and we will print out each line So we can see that each line only has one value And it didnt split on the values on the tab because it was expecting commas So instead you have to explicitly pass in that we want the delimiter to be a tab so Ill pass that into the reader method here and say delimiter equals a Backslash t for tab and then rerun that and now you can see that we get the correct parsing okay? so now Im going to delete these lines here and Uncomment out what we had before Okay, so the way that weve been working with CSV files using the reader and writer is probably the more common way to work with CSV data since theyre the first things that come up in the python documentation But my preferred method is working with CSV data using the dictionary reader and the dictionary writer So lets take a look at those and Ill explain why I prefer them over the regular reader and writer okay, so first Lets take a look at the dictionary reader So to use this were just going to replace the regular reader method here with a dict Reader and now lets print out the lines that we give with this so Ill say four line in CSV reader and well just print out each line Okay, so at first glance this may look a little more complicated each of the values is now an ordered dictionary and if we scroll Up here to the top then we can see that that first line no longer contains the field names It starts off immediately with the first person So the reason is that the field names are now the keys of each of these values here now the reason I like this is because it makes it a lot easier to parse out the information that we want so for example remember when we Use the regular reader if we wanted to print out the email Address then we printed out the second index of our line well for anyone reading your code It isnt obvious what that second index is so theyd have to go into the CSV file to find that information out But now that we have those Fields as our dictionary keys then we can get the email here by saying I just want the Email of that line So we just access that key so now if we rerun that we can see that now we have all of the email information Okay, and now lets look at how to use the dictionary writer So Im going to remove this loop and then uncomment out the rest of this information here Now with the dictionary reader. We really didnt need to change anything, but with the dictionary writer We actually have to provide the field names of our file, so one line above our writer here Im just going to create a list of the field names And now instead of using this writer method.

Were instead going to use dict writer Now one thing that we need to change there here is that after the file that were going to be writing to we need to? Pass in those field names, so Ill say field names is equal to field names Okay, and now were ready to write the data so with the dictionary writer you have the option of whether or not you want to Write out those headers. Which are the field names in the first row so if we want those headers, which most of the time I do then we can say CSV writer Dot right header So thats going to write out those field names as the first line and once the header is written out We can loop through the lines of the original file Just like we did before and say CSV writer dot right row and then pass in that line so all of that stays the same so if we run this and then look over here at our new names CSV file then we can see that that still worked and Like I said before the reason I like working with the dictionary reader and writer is because its more obvious what youre doing So lets say for example that in our new CSV file We actually only wanted the first and last names and wanted to leave off the email well with the regular reader and writer Wed be modifying the indexes of those list and like I mentioned before its not obvious by looking at an index What value its supposed to hold but with our dictionary writer? We can just remove the email from the field names up here and Before we write each line within our loop here We can just remove the email key and value and one way to do that is to just delete it so we can say delete the email of That line so now when it writes that row its only going to be writing the first name and the last name and the email no longer exists So now if we save that and run it and then I open up the new names dot CSV file here then you can see That now we just have a tab delimited file of first names and last names and that email is no longer there Now there are several ways that we could have written this row. We could have deleted the email key from Line Just like we did here or we could have created a new dictionary?

With only the first name and last name keys and passed that into the right Road method So whichever way works for you in this case. I think it was easier Just to remove the email key okay, so I think that is going to do it for this article I hope that now you have a pretty good idea for how you can read parse and write CSV files But if anyone does have any questions about what we covered in this article then feel free to ask in the comment section below and Ill do my best to answer those and if you enjoy these tutorials and Would like to support them and there are several ways you can do that the easiest ways to simply like the article and give it a thumbs up and also Its a huge help to share these articles with anyone who you think would find them useful and if you have the means you can contribute through Patreon And theres a link to that page in the description section below be sure to subscribe for future articles and thank you all for watching youp

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *