Avoid accessing wrong column in csv
I was parsing few csv files. All were from same provider. Format isDate, Name, Email, Company, City
etc in one file. I made an assumption all the downloaded files are in the same format. For my surprise few files had same format and others din’t.
with open(filename, 'rb') as f:
reader = csv.reader(f)
reader.next() # first row contains column names
for row in reader:
name, email, company = row[1], row[2], row[3]
#save to db
In the above code the fundamental mistake is fixing the position of the columns in csv file. What happens if email
and company
position is interchanged? In simple screwed
.
How can I avoid these situations ?
with open(filename, 'rb') as f:
reader = csv.reader(f)
header = reader.next() #reader is not list
name_pos, email_pos, company_pos = header.index("Name"),header.index("Email"),header.index("Company")
with open(filename, 'rb') as f:
reader = csv.reader(f)
reader.next()
for row in reader:
name, email, company = row[name_pos], row[email_pos], row[company_pos]
#save to db
The code is bit longer here, but robost. The above snippet doesn’t rely on column position but only relies on column name in the first line. Ofcourse header.index
will raise ValueError
you can wrap the exception. First all the required column positions are retrieved, file is reopened, first row is omitted and other details are accessed.
See also
- Python Typing Koans
- Model Field - Django ORM Working - Part 2
- Structure - Django ORM Working - Part 1
- jut - render jupyter notebook in the terminal
- Five reasons to use Py.test
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.