Skip to content Skip to sidebar Skip to footer

How To Address: Python Import Of File With .csv Dictreader Fails On Undefined Character

First of all, I found the following which is basically the same as my question, but it is closed and I'm not sure I understand the reason for closing vs. the content of the post.

Solution 1:

I posted the solution I went with in the comments above; it was to set the errors argument of open() to 'ignore':

withopen(file, newline = '', errors='ignore') as f: 

This is exactly what I was looking for in my first question in the original post above (i.e. whether there is a way to tell the csv.DictReader to ignore undefined characters).

Update: Later I did need to work with some of the Unicode characters and couldn't ignore them. The correct answer for that solution based on Excel-produced unicode .csv file was to use the 'utf_8_sig' codec. That deletes the byte order marker (utf-16 BOM) that Windows writes at the top of the file to let it know there are unicode characters in it.

Post a Comment for "How To Address: Python Import Of File With .csv Dictreader Fails On Undefined Character"