Error reading an .xls file

0

I have some files to read in Python, I'm using the following structure:

  

df = pd.read_csv (path, sep = '\ t')

And this generated the following error:

  

UnicodeDecodeError: 'utf-8' codec can not decode byte 0xc3 in position 9: unexpected end of data

Looking for the internet, I added a engine = 'python' (df = pd.read_csv (path, engine = 'python', sep = '\ t')) and it read normally, I thought my problems were over , but when I went to read the other files, the following error occurred:

Using encoding = 'ISO-8859-1' also solved in the first case, but in the other files it was as follows:

  

pandas.errors.ParserError: NULL byte detected. This byte can not be processed in Python's native csv library at the moment, so please pass in engine = 'c' instead

Visually, both files are the same, same data type, same size practically, same extension. Does anyone know why this incompatibility?

    
asked by anonymous 29.06.2018 / 17:10

1 answer

2

If you have an "XLS" file with the title of the question, it will not be able to be read with functions to read "CSV" files. They are fundamentally different files. (And CSV files will not normally have "\ x00" bytes inside - so I even think it's an Excel file.)

If it is a .xlsx file, Pandas has the read_excel function that you can use. If it is a legacy ".xls" spreadsheet, your options will be to install another Python library to access the data, and then convert it to a dataframe, or open it in Excel and save it in another format.

    
29.06.2018 / 18:59