Reading CSV file accented in Python?

3

I'm learning python3 and I ended up packing on the issue of reading a simple csv file that contains the character 'à'. I've tried using decode, encode that I found on the internet but nothing seems to work, it is always printed as '\ xc3 \ xa0'. Remembering that I use the sublime to edit the code and run it.

import csv

with open('teste.csv', 'r') as ficheiro:
    reader = csv.reader(ficheiro, delimiter=';')
    for row in reader:
        print(row)

The test.csv file:

batata;14;True
pàtato;19;False
papa;10;False

The error:

    Traceback (most recent call last):
  File "/Users/Mine/Desktop/testando csv.py", line 5, in <module>
    for row in reader:
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 16: ordinal not in range(128)
[Finished in 0.1s with exit code 1]

I look forward to helping you.

    
asked by anonymous 26.12.2016 / 05:38

1 answer

6

Depends on which encoding saved the file .csv

  

Note: in python 2 csv only supports ASCII

UTF-8

If the .csv file is saved as UTF-8 it can do according to Python 3 documentation :

import csv

with open('teste.csv', encoding='utf-8') as f:
    reader = csv.reader(f, delimiter=';')
    for row in reader:
        print(row)

If the .csv file is not in UTF-8, an error similar to this will occur:

C:\Users\guilherme\Desktop>python testcsv.py
Traceback (most recent call last):
  File "testcsv.py", line 5, in <module>
    for row in reader:
  File "C:\Python\Python36-32\lib\codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 0: invalid
continuation byte

If everything is correct it will look like this:

Ifitisaproblemonaterminalinalike-unixenvironment(egMacandLinux)applythis(IbelievethedocumentmustalsobesavedinUTF-8withoutBOM):

#-*-coding:utf-8-*-importcsvwithopen('teste.csv',encoding='utf-8')asf:reader=csv.reader(f,delimiter=';')forrowinreader:print(row)

Latin1

IfthefileissavedinANSI,eitherlatin1orwindows-1252oriso-8859-1(theyare"compatible") can be encoding='latin-1' (although in Python3 on was needed), it should look like this:

import csv

with open('teste.csv', encoding='latin-1') as f:
    reader = csv.reader(f, delimiter=';')
    for row in reader:
        print(row)
    
26.12.2016 / 07:07