I'm having UnicodeDecodeError: 'utf-8' problems in a python file and I'm not able to solve it. This is the error:
Traceback (most recent call last):
File "file.py", line 448, in <module>
fileOriginal.sliceFile(url) #Separa os arquivos para evitar MemoryError
File "file.py", line 188, in sliceFile
line = fileOriginal.readline()
File "C:\Python34\lib\codecs.py", line 313, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: invalid
continuation byte
It occurs when reading a txt file. The file is encoded with UTF-8 without BOM. And I do not understand why you give this error. The error occurs in the following line: "line = fileOriginal.readline ()", according to the following code:
Code:
for(path, dirs, files) in os.walk(url):
contDec = 0 #Conta as declarações
contTempFiles = 0 #Conta os arquivo temporários
for file in files:
fileOriginal = open(os.path.join(url,file),encoding = "utf8")
endFile = False
contLines = 0
contDec = 0
cont = 0
line = ''
while not 'ZZZZZ|' in line:
if cont == 0:
contTempFiles += 1
tempFile = open(os.path.join('separados',str(contTempFiles)+'_'+str(self.getFileName(file))+'.txt'),'w', encoding='utf-8')
line = fileOriginal.readline()#Erro nessa linha
if line[0:5] == '99999':
tempFile.write(line)
contDec += 1
if contDec <= 200000:
tempFile.write(line)
cont += 1
else:
contDec = 0
cont = 0
tempFile.close()
fileOriginal.close()
Python version: 3.4.0 Can anyone help me with this? Thanks!