In pandas and unidecode, how to avoid warning messages - copy of a slice from a DataFrame?

Question

In pandas and unidecode, how to avoid warning messages - copy of a slice from a DataFrame?

Navigation

#1 by (1 votes)

0

In Python3 and pandas I'm reading CSV files to create dataframes. In some columns I need to remove the accent (Portuguese). I do this with unidecode

But in some files a warning message appears

import pandas as pd
import unidecode

def f(str):
    return (unidecode.unidecode(str))

candidatos_2014 = pd.read_csv("candidatos_2014.csv",sep=',',encoding = 'utf-8', converters={'cpf': lambda x: str(x), 'sequencial': lambda x: str(x)})

candidatos_2014.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26245 entries, 0 to 26244
Data columns (total 9 columns):
Unnamed: 0         26245 non-null int64
uf                 26245 non-null object
cargo              26245 non-null object
nome_completo      26245 non-null object
sequencial         26245 non-null object
cpf                26245 non-null object
nome_urna          26245 non-null object
partido_eleicao    26245 non-null object
situacao           26245 non-null object
dtypes: int64(1), object(8)
memory usage: 1.8+ MB

eleitos = candidatos_2014[(candidatos_2014['situacao'] == 'ELEITO POR QP') | (candidatos_2014['situacao'] == 'ELEITO POR MÉDIA') | (candidatos_2014['situacao'] == 'ELEITO')]

eleitos_d_2014 = eleitos[(eleitos['cargo'] == 'DEPUTADO FEDERAL')]

eleitos_d_2014["nome_completo"] = eleitos_d_2014["nome_completo"].apply(f)

/home/reinaldo/Documentos/Code/seguranca/lib/python3.6/site-packages/ipykernel_launcher.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """

eleitos_d_2014.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 513 entries, 144 to 26209
Data columns (total 9 columns):
Unnamed: 0         513 non-null int64
uf                 513 non-null object
cargo              513 non-null object
nome_completo      513 non-null object
sequencial         513 non-null object
cpf                513 non-null object
nome_urna          513 non-null object
partido_eleicao    513 non-null object
situacao           513 non-null object
dtypes: int64(1), object(8)
memory usage: 40.1+ KB

The accent has been removed, it seems. But is there any risk of faults in some lines? Please, how to avoid this warning message? How to use .loc?

python character-encoding pandas

asked by anonymous 02.03.2018 / 15:03

1 answer

How to select a specific HTML tag when there is no id or name and (rarely has a class) using jQuery? How to make a query depend on the result of the other? [duplicate]

score 1 · Answer 1

I solved the same problem (in fact, with the same database) without using unidecode.

from bs4 import BeautifulSoup
import requests
import pandas as pd

candidatosal2014 = pd.read_csv("candidatos_alagoas_2014.csv", encoding="latin1", delimiter=";", header=None, usecols=[9, 10, 14, 43, 44])

candidatosal2014[10] = candidatosal2014[10].str.normalize('NFKD').str.encode('ascii', errors='ignore').str.decode('utf-8')



display(candidatosal2014.loc[candidatosal2014[43].isin([1,2,3])]) #1 é eleito, 2 é eleito por quociente parlamentar e 3 é eleito por média