How to remove special character and column point string from a data frame?

Question

How to remove special character and column point string from a data frame?

Navigation

#1 by (1 votes)

0

raw_data = {'NAME': ['José L. da Silva', 
                      'Ricardo Proença', 
                      'Antônio de Morais']}

df = pd.DataFrame(raw_data, columns = ['NAME'])

How to make NAME column names into:

Jose L da Silva (no point or accent)
Ricardo Proenca (without the cedilla) and
Antonio de Morais (without the accent)?

python pandas

asked by anonymous 04.07.2017 / 15:32

1 answer

I'm having trouble reading a string inside a function Display products Subcategory

score 1 · Answer 1

You can use the apply() function of objects of type Series . With it you can apply any function that returns something. So you can define a correction function and apply it. For example:

def corrigir_nomes(nome):
    nome = nome.replace('.', '').replace('ç', 'c').replace('ô', 'o').replace('é', 'e')
    return nome

And then apply the column you want:

df['NAME'] = df['NAME'].apply(corrigir_nomes)

The result will look something like:

0      Jose L da Silva
1      Ricardo Proenca
2    Antonio de Morais
Name: NAME, dtype: object