I'm trying to replace names that are in a list in a column of a large dataframe. I'm trying this way, but it's not working ...
List of names (the list is too large):
Jack
Liam
John
Ethan
George
...
Small dataframe example:
A B C
French house Phone <phone_numbers>
English house email <adresse_mail>
French apartment my name is Liam
French house Hello George
English apartment Ethan, my phone is <phone_numbers>
My script:
import re
import pandas as pd
from pandas import Series
df = pd.read_excel('data_frame.xlsx')
data = Series.to_string(df['Descricao'])
first_names = open('names_list.txt', 'r')
names_read = first_names.readlines()
def names_teste(no_names):
list_to_string = ''.join(names_read)
for l in list_to_string.split('\n'):
replaces = no_names.replace([l, '<name>'], l)
return replaces
result = names_teste(no_names)
print(result)
My result shows an error:
runfile('C:/Users/marin/Desktop/Python/replaces.py', wdir='C:/Users/marin/Desktop/Python')
Traceback (most recent call last):
File "<ipython-input-30-d10d01d4e428>", line 1, in <module>
runfile('C:/Users/marin/Desktop/Python/replaces.py', wdir='C:/Users/marin/Desktop/Python')
File "C:\Programmes\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "C:\Programmes\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/marin/Desktop/Python", line 121, in <module>
result = names_teste(no_names)
File "C:/Users/marin/Desktop/Python", line 103, in names_teste
replaces = no_names.replace([l, '<name>'], l)
TypeError: replace() argument 1 must be str, not list
Good output:
C
Phone <phone_numbers>
email <adresse_mail>
my name is <name>
Hello <name>
<name>, my phone is <phone_numbers>