How to remove unwanted characters from a list of strings?

1

I'm new to python and could not find an answer to my question. I get a list of texts from the database and transform it into a list of strings as below:

textosPuros = df['texto']
# print(textosPuros)

textoMinusculo = textosPuros.str.lower().str.split(' ')

# print(textoMinusculo)

textoLimpo = [item for item in textoMinusculo if item not in ['\n', '\t', '/', '.', '-', '(', ')']]

In an attempt to clean the strings so that I can work with them, I have implemented the last line, but I still have bad characters:

[['\testá', 'tossindo', 'noite.mãe', 'fez', 'inalação', 'com', 'berotec', 'essa', 'noite,com', 'melhora', '.está', 'usando', 'o', 'piemonte', 'há', '2', 'cuidou', '', ',', '----', 'com', '', 'febre.é', 'muito', 'ansioso', 'e', 'agitado.\nex.f:beg', 'corado,com', 'taquipnéia', 'leve', 'afebril', 's/sinais', 'meníngeos', 'otosc:nl', 'cavum:hiperemia', 'pulmões:esc+sibilos', 'abdome', 'nl\t'],['mais strings','episódio\n\nap\n-','\t0000000000\t'],['outra lista','menopausea.\n\nexames']]

How do I remove these unwanted characters? How

  

\ t \ n:. (i.e.

    
asked by anonymous 23.08.2018 / 16:24

1 answer

1

This should work

novo = []
for x in lista:
    item = x
    for y in ['\n', '\t', '/', '.', '-', '(', ')']:
        item = item.replace(y, "")
    novo.append(item)

or

novo = []
for x in lista:
    novo.append(x.translate(None, "\n\t/.-()"))
    
23.08.2018 / 18:35