How to find specific words in a txt file in python?

1

I need to make software that looks for words in the code and saves the next line in a string. I did not find anything how to do this anywhere, I just found one that counts the number of words in the file like this

with open('File_of_data.txt', 'r',encoding='utf8') as j:
    for line in j:
        words = line.split('\n')
        num_lines += 1
        num_words += len(words)
print(num_lines)
print(num_words)

After finding the word as the name of a person "Carlos" on the bottom line will be that person's information and need to save this to do calculations.

    
asked by anonymous 14.10.2017 / 04:39

1 answer

2

You can build a simple state machine that performs a line-by-line search of a given input file by searching for the name or one of the person's last names.

When you find a person who is compatible with the search, the state machine assumes that the next line to be read contains that person's data.

Once with the line containing the data of the person found, a dictionary can be assembled and returned for general use:

def pesquisar_registro( arq, txt ):
    nome = ""
    with open( arq, 'r' ) as a:
        for linha in a:
            linha = linha.strip('\n')
            if nome == "":
                if txt in linha.split():
                    nome = linha
            else:
                registro = linha.split(',')
                dic = { "nome"       : nome,         \
                        "cod"        : registro[0],  \
                        "pais_nasc"  : registro[1],  \
                        "ano_nasc"   : registro[2],  \
                        "pais_morte" : registro[3],  \
                        "ano_morte"  : registro[4] }
                return dic;
    return None;


print pesquisar_registro( 'fisicos.txt', 'Einstein' )
print pesquisar_registro( 'fisicos.txt', 'Max' )
print pesquisar_registro( 'fisicos.txt', 'Michael' )
print pesquisar_registro( 'fisicos.txt', 'Curie' )
print pesquisar_registro( 'fisicos.txt', 'Obama' )

fisicos.txt

Albert Einstein
100,Alemanha,1879,EUA,1955
Isaac Newton
200,Reino Unido,1643,Reino Unido,1727
Galileo Galilei
300,Italia,1564,Italia,1642
Marie Curie
400,Polonia,1867,Polonia,1934
Erwin Schrodinger
500,Austia,1887,Austria,1961
Michael Faraday
600,Reino Unido,1791,Reino Unido,1867
Max Planck
700,Alemanha,1858,Alemanha,1947

Output:

{'ano_nasc': '1879', 'pais_nasc': 'Alemanha', 'nome': 'Albert Einstein', 'ano_morte': '1955', 'cod': '100', 'pais_morte': 'EUA'}
{'ano_nasc': '1858', 'pais_nasc': 'Alemanha', 'nome': 'Max Planck', 'ano_morte': '1947', 'cod': '700', 'pais_morte': 'Alemanha'}
{'ano_nasc': '1791', 'pais_nasc': 'Reino Unido', 'nome': 'Michael Faraday', 'ano_morte': '1867', 'cod': '600', 'pais_morte': 'Reino Unido'}
{'ano_nasc': '1867', 'pais_nasc': 'Polonia', 'nome': 'Marie Curie', 'ano_morte': '1934', 'cod': '400', 'pais_morte': 'Polonia'}
None
    
15.10.2017 / 00:51