Replace words between two files

2

I'm trying to make some substitutions between two files: if a word from file 1 is in the second column of file 2, replace that word from file 1 with the word from the first column of file 2

File 1:

  

bought large carpet expected large package consequence surprise size small wanted to check content consistent delivery delivery driver not allowed still recommends order

     

personal transport work shop full pass everywhere stores without person occupies genent movement overview take example shop forum

     

buy time product damaged address table time wanted order copy buy product damaged send

File 2 :

  

buy, bought

     

place, places

     

store, shops

Script:

import csv


with open ("arquivo1.txt", "r") as f, open("arquivo2.csv", "r") as f1:
    text = f.read().split('\n')
    text_csv = csv.reader(f1, delimiter = ',')

    for item in text: #percorro a lista de strings
        for novo_item in item.split(): #tento separar cada frase em palavras sem perder a info de que é uma frase

            for elements in text_csv: #percorro a lista do arquivo 2
                lexema = elements[1] # colunas
                lema = elements[0]

                if novo_item == lexema: #se um elemento do meu arquivo 1 esta na segunda coluna do arquivo 2
                    novo_item = novo_item.replace(novo_item, lema) #substituir essa palavra pela primeira coluna do arquivo 2

                print (novo_item)

Expected output:

  

buy big carpet expected big package consequence surprise size small wanted to check content consistent order delivery driver not allowed still recommends order

     

store without person occupies genent movement overview take example shop forum

     

buy time product damaged address table time wanted order copy buy product damaged send

My output:

  

bought

     

bought

     

bought

     

bought

     

...

     

buy

     

buy

     

buy

     

buy

     

...

    
asked by anonymous 29.06.2018 / 13:50

2 answers

3

Another way that would keep the "\ n" in the text is like the code below, which is even simpler than yours.

import csv
with open ("arquivo1.txt", "r") as f, open("arquivo2.csv", "r") as f1:
    text = f.read()
    text_csv = csv.reader(f1, delimiter = ',')
    for elements in text_csv:
        novo_text= text.replace(elements[1], elements[0])
        text = novo_text
print novo_text
    
29.06.2018 / 18:54
1

I think it would be something like

# coding=utf-8

import csv
aa = ""

with open ("arquivo1.txt", "r") as f, open("arquivo2.csv", "r") as f1:
    text = f.read().split('\n')
    text_csv = csv.reader(f1, delimiter = ',')
    for item in text: #percorro a lista de strings
        for novo_item in item.split(): #tento separar cada frase em palavras sem perder a info de que é uma frase

            for elements in text_csv: #percorro a lista do arquivo 2
                lexema = elements[1] # colunas
                lema = elements[0]

                if novo_item == lexema: #se um elemento do meu arquivo 1 esta na segunda coluna do arquivo 2
                    novo_item = novo_item.replace(novo_item, lema) #substituir essa palavra pela primeira coluna do arquivo 2
                # print (novo_item)
            aa+=novo_item+' '

print aa
    
29.06.2018 / 14:35