Python - Breaking down text file

3

Good evening,

I need to separate a few lines from the file and according to the line append to another file. That is, a file containing 6 words will be added, according to the word for a specific file.

These 6 words can increase to 8, 10, etc. and there you will have to create 8, 10 files, and so on.

I tried at first to create an array in which each line would be responsible for a line that contained the word.

But I could not, because as I tried to add on the line, there was no way, because I could not unless I specified the input with line and column.

For example, I want all rows containing orange to be directed to the file orange.txt and all rows containing plum go to the file plum.txt.

The idea would be to make the code without "if plum", "if orange".

I tried to play the words in a vector but to play in the file I could not without having the if ... Ex:

frutas = ['laranja', 'ameixa']

with open('frutas.txt', 'r') as arq_fruta:
  for line in arq_fruta:
    coluna = line.split()
    for i in range (len(frutas)):
      if(coluna[1] == variaveis[0]):
        laranja.append(coluna[0] +' '+ coluna[3]+'\n')

The last line I could not put as a vector for example, something like:

fruta[i].append(coluna[0] +' '+ coluna[3]+'\n') #só como exemplo, nao funciona

where the fruit [0] would be the vector of all lines containing only orange and fruit [1] all lines with plum.

I tried to create an array, but it did not work, because the array asks for the line and column for input, but I do not have these infos, since I'm going to read it and supposedly play it to the file.

And speaking of a file, I also tried to do something that was "straightforward" but it did not work either.

for i in range(1, len(frutas)):
   arq = open(frutas[i]+'.txt','w')
   arq.writelines(fruta[i])

Is there any more "right" way to do this? I did not succeed, only with the code with "if" which would cause a lot of change if I had to include another fruit for example.

    
asked by anonymous 16.11.2018 / 00:43

1 answer

1

A more direct way would be to create a list of open files, where you have a file for each fruit. So you can have a single code that writes to all files directly without having to split into lists in memory. The code will be able to handle files of any size because it writes directly to the destination.

frutas = ['laranja', 'ameixa']
arquivos = [open(fruta + '.txt', 'w') for fruta in frutas]

with open('frutas.txt', 'r') as arq:
    for linha in arq:
        for fruta, arquivo in zip(frutas, arquivos):
            if fruta in linha:
                arquivo.write(linha)

If you really want to separate into variables in memory, one solution is to combine dictionaries with lists, can be facilitated by collections.defaultdict :

import collections

frutas = ['laranja', 'ameixa']
por_fruta = collections.defaultdict(list)

with open('frutas.txt', 'r') as arq:
    for linha in arq:
        for fruta, arquivo in zip(frutas, arquivos):
            if fruta in linha:
                por_fruta[fruta].append(linha)

So you have all the lists in the por_fruta ... dictionary to save to file after:

for fruta, linhas in por_fruta.items():
    with open(fruta + '.txt', 'w') as f:
         f.writelines(linhas)
    
16.11.2018 / 02:09