Find and Replace in bash

2

I need to find outdated rows in a csv file and replace them with new rows.

These are the commands that find the lines that are going to be replaced (old) and that will replace (new) lines.

linhas_antigas=$(diff -y arquivo_com_linhas_antigas.csv arquivo_com_linhas_novas.csv | grep -e "|" | awk -F"|" '{ print $1 }')
linhas_novas=$(diff -y arquivo_com_linhas_antigas.csv arquivo_com_linhas_novas.csv | grep -e "|" | awk -F"|" '{ print $2 }' | sed 's/\t *//')

Then I run the following excerpt to replace:

while read -r arquivo_antigo 
do
    echo ${arquivo_antigo//"$linhas_antigas"/"$linhas_novas"} 
done < arquivo_com_linhas_antigas.csv

Now the problem ... When diff returns only one line between the two files, replace is done quietly. But if you have two or more rows to update, it does not replace any of them.

I imagine that if the variables $linhas_antigas and $linhas_novas were arrays it would facilitate the process.

But how to do that? Do you have any other solution ??

    
asked by anonymous 21.06.2018 / 20:12

2 answers

2

What I understood would be how to update a backup file. Contents of file 1:

A1 B1;10;52;3
A2 B2;12;52;3
A3 B3;10;52;3
A4 B4;10;34;3
A5 B5;10;52;3
A6 B6;10;33;3
A7 B7;08;52;4

Content file 2:

A1 B1;10;52;1
A2 B2;12;52;2
A3 B3;10;52;3
A4 B4;10;34;3
A5 B5;10;52;5
A6 B6;10;33;6
A7 B7;08;52;4

Script:

#!/bin/bash
# Quantidade de linhas para determinar quantas vezes o laço sera executado
# Poderia ser com "while" dizendo "enquanto o arquivo for diferente um do outro faça"
qt=$(diff -y --suppress-common-lines l1.csv l2.csv | wc -l)
for (( i = 0; i < $qt; i++ )); do
    # Pega sempre a primeira ocorrência, linhas diferentes
    linha=$(diff -y --suppress-common-lines l1.csv l2.csv | head -n1)
    # Pega a linha antiga
    la=$(awk '{print $1,$2}' <<< $linha)
    # Linha nova
    ln=$(awk '{print $4,$5}' <<< $linha)
    # Coloca o conteúdo do arquivo na variável
    arq=$(cat l1.csv)
    # Faz a substituição da linha antiga pela nova
    arq=${arq//$la/$ln}
    # Coloca a alteração dentro do arquivo original
    echo "$arq" > l1.csv
done

Output:

A1 B1;10;52;1
A2 B2;12;52;2
A3 B3;10;52;3
A4 B4;10;34;3
A5 B5;10;52;5
A6 B6;10;33;6
A7 B7;08;52;4

This method above makes the change line by line, if I understand correctly, and you want to update a file taking into account another file, could do as follows:

#!/bin/bash
if [[ -n $(diff -q l1.csv l2.csv) ]]; then
    cat l2.csv > l1.csv
fi

Your script was not working because you put all the different lines inside the variable, you have to do this line by line, so when you only had a different line it worked.

    
25.06.2018 / 15:34
1

You can implement a script in gawk to process your files, for example:

BEGIN{
}

{
    if( FNR == NR )
    {
        a[FNR] = $0;
        next;
    }

    print (a[FNR] == $0) ? a[FNR] : $0;
}

END{
}

Or, in a line:

$ awk 'FNR==NR{a[FNR]=$0;next}{print a[FNR]==$0?a[FNR]:$0}' antigas.csv novas.csv

Assuming the .CSV fault files are something like:

old.csv :

JESUS DE NAZARE;15;21;1
MARIA MAGDALENA;12;52;3
JOAO DE DEUS;33;52;5
MATUZALEM DA COSTA;10;34;7
MICHAEL JACKSON;10;28;2
DINO DA SILVA SAURO;16;32;4
FULANO DE TAL;84;25;6

new.csv :

JESUS DE NAZARE;15;21;8
MARIA MAGDALENA;12;52;3
JOAO DE DEUS;33;52;5
MATUZALEM DA COSTA;15;34;7
MICHAEL JACKSON;10;28;2
DINO DA SILVA SAURO;14;32;9
FULANO DE TAL;84;25;6

Output:

JESUS DE NAZARE;15;21;8
MARIA MAGDALENA;12;52;3
JOAO DE DEUS;33;52;5
MATUZALEM DA COSTA;15;34;7
MICHAEL JACKSON;10;28;2
DINO DA SILVA SAURO;14;32;9
FULANO DE TAL;84;25;6
    
01.07.2018 / 15:21