Count exchange quantities performed with replace in python

0

I need a help in the code, I need to count how many changes have been made in the sentence, you have some idea to pass me.this code is to remove repeated substrings that are at the end of the sentence, now I need to count how many changes were made in the " replace "from the code in the list.

def corrigePalavra(str):
palavra = [str[-1:], str[-2:], str[-3:], str[-4:]]
result = str
palavra_modificada = False
for w in palavra:
    if result.count(w) > 1:
        result = result.replace(w * result.count(w), w, 1)
        palavra_modificada = True

return palavra_modificada, result

lista1 = ['programaramar ee legalal','python ee showow','linguagemem de programacaocao']
aux2 = []
cont_palavras_modificadas = -1
for i in lista1:
aux1 = i.split()
for j in aux1:
    palavra_modificada, x = corrigePalavra(j)
    aux2.append(x)
    if palavra_modificada:
           cont_palavras_modificadas += 1
b = " ".join(aux2)
print(cont_palavras_modificadas, b)

Exit from my code:

   2 programar e legal
   4 programar e legal python e show
   6 programar e legal python e show linguagem de programacao

Correct Exit:

  3  programar e legal
  2  python e show
  2  linguagem de programacao

ie 3 occurrences in the first sentence, 2 in the second and 2 in the third.

    
asked by anonymous 13.04.2018 / 21:27

2 answers

1

You can also use Regular Expressions as I mentioned in response given in your question Search for Python 3.xx sub-strings .

To count, do the same as you already do, but instead of incrementing, you just change the value to 1:

import re
def corrigePalavra(str):
  count = 0
  for m in re.finditer(r"(\w+)+", str):
    str = str.replace(m.group(1) * str.count(m.group(1)), m.group(1), 1)
    count = 1
  return count, str

linha = 'eu estavava indodo para aaaaaa aulaula'
total = 0;
resultado = [];
for palavra in linha.split():
  count, retorno = corrigePalavra(palavra)
  total += count
  resultado.append(retorno)

print(linha)
print(' '.join(resultado))
print('{} palavra(s) corrigida(s)'.format(total))

See working on repl.it

    
15.04.2018 / 05:52
1

You increment the counter but do nothing with it. One way is to return it too:

def corrigePalavra(str):
    palavra = [str[-1:], str[-2:], str[-3:], str[-4:]]
    result = str
    cont = 0
    for w in palavra:
        if result.count(w) > 1:
            result = result.replace(w * result.count(w), w, 1)
            cont += 1

    return cont, result

lista1 = 'estou indodo para a aulaula'
aux1 = lista1.split()
aux2 = []
cont_total = 0
for i in aux1:
    cont, x = corrigePalavra(i)
    cont_total += cont
    aux2.append(x)
print(aux1)
b = " ".join(aux2)
print(cont_total, b)  # 6 estou indo para a aula

However, it counts the replaced number of letters , not the number of words affected. We can modify the program a bit to solve this:

def corrigePalavra(str):
    palavra = [str[-1:], str[-2:], str[-3:], str[-4:]]
    result = str
    palavra_modificada = False
    for w in palavra:
        if result.count(w) > 1:
            result = result.replace(w * result.count(w), w, 1)
            palavra_modificada = True  # Se fizermos uma substituição, marcamos palavra_modificada como True

    return palavra_modificada, result

lista1 = 'estou indodo para a aulaula'
aux1 = lista1.split()
aux2 = []
cont_palavras_modificadas = 0

for i in aux1:

    palavra_modificada, x = corrigePalavra(i)
    if palavra_modificada:
        cont_palavras_modificadas += 1

    aux2.append(x)
print(aux1)
b = " ".join(aux2)
print(cont_palavras_modificadas, b)  # 3 estou indo para a aula

Now we have the count of 3. This happens because the algorithm is a bit flawed. He thinks he made a substitution in the "for" because he looks at the "a" and finds more than two "a's" in the word. One way to fix this is to take into account only repeats of more than one letter:

def corrigePalavra(str):
    palavra = [str[-2:], str[-3:], str[-4:]]
...

Now the result comes out as expected.

As a broader tip, try giving more descriptive names to your variables to give your code more clarity, especially when sharing with others. aux1 does not say anything about what the list should represent, lista1 is not even a list and palavra is not a string, but a list of string fragments that are not words. Possible best names for these variables would be, for example, palavras_isoladas , frase_original , and lista_substrings .

    
13.04.2018 / 22:04