Removing unnecessary question mark from a String

-3

I have possible strings, where I would like to have a way to prevent unnecessary question marks and add if it is missing. However, this function applies only at the end of the string. Here is the list below and the expected result:

Possible wrong strings:

  • How do you go to the bathroom ????
  • Can I invest today? How much is the minimum value
  • Can not I do this? Why ????

Expected / correct result:

  • How do you go to the bathroom?
  • Can I invest today? How much is the minimum value?
  • Can not I do this? Why?

I started the code and I already check if there is a question mark at the end of the string and add it if it does not exist. In case I check the last 3 characters to prevent cases like: Am I alive?!

if "?" not in title[-3:]:
        title += "?"
    
asked by anonymous 27.08.2018 / 23:50

2 answers

3

You can use

28.08.2018 / 00:11
0

To also add to the end of the string that does not have the question mark, you can use the regular expression:

(\?+|$)

It basically captures any sequence of one or more question marks or the end of the line, thus being able to substitute for the single question mark.

import re

tests = [
    ('Como faz para ir ao banheiro????', 'Como faz para ir ao banheiro?'),
    ('Posso investir hoje? Quanto é o valor mínimo', 'Posso investir hoje? Quanto é o valor mínimo?'),
    ('Não posso fazer isso? Por que????', 'Não posso fazer isso? Por que?')
]

for test in tests:
    result = re.sub(r'(\?+|$)', '?', test[0])
    assert result == test[1]

If your string ends in a punctuation other than a question mark, the result will be somewhat strange:

print(re.sub(r'(\?+|$)', '?', 'teste!'))  # teste!?

If it is interesting to delete this final score, simply add it to the $ in the regular expression:

(\?+|[{string.punctuation}]+$)

Where string.punctuation refers to the module string . In this case, the entry teste!!!,..;! would become teste? .

import re
import string

tests = [
    ('Como faz para ir ao banheiro????', 'Como faz para ir ao banheiro?'),
    ('Posso investir hoje? Quanto é o valor mínimo!..,,', 'Posso investir hoje? Quanto é o valor mínimo?'),
    ('Não posso fazer isso? Por que????', 'Não posso fazer isso? Por que?'),
    ('teste!!!,..;!', 'teste?')
]

for test in tests:
    result = re.sub(f'(\?+|[{string.punctuation}]+$)', '?', test[0])
    assert result == test[1]
    
28.08.2018 / 00:38