Replace numbers that are within a word

1

I want to remove numbers in the middle of letters, but I do not want to remove all numbers from the text, for example:

Vini12cius

I want to transform to Vinicius , but without changing a possible CPF that will come after, I created the following regex:

r = re.sub(r'([0-9])','',"")

However, it deletes all text numbers, even those that are not between characters, I also tried:

r = re.sub(r'([a-z]*[0-9]*[a-z])','',"") 

But I did not succeed either.

    
asked by anonymous 31.08.2018 / 20:49

2 answers

4

I'd rather do this with a replacement function:

>>> def f(m):
...     text = m.group()
...     if text.isdigit(): # se for tudo digito
...         return text
...     else:
...         return re.sub(r'\d', '', text)
... 
>>> re.sub(r'\w+', f, '4lfr3do rodr1g0 marc05 12345')
'lfrdo rodrg marc 12345'
    
31.08.2018 / 21:29
1

You can use this Regular Expression: \d+(?=[a-zA-Z]+)|(?<=[a-zA-Z])\d+

Where the demo of Regex101 can be seen.

Code

import re


testes = ("Vini12cius 000.000.000-00",
            "Vini12cius 00000000000",
            "Vinicius12 00000000000",
            "12Vinicius 00000000000",
            "000.000.000-00 Vini12cius",
            "00000000000 Vini12cius",
            "00000000000 Vinicius12",
            "00000000000 12Vinicius")


padrao_regex = re.compile(r"\d+(?=[a-zA-Z]+)|(?<=[a-zA-Z])\d+")
substituicoes = [re.sub(padrao_regex, "", elemento_teste) for elemento_teste in testes]
if substituicoes:
    for substituicao in substituicoes:
        print(substituicao)

Ideone

    
31.08.2018 / 21:40