Replace words in a text that are formed, necessarily, formed of letters and numbers

0

I am trying to replace words in a text that are formed, necessarily, formed of letters and numbers.

I've tried this:

def passwords():

    df['C'] = df['C'].str.replace(r'[a-zA-Z0-9]', '<password>')

    return df

My data:

       A       B                                                  C
   Joana      MG                              minha senha é aaabb123
  Marcos      AM        eu tentei colocar a minha senha varias vezes
   Paulo      RS       eu tenho duas senhas: 321cccppp e r1t2r3t4r5t

My result is horrible:

       A       B                                                            C
   Joana      MG                     <password><password><password><password>
  Marcos      AM <password><password><password><password><password><password>
   Paulo      RS <password><password><password><password><password><password>

Good output:

       A       B                                                  C
   Joana      MG                            minha senha é <password>
  Marcos      AM        eu tentei colocar a minha senha varias vezes
   Paulo      RS       eu tenho duas senhas: <password> e <password>       
    
asked by anonymous 06.12.2018 / 11:54

1 answer

3

This regex here works fine:

([0-9]()|[A-Za-z]())+
Basically what this regex does is create 2 catch groups, one of numbers (% with%) and one of lowercase letters (% with%), and then it checks to see if there are 2 groups in the string ( [0-9] = group 2) and ( [a-zA-Z] = group 3).

  

See working at Repl.it .

    
06.12.2018 / 14:02