Find substring with REGEX

1

I'm trying to turn all N parts (...) into uppercase. I thought REGEX would be the most appropriate. But it is very difficult, nor capture the part N (...) to later convert it in capital letters I can:

My file:

  

muffled, muffled.A + H_PRE + pol = no + N_aps: fs   muffled, muffled. A + H_PRE + pol = no + N_apple: fp   muffled, .A + H_PRE + pol = no + N_abafado: ms   muffled, muffled. A + H_PRE + pol = no + N_apple: mp   damper, .A + H_PRE + pol = no: + N_afafante: ms

Script:

import re

with open("word_upper.txt", "r") as f:
    text = f.read()

    pattern = re.findall(r'N_(\w+)', text)
    upper_word = pattern.group(1)

    print(upper_word)

Output:

  

Traceback (most recent call last):

     

File "test_lemme.py", line 14, in

     

upper_word = pattern.group (1)

     

AttributeError: 'list' object has no attribute 'group'

Desired output:

  

stuffy   stuffy   stuffy   stuffy   abafante

Then I thought of just making this list uppercase (using the (upper) method and then replacing it with the replace method.) So I would have:

  

muffled, muffled. A + H_PRE + pol = no + N_ABAFADO: fs

What do you guys think?

    
asked by anonymous 30.05.2018 / 16:24

1 answer

2

You can use the re.sub function to replace based on a regular expression and, if passed as a value to replace a callable object, the value captured in the regular expression will be replaced by the function return.

Something like:

with open('words_upper.txt') as stream:
    text = stream.read()
    edited = re.sub(r'(N_\w+)', lambda match: match.group(0).upper(), text)

So% w / w of% would have the value:

abafada,abafado.A+H_PRE+pol=no+N_ABAFADO:fs abafadas,abafado.A+H_PRE+pol=no+N_ABAFADO:fp abafado,.A+H_PRE+pol=no+N_ABAFADO:ms abafados,abafado.A+H_PRE+pol=no+N_ABAFADO:mp abafante,.A+H_PRE+pol=no:+N_ABAFANTE:ms
    
30.05.2018 / 16:54