Python: The ^ is not transforming the character class of my regex to negative

5

I'm learning REGEXes by Automate the Boring Stuff w / Python. In one of the titles in Chapter 7, the book teaches about character classes. Until then, easy. I created character classes for vowels ( re.compile(r'[aeiouAEIOU]') ), for letters and digits in intervals ( (r'a-zA-Z0-9') ) ... All quiet.

When I started learning about negative character classes, that is, defining a character class and having text strings detected, for example, in strings that do NOT have the character combination I defined by the character class, I started to find difficulties.

A negative character class is declared as: re.compile(r'[ˆaeiouAEIOU]') , with the little hat on the front. But this is not making the character class negative: in fact, it is detecting vowels and the little hat, if you have little hat in the sentence.

See:

#Tentando (e conseguindo) detectar só VOGAIS...
>>> consonantRegex = re.compile(r'[aeiouAEIOU]')
>>> consonantRegex.findall('Robocop eats baby food. BABY FOOD')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

#Tentando detectar só CONSOANTES... (Perceba o chapeuzinho)
>>> consonantRegex = re.compile(r'[ˆaeiouAEIOU]')
>>> consonantRegex.findall('Robocop eats baby food. BABY FOOD')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

#Colocando um chapeuzinho na frase -> Chapeuzinho detectado
>>> consonantRegex.findall('Robocop eats baby food. ˆBABY FOOD.')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'ˆ', 'A', 'O', 'O']

Data:

  • I'm using THONNY's interactive shell , but I also tested it in IDLE, same error.
  • Mac Usage. We know Mac has that annoying feature that gets in the way of programming: the accents issue. When I put quotation marks, for example, he's waiting for a vowel to see if it can accentuate. Type ä . Then you have to put quotation marks, type a consonant, type s , there it normal, and then delete the s to type the desired vowel (If anyone knows how to disable this, but so that you can still use accents, I also thank you VERY much.)
  • asked by anonymous 23.09.2017 / 22:53

    1 answer

    6

    Your 2nd data answers your own question.

    Actually, MacOS works with modifiers, which actually use other types of characters to accentuate, for example:
    (MacOS) (U+02C6) MODIFIER LETTER CIRCUMFLEX ACCENT . When the accent character is:
    (Other) ^ (U+005E) CIRCUMFLEX ACCENT .

    To fix this problem on your keyboard, follow these [see edit] instructions.
    If you want to test regex before making such changes, try using the following code:

    consonantRegex = re.compile(r'[^aeiouAEIOU]')
    

    Issue (09/28/2017):

    To help other Stack people search for the question, here is the translation of the above instructions:

    (en): You should go to Keyboard Preferences ), and add a new keyboard.

    Instead of using the USA International keyboard [1] you should use the USA Keyboard [2] .

    Then from

    Useasanalternative

  • Whenyouareaddinganewkeyboard,selectEnglish(orEnglishifyouwanttocontinuewiththeUSkeyboardonyourMac).
  • Attheendofthelistyouwillseethetwokeyboards.Usethenooptiontobeinternacional.
  • [! ]

        
    25.09.2017 / 01:58