Regex custom time in python 3

4

I need to create a regex that accepts the following entries:

8:00  
8 horas    
8h   
8h30 (8h 30)  
8h30min (8h 30 min)  
8h30minutos (8h 30 minutos)

And I came up with the following:

((\d{1,2}:\d{1,2}) | (\d{1,2}\s+\bhoras\b) | (\d{1,2}\bh\b(\s+\d{1,2}(\bminutos\b|\bmin\b)?)?))?

The first two parts work separately, when together with a | does not work. The third part does not work at all.

    
asked by anonymous 18.08.2017 / 15:24

1 answer

4

Why your regex does not work

You've made a lot of mistakes in it, this happens to who's starting, I'm not going to just show you where it went wrong, but explain a better way that I hope will continue going forward, so your regex will not fail anymore.

  • Add spaces to the end of the regex on the OU tab ( | ), I know this makes the regex more readable and easy to understand, but that's why it did not work with the other alternatives you put , when it parsed the sequences after the OU tab, it would check if the sequence was started with space.
  • Unnecessary use of word separator ( \b ), does the sequence analysis as a word, expecting the last valid character and verifies that it matches perfectly with your pattern ie when you used h it would only find "h" if it had followed by a number or other letter would not capture.
  • "?" After the capture group ( )? ), you are making regex only capture this group once or once, so if your regex worked for 1 sequence, it would stop there.
  • >
  • A catch group for each case , the regex that you presented has a catch group for each case, and it is plausible to think that a number after the sequence " h , hora or : "is the minute number, so simply add the word minute after the sequences with h or hora and this will save you time processing and regex.


Regex that works for your cases

(\d{1,2}:\d{1,2})|(\d{1,2}\s+horas{0,1})|(\d{1,2}h ?\d{0,2})

You can test how this regex works

    
18.08.2017 / 16:46