How to do this regular expression in python 3.6

Question

How to do this regular expression in python 3.6

Navigation

#1 by (0 votes)

2

I need to do a regular expression to extract the links from this string:

links =('href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=70>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO A</a></li><li><a href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=71>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO B</a></li>

The string is much larger. I put only one part because the rest repeats. Here's what I've tried:

campus1 = re.findall("href", links)
campus2 = re.findall("http", links)
campus3 = re.findall("href=http", links)
campus4 = re.findall("hre", links)
campus5 = re.findall("a", links)
campus6 = re.findall("<a> <\a>", links)

When I give a print or leave the letters separated or leave the link and these names (which later I will also have to think of an expression to get only those names of colleges) Anyone any ideas? What comes out is this when I run campus1 = re.findall ("href", links), for example: 'href', 'href', 'href', 'href', 'href', 'href', 'href', 'href', 'href', 'href', 'href', 'href', That is, it returns all the href's of the string. I would like to extract only the links, for example:

">

All links as they are in this string.

python python-3.x python-2.7

asked by anonymous 26.02.2017 / 15:32

1 answer

Rename file in python3 [closed] How could I merge the loops below?

score 0 · Accepted Answer

Do this:

import re
s = "<li><a>href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=70>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO A</a></li><li><a href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=71>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO B</a></li>"
print(re.findall(r'href=[\'"]?([^\'" >]+)', s))

See the Ideone

Explanation of Regex