php regex to get 2 href link groups

1

Hello, how to set up a REGEX to get 2 groups of all the href link

<a href="/page/page/categoria/page?page=2&amp;publica=1" rel="next">2</a>

Where group 1 would be the entire link

/page/page/categoria/page?page=2&amp;publica=1

And the second group would be the page number (page = ? )

2

My REGEX by how much it looks like this:

href=["][^"]+page=(\d+).+["]
// GRUPO 1: href="/page/page/categoria/page?page=2&amp;publica=1" rel="next"
// GRUPO 2: 2
    
asked by anonymous 17.09.2015 / 03:00

2 answers

1

Instead of regex you can use DOMDocument , a PHP API that works with XML and HTML , an example would look like this:

$conteudoDoHtml = '<a href="/page/page/categoria/page?page=2&amp;publica=1" rel="next">2</a>';

$dom = new DOMDocument;
$dom->loadHTML($conteudoDoHtml);
$ancoras = $dom->getElementsByTagName("a");
foreach($ancoras as $elementos) {
   echo $elementos->getAttribute('href'), '<hr>';
}

So you would only do one regex to extract page

$conteudoDoHtml = '<a href="/page/page/categoria/page?page=2&amp;publica=1" rel="next">2</a>';

$dom = new DOMDocument;
$dom->loadHTML($conteudoDoHtml);
$ancoras = $dom->getElementsByTagName("a");
foreach($ancoras as $elementos) {
   $data = $elementos->getAttribute('href');

   echo 'Conteudo de href:', $data, '<br>';

   preg_match('#(&amp;|&|\?)page=(\d+)#', $data, $match);

   echo 'page=', $data[2], '<br>';

   var_dump($match);//Pra visualizar melhor o resultado do preg_match
   echo '<hr>';
}
    
17.09.2015 / 03:36
1
href="([^"]+\?(page=([^&]*))[^"]+)"

See working at Regex101

Basically, it captures the href that contains page. And subdivide the way you want it.

match[1] = toda url
match[2] = page=conteudo
match[3] = conteudo
    
18.09.2015 / 13:36