I'm learning how to deal with DOMXpath
in php
. I was using regex
(but I was discouraged here in the stack when for html capture). I confess that for me it is not so simple and DOM
has its limits (when there are spaces in tag names and also in error handling). If someone can help me with the command in php
to get a preview of the captured elements and check if everything is right, I would appreciate it. If you have suggestions for improving the code, then welcome.
<?php
$doc = new DOMDocument;
libxml_use_internal_errors(true);
// Eliminando espaços em branco (caso existam)
$doc->preserveWhiteSpace = false;
@$doc->loadHTML(file_get_contents ('http://www.imdb.com/search/title?certificates=us:pg_13&genres=comedy&groups=top_250'));
$xpath = new DOMXPath($doc);
// Iniciando a partir do elemento raiz
$grupos = $xpath->query(".//*[@class='lister-item mode-advanced']");
// Criando um array e depois um loop com os elementos a serem capturados (imagem, titulo e link)
$resultados = array();
foreach($grupos as $grupo) {
$i = $xpath->query(".//*[@class='loadlate']//@src", $grupo);
$t = $xpath->query(".//*[@class='lister-item-header']//a/text()", $grupo);
$l = $xpath->query(".//*[@class='lister-item-header']//a/@href", $grupo);
$resultados[] = $resultado;
}
// Que comando deveria usar para ter uma prévia dos resultados e verificar se está tudo ok?
print_r($resultado);