Capture div by class

2

I'm trying to capture a div from its class, but I did not succeed, note: I try to capture the div with the class class='m-definicao-conteudo' of the site that I report to curl, but I get this error:

  

Warning:
  DOMDocument :: loadHTML (): Unexpected end tag: a in Entity, line: 102 in /Applications/XAMPP/xamppfiles/htdocs/teste.php on line 13

$ch = curl_init ("");
curl_setopt($ch, CURLOPT_URL, 'http://dicionarioinformal.com.br/aham/');
curl_setopt($ch, CURLOPT_USERAGENT, "Opera/9.80 (J2ME/MIDP; Opera Mini/4.2.14912/870; U; id) Presto/2.4.15"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$html = curl_exec($ch);

$dom = new DOMDocument;
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);

$results = $xpath->query("//*[@class='m-definicao-conteudo']");

if ($results->length > 0) {
    echo $review = $results->item(0)->nodeValue;
}
    
asked by anonymous 21.12.2014 / 05:03

2 answers

3

Change the USER_AGENT.

Change:

curl_setopt($ch, CURLOPT_USERAGENT, "Opera/9.80 (J2ME/MIDP; Opera Mini/4.2.14912/870; U; id) Presto/2.4.15");

To:

curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 5.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"); 

You can change to any other, as long as it is a Desktop browser.

Using the current (Opera Mini) the site is redirecting to the 'mobile' model of the site, which does not contain the div. ;)

    
21.12.2014 / 05:26
2

Your problem seems to me to be the way you are passing the name of class to be found:

$html = '<div>Ora que raio!</div>
<p>Ola meu nome é pseudomatica (sou normal), etc. Meu nome é assim pq sim</p>
<div class="minhaClasse">Encontrei</div>
<p></p>';

$dom = new DOMDocument;
$dom->loadHTML($html);

$class = 'minhaClasse'; // guarda nome da classe numa variavel

$procura = new DomXPath($dom); // instancia o DomXPath

$div = $procura->query("//*[contains(@class, '$class')]"); // Procura passando a variavel

See example on Ideone :

var_dump($div->item(0)->nodeValue); // string(9) "Encontrei"
    
21.12.2014 / 05:20