Improve file_get_contents performance in loops

2

Is there a way to make file_get_contents perform a quick function within a loop?

Follow the code

<?php foreach ($links->result() as $value) : ?>

<?php 
          $url = $value->lnkUrl;
          $domain = parse_url($url, PHP_URL_HOST);
          $web_link = "http://".$domain;

          $str = file_get_contents($web_link);

          if(strlen($str)>0){
            preg_match("/\<title\>(.*)\<\/title\>/",$str,$title);
             if ( isset( $title[1] ) ) {
               echo "<span class='directi_web' title='".$title[1]."'>". $title[1] ."</span>";
             }else{
                echo "<span class='directi_web'>...</span>";
             }
           }
?>


<?php endforeach; ?>
    
asked by anonymous 16.10.2014 / 00:33

1 answer

1

Using cURL (English) to collect the page and DOMDocument (English) to extract the title significantly simplifies the work done:

HTML collection function

function file_get_contents_curl($url) {

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

    $data = curl_exec($ch);
    curl_close($ch);

    return $data;
}

Your code making use of DOMDocument

foreach ($links->result() as $value) {

    // recolher página
    $domain = parse_url($value->lnkUrl, PHP_URL_HOST);
    $html = file_get_contents_curl("http://".$domain);

    // processar e recolher titulo
    $doc = new DOMDocument();
    @$doc->loadHTML($html);
    $nodes = $doc->getElementsByTagName('title');
    $title = $nodes->item(0)->nodeValue;

    // output
    if (!empty($title)) {
      echo '<span class="directi_web" title="'.$title.'">'.$title.'</span>';
    }
    else {
      echo '<span class="directi_web">...</span>';
    }
}

Note: This type of work should be done in background and information stored in a file or database. At the time you serve a page to the visitor, the data must be ready to use. If you are processing all the information at the time of submitting the page, the visitor naturally has to wait and things take much longer than is supposed.

    
16.10.2014 / 01:52