How to get a limited number of occurrences with DOM?

2

I'm doing parser for a website, I want to get some data from it, the data is structured as follows:

<div class="interesses">
<span class="tipo" >Tipo 1</span>
<span class="tipo" >Tipo 1</span>
<span class="tipo" >Tipo 2</span>
<span class="tipo" >Tipo 2</span>
<span class="tipo" >Tipo 3</span>
<span class="tipo" >Tipo 3</span>
</div>

I want to get information from span tipo , so I used the DOM:

$html = file_get_contents("http://exemplo.com");
    $DOM =  new DOMDocument();
    $DOM->loadHTML('<meta charset="utf-8">'.$html);
    $xpath = new DomXpath($DOM);
        $tipo = $xpath->query('//*[contains(concat(" ", normalize-space(@class), " "), "tipo")]');
        $arrValues = array_map(null,iterator_to_array($tipo))
        foreach($arrValues as $value){
            echo $value[0]->nodeValue."<br />";
        }

It works!

But the problem is that on the source page, as you have seen, there are two "type 1" and two "type 2" and so on, the site always generates duplicate information, but I only want to show one of each, that is, only one "Type 1" and another "Type 2" and so on. But everything is coming and I have no idea what to do to prevent duplicity.

Update:

The% as% that @Miguel Angelo taught, worked! But now imagine the following scenario: There is 1 bakery online, which sells several types of sweet bread: with coconut and without coconut. The buyer then chooses two loaves one with coconut and another without coconut, the HTML structure would look something like this:

<div class="interesses">
<span class="tipo" >Pão Doce</span>
<span class="tipo" >Com coco</span>
<span class="tipo" >Pão Doce</span>
<span class="tipo" >Com coco</span>
<span class="tipo" >Pão Doce</span>
<span class="tipo" >Sem coco</span>
<span class="tipo" >Pão Doce</span>
<span class="tipo" >Sem coco</span>
</div>

I want to now show the user only the 2 types of bread he requested:

  

Item 1: Sweet bread with coconut, Item 2: Sweet bread without coconut.

The array_unique would return something like:

  

Item 1: Sweet bread with coconut, Item 2: Sweet bread with coconut, item 3: Sweet bread without coconut, Item 4: Sweet bread without   coconut

If you use DOM of the tip of @Miguel Angelo, the "type" will only be repeated once, that is:

  

Item 1: Sweet Bread With Coconut, Item 2: No Coconut.

That is, if you have two of the same types of bread, it will only show or all or only 1, but I want you to display only one group each: "Sweet bread with coconut" and take the repetition "Sweet bread with coconut "but keep the" Sweet Coconutless Bread "and remove the duplicated" Sweet Coconutless Bread "again.

Is there any way to do this?

    
asked by anonymous 27.03.2015 / 04:01

1 answer

2

Can not use array_unique ?

Example:

$html = file_get_contents("http://exemplo.com");
$DOM =  new DOMDocument();
$DOM->loadHTML('<meta charset="utf-8">'.$html);
$xpath = new DomXpath($DOM);
    $tipo = $xpath->query('//*[contains(concat(" ", normalize-space(@class), " "), "tipo")]');
    $arrValues = array_unique(array_map(
            function ($el) { return $el->nodeValue; },
            iterator_to_array($tipo)));
    foreach($arrValues as $value){
        echo $value."<br />";
    }

Editing to solve additional problem:

The second problem seems to be to concatenate the elements of an array, from 2 to 2 elements. That is, an array like this:

[ "a", "b", "c", "d" ]

It would have to look like this:

[ "ab", "cd" ]

Before moving to array_unique .

For this I did the following function:

function func_concat_N_a_N($num, $array) {
  $length = count($array);
  $item = "";
  $result = array();
  for ($i = 0; $i < $length; $i++) {
    $item = ($i % $num)==0 ? $array[$i] : $item." ".$array[$i];
    if ((($i+1) % $num)==0)
      $result[] = $item;
  }
  return $result;
};

That will be used in the original code like this:

$arrValues = array_unique(
               // aqui está ela sendo utilizada
               func_concat_N_a_N(
                   // indicando que será de 2 a 2
                   2,
                   // array que queremos unir de 2 a 2
                   array_map(
                       function ($el) { return $el->nodeValue; },
                       iterator_to_array($tipo))));

Online demo of the code above

    
27.03.2015 / 04:17