Capture the last characters in PHP from an XML file

1

I have a sitemap file from my site.

...
<loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
<loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
<loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
...

I need to capture the last 14 digits of all urls.

I started as follows and traversed:

<?php

function esquerda($str, $length) {
return substr($str, 0, $length);
}

$url = file_get_contents('https://site.com.br/sitemap.xml');

while (strpos($url,'/aluno/') > 0) {
    $url = substr($url,strpos($url,'/aluno/')+14);
    $numero = esquerda($url, 14);
    echo $numero;
    echo "<br>";
}

?>

But I can not get it right. Can anyone get a solution?

    
asked by anonymous 01.08.2018 / 16:32

2 answers

2

You have to get the size of string and decrease by $length :

function esquerda($str, $length) 
{
    return substr($str, (strlen($str) - $length), $length);
}

Example ONLINE Ideone

<?php

    function esquerda($str, $length) 
    {
        return substr($str, (strlen($str) - $length), $length);
    }

    $n = '1400001';

    echo esquerda($n, 5); // resposta: 00001

This would be the idea on top of your xml missing the previous keys, but, I'll generate an example:

function esquerda($str, $length) 
{
    $length++;
    return substr($str, (strlen($str) - $length), $length - 1);
}

$xml = '<a>
    <loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
    <loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
    <loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
    </a>';

$simpleXml = simplexml_load_string($xml);

foreach($simpleXml->loc as $loc) 
{
    echo esquerda($loc, 14);
    echo '<br />';
}

If it is a file or address, you can do simpleXML_load_file($url); which also works.

01.08.2018 / 16:34
0

First you have to read the XML format, so you can use DOMDocument or SimpleXML , for your case, it seems simple the second one resolves:

For example:

<?php

$xmlstring = '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
   </url>
   <url>
      <loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
   </url>
   <url>
      <loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
   </url>
</urlset>';

$xml = simplexml_load_string($xmlstring);

foreach ($xml as $tag) {
    var_dump($tag->loc);
}
  

I made the example assuming it's a Sitemap

Now if the XML is something like:

<foo>
<loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
<loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
<loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
</foo>

That would be enough:

<?php

$xmlstring = '<?xml version="1.0" encoding="UTF-8"?>
<foo>
<loc>https://www.site.com.br/aluno/jose/11111111111111/</loc>
<loc>https://www.site.com.br/aluno/jose/22222222222222/</loc>
<loc>https://www.site.com.br/aluno/jose/33333333333333/</loc>
</foo>';

$xml = simplexml_load_string($xmlstring);

foreach ($xml as $tagFoo) {
    var_dump($tagFoo);
}

If it comes from a URL you can use $xml = simpleXML_load_file('http://site/sitemap.xml'); instead of $xml = simplexml_load_string($xmlstring);

Whatever the format is, just adjust to read, now let's get to the last part of the string.

You could do a function like one of these two:

  • With substr

    function final_str($str, $delimitador = '/') {
        $str = trim($str, $delimitador); //Remove o delimitador do final
    
        $posicao = strrpos($str, $delimitador) + 1; //Pega a posição do ultimo delimitador (no seu caso /)
    
        return substr($str, $posicao); // Remove tudo antes do delimitador incluindo o delimitador
    }
    
  • With explode

    function final_str($str, $delimitador = '/') {
        $str = trim($str, $delimitador); //Remove o delimitador do final
    
        $partes = explode($delimitador, $str); //Separa a string em partes
    
        return $partes[count($partes) - 1]; //Pega a parte final da string
    }
    

And it would use like this:

$xml = simplexml_load_string($xmlstring);

foreach ($xml as $tag) {
    var_dump( final_str($tag->loc) );
}

Or come from a URL:

$xml = simplexml_load_string('http://site.com/sitemap.xml');

foreach ($xml as $tag) {
    var_dump( final_str($tag->loc) );
}
    
01.08.2018 / 16:58