json_encode returning "Malformed UTF-8 characters, possibly incorrectly encoded"

1

I'm using Laravel 3 on a particular system.

Sometimes I'm having a problem with json_encode , which is returning false in some cases.

In this code, I load an external page and, with DomDocument , I make a foreach in the meta tags and capture the content value, saving array .

This array I use Response::json of Laravel , which internally uses json_encode .

More or less like this:

$url = Input::get('url');
$html = file_get_contents($url);

$dom = new DOMDocument();

@$dom->loadHtml('<?xml encoding="UTF-8" version="1.0"?>' . $html);

$dados = array();

foreach ($dom->getElementsByTagName('meta') as $element) {

    $name = trim($element->getAttribute('property'));

    if (! $name || strpos($name, 'og:') === false) continue;

    $dados[$name] = $element->getAttribute('content');
}

return Response::json($dados);

When I use Response::json , it is returning me empty in some cases.

Then I made the following check to know what was wrong:

$json = json_encode($dados);

   if ($json === false) {
      echo json_last_error_msg();
   }

And he returned:

  

Malformed UTF-8 characters, possibly incorrectly encoded

I checked the contents of the $dados variable, and it looked like this:

Array
(
    [og:title] => **Removido**
    [og:description] => Os Dez Mandamentos: chuva de granizo e fogo � a sétima praga a castigar o Egito
    [og:image] => **Removido**
)   

It seems that the problem is being generated because of this character.

Does anyone know how I can work around this problem?

Update

I tried to print the contents of the html with DomDocument , using $dom->saveHTML() and I was returned this error:

  

output conversion failed due to conv error, bytes 0xE9 0x20 0x61 0x20

    
asked by anonymous 09.10.2015 / 14:36

2 answers

0

Since $ data is an array, you can apply the function recursively with array_walk_recursive:

array_walk_recursive($dados, function (&$val) {
    if (is_string($val)) {
        $val = mb_convert_encoding($val, 'UTF-8', 'UTF-8');
    }
});

But making it clear that this is a very ugly gambiarra ... It is better to identify the source of these unicode characters and treat them before they arrive in PHP.

    
27.04.2018 / 14:04
-1

Use md_convert_encoding () on return, following example:

  

$ data = mb_convert_encoding ($ data, "UTF-8", "auto");

     

$ json = json_encode ($ data);

    
09.10.2015 / 14:40