Special characters (ï »¿) appear in front of the XML tag [duplicate]

2

I'm reading two XML files however created on different computers. This one was created on my computer:

<?xml version="1.0" encoding="UTF-8"?> ...

This below with the same content appears a sequence of special characters (  ) before the tag xml . See:

<?xml version="1.0" encoding="UTF-8"?> ...

Even copying the contents of one file and pasting it into another, it's still time to use new SimpleXMLElement($contentFile) to continue in the same way.

As these strange characters appear, an error occurs:

  

Warning: SimpleXMLElement :: __ construct (): Entity: line 1: parser error   : Start tag expected, '& lt;' not found in C: \ wamp64 \ www ...

At first I thought I'd remove these characters using regular expression, but I thought maybe something is already ready about it (which I still do not know);

How can I fix this?

    
asked by anonymous 03.07.2017 / 21:54

1 answer

5

The  characters indicate that the document was saved with "UTF-8 with BOM" when it should save "without BOM."

How to try to solve the application

  

Note: In the examples I used simplexml_load_string , but both it and simplexml_load_file returns a SimpleXMLElement :

     

SimpleXMLElement simplexml_load_string ( string $data [, string $class_name = "SimpleXMLElement" [, int $options = 0 [, string $ns = "" [, bool $is_prefix = false ]]]] )

I can not tell which document it is, you can try decoding the XML content before parsing:

$data = file_get_contents($url);
$data = utf8_decode($data);

$xml = simplexml_load_string($data);

...

$xml->asXML();

Or decode and re-encode (might have some problem with XML being UTF-8 in the header):

$data = file_get_contents($url);
$data = utf8_decode($data);
$data = utf8_encode($data);

$xml = simplexml_load_string($data);

...

$xml->asXML();

You can choose to try trim (this by the way was the only one that worked for me):

$data = file_get_contents('A.xml');

$data = trim($data); //Remove os espaçamentos incluindo o "BOM"

$xml = simplexml_load_string($data);

...

$xml->asXML();

If no function can try substr with strpos like this:

$data = file_get_contents($url);

$data = substr($data, strpos($data, '<'));

$xml = simplexml_load_string($data);

...

$xml->asXML();

If it fails, you can combine it with utf8_decode and utf8_encode again:

$data = file_get_contents($url);

$data = substr($data, strpos($data, '<'));

$data = utf8_decode($data);
$data = utf8_encode($data);

$xml = simplexml_load_string($data);

...

$xml->asXML();

How to solve with editors / word processors

If you have access to these .xml you can edit them using notepad ++:

Orsublimetext:

    
03.07.2017 / 22:10