XmlDocument Exception: hexadecimal value 0x1a, is an invalid character

1

I have a project that reads an XML generated by another program in which I do not have access to the source code. The problem is that the generated XML comes with a special character at the end, and when I try to read XML the exception is thrown.

Follow the code:

XmlDocument xmlDoc = new XmlDocument();
try
{
   xmlDoc.Load(localMontaXML);
}
catch (Exception ex)
{
   erros.Add(ex.Message);
}
ErrorChecker.Check(erros);

Have you any way to ignore this character?

I use Windows Forms with .Net 3.5

    
asked by anonymous 11.02.2014 / 11:26

3 answers

2

It has a face, but first you should consider whether the file is coming corrupted or something. If you are sure that this file is intact, and that there is only one invalid character at the end for some unknown reason (?), You can play the file in memory except the last character, obtaining your valid xml.

There are two ways I know how to do this. Note the following code:

METHOD 1

using (MemoryStream ms = new MemoryStream()) // cria um stream de memória
using (var fs = new FileStream(@"C:\sample.xml", FileMode.Open, FileAccess.Read))
// abre o arquivo xml. No caso C:\sample.xml
{
    byte[] bytes = new byte[fs.Length]; // onde ficará o conteúdo do arquivo - 1
    fs.Read(bytes, 0, (int)fs.Length); // lê o arquivo
    ms.Write(bytes, 0, (int)fs.Length-1); // escreve tudo exceto o último byte (length - 1)

    ms.Seek(0, SeekOrigin.Begin); // volta para o início do stream

    XDocument doc = XDocument.Load(ms); // carrega o arquivo em memória
    Console.WriteLine(doc.Root.Value); // teste de leitura
}

METHOD 2

using (var fs = new FileStream(@"C:\sample.xml", FileMode.Open, FileAccess.Read))
// cria um arquivo mapeado em memória a partir do seu FileStream
using (var mmf = MemoryMappedFile.CreateFromFile(fs,"xml",fs.Length,MemoryMappedFileAccess.Read,null,System.IO.HandleInheritability.None,true)) 
{
    // cria um stream que enxerga até o final do arquivo - 1
    using (var str = mmf.CreateViewStream(0, fs.Length - 1, MemoryMappedFileAccess.Read)) 
    {
        XDocument doc = XDocument.Load(str); // lê a partir do stream
        Console.WriteLine(doc.Root.Value);
    }
}

The second uses Memory-Mapped Files .

One thing is that these methods copy the contents of the file into memory, it's not the most performative thing in the world so be careful when doing it with a large XML, and never forget using to "discard" your Stream after use.

Good luck and I hope I have helped.

    
11.02.2014 / 12:43
1

Good answer @Conrad Clark

I have a third suggestion if you do not know how many extra characters the file has at the end.

xmlDoc.LoadXml(System.Text.RegularExpressions.Regex.Match(localMontaXML, @"<[\w\W]+>").ToString());

    
11.02.2014 / 12:48
1

In my case, the error that occurred was similar hexadecimal value 0x00, is an invalid character , maybe the solution is next, changing only the character to be replaced at the end.

When the error occurred it looked like this:

xmlObj = new XmlDocument();
xmlObj.Load(arquivo.fullPath);

The fix was:

StreamReader sr = arquivo.OpenText(); //carrega o texto do arquivo
string xmlText = sr.ReadToEnd(); //lê o texto do arquivo e salva na variavel xmlText
xmlText = xmlText.Replace("
xmlObj = new XmlDocument();
xmlObj.Load(arquivo.fullPath);
", string.Empty); // retira os caracteres nulos; xmlObj = new XmlDocument(); // cria o xml xmlObj.LoadXml(xmlText); // Carrega a string diretamente para ser lida como xml

I hope you can help.

    
18.07.2018 / 22:34