Catch custom tags with Html Agility Pack

3

I'm using the Html Agility Pack plugin to handle html

And I'd like it to capture my elements using% custom%

I tried the following:

 HtmlDocument html = new HtmlDocument();
 html.Load(new StringReader(Document.Content)); //Aqui é o meu html, ele não possui <body> ele vem carregado do banco

 var teste = html.DocumentNode.SelectNodes("//Tag-Teste"); //no html está <tag-teste>conteudo</tag-teste>

but the variable tag returns me teste

    
asked by anonymous 11.03.2015 / 20:09

2 answers

3

Solution found by the author : The problem was in using the tag initials in uppercase. Instead of Tag-Teste , use tag-teste .

The Html Agility Pack handles HTML insensibly ¹ relative to case-sensitive , however, the XHTML does not.

Knowing this, when you use a Xpath feature, you should use tags > written in lowercase letters.

The 4.2 section of this page quotes this.

  

XHTML documents should use lowercase for all HTML element and   attribute names. This difference is necessary because XML is   case-sensitive

     

For example, <li> and <LI> are different tags .

¹ The author of the Html Agility Pack mentioned this # in the SO.

    
11.03.2015 / 22:03
2

I was able to extract the content with the code below:

// no html está <tag-teste>conteudo</tag-teste>
var teste = html.DocumentNode.SelectNodes("//tag-teste"); 

foreach (var conteudo in teste)
{
    MessageBox.Show(conteudo.InnerText);
}
    
11.03.2015 / 20:35