Getting table elements with htmlagilitypack

1

I have this structure reordered several times.

1st Table

<table>
<tbody>
<tr>
<th>titulo</th>
</tr>
</tbody>
</table>

2nd Table

<table>
<tbody>
<tr>
<th>Texto</th>
<th>Texto</th>
<th>Texto</th>
<th>Texto</th>
</tr>
</tbody>
</table>

There are several following this pattern.

How do I pass them to an array and a list so I can get the values of each?

    
asked by anonymous 16.03.2015 / 12:25

1 answer

1

You can use a helper tool to HtmlAgility which would be Fizzler !! it has the same purpose as HtmlAgility but in it you can do "query's" on your object to get the desired element, you can download Fizzler at the same nuget,

and would work like this

WebClient wc = new WebClient();
wc.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36");
string pag = wc.DownloadString("pagina de onde você quer pegar a informação");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml2(pag);
var tabela = doc.DocumentNode.QuerySelectorAll("tags do queryselector").ToList();

You can search a little more about the queryselector tags and the basics would be

"#" = Element ID,

"." = element class,

and the elements tags in html normally!

    
16.03.2015 / 14:22