HttpUtility.HtmlDecode (), Html to Txt intent (at least), can anyone help?

1

I'm getting a record whose content is below (in HTML):

 
 
<p style="text-align: justify;"><span style="font-family: times new     roman,times;"><span style="font-size: medium;"><strong>EDECPJE N&ordm;    </strong> <strong>0800141-19.2014.4.05.0000 - AGTR</strong></span></span></p>
(...)

But I need to save this in TXT (at least the ideal was to convert the formatting and save everything).

Happy, I tried to use HttpUtility.HtmlDecode (), but to my disappointment it only removed &nbsp; from other inexpressive tags.

Any ideas how I can do this correctly?

Thanks in advance.

    
asked by anonymous 05.06.2015 / 17:12

1 answer

1

Okay! Friends, thanks for the effort. I have found a satisfactory solution that shows below:

System.Text.RegularExpressions.Regex.Replace(text, "<(.|\n)*?>", string.Empty);

This command will not format anything, just remove all tags (HTML / XML) in a quick and simplified way. If I find a more elaborate solution (type besides removing everything, keep formatting) I post here.

Thanks again.

    
05.06.2015 / 17:40