Extract text between HTML tags with Indy IdHTTP with Delphi

3

I have an html site that contains:

<html>
<head>
<title>Teste</title>
</head>
<body>
<h1>Teste 1</h1>
<h2>Teste 2</h2>
</body>
</html>

I'm extracting content from the site and playing on MEMO with:

IdHTTP1 := TIdHTTP.Create(nil);
IdHTTP1.Request.Accept := 'text/html, */*';
IdHTTP1.Request.UserAgent := 'Mozilla/3.0 (compatible; IndyLibrary)';
IdHTTP1.Request.ContentType := 'application/x-www-form-urlencoded';
IdHTTP1.HandleRedirects := True;
HTML := IdHTTP1.Get('http://www.site.com/link.html');
Memo1.Text := (HTML);

The problem that I can not remove the content between the <h1> .. </h1> tags ie Test 1 and play on a label.

    
asked by anonymous 07.09.2014 / 17:57

1 answer

4

function ExtractText(aText, OpenTag, CloseTag : String) : String;
{ Retorna o texto dentro de 2 tags (open & close Tag's) }
var
  iAux, kAux : Integer;
begin
  Result := '';

  if (Pos(CloseTag, aText) <> 0) and (Pos(OpenTag, aText) <> 0) then
  begin
    iAux := Pos(OpenTag, aText) + Length(OpenTag);
    kAux := Pos(CloseTag, aText);
    Result := Copy(aText, iAux, kAux-iAux);
  end;
end;

Parameters:

  • aText : would be XML or HTML content;
  • OpenTag : would be the tag that opens (in your case, for example <h1> );
  • CloseTag : would be the tag that closes (in your case, for example </h1> );

Then for you to call this function, for example:

variavelString = ExtractText(Memo1.Text,'<h1>','</h1>');

I hope I have helped. Hugs!

    
12.09.2014 / 21:25