Open the Chrome Console, and type the following:
$x("//tr[position() > 1]/td[2]/p/span/text()")
This will call the (Ferramentas > Console Javascript)
Javascript function (set for the Chrome console) and return the result of the XPath provided as a parameter.
Explanation of the XPath expression
-
$x
: selects all elements //tr[position() > 1]
of the page except the first
-
tr
: selects only the second element td[2]
(i.e., the second column); you can change the column number or use td
to select the first two columns, for example.
-
td[position()=1 or position()=2]
: selects the element p/span
within the element span
(this is true for the page of your example, for other pages you should check the elements inside the p
)
-
td
selects the contents of the tag.
Use in programming languages
The most practical solution in any programming language involves XPath with some XML library. Ruby example:
require 'nokogiri'
require 'open-uri'
require 'openssl'
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
doc = Nokogiri::HTML(open(ARGV[0]).read)
doc.xpath("//tr[position() > 1]/td[2]/p/span/text()").each { |x| puts x}
Usage example
ruby script.rb 'https://www.iomat.mt.gov.br/do/navegadorhtml/mostrar.htm?id=630237&edi_id=3580'