Web scraping with pure Javascript

4

I want to do a web scraping that reads an XML page and takes a certain value that is in "name", but I'm not sure exactly if it's possible - I just found out how to do with NodeJS - is it possible to do with pure JS? No external libraries and / or frameworks?

    
asked by anonymous 05.11.2015 / 18:29

2 answers

3

There's nothing to stop you from downloading an XML and analyzing your content. The only problem with doing this in a browser would be same origin policy , which would prevent you from accessing addresses arbitrary files via Javascript.

    
05.11.2015 / 19:09
3

Yes, it is possible. For example:

var parser = new DOMParser();
var tmplXML = document.getElementById("tmplXML");
var blobXML = new Blob([tmplXML.innerHTML], { type: 'text/xml' });
var urlXML = URL.createObjectURL(blobXML);

var httpRequest = new XMLHttpRequest();

httpRequest.open("GET", urlXML, true);
httpRequest.onreadystatechange = function(){
  if (httpRequest.readyState == 4){
    if (httpRequest.status == 200) { 
      var xml = httpRequest.responseXML;
      console.log(xml.getElementsByTagName("p")[0].innerHTML);
    }
  }
}
httpRequest.send();
<template id="tmplXML">
  <?xml version="1.0" encoding="UTF-8"?>
  <text>
    <p>Lorem ipsum dolor sit amet</p>
    <p>Nihil cumque vero</p>
    <p>Impedit quibusdam fuga</p>
    <p>Magnam ad maiores omnis</p>
    <p>Aliqua omnis laborum</p>
  </text>
</template>

But as Pablo has already said, it may be that politics of the same origin makes it difficult for him to work.

Source: Ajax reading XML

    
05.11.2015 / 19:28