(javascript) Get value from a very complex html structure

0

I'll try to be quite objective:

How do I get the value inside a tag that is within 200 million other childs?

(assuming I can not make changes to the code, because it's the code for a site that I need to filter to collect specific data):

ex:

    <div>
      <div>
        <div></div>
        <div>
          <div></div>
          <div></div>
          <h1></h1>
          <div>
            <div>
              <div>
              <h1></h1>
                <div>
                  <div></div>
                  <div></div>
                  <table>
                    <tbody>
                      <tr></tr>
                      <tr></tr>
                    </tbody>
                  </table>
                  <div>
                    <div></div>
                    <div></div>
                    <div>
                      <div>
                        <div></div>
                        <div>
                          <div>
                            <div></div>
                            <table>
                              <tbody>
                                <tr></tr>
                                <tr></tr> 
                              </tbody>
                            </table>   
                            <div></div>
...........

For example, I need to get the information from the second table (inside the TR tags) ... assuming there is no id or class, how do I get this information?

Thank you.

    
asked by anonymous 11.02.2018 / 19:12

2 answers

3

You can do this by using selectors of jQuery or by capturing all elements and returning only one of them through the index.

document.getElementsByTagName

With document.getElementsByTagName('table') you return a array with the interface HTMLCollection .

This interface will give you the length property. This property will return the total amount of elements captured with the above code.

In addition to the property mentioned above, you will also have access to two methods;

  • HTMLCollection.item() : With this method you can capture the element through its index (from 0..n-1 ) . It is the same as document.getElementsByTagName('table')[0] or document.getElementsByTagName('table')[1] .

    If the element does not exist, it returns null

  • HTMLCollection.namedItem() : Returns the node specified by the ID or, if it does not have ID, the item whose name property is the same as the search.

    Returns null if no node matches the name you searched for.

const segundaTabela = document.getElementsByTagName('table')[1];
const dadosDaTabela = segundaTabela.getElementsByTagName("td");

// Percorre todos os valores
for (let i = 0; i < dadosDaTabela.length; i++) {
  console.log( dadosDaTabela[i] );
}
<div>
  <div>
    <div></div>
    <div>
      <div></div>
      <div></div>
      <table>
        <tbody>
          <tr></tr>
          <tr></tr>
        </tbody>
      </table>
      <div>
        <div></div>
        <div></div>
        <div>
          <div>
            <table>
              <tbody>
                <tr>
                  <td>Valor 1.1</td>
                  <td>Valor 1.2</td>
                </tr>
                <tr>
                  <td>Valor 2.1</td>
                  <td>Valor 2.2</td>
                </tr>
              </tbody>
            </table>
            <div></div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>

document.querySelectorAll

In addition to the way we mentioned above, you can also use document.querySelectorAll("table") . This method will not bring a array , but will return the object NodeList (or Collection ) that works similar to array .

This interface also returns the length property with the total amount of elements captured.

However, it will bring a greater amount of methods. They are:

  • NodeList.item() : With this method you can capture the element through its index (from 0..n-1 ) . It is the same as document.querySelectorAll("table")[0] or document.querySelectorAll("table")[1] .

    If the element does not exist, it returns undefined

  • NodeList.entries() : Returns a iterator that allows you to pass by all the key / value pairs contained in the object.

  • NodeList.forEach() : Here you can add a callback function . Assim ele percorrerá toda a lista enviando três argumentos para a função de callback. The arguments are: Current Element; Current Index; Object List.

  • NodeList.keys() : Similar to entries , however it will return only the collection keys.

  • NodeList.values() : Similar to entries , however it will return only the values in the collection.

const segundaTabela = document.querySelectorAll("table");
const dadosDaTabela = segundaTabela[1].querySelectorAll("td");

for (let resultado of dadosDaTabela) {
  console.log( resultado.innerText );
}
<div>
  <div>
    <div></div>
    <div>
      <div></div>
      <div></div>
      <table>
        <tbody>
          <tr></tr>
          <tr></tr>
        </tbody>
      </table>
      <div>
        <div></div>
        <div></div>
        <div>
          <div>
            <table>
              <tbody>
                <tr>
                  <td>Valor 1.1</td>
                  <td>Valor 1.2</td>
                </tr>
                <tr>
                  <td>Valor 2.1</td>
                  <td>Valor 2.2</td>
                </tr>
              </tbody>
            </table>
            <div></div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>

jQuery

With jQuery everything becomes easier, but a lot of unnecessary code in the "backstage". But in case you're already using it in your project (whether by taste or requirement of some plugin), it's worth using. Otherwise use VanillaJs

With jQuery you can already enter the index value through the :eq selector. Ex:

$("table:eq(1) tr")

Ready. This way you will have all tr of the second table.

$("table:eq(1) td").each( (index, el) => {
  console.log( el );
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div>
  <div>
    <div></div>
    <div>
      <div></div>
      <div></div>
      <table>
        <tbody>
          <tr></tr>
          <tr></tr>
        </tbody>
      </table>
      <div>
        <div></div>
        <div></div>
        <div>
          <div>
            <table>
              <tbody>
                <tr>
                  <td>Valor 1.1</td>
                  <td>Valor 1.2</td>
                </tr>
                <tr>
                  <td>Valor 2.1</td>
                  <td>Valor 2.2</td>
                </tr>
              </tbody>
            </table>
            <div></div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>
    
11.02.2018 / 20:49
0

If you are based on this structure you can use document.getElementsByTagName('table') and then to select use square brackets (ex .:: 0), 1 ), and to find the TR in question repeat the process:

document.getElementsByTagName('table')[0].getElementsByTagName('tr')

If you are sure of this user innerHTML to get the content from within it:

document.getElementsByTagName('table')[0].getElementsByTagName('tr').innerHTML

Now another way to work is using querySelector second @Valdeir Psr For example if you want to automatically select the first TABLE :

document.querySelector ('table')

But if you want to select all in an ARRAY:

document.querySelectorAll('table')

Then the process is the same as the first example.

If you prefer to use JQuery follows the same logic to select, but you write less:

$("table")

Now if you want to select between odd or even arrays you can use this:

$('tr:even') //Para Ímpar
$('tr:odd') //Para Par

The possibilities are many, even involving attributes, I found no ready list better than this one of all types of selectors: Jquery Selectors

    
11.02.2018 / 20:33