Filtering td with PHP

4

I have several <td>

<td>Conteúdo</td>
<td>Conteúdo</td>
<td>Conteúdo</td>
<td>Conteúdo</td>
<td>Conteúdo</td>
<td>Conteúdo</td>

I wanted to split these <td> . For example, I would like to only display the 2, 3, 6. I am wanting to do this because the table I am pulling is from another site. I wanted to know how to do it using PHP.

    
asked by anonymous 18.12.2014 / 20:17

4 answers

4

You can use DOMDocument :: loadHTML to parse HTML content , example:

$html = <<<EOF
<table>
<thead>
<tr>
<th>Col 1</th>
<th>Col 2</th>
<th>Col 3</th>
<th>Col 4</th>
<th>Col 5</th>
<th>Col 6</th>
</tr>
</thead>
<tbody>
<tr>
<td>Conteudo 1</td>
<td>Conteudo 2</td>
<td>Conteudo 3</td>
<td>Conteudo 4</td>
<td>Conteudo 5</td>
<td>Conteudo 6</td>
</tr>
<tr>
<td>Conteudo 1</td>
<td>Conteudo 2</td>
<td>Conteudo 3</td>
<td>Conteudo 4</td>
<td>Conteudo 5</td>
<td>Conteudo 6</td>
</tr>
</tbody>
</table>
EOF;

// criar um novo documento 
$document = new DOMDocument();
// ler o html
$document->loadHTML($html);
// criar seletor xpath
$selector = new DOMXPath($document);
//selecionar conteudo das td's
$results = $selector->query('//td');
// resultado
foreach($results as $node) {
    echo $node->nodeValue . PHP_EOL;
}

Example: Ideone

    
18.12.2014 / 20:27
4

If I understand what you want to do, you can use a regular expression to catch the td tags, which would be /<td>(.*?)<\/td>/ . Note that if there are more tables in this string the expression should be changed.

$str =  "<td>Linha 1</td>" .
        "<td>Linha 2</td>" .
        "<td>Linha 3</td>" .
        "<td>Linha 4</td>" .
        "<td>Linha 5</td>" .
        "<td>Linha 6</td>";

preg_match_all("/<td>(.*?)<\/td>/", $str, $result);

// o array $result[0] possui todas as tags td's encontradas, inclundo a tag
// já o array $result[1] possui apenas o conteúdo das tags td's encontradas
$linhas = $result[1];

echo $linhas[1] . "<br />"; // irá exibir Linha 2
echo $linhas[2] . "<br />"; // irá exibir Linha 3
echo $linhas[5] . "<br />"; // irá exibir Linha 6
    
18.12.2014 / 21:18
2

I think with ER, if at all possible, would be more complicated or less readable.

But it is easy, assuming that this HTML is in a string, and as an alternative to @abfurlan's solution in case tags should be kept, just break it, separating them with new lines (not tags), iterating and comparing with the index:

$str = '<td>Conteúdo #1</td>
<td>Conteúdo #2</td>
<td>Conteúdo #3</td>
<td>Conteúdo #4</td>
<td>Conteúdo #5</td>
<td>Conteúdo #6</td>';

$tds = explode( "\n", $str );

$slice = array();

array_walk(

    $tds,

    function( $entry, $offset ) use( &$slice ) {
        if( in_array( $offset, array( 1, 2, 5 ) ) ) $slice[] = $entry;
    }
);

I preferred array_walk () , but you can do with a simple foreach , too:

foreach( $tds as $offset => $entry ) {
    if( in_array( $offset, array( 1, 2, 5 ) ) ) $slice[] = $entry;
}

As for having other elements before and after, then the thing changes a bit because you have to get in the string that you originally posted and did not say that it was in the middle of more HTML.

For this you have two options:

  • Syntactically analyze HTML, with DOM or SimpleXML a> (which is easier)

    $xml = simplexml_load_string( $str );
    
    $tds = $xml -> xpath( '//td' );
    
    $slice = array();
    
    foreach( $tds as $offset => $entry ) {
        if( in_array( $offset, array( 1, 2, 5 ) ) ) $slice[] = (string) $entry;
    }
    

To the detriment of losing the tags.

  • Use a Regular Expression to find <td> :

    $tds = preg_split( '/.*(<td>.*?<\/td>).*/', $str, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE );
    

The problem with this approach is that you get a little bit of junk that you need to clean up for offsets to match:

    $tds = array_values( array_filter( array_map( 'trim', $tds ) ) );

array_map () with will apply trim () in all indexes, array_filter () will remove the empty entries and array_values () will reindex.

And the iteration remains the same.

    
18.12.2014 / 20:41
0

Since it's from another site try with jquery

$("tr td:nth-child(5)").remove()
$("tr td:nth-child(4)").remove()
$("tr td:first").remove()
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
<table>
  <tr>
    <td>coluna 1</td>
    <td>coluna 2</td>
    <td>coluna 3</td>
    <td>coluna 4</td>
    <td>coluna 5</td>
    <td>coluna 6</td>
  </tr>
</table>
    
18.12.2014 / 21:02