Remove all lines beginning with a specific character

2

Hello,

I have a project that uses file_get_contents(); on an external page and would like to know how I can remove, for example, all lines that start with // .

The external page contains comments in JavaScript's and I have not found anything that can remove only comments within the tag.

Sample external page code;

<!DOCTYPE html>
<html>
<body>
<script>
[...]
// Alerta
alert('Olá!');
[...]
</script>
</body>
</html>

Code that I tried to use to remove comments;

$html = explode(PHP_EOL, file_get_contents('https://example.com/'));
foreach($html as $linha) {
    if(substr($linha, 0, 2) !== '//' && mb_substr($linha, 0, 2) !== '//') {
        $linhas[] = trim($linha);
    }
}
$html = implode(PHP_EOL, $linhas);
echo $html;

When I ran the script through the browser to see if it removed the comment or not, unfortunately I still got the comment.

I'm using XAMPP in the 5.6.35 version.

    
asked by anonymous 19.04.2018 / 11:46

1 answer

0

You can use regular expressions to solve this problem. In my regex pattern I'm looking for lines that start with "/" followed by another /. I finally deny the lines that match this pattern.

default: "/ ^ ///"

<?php 

 $html = explode(PHP_EOL, file_get_contents('./teste.html'));
   foreach($html as $linha) {
    if(!preg_match("/^\/\//", $linha)) {
     $linhas[] = trim($linha);
    }
  }
  $html = implode(PHP_EOL, $linhas);
  echo $html;
?>

Example HTML file.

LINHA 1 <br>
LINHA 2 <br>
LINHA 3 <br>
//NAO LEIA <br>
LINHA 4 <br>

Output:

    
19.04.2018 / 13:51