How to create a script to automate link exchange in HTML?

0

I'm performing maintenance on a system that has hundreds of links on one page as follows:

<li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>

Note that there is text out of the a tag. What I want to do, put this text inside the a tag with your supposed href, the above example would look like this:

<li> <a href="http://exemplo.com"> Revista alvo </a> </li>

I'm making this change in hand but there are hundreds and hundreds of links, which makes the task tiresome. Does anyone have any idea how I can script to do this? It can be in any language, PHP , JS etc. I tried to be as clear as possible, if you do not understand I will try to explain again, I need help because it is urgent, please!

    
asked by anonymous 14.05.2018 / 23:22

2 answers

4

Update: If you do not want to open file by file, you can create a PHP script to scan a directory for HTML file searching or another case. See:

This script will run through Terminal / PowerShell , then you will receive a parameter which will be the directory to be scanned. Use the glob function to scan the directory, it will receive a parameter that will be "{$Dir}/*.html" and will return an array if it has found something, if it does not find it, return an empty array and false in case of error.

  

Before using the script below, make a backup!

// Conta quantos argumentos foi informado.
// O primeiro argumento sempre será o nome do arquivo.
$CountArgs = count($argv);

// Verifica se é menor que 2
if ($CountArgs < 2) {
  echo "Informe um diretório!\n\n";
  exit(0);
}
// Verifica se o argumento é um diretório.
else if ( !is_dir($argv[1]) ) {
  echo "O parâmetro informado não é um diretório!\n\n";
  exit(0);
}
// Guarda o argumento na variável.
$Dir = $argv[1];

// Varre o diretório atrás de arquivos html
// depois percorre a array e executa a função.
foreach (glob("{$Dir}/*.html") as $arquivo) {
  alterar_links($arquivo);
}

function alterar_links($Arquivo) {
  // Lê o arquivo, e guarda o conteúdo na variável
  $Conteudo = file_get_contents($Arquivo);
  // Faz a busca usando a expressão regular
  // e modifica usando um callback
  $Alteracoes = preg_replace_callback("|<li>([\w\s]+)<a(.*?)>(.*?)<\/li>|",
    function($retorno) {
      return "<li><a{$retorno[2]}>{$retorno[1]}</a></li>";
    },
    $Conteudo);
  // Abre o arquivo em modo escrita
  $arquivo = fopen($Arquivo,'w+');
  // Escreve as alterações no arquivo
  fwrite($arquivo, $Alteracoes);
  // Fecha
  fclose($arquivo);
}
  

Important : Note that when I make the change, I'm not leaving space between the li tag and the a : <li><a{$retorno[2]}>{$retorno[1]}</a></li> tag. Doing so, if the script reads the file, it does not make any changes.

References:

Sublime Text

You can use Regular Expressions to streamline the process, see:

<li>([\w\s]+)<a(.*?)>(.*?)<\/li>$

Explanation:

  • (.*?) : Captures the text inside tag "a" including tag closing
  • <a(.*?)> : Captures attributes of tag "a"
  • ([\w\s]+) : Captures text before tag "a"

To use in Sublime Text, press CTRL+H then ALT+R to activate Regular Expression search, in the Find field place the above code, already in the Replace field:

<li><a$2>$1</a></li>

Explanation:

  • $ 1 : Places captured text before tag "a"
  • $ 2 : Places captured attributes of tag "a"

Note that I used [\w\s]+ instead of [a-zA-Z0-9 ]+ because you can retrieve everything you are before, [a-zA-Z0-9 ]+ will only capture letters, numbers and spaces.

    
15.05.2018 / 00:40
0

Thanks to everyone who helped me, I was able to solve with the script below that I got from that response of the international stackoverflow.

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width">
  <title>JS Bin</title>
<script src="https://code.jquery.com/jquery-1.12.4.js"></script></head><body><ulid="linksList">
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo.com"> http://exemplo.com </a> </li>
  <li> Revista alvo <a href="http://exemplo10.com"> http://exemplo.com </a> </li>

  </ul>
  <a href="#" id="changeIt">Change</a>
  <script>
    $(document).ready(function(){     
      $("#changeIt").click(function(){
        $("#linksList li").each(function(){
          txt = $(this).text().split(' http://')[0].trim();
          lnk = $(this).children('a').text(txt)
          $(this).html(lnk)
        })
      })
    })
  </script>
</body>
</html>
    
15.05.2018 / 14:49