Divide text into parts

2

I need to get a specific piece of a .txt file that is between <tag> and </tag> . But I need to get all the lines where it happens.

Example:

<tag1>titulo</tag1>
<tag2>subtitulo</tag2> 

texto... 

<tag2>subtitulo</tag2> 
texto... 

<tag1>titulo</tag1> 
<tag2>subtitulo</tag2> 
texto...

I want to get the text between these tags and save.

    
asked by anonymous 22.11.2017 / 19:38

1 answer

0

Try this code:

var pattern = /<tag.*?>(.*?)<\/tag.*?>/g; //Padrão que a regex vai dar Match
var reader = new FileReader(); //Leitor do arquivo txt
var output = ""; //Variável de texto onde será carregado o conteúdo do txt
reader.onload = function (e) {  //Função para leitura após o carregamento do reader
            output = e.target.result;
            displayContents(output);
        }; //Fim da função
        reader.readAsText(filePath.files[0]); //insira o path para seu txt
var m;

do {
    m = pattern.exec(output); //executa o padrão da regex e armazena os val em m
    if (m) {
        console.log(m[1]); //mostra o resultado no console do browser
    }
} while (m); 

Explanation of the code:

  • The following function loads the txt file through a FileReader

  • Pass your content to the output variable.

  • Prints content that matches the regex in the browser console.

Explanation of regex:

/<tag.*?>(.*?)<\/tag.*?>/g;
  • <tag Finds the sequence exactly equal to <tag
  • .*?> Proceeds (not captured) by all characters up to the first occurrence of > .
  • (.*?)<\/tag Captures all content in capture group 1 until the first occurrence of </tag
  • .*?> Proceeds (not captured) by all characters up to the first occurrence of > .
  • /g It is a regex modifier, indicates that it should continue trying to match until the end of the text sequence.
24.11.2017 / 01:13