Picking indentation with Regular Expression

3

I would like someone to help me create an ER that takes only one mark and all the lines that are indented forward.

example:

aaa
abab aca
marcacao
   aaa
   abab aca
   cc
cc
bb

With the above code ER would return:

marcacao
   aaa
   abab aca
   cc

My code is written in JavaScript so I use .match () on my ERs.

[edited]

I described my problem better in the comments below.

This is my real code:

DOCTYPE html
html
    head
        title gulp-gotohead
        style.
            article {border:1px solid}
        style(data-above-the-fold="true").
            body {font-size:100%}
            body{font-size:100%}
            body.main{font-size:100%}
            body, h1{font-size:100%}
            body>h1{font-size:100%}
            header {color:#333}
        script(data-above-the-fold="true").
            var head = Head();
            head();
    body (data-d="true")
        h1 gulp-gotohead
        p
            span regex

Here is my mark:

style(data-above-the-fold="true")

And this is the return you want:

style(data-above-the-fold="true").
     body {font-size:100%}
     body{font-size:100%}
     body.main{font-size:100%}
     body, h1{font-size:100%}
     body>h1{font-size:100%}
     header {color:#333}

The most I can get is to get the codes from my dial-down link .

    
asked by anonymous 30.08.2014 / 02:33

3 answers

2

If you really want to use a regular expression, I think the following regex does what you want:

^(\s*)style\(data-above-the-fold="true"\)\.\n(\s+.*\n)*

link

The ^(\s*) captures the indentation before its initial marker. Since this is the first capture of the regex, we can refer to it with . The ^ serves to ensure that we start at the beginning of the line and avoids unnecessary backtracking.

\s+ recognizes a sequence of spaces larger than the sequence before its marker. Be careful not to mix tabs with spaces.

Finally, I added \n here and there, since . does not count as end of line.

A variation to consider in addition is to change all \s by (space), \t or [ \t] , depending on your opinion about blending tabs with spaces. This would be to prevent line breaks from being treated as indentation.

    
01.09.2014 / 01:46
1

This works.

var resultado = seuTexto.match(/^marcacao(\s\s+.+$)+/m )[0];

Edit: This response was dated before the OP explained the whole problem. Hence the simplicity of RegExp. The problem as described above is much more complex and this answer is only for illustration purposes.

    
30.08.2014 / 21:40
0

Maybe you can solve your problem with Regex but I believe that the restriction of making the sequencing lines have more indentation than the first would make the regex very complicated. I would solve this problem by writing code in the same hand:

In pseudocode

marcação = 'style(data-above-the-fold="true")';

function get_line(){
   pega a próxima linha da entrada e retorna dois valores:
   - o número de espaços no início da linha e 
   - uma string com o resto da linha
}

repeat {
    root_indent, root_data = get_line();
} until(root_data == marcação )

styles = [];
loop {
    indent, style = get_line();
    if (indent <= parent_indent ) { break }
    styles.push(style);
}
One advantage of solving this rather than using a regex is that the output of the algorithm is a structured list rather than a stringzone, and the code written in the hand is more flexible than a regex if you need to change the logic later .

    
31.08.2014 / 20:16