How to catch reps of a certain group with regex?

1

I'm trying to capture a certain structured data, and I need it to capture a certain group as long as there are possibilities.

The format of the data is as follows:

  

foo01 @ key1 | value1 # keyN | valueN

Where the first group consists of a numeric alpha value, separating the other groups with the character @ , my interest is to get only the foo01 in the first group.

The second group is what is repeated, where Attribute and value are separated by | and the other attributes are separated by # , after that there is no more information only attributes in the format "key1 | value1 # keyN | valueN ".

Below you can see what I started to do, but I was not able to capture all of the attributes separately.

const regex = /(^[a-zA-Z0-9_]*@)([a-zA-Z0-9_]*\|[a-zA-Z0-9_]*)/g;
const str = 'foo01@chave1|valor1#chaveN|valorN';
let m;

while ((m = regex.exec(str)) !== null) {
  if (m.index === regex.lastIndex) {
    regex.lastIndex++;
  }

  m.forEach((match, groupIndex) => {
    console.log('Encontrado, grupo ${groupIndex}: ${match}');
  });
}
Note: You are in js just by the ease of adding the snippet

I would like to know how to capture all occurrences of "key1 | value1 # keyN | valueN" within a string independent of the amount present in the string.

    
asked by anonymous 13.09.2017 / 22:03

3 answers

1
  

How to capture replays of a given group with regex?

There are 2 alternatives:

  • Use match previously named capture group (find the previously named group ), with it you can designate a name for a catch group and repeat its capture several times using greedy , lazy , possessive quantifiers, etc. .
  • Create 2 capture groups, the first to capture the sequence you want and one outside, encompassing only the capture group and a quantifier, in your case would use a greedy quantize ( greedy )

Answer 1

(?'foo'\w*)@(?'Todos_Atrib_Val'(?'Atrib_Val'\w*\|\w*#{0,1})(?'Atrib_Val_Recursivo'\g'Atrib_Val')*)

I understand that it is hopeless to see such a large Regex code, however it is much easier to read once you have placed the regex101 site or isolating your named capture groups and analyzing 1 to 1, it is much easier to code maintenance and / or reading by other programmers.
Here you can see this regex in action , I recommend that you look in the "Match Information" panel and note how the line of thought is organized.

Answer 2

(\w*)@((\w*\|\w*#{0,1})*)

Here is the same line of thinking, but without naming groups and without using the previous groups capture feature.

  • Group 1 to capture the sequence before @
  • Group 2 that will capture everything that Group 3 finds, storing all the results greedily.
  • Group 3 that identifies the sequence and captures key1 | value1 # N times, but only stores the last.

You can check that its operation is the same as the first example here.

    
14.09.2017 / 16:03
2

Do you need to capture everything in a single regular expression?

I believe I got something close to what you need here with this one. I left the @ and # from outside:

const regex = /(^[\d\w_]*)?[@|#]([\d\w_]*\|[\d\w_]*)/g

But I believe that in this case a split () would be much simpler to understand everything.

var campos = str.split('@');
var inicio = campos[0];
var lista = campos[1].split('#');

console.log(inicio, lista);
    
14.09.2017 / 00:18
2

The way I know it is with split () ...

var arrayStr = str.split(/(@|#|\|)/)

This creates an array variable that will have the elements separated by @, # or |

And to paste:

arrayStr[0]
arrayStr[1]
    
14.09.2017 / 00:25