I am trying to create a parser of robots.txt
with Regex but I can not make the expression to be right. I ran several tests on Regex101 and still did not achieve an expected result.
My regular expression:
/user-agent: (bot|\*)\n*((disallow:\s*(?<disallow>.*)|allow:\s*(?<allow>.*)|sitemap:\s*(?<sitemap>.*))\n*)+/gi
My test suite:
User-agent: *
Disallow: /exemplo/
Allow: /dolor/
Disallow: /sit/
Allow: /amet/
Sitemap: http://www.loremipsum.com/sitemap.xml
In the image you can see the result that Regex101 returns and the one I wanted it to return.