I am doing some automated testing for a legacy project in the MVC template, however there is a requirement for one of them that is capturing all fixed strings in HTML and JS codes. Since the project company is undergoing an internationalization process of its content, transforming its fixed strings into resource files.
I made this regex: ([\n]|^)(?<Value>(?!.*?\/\/|.*?@\*|.*?@.*?@|.*?\/\*|.*?<!--|.*?\\*)([^\n]*?)[áâãàéêèíîìóôõòúûù].*)
It partially solves my problem, since it identifies accented characters in the code by capturing SE not in comments (% with%).
So since there are no HTML or JS functions that use accents, I can assume that these are fixed strings.
After doing this, I was able to identify some pages that have fixed strings that should be transformed into resource files, but this regex does not cover all cases.
I would like a regex that:
- Capture fixed strings even with no accented characters in HTML and JS codes.
- Ignore string cases in comments.
There would be some particularity of syntax that could help me delimit where the regex should capture to identify these strings?