replace with regular expression (regex) ignoring accents

3

Recently I changed the friendly url, due to this the querystryng used for searches was filtered without accents. Regarding queries to the database independent of the word having accent or not being found. But I used to give a replace to highlight the searched words

    Replace (texto, palavra,"<b>" & palavra & "</b>"

In short, how can I use a ereg_replace() to ignore accents. Example:

    Texto="Este é um filme de Ação e tem muita ação. "
    Palavra="acao"
    texto=Ereg_replace(condição_RegEx_que_ignora_Acentos-e-case, texto, palavra,"<b>"& palavra & "</b>")
    'Resultado q preciso:
    "Este é um filme de <b>Ação</b> e tem muita <b>ação</b>."

Thanks in advance for your attention

@EDITE to explain better:

I have a word that is without accent coming from querystring, a search: example: toquio I need to make a substitution to highlight this word in the text, whether or not it is accentuated. Then a text with Tokyo or Tokyo or Tokyo or Tokyo which was found by the search term: Tokyo or Tokyo or Tokyo or Tokyo needs to be replaced by itself added by the tag <b>...</b> so any text with the word: tokyo or tokyo or tokyo or tokyo should have itself added tag, like this:

era assim: Fui pra Toquio.
precisa ficar assim: Fui pra <b>Toquio</b>.

mas se for assim: Fui pra Tóquio.
precisa ficar assim: Fui pra <b>Tóquio</b>.

ou ainda: Fui pra tóquio.
precisa ficar assim: Fui pra <b>tóquio</b>.

ou ainda: Fui pra toquio.
precisa ficar assim: Fui pra <b>toquio</b>.

This is independent whether the user typed in tokyo or Tokyo or Tokyo or Tokyo

@david got better

    
asked by anonymous 24.12.2015 / 12:42

1 answer

3
  

Note: I do not know ASP very well, I will respond with the proposed logic, but the code may not be 100% correct.

I suggest from the search term you create a regex for it, replacing each letter of the word with a range containing all variations of the letter (uppercase / lowercase / accented), and then use that regex in Ereg_replace .

function CriarRegex(palavra)
    palavra = eregi_replace("a", "[aAáÁâÂàÀäÄãÃ]", palavra)
    palavra = eregi_replace("e", "[eEéÉêÊèÈëË]", palavra)
    palavra = eregi_replace("i", "[iIíÍîÎìÌïÏ]", palavra)
    palavra = eregi_replace("o", "[oOóÓôÔòÒöÖõÕ]", palavra)
    palavra = eregi_replace("u", "[uUúÚûÛùÙüÜ]", palavra)
    palavra = eregi_replace("c", "[cCçÇ]", palavra)
    CriarRegex= "(" & palavra & ")"
End Function

(The eregi_replace should be case-insensitive, however as I do not know if this holds for accented words, I used all the combinations in the code above.)

At the time of use, have the text replaced by the result of the first capture group, so the word entered at the end will be the same word found in the text, not the search term:

Texto="Este é um filme de Ação e tem muita ação. "
Palavra="acao"
Regex = CriarRegex(Palavra)
texto=eregi_replace(Regex, "<b>$1</b>", texto)

By the way, the above code will also highlight things like "cri action " - which may or may not be what you want. If you just want to highlight whole words, use a word boundary in your regex (if supported by

24.12.2015 / 15:52