You can add a helper class to these treatments by adding these methods to the string type and adding the methods to the treatments you want, removing any characters you find relevant. See the example:
public static class StringHelper
{
public static string RemoverAcentos(this string texto)
{
StringBuilder retorno = new StringBuilder();
var arrTexto =
texto.Normalize(NormalizationForm.FormD).ToCharArray();
foreach (var letra in arrTexto)
{
if (System.Globalization.CharUnicodeInfo.GetUnicodeCategory(letra) !=
System.Globalization.UnicodeCategory.NonSpacingMark)
retorno.Append(letra);
}
return retorno.ToString();
}
public static string RemoverEspacamentos(this string texto)
{
string retorno = texto.Replace("\t", "").Replace(" ", "");
return retorno.ToString();
}
public static string RemoverCaracteresEspeciais(this string texto) {
string retorno = texto.RemoverAcentos();
retorno = Regex.Replace(retorno.ToLower(), @"[^a-z0-9\s*]", "");
return retorno;
}
}
And use as follows:
string entrada = "São Paulo SP";
string entradaNormalizada = entrada.RemoverCaracteresEspeciais()
.RemoverEspacamentos()
.ToLower();
string cadastro = "Cidade de São Paulo - SP";
string cadastroNormalizado = cadastro.RemoverCaracteresEspeciais()
.RemoverEspacamentos()
.ToLower();
bool comparacao = cadastroNormalizado.Contains(entradaNormalizada); // true
Still this is only the first part of your journey, for even after these basic treatments you will only get positive results when the entry is less than the base if you compare and are in the same order. If the entry is for example "I live in the city of são paulo" or "SP - São Paulo". The comparison will be false.
From this point you should enrich your engine to work with a hit score, comparing how many A terms there are in B and making your decision to validate the comparison.
But you need something more sophisticated will need to implement a search API that meets your needs, such as Lucene or < a href="https://github.com/reddog-io/RedDog.Search"> RedDog.Search .