I was not going to answer this question rsrs, but ...
Come on, you want something cool to play with? or something professional?
Professional - You have nowhere to run, you will have to work with neural networks, this demands a well-crafted Artificial Intelligence, everything is segmented word for word, each one receives a certain grammatical classification (verb, adjective, nouns, pronouns, etc), this will separate your entire sentence into grammatical categories, later it will be used to compose the context as a whole, it's really something complex ...
Child's play - At the beginning of the internet, the search engines in the late 90's did not have IA
to interpret a phrase entered in the search, but what was it like? - Levenshtein is the answer, a widely used algorithm for finding close / similar words, for example if a person type "abacachi" how would we know he meant "pineapple"? Levenshtein does this, he calculates the minimum distance between two words, imagine in his example:
"Vai chover amanhã?"
"Vou precisar de uma guarda chuva amanhã?"
Imagine that in your system you only have one base knowing the word chuva
, temperatura
and clima
or just three words to define an action for weather forecast, as those three words in your small bank would understand these your sentences?
A: Segment all words and apply Levenshtein algorithm to each word, in the first sentence you will get a good score of the word chover
when compared to the word of your bank chuva
, ie the word raining will stand out as the best score in relation to the other words of the phrase, this already gives you clues that your phrase may have the climate and you could take an action depending on the scores of the other words in that sentence. The same happens for the next sentence that will have a distance equal to 0
returned by the algorithm of course the two words are equal ...
See the expected results for Levenshtein for his first sentence:
chuva => vai = distancia de 4
chuva => chover = distancia de 3
chuva => amanhã = distancia de 6
Shortest distance is the word chover
is an excellent score for a word with 6 characters, notice the word vai
the word only has 3 characters but has a distance of 4, we can say that it is completely different from the word and it has to be discarded right away, you can get the idea of how everything works / works, this algorithm is still very used ...