First you break the sentence into words:
words = texto.lower().split()
With this list of words, just iterate over it, attaching the next word. So that you do not have much work, you can use collections.defaultdict
, which will create a list dictionary for you. The code looks like this:
import collections
adjacente = collections.defaultdict(list)
for (i, word) in enumerate(words[:-1]):
next_word = words[i + 1]
adjacente[word].append(next_word)
Remembering that we do -1 to get n - 1 words, since the last word has no words adjacent to it.
And the result:
adjacente
defaultdict(list,
{'But': ['at'],
'We': ['are', 'are'],
'are': ['not', 'not', 'not'],
'at': ['least'],
'be': ['We', 'But'],
'least': ['we'],
'need': ['to'],
'not': ['what', 'what', 'what'],
'should': ['be'],
'to': ['be', 'be'],
'used': ['to'],
'we': ['should', 'need', 'are', 'used'],
'what': ['we', 'we', 'we']})
If you wanted the words to be unique, change it from defaultdict
from list
to set
and instead of append, use update passing a vector with next_word.