TypeError: object of type 'NoneType' has no len ()

0

I'm trying to apply the NMF algorithm in a csv and then extract the phrases attached to each topic

import pandas
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.decomposition import NMF

def display_topics(model, feature_names, no_top_words):
    for topic_idx, topic in enumerate(model.components_):
        print "Topic %d:" % (topic_idx)
        print " ".join([feature_names[i]
                    for i in topic.argsort()[:-no_top_words - 1:-1]])

textos = pandas.read_csv('teste_nmf.csv', encoding = 'utf-8')
textos_limpos = textos['frase_limpa']
textos_bruts = textos['frase_brut']
textos_bruts_list = textos_bruts.values.tolist()
textos_limpos_list = textos_limpos.values.tolist()

tfidf_vectorizer = TfidfVectorizer()
tfidf = tfidf_vectorizer.fit_transform(textos_limpos_list)
tfidf_feature_names = tfidf_vectorizer.get_feature_names()


#n_components: numero de topicos
nmf = NMF(n_components = 2, random_state = 1, alpha = .1, l1_ratio = .5, init = 'nndsvd').fit(tfidf)


#Numero de palavras por topico
no_top_words = 2

#Visualizaçao dos topicos com as palavras
print 'NMF'
topics = display_topics(nmf, tfidf_feature_names, no_top_words)
print topics

#extrair frases ligadas aos topicos
for topic in range(len(topics)): #TypeError: object of type 'NoneType' has no len()
    print "Topic {}:".format(topic)
    docs = np.argsort(document_topics[:, topic])[::-1]
    for text in docs[:3]:
        text_brut = " ".join(textos_bruts_list[text].split(",")[:2])
        print " ".join(textos_limpos_list[text].split(",")[:2]) + ',' + text_brut

A sample (rough) dataset:

frase_limpa,frase_brut
manga fruta gostosa,a manga é uma fruta gostosa  
computador objeto importante,o computador é um objeto importante
banana fruto popular,a banana é um fruto popular
lapis coisa importante,o lapis é uma coisa importante
uva roxa,a uva é roxa
telefone objeto mundial,o telefone é um objeto mundial

My result:

  

NMF

     

Topic 0:

     

important object

     

Topic 1:

     

purple grape

     

None

     

Traceback (most recent call last):     File "test_NMF.py", line 55, in       

     

TypeError: object of type 'NoneType' has no len ()

What I expected more or less:

  

Topic 0:

     

important object

     

Topic 1:

     

purple grape

     

Topic 0:

     

Important computer object, the computer is an important object

     

world object phone, the phone is a worldwide object

     

Lapis important thing, lapis is an important thing

     

Topic 1:

     

Purple grape, the grape is purple

    
asked by anonymous 14.09.2018 / 16:00

1 answer

1

The problem is that the display_topics function does not have a return or yield clause, so it will always return None every time.

topics = display_topics(nmf, tfidf_feature_names, no_top_words)

It means that the variable topics is None because that is what display_topics() returns.

for topic in range(len(topics)): 

Try to calculate the len() of topics which is None so you have the error.

    
14.09.2018 / 18:49