How to store a result in a dataframe?

0

I run a difference calculation on two columns of data frames that are in a third column. However, if the calculation is executed, the last one will not be stored in the dataframe.

def predictions(train):

    print("cosine_sim")
    train["cosine_sim"] = train.apply(cosine_sim, axis = 1)
    print("diff")

    i = 0
    for index, row in train.iterrows():
        i += 1
        row["diff"] = row["quest_emb"] - row["sent_emb"]
        if i % 10000 == 0:
            print("row ",i)
            print("row[\"diff\"] ",row["diff"])        
    print("euclidean_dis")
    print(train)

Then the first print (" row [\ "diff \"] ", row [" diff "]) na row i 'gives me:

row  10000
row["diff"]  [[-0.00541345 -0.00239381  0.00431296 ... -0.01337912 -0.0073709
   0.        ]]
row  20000
row["diff"]  [[-0.03855522 -0.00136002 -0.02514186 ... -0.06655771 -0.02910786
  -0.02423212]
 [-0.03762216 -0.031567   -0.01083523 ... -0.01431298 -0.03401132
  -0.01916602]]

But the resulting column is filled with NaN :

                                                sent_emb  \
0      [[0.030376578, 0.044331014, 0.081356354, 0.062...   
1      [[0.030376578, 0.044331014, 0.081356354, 0.062...   
2      [[0.030376578, 0.044331014, 0.081356354, 0.062...   
3      [[0.030376578, 0.044331014, 0.081356354, 0.062...   
...  
16289  [[0.035860058, 0.049851194, 0.0662197, 0.02581...   

                                               quest_emb  \
0      [[0.01491953, 0.021973763, 0.021364095, 0.0393...   
1      [[0.04444952, 0.028005758, 0.030357722, 0.0375...   
2      [[0.03949683, 0.04509903, 0.018089347, 0.07667...   
3      [[0.03284301, 0.01849968, 0.020346267, 0.03835...   
...  
16289  [[0.03924892, 0.04188699, 0.025356837, 0.04136...   

                                              cosine_sim  diff  
0      [0.1401391625404358, 0.11776834726333618, 0.09...   NaN  
1      [0.12254136800765991, 0.08665323257446289, 0.0...   NaN  
2      [0.09432470798492432, 0.06841456890106201, 0.0...   NaN  
3      [0.1274968981742859, 0.09279131889343262, 0.08...   NaN  
...
16289  [0.060139477252960205, 0.07225644588470459, 0....   NaN  

I also tried a function, but it did not even create the column.

    
asked by anonymous 17.08.2018 / 14:39

0 answers