When performing sklearn.metrics.pairwise.cosine_similarity, the results I got came with index 0, 1, 2... and column names 0, 1, 2...
How can I turn the results to be with original column and index names?
Dataframe for calculation:
user_id | age | education | income | length_residence
-----------------------------------------------------------------------
NIODB6S3 | 43.769912 | 1.537634 | 58.754647 | 7.232344
BOAWG65L | 43.769912 | 1.537634 | 58.754647 | 7.232344
3667B8P0 | 20.000000 | 1.000000 | 40.000000 | 4.000000
VS53SKY5 | 35.000000 | 1.537634 | 75.000000 | 14.000000
Code I ran:
pd.DataFrame(cosine_similarity(df))
Expected:
user_id | NIODB6S3 | BOAWG65L | 3667B8P0
user_id |
----------------------------------------------
NIODB6S3 | 1.000000 | 0.000084 | 0.996848
BOAWG65L | 0.000084 | 1.000000 | 0.000342
3667B8P0 | 0.996848 | 0.000342 | 1.000000
Got:
| 0 | 1 | 2
--------------------------------------
0 | 1.000000 | 0.000084 | 0.996848
1 | 0.000084 | 1.000000 | 0.000342
2 | 0.996848 | 0.000342 | 1.000000
I'm not sure if the default numeric index conveys the correct and original order of 'user_id' in df.