The output of a MERGE operation on two pandas data frames does not yield the expected result:
**dfmatrix**:
… young label filename
0 … 1 neg cv005_29357
1 … 0 neg cv006_17022
2 … 0 neg cv007_4992
3 … 1 neg cv008_29326
4 … 1 neg cv009_29417
**dfscores**:
filename score
0 cv005_29357 -10
1 cv006_17022 5
dfnew = pandas.merge(dfmatrix, dfscores, on='filename', how='outer', left_index=False, right_index=False)
**dfnew**:
… young label filename score_y
0 … 0 neg cv005_29357 NaN
1 … 1 neg cv006_17022 NaN
2 … 0 neg cv007_4992 NaN
3 … 0 neg cv008_29326 NaN
4 … 1 neg cv009_29417 NaN
Excpected Output:
**dfnew**:
… young label filename score_y
0 … 0 neg cv005_29357 -10
1 … 1 neg cv006_17022 5
2 … 0 neg cv007_4992 NaN
3 … 0 neg cv008_29326 NaN
4 … 1 neg cv009_29417 NaN
What am I doing wrong?
Update: this post suggests that MERGE is the way to go for the purposes of joining two data frames