0

I want to combine two dataframes:

df1:

     money      house       points      day         hour        min         sec
0   -0.099322   -0.023973   -0.830284   -0.078535   -0.479580   -0.590838   -1.519931
1   -0.100334   -0.023973   -0.391713   -0.078535   0.742059    0.680058    -1.230736
2   -0.085211   1.138251    -0.830284   0.352633    1.047469    0.853362    0.909305
3   -0.062503   1.525660    -0.830284   1.322761    1.352878    1.488810    -1.635608
4   -0.100325   0.750843    0.736043    1.538345    -0.937695   -1.399590   -1.288575

(323925 rows × 7 columns)

df2:

    A   B   C   D   E   F   G   H   I   J
0   1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
1   0.0 1.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
2   1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
3   1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
4   1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0

(323925 rows × 10 columns)

I tried to use pd.concat(), but I end up getting many NaN values. I don't have a column to concatenate two dataframe on because I just want to simply combine them horizontally. Could anyone please tell me why there are so many NaN values even though two dataframes have the same number of rows?

pd.concat([df1, df2], axis = 1)

    money   house   points  day hour    min sec A   B   C   D   E   F   G   H   I   J
0   -0.099322   -0.023973   -0.830284   -0.078535   -0.479580   -0.590838   -1.519931   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1   -0.100334   -0.023973   -0.391713   -0.078535   0.742059    0.680058    -1.230736   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2   -0.085211   1.138251    -0.830284   0.352633    1.047469    0.853362    0.909305    NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3   -0.062503   1.525660    -0.830284   1.322761    1.352878    1.488810    -1.635608   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4   -0.100325   0.750843    0.736043    1.538345    -0.937695   -1.399590   -1.288575   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
324172  NaN NaN NaN NaN NaN NaN NaN 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
324173  NaN NaN NaN NaN NaN NaN NaN 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
324174  NaN NaN NaN NaN NaN NaN NaN 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
324175  NaN NaN NaN NaN NaN NaN NaN 0.0 1.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
324176  NaN NaN NaN NaN NaN NaN NaN 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0

(324177 rows × 17 columns)

Wu Kris
  • 77
  • 8

1 Answers1

0

Both dataframes may not have same index number so you can use

pd.concat([df1, df2], axis = 1, ignore_index=True)
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
tugrultosun
  • 15
  • 1
  • 10
  • I tried that, but it still has many NaN values, and the number of rows becomes 324177 rows instead of 323925 rows – Wu Kris May 26 '21 at 21:01
  • 1
    Can u check your dataframes' indexes with df.tail functionality,also check this post might be helpful https://stackoverflow.com/questions/40339886/pandas-concat-generates-nan-values – tugrultosun May 26 '21 at 21:03
  • I see, it works now after fixing the index values. Thank you – Wu Kris May 26 '21 at 21:07