0

I'm concatenating two pandas data frames, that have the same exact columns, but different number of rows. I'd like to stack the first dataframe over the second.

When I do the following, I get many NaN values in some of the columns. I've tried the fix in using this post, using .reset_index But I'm getting NaN values still. My dataframes have the following columns:

enter image description here

The first one, rem_dup_pre and the second one, rem_dup_po have shape (54178, 11) (83502, 11) respectively.

I've tried this:

concat_mil = pd.concat([rem_dup_pre.reset_index(drop=True), rem_dup_po.reset_index(drop=True)], axis=0)

and I get NaN values. For example in 'Station Type', where previously there were no NaN values in either rem_dup_pre or rep_dup_po:

enter image description here

How can I simply concat them without NaN values?

halfer
  • 19,824
  • 17
  • 99
  • 186
Katie Melosto
  • 1,047
  • 2
  • 14
  • 35
  • can you share a few sample rows from both dataframes. I am unable to recreate the problem – Joe Ferndz Mar 11 '21 at 22:35
  • Have a look at [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and [edit] your question to include a [mcve] with sample input and expected output as text in the body of your question, not as a picture or external link – G. Anderson Mar 11 '21 at 22:37
  • Try this `df = pd.concat([df1,df2]).reset_index(drop=True)` instead of how you have given. – Joe Ferndz Mar 11 '21 at 22:41

1 Answers1

1

Here's how I did it and I don't get any additional NaNs.

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'a':[1,2,3,4,5,6],
                    'b':['a','b','c','d',np.nan,np.nan],
                    'c':['x',np.nan,np.nan,np.nan,'y','z']})
df2 = pd.DataFrame(np.random.randint(0,10,(3,3)), columns = list('abc'))
print (df1)
print (df2)
df = pd.concat([df1,df2]).reset_index(drop=True)
print (df)

The output of this is:

DF1:

   a    b    c
0  1    a    x
1  2    b  NaN
2  3    c  NaN
3  4    d  NaN
4  5  NaN    y
5  6  NaN    z

DF2:

   a  b  c
0  4  8  4
1  8  4  4
2  2  8  1

DF: after concat

   a    b    c
0  1    a    x
1  2    b  NaN
2  3    c  NaN
3  4    d  NaN
4  5  NaN    y
5  6  NaN    z
6  4    8    4
7  8    4    4
8  2    8    1
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33