0

I have 2 dataframes and i want to concat each other as follows:

df1:

index    394               min     FIC-2000      398           min       FFC
0       Recycle Gas        min       20K20       Compressor    min       20k
1       TT                 date       kg/h       AT            date       ..
2       nan              2011-03-02   -20.7                    2011-03-02
                         08:00:00                              08:00:00
3       nan              2011-03-02   -27.5                      ...
                         08:00:10


df2:

index    Unnamed:0    0       1  ..     394         395   .....
 0        Service     Prop   Prop1      Recycle Gas  RecG

the output df3 should be like this:

df3

index    Unnamed:0    0        ..     394                            395..   
0        Service     Prop       Recycle Gas                          RecG
1                               Recycle Gas       min     FIC-2000
2                                                 min       20K20
3                                       TT        date       kg/h
4                                      nan       2011-03-02   -20.7
                                                 08:00:00    
5                                      nan       2011-03-02   -27.5 
                                                 08:00:10

i've tried to use this code:

df3=pd.concat([df1,df2), axis=1)

but this just concat index 394 and the rest of df1 is appended to the end of the dataframe of df2. Any idea how to?

jayjay62
  • 17
  • 7

1 Answers1

0

Just change to axis=0. Consider this:

Input:

>>> df
   col1  col2  col3
0     1     4     2
1     2     1     5
2     3     6   319
>>> df_1
   col4  col5  col6
0     1     4    12
1    32    12     3
2     3     2   319
>>> df_2
   col1  col3  col6
0    12    14     2
1     4   132     3
2    23    22     9

Concat mismatched (per column name)

>>> pd.concat([df, df_1], axis=0)
   col1  col2   col3  col4  col5   col6
0   1.0   4.0    2.0   NaN   NaN    NaN
1   2.0   1.0    5.0   NaN   NaN    NaN
2   3.0   6.0  319.0   NaN   NaN    NaN
0   NaN   NaN    NaN   1.0   4.0   12.0
1   NaN   NaN    NaN  32.0  12.0    3.0
2   NaN   NaN    NaN   3.0   2.0  319.0

Concat matching:

>>> pd.concat([df, df_1, df_2], axis=0)
   col1  col2   col3  col4  col5   col6
0   1.0   4.0    2.0   NaN   NaN    NaN
1   2.0   1.0    5.0   NaN   NaN    NaN
2   3.0   6.0  319.0   NaN   NaN    NaN
0   NaN   NaN    NaN   1.0   4.0   12.0
1   NaN   NaN    NaN  32.0  12.0    3.0
2   NaN   NaN    NaN   3.0   2.0  319.0
0  12.0   NaN   14.0   NaN   NaN    2.0
1   4.0   NaN  132.0   NaN   NaN    3.0
2  23.0   NaN   22.0   NaN   NaN    9.0

Concat matched, fill NaN-s (analogically you can fill None-s)

>>> pd.concat([df, df_1, df_2], axis=0).fillna(0) #in case you wish to prettify it, maybe in case of strings do .fillna('')
   col1  col2   col3  col4  col5   col6
0   1.0   4.0    2.0   0.0   0.0    0.0
1   2.0   1.0    5.0   0.0   0.0    0.0
2   3.0   6.0  319.0   0.0   0.0    0.0
0   0.0   0.0    0.0   1.0   4.0   12.0
1   0.0   0.0    0.0  32.0  12.0    3.0
2   0.0   0.0    0.0   3.0   2.0  319.0
0  12.0   0.0   14.0   0.0   0.0    2.0
1   4.0   0.0  132.0   0.0   0.0    3.0
2  23.0   0.0   22.0   0.0   0.0    9.0

EDIT Triggered by the conversation with OP in the comment section below.

So you do:

(1) To concat dataframes

df3=pd.concat([df1,df2], axis=0)

(2) To join another dataframe on them:

df5=pd.merge(df3, df4[["FIC", "min"]], on="FIC", how="outer")

(you may want to consider suffixes field if you think it's relevant) REF: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34
  • Thank you for your answer. I get the following Error: ValueError: Plan shapes are not aligned – jayjay62 Sep 06 '19 at 13:51
  • Maybe this one will help: https://stackoverflow.com/a/27412913/11610186 – Grzegorz Skibinski Sep 06 '19 at 13:53
  • not really. i have tried ``df2=df2[df1.columns].apply( lambda x: ','.join(x.dropna().astype(str)),axis=1)`` but i geht the error KeyError: "['min', 'FIX.P2-FIC-200607.F_CV'] not in index" – jayjay62 Sep 09 '19 at 09:27
  • @jayjay62 sorry, but don't see how this is related here - I thought you wanted to concatenate 2 pandas dataframes – Grzegorz Skibinski Sep 09 '19 at 10:23
  • yes you're right, but I am out of ideas, this was just a try. – jayjay62 Sep 09 '19 at 11:01
  • ok, and what are you exactly trying to do there? (you can elaborate in your question, or post another one) – Grzegorz Skibinski Sep 09 '19 at 11:15
  • I want to concat the column name of df1 found in df2 (394) and also join the other columns of df 1 (min,FIC) to the concatenated column. not changing the column order of df2 ..it all will be written in a loop – jayjay62 Sep 09 '19 at 11:21
  • ah, so you have set of columns with the same name in both df-s, and you want to keep them underneath each other, but keep also the ones, which are unique to df2, just making them "None" for df1, right ? Can you share ```df1.dtypes``` and ```df2.dtypes``` ? – Grzegorz Skibinski Sep 09 '19 at 11:27
  • yes it's right, how you explained it, but i also want to join the columns 'min' and 'FIC..' and this should be done to every column names found in both dataframes. consider thathe columns 'FIC' vary with the name. ``df1.dtypes Out[54]: 332 object min object FIX.P2-FIC-200607.F_CV object dtype: object df2.dtypes Out[55]: Unnamed: 0 object 0 object 1 object 2 object`` – jayjay62 Sep 09 '19 at 11:36
  • Added something in EDIT - please have a look, and let me know if this is going into right direction, otherwise I would need you to be a bit more specific about this "join" – Grzegorz Skibinski Sep 09 '19 at 11:47
  • It' heading the right way. But i think i couldn't explain it properly. I apologize for that. comment on EDIT: df3 is right but only the other columns should be added next to the concat columns and not ant the end of dataframe. but the columns 'min'(which is always same) and 'FIC'(which has different names, because its a loop) can not be merged, because the columns name FIC vary. so at the end if matches of first column in df1 is found in df2, the two other columns right next to the found match should be added aswell.. – jayjay62 Sep 09 '19 at 12:43