0

I am fairly new to pandas and trying to join two dataframes:

First one:

Date   Id    Provider
2019   1     Google
2019   2     Google

Second one:

Date-second   Id-second    Provider-second    Test
2019-11       5            Bing               True
2019-11       6            Bing               True

My desired output would look like this:

Date   Id    Provider   Date-second   Id-second    Provider-second    Test
2019   1     Google     2019-11       5            Bing               True
2019   2     Google     2019-11       6            Bing               True 

If I use pd.concat([df1,df2]) I get empty values in the places where entries overlap:

              Date                   Date-DB                                              Gclid                              Gclid_DB Provider Provider-DB click_type
0      2019-11-240                       NaT  EAIaIQobChMI2t6D4MqB5gIVA9bACh0BvwK-EAAYAyAAEg...                                   NaN  test2         NaN        NaN
1      2019-11-240                       NaT  CjwKCAiAzuPuBRAIEiwAkkmOSJ7WSwoG9veQ-jKXYi5Fyx...                                   NaN  test2         NaN        NaN
2      2019-11-240                       NaT  EAIaIQobChMIkdObncWB5gIVFZzVCh245Aq0EAAYASAAEg...                                   NaN  test2         NaN        NaN
3      2019-11-240                       NaT  CjwKCAiAzuPuBRAIEiwAkkmOSHDEAo0jtVHXRWOr3Rh1Yj...                                   NaN  test2         NaN        NaN
4      2019-11-240                       NaT  EAIaIQobChMI-ZenkNCB5gIVAx6tBh0gOg9GEAAYASAAEg...                                   NaN  test2         NaN        NaN
...            ...                       ...                                                ...                                   ...      ...         ...        ...
12741          NaN 2019-11-25 23:59:40+00:00                                                NaN  7d904da7-cd77-428c-a0d3-1fbe3c3c992d      NaN     test2      gclid
12742          NaN 2019-11-25 23:59:44+00:00                                                NaN  690aa2e3-de06-4f96-82bc-aed9c7ed16dc      NaN     test2      gclid
12743          NaN 2019-11-25 23:59:45+00:00                                                NaN  7a3ebeee-bfad-4f9d-931c-234d30ad8b52      NaN   test3      gclid
12744          NaN 2019-11-25 23:59:50+00:00                                                NaN  0e907d6f-0bf5-4fbc-8b03-8f0e0d73487b      NaN   test1      gclid
12745          NaN 2019-11-25 23:59:59+00:00                                                NaN  463bec78-b7c1-4a15-9f81-c163ece05a45      NaN     test2      gclid

If I use axis=1 I get the following error:

pandas.core.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Thank you for your suggestions.

Jonas Palačionis
  • 4,591
  • 4
  • 22
  • 55
  • Use `pd.concat([df1, df2], axis=1)` – rafaelc Nov 26 '19 at 15:45
  • Hi, I have updated my question, that does not work for me. – Jonas Palačionis Nov 26 '19 at 15:50
  • 1
    Read the linked question and the `pd.concat` docs. Basically, you have data frames with different indexes (see how the first one has 0, 1, 2, 3.... and the second one has 12741, 12742...). To concatenate them, they should have the same index. If you have different indexes but have a column that is mutual to both and serves as pivot, use `df.merge(df2, on='column')`. The linked duplicate will give more details on different types of merges – rafaelc Nov 26 '19 at 15:53
  • Hi again, I did not seem to find answer if I have different indexes and no shared column, I would simply need to add two dataframes next to each other. – Jonas Palačionis Nov 27 '19 at 07:33
  • Then just reset index twice. `df1=df1.reset_index(drop=True)` and the same for `df2`. Then, use concat as I mentioned above – rafaelc Nov 27 '19 at 13:51

0 Answers0