0

I have multiple dataframes that have an ID and a value and I am trying to merge them such that each ID has all the values in it's row.

ID Value
1 10
3 21
4 12
5 43
7 11

And then I have another dataframe:

ID Value2
1 12
2 14
4 55
6 23
7 90

I want to merge these two in a way where it considers the ID's that are already in the first dataframe and if an ID that is the second dataframe is not in the first one, it adds it to the ID row with value2 leaving value empty. This is what my result would look like:

ID Value Value2
1 10 12
3 21 -
4 12 55
5 43 -
7 11 90
2 - 14
6 - 23

Hope this makes sense. I don't really care for the order of the ID numbers, they can be sorted or not. My goal is to be able to create dictionaries for each ID with "Value", "Value2", "Value3,... as keys and the corresponding actual value numbers as the keys values. Please let me know if any clarification needed.

markovv.sim
  • 161
  • 1
  • 8
  • 1
    Check the section on FULL OUTER JOIN in the [accepted answer](https://stackoverflow.com/a/53645883/15497888). `merged_df = pd.merge(df1, df2, how='outer', on='ID')`. Add a `fillna` if you want actual dashes instead of `NaN`. `merged_df = pd.merge(df1, df2, how='outer', on='ID').fillna('-')` – Henry Ecker Oct 13 '21 at 21:40

1 Answers1

2

You can use pandas' merge method (see here for the help page):

import pandas as pd
df1.merge(df2, how='outer', on='ID')

Specifying 'outer' will use union keys from both dataframes.

Sheldon
  • 4,084
  • 3
  • 20
  • 41
  • 1
    I think you're looking for an outer join here, since they want to include the values from the right dataframe that are not in the let. – whege Oct 13 '21 at 21:37
  • 1
    Good catch. I will modify my answer accordingly. – Sheldon Oct 13 '21 at 21:38