0

I have two data frames that I imported as spreadsheets into Pandas and cleaned up. They have a similar key value called 'PurchaseOrders' that I am using to match product numbers to a shipment number. When I attempt to merge them, I only end up with a df of 34 rows, but I have over 400 pairs of matching product to shipment numbers.

This is the closest I've gotten, but I have also tried using join()

ShipSheet = pd.merge(new_df, orders, how ='inner')
ShipSheet.shape

Here is my order df orders df

and here is my new_df that I want to add to my orders df using the 'PurchaseOrders' key new_df

In the end, I want them to look like this end goal df

I am not sure if I'm not using the merge function improperly, but my end product should have around 300+ rows. I will note that the new_df data frame's 'PurchaseOrders' values had to be delimited from a single column and split into rows, so I guess this could have something to do with it.

bowen17
  • 1
  • 1
  • 2
    Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – RichieV Aug 21 '20 at 05:21
  • 1
    I read through it, and I think that my problem may not be the merge call, but rather something wrong with my data from my new_df dataframe. I had to delineate it, so I am checking for any extra spaces in the values. Does the merge function require the datatypes to be the same for the keys? It's possible my key data types in excel were different (text and custom) – bowen17 Aug 21 '20 at 05:48
  • @bowen17 it would raise a ```ValueError``` if the datatype wouldn't match – Aagam Sheth Aug 21 '20 at 06:05

2 Answers2

0

Use the merge method on the dataframe and specify the key

merged_inner = pd.merge(left=df_left, right=df_right, left_on='PurchaseOrders', right_on='PurchaseOrders')

learn more here

Aagam Sheth
  • 685
  • 7
  • 15
  • 1
    The merge call produced the same results. I'm starting to think that the problem is with my values in new_df and not the merging function. – bowen17 Aug 21 '20 at 05:49
  • @bowen17 it would be helpful with a sample of your data to figure out the answer – Aagam Sheth Aug 21 '20 at 06:07
0

Use the concat method on pandas and specify the axis.

final_df = pd.concat([new_df, order], axis = 1)

when you specify the axis please careful if you specify axis = 0 then it placed second data frame under the first one and if you specify axis = 1 then it placed the second data frame right to the first data frame.

Javed Ali
  • 23
  • 4
  • ```concat``` won't match the dataframe columns. I will just add column and and place data according to the index. – Aagam Sheth Aug 21 '20 at 06:08