0

I am facing a problem that I am uncapable of finding a way around it.

I find very difficult too to explain what I am trying to do so hopefully a small example would help

I have df1 as such:

Id   product_1   product_2  
Date
1    0.1855672   0.8855672
2    0.1356667   0.0356667 
3    1.1336686   1.7336686  
4    0.9566671   0.6566671  

and I have df2 as such:

                            product_1    Month
Date                                          
2018-03-30                         11.0      3
2018-04-30                         18.0      4
2019-01-29                         14.0      1
2019-02-28                         22.0      2

and what I am trying to achieve is this in df2:

                            product_1    Month   seasonal_index  
Date                                          
2018-03-30                         11.0       3        1.1336686 
2018-04-30                         18.0       4        0.9566671
2019-01-29                         14.0       1        0.1855672
2019-02-28                         22.0       2        0.1356667

So what I try is to match the product name in df2 with the corresponding column in d1 and then get the value of for each index value that matches the month number in df2 I have tried doing things like:

for i in df1:
    df2['seasonal_index'] = df1.loc[df1.iloc[:,i] == df2['Month']]

but with no success. Hopefully someone could have a clue on how to unblock the situation

Murcielago
  • 905
  • 1
  • 8
  • 30
  • 2
    This is basic merging, see [this](https://stackoverflow.com/questions/53645882/pandas-merging-101) question and answer. – Erfan Dec 19 '20 at 16:38
  • it might @Erfan you are right but I dont find a way to do it, because wouldn't be mergin on ```month```, but there is no month column in ```df1```. I'll attempt anyways, thanks. – Murcielago Dec 19 '20 at 16:44
  • Try something like this `df2['season_index'] = df2['Month'].map(df1.set_index('Date')['product_1'])` – Scott Boston Dec 19 '20 at 16:48

1 Answers1

1

Here you are my friend, this produces exactly the output you specified.

import pandas as pd

# replicate df1
data1 = [[0.1855672, 0.8855672],
         [0.1356667, 0.0356667],
         [1.1336686, 1.7336686],
         [0.9566671, 0.6566671]]
index1 = [1, 2, 3, 4]
df = pd.DataFrame(data=data1,
                  index= index1,
                  columns=['product_1', 'product_2'])
df.columns.name = 'Id'
df.index.name = 'Date'

# replicate df2
data2 = [[11.0, 3],
         [18.0, 4],
         [14.0, 1],
         [22.0, 2]]
index2 = [pd.Timestamp('2018-03-30'),
          pd.Timestamp('2018-04-30'),
          pd.Timestamp('2019-01-29'),
          pd.Timestamp('2019-02-28')]
df2 = pd.DataFrame(data=data2, index=index2,
                   columns=['product_1', 'Month'])
df2.index.name = 'Date'

# Merge your data
df3 = pd.merge(left=df2, right=df[['product_1']],
               left_on='Month',
               right_index=True,
               how='outer',
               suffixes=('', '_df2'))
df3 = df3.rename(columns={'product_1_df2': 'seasonal_index'})
print(df3)

If you are interested in learning why this works, take a look at this link explaining the pandas.merge function. Notice specifically that for your dataframes, the key for df2 is one of its columns (so we use the left_on parameter in pd.merge) and the key for df is its index (so we use the right_index parameter in pd.merge).

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

mgd
  • 306
  • 1
  • 6