0

Table 1:

Item Type Order Date Ship Date Purchase Cost

0 Example 2014-08-10 2014-08-10 850.7544

1 Snacks 2014-08-10 2014-08-10 NaN

2 Cosmetics 2/22/2015 2/22/2015 NaN

3 Fruits 2015-09-12 2015-09-12 NaN

4 Personal Care 9/17/2014 9/17/2014 NaN

5 Household 2010-04-02 2010-04-02 NaN

6 Clothes 2/20/2013 2/20/2013 NaN

Table 2:

Item Type   Purchase Start Date     Purchase End Date   Cost Per Unit 

0 Baby Food 2010-01-01 2010-05-01 158.2736

1 Beverages 2010-01-01 2010-05-01 36.0620

2 Cereal 2010-01-01 2010-05-01 160.4460

3 Clothes 2010-01-01 2010-05-01 66.6608

4 Cosmetics 2010-01-01 2010-05-01 266.6920

5 Fruits 2010-01-01 2010-05-01 5.5980

6 Household 2010-01-01 2010-05-01 467.7890

7 Meat 2010-01-01 2010-05-01 274.2285

Here I need to fill the Purchase Cost Column In Table 1 Based in Table 2 v Date & Cost Per Unit Column

For example In table 1 Household Date Values between (2010-04-02 ,2010-04-02) so in Table 2 Household Values between (2010-01-01 ,2010-05-01),so from Table 1 Order Date & Ship Date are in the Date range of 'Purchase start Date'& 'Purchase End Date' ,so we can fill the value of Purchase cost as '467.789' ,So how to fill the Purchase cost ?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • Please beautify the question to give a clear idea – Sanjeev S May 19 '20 at 05:31
  • Welcome on SO! Please read and edit your question based on the [how-to-make-good-reproducible-pandas-examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Brown Bear May 19 '20 at 06:27

1 Answers1

0

I assume that all "date" columns have been converted to datetime type. Otherwise start from converting them.

Generate an auxiliary Series:

wrk = pricing.assign(year=pricing['Start Date'].dt.year)\
    .drop_duplicates(subset=['Item', 'year'])\
    .set_index(['Item', 'year'])['(USD)dollar'].rename('price'); wrk

It contains first price in each year and product. Product name (Item) and year are levels of MultiIndex and the price is the value.

For your sample data, completed with one row for Cosmetics for year 2014, the result is:

Item       year
Snacks     2010    68
           2011    72
Cosmetics  2014    50
Name: price, dtype: int64

Then, to fill price column, run:

product.price = wrk[product.set_index(['Product',
    product['Date (USD)'].dt.year]).index].tolist()

The result is:

     Product Date (USD)  price
0     Snacks 2010-02-03   68.0
1     Snacks 2010-02-06   68.0
2     Snacks 2014-02-03    NaN
3     Snacks 2012-02-03    NaN
4  Cosmetics 2012-02-03    NaN
5  Cosmetics 2013-02-03    NaN
6  Cosmetics 2013-02-08    NaN
7  Cosmetics 2014-02-06   50.0
8  Cosmetics 2014-02-09   50.0
Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41