1

I have a df called "df_new" shown below...

enter image description here

I have another df called "df_old" shown below...

enter image description here

(1) I want to check if stock listed in df_new is in df_old.

If yes, print stock ticker (SELL) and show subtract df_new['Quantity'] - df_old['Quantity']. If no, print stock ticker (BUY) and show df_new['Quantity'].

(2) I want to check if stock listed in df_old is in df_new. If yes, do nothing If no, print stock ticker (SELL) and show df_old['Quantity'].

Sorry for not able to provide the df, working in colab and clipboard does not work.

bobby666
  • 67
  • 6

1 Answers1

1

I may not have understood the exact intent of your question. As @RichieV commented, I use df.merge() to join and np.where() to determine and add the columns.

import pandas as pd
import numpy as np
import io

data = '''
Stock Quantity "Buy Price"
AWK 31.0 147.220001
BDX 16.0 281.339996
CDNS 42.0 109.250000
DHR 23.0 203.949997
EXPD 55.0 84.510002
LIN 19.0 245.309998
MAS 81.0 57.180000 
PCAR 54.0 85.000002
RMD 23.0 202.509995
TMO 11.0 413.959991
TSCO 32.0 142.740005
VRSK 24.0 188.710007
WST 17.0 268.809998
'''

df_old = pd.read_csv(io.StringIO(data), sep='\s+')

data2 = '''
Stock Quantity "Buy Price" "Sell Price"
WST 22.0 246.232273 268.809998
CHD 64.0 91.534375 96.330002
RMD 29.0 201.286897 202.509995
TMO 15.0 419.992000 413.959991
LIN 24.0 247.027083 245.309998
BDX 22.0 274.558636 281.339996
CDNS 56.0 106.053929 109.250000
DHR 30.0 199.803000 203.949997
TSCO 42.0 143.4083000 142.740005
AWK 41.0 146.890976 147.220001
'''

df_new = pd.read_csv(io.StringIO(data2), sep='\s+')

# df_new and df_old combine
df = df_new.merge(df_old, on='Stock', suffixes=('_new','_old'), how='outer')
df['Price'] = np.where(~(df['Quantity_new'].isnull())&~(df['Quantity_old'].isnull()), df['Sell Price'], df['Buy Price_new'])
df['diff'] = np.where(~(df['Quantity_new'].isnull())&~(df['Quantity_old'].isnull()), df['Quantity_new'] - df['Quantity_old'], df['Quantity_new'])

# update

df['diff'].loc[df['diff'].isnull()] = df['Quantity_old'].loc[df['Quantity_new'].isnull()]
df = df[['Stock','Quantity_new','Quantity_old', 'diff']]

|    | Stock   |   Quantity_new |   Quantity_old |   diff |
|---:|:--------|---------------:|---------------:|-------:|
|  0 | WST     |             22 |             17 |      5 |
|  1 | CHD     |             64 |            nan |     64 |
|  2 | RMD     |             29 |             23 |      6 |
|  3 | TMO     |             15 |             11 |      4 |
|  4 | LIN     |             24 |             19 |      5 |
|  5 | BDX     |             22 |             16 |      6 |
|  6 | CDNS    |             56 |             42 |     14 |
|  7 | DHR     |             30 |             23 |      7 |
|  8 | TSCO    |             42 |             32 |     10 |
|  9 | AWK     |             41 |             31 |     10 |
| 10 | EXPD    |            nan |             55 |     55 |
| 11 | MAS     |            nan |             81 |     81 |
| 12 | PCAR    |            nan |             54 |     54 |
| 13 | VRSK    |            nan |             24 |     24 |
r-beginners
  • 31,170
  • 3
  • 14
  • 32
  • Looks good, a change is requested. How do I show row 10 to 13 "Quantity_old" in "diff" column? Also, I do not need the rest of the columns to show, except stock, quantity_new, quantity_old and diff. The price should be pulled from y-finance. – bobby666 Aug 02 '20 at 20:57