-1

I have two separated DataFrames:

df1:

Col1  Col2  Col3  Col4  Col5
ID1   2     3      5     0
ID2   7     6     11     5
ID3   9    16     20    12

df2:

Col1  ColB 
ID1   2   
ID2   7   
ID3   9    

Is there a way to count how many values in the first row of df1 are greater than the first value in the column ColB in df2? I need this counting for each row and to add it at the end of df1. So, df1 can be looked like this:

df1:

Col1  Col2  Col3  Col4  Col5 COUNT
ID1   2      3      5    0    2
ID2   7      6     11    5    1
ID3   9     16     20   12    3

Thank you for any suggestion!

Avah
  • 227
  • 3
  • 13
  • 2
    You can [merge](https://stackoverflow.com/questions/53645882/pandas-merging-101) and use [df.gt](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.gt.html#pandas-dataframe-gt) – pault Aug 29 '21 at 15:44

2 Answers2

0

The prior assumption is that 'Col1' is the index. If not, add .set_index('Col1') after df1/df2 in the right part of the commands:

You can use the underlying numpy array:

df1['COUNT'] = (df1.values>df2.values).sum(axis=1)

# if "Col1" is not index
df1['COUNT'] = (df1.set_index('Col1').values>df2.set_index('Col1').values).sum(axis=1)

or:

df1['COUNT'] = df1.gt(df2['ColB'].values[:, None]).sum(axis=1)

# if "Col1" is not index
df1['COUNT'] = df1.set_index('Col1').gt(df2['ColB'].values[:, None]).sum(axis=1)

output:

      Col2  Col3  Col4  Col5  COUNT
Col1                               
ID1      2     3     5     0      2
ID2      7     6    11     5      1
ID3      9    16    20    12      3
mozway
  • 194,879
  • 13
  • 39
  • 75
0

Try this:

df1 = df1.set_index('Col1')
df1.assign(COUNT = df1.gt(df2.set_index('Col1').squeeze(),axis=0).sum(axis=1))
rhug123
  • 7,893
  • 1
  • 9
  • 24