-1

I have this excel formula that needs to be applied on a pandas dataframe

=IF(OFFSET(list!$B$1,MATCH($X1,list!$A:$A,0)-1,0)=0,"",OFFSET(list!$B$1,MATCH($X1,list!$A:$A,0)-1,0))

dataframe: list

A       B           C    
24309   Pepsi       US
45768   McDonalds   US
45638   Apple       Ireland
59374   Google      Ireland

dataframe 2: new_list

X         Y
24309     ?
45768     ?
45638
59374

The goal is to populate the Y column of the second new_list dataframe if the value in X matches a value in range A1:A4

I'm struggling with the best approach to do this. Can someone point me to the right direction? I was thinking of using panda groupby with np.where to compare and take the value of B column when there is a match.

Edit

new_list['Y'] = np.where(list['A'] == new_list['X'], list['B'], np.nan)

The solution above gives the same results but is this the right way to define such an Excel formula in Pandas?

Georgy
  • 12,464
  • 7
  • 65
  • 73
rescot
  • 325
  • 2
  • 18

2 Answers2

0

the easiest and most efficient way is to apply a mask over the values. Here you could find a sample dataframe:

df1 = pd.DataFrame({
"Date" : ['2013-11-22', '2013-10-24', '2013-09-24', '2013-02-15'],
"Fruit" : ['Banana', 'Orange', 'Apple', 'Celery'], 
"Num" : [22.1, 8.6, 7.6, 10.2],
"Color" : ['Yellow', 'Orange', 'Green', 'Green']
})

where after you could apply this mask, print out the mask to see result, or you could use the mask in a way that you like.

mask = (df1["Fruit"] == df1["Color"])
print(mask)

I hope that helps.

default_settings
  • 440
  • 1
  • 5
  • 10
0
new_list['Y'] = np.where(list['A'] == new_list['X'], list['B'], np.nan)

The code above was implemented and provides the same solution.

rescot
  • 325
  • 2
  • 18