1

Sample DF:

I have a df with columns like Zone , New_Zone, Country, New_Region & Currency

Currency column has values like EUR, AUD, BLR,RUB

Zone columns has values like EU, Africa, NAZ

New_Zone has values like EU, Europe, EUROPE, Africa, NAZ, AFRICA

Country has values like Australia, Brazil, Russia, Ukraine, USA

New_Region has values like Australia, Brazil, USA

So, columns Zone and New_Zone contains almost similar values and New_Region and Country almost similar values

Ask:

I want to make a new column called Currency_Match with below conditions:

  1. If Zone is EU OR(logical) New_Zone is EU, Europe or EUROPE AND(logical)Currency is EUR then Yes else No

  2. If Country is Australia OR(logical) New_Region is Austraila AND(logical) Currency is AUD then Yes else No

  3. If Country is BrazilOR(logical) New_Region is Brazil AND(logical) Currency is BLR then Yes else No

  4. If Country is KoreaOR(logical) New_Region is koreaAND(logical) Currency is KRW then Yes else No

  5. If Zone is Africa OR(logical) New_Zone is Africa or AFRICA, AND(logical)Currency is NGN then Yes else No

6 If Zone is NAZ OR(logical) New_Zone is NAZ, AND(logical)Currency is USD then Yes else No

7 If Country is Russia , Ukraine OR(logical) New_Region is RussiaAND(logical) Currency is RUB then Yes else No

I have overall 7 conditions like these above ones

Code:

        df['Currency_Match']=df.apply(lambda row: "Yes" if (((row['Zone'] == "EU")|(row['New_Zone'] =='Europe')|(row['New_Zone'] =='EU')|(row['New_Zone'] =='EUROPE')) 
& (row["Contract - Original Currency Code"] == "EUR"))
     else 'No',axis=1)

Problem:

Above code works well but when I write the other conditions like this it updates the column again and again.

Is their any other approach I can do this and also less cumbersome than this

Community
  • 1
  • 1
Rahul Agarwal
  • 4,034
  • 7
  • 27
  • 51

1 Answers1

2

Create each conditions for separate rows for readable code, chain by | or & for bitwise OR or AND to final mask and pass to numpy.where:

m1 = (df['Zone'] == "EU")|(df['New_Zone'].isin(['Europe','EUROPE','EU'])
m2 = (df["Contract - Original Currency Code"] == "EUR")

m3 = (df['Region'] == "Australia")|(df['New_Region'] =='Australia')
m4 = (df["Contract - Original Currency Code"] == "AUD")

m5 = (df['Region'] == "Brazil")|(df['New_Region'] =='Brazil')
m6 = (df["Contract - Original Currency Code"] == "BLR")

mask = (m1 & m2) | (m3 & m4) | (m5 & m6)

df['Currency_Match'] = np.where(mask, "Yes", 'No')

EDIT: for dynamic solution create dictionary by currency with list of zones and regions and pass to logical_or.reduce in dict comprehension:

#zone dict
d1 = {'EUR':['Europe','EUROPE','EU'],'NGN':['Africa','AFRICA'],'USD':['NAZ']}

#region dict
d2 = {'AUD':['Australia'], 'BLR':['Brazil'],'KRW':['Korea'],'RUB':['Russia','Ukraine']}

m1 = [(df['Zone'].isin(v) | df['New_Zone'].isin(v)) & 
      (df["Contract - Original Currency Code"] == k) 
       for k, v in d1.items()]
print (m1)

m2 = [(df['Region'].isin(v) | df['New_Region'].isin(v)) & 
      (df["Contract - Original Currency Code"] == k) 
       for k, v in d2.items()]
print (m2)

mask = np.logical_or.reduce(m1) | np.logical_or.reduce(m2)

df['Currency_Match'] = np.where(mask, "Yes", 'No')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252