I have 1 Dataframe with some rows of Sold-To Country Name
column in the value of Not: XX XX XX
,which means the rest of Sold-To Country Codes
except XX XX XX
will be reporting to the mapped Reporting Country
.
Another requirement is that if Sold-To Country Code
is null
(or NaN
), it will capture all revenue from all country codes in that SalesOrg.
df_mapping = pd.DataFrame({'SalesOrg Code':['0001','0002','0002','0002','0002'],
'Reporting Country':['Spain','UK','UK','UK','Netherlands'],
'Sold-To Country Code':[np.nan,'IE','FR','IT','Ex:'],
'Sold-To Country Name':[np.nan,'Ireland','France','Italy','NOT: FR IE IT']})
SalesOrg Code Reporting Country Sold-To Country Code Sold-To Country Name
0001 Spain null null
0002 UK IE Ireland
0002 UK FR France
0002 UK IT Italy
0002 Netherlands Ex: NOT: FR IE IT
.......
There will be another Dataframe with a full list of global country codes, where we can search for the rest of the country codes.
Example of the Dataframe:
df_countrylist = pd.DataFrame(["AF", "AX", "AL", "DZ", "AS", "AD", "AO", "AI", "AQ", "AG", "AR",
"AM", "AW", "AU", "AT", "AZ", "BS", "BH", "BD", "BB", "BY", "BE",
"BZ", "BJ", "BM", "BT", "BO", "BQ", "BA", "BW", "BV", "BR", "IO",
"BN", "BG", "BF", "BI", "CV", "KH", "CM", "CA", "KY", "CF", "TD",
"CL", "CN", "CX", "CC", "CO", "KM", "CG", "CD", "CK", "CR", "CI",
"HR", "CU", "CW", "CY", "CZ", "DK", "DJ", "DM", "DO", "EC", "EG",
"SV", "GQ", "ER", "EE", "ET", "FK", "FO", "FJ", "FI", "FR", "GF",
"PF", "TF", "GA", "GM", "GE", "DE", "GH", "GI", "GR", "GL", "GD",
"GP", "GU", "GT", "GG", "GN", "GW", "GY", "HT", "HM", "VA", "HN",
"HK", "HU", "IS", "IN", "ID", "IR", "IQ", "IE", "IM", "IL", "IT",
"JM", "JP", "JE", "JO", "KZ", "KE", "KI", "KP", "KR", "KW", "KG",
"LA", "LV", "LB", "LS", "LR", "LY", "LI", "LT", "LU", "MO", "MK",
"MG", "MW", "MY", "MV", "ML", "MT", "MH", "MQ", "MR", "MU", "YT",
"MX", "FM", "MD", "MC", "MN", "ME", "MS", "MA", "MZ", "MM", "NA",
"NR", "NP", "NL", "NC", "NZ", "NI", "NE", "NG", "NU", "NF", "MP",
"NO", "OM", "PK", "PW", "PS", "PA", "PG", "PY", "PE", "PH", "PN",
"PL", "PT", "PR", "QA", "RE", "RO", "RU", "RW", "BL", "SH", "KN",
"LC", "MF", "PM", "VC", "WS", "SM", "ST", "SA", "SN", "RS", "SC",
"SL", "SG", "SX", "SK", "SI", "SB", "SO", "ZA", "GS", "SS", "ES",
"LK", "SD", "SR", "SJ", "SZ", "SE", "CH", "SY", "TW", "TJ", "TZ",
"TH", "TL", "TG", "TK", "TO", "TT", "TN", "TR", "TM", "TC", "TV",
"UG", "UA", "AE", "GB", "US", "UM", "UY", "UZ", "VU", "VE", "VN",
"VG", "VI", "WF", "EH", "YE", "ZM", "ZW"])
Ultimately, I want to have like this:
SalesOrg Code Reporting Country Sold-To Country Code Sold-To Country Name
0001 Spain null (all) null
0002 UK IE Ireland
0002 UK FR France
0002 UK IT Italy
0002 Netherlands AT Austria
0002 Netherlands DK Denmark
0002 Netherlands NL Netherlands
0002 Netherlands BE Belgium
0002 Netherlands LT Lithuania
0002 Netherlands LX Latvia
.......
For SalesOrg #0002, if the Sold-To Country Code
are not FR IE IT
, the rest will be reporting to Netherlands. So I want to create rows for the rest of the country codes.
Is there any better way to create rows and expand into the existing Dataframe?