0

From my df showing employees with multiple levels of managers (see prior question here), I want to map rows to a department ID, based on a manager ID that may appear across multiple columns:

eid,    mid,    l2mid   l3mid
111,    112,    116,    115
113,    114,    115,    0
112,    117,    114,    0   
110,    115,    0,      0    
116,    118,    0,      0 

[edit: corrected data set to reflect row for eid=110 will be dropped edit #2: modified row for eid=112 to reflect that i need to search multiple columns to get a match in dictionary.]

The dictionary is

  country = {112: 'US', 114: 'Ireland', 118: 'Mexico'}

I'd like write Python that searches the manager columns 'mid':'l3mid' and then writes the country string value into the new column. I'd like to drop rows if they do not have a manager from one of the country codes in the dictionary. So the output I'm looking for is:

eid,    mid,    l2mid   l3mid   country
111,    112,    116,    115,    US
113,    114,    115,    0,      Ireland
112,    117,    114,    0       Ireland
116,    118,    0,      0       Mexico

I have tried building a function to do this but can't quite figure out the syntax. I appreciate your help as I'm new to this work.

QuillPy
  • 25
  • 1
  • 5

1 Answers1

2

I added a solution, if manager columns (mid,l2mid,l3mid) value match the dictionary keys, then the values are joined splitted by ,:

s = df.drop('eid',1).applymap(country.get)
      .dropna(how='all', axis=0)
      .apply(lambda x: ', '.join(x.dropna()), 1)

df = df.loc[s.index].assign(country=s)
print (df)
   eid  mid  l2mid  l3mid          country
0  111  112    114    115          US, Ireland
1  113  114    115      0          Ireland
2  112  114    118      0          Ireland
4  116  118      0      0          Mexico
Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 2
    Why was this answer downvoted too? It's perfectly right. (+1) – cs95 Aug 30 '17 at 15:14
  • 1
    As per my understading of question now this answer is perfect. Sir you can add the multiple match cases too na. So everyone will know the significance of apply. – Bharath M Shetty Aug 30 '17 at 15:19
  • @jezrael - thank you for the answer. I was able to try this and it worked great. i'm just learning python and i would not have been able to come up with this on my own. your answer gives me the opportunity to study how each of these operations work together. I did get initially get an "unexpected indentation" error but i just removed line breaks on the s = df.drop.... line and it executed perfectly. I really appreciate your help! – QuillPy Aug 31 '17 at 15:37
  • Thank you very much! I am really happy it help you. ;) Good luck! – jezrael Aug 31 '17 at 15:38