I have a dataset that looks at all employees' history. The goal I am trying to get to is see an employees current manager and previous manager only if that previous manager has left without being replaced. To identify if a manager has left you look at the ManagerPositionNum this column is unique to the Manager and if another Manager has those numbers then they are filling in for a vacant role. I am running pandas and numpy with this file.
Here is a sample of what I have:
EmpID Date Job_Title ManagerName ManagerPositionNum
101 May 2021 Sales Rep John Doe 1111
101 June 2021 Sales Rep John Doe 1111
102 February 2022 Tech Support Mary Sue 2111
102 March 2022 Tech Support Mary Sue 2111
102 April 2022 Tech Support John Doe 2111
103 October 2022 HR Advisor Sarah Long 3111
103 November 2022 HR Advisor Michael Scott 4111
103 December 2022 HR Advisor John Doe 4111
103 December 2022 HR Advisor John Doe 4111
Desired Output:
EmpID Date Job_Title ManagerName ManagerPositionNum Vacated Manager
101 May 2021 Sales Rep John Doe 1111
101 June 2021 Sales Rep John Doe 1111
102 February 2022 Tech Support Mary Sue 2111 Mary Sue
102 March 2022 Tech Support Mary Sue 2111 Mary Sue
102 April 2022 Tech Support John Doe 2111 Mary Sue
103 October 2022 HR Advisor Sarah Long 3111
103 November 2022 HR Advisor Michael Scott 4111
103 December 2022 HR Advisor John Doe 4111 Michael Scott
103 January 2023 HR Advisor John Doe 4111 Michael Scott
Just for clarification:
1111 is unique to John Doe
2111 is unique to Mary Sue
3111 is unique to Sarah Long
4111 is unique to Michael Scott
Code I have tried:
reportid = df.groupby('ManagerName')['ManagerPositionNum'].transform('first')m =
~df['ManagerPositionNum'].eq(reportid) df.loc[m,'ManagerName']