1

Can someone please help me understand this?

Let us have this DataFrame:

df = pd.DataFrame({
    "id": ['a', 'b', 'c', 'd', 'e'],
    "parent_id": [None, None, 'a', 'b', 'a'],
    "name": ["Bob", "Jane", "John", "Patty", "Sam"],
})

Now, I want to retrieve the parent name next to each child name like this:

+----+-----------+-------+-------------+
| id | parent_id | name  | parent_name |
+----+-----------+-------+-------------+
| a  | None      | Bob   | NaN         |
+----+-----------+-------+-------------+
| b  | None      | Jane  | NaN         |
+----+-----------+-------+-------------+
| c  | a         | John  | Bob         |
+----+-----------+-------+-------------+
| d  | b         | Patty | Jane        |
+----+-----------+-------+-------------+
| e  | a         | Sam   | Bob         |
+----+-----------+-------+-------------+

So I do that:

df['parent_name'] = None
df['parent_name'] = df['parent_id'].apply(lambda x: df['name'][df['id']==x])

But here's what I get:

+----+-----------+-------+-------------+
| id | parent_id | name  | parent_name |
+----+-----------+-------+-------------+
| a  | None      | Bob   | NaN         |
+----+-----------+-------+-------------+
| b  | None      | Jane  | NaN         |
+----+-----------+-------+-------------+
| c  | a         | John  | Bob         |
+----+-----------+-------+-------------+
| d  | b         | Patty | NaN         |
+----+-----------+-------+-------------+
| e  | a         | Sam   | Bob         |
+----+-----------+-------+-------------+

So, it appears to only process the first item in the name column.

In the words of Plato quoting Socrates: "WTF???"

mrgou
  • 1,576
  • 2
  • 21
  • 45
  • @jezrael Are you sure the marked dupe is correct one? In this question there is only one dataframe not two. Maybe if you want to close it find better dupe. – Shubham Sharma May 12 '21 at 09:55

3 Answers3

3

We can try mapping the parent_id with the corresponding parent_name based on common id

df['parent_name'] = df['parent_id'].map(df.set_index('id')['name'])

  id parent_id   name parent_name
0  a      None    Bob         NaN
1  b      None   Jane         NaN
2  c         a   John         Bob
3  d         b  Patty        Jane
4  e         a    Sam         Bob
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
1

Try a merge

final = df.merge(df[["id", "name"]].rename(
    columns={"name": "parent_name"}),
    left_on="parent_id",
    right_on="id",
    how="left"
)
Kedar
  • 1,648
  • 10
  • 20
0

I don't think how apply is used, you can use merge however:

df['parent_name'] = df[['parent_id']].merge(df[['id', 'name']], left_on=['parent_id'], right_on=['id'], how='left')['name']

#   id parent_id   name parent_name
# 0  a      None    Bob         NaN
# 1  b      None   Jane         NaN
# 2  c         a   John         Bob
# 3  d         b  Patty        Jane
# 4  e         a    Sam         Bob
Andreas
  • 8,694
  • 3
  • 14
  • 38