Can someone please help me understand this?
Let us have this DataFrame:
df = pd.DataFrame({
"id": ['a', 'b', 'c', 'd', 'e'],
"parent_id": [None, None, 'a', 'b', 'a'],
"name": ["Bob", "Jane", "John", "Patty", "Sam"],
})
Now, I want to retrieve the parent name next to each child name like this:
+----+-----------+-------+-------------+
| id | parent_id | name | parent_name |
+----+-----------+-------+-------------+
| a | None | Bob | NaN |
+----+-----------+-------+-------------+
| b | None | Jane | NaN |
+----+-----------+-------+-------------+
| c | a | John | Bob |
+----+-----------+-------+-------------+
| d | b | Patty | Jane |
+----+-----------+-------+-------------+
| e | a | Sam | Bob |
+----+-----------+-------+-------------+
So I do that:
df['parent_name'] = None
df['parent_name'] = df['parent_id'].apply(lambda x: df['name'][df['id']==x])
But here's what I get:
+----+-----------+-------+-------------+
| id | parent_id | name | parent_name |
+----+-----------+-------+-------------+
| a | None | Bob | NaN |
+----+-----------+-------+-------------+
| b | None | Jane | NaN |
+----+-----------+-------+-------------+
| c | a | John | Bob |
+----+-----------+-------+-------------+
| d | b | Patty | NaN |
+----+-----------+-------+-------------+
| e | a | Sam | Bob |
+----+-----------+-------+-------------+
So, it appears to only process the first item in the name
column.
In the words of Plato quoting Socrates: "WTF???"