How to extract a value after colon in all the rows from a pandas dataframe column?

Question

Edit: the dummy dataframe is edited

I have a pandas data frame with the below kind of column with 200 rows.

Let's say the name of df is data.

-----------------------------------|
B
-----------------------------------|
{'animal':'cat', 'bird':'peacock'...}

I want to extract the value of animal to a separate column C for all the rows.

I tried the below code but it doesn't work.

data['C'] = data["B"].apply(lambda x: x.split(':')[-2] if ':' in x else x)

Please help.

please provide the output of `data['B'].head().to_dict()`, it is currently ambiguous — mozway, Nov 13 '22 at 16:28
Considering it a sensitive data, I am unable to share the output. however, I can see that it is getting converted into a dictionary — Ann09, Nov 13 '22 at 16:38
Can you transform the output you have in dummy data keeping exactly the same format? And please provide the matching (dummy) expected output — mozway, Nov 13 '22 at 16:39
It's in this form: `{0: {'animal': 'cat', 'bird': 'peacock'} ` — Ann09, Nov 13 '22 at 16:47
Then, the example you provided is not well crafted. You **must** provide a reproducible example, else you're wasting everyone's time and the question should rather be closed. — mozway, Nov 13 '22 at 16:55
@mozway You seem to have trouble with explaining the idea and concept of a [mre]. You should not try; and link instead. ;-) — Yunnosch, Nov 13 '22 at 17:08
@Yunnosch trouble? I'd rather say weariness… this [minimal reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) link is more appropriate for pandas ;) — mozway, Nov 13 '22 at 17:19
Perfect. I did not know there are tailored explanations, but if you know one, use it. Just if you interested, I made one for Sqlite, found at the tag wiki for sqlite. (So yes, I kind of did know ... that there is one for Sqlite ;-) ) — Yunnosch, Nov 13 '22 at 17:20
Well, I answer many times from my phone, it's not always practical to go fetch the links. anyway enough offtopic. I'll just vote to close as OP doesn't provide reproducible data. — mozway, Nov 13 '22 at 17:28
Yunnosch, Thank you. @mozway, thank you for your help and patience. Seems like I was dealing with a nested dictionary, I am getting the correct output now. — Ann09, Nov 13 '22 at 17:29

score 0 · Answer 1 · answered Nov 13 '22 at 16:37

0

I'm not totally sure of the structure of your data. Does this look right?

import pandas as pd
import re
df = pd.DataFrame({
   "B": ["'animal':'cat'", "'bird':'peacock'"]
})

df["C"] = df.B.apply(lambda x: re.sub(r".*?\:(.*$)", r"\1", x))

answered Nov 13 '22 at 16:37

AndS.

7,748
2
12
17

Hi, It currently throws an error. `expected string or bytes-like object` The dataset type is pandas.core.series.Series. – Ann09 Nov 13 '22 at 16:49
yeah it looks like the data is formatted differently. Can you update with exactly how the data is formatted? Maybe make a dummy dataset like I did here. – AndS. Nov 13 '22 at 16:51

score 0 · Accepted Answer · answered Nov 13 '22 at 17:38

0

The dictionary is unpacked with pd.json_normalize

import pandas as pd

data = pd.DataFrame({'B': [{0: {'animal': 'cat', 'bird': 'peacock'}}]})

data['C'] = pd.json_normalize(data['B'])['0.animal']

answered Nov 13 '22 at 17:38

Сергей Кох

1,417
12
6
13

It's a waste when you can use `df['B'].str['animal']` directly ;) – mozway Nov 13 '22 at 17:42
Clean. Else have to fetch the right output in two lines of code. – Ann09 Nov 13 '22 at 17:47
@mozway df['B'].str['animal'] returns None – Сергей Кох Nov 13 '22 at 17:48
@Сергей my bad, `df['B'].str[0].str['animal']` – mozway Nov 13 '22 at 17:49
Yes, because "animal" was in a nested dictionary. – Ann09 Nov 13 '22 at 17:50
@Ann09 please update your question to provide the reproducible example – mozway Nov 13 '22 at 17:51
I have edited the question. Hope its all ok now. – Ann09 Nov 13 '22 at 18:07

How to extract a value after colon in all the rows from a pandas dataframe column?

2 Answers2