-2

Edit: the dummy dataframe is edited

I have a pandas data frame with the below kind of column with 200 rows.

Let's say the name of df is data.

-----------------------------------|
B
-----------------------------------|
{'animal':'cat', 'bird':'peacock'...}

I want to extract the value of animal to a separate column C for all the rows.

I tried the below code but it doesn't work.

data['C'] = data["B"].apply(lambda x: x.split(':')[-2] if ':' in x else x)

Please help.

Ann09
  • 26
  • 6
  • please provide the output of `data['B'].head().to_dict()`, it is currently ambiguous – mozway Nov 13 '22 at 16:28
  • Considering it a sensitive data, I am unable to share the output. however, I can see that it is getting converted into a dictionary – Ann09 Nov 13 '22 at 16:38
  • Can you transform the output you have in dummy data keeping exactly the same format? And please provide the matching (dummy) expected output – mozway Nov 13 '22 at 16:39
  • It's in this form: `{0: {'animal': 'cat', 'bird': 'peacock'} ` – Ann09 Nov 13 '22 at 16:47
  • Then does `df['B'].str['animal']` give you what you want? – mozway Nov 13 '22 at 16:49
  • It's returning the None `0 None 1 None 2 None` – Ann09 Nov 13 '22 at 16:52
  • Then, the example you provided is not well crafted. You **must** provide a reproducible example, else you're wasting everyone's time and the question should rather be closed. – mozway Nov 13 '22 at 16:55
  • @mozway You seem to have trouble with explaining the idea and concept of a [mre]. You should not try; and link instead. ;-) – Yunnosch Nov 13 '22 at 17:08
  • @Yunnosch trouble? I'd rather say weariness… this [minimal reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) link is more appropriate for pandas ;) – mozway Nov 13 '22 at 17:19
  • Perfect. I did not know there are tailored explanations, but if you know one, use it. Just if you interested, I made one for Sqlite, found at the tag wiki for sqlite. (So yes, I kind of did know ... that there is one for Sqlite ;-) ) – Yunnosch Nov 13 '22 at 17:20
  • Well, I answer many times from my phone, it's not always practical to go fetch the links. anyway enough offtopic. I'll just vote to close as OP doesn't provide reproducible data. – mozway Nov 13 '22 at 17:28
  • Yunnosch, Thank you. @mozway, thank you for your help and patience. Seems like I was dealing with a nested dictionary, I am getting the correct output now. – Ann09 Nov 13 '22 at 17:29

2 Answers2

0

I'm not totally sure of the structure of your data. Does this look right?

import pandas as pd
import re
df = pd.DataFrame({
   "B": ["'animal':'cat'", "'bird':'peacock'"]
})

df["C"] = df.B.apply(lambda x: re.sub(r".*?\:(.*$)", r"\1", x))
AndS.
  • 7,748
  • 2
  • 12
  • 17
  • Hi, It currently throws an error. `expected string or bytes-like object` The dataset type is pandas.core.series.Series. – Ann09 Nov 13 '22 at 16:49
  • yeah it looks like the data is formatted differently. Can you update with exactly how the data is formatted? Maybe make a dummy dataset like I did here. – AndS. Nov 13 '22 at 16:51
0

The dictionary is unpacked with pd.json_normalize

import pandas as pd

data = pd.DataFrame({'B': [{0: {'animal': 'cat', 'bird': 'peacock'}}]})

data['C'] = pd.json_normalize(data['B'])['0.animal']
Сергей Кох
  • 1,417
  • 12
  • 6
  • 13