1

I have a dataframe where values of features-column are dict-like as here:

http://screencast.com/t/0Ko0NIBLwo

   features                    name             price  rating  read reviews
9  {'Cooking...': '- S...', }  Master Chef...  $279.99   None  None      {}  

example of dict:

{u'Cooking Type': u'- Specialty Cooking', u'Cooking Area': u'- Backyard', u'Brand Name': u'- Pizzacraft', u'Fuel Type': u'- Propane', u'Product Type': u'- BBQ', u'Size': u'- Medium Size'}

Does it possible to transform these values to new columns like?

   features                    Cooking Type       Specialty Cooking  ... name             price  rating  read reviews
9  {'Cooking...': '- S...', }  Specialty Cooking   Backyard          ... Master Chef...    $279.99   None  None      {}  
Fabio Lamanna
  • 20,504
  • 24
  • 90
  • 122
SpanishBoy
  • 2,105
  • 6
  • 28
  • 51

1 Answers1

2

I think you can use replace and strip and concat:

print df
                                            features          name    price  \
0  {u'Cooking Type': u'- Specialty Cooking', u'Co...  Master Chef1  $279.99   
1  {u'Cooking Type': u'- Specialty Cooking', u'Co...  Master Chef3  $279.99   

  rating  read reviews  
0   None  None      {}  
1   None  None      {}  

df1 = pd.DataFrame([x for x in df['features']], index=df.index)

for col in df1.columns:
    df1[col] = df1[col].str.replace(r'-','').str.strip()

print df1
   Brand Name Cooking Area       Cooking Type Fuel Type Product Type  \
0  Pizzacraft     Backyard  Specialty Cooking   Propane          BBQ   
1  Pizzacraft     Backyard  Specialty Cooking   Propane          BBQ   

          Size  
0  Medium Size  
1  Medium Size  

df = pd.concat([df1, df[['name','price','rating','read','reviews']]], axis=1)
print df
   Brand Name Cooking Area       Cooking Type Fuel Type Product Type  \
0  Pizzacraft     Backyard  Specialty Cooking   Propane          BBQ   
1  Pizzacraft     Backyard  Specialty Cooking   Propane          BBQ   

          Size          name    price rating  read reviews  
0  Medium Size  Master Chef1  $279.99   None  None      {}  
1  Medium Size  Master Chef3  $279.99   None  None      {}  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • How can I struggle with `NaN` and `{}` values? `df1 = pd.DataFrame([x for x in df['FEATURES'] if isinstance(x, dict) and x], index=df.index)` raised an error *ValueError: Shape of passed values is (17, 66), indices imply (17, 105)* – SpanishBoy Feb 25 '16 at 14:02
  • Yes, it is problem. One solution is replace `NaN` to `{}` - [see](http://stackoverflow.com/a/34991815/2901002) – jezrael Feb 25 '16 at 14:20