6

Let me preface this question by noting that the combined column is not a dictionary. The resulting dataframe has square brackets within the 'combined' column - so it appears like a list within the dataframe int he format [key1:value1, key2:value2, etc].

I'm trying to convert my dataframe from this:

import pandas as pd
test = pd.DataFrame({'apples':['red','green','yellow'], 'quantity':
[1,2,3],'tasteFactor':['yum','yum','yuck']})

   apples  quantity tasteFactor
0     red         1         yum
1   green         2         yum
2  yellow         3        yuck

To this format, which is combining keys with values in each row into a new column:

   apples  quantity tasteFactor  combined
0     red         1         yum  ['apples':'red','quantity':'1','tastefactor':'yum']
1   green         2         yum  ['apples':'green','quantity':'2','tastefactor':'yum']
2  yellow         3        yuck  ['apples':'yellow','quantity':'3','tastefactor':'yuck']

Tried to turn the dataframe into a dictionary per row, but got stuck converting that into a list.

test['combined'] = test.to_dict(orient='records')

The resulting new column doesn't need to be an actual list type. It could be a string.

Previously asked this question here but wanted to clarify the question in the title in this question. How to Create a List from a Dictionary within a DataFrame in Python

Found the following closely related questions and tried derivations of them which gets me half the way but can't seem to get exactly the right format.

sweetnlow
  • 85
  • 1
  • 7

2 Answers2

3

You can do by using the apply method of pandas dataframes

import pandas as pd
df = pd.DataFrame({'apples':['red','green','yellow'], 'quantity':
[1,2,3],'tasteFactor':['yum','yum','yuck']})

col_names = df.columns

def func(row):
    global col_names
    list_ = [str(b)+':'+str(a) for a,b in zip(row,col_names.values.tolist())]
    return list_

x = list(map(func, df.values.tolist()))
df.loc[:,'combined'] = pd.Series(x)
# df
#    apples  quantity tasteFactor                                       combined
# 0     red         1         yum      [apples:red, quantity:1, tasteFactor:yum]
# 1   green         2         yum    [apples:green, quantity:2, tasteFactor:yum]
# 2  yellow         3        yuck  [apples:yellow, quantity:3, tasteFactor:yuck]
Clock Slave
  • 7,627
  • 15
  • 68
  • 109
  • I ran this code... and got back apples quantity tasteFactor combined 0 red 1 yum (a, p, p, l, e, s) 1 green 2 yum (q, u, a, n, t, i, t, y) 2 yellow 3 yuck (t, a, s, t, e, F, a, c, t, o, r) – sweetnlow Aug 14 '17 at 21:03
  • Edited. Please check – Clock Slave Aug 14 '17 at 21:23
  • Thanks! Added single quotes to make it list_ = ['\''+str(b)+'\': \''+str(a)+'\'' for a,b in zip(row,col_names.values.tolist())] – sweetnlow Aug 14 '17 at 23:36
1

As you mentioned The resulting new column doesn't need to be an actual list type.

di=test.T.to_dict()
test['Mapper']=test.index
test.Mapper.map(di)
test.assign(combined=test.Mapper.map(di)).drop('Mapper',1)


Out[493]: 
   apples  quantity tasteFactor                                           combined
0     red         1         yum  {'apples': 'red', 'quantity': 1, 'tasteFactor'...
1   green         2         yum  {'apples': 'green', 'quantity': 2, 'tasteFacto...
2  yellow         3        yuck  {'apples': 'yellow', 'quantity': 3, 'tasteFact...

EDIT:

di=test.T.to_dict()
test['Mapper']=test.index
test.Mapper.map(di)
test=test.assign(combined=test.Mapper.map(di).astype(str)).drop('Mapper',1)
test=test.combined.str.replace('{','[').str.replace('}',']')


test.combined[0]
Out[511]: "['apples': 'red', 'quantity': 1, 'tasteFactor': 'yum']"
BENY
  • 317,841
  • 20
  • 164
  • 234