0

I'm trying to create dictionary from DF however I'm not getting the desired output:

DataFrame:

A      B   C    D  
0.0   0.0 NaN  NaN 
0.0   0.0 NaN  NaN 
0.0   0.0 NaN  NaN 
0.0   0.0 NaN  NaN 
0.0   0.0 NaN  NaN 

data_dict1 = adsl.to_dict('list')

Current output: {'A': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}

Desired output: {'A': {0.0, 0.0, 0.0, 0.0, 0.0, 0.0}}

Difference is square braces instead of curly braces.

Mann
  • 11
  • 4
    You realized that `{0.0, 0.0, 0.0, 0.0, 0.0, 0.0}` is not a valid python representation, right? More precisely, it's equivalent to just `{0.0}`. – Quang Hoang Jul 24 '20 at 20:47
  • Yes sure, I'm trying to replicate an output from below code but my input is in csv which i'm converting to a dataframe. `dataset_dict2 = { name: set(choice(1000, 700, replace=False)) for name in islice(letters, 6)` – Mann Jul 24 '20 at 20:53
  • @QuangHoang It's just representation for Yes/No (1/0) value. – Mann Jul 24 '20 at 20:57

3 Answers3

2

If you have an example df, created from a dict:

data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
df = pd.DataFrame.from_dict(data)

You can do:

data_dict = df.to_dict('dict')

data_dict will be:

{'col_1': {0: 3, 1: 2, 2: 1, 3: 0}, 'col_2': {0: 'a', 1: 'b', 2: 'c', 3: 'd'}}

If you want to keep only col_1, you can, using this, delete col_2 from data_dict:

data_dict.pop('col_2',None)

Your new data_dict will be:

{'col_1': {0: 3, 1: 2, 2: 1, 3: 0}}
zabop
  • 6,750
  • 3
  • 39
  • 84
1

Your current output is already a dictionary, mapping 'A' to [0.0,0.0,....].

This is not a valid python expression:

{'A':{0.0,0.0,....}}

But

data_dict = df.to_dict() 

Should give you what you are looking for.

nav610
  • 781
  • 3
  • 8
0

Based on your comment reply it seems you ARE looking for a unique set of values for each column. Try:

data_dict1 = adsl.to_dict('list') # which you already have, then...
data_dict1 = {key: set(vals) for key, vals in data_dict1.items()}

This will give you what you're asking for BUT it is bound to lose any sorting you have on the dataframe.

RichieV
  • 5,103
  • 2
  • 11
  • 24