0

Example dataframe

patient_id, value_id, value
1           10        20
1           30         5
2           40         8

From this dataframe, i'd like to transform it to something like this in a dictionary form.

{ 1: [(10, 20), (30, 5)], 2: [(40, 8)] }

I know I can use to_dict but what am I missing here?

adrian
  • 2,326
  • 2
  • 32
  • 48

1 Answers1

0

I do not see any way that to_dict() can create what you want here. The following solution is not the most Pythonic (or Pandanic), but it is a way to get what you want:

d={}
for pid,vid,v in df.itertuples(index=False):
    d.setdefault(pid,[])
    d[pid].append((vid,v))

The first line of the loop does nothing if a given patient_id is already in the output dict, and adds an empty list if it is not. Then the second line appends the values you want to the empty list, or the existing list if it's already there.

EDIT: This answer also uses iteration and also speculates that pandas has no native way to do this. I've updated my answer it use itertuples() which is a less-memory intensive method than my original as_matrix().

Community
  • 1
  • 1
William Welsh
  • 351
  • 2
  • 11
  • I'll look at it, I was leaning towards a solution like this but have always been told basically loops are bad ... why I was wondering if there was a more concise way of doing it. – adrian Jan 25 '16 at 02:37
  • Because you have duplicated `patient_id`s, I am almost 100% sure you cannot do this with native pandas functions without reshaping your dataset in a way that would itself involve loops/iteration. If that's true, I would just use a loop here. – William Welsh Jan 25 '16 at 03:28