-2

I have a list of tuples, each tuple has patient and visit, patient can have several visits

I want to get list of patient and for every patient the list of their visits

for example

[(patient1, visit), (patient2, visit), (patient1, visit)]

To

[(patient1, [visit, visit]), (patient2, [visit])]

I tried javascript's reduce function approach, but I can't really understand how I can do it in python

rv.kvetch
  • 9,940
  • 3
  • 24
  • 53
Israel kusayev
  • 833
  • 8
  • 20

3 Answers3

1

The defaultdict approach is the standard way and has linear complexity. You can also just use a common dict and dict.setdefault

d = {}
for patient, visit in data:
    d.setdefault(patient, []).append(visit)
[*d.items()]
# [('patient1', ['visit', 'visit']), ('patient2', ['visit'])]

For a one-line approach (excluding imports) - albeit only log-linear, you can use itertools.groupby:

from itertools import groupby
from operator import itemgetter as ig

[(k, [*map(ig(1), g)]) for k, g in groupby(sorted(data), key=ig(0))]
# [('patient1', ['visit', 'visit']), ('patient2', ['visit'])]

Some useful docs:

user2390182
  • 72,016
  • 6
  • 67
  • 89
0

You can use a collections.defaultdict in the following way:

from collections import defaultdict

d = defaultdict(list)
for patient, visit in data:
    d[patient].append(visit)
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • I want the `d` to contain the patient itself – Israel kusayev Oct 26 '21 at 13:14
  • 1
    @Israelkusayev You can then use `list(d.items())` to get the format you describe in your question. – a_guest Oct 26 '21 at 13:19
  • @Israelkusayev `d` does contain the patient. It's the key of each entry. @Matthias 's comment points out that a dict matches your data better, because it guarantees that the patient ids will be unique. The list of tuples that you say you want would not prevent you from having the same patient twice, and so you would have to code your own check to guarantee that when you want to update the data. – BoarGules Oct 26 '21 at 13:19
  • This is the right approach. In fact, in the latter result from above, that is actually *directly convertible* to a dict. That is, you can pass that list of key-value pairs to `dict()` and it will happily accept it, because the end data is the same, more or less. – rv.kvetch Oct 26 '21 at 13:23
0

Example Using itertools.groupby

from itertools import groupby

# Example data
records = [('bill', '1/1/2021'), ('mary', '1/2/2021'), ('janet', '1/3/2021'), ('bill', '3/5/2021'), ('mary', '4/25/2021')]

# Group visits by patient names
g = groupby(sorted(records), lambda kv: kv[0])  # Group based upon first element of tuples (i.e. name)
                                                # Sort so names are adjacent for groupby
    
# Using list comprehension on groupings to provided desired tuples
result =  [(name, [d[1] for d in visit]) for name, visit in g]

Above code as a one-liner

result = [(name, [d[1] for d in visit]) for name, visit in  groupby(sorted(records), lambda kv: kv[0])]
DarrylG
  • 16,732
  • 2
  • 17
  • 23
  • Note that this requires the `partient`'s class to implement a total ordering. For strings, no problem, but perhaps these are more complex objects such as a frozen dataclass. The dict approach only requires these objects to be hashable which seems much more natural for patient data than to define a total ordering. – a_guest Oct 26 '21 at 13:57