4

I have nested list that has a structure similar to this, except it's obviously much longer:

mylist = [ ["Bob", "12-01 2:30"], ["Sal", "12-01 5:23"], ["Jill", "12-02 1:28"] ]

My goal is to create another nested lists that combines all elements that have the same date. So, the following output is desired:

newlist = [  [["Bob", "12-01 2:30"], ["Sal", "12-01 5:23"]], [["Jill", "12-02 1:28"]]  ]

Above, all items with the date 12-01, regardless of time, are combined, and all elements of 12-02 are combined.

I've sincerely been researching how to do this for the past 1 hour and can't find anything. Furthermore, I'm a beginner at programming, so I'm not skilled enough to try to create my own solution. So, please don't think that I haven't attempted to do research or put any effort into trying this problem myself. I'll add a few links as examples of my research below:

Collect every pair of elements from a list into tuples in Python

Create a list of tuples with adjacent list elements if a condition is true

How do I concatenate two lists in Python?

Concatenating two lists of Strings element wise in Python without Nested for loops

Zip two lists together based on matching date in string

How to merge lists into a list of tuples?

martineau
  • 119,623
  • 25
  • 170
  • 301
F16Falcon
  • 395
  • 1
  • 11

4 Answers4

5

Use dict or orderdict(if the sort is important) group data by the date time .

from collections import defaultdict # use defaultdict like {}.setdefault(), it's very facility

mylist = [["Bob", "12-01 2:30"], ["Sal", "12-01 5:23"], ["Jill", "12-02 1:28"]]
record_dict = defaultdict(list)
# then iter the list group all date time.

for data in mylist:
    _, time = data
    date_time, _ = time.split(" ")
    record_dict[date_time].append(data)

res_list = list(record_dict.values())
print(res_list)

output:
[[['Bob', '12-01 2:30'], ['Sal', '12-01 5:23']], [['Jill', '12-02 1:28']]]

DustyPosa
  • 463
  • 2
  • 8
4

A pure list-based solution as an alternative to the accepted dictionary-based solution. This offers the additional feature of easily sorting the whole list, first by date, then by hour, then by name

from itertools import groupby

mylist = [["Bob", "12-01 2:30"], ["Sal", "12-01 5:23"], ["Jill", "12-02 1:28"]]

newlist = [dt.split() + [name] for (name, dt) in mylist]
newlist.sort() # can be removed if inital data is already sorted by date
newlist = [list(group) for (date, group) in groupby(newlist, lambda item:item[0])]

# result:
# [[['12-01','2:30','Bob'], ['12-01','5:23','Sal']], [['12-02','1:28','Jill']]]

If you really want the same item format as the initial list, it requires a double iteration:

newlist = [[[name, date + ' ' + time] for (date, time, name) in group]
           for (date, group) in groupby(newlist, lambda item:item[0])]

# result:
# [[['Bob', '12-01 2:30'], ['Sal', '12-01 5:23']], [['Jill', '12-02 1:28']]]
sciroccorics
  • 2,357
  • 1
  • 8
  • 21
2

If you don't mind going heavy on your memory usage, you can try using a dictionary. You can use the date as the key and make a list of values.

all_items = {}
for line in myList:
    x, y = line
    date, time = y.split()
    try:
        all_items[date].append(line)
    except:
        all_items[date] = [line,]

Then, you can create a new list using the sorted date for keys.

Seraph Wedd
  • 864
  • 6
  • 14
1

If all of the elements with the same date are consecutive, you can use itertools.groupby:

list(map(list, groupby(data, lambda value: ...)))
Solomon Ucko
  • 5,724
  • 3
  • 24
  • 45