0
import datetime
dic1 = [datetime.datetime(2014, 2, 4, 17, 48, 4), datetime.datetime(2014, 2, 4, 17, 48, 4), datetime.datetime(2014, 2, 4, 17, 58, 18), datetime.datetime(2014, 2, 4, 17, 58, 18), datetime.datetime(2014, 2, 5,
 1, 8, 13), datetime.datetime(2014, 2, 5, 1, 8, 13), datetime.datetime(2014, 2, 5, 1, 8, 45), datetime.datetime(2014, 2, 5, 1, 8, 45), datetime.datetime(2014, 2, 5, 15, 40, 54), datetime.datetime(2014
, 2, 5, 15, 40, 54), datetime.datetime(2014, 2, 5, 15, 49, 41)]

dic2 = [datetime.datetime(2014, 2, 5, 15, 49, 41), datetime.datetime(2014, 2, 5, 17, 43, 26), datetime.datetime(2014, 2, 5, 17, 43, 26), datetime.datetime(2014, 2, 5, 22, 36), datetime.datetime(2014, 2, 5, 22, 36), datetime.datetime(2014, 2, 6, 15, 26, 54), datetime.datetime(2014, 2, 6, 15, 26, 54), datetime.datetime(2014, 2, 6, 21, 19, 42),
datetime.datetime(2014, 2, 6, 21, 19, 42), datetime.datetime(2014, 2, 7, 0, 9, 3), datetime.datetime(2014, 2, 7, 0, 9, 3), datetime.datetime(2014, 2, 7, 16, 15, 11), datetime.datetime(2014, 2, 7, 16,
15, 11), datetime.datetime(2014, 2, 7, 16, 33, 33)]

for i in dic1:
    print i, " source is dic1"

print "--"
for i in dic2:
    print i, " source is dic2"

This outputs data like this:

2014-02-04 17:48:04  source is dic1
2014-02-04 17:48:04  source is dic1
2014-02-04 17:58:18  source is dic1
2014-02-04 17:58:18  source is dic1
2014-02-05 01:08:13  source is dic1
2014-02-05 01:08:13  source is dic1
2014-02-05 01:08:45  source is dic1
2014-02-05 01:08:45  source is dic1
2014-02-05 15:40:54  source is dic1
2014-02-05 15:40:54  source is dic1
2014-02-05 15:49:41  source is dic1

2014-02-05 15:49:41  source is dic2
2014-02-05 17:43:26  source is dic2
2014-02-05 17:43:26  source is dic2
2014-02-05 22:36:00  source is dic2
2014-02-05 22:36:00  source is dic2
2014-02-06 15:26:54  source is dic2
2014-02-06 15:26:54  source is dic2
2014-02-06 21:19:42  source is dic2
2014-02-06 21:19:42  source is dic2
2014-02-07 00:09:03  source is dic2
2014-02-07 00:09:03  source is dic2
2014-02-07 16:15:11  source is dic2
2014-02-07 16:15:11  source is dic2
2014-02-07 16:33:33  source is dic2

What I am trying to do is combine the 2 lists in chronological order while preserving the source (Like below). Any way to do this?

2014-02-07 16:15:11  source is dic1 
2014-02-07 16:33:33  source is dic2
2014-02-07 18:09:03  source is dic1
2014-02-07 20:15:11  source is dic1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
user1314011
  • 153
  • 2
  • 5
  • 12

2 Answers2

0

You'd produce tuples with (datetime, source) and merge the two lists; no need to use sorting if you do your merging intelligently. Here is a iterable merger I wrote for a different answer; it can be made more efficient still using heapq but this one is more readable:

import operator

def mergeiter(*iterables, **kwargs):
    """Given a set of sorted iterables, yield the next value in merged order

    Takes an optional `key` callable to compare values by.
    """
    iterables = [iter(it) for it in iterables]
    iterables = {i: [next(it), i, it] for i, it in enumerate(iterables)}
    if 'key' not in kwargs:
        key = operator.itemgetter(0)
    else:
        key = lambda item, key=kwargs['key']: key(item[0])

    while True:
        value, i, it = min(iterables.values(), key=key)
        yield value
        try:
            iterables[i][0] = next(it)
        except StopIteration:
            del iterables[i]
            if not iterables:
                raise

You'd use it like this:

source1 = ((dt, 'source is dic1') for dt in dic1)
source2 = ((dt, 'source is dic2') for dt in dic2)

for dt, source in mergeiter(source1, source2):
    print dt, source

The source1 and source2 inputs are generator expressions; they only produce values as they are iterated over. By looping over mergeiter() values from either generator are produced, in order, with their source attached.

This is also very memory efficient; no copies are made of the input lists, only enough data is kept in memory to determine the next value to output.

For your sample data, this produces:

>>> source1 = ((dt, 'source is dic1') for dt in dic1)
>>> source2 = ((dt, 'source is dic2') for dt in dic2)
>>> for dt, source in mergeiter(source1, source2):
...     print dt, source
... 
2014-02-04 17:48:04 source is dic1
2014-02-04 17:48:04 source is dic1
2014-02-04 17:58:18 source is dic1
2014-02-04 17:58:18 source is dic1
2014-02-05 01:08:13 source is dic1
2014-02-05 01:08:13 source is dic1
2014-02-05 01:08:45 source is dic1
2014-02-05 01:08:45 source is dic1
2014-02-05 15:40:54 source is dic1
2014-02-05 15:40:54 source is dic1
2014-02-05 15:49:41 source is dic1
2014-02-05 15:49:41 source is dic2
2014-02-05 17:43:26 source is dic2
2014-02-05 17:43:26 source is dic2
2014-02-05 22:36:00 source is dic2
2014-02-05 22:36:00 source is dic2
2014-02-06 15:26:54 source is dic2
2014-02-06 15:26:54 source is dic2
2014-02-06 21:19:42 source is dic2
2014-02-06 21:19:42 source is dic2
2014-02-07 00:09:03 source is dic2
2014-02-07 00:09:03 source is dic2
2014-02-07 16:15:11 source is dic2
2014-02-07 16:15:11 source is dic2
2014-02-07 16:33:33 source is dic2

Unfortunately, your sample input data uses two sources that do not overlap in their timestamp ranges.

Your output sample does use sources that'd mix. Using those as input would look like:

>>> dic1 = [datetime.datetime(2014, 2, 7, 16, 15, 11), datetime.datetime(2014, 2, 7, 18, 9, 3), datetime.datetime(2014, 2, 7, 20, 15, 11)]
>>> dic2 = [datetime.datetime(2014, 2, 7, 16, 33, 33)]
>>> source1 = ((dt, 'source is dic1') for dt in dic1)
>>> source2 = ((dt, 'source is dic2') for dt in dic2)
>>> for dt, source in mergeiter(source1, source2):
...     print dt, source
... 
2014-02-07 16:15:11 source is dic1
2014-02-07 16:33:33 source is dic2
2014-02-07 18:09:03 source is dic1
2014-02-07 20:15:11 source is dic1
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0

Sure, it is.

for dt, src in sorted(
        [(t,'dic1') for t in dic1] + [(t,'dic2') for t in dic2] ):
    print '{} from source {}'.format(dt, src)

-- output --

2014-02-04 17:48:04 from source dic1
2014-02-04 17:48:04 from source dic1
2014-02-04 17:58:18 from source dic1
2014-02-04 17:58:18 from source dic1
2014-02-05 01:08:13 from source dic1
2014-02-05 01:08:13 from source dic1
2014-02-05 01:08:45 from source dic1
2014-02-05 01:08:45 from source dic1
2014-02-05 15:40:54 from source dic1
2014-02-05 15:40:54 from source dic1
2014-02-05 15:49:41 from source dic1
2014-02-05 15:49:41 from source dic2
2014-02-05 17:43:26 from source dic2
2014-02-05 17:43:26 from source dic2
2014-02-05 22:36:00 from source dic2
2014-02-05 22:36:00 from source dic2
2014-02-06 15:26:54 from source dic2
2014-02-06 15:26:54 from source dic2
2014-02-06 21:19:42 from source dic2
2014-02-06 21:19:42 from source dic2
2014-02-07 00:09:03 from source dic2
2014-02-07 00:09:03 from source dic2
2014-02-07 16:15:11 from source dic2
2014-02-07 16:15:11 from source dic2
2014-02-07 16:33:33 from source dic2
David Unric
  • 7,421
  • 1
  • 37
  • 65