Python Subtract Arrays Based on Same Time

Question

Is there a way I can subtract two arrays, but making sure I am subtracting elements that have the same day, hour, year, and or minute values?

list1 = [[10, '2013-06-18'],[20, '2013-06-19'], [50, '2013-06-23'], [15, '2013-06-30']]
list2 = [[5, '2013-06-18'], [5, '2013-06-23'] [20, '2013-06-25'], [20, '2013-06-30']]

Looking for:

 list1-list2 = [[5, '2013-06-18'], [45, '2013-06-23'] [10, '2013-06-30']]

And what is your problem in that? Where are you facing issue in this ? — Anand S Kumar, Aug 25 '15 at 02:51
@AnandSKumar I am pulling from a database, I will get an array, and I need to make sure that I take into accounts gaps of data, where one array doesn't have the same time values as the other array — Outshined, Aug 25 '15 at 02:56
`list1` has `'2013-06-18'` twice, so why isn't it `[25, '2013-06-18']...` — John La Rooy, Aug 25 '15 at 02:58
Lets get technical - are these lists, that is lists of iists, or `numpy` arrays. If arrays what dtype and shape? — hpaulj, Aug 25 '15 at 02:59
@hpaulj List of lists, I haven't worked too much with numpy, I don't know how to use it very well — Outshined, Aug 25 '15 at 03:01
And you only want to have dates in the result that are in both lists? — Cyphase, Aug 25 '15 at 03:02
By the way, your example could be a bit better; the result could just as well be items in `list2` whose dates are also in `list1`. — Cyphase, Aug 25 '15 at 03:04
@Cyphase Yes, in terms of implementation of how data is stored and what sql command I am using to get the data — Outshined, Aug 25 '15 at 03:05
Just iterate through one array, and compare the 'date' strings. If they match, subtract and add the value to an output list. No special tricks, just straight forward list iteration. — hpaulj, Aug 25 '15 at 03:11
@hpaulj The problem is that there is an assumption that the third day may be at the third element in list for example, when in fact, it may be the fourth, or fifth depending on how much data there is between the first and third day — Outshined, Aug 25 '15 at 03:13
@Cyphase I meant whether I should input data for x amount of time when there is no data for that time, or how my sql commands are structures in order to get the list of dates and corresponding values — Outshined, Aug 25 '15 at 03:15
And it looks like the lists will always be sorted by date, and there won't be two of the same date in a list, right? — Cyphase, Aug 25 '15 at 03:25
@Cyphase This is a worst case scenario, where there are gaps in data, but they must be accounted for — Outshined, Aug 25 '15 at 03:27
Another idea - collect the values in a dictionary, using the date as the key. — hpaulj, Aug 25 '15 at 03:39

mhawke · Answer 1 · 2015-08-25T04:12:31.167

How about using a defaultdict of lists?

import itertools
from operator import sub
from collections import defaultdict

def subtract_lists(l1, l2):
    data = defaultdict(list)
    for sublist in itertools.chain(l1, l2):
        value, date = sublist
        data[date].append(value)
    return [(reduce(sub, v), k) for k, v in data.items() if len(v) > 1]

list1 = [[10, '2013-06-18'],[20, '2013-06-19'], [50, '2013-06-23'], [15, '2013-06-30']]
list2 = [[5, '2013-06-18'], [5, '2013-06-23'], [20, '2013-06-25'], [20, '2013-06-30']]

>>> subtract_lists(list1, list2)
[(-5, '2013-06-30'), (45, '2013-06-23'), (5, '2013-06-18')]
>>> # if you want them sorted by date...
>>> sorted(subtract_lists(list1, list2), key=lambda t: t[1])
[(5, '2013-06-18'), (45, '2013-06-23'), (-5, '2013-06-30')]

Note that the difference for date 2013-06-30 is -5, not +5.

This works by using the date as a dictionary key for a list of all values for the given date. Then those lists having more than one value in its list are selected, and the values in those lists are reduced by subtraction. If you want the resulting list sorted, you can do so using sorted() with the date item as the key. You could move that operation into the subtract_lists() function if you always want that behavior.

score 0 · Answer 2 · edited May 23 '17 at 11:43

I think this code does what you want:

list1 = [[10, '2013-06-18'],[20, '2013-06-19'], [50, '2013-06-23'], [15, '2013-06-30']]
list2 = [[5, '2013-06-18'], [5, '2013-06-23'], [20, '2013-06-25'], [20, '2013-06-30']]
list1=dict([[i[1],i[0]] for i in list1])
list2=dict([[i[1],i[0]] for i in list2])
def minus(a,b):
    return { k: a.get(k, 0) - b.get(k, 0) for k in set(a) & set(b) }

minus(list2,list1)

# returns the below, which is now a dictionary
{'2013-06-18': 5, '2013-06-23': 45, '2013-06-30': 5}
# you can convert it back into your format like this
data = [[value,key] for key, value in minus(list1,list2).iteritems()]

But you seem to have an error in your output data. If you want to include data when it's in either list, define minus like this instead:

def minus(a,b):
    return { k: a.get(k, 0) - b.get(k, 0) for k in set(a) | set(b) }

See this answer, on Merge and sum of two dictionaries, for more info.

Python Subtract Arrays Based on Same Time

2 Answers2