2

I have a list of tuples:

[(0.0, 287999.70000000007),
(1.0, 161123.23000000001),
(2.0, 93724.140000000014),
(3.0, 60347.309999999983),
(4.0, 55687.239999999998),
(5.0, 29501.349999999999),
(6.0, 14993.920000000002),
(7.0, 14941.970000000001),
(8.0, 13066.229999999998),
(9.0, 10101.040000000001),
(10.0, 4151.6900000000005),
(11.0, 2998.8899999999999),
(12.0, 1548.9300000000001),
(15.0, 1595.54),
(16.0, 1435.98),
(17.0, 1383.01)]

As can be seen, there are missing indexes (13 and 14). I want to fill the missing indexes with zeros:

[(0.0, 287999.70000000007),
(1.0, 161123.23000000001),
(2.0, 93724.140000000014),
(3.0, 60347.309999999983),
(4.0, 55687.239999999998),
(5.0, 29501.349999999999),
(6.0, 14993.920000000002),
(7.0, 14941.970000000001),
(8.0, 13066.229999999998),
(9.0, 10101.040000000001),
(10.0, 4151.6900000000005),
(11.0, 2998.8899999999999),
(12.0, 1548.9300000000001),
(13.0, 0),
(14.0, 0),
(15.0, 1595.54),
(16.0, 1435.98),
(17.0, 1383.01)]

I did something ugly with for loop (I didn't add it cause I don't think it will contribute to anything...), but I was wondering is there any elegant way to resolve this problem? (maybe 3-4 lines with list comprehension).

Duncan
  • 92,073
  • 11
  • 122
  • 156
Binyamin Even
  • 3,318
  • 1
  • 18
  • 45

3 Answers3

2

Just a straight for loop is probably easier than a list comprehension:

data = [(0.0, 287999.70000000007),
(1.0, 161123.23000000001),
(2.0, 93724.140000000014),
(3.0, 60347.309999999983),
(4.0, 55687.239999999998),
(5.0, 29501.349999999999),
(6.0, 14993.920000000002),
(7.0, 14941.970000000001),
(8.0, 13066.229999999998),
(9.0, 10101.040000000001),
(10.0, 4151.6900000000005),
(11.0, 2998.8899999999999),
(12.0, 1548.9300000000001),
(15.0, 1595.54),
(16.0, 1435.98),
(17.0, 1383.01)]

result = []
last = 0.0
for d in data:
    while last < d[0]:
        result.append((last, 0))
        last += 1
    result.append(d)
    last = d[0]+1

Slightly shorter (and including a list comprehension):

result, last = [], 0.0
for d in data:
    result.extend((r,0) for r in range(int(last), int(d[0])))
    result.append(d)
    last = d[0]+1
Duncan
  • 92,073
  • 11
  • 122
  • 156
0

You can do it by converting your data to a dictionary, and then retrieving each index with dict.get() so that you can default the value to 0.

Compact version:

def fill_missing_with_zero(collection, fill=0.0):
    d = dict(collection)
    return [(key, d.get(key, fill)) for key in [float(i) for i in range(int(max(d))+1)]]

For more detail, As follows:

def fill_missing_with_zero(collection, fill=0.0):
    d = dict(collection)
    highest_index = int(max(d.keys()))
    result = []
    for i in range(highest_index+1):
        key = float(i)  # because your "keys" are floats
        result.append((key, d.get(key, fill)))
    return result

Example:

>>> fill_missing_with_zero(collection))
[(0.0, 287999.70000000007),
 (1.0, 161123.23),
 (2.0, 93724.14000000001),
 (3.0, 60347.30999999998),
 (4.0, 55687.24),
 (5.0, 29501.35),
 (6.0, 14993.920000000002),
 (7.0, 14941.970000000001),
 (8.0, 13066.229999999998),
 (9.0, 10101.04),
 (10.0, 4151.6900000000005),
 (11.0, 2998.89),
 (12.0, 1548.93),
 (13.0, 0.0),
 (14.0, 0.0),
 (15.0, 1595.54),
 (16.0, 1435.98),
 (17.0, 1383.01)]
Inbar Rose
  • 41,843
  • 24
  • 85
  • 131
0

I've modified your input slightly to use integer values.

Presuming the input is in order. First I work out the highest key in the list. top=in_list[-1][0]

Then convert the input to a dict.

This means I can use get(key[, default]) to return a zero if the key doesn't exist.

Then use a list comprehension with range to go through the possible integers. Needs to be top+1 because range returns the number of elements, and so starting from zero, needs one more.

list=[(0, 287999.70000000007),
(1, 161123.23000000001),
(2, 93724.140000000014),
(3, 60347.309999999983),
(4, 55687.239999999998),
(5, 29501.349999999999),
(6, 14993.920000000002),
(7, 14941.970000000001),
(8, 13066.229999999998),
(9, 1010140000000001),
(10, 4151.6900000000005),
(11, 2998.8899999999999),
(12, 1548.9300000000001),
(15, 1595.54),
(16, 1435.98),
(17, 1383.01)]

top=in_list[-1][0]
in_dict=dict(in_list)
out_list=[ (i,in_dict.get(i,0)) for i in range(top+1)]
print(out_list)
Tim Bray
  • 1,373
  • 10
  • 7