compare elements of list to elements of list of lists and conditionally create new lists

Question

It is about financial data. I have a list of the 70% percentiles of return data at 72 dates:

list = [0.11,0.12,...,0.125]

Furthermore I have a list of lists which contains the 72 returns at the different dates for 500 companies (= 500 lists and 72 entries per list):

list_of_lists = [[0.09,0.08,...,0.15],...,[0.1,0.34,...,0.01]]

What I want to do now is compare the first entry of my list (0.11) to all the entries in my first list in the list of lists. If the entry in the first list exceeds the 0.11 threshold (so in this case the 0.15 above) I want to add this number to a new list. Then I want to do the same with the second entry in list (0.12) and the second list in list_of_lists. In the end I basically want to obtain 72 lists (or a new list of lists) which contain the returns that are above the respective 70% percentile.

You say *500 lists and 72 entries per list*, shouldn't this by *72 lists and 500 entries per list*? — Willem Van Onsem, Jan 06 '16 at 23:17
Not an answer, but I would definitely suggest looking into using a library like e.g. [`numpy`](http://www.numpy.org/). — Nelewout, Jan 06 '16 at 23:24

score 3 · Answer 1 · edited May 23 '17 at 12:31

If I understand your question correctly, you have 500 lists of 72 values and 72 threshold values. You want to compare the n^th value of each list with the n^th value of your list of thresholds. In other words, you want to proceed column-wise. It's easiest to first transpose list_of_lists using this one cool trick, so that each column in list_of_lists becomes a row:

transposed = zip(*list_of_lists)

Now we can work with rows. Pair each number in your list of thresholds with its corresponding row in transposed.

lists_with_thresholds = zip(list, transposed)

Each item in lists_with_thresholds is a pair containing a cutoff point and the values to which we want to compare it. The ducks are lined up in a row; we just have to find the values in the second part of the pair which exceed the corresponding cutoff point.

result = []
for threshold, values in lists_with_thresholds:
    values_over_threshold = []
    for x in values:
        if x > threshold:
            values_over_threshold.append(x)
    result.append(values_over_threshold)

Or, squishing the nested for loops up into a nested list comprehension:

result = [[x for x in values if x > threshold]
          for threshold, values in zip(list, zip(*list_of_lists))]

These two versions are exactly equivalent - they compile into the same byte code, for all intents and purposes - but I like the list comprehension better because it's shorter and it has a more functional feel.

This is exactly what I wanted. Thank you! – Alexander Eser Jan 07 '16 at 07:09 — Alexander Eser, Jan 07 '16 at 07:09

Willem Van Onsem · Accepted Answer · 2016-01-07T11:22:36.130

You could do this with list comprehension I think:

thresholds = [0.11,0.12,0.125]
quotes = [[0.09,0.08,0.15],[0.09,0.08,0.15],[0.1,0.34,0.01]]
[filter(lambda x: x > thresholds[idx],qts) for idx,qts in enumerate(quotes)]

I made some real lists out of the given ones (omitting the ...) such that this is an example that compiles.

The list comprehension works as follows: we iterate over the qts from quotes (and also obtain the corresponding index idx, which is used to obtain the threshold). Next we perform a filter operation on the qts and only allow elements that are larger than the threshold[idx] (the threshold for that timestamp).

Running this with python gives:

$ python
Python 2.7.9 (default, Apr  2 2015, 15:33:21) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> thresholds = [0.11,0.12,0.125]
>>> quotes = [[0.09,0.08,0.15],[0.09,0.08,0.15],[0.1,0.34,0.01]]
>>> [filter(lambda x: x > thresholds[idx],qts) for idx,qts in enumerate(quotes)]
[[0.15], [0.15], [0.34]]

which seems to be what you want.

EDIT In python-3.x, this should work as well, although the filter is "delayed":

$ python3
Python 3.4.3 (default, Mar 26 2015, 22:03:40) 
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> thresholds = [0.11,0.12,0.125]
>>> quotes = [[0.09,0.08,0.15],[0.09,0.08,0.15],[0.1,0.34,0.01]]
>>> res=[filter(lambda x: x > thresholds[idx],qts) for idx,qts in enumerate(quotes)]
>>> res[0]
<filter object at 0x7f0d3fbc2be0>
>>> list(res[0])
[0.15]

If you want to materialize the lists straight away, you can slightly alter the list comprehension to:

[list(filter(lambda x: x > thresholds[idx],qts)) for idx,qts in enumerate(quotes)]

Which results in:

>>> [list(filter(lambda x: x > thresholds[idx],qts)) for idx,qts in enumerate(quotes)]
[[0.15], [0.15], [0.34]]

How would I apply this in python 3? I get a TypeError: 'list' object is not callable. — Alexander Eser, Jan 07 '16 at 11:11
@AlexanderEser: I do not get this error, see updated answer. — Willem Van Onsem, Jan 07 '16 at 11:22

score 1 · Answer 3 · answered Jan 06 '16 at 23:24

1

I think this is what you want:

new_list = []
for i in lists_of_lists:
    for j in i:
        if j > list[0]:
            new_list.append(j)

answered Jan 06 '16 at 23:24

laserpython

300
1
6
21

Although I think this works, it's a but *un-pythonic* to call list modifiers. That's why they invented list comprehension. +1 nevertheless ;). – Willem Van Onsem Jan 06 '16 at 23:25
OP wants the elements of the sublists which are greater than the _corresponding_ element of `list`, not `list[0]`. Also, this will produce a flat list whereas OP wants a list of lists. – Benjamin Hodgson Jan 06 '16 at 23:29
Thanks! I did it like this since I knew there would be plenty of list comprehension answers, and for beginners list comp is harder to read (at least for me it was). And I guess misunderstood the OP.... – laserpython Jan 06 '16 at 23:30

karlson · Answer 4 · 2016-01-06T23:27:09.510

0

You could use a list comprehension:

list = [4, 3, 2, 3, 4, 5]
list_of_lists = [[6, 1, 3, 7, 2, 5], [1, 2, 6, 3, 8, 1], [1, 2, 3, 2, 7, 6]]

above = [[ret for i, ret in enumerate(lst) if ret > list[i]] for lst in list_of_lists]

[[6, 3, 7], [6, 8], [3, 7, 6]]

This will remove all entries in your lists in list_of_lists that are less or equal to the corresponding element of list.

edited Jan 06 '16 at 23:27

answered Jan 06 '16 at 23:20

karlson

5,325
3
30
62

OP wants the elements of the sublists which are greater than the _corresponding_ element of `list`. – Benjamin Hodgson Jan 06 '16 at 23:24
Alright, fixed that. I think that wasn't totally clear from the question. – karlson Jan 06 '16 at 23:27

compare elements of list to elements of list of lists and conditionally create new lists

4 Answers4