from itertools import groupby
input_data = [
[[30.0, 'P'], [45.0, 'R'], [50.0, 'D']],
[[10.0, 'R'], [20.0, 'D'], [60.0, 'R']],
[[42.4, 'R'], [76.0, 'R'], [52.0, 'D']]]
print (sum([[list(j) for i,j in
groupby([item[0] if item[1] == 'R' else None for item in sublist],lambda x:x is not None) if i]
for sublist in input_data],[]))
Result:
[[45.0], [10.0], [60.0], [42.4, 76.0]]
Derivation
If you think of grouping something, you should take a look at what groupby
can do for you. To keep it simple, let's first use only part of your longer list to work it out:
i = input_data[2]
print ([(key,*lst) for key,lst in groupby(i, lambda x: x[1]=='R')])
and show how groupby
works for your input:
[(True, [42.4, 'R'], [76.0, 'R']), (False, [52.0, 'D'])]
because the two R
values are in one grouped list and the other value is in the other. You are not interested in those False
values so don't include them:
print ([list(lst) for key,lst in groupby(i, lambda x: x[1]=='R') if key])
and this will get you
[[[42.4, 'R'], [76.0, 'R']]]
Please, do check the results for the other sub-lists in your sample data as well!
It is easy to not include the group key values True
and False
, but you still have the 'R'
strings as well (which, incidentally, add yet another level of brackets). Now groupby
can ultimately only decide whether or not to include an item into a group. So you cannot re-write it to 'return' just the number for R
items. (I'll be happily corrected on this, by the way.)
But you are not interested in the values that aren't tagged R
anyway; you only need to know there may be some value, and if there is, it's only to split runs of R
on. You can safely replace them with None
, while keeping the R
values:
>>> print ([item[0] if item[1] == 'R' else None for item in i])
[42.4, 76.0, None]
which means that that earlier groupby
should not check anymore on the presence of R
but on not None
:
>>> j = [item[0] if item[1] == 'R' else None for item in i]
>>> print ([list(lst) for key,lst in groupby(j, lambda x: x is not None) if key])
[[42.4, 76.0]]
This is, as requested, a list containing lists of continuous items (only one list here, but each of your other input lines will show a different variation). Hold on, we're nearly there.
This testing was done on a single item in your longer list, and so it's easy to loop over the original as well:
for i in input_data:
...
Printing out, for example, can be done with this loop. However, you want a list back again. You can use append
, of course, but let's have some fun and add a list comprehension around the current groupby
:
print ([
[list(lst) for key,lst
in groupby([item[0] if item[1] == 'R' else None for item in i],
lambda x: x is not None) if key]
for i in input_data])
Don't be alarmed by its length! It's our earlier groupby
but instead of a variable i
, it contains the list comprehension itself as its first argument. The outermost layer is new; it's only this standard wrapper
[ original list comprehension for i in input_data]
and it shows
[[[45.0]], [[10.0], [60.0]], [[42.4, 76.0]]]
Where do those extra brackets come from? We started out with single items (we changed the list [45.0, 'R']
into a single item 45.0
), grouped them by occurrence, grouped that by sub-list, and the total is a list of those lists. You want the total listing, not a list of lists, so let's add them together by flattening the list. (Flattening lists is a well-researched question and you're free to pick any method, but I like sum
best because it kept things in a single line...)
Only using the above result as input:
print (sum([[[45.0]], [[10.0], [60.0]], [[42.4, 76.0]]],[]))
neatly shows that the outer layer of extra brackets have disappeared:
[[45.0], [10.0], [60.0], [42.4, 76.0]]
which is precisely what you were after.