4

I have a script that converts data from one type to another. The source file can have one, two or all of: position, rotation and scale data.

My script zips the 3 together after conversions have taken place for the output file.

In this case, my source file only contains position data. So the lists returned at the end are:

pData = [['-300.2', '600.5'],['150.12', '280.7'],['19.19', '286.56']]
rData = []
sData = []

translationData = list(zip(pData, rData, sData))

If I try this, it will return [] because the shortest list is []. If I try:

translationData = list(zip_longest(pData, rData, sData))

I get:

`[(['-300.2', '600.5'], None, None), (['150.12', '280.7'], None, None), (['19.19', '286.56'], None, None)]`

Is there any way to only zip lists that contain data, or remove the None's from within the tuples within the list?

Thanks in advance!

Mike Müller
  • 82,630
  • 20
  • 166
  • 161
  • What output are you trying to get? Can you edit your question to include this. – Martin Evans Jan 27 '16 at 16:02
  • This is a good question, but I feel you are also missing something. It is generally preferred in Python to have uniform objects in a list. IE: All items in the list should be the same. So having a list of tuples of varying lengths is a bit counter-intuitive. It seems to me that perhaps you should check out that you aren't falling into the [XY problem](http://meta.stackexchange.com/q/66377/192545). Or perhaps you should handle these missing values elsewhere (like with the code that deals with them) – Inbar Rose Jan 27 '16 at 16:04
  • 1
    What do you do later with the data? What happens when you have different lenghts in the lists? What's the problem with the `None`s? Suppose you have to iterate through the translationData, a nice way of know that you don't have any data is `None`. It's easier to iterate through.. `for p, r, s in translationData: if p is None: ...` – Ale Jan 27 '16 at 16:04

6 Answers6

3

You can use the filter builtin embedded in a list-comp.

Note: In Python 3 filter returns an iterator, so you will need to call tuple() on it. (unlike in py2)

pData = [['-300.2', '600.5'],['150.12', '280.7'],['19.19', '286.56']]
rData = []
sData = []

from itertools import zip_longest  # izip_longest for python 2
[tuple(filter(None, col)) for col in zip_longest(pData, rData, sData)]

Result:

[(['-300.2', '600.5'],), (['150.12', '280.7'],), (['19.19', '286.56'],)]
Inbar Rose
  • 41,843
  • 24
  • 85
  • 131
  • 1
    This isn't right in the general case, and somewhat worse in Python 3 specifically; `filter(None` will drop all "falsy" values, not just `None`, and in Python 3, `filter` returns a generator; you'd need to wrap in `tuple` constructor to make it run out the generator to get the expected result. – ShadowRanger Jan 27 '16 at 16:31
  • @ShadowRanger You are right, sorry. I edited my answer. Good catch! – Inbar Rose Jan 28 '16 at 12:17
0

I will try to add more variation-bool,None,lambda.

import itertools
from itertools import zip_longest
pData = [['-300.2', '600.5'],['150.12', '280.7'],['19.19', '286.56']]
rData = []
sData = []


print ([list(filter(bool, col)) for col in zip_longest(pData, rData, sData)])
print ([list(filter(None, col)) for col in zip_longest(pData, rData, sData)])
print ([list(filter(lambda x: x, col)) for col in zip_longest(pData, rData, sData)])

Output-

[[['-300.2', '600.5']], [['150.12', '280.7']], [['19.19', '286.56']]]
[[['-300.2', '600.5']], [['150.12', '280.7']], [['19.19', '286.56']]]
[[['-300.2', '600.5']], [['150.12', '280.7']], [['19.19', '286.56']]]
Learner
  • 5,192
  • 1
  • 24
  • 36
  • Answer should be for Python3. – Inbar Rose Jan 27 '16 at 16:14
  • Why even offer `filter(bool,` and `filter(lambda x: x,`, both of which are identical in behavior (and slightly lower performance) than `filter(None,`? – ShadowRanger Jan 27 '16 at 16:33
  • `bool` is faster than `None`. – Learner Jan 27 '16 at 16:55
  • @Sislam: That seems highly unlikely; I suspect you're seeing timing jitter. At least in Python 3.5, [both `None` and `bool` use the _exact_ same code path](https://hg.python.org/cpython/file/dec734dfe2fe/Python/bltinmodule.c#l479) (excluding the actual load of the argument, where `LOAD_CONST` for `None` would usually beat `LOAD_GLOBAL` for `bool`, but have no effect on algorithmic overhead as iterables get larger); `None` would run trivially faster (the test for `None` comes first, short-circuiting the test for `bool`), but the difference is so meaningless that jitter would outweigh it. – ShadowRanger Jan 28 '16 at 15:56
  • @ShadowRanger Sorry,I deemed it for 2.7 as the http://stackoverflow.com/questions/3845423/remove-empty-strings-from-a-list-of-strings shows – Learner Jan 28 '16 at 16:39
  • @SIslam: Yeah, and even there, the discrepancy in timing is so small that it's almost certainly jitter. [Looks like in 2.7, the test occurs in reverse order (for `bool`, then for `None`), but it's still the same code path and as in 3.5, the timing would be indistinguishable after jitter is added.](https://hg.python.org/cpython/file/34ca24fa1b4a/Python/bltinmodule.c#l303) – ShadowRanger Jan 28 '16 at 16:55
0

You can modify the pure Python version of zip_longest given the documentation and create a version to do what you want:

from itertools import chain, repeat

class ZipExhausted(Exception):
    pass

def zip_longest(*args, **kwds):
    # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    counter = len(args) - 1
    def sentinel():
        nonlocal counter
        if not counter:
            raise ZipExhausted
        counter -= 1
        yield fillvalue
    fillers = repeat(fillvalue)
    iterators = [chain(it, sentinel(), fillers) for it in args]
    try:
        while iterators:
            res = []
            for it in iterators:
                value = next(it)
                if value != fillvalue:
                    res.append(value)
            yield tuple(res)
    except ZipExhausted:
        pass

pData = [['-300.2', '600.5'],['150.12', '280.7'],['19.19', '286.56']]
rData = []
sData = []

translationData = list(zip_longest(pData, rData, sData))
print(translationData)

Output:

[(['-300.2', '600.5'],), (['150.12', '280.7'],), (['19.19', '286.56'],)]
martineau
  • 119,623
  • 25
  • 170
  • 301
0

If you don't want to import or use list comprehensions etc for some reason:

  1. Make a grouping of the lists you want to zip (allLists)
  2. Then loop through the grouping to check if anything is in each
  3. Append together a grouping of those with data in them (zippable)
  4. Finally, *pass zip that filtered grouping (*zippable)

    alist = ['hoop','joop','goop','loop']
    blist = ['homp','jomp','gomp','lomp']
    clist = []
    dlist = []
    
    allLists = [alist,blist,clist,dlist]
    
    zippable = []
    
    for fullList in allLists:
        if fullList:
            zippable.append(fullList)
    
    finalList = list(zip(*zippable))
    
    print(finalList)
    

Just another possible solution

0

If the lists are either totally empty or totally full this would work:

>>> list(zip(*(x for x in (pData, rData, sData) if x)))
[(['-300.2', '600.5'],), (['150.12', '280.7'],), (['19.19', '286.56'],)]
Mike Müller
  • 82,630
  • 20
  • 166
  • 161
0

if you don't want the None you can use the key-word argument fillvalue of zip_longest to put anything you want instead so you can have uniform result as pointed out by @Ale

>>> translationData = list(zip_longest(pData, rData, sData, fillvalue=tuple()))
>>> translationData
[(['-300.2', '600.5'], (), ()), (['150.12', '280.7'], (), ()), (['19.19', '286.56'], (), ())]
>>> 

beware if you use a mutable object as a fill value, because if you change one, all of them change because they all are a reference to the same object

>>> translationData = list(zip_longest(pData, rData, sData,fillvalue=list()))
>>> translationData
[(['-300.2', '600.5'], [], []), (['150.12', '280.7'], [], []), (['19.19', '286.56'], [], [])]
>>> translationData[0][1].append(23)
>>> translationData
[(['-300.2', '600.5'], [23], [23]), (['150.12', '280.7'], [23], [23]), (['19.19', '286.56'], [23], [23])]
>>> 
Copperfield
  • 8,131
  • 3
  • 23
  • 29