13

I'm reading a file and unpacking each line like this:

for line in filter(fh):
  a, b, c, d = line.split()

However, it's possible that line may have more or fewer columns than the variables I wish to unpack. In the case when there are fewer, I'd like to assign None to the dangling variables, and in the case where there are more, I'd like to ignore them. What's the idiomatic way to do this? I'm using python 2.7.

alvas
  • 115,346
  • 109
  • 446
  • 738
pythonic metaphor
  • 10,296
  • 18
  • 68
  • 110
  • Good question. I hope that somebody proves me wrong, but I think that you would need more than one statement for that... It's do-able in Lua as far as I remember... – dsign Dec 10 '13 at 17:40
  • see https://stackoverflow.com/questions/5333680/extended-tuple-unpacking-in-python-2 – n611x007 Dec 12 '14 at 15:52

5 Answers5

15

Fix the length of the list, padding with None.

def fixLength(lst, length):
    return (lst + [None] * length)[:length]
Hyperboreus
  • 31,997
  • 9
  • 47
  • 87
7

In python 3 you can use this

a, b, c, d, *_unused_ = line.split() + [None]*4

Edit

For large strings I suggest to use maxsplit-argument for split (this argument also works in py2.7):

a, b, c, d, *_unused_ = line.split(None, 4) + [None]*4

Why 5? Otherwise the 4th element would consist the whole residual of the line.

Edit2 It is 4… It stops after 4 splits, not 4 elements

koffein
  • 1,792
  • 13
  • 21
  • @Marcin And I didn't get it in your case either. – koffein Dec 10 '13 at 18:04
  • That's a nice feature of python 3 which I wish I could use! Alas, python 2 – pythonic metaphor Dec 10 '13 at 18:21
  • At least as of python 2.7.5, `str.split` takes maxsplit as a positional, not keyword argument. So it is `line.split(None, 5)`, but still a good addition. – pythonic metaphor Dec 10 '13 at 18:34
  • @pythonicmetaphor You are right. Changed it, so the split will work in both versions. And my explanation for the maxsplit-number was wrong, changed it, too. Sorry for all this confusion... – koffein Dec 10 '13 at 18:45
5

First of all, think about why you want to do this.

However, given that you want to (1) pad with None and (2) ignore extra variables, the code is easy:

a,b,c,d = (line.split() + [None]*4)[:4]

Obviously, the magic number has to be the same as the number of variables. This will extend what you have with the magic number, then trim back down to that length.

For an arbitrary iterable you can do:

import itertools

def padslice(seq,n):
    return itertools.islice(itertools.chain(seq,itertools.repeat(None)), n)

This is the same pad-and-slice with itertools.

Marcin
  • 48,559
  • 18
  • 128
  • 201
  • +1 for the python-2.7-one-liner. (Not compensatory, seriously). But have a look at your padslice-function. It does not return a sequence, but I think in order to unpack you will need a list or something like this. – koffein Dec 10 '13 at 18:16
  • 1
    @koffein You can unpack from a generator, so there is no issue. – Marcin Dec 10 '13 at 18:26
  • Your first answer is similar to what I was doing, which is works, but doesn't feel quite idiomatic, because it doesn't work with iterators. But I like your second answer, which is getting much closer to feeling pythonic to me. – pythonic metaphor Dec 10 '13 at 18:30
  • @Marcin Right. Tested it and it worked… Nice feature – koffein Dec 10 '13 at 18:31
  • @pythonicmetaphor Idiomatic python doesn't have to work with iterators, especially where the type is known not to be an iterator. But, thanks. – Marcin Dec 10 '13 at 18:32
  • Well, in this case, unpacking tuples works with iterators, so I expected the 'right' solution to this problem to also work with iterators, thought maybe that's not true. – pythonic metaphor Dec 10 '13 at 19:08
0

In Python 3, you can use itertools.zip_longest, like this:

from itertools import zip_longest

max_params = 4

lst = [1, 2, 3, 4]
a, b, c, d = next(zip(*zip_longest(lst, range(max_params))))
print(f'{a}, {b}, {c}, {d}') # 1, 2, 3, 4

lst = [1, 2, 3]
a, b, c, d = next(zip(*zip_longest(lst, range(max_params))))
print(f'{a}, {b}, {c}, {d}') # 1, 2, 3, None

For Python 2.x you can follow this answer.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
-1

Something like this, works for any iterable/iterator. If you're always going to pass a list then you can remove the islice part.

from itertools import islice
def solve(seq, n):
    lis = list(islice(seq, n))
    return lis + [None]*(n - len(lis))
... 
>>> a, b, c, d = solve(range(2), 4)
>>> a, b, c, d
(0, 1, None, None)
>>> a, b, c, d = solve('qwe', 4)
>>> a, b, c, d
('q', 'w', 'e', None)
>>> a, b, c, d = solve(iter([1, 2, 3, 4, 5]), 4)
>>> a, b, c, d
(1, 2, 3, 4)
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504