5

Given the following list of lists:

a = [[2,3],[1,2,3],[1]]

I need each list within a to have the same number of elements. First, I need to get the longest length of any list in a. Then, I need to ensure all lists are at least that long. If not, I want to add a zero (0) to the end until that is true. The desired result is:

b = [[2,3,0],[1,2,3],[1,0,0]]

Thanks in advance!

P.S. I also need to apply this to a Pandas Data Frame like this one:

import pandas as pd
b = [[2,3,0],[1,2,3],[1,0,0]]
f=pd.DataFrame({'column':b})
piRSquared
  • 285,575
  • 57
  • 475
  • 624
Dance Party2
  • 7,214
  • 17
  • 59
  • 106

3 Answers3

5

First, compute the maximum length of your elements:

maxlen=len(max(a,key=len))  # max element using sublist len criterion

or as Patrick suggested do it using generator comprehension on sublist lengths, probably a tad faster:

maxlen=max(len(sublist) for sublist in a)  # max of all sublist lengths

then create a new list with 0 padding:

b = [sl+[0]*(maxlen-len(sl)) for sl in a]  # list comp for padding

result with a = [[2,3],[1,2,3],[1]]:

[[2, 3, 0], [1, 2, 3], [1, 0, 0]]

Note: could be done in one line but would not be very performant because of the recomputation of maxlen. One-liners are not always the best solution.

b = [sl+[0]*(len(max(a,key=len))-len(sl)) for sl in a]  # not very performant
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • 1
    `maxlen=len(max(a,key=len))` looks a little odd. Any reason to do it this way over `maxlen=max(len(sublist) for sublist in a)` – Patrick Haugh Nov 16 '16 at 20:51
  • no, it was just the most complicated one-liner I could think of :). I _know_ I have to avoid `filter`, `map` and a lot of complex constructs which are best replaced by listcomps/gencomps most of the time, but I continue doing so... edited with your alternative. – Jean-François Fabre Nov 16 '16 at 21:21
  • I know what you mean. I did nothing but functional programming for a few years and sometimes I feel like I've come back and am trying to tell everyone that it's all just shadows on a wall. – Patrick Haugh Nov 16 '16 at 21:26
  • very philosophical :) I lost a good answer the other day over those functional artifacts. And in python 3 when you need a list from those you have to explicitly convert it, making it even more complex. That's decided: I'm stopping :) thanks for getting me out of the cave. – Jean-François Fabre Nov 16 '16 at 21:30
  • With `itertools.zip_longest` you don't need to figure out the max length yourself. – hpaulj Nov 16 '16 at 22:34
  • Yes but you have to transpose the data afterwards. – Jean-François Fabre Nov 16 '16 at 22:52
5

How about

pd.DataFrame(a).fillna(0)

enter image description here


to get exactly what you asked for

pd.Series(pd.DataFrame(a).fillna(0).astype(int).values.tolist()).to_frame('column')

enter image description here


this is also related to this question

where you can get much better performance with

def box(v):
    lens = np.array([len(item) for item in v])
    mask = lens[:,None] > np.arange(lens.max())
    out = np.full(mask.shape, 0, dtype=int)
    out[mask] = np.concatenate(v)
    return out

pd.DataFrame(dict(columns=box(a).tolist()))

enter image description here


timing
enter image description here

Community
  • 1
  • 1
piRSquared
  • 285,575
  • 57
  • 475
  • 624
0
for i in a: 
    while len(i) < 3:
        i.append(0)
Eric
  • 77
  • 1
  • 5