Padding a 2D numpy with varying rows into a same size

Question

I have read a post on this topic asking the exact thing, but the solution was to loop it and apply padding to it inside the loop. Person explained that at this moment there is no way to deal with "jaggered arrays" like that one, so looping is the only way.

I think that there is a nicer and more efficient solution to this because numpy.pad is powerful function and it just doesn't seem right to do it like that, but I have no idea how to accomplish it.

For better understanding this is what the data looks like:

output_data_set =[[0, 0, 0], 
                  [0, 0, 0, 0, 2, 0, 126], 
                  [0, 2, 0, 126, 0, 20, 207],
                  [0, 0, 0], 
                  [0, 0, 0]]

And I want this :

final = [[0, 0, 0, 0, 0, 0, 0], 
         [0, 0, 0, 0, 2, 0, 126], 
         [0, 2, 0, 126, 0, 20, 207],
         [0, 0, 0, 0, 0, 0, 0], 
         [0, 0, 0, 0, 0, 0, 0]]

My code looks like this (didn't get the chance to test on bigger file because it takes ages!)

        for i in range(len(output_data_set)):
        #max_entries is 3 in this case and the -2 is because ommit
        # the last 2 data values (I don't believe it matters eitherway
        temp = np.pad(output_data_set[i], (((max_entries*3) - (len(output_data_set[i] -2))), 0), 'constant', constant_values=0)
        print('I am at : ', i, 'th element from ', len(output_data_set),' number of elements')
        if i is 0:
            final = temp
        else:
            final = np.vstack((final, temp))
        return final

If anyone can give me any better solution, I would love you for it!

Thank you

Especially if you are starting with a list of lists , `zip_longest` is convenient, and easy to understand. — hpaulj, Aug 24 '17 at 16:56

Padding a 2D numpy with varying rows into a same size

0 Answers0