Convert jagged lists into numpy array

Question

I have list consisting of different lengths (see below)

[(880),
 (880, 1080),
 (880, 1080, 1080),
 (470, 470, 470, 1250)]

I want to convert it to same looking numpy.array, even if I have to fill blank spaces with zeros.

For example it should look like this:

[(880, 0, 0, 0),
 (880, 1080, 0, 0),
 (880, 1080, 1080, 0),
 (470, 470, 470, 1250)]

How to do that?

score 1 · Answer 1 · answered May 11 '21 at 17:33

You might think, your input is a list of tuples. However, it is a list of integers and tuples. (880) will be interpreted as an integer, but not as a tuple. So you have to deal with both datatypes.

First of all I suggest converting your input data to a list of lists. Each of the lists contained in that list should have the same length, because an array supports constant dimensions only. Therefore, I would convert the elements into a list and fill missing values with zeros (to make all elements equal in length).

If we do this for all of the elements given in the input list, we create a new list containing lists of equal length which can be converted into an array.

A very basic (and error-prone) approach would look like this:

import numpy as np


original_list = [
    (880),
    (880, 1080),
    (880, 1080, 1080),
    (470, 470, 470, 1250),
]


def get_len(item):
    try:
        return len(item)
    except TypeError:
        # `(880)` will be interpreted as an int instead of a tuple
        # so we need to handle tuples and integers
        # as integers do not support len(), a TypeError will be raised
        return 1


def to_list(item):
    try:
        return list(item)
    except TypeError:
        # `(880)` will be interpreted as an int instead of a tuple
        # so we need to handle tuples and integers
        # as integers do not support __iter__(), a TypeError will be raised
        return [item]


def fill_zeros(item, max_len):
    item_len = get_len(item)
    to_fill = [0] * (max_len - item_len)
    as_list = to_list(item) + to_fill
    return as_list


max_len = max([get_len(item) for item in original_list])
filled = [fill_zeros(item, max_len) for item in original_list]

arr = np.array(filled)
print(arr)

Printing:

[[ 880    0    0    0]
[ 880 1080    0    0]
[ 880 1080 1080    0]
[ 470  470  470 1250]]

Convert jagged lists into numpy array

1 Answers1