How to transform 1D list of values to 2D grid of 0's and 1's in python

Question

I would like to take a list of values and transform them to a table (2D-list) of 0's and 1's, with one column for each unique number in the source list and an equal number of rows to the original. Each row will have a 1 if that column index matches the original value-1.

I have code that accomplishes this task, but I'm wondering if there is a better/faster way to do it. (The actual dataset has millions of entries vs. the simplified set below)

Sample Input:

value_list = [1, 2, 1, 3, 6, 5, 4, 3]

Desired output:

output_table = [[1, 0, 0, 0, 0, 0],
                [0, 1, 0, 0, 0, 0],
                [1, 0, 0, 0, 0, 0],
                [0, 0, 1, 0, 0, 0],
                [0, 0, 0, 0, 0, 1],
                [0, 0, 0, 0, 1, 0],
                [0, 0, 0, 1, 0, 0],
                [0, 0, 1, 0, 0, 0]]

Current Solution:

value_list = [1, 2, 1, 3, 6, 5, 4, 3]
max_val = max(value_list)

# initialize to table of 0's
a = [([0] * max_val) for i in range(len(value_list))]

# overwrite with 1's where required
for i in range(len(value_list)):
    j = value_list[i] - 1
    a[i][j] = 1

print(f'a = ')
for row in a:
    print(f'{row}')

If you're dealing with large amounts of data, it might be worth using NumPy. What kind of data is it? — AMC, Jan 23 '20 at 18:37
It comes from a text file (I extract that into the 1D list of values in another step). All of the numbers in the source data are integers. — KKB, Jan 23 '20 at 18:39
This is basically one-hot encoding. If you can use NumPy your life will be easier. I've marked a duplicate for you to have a look at — rayryeng, Jan 23 '20 at 18:47
If you're looking for reducing processing (and potentially even memory), it may be possible to subclass `numpy.ndarray` such that the underlying data is just the contents of `value_list`, but returns views that look like `output_table`. — Aaron, Jan 23 '20 at 18:52
`import numpy as np; a = np.array([1, 2, 1, 3, 6, 5, 4, 3]); b = np.arange(1,a.max()+1); c = 1 * (a[:,None] == b[None,:])` — wwii, Jan 23 '20 at 19:20

score 1 · Accepted Answer · answered Jan 23 '20 at 18:44

1

You can do:

import numpy as np

value_list = [1, 2, 1, 3, 6, 5, 4, 3]

# create matrix of zeros
x = np.zeros(shape=(len(value_list), max(value_list)), dtype='int')

for i,v in enumerate(value_list):
    x[i,v-1] = 1

print(x)

Output:

[[1 0 0 0 0 0]
 [0 1 0 0 0 0]
 [1 0 0 0 0 0]
 [0 0 1 0 0 0]
 [0 0 0 0 0 1]
 [0 0 0 0 1 0]
 [0 0 0 1 0 0]
 [0 0 1 0 0 0]]

answered Jan 23 '20 at 18:44

Sociopath

13,068
19
47
75

1

You can make the assignment of all rows simultaneously without a loop by `x[np.arange(len(value_list)), np.array(value_list) - 1] = 1` after you create the initial array `x`. – rayryeng Jan 24 '20 at 00:21

score 0 · Answer 2 · answered Jan 23 '20 at 18:41

You can try this:

dummy_list = [0]*6
output_table = [dummy_list[:i-1] + [1] + dummy_list[i:] for i in value_list]

Output:

output_table = [[1, 0, 0, 0, 0, 0],
                [0, 1, 0, 0, 0, 0],
                [1, 0, 0, 0, 0, 0],
                [0, 0, 1, 0, 0, 0],
                [0, 0, 0, 0, 0, 1],
                [0, 0, 0, 0, 1, 0],
                [0, 0, 0, 1, 0, 0],
                [0, 0, 1, 0, 0, 0]]

How to transform 1D list of values to 2D grid of 0's and 1's in python

2 Answers2