I am trying to create dummy columns from arrays.
My original dataset looks something like the following
|---------------------|------------------|
| UserID | Categories |
|---------------------|------------------|
| 1 | 1 44 56 423 |
|---------------------|------------------|
| 2 | 44 54 624 |
|---------------------|------------------|
What I am trying to achieve is a table that dummifies all the categories (e.g. creating a new column of Categories_1, Categories_44, Categories_56 etc), where array length is not fixed.
The output should look something like this - with all possible columns, depending on what is present in the categories column
|-----------|-----------------|-----------------|-----------------|
| UserID | Categories_1 | Categories_44 | Categories_54 |
|-----------|-----------------|-----------------|-----------------|
| 1 | 1 | 1 | 0 |
|-----------|-----------------|-----------------|-----------------|
| 2 | 0 | 1 | 1 |
|-----------|-----------------|-----------------|-----------------|
From the initial table - I've managed to split the categories down with the SPLIT function
SELECT SPLIT(Categories, " ") FROM `dataset.table`
It's given me an array, and i'm not too sure how to proceed from here.
Hard coding this with a CASE WHEN would be nearly impossible this because the number of users extend up to tens of thousands, and categories in hundreds