How do I split a string of numbers into numbered columns in a Pandas dataframe?

Question

I have a pandas dataframe that looks like this:

ImageID	labels	caption_text
0.JPG	1	Woman in swim suit holding parasol
1.JPEG	1 19	a black and silver clock tower
2.JPEG	8 3 13	This photo shows people skiing in the mountains.

The labels for this data set range from 1 to 19 and I am trying to allocate them to their own column. The final dataframe will have an additional 19 columns with a 1 or 0.

For example, "8 3 13" will have a 1 in columns 8, 3 and 13 and 0's everywhere else.

So far I have managed to put them into arrays and managed to put them into columns, but neither of these gives me what I need.

Any ideas on how I can achieve this?

Thanks!

Welcome to SO! Please share a [reproducible code snippet of your dataframe](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) along with your code attempt as a [mcve]. Thanks. — ggorlen, May 23 '21 at 06:53

score 5 · Accepted Answer · answered May 23 '21 at 06:57

Since you already know the range to be 1-19, you can do a get_dummies and reindex:

n=19
arr = df['labels'].str.get_dummies(' ').reindex(map(str,range(1,n+1)),axis=1,fill_value=0)
print(arr)

   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19
0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0
1  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1
2  0  0  1  0  0  0  0  1  0   0   0   0   1   0   0   0   0   0   0

Finally you can concat this with the original dataframe:

out = pd.concat((df,arr),axis=1)

score 1 · Answer 2 · answered May 23 '21 at 07:33

1

Just to offer an alternative way of doing this. You can iterate over each of your labels and see if the value in labels contains that label:

n = 19
for i in range(1, n+1):
    df[i] = df['labels'].str.contains(rf'\b{i}\b').astype(int)

answered May 23 '21 at 07:33

Nick

138,499
22
57
95

How do I split a string of numbers into numbered columns in a Pandas dataframe?

2 Answers2