how to split the items of a space-separated list into the columns in pandas

Question

I have a dataframe with multiple columns and the content of one of the columns looks like a list:

df = pd.DataFrame({'Emojis':['[1 2 3 4]', '[4 5 6]']})

What I want to do to split the contents of these "lists" into the columns and since the sizes of the lists are not the same I will have the number of columns with the max of the items (5 items is the max) and whenever the items is less than that I will put null.

So the output will be something like this:

      Emojis it1  it2  it3  it4   it5
0  [1 2 3 4] 1     2    3   4     null
1    [4 5 6] 4     5    6   null  null

I was doing like this:

splitlist = df['Emojis'].apply(pd.Series)
df2 = pd.concat([df, splitlist], axis=1)

but its not close to what I want since the list is not really a list is saved in df as object without ,

Does this answer your question? [Pandas: split column of lists of unequal length into multiple columns](https://stackoverflow.com/questions/44663903/pandas-split-column-of-lists-of-unequal-length-into-multiple-columns) — jglad, Feb 10 '23 at 20:55
@jglad not really. since they have a list with , seperated in their dataframe. Mine technically is not a list thats why I mentioned look like a list — sariii, Feb 10 '23 at 20:57
Did you try using code to make the thing that looks like a list into an actual list, and then applying the solution for actual lists? is the question actually "how do I make the list-looking thing into an actual list", perhaps? — Karl Knechtel, Feb 10 '23 at 21:50
I tried just did not include all my efforts. Regardless, I think this answer is more pandas way than having the list to look like a real list and then apply that solution — sariii, Feb 10 '23 at 22:07

mozway · Accepted Answer · 2023-02-10T21:02:23.790

2

You can use:

out = df.join(pd.DataFrame(df['Emojis'].str.findall('\d+').to_list(), 
                           index=df.index)
              .reindex(columns=range(5))
              .rename(columns=lambda x: f'it{x+1}')
              )

Output:

      Emojis it1 it2 it3   it4  it5
0  [1 2 3 4]   1   2   3     4  NaN
1    [4 5 6]   4   5   6  None  NaN

edited Feb 10 '23 at 21:02

answered Feb 10 '23 at 21:00

mozway

194,879
13
39
75

2

My answer: `pd.DataFrame.from_records(df['Emojis'].str.findall('\d+')).add_prefix('it')` :-) – Corralien Feb 10 '23 at 21:02
@Corralien why not post it separately? – Karl Knechtel Feb 10 '23 at 21:51
@KarlKnechtel. There is too little difference with my answer. Mozway was the fastest. There is no reason to have 2 answers so close for the OP. – Corralien Feb 10 '23 at 21:57
@Corralien It's very frustrating when someone posted an same answer with you just 1 or 2 minutes ago. Meet this many times with mozway, beny and jezrael. – Ynjxsjmh Feb 11 '23 at 16:22
@Ynjxsjmh sorry :( – mozway Feb 11 '23 at 18:03

score 1 · Answer 2 · answered Feb 10 '23 at 21:13

1

You can also use:

df = pd.DataFrame({'Emojis':['[1 2 3 4]', '[4 5 6]']})
for i in range(5):
    column_name = 'it' + str(i)
    df[column_name] = df['Emojis'].astype(str).str[1 + 2 * i]

answered Feb 10 '23 at 21:13

Mitchell Faulk

11
1

how to split the items of a space-separated list into the columns in pandas

2 Answers2