1

I have a dataframe similar to this:

data = {"col_1": [0, 1, 2],
        "col_2": ["abc", "defg", "hi"]}
df = pd.DataFrame(data)

Visually:

   col_1 col_2
0      0   abc
1      1   defg
2      2   hi

What I'd like to do is split up each character in col_2, and append it as a new column to the dataframe

example iterative method:

def get_chars(string):
    chars = []
    for char in string:
        chars.append(char)
    return chars

char_df = pd.DataFrame()
for i in range(len(df)):
    char_arr = get_chars(df.loc[i, "col_2"])
    temp_df = pd.DataFrame(char_arr).T
    char_df = pd.concat([char_df, temp_df], ignore_index=True, axis=0)

df = pd.concat([df, char_df], ignore_index=True, axis=1)

Which results in the correct form:

   0     1  2  3    4    5
0  0   abc  a  b    c  NaN
1  1  defg  d  e    f    g
2  2    hi  h  i  NaN  NaN

But I believe iterating though the dataframe like this is very inefficient, so I want to find a faster (ideally vectorised) solution.

In reality, I'm not really splitting up strings, but the point of this question is to find a way to efficiently process one column, and return many.

Rory LM
  • 160
  • 2
  • 15
  • Does this answer your question? [Split / Explode a column of dictionaries into separate columns with pandas](https://stackoverflow.com/questions/38231591/split-explode-a-column-of-dictionaries-into-separate-columns-with-pandas) – KaRaOkEy Feb 26 '21 at 08:33

1 Answers1

3

If need performance use DataFrame constructor with convert values to lists:

df = df.join(pd.DataFrame([list(x) for x in df['col_2']], index=df.index))

Or:

df = df.join(pd.DataFrame(df['col_2'].apply(list).tolist(), index=df.index))

print (df)
   col_1 col_2  0  1     2     3
0      0   abc  a  b     c  None
1      1  defg  d  e     f     g
2      2    hi  h  i  None  None
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Great, thank you. In reality col_2 was holding a list as a string: `[1.0023, 3.0421, ...]`, so i just had to change `list` to `eval` in your first solution. – Rory LM Feb 26 '21 at 08:44
  • @RoryLM - Or use `ast.literal_eval(x)` instead `eval` - [link](https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice) – jezrael Feb 26 '21 at 08:47