Separate column values into new columns with no delimiters?

Question

In a file I have column values that that look like

I am currently trying to separate each individual numerical value into it's own column

I have tried doing it via indexing:

But I get this error when doing so:

KeyError: 0L

I know this would be so much easier if there was a delimiter and I could use split, but since that is not the case I am running into some issues.

[How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples?r=Saves_AllUserSaves) — Mark Tolonen, May 03 '23 at 16:48
[Please don't post pictures of text](https://meta.stackoverflow.com/q/285551/4518341). Copy the text itself and [edit] it into your post. — wjandrea, May 03 '23 at 17:05
`0L`? Are you using Python 2? You should switch to Python 3 ASAP, [Python 2 is EOL](https://www.python.org/doc/sunset-python-2/). — wjandrea, May 03 '23 at 17:08
@Timeless That's not an MRE. An MRE needs to include desired output and all the code. — wjandrea, May 03 '23 at 17:08
Oops but it still a reproducible example, no ? I can make a rollback though if you want. — Timeless, May 03 '23 at 17:10
@Timeless It's useful to have the input data as text, but it's not a reproducible example, like I said. — wjandrea, May 03 '23 at 17:12

score 3 · Accepted Answer · edited May 03 '23 at 17:29

IIUC, here is one option :

splits = (
    df["numbers"]
    .astype(str).str.findall(r"\d")
    .apply(pd.Series)
    .rename(columns=lambda x: f"TEST{x+1}")
)

out = df.join(splits)

Another variant

splits = (
    pd.DataFrame(
        df["numbers"]
        .astype(str).str.findall(r"\d")
        .to_list())
    .rename(columns=lambda x: f"TEST{x+1}")
)

Or, as suggested by @Quang Hoang use one of those statements as the first chain of splits :

df["numbers"].astype(str).str.extractall(r"(\d)")[0].unstack()

df["numbers"].astype(str).str.split("", expand=True).iloc[:, 1:-1]

Or simply, as pointed by @wjandrea, use this:

df["numbers"].apply(lambda n: pd.Series(list(str(n))))

Output:

print(out)

    numbers TEST1 TEST2 TEST3 TEST4
0      3336     3     3     3     6
1      1020     1     0     2     0
2      5060     5     0     6     0
3      6060     6     0     6     0
4      5141     5     1     4     1
5      3121     3     1     2     1
6      1010     1     0     1     0
7      5060     5     0     6     0
8      2020     2     0     2     0
9      1030     1     0     3     0
10     1010     1     0     1     0
11     1010     1     0     1     0
12     1010     1     0     1     0
13     1121     1     1     2     1

`.str.extractall('(\d)')[0].unstack()` or `.str.split("", expand=True).iloc[:, 1:-1]`. — Quang Hoang, May 03 '23 at 17:21
I liked the `split` approach, answer updated. Thanks @QuangHoang ;) — Timeless, May 03 '23 at 17:25
Many roads lead to Rome. I added your approach as well, @wjandrea ;) — Timeless, May 03 '23 at 17:28

score 0 · Answer 2 · answered May 03 '23 at 17:02

0

This is so silly I am flabbergasted it took me so long to figure this out. I had to use .str before indexing.

df['numbers'].str[0]

answered May 03 '23 at 17:02

Nia Jackson

13
2

Separate column values into new columns with no delimiters?

2 Answers2