0

In a file I have column values that that look like

screenshot of data

I am currently trying to separate each individual numerical value into it's own column

I have tried doing it via indexing:

screenshot of code

But I get this error when doing so:

KeyError: 0L

I know this would be so much easier if there was a delimiter and I could use split, but since that is not the case I am running into some issues.

Timeless
  • 22,580
  • 4
  • 12
  • 30
  • 3
    [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples?r=Saves_AllUserSaves) – Mark Tolonen May 03 '23 at 16:48
  • [Please don't post pictures of text](https://meta.stackoverflow.com/q/285551/4518341). Copy the text itself and [edit] it into your post. – wjandrea May 03 '23 at 17:05
  • `0L`? Are you using Python 2? You should switch to Python 3 ASAP, [Python 2 is EOL](https://www.python.org/doc/sunset-python-2/). – wjandrea May 03 '23 at 17:08
  • 1
    @Timeless That's not an MRE. An MRE needs to include desired output and all the code. – wjandrea May 03 '23 at 17:08
  • Oops but it still a reproducible example, no ? I can make a rollback though if you want. – Timeless May 03 '23 at 17:10
  • 1
    @Timeless It's useful to have the input data as text, but it's not a reproducible example, like I said. – wjandrea May 03 '23 at 17:12
  • 1
    Is this column a numeric datatype or string datatype? – Scott Boston May 03 '23 at 17:37

2 Answers2

3

IIUC, here is one option :

splits = (
    df["numbers"]
    .astype(str).str.findall(r"\d")
    .apply(pd.Series)
    .rename(columns=lambda x: f"TEST{x+1}")
)
​
out = df.join(splits)

Another variant

splits = (
    pd.DataFrame(
        df["numbers"]
        .astype(str).str.findall(r"\d")
        .to_list())
    .rename(columns=lambda x: f"TEST{x+1}")
)

Or, as suggested by @Quang Hoang use one of those statements as the first chain of splits :

df["numbers"].astype(str).str.extractall(r"(\d)")[0].unstack()

df["numbers"].astype(str).str.split("", expand=True).iloc[:, 1:-1]

Or simply, as pointed by @wjandrea, use this:

df["numbers"].apply(lambda n: pd.Series(list(str(n))))

Output:

print(out)

    numbers TEST1 TEST2 TEST3 TEST4
0      3336     3     3     3     6
1      1020     1     0     2     0
2      5060     5     0     6     0
3      6060     6     0     6     0
4      5141     5     1     4     1
5      3121     3     1     2     1
6      1010     1     0     1     0
7      5060     5     0     6     0
8      2020     2     0     2     0
9      1030     1     0     3     0
10     1010     1     0     1     0
11     1010     1     0     1     0
12     1010     1     0     1     0
13     1121     1     1     2     1
wjandrea
  • 28,235
  • 9
  • 60
  • 81
Timeless
  • 22,580
  • 4
  • 12
  • 30
0

This is so silly I am flabbergasted it took me so long to figure this out. I had to use .str before indexing.

df['numbers'].str[0]