1

I have a column existing of rows with different strings (Python). ex.

  1. 5456656352
  2. 435365
  3. 46765432 ...

I want to seperate the strings every 2 digits with a comma, so I have following result:

  1. 54,56,65,63,52
  2. 43,53,65
  3. 46,76,54,32 ...

Can someone help me please.

MC Emperor
  • 22,334
  • 15
  • 80
  • 130
  • Does this answer your question? [Split string every nth character?](https://stackoverflow.com/questions/9475241/split-string-every-nth-character) – mkrieger1 Apr 06 '22 at 20:40

2 Answers2

1

Not sure about the structure of desired output (pandas and dataframes, pure strings, etc.). But, you can always use a regex pattern like:

import re
re.findall("\d{2}", "5456656352")

Output

['54', '56', '65', '63', '52']

You can have this output as a string too:

",".join(re.findall("\d{2}", "5456656352"))

Output

54,56,65,63,52

Explanation

\d{2} is a regex pattern that points to a part of a string that has 2 digits. Using findall function, this pattern will divide each string to elements containing just two digits.

Edit

Based on your comment, you want to APPLY this on a column. In this case, you should do something like:

df["my_column"] = df["my_column"].apply(split_it)
TheFaultInOurStars
  • 3,464
  • 1
  • 8
  • 29
  • I need to apply this function to one column of a dataframe. This is what I tried but I get an error message ('Series' object has no attribute 'split_it'). Probably somehing goes wrong with the selection of the rows of my columns.import re def split_it(Age): ",".join(re.findall("\d{2}", Age)) df['DMAge']= df['DMAge'].split_it(df[:,'DMAge']) – Wouter van Epperzeel Apr 08 '22 at 12:42
  • @WoutervanEpperzeel Thanks for the comment. I have edited the answer as per your comment. – TheFaultInOurStars Apr 08 '22 at 12:54
  • How should I build my funtion? The input of my function is every row of the column I guess? Because find all only works for a string (=row of columns) and not a whole colomn. I have now this as function: def split_it(Age): ",".join(re.findall("\d{2}", Age)) where age represent a row of a column – Wouter van Epperzeel Apr 08 '22 at 13:16
1

Try:

text = "5456656352"
print(",".join(text[i:i + 2] for i in range(0, len(text), 2)))

output:

54,56,65,63,52

You can wrap it into a function if you want to apply it to a DF or ...

note: This will separate from left, so if the length is odd, there will be a single number at the end.

S.B
  • 13,077
  • 10
  • 22
  • 49
  • Hi SorousH, can you explain how I need to build my function. I am a bit confused whit the input parameters of the function. I think it need to be a row of a column of my df, but not sure about it – Wouter van Epperzeel Apr 10 '22 at 09:12
  • @WoutervanEpperzeel You can pass this lambda to `.apply()` method of your DataFrame : `lambda text: ",".join(text[i:i + 2] for i in range(0, len(text), 2))` – S.B Apr 10 '22 at 10:12