split-string-every-nth-character with nth+1 separator '0'

Question

I have a serie of digits that are string, for example : IN : '01110100001101001001110100'

I want to split as follow: OUT: ['01110100', '0', '01101001', '0','01110100]

We can notice that ninth '0' is a separator of list of digits that are binary.

Said otherwise : first eighth, sperator'0',next height, separator '0', next height, separator '0', etc

I know how to split using nth element (Split string every nth character?), but the issue here is a bit more complicated:there is a separator '0'

Thanks a lot for your help.

Kindest regards

You can easily do this with `re.findall()`. Use a regexp with two capture groups: the first one matches 8 digits, the second one matches the `0` separator. — Barmar, Dec 14 '21 at 22:16
Hi and thanks for your help. Could you show how exactly to do this. A concrete illustration would help, as it is normally the convention in stack overflow. Thanks in advance and kindest regards — Laurent Salé, Dec 14 '21 at 22:27
you can use loop to get `[:8]` and replace `IN = IN[8:]`, next get `[:1]` and replace `IN = IN[1:]`, (end of loop) — furas, Dec 15 '21 at 01:36

Paul M. · Answer 1 · 2021-12-14T23:26:22.763

1

You just want to alternate between grabbing a slice of length 8, and then a slice of length 1, right?

def get_slices(string):
    from itertools import islice, cycle

    string_iter = iter(string)
    slice_lens = cycle([8, 1])

    while slc := "".join(islice(string_iter, next(slice_lens))):
        yield slc

print(list(get_slices("abcdefghijklmnopqr")))

Output:

['abcdefgh', 'i', 'jklmnopq', 'r']

edited Dec 14 '21 at 23:26

answered Dec 14 '21 at 22:38

Paul M.

10,481
2
9
15

It works perfectly ! Thanks a lot and kindest regards – Laurent Salé Dec 14 '21 at 23:01
@LaurentSalé Glad it's working! I've edited my answer with a slightly more compact version (requires Python 3.8.) – Paul M. Dec 14 '21 at 23:26

score 1 · Accepted Answer · answered Dec 15 '21 at 04:52

1

You can use a regular expression with capture groups.

import re

instr = '01110100001101001001110100'
outlist = list(sum(re.findall(r'(\d{8})(\d)', instr), ()))

print(outlist)

re.findall() returns a list of tuples, list(sum(..., ()) flattens it into a single list.

answered Dec 15 '21 at 04:52

Barmar

741,623
53
500
612

Thanks a lot and kindest regards – Laurent Salé Dec 16 '21 at 11:37

split-string-every-nth-character with nth+1 separator '0'

2 Answers2