By using Python (mostly REGEX), I would like to have the following output:
string = 'leelee'
result = [('l',1),('e',2),('l',1),('e',2)]
By using Python (mostly REGEX), I would like to have the following output:
string = 'leelee'
result = [('l',1),('e',2),('l',1),('e',2)]
You can do it with the help of regex, but not regex alone.
First group by character, then list comprehension to count elements in those groups.
import re
s = 'leelee'
x = re.findall(r'(.)(\1*)',s)
print([[e[0],1+len(e[1])] for e in x])
The regex above captures a character (.)
, then matches that character any number of times if it immediately follows it (\1*)
.
Why would you need regex? Python's *
is string multiplication, and +
is string concatenation. For example:
print("h" * 5) # hhhhh
print("h" + "t") # ht
Here's a version with a bunch of for
loops:
for pair in result:
for char, times in pair:
for _ in range(times):
print(char, end='')
Or here's one with comprehension and join
:
print(''.join([x * y for x, y in result]))
Or the most direct solution:
print(string)
I don't think you'll find one that just uses regexes though...
You can do this with regex plus other tools, but it's not ideal. Using itertools.groupby
is much easier.
from itertools import groupby
result = [(k, sum(1 for _ in g)) for k, g in groupby(string)]
This method of getting the len of an iterator is explained here.