I had the case to set together letters with length one when they were separated by just space, very straightforward. So I applied the following regex pattern:
res = re.sub(r'(?<=\b[A-Za-z]\b)\s+(?=[a-zA-Z]\b)|\s+$',
'',
text,
0,
re.IGNORECASE)
To set letters one length together strings such as:
A B SCHOOL DISTRICT
J B UNIVERISTY
X Z SCHOOL LAB
which become as:
AB SCHOOL DISTRICT
JB UNIVERISTY
XZ SCHOOL LAB
However, when increasing the number to two letters, look behind does not support quantifiers. Then, I applied the following regex:
res = re.sub(r'(\b[A-Za-z]{1,2}\b)\s+(?=[a-zA-Z]{1,2}\b)|\s+$',
r'\1',
text,
0,
re.IGNORECASE)
For example, the following strings:
AB XY SCHOOL DISTRICT
JB ZC UNIVERISTY
XZ AB SCHOOL LAB
which become as:
ABXY SCHOOL DISTRICT
JBZC UNIVERISTY
XZAB SCHOOL LAB
Considering the second regex pattern. The second pattern is doing the work already, but wondering if it is the best way to do it. Do you find better approach to cope with the problem of including quantifiers in lookbehind?
Thanks