3

Suppose I have a string of the following form:

ABCDEF_(0-100;1)(A|B)_GHIJ_(A-F)

I want to be able to expand this to:

ABCDEF_0A_GHIJ_A
ABCDEF_1A_GHIJ_A
ABCDEF_2A_GHIJ_A
...
ABCDEF_100A_GHIJ_A

ABCDEF_0B_GHIJ_A
ABCDEF_1B_GHIJ_A
ABCDEF_2B_GHIJ_A
...
ABCDEF_100B_GHIJ_A

ABCDEF_0A_GHIJ_B
ABCDEF_1A_GHIJ_B
ABCDEF_2A_GHIJ_B
...
ABCDEF_100A_GHIJ_B

ABCDEF_0B_GHIJ_B
ABCDEF_1B_GHIJ_B
ABCDEF_2B_GHIJ_B
...
ABCDEF_100B_GHIJ_B

ABCDEF_0A_GHIJ_C
ABCDEF_1A_GHIJ_C
ABCDEF_2A_GHIJ_C
...
ABCDEF_100A_GHIJ_C

..and so on

The string on the second line is short hand for:

STRING_(START-END;INC)_STRING(A OR B)_STRING(A THRU F)

However, the regex notations can be ANYWHERE in the string. i.e. the string could also be :

ABCDEF_(A|B)_(0-100;1)_(A-F)_GHIJ

Here's what I tried so far:

trend = 'ABCDEF_(0-100;1)(A|B)_GHIJ_(A-F)'

def expandDash(trend):
    dashCount = trend.count("-")
    for dC in range(0, dashCount):
        dashIndex = trend.index("-")-1
        trendRange = trend[dashIndex:]
        bareTrend = trend[0:trend.index("(")]
        beginRange = trendRange[0:trendRange.index("-")]
        endRange = trendRange[trendRange.index("-"):trendRange.index(";")]
        trendIncrement = trendRange[-1]
        expandedTrendList = []


def regexExpand(trend):

    for regexTrend in trend.split(')'):
        if "-" in regexTrend:
            print trend
            expandDash(regexTrend)

I'm obviously stuck here...

Is there any easy way to do the string expansion using REGEX?

Mark Kennedy
  • 1,751
  • 6
  • 30
  • 53
  • 1
    (Nothing personal, re: close vote. It's not an easy duplicate to find—I only know about it because I've seen it before—understandable if you weren't able to find it. If you disagree with the close vote, explain in a comment why that link isn't relevant.) – Andrew Cheong Nov 18 '13 at 00:26

1 Answers1

1

You could parse your mini-expression language fairly easily using regex. But you can't use regex to actually do the expansion:

TREND_REGEX = re.compile('(^.*?)(?:\((?:([^-)])-([^)])|(\d+)-(\d+);(\d+)|([^)|]+(?:\|[^)|]+)*))\)(.*))?$')

def expand(trend):
    m = TREND_REGEX.match(trend)
    if m.group(8):
        suffixes = expand(m.group(8))
    else:
        suffixes = ['']
    if m.group(2):
        for z in suffixes:
            for i in range(ord(m.group(2)), ord(m.group(3))+1):
                yield m.group(1) + chr(i) + z
    elif m.group(4):
        for z in suffixes:
            for i in range(int(m.group(4)), int(m.group(5))+1, int(m.group(6))):
                yield m.group(1) + str(i) + z
    elif m.group(7):
        for z in suffixes:
            for s in m.group(7).split('|'):
                yield m.group(1) + s + z
    else:
        yield trend
pobrelkey
  • 5,853
  • 20
  • 29