4

I have a question on how to properly parse the string like the following,

"(test.function, arr(3,12), "combine,into one")"

into the following list,

['test.function', 'arr(3,12)', '"combine,into one"']

Note: the 'list' items from the original string are not necessarily split by a comma and a space, it can also be two items split directly by a comma one after another, e.g. test.function,arr(3,12).

Basically, I want to:

  1. Parse the input string which is contained in parentheses, but not the inner parentheses. (Hence, nestedExpr() can't be used as-is)
  2. The items inside are separeted by commas, but the items themselves may contain commas.

Moreover, I can only use scanString() and not parseString().

I've done some search in SO and found this and this, but I can't translate them to fit into my problem.

Thanks!

Community
  • 1
  • 1
Sam Tatasurya
  • 223
  • 2
  • 15

1 Answers1

1

This should address your nesting and quoting issues:

sample = """(test.function, arr(3,12),"combine,into one")"""

from pyparsing import (Suppress, removeQuotes, quotedString, originalTextFor, 
    OneOrMore, Word, printables, nestedExpr, delimitedList)

# punctuation and basic elements
LPAR,RPAR = map(Suppress, "()")
quotedString.addParseAction(removeQuotes)

# what are the possible values inside the ()'s?
# - quoted string - anything is allowed inside quotes, match these first
# - any printable, not containing ',', '(', or ')', with optional nested ()'s
#   (use originalTextFor helper to extract the original text from the input
#   string)
value = (quotedString 
         | originalTextFor(OneOrMore(Word(printables, excludeChars="(),") 
                                     | nestedExpr())))

# define an overall expression, with surrounding ()'s
expr = LPAR + delimitedList(value) + RPAR

# test against the sample
print(expr.parseString(sample).asList())

prints:

['test.function', 'arr(3,12)', 'combine,into one']
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • Hi Paul, thanks for sharing this solution. This one addresses my question. I'm aware of originalTextFor() and nestedExpr(), but never thought to implement both of them in such a way. – Sam Tatasurya Nov 21 '16 at 08:14