1

I tried searching for this and only found PHP answers. I'm using Python on Google App Engine and am trying to remove nested quotes.

For example:

[quote user2]
[quote user1]Hello[/quote]
World
[/quote]

I would like to run something to just get the outer most quote.

[quote user2]World[/quote]
TylerW
  • 1,443
  • 2
  • 13
  • 25

2 Answers2

3

Not sure if you just want the quotes, or the whole input with nested quotes removed. This pyparsing sample does both:

stuff = """
Other stuff
[quote user2] 
[quote user1]Hello[/quote] 
World 
[/quote] 
Other stuff after the stuff
"""

from pyparsing import (Word, printables, originalTextFor, Literal, OneOrMore, 
    ZeroOrMore, Forward, Suppress)

# prototype username
username = Word(printables, excludeChars=']')

# BBCODE quote tags
openQuote = originalTextFor(Literal("[") + "quote" + username + "]")
closeQuote = Literal("[/quote]")

# use negative lookahead to not include BBCODE quote tags in tbe body of the quote
contentWord = ~(openQuote | closeQuote) + (Word(printables,excludeChars='[') | '[')
content = originalTextFor(OneOrMore(contentWord))

# define recursive definition of quote, suppressing any nested quotes
quotes = Forward()
quotes << ( openQuote + ZeroOrMore( Suppress(quotes) | content ) + closeQuote )

# put separate tokens back together
quotes.setParseAction(lambda t : '\n'.join(t))

# quote extractor
for q in quotes.searchString(stuff):
    print q[0]

# nested quote stripper
print quotes.transformString(stuff)

Prints:

[quote user2]
World
[/quote]

Other stuff
[quote user2]
World
[/quote] 
Other stuff after the stuff
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
0

You should find and use a real BBCode parser in Python. Googling brings up some hits - for example this one, and this one.

Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
  • Oh, I'm actually using the first one! But when I tested it, the quotes kept going and going. I didn't think that maybe it was able to address this issue, I'll check it out. – TylerW Dec 10 '11 at 04:05