2

I'm trying to get a regular expression to match the text between curly braces

The following SO question. Gave me a start but it's not working for me. And I'm not sure what I'm doing wrong. Consider the following:

The {quick} brown fox {jumped over the} lazy old {dog}. While {the [0ld] man} spoke {to the} gardener.

What I am trying to do is to match all the text between the curly braces, so I can highlight them. The expression

\{(.*?)\}

did not work for me. I'm not sure why. I'm using python 2.10/pyqt, and the QRegExp class of pyqt on Windows.

Can anyone point out what I'm doing wrong.

Just to add more detail, this time with some code. consider the following

import sys
from PyQt4.QtGui import *
from PyQt4.QtCore import *

class MyHighlighter( QSyntaxHighlighter ):

    def __init__( self, parent, theme ):

        QSyntaxHighlighter.__init__( self, parent )
        self.parent = parent
        keyword = QTextCharFormat()


        self.highlightingRules = []

        # keyword
        brush = QBrush( Qt.darkBlue, Qt.SolidPattern )
        keyword.setForeground( brush )
        keyword.setFontWeight( QFont.Bold )
        keywords = QStringList( [ "break", "else", "for", "if", "in", 
                                  "next", "repeat", "return", "switch", 
                                  "try", "while" ] )
        for word in keywords:

            pattern = QRegExp("\\b" + word + "\\b")
            rule = HighlightingRule( pattern, keyword )
            self.highlightingRules.append( rule )


        # braces
        singlebraces = QTextCharFormat()
        pattern = QRegExp( "\{(.*?)\}" )
        pattern.setMinimal( False )
        brush = QBrush( Qt.darkRed, Qt.SolidPattern )
        singlebraces.setForeground( brush )
        rule = HighlightingRule( pattern, singlebraces )
        self.highlightingRules.append( rule )

    def highlightBlock( self, text ):

        for rule in self.highlightingRules:

            expression = QRegExp( rule.pattern )
            index = expression.indexIn( text )
            while index >= 0:
                length = expression.matchedLength()
                self.setFormat( index, length, rule.format )
                index = text.indexOf( expression, index + length )
            self.setCurrentBlockState( 0 )

class HighlightingRule():

    def __init__( self, pattern, format ):

        self.pattern = pattern
        self.format = format

class TestApp( QMainWindow ):

    def __init__(self):

        QMainWindow.__init__(self)
        font = QFont()
        font.setFamily( "Courier" )
        font.setFixedPitch( True )
        font.setPointSize( 10 )
        editor = QTextEdit()
        editor.setFont( font )
        highlighter = MyHighlighter( editor, "Classic" )
        self.setCentralWidget( editor )
        self.setWindowTitle( "Syntax Highlighter" )


if __name__ == "__main__":
    app = QApplication( sys.argv )
    window = TestApp()
    window.show()
    sys.exit( app.exec_() )

Well when i run this and type anything between curly braces, it does not get highlighted red. Just for good measure I have left the keywords in to show that the code does do syntax highlighting.

Note: I did try the expression \{(.*?)\} on the website and yes it did work, but not clear why the expression isn't working in the code.

Community
  • 1
  • 1
user595985
  • 1,543
  • 4
  • 29
  • 55
  • Ho it doesn't works for you? what's wring with `>>> re.findall(r'\{(.+?)\}',s) ['quick', 'jumped over the', 'dog', 'the [0ld] man', 'to the'] ` – Mazdak Sep 17 '15 at 12:55
  • Your regex works... You can check here https://regex101.com/r/kV5dR1/1 If you still have problem post your code here. – Asunez Sep 17 '15 at 13:05

3 Answers3

1

You need to use setMinimal(true):

QRegExp.setMinimal (self, bool minimal)

Enables or disables minimal matching. If minimal is false, matching is greedy (maximal) which is the default.

Thus, the code will look like:

QRegExp rx("\\{(.*)}"); 
rx.setMinimal(true);
Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

The following SO questions seems to have the answer, at least one that works for me. The expression

"\\{(.*)\\}"

Seems to do the trick. But I would like to know WHY. My knowlege of regex can be written in entirely, double spaced on the back of a napkin. Any additional clarification would help

Community
  • 1
  • 1
user595985
  • 1,543
  • 4
  • 29
  • 55
-1

So, the following is working for me:

string = 'The {quick} brown fox {jumped over the} lazy old {dog}. While {the [0ld] man} spoke {to the} gardener.'

import re

ans = re.findall(r'{.*?}', string)

As pointed out by @Alan Moore, I was wrong about the re matching the unescaped parenthesis, still you don't need the scape sequences if you you raw string notation r'string'.

  • *I'm using python 2.10/pyqt, and the QRegExp class of pyqt on Windows.* Your answer does not solve the issue in the original question. – Wiktor Stribiżew Sep 17 '15 at 13:17
  • I don't know about `pyqt`, but this answer is definitely wrong for Python's `re` module. It may not be *necessary* to escape the braces in this regex, but it's not incorrect to do so. And unescaped parentheses are metacharacters; they will *not* be matched literally. – Alan Moore Sep 17 '15 at 15:03