19

I want to know how to change the style of certain words and expressions based on certain patterns.

I am using the Tkinter.Text widget and I am not sure how to do such a thing (the same idea of syntax highlighting in text editors). I am not sure even if this is the right widget to use for this purpose.

nbro
  • 15,395
  • 32
  • 113
  • 196
Lattice
  • 233
  • 1
  • 2
  • 4
  • It is the right widget. See what [`idle`](http://hg.python.org/cpython/file/63a00d019bb2/Lib/idlelib) does. – tzot Nov 12 '11 at 18:46
  • @tzot You could at least give a better indication of what files one should see. `idlelib` contains many files and modules, etc, and it is a little bit difficult to find something, in my opinion, without a real documentation, and mostly if one has not much experience. I will lead the users of this website first to this article: https://docs.python.org/3.5/library/idle.html – nbro Jul 12 '15 at 16:18

2 Answers2

49

It's the right widget to use for these purposes. The basic concept is, you assign properties to tags, and you apply tags to ranges of text in the widget. You can use the text widget's search command to find strings that match your pattern, which will return you enough information apply a tag to the range that matched.

For an example of how to apply tags to text, see my answer to the question Advanced Tkinter text box?. It's not exactly what you want to do but it shows the basic concept.

Following is an example of how you can extend the Text class to include a method for highlighting text that matches a pattern.

In this code the pattern must be a string, it cannot be a compiled regular expression. Also, the pattern must adhere to Tcl's syntax rules for regular expressions.

class CustomText(tk.Text):
    '''A text widget with a new method, highlight_pattern()

    example:

    text = CustomText()
    text.tag_configure("red", foreground="#ff0000")
    text.highlight_pattern("this should be red", "red")

    The highlight_pattern method is a simplified python
    version of the tcl code at http://wiki.tcl.tk/3246
    '''
    def __init__(self, *args, **kwargs):
        tk.Text.__init__(self, *args, **kwargs)

    def highlight_pattern(self, pattern, tag, start="1.0", end="end",
                          regexp=False):
        '''Apply the given tag to all text that matches the given pattern

        If 'regexp' is set to True, pattern will be treated as a regular
        expression according to Tcl's regular expression syntax.
        '''

        start = self.index(start)
        end = self.index(end)
        self.mark_set("matchStart", start)
        self.mark_set("matchEnd", start)
        self.mark_set("searchLimit", end)

        count = tk.IntVar()
        while True:
            index = self.search(pattern, "matchEnd","searchLimit",
                                count=count, regexp=regexp)
            if index == "": break
            if count.get() == 0: break # degenerate pattern which matches zero-length strings
            self.mark_set("matchStart", index)
            self.mark_set("matchEnd", "%s+%sc" % (index, count.get()))
            self.tag_add(tag, "matchStart", "matchEnd")
Community
  • 1
  • 1
Bryan Oakley
  • 370,779
  • 53
  • 539
  • 685
  • Thanks, this has helped me immensely! Can you tell me how to change this so it accepts regular expressions as patterns, though? (When I try, I get `TypeError: '_sre.SRE_Pattern' object has no attribute '__getitem__'`) – CodingCat Oct 08 '12 at 12:02
  • @Lastalda: the text widget `search` method accepts a keyword argument named `regexp`. If you set this to `True` the pattern will be treated as a regular expression. I've updated my answer to include this functionality. Unfortunately tkinter-specific documentation on the `search` method is a bit sparse. If you read the official tk documentation it's explained a little better, though you have to do a small mental translation from tcl to python. See http://tcl.tk/man/tcl8.5/TkCmd/text.htm#M120 – Bryan Oakley Oct 08 '12 at 14:03
  • Thanks for looking into it. But I still get the same error. :( Am I doing something wrong with the regexp? I use `w.HighlightPattern(re.compile("R\d+"),"blue")` and I get the error traceback `File "C:\Python27\lib\lib-tk\Tkinter.py", line 3030, in search` `if pattern and pattern[0] == '-': args.append('--')` `TypeError: '_sre.SRE_Pattern' object has no attribute '__getitem__'` – CodingCat Oct 09 '12 at 07:26
  • @Lastalda: The `pattern` argument must be a string, not a compiled regular expression. – Bryan Oakley Oct 09 '12 at 11:02
  • Huh. I tried that first, then tried the compiled version when it didn't work. But now it does work. Anyway, it's working now. Thank you so much! – CodingCat Oct 10 '12 at 07:27
  • @tom_mai78101: I think you are incorrect. Did you actually try running the code? It highlights fine whether the pattern is a fixed string or a regular expression.`count` represents the number of characters that matched, and that's how many characters we want to highlight. If we match 10 characters, we want to highlight 10 characters. Why subtract 1? – Bryan Oakley Jun 11 '16 at 11:00
  • @tom_mai78101: I think it's simply the case that your pattern is matching more than you realize. It's impossible for me to say since I don't know what your pattern actually resolves to. – Bryan Oakley Jun 11 '16 at 19:05
  • @BryanOakley The pattern I am matching is `WedrBot`. No spaces, nor anything added. This is what happens before adding `-1`: http://i.imgur.com/HjXsVKb.png and after adding `-1`: http://i.imgur.com/wYHVApY.png I am guessing it may be because I specified my regex to not include the ] character, so it can differentiate from tags for the IRC, sender, and names within the text messages. But that is my educated guess. What do you think? – tom_mai78101 Jun 11 '16 at 19:13
  • @tom_mai78101: ake a close look at your pattern. If the pattern looks like `WedrBot([^\>])|...)` than that means it will match the string `WedrBot]` because the literal character `]` is matched by the sub-pattern `[^\>]` which means "any character other than >". You are telling it to match the literal string `WedrBot` followed by any character other than that one, so it matches `WedrBot]`. Again, it's not the highlighting code, it's the fact that your pattern is matching more than you think it does. Of that I am 100% certain. – Bryan Oakley Jun 11 '16 at 23:59
  • @BryanOakley I took a look, and indeed it is like that. I'm currently looking up alternatives for this. Thanks, and sorry for saying you're incorrect. – tom_mai78101 Jun 12 '16 at 00:24
  • Hmm. I tried the CustomText technique but it doesn't work for me, but I think I have some other problem. I've used the following technique successfully in a Text widget in one app but it doesn't work in another results is a Text widget results.tag_config('search',foreground='orange',background='black'') where=results.search(findThis,INSERT,None,exact=True) #find end of selection pastSel=where+'+%dc'% len(findThis) results.tag_add('Search',where,pastSel) – dday52 Jun 19 '23 at 13:30
  • @dday52: `'search'` and `'Search'` are two different tags. Case matters. – Bryan Oakley Jun 19 '23 at 17:21
  • @BryanOakley- you are right. A typo on my part. In the app having the problem, the tag is defined with 'search' as the name. and there is no tag defined as 'Search'. results.tag_add() uses the 'search' tag... I'm wondering if there some Text widget property that can adversely affect highlighting. In my two apps though the Text widget is defined the same/has the same options. – dday52 Jun 20 '23 at 18:45
  • @dday52: in my experience, highlighting is rock solid and deterministic. It works exactly as documented. – Bryan Oakley Jun 20 '23 at 19:09
0

Bryan Oakley's answer has helped me a lot on configuring highlights on many text widgets. Thanks to them, I'm able to understand how the highlighting works now.

Drawbacks

Only drawback I found was the difference between the regex syntax used by tcl/tk and the python regex syntax. The tcl/tk regex syntax is close to the normal python regular expression syntax, but it's not the same. Due to this issue, many of the regex testing applications available were not usable for writing regex for tkinter search method.

Solution

NOTE: This won't work as expected if the text widget has embedded images or widgets since the indexes in the widget won't be the same as the indexes in just the text portion.

I tried to incorporate python's regular expression standard library with the tkinter Text widget.

import re
import tkinter as tk
...

def search_re(self, pattern):
    """
    Uses the python re library to match patterns.

    pattern - the pattern to match.
    """
    matches = []
    text = textwidget.get("1.0", tk.END).splitlines()
    for i, line in enumerate(text):
        for match in re.finditer(pattern, line):
            matches.append((f"{i + 1}.{match.start()}", f"{i + 1}.{match.end()}"))
    
    return matches

the return value is a list of tuples containing the start and end indices of the matches. Example:

[('1.1', '1.5'), ('1.6', '1.10'), ('3.1', '3.5')]

Now these values can be used to highlight the pattern in the text widget.

Reference

CustomText widget

This is a wrapper for tkinter's Text widget with additional methods for highlighting and searching with regular expression library. It's based on bryan's code, thanks to them.

import re
import tkinter as tk


class CustomText(tk.Text):
    """
    Wrapper for the tkinter.Text widget with additional methods for
    highlighting and matching regular expressions.

    highlight_all(pattern, tag) - Highlights all matches of the pattern.
    highlight_pattern(pattern, tag) - Cleans all highlights and highlights all matches of the pattern.
    clean_highlights(tag) - Removes all highlights of the given tag.
    search_re(pattern) - Uses the python re library to match patterns.
    """
    def __init__(self, master, *args, **kwargs):
        super().__init__(master, *args, **kwargs)
        self.master = master
        
        # sample tag
        self.tag_config("match", foreground="red")

    def highlight(self, tag, start, end):
        self.tag_add(tag, start, end)
    
    def highlight_all(self, pattern, tag):
        for match in self.search_re(pattern):
            self.highlight(tag, match[0], match[1])

    def clean_highlights(self, tag):
        self.tag_remove(tag, "1.0", tk.END)

    def search_re(self, pattern):
        """
        Uses the python re library to match patterns.

        Arguments:
            pattern - The pattern to match.
        Return value:
            A list of tuples containing the start and end indices of the matches.
            e.g. [("0.4", "5.9"]
        """
        matches = []
        text = self.get("1.0", tk.END).splitlines()
        for i, line in enumerate(text):
            for match in re.finditer(pattern, line):
                matches.append((f"{i + 1}.{match.start()}", f"{i + 1}.{match.end()}"))
        
        return matches

    def highlight_pattern(self, pattern, tag="match"):
        """
        Cleans all highlights and highlights all matches of the pattern.

        Arguments:
            pattern - The pattern to match.
            tag - The tag to use for the highlights.
        """
        self.clean_highlights(tag)
        self.highlight_all(pattern, tag)

Example usage

Following code uses the above class, and shows an example of how to use it:

import tkinter as tk


root = tk.Tk()

# Example usage
def highlight_text(args):
    text.highlight_pattern(r"\bhello\b")
    text.highlight_pattern(r"\bworld\b", "match2")

text = CustomText(root)
text.pack()

text.tag_config("match2", foreground="green")

# This is not the best way, but it works.
# instead, see: https://stackoverflow.com/a/40618152/14507110
text.bind("<KeyRelease>", highlight_text)

root.mainloop()
Billy
  • 1,157
  • 1
  • 9
  • 18
  • 1
    This won't work as expected if the text widget has embedded images or widgets since the indexes in the widget won't be the same as the indexes in just the text portion. You should probably mention that. – Bryan Oakley Jul 26 '22 at 22:49
  • Thats indeed a drawback of this, thanks for point out bryan, I have mentioned it! – Billy Jul 27 '22 at 07:05