efficient way to color text in JTextPane

Question

I have a problem regarding coloring some keywords in a JTextPane. In other words, I want to make something like a mini IDE so I will write some code and I want to give a color (say blue) for some keywords like "public" "private" ... etc. The problem is that it is strongly slow !! as each time I hit the "space" or "backspace" key the function scans the whole text to give a color to the keywords, so when I write a lot of code in the textpane it gets very slow. here is my function of matching keywords:

public void matchWord() throws BadLocationException {
        String tokens[] = ArabicParser.tokenNames;
        int index = 0;
        String textStr[] = textPane.getText().split("\\r?\\n");
        for(int i=0 ; i<textStr.length ; i++) {
            String t = textStr[i];
            StringTokenizer ts2 = new StringTokenizer(t, " ");
            while(ts2.hasMoreTokens()) {
                String token = ts2.nextToken();

                // The iterations are reduced by removing 16 symbols from the search space
                for(int j = 3 ; j<tokens.length-5 ; j++) {
                    if(!(token.equals("؛")) && (tokens[j].equals("'"+token+"'"))) {
                        changeColor(textPane,token,Color.BLUE,index,token.length());
                        break;
                    } else {
                        changeColor(textPane,token,Color.BLACK,index,token.length());
                    }
                }
                index += token.length() + 1;
            }
            //index -= 1;
        }
    }

and here is my function of coloring the matched words:

private void changeColor(JTextPane tp, String msg, Color c, int beginIndex, int length) throws BadLocationException {
        SimpleAttributeSet sas = new SimpleAttributeSet(); 
        StyleConstants.setForeground(sas, c);
        StyledDocument doc = (StyledDocument)tp.getDocument();
        doc.setCharacterAttributes(beginIndex, length, sas, false);
        sas = new SimpleAttributeSet(); 
        StyleConstants.setForeground(sas, Color.BLACK);
        tp.setCharacterAttributes(sas, false);
    }

and thanks in advance =)

No need to check the whole text but changed line(s) only, – StanislavL Jun 20 '15 at 11:48 — StanislavL, Jun 20 '15 at 11:48

score 1 · Answer 1 · edited May 23 '17 at 11:43

Consider replacing StringTokenizer since it's modern use is discoraged https://stackoverflow.com/a/6983908/1493294

Consider refactoring String tokens[] into HashSet<String> tokens. Hash lookup will be faster than looping, especially as tokens[] gets large.

If you'd like to use more than two colors try HashMap<String, Color> tokens.

Also, having two very different things called token and tokens running around in here is confusing. Consider renaming tokens[] to coloredNames[] so it's clearly different than the token from the textPane tokens.

Consider using a profiler to see where the bulk of the time is being spent. You might find repetitive work being done in changeColor() would be worth caching.

If so write a class called ColorChanger. ColorChanger will have one constructor and one method changeColor(). The constructor will take (and thus cache) the parameters that don't change as you loop. ColorChanger.changeColor() will take the parameters that do change as you loop.

I don't see any reason here for replacing StringTokenizer. The selected answer argument doesn't apply here, and the top rated answer shows that the use doesn't seem to be discouraged. I also think that your answer doesn't consider the core of the problem which is that the amount of data treated is far too important and unnecessary. — Sharcoux, Jun 21 '15 at 01:12

Sharcoux · Answer 2 · 2015-06-21T09:28:03.473

You could use a DocumentListener to analyse only the text that is inserted inside your TextPane. This way, you wouldn't need to analyse the whole text multiple times, you would check only what is added.

To do so, you would need to get the getWordStart and getWordEnd methods of the javax.swing.text.Utilities class. This way you can get the surrounding context of the insert location.

Edit : Removing can change the state of the keywords. When you remove, you need to get the text between the removal start position and getWordStart, and the text between the removal end position and getWordEnd. For instance, if you remove "continental sur" from "intercontinental surface", you would get "interface" which might be a keyword.

You could use this class for instance :

import javax.swing.text.Utilities;
public class Highlighter implements DocumentListener {

    public void insertUpdate(final DocumentEvent e) {
        highlight(e.getDocument(),e.getOffset(),e.getLength());
    }

    public void removeUpdate(DocumentEvent e) {
        highlight(e.getDocument(), e.getOffset(), 0);
    }

    public void changedUpdate(DocumentEvent e) {}

    private void highlight(final Document doc, final int offset, final int length) {
        //Edit the color only when the EDT is ready
        SwingUtilities.invokeLater(new Runnable() 
            public void run() {
                //The impacted text is the edition + the surrounding part words.
                int start = Utilities.getWordStart(myJTextPane,offset);
                int end = Utilities.getWordEnd(myJTextPane,offset+length);
                String impactedText = doc.getText(start,end-start);
                applyHighlighting(doc, impactedText, offset);
            }
        });
    }

    private void applyHighlighting(Document doc, String text, int offset) {
        //we review each word and color them if needed.
        StringTokenizer tokenizer = new StringTokenizer(text, " \t\n\r\f,.:;?![]'()");
        int start = 0;
        while(tokenizer.hasMoreTokens()) {
            String word = tokenizer.nextToken();
            start = text.indexOf(word,start+1);
            if(isKeyword(word)) {
                //you can use the method you proposed for instance as a start.
                changeColor(myJTextPane, word, Color.BLUE, start, word.length());
            } else if(offset==0 || !tokenizer.hasMoreTokens()) {
                //The first and last word's state can have changed. 
                //We need to put them back in BLACK if needed.
                changeColor(myJTextPane, word, Color.BLACK, start, word.length());
            }
        }
    }
}

(1+) for the DocumentListener, but don't forget removing text can affect highlighting. Also, you can paste multiple lines of text. So you need to consider these situations for a general solution. — camickr, Jun 20 '15 at 16:26
You're completely right. I forgot about removing part. My edit should fix this. About multi-line, actually, you still need only the previous wordStart and next wordEnd. The impacted text would be wordStart+insertedText+wordEnd. — Sharcoux, Jun 21 '15 at 00:54
Yes, the wordStart/wordEnd concept will work for handling any amount of pasted text. When I wrote my highlighting code I was also handling String literals so I needed more text to make sure you didn't highlight a token found in a literal string. The tokenizing also get far more complex when you need to handle comments. Note the parameters of the getText(...) method are `Document.getText(start, length)` not (start, end) so you will be tokenizing extra text in your current example. — camickr, Jun 21 '15 at 03:05
That's true about getText. Sorry, I edited my answer. About String literals and comments, Actually your method is not sufficient as the string or comment can start a few paragraphs before. But IMO, to handle this, you need to do it after highlighting keywords. Then you parse the whole text and look for open/closing quotes or comments. And then you highlight the whole portion, which is efficient enough. — Sharcoux, Jun 21 '15 at 09:34

camickr · Answer 3 · 2015-06-20T16:33:31.933

The problem is that it is strongly slow !! as each time I hit the "space" or "backspace" key the function scans the whole text

You can make this more efficient by processing only the line that changed.

A DocumentListener can be used to notify you when the Document has changed. You can then parse only the lines that have been affected by the change. Remember multiple lines of text could be pasted into the text pane, so you need to handle this situation.

Here is some (untested) code for a simple structure for the DocumentListener that you might use to only process the changed lines:

public class KeywordDocumentListener implements DocumentListener
{
    public void insertUpdate(final DocumentEvent e)
    {
        SwingUtilities.invokeLater(new Runnable()
        {
            public void run()
            {
                processChangedLines(e.getDocument(), e.getOffset(), e.getLength());
            }
        });
    }

    public void removeUpdate(DocumentEvent e)
    {
        SwingUtilities.invokeLater(new Runnable()
        {
            public void run()
            {
                processChangedLines(e.getDocument(), e.getOffset(), 0);
            }
        });
    }

    public void changedUpdate(DocumentEvent e) {}

    private void processChangedLines(Document doc, int offset, int length)
    {
        //  The lines affected by the latest document update

        Element rootElement = doc.getDefaultRootElement();
        int startLine = rootElement.getElementIndex(offset);
        int endLine = rootElement.getElementIndex(offset + length);

        //  Do the highlighting one line at a time

        for (int i = startLine; i <= endLine; i++)
        {
            int lineStart = rootElement.getElement( i ).getStartOffset();
            int lineEnd = rootElement.getElement( i ).getEndOffset() - 1;
            String lineText = doc.getText(lineStart, lineEnd - lineStart);
            applyHighlighting(doc, lineText, lineStart);
        }
    }

    private void applyHighlighting(Document doc, String text, int lineStart)
    {
        // Now you can search a line of text for your keywords
        // As you find a keyword to highlight you add the lineStart to the search
        // location so the highlight is the proper offset in the Document
    }
}

The invokeLater() is needed because you can't update a Document in a DocumentListener, so this places the code at the end of the EDT so it is executed after the listener has finished executing.

For simple parsing I don't see a problem using the StringTokeninzer. It will be more efficient then using a regex.

to give a color to the keywords,

Actually you are doing more than coloring the keywords, you are also coloring every normal word which is not very efficient. I recommend you set the entire line of text to the BLACK foreground color. Then as your parse you only highlight the tokens that you find with the BLUE color. This will significantly reduce the number of attribute changes that are done to the Document.

Don't create a new AttributeSet for every token. Create the AttributeSet once and then reuse it for each token.

efficient way to color text in JTextPane

3 Answers3