RegexBuddy on the tab "Debug" shows how regular expressions are executed step by step. But what exactly that steps mean? What operations are behind every step?
-
This question is off-topic because it is about *general computing hardware and software*. – Wiktor Stribiżew Jan 03 '17 at 16:51
-
That's a tool to discover how corresponding engine works in comparison to your thoughts. – revo Jan 03 '17 at 17:15
-
2Are you aware that `within` RegexBuddy there is a private forum, and that if you ask this question there it is likely that Jan, the author, will reply? Also, [this is an awesome regexbuddy tutorial](http://www.rexegg.com/regexbuddy-tutorial.html) Scroll down a bit for the direct link to the Debug section. – zx81 Jan 03 '17 at 18:07
-
1@WiktorStribiżew actually I ask about this tool here because (1) it is not "general" software but very special software for developers and (2) I suppose the answer can be interesting for others. – Konstantin Smolyanin Jan 03 '17 at 18:35
-
2@zx81 ok ... I can say that 90% of all questions here can be answered by reading official documentation. So what? Let's close this site because all questions can be answered somewhere else, yes? – Konstantin Smolyanin Jan 03 '17 at 18:55
-
The private forum within RegexBuddy, being private, is only available to people who have already purchased RegexBuddy. If you want to ask questions about RegexBuddy on a public forum, then stackoverflow is a good fit. RegexBuddy is a technical tool and programmers are a key group of users. – Jan Goyvaerts Jan 04 '17 at 01:17
-
You can think of steps as individual checks similar to basic programming if-checks each time the position changes. The more checks, the less efficient. That doesn't always correlate to good or bad, as the context of your regex and its goals are unknown, but typically you want to use as few steps as _needed_ -- note: I didn't say _possible_; that level of detail and optimization is rarely needed. Regex optimization is more of a problem when dealing with regexes that lead to massive performance hits from _badly_ written regexes. Sometimes those are used intentionally for DDoS attacks. – kayleeFrye_onDeck Jan 04 '17 at 04:28
2 Answers
The steps count is basically how many times the current position in the input was changed, which is a very good indicator of performance.
The "current position" may be at any character or between characters (including before and after the entire input).
Simplifying it, regex engines process the input by moving the current position along the input and evaluating whether the regex matches at that position. They also keep track of the position in the regex the match is up to.
I don't want to turn this answer into a regex tutorial, but... regex engines always consume as much of the input as possible while still matching. To give a simple example, given the input "12345"
and the regex .*1.*
, the regex engine will first apply .*
consuming all input leaving the position at the end of the input, fail to match a 1
, then back track by "uncomsuming" one character at a time until it finds a 1
, then continue. You can see that this would take 9 steps just to process the initial .*
.
By contrast, if the regex was [^1]*1.*
, the regex will match the "1"
in just one step.

- 412,405
- 93
- 575
- 722
-
Thank you! But could you provide some more details for this "how many times the current position in the input was changed"? – Konstantin Smolyanin Jan 03 '17 at 19:06
-
@KonstantinSmolyanin I updated the answer, but a visit to http://www.regular-expressions.info/ is probably in order. – Bohemian Jan 03 '17 at 19:22
-
@SeinopSys If you have regexbuddy, just try "Debug Everywhere" and it should show you when it has to backtrack etc so you can see how many steps it takes to match or fail to match, depending on your goal. – kayleeFrye_onDeck Jan 04 '17 at 04:21
-
@kayleeFrye_onDeck I was pointing out a minor grammar issue but with the removal of the backtick by the author the last sentence's last half got even more confusing to me. – SeinopSys Jan 04 '17 at 15:30
In RegexBuddy's debugger, a step is when the regex engine matches something, or fails to match something. Steps that match a character are indicated by all the characters matched by the regex so far which will usually be one character more than the previous step. Steps that match a position, like a word boundary, are indicated by the characters matched so far plus "ok". Steps that failed to match something are indicated by the characters matched so far plus "backtrack".
If you click on any of the matched characters in the debugger, RegexBuddy selects the token in the regular expression that matched those characters and highlights all the characters in the debugger matched by that token. If you click on an "ok" or "backtrack" indicator, RegexBuddy selects the token in the regex that matched or failed to match.
Moving the cursor with the keyboard has the same effect as clicking. Pressing the End key on the keyboard moves the cursor to the end of a step. Then pressing Arrow Up or Down moves the cursor to the previous or next step while keeping the cursor at the end of that step. By moving the cursor this way, you can easily follow how the regex engine steps through your regular expression and which characters is matches and backtracks along the way.
For more details, see these two pages in RegexBuddy's help file: https://www.regexbuddy.com/manual.html#debug https://www.regexbuddy.com/manual.html#benchmark

- 21,379
- 7
- 60
- 72