2

I have the following regex \b(\w+)$ that works to find the last word in a string. However the longer the string, the more steps it takes.

How can I make it start the search from the end of the line?

  • What do you mean about more steps it takes? Depending on language you could reverse the string. – danronmoon Sep 14 '17 at 17:51
  • then how do I make it more efficient? –  Sep 14 '17 at 17:53
  • How inefficient is it right now? Do you have benchmarks or are you micro-optimizing? – danronmoon Sep 14 '17 at 17:54
  • 1
    Assuming you need this for a specific use-case: Why don't you make it work from the start of the string but reverse your entire string? That way a string like `This is my string` becomes `gnirts ym si sihT`. Then use regex to parse from the beginning of the string and the re-reverse your string parts: `gnirts` becomes `string`. – ctwheels Sep 14 '17 at 18:14
  • maybe I'm being too picky - I wanted the number of steps to remain the same regardless of the length of the string. –  Sep 14 '17 at 18:20
  • Split on `\W+` and take the last element. – Toto Sep 14 '17 at 18:24
  • @mtissington No, but if you reverse the string like I mentioned and used `^` instead of `$` it will only do 7 steps instead of X steps – ctwheels Sep 14 '17 at 18:24

4 Answers4

3

Answer

Brief

Using the regex you specified \b(\w+)$ you will get an increasing number of steps depending on the string's length (it will match each \b, then each \b\w, then each \b\w\w until it finds a proper match of \b\w$), but it still has to do that check on each item in the string until it's satisfied.

What you can do to get the last item of a string using regex explicitly is to flip the string and then use the ^ anchor. This will cause regex to immediately be satisfied upon the first match and stop processing.

You can search how to flip a string in multiple languages. Some examples for languages include the following:

Code

You can see the regex in use here

Your programming language

// flip string code goes here 

Regex

^(\w+)

Your programming language

// flip regex capture code goes here

Input

This is my string

Output

Converted to the following by flipping the string in your language

gnirts ym si sihT

Regex returns the following result

gnirts

Flip the string back in your language

string

Explanation

Since the anchor ^ is used, it will check from the beginning of the string (as per usual regex behaviour). If this is satisfied it will return the match, otherwise, it will return no matches. Testing in regex101 (provided through the link in the Code section) shows that it takes exactly 6 steps to ensure that a match is made. It also takes exactly 3 steps to ensure no match is made. These values do not change with string length.

ctwheels
  • 21,901
  • 9
  • 42
  • 77
1

In most regex engines, you can't.

Regex engines work by consuming input from the start of the input.


You can programmatically do it with a simple decrementing loop over the characters starting from the last character. If you need more performance, using code over regex is the only way.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
1

It only works in .NET:

Regex rx = new Regex(Pattern, RegexOptions.RightToLeft);
Match match = rx.Match(Source);
Jan
  • 42,290
  • 8
  • 54
  • 79
0

This can be faster.

^.*\b(\w+)

• add ^.* before and capture \w+
• drop the $ if possible

Good luck!

codeonly
  • 1
  • 1