9

I'm looking for amount of repetitions of symbols in Lua pattern setup. I try to check amount of symbols in a string. As I read in manual, Even with character classes this is still very limiting, because we can only match strings with a fixed length.

To solve this, patterns support these four repetition operators:

  • '*' Match the previous character (or class) zero or more times, as many times as possible.
  • '+' Match the previous character (or class) one or more times, as many times as possible.
  • '-' Match the previous character (or class) zero or more times, as few times as possible.
  • '?' Make the previous character (or class) optional.

So, no information about Braces {} e.g.,

{1,10}; {1,}; {10};

doesn't work.

local np = '1'
local a =  np:match('^[a-zA-Z0-9_]{1}$' )

returns np = nil.

local np = '1{1}'
local a =  np:match('^[a-zA-Z0-9_]{1}$' )

returns np = '1{1}' :)

This url says that no such magic symbols:

Some characters, called magic characters, have special meanings when used in a pattern. The magic characters are

( ) . % + - * ? [ ^ $

Curly brackets do work only as simple text and no more. Am I right? What is the best way to avoid this 'bug'?

It is possible to read usual usage of braces, for instance, here.

Community
  • 1
  • 1
Vyacheslav
  • 26,359
  • 19
  • 112
  • 194
  • 2
    Lua do not provide it. You can repetition byself e.g.(`\d{2,}` is `%d%d+`). Also you can use Lua rex pcre library. – moteus Oct 01 '15 at 09:38
  • @moteus, very monstrous and ugly usability. but thanks for the idea. – Vyacheslav Oct 01 '15 at 09:40
  • Lua pattern doesn't support full set of Perl regex features. Braces are not supported. Use explicit count: `np:match('^'..('[%w_]'):rep(k)..'$')` – Egor Skriptunoff Oct 01 '15 at 09:40
  • @EgorSkriptunoff, doesn't it dramatically increase the calculations? – Vyacheslav Oct 01 '15 at 09:41
  • 2
    @trololo - Regex had never been a fast thing. It's always CPU-intensive. Search for another approaches to calculate faster: `#np==k and not np:find'[^%w_]'` – Egor Skriptunoff Oct 01 '15 at 09:41
  • 1
    Another limitation for the supported quantifiers: *Unlike some other systems, in Lua a modifier can only be applied to a character class; there is no way to group patterns under a modifier.* Try with PCRE library: `> require "rex_pcre" > return rex_pcre.new("^[a-zA-Z0-9_]{2}$"):exec("12")`. – Wiktor Stribiżew Oct 01 '15 at 09:45
  • @EgorSkriptunoff, I think you are right: to check string length in my simple case beforehand is the most usable solution without using external libraries. – Vyacheslav Oct 01 '15 at 09:47
  • @stribizhev, thanks, i will try. – Vyacheslav Oct 01 '15 at 09:49

1 Answers1

7

We can't but admit that Lua regex quantifiers are very limited in functionality.

  1. They are just those 4 you mentioned (+, -, * and ?)
  2. No limiting quantifier support (the ones you require)
  3. Unlike some other systems, in Lua a modifier can only be applied to a character class; there is no way to group patterns under a modifier (see source). Unfortunately Lua patterns do not support this ('(foo)+' or '(foo|bar)'), only single characters can be repeated or chosen between, not sub-patterns or strings.

As a "work-around", in order to use limiting quantifiers and all other PCRE regex perks, you can use rex_pcre library.

Or, as @moteus suggests, a partial workaround to "emulate" limiting quantifiers having just the lower bound, just repeat the pattern to match it several times and apply the available Lua quantifier to the last one. E.g. to match 3 or more occurrences of a pattern:

local np = 'abc_123'
local a = np:match('^[a-zA-Z0-9_][a-zA-Z0-9_][a-zA-Z0-9_]+$' )

See IDEONE demo

Another library to consider instead of PCRE is Lpeg.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563