0

For a regex that support only +,?,*,.,|,[..],[^..],^,$,(..), a matcher that lets you match recursion: {<some-regex-name>} with length m, and a string of length n.

(the regex isn't supporting positive/negative look-ahead/behind)

What is the best complexity of the matcher?

Examples:

Bracket Matching:

brackets = (\({brackets}\))*
// brackets can match:
// ((()())())()

Some random "double" recursion:

a = \(({a}|{b})?\)
b = \[{a}(,{b})*\]
// a can match:
// (([(),[(),[()]][(())]]))

Json without whitespaces:

string = "([^\\"]|\\.)"
object = \{ ( {string}:{value}(,{string}:{value})* )? \}
number = [0-9]+(\.[0-9]*)?
list = \[({value}(,{value})*)?]
value = object|string|number|list|true|false
// value can match:
// {"key":false,"ab\"c":[false,12.3,[{}]]}

i have an idea how to implement this with complexity O(m^2*n^3) where n is the size of the string, and m is the size of the regex. i haven't implement this yet, so maybe i have a mistake

ntg
  • 12,950
  • 7
  • 74
  • 95
Ofek
  • 1,065
  • 6
  • 19
  • Depends totally on what is inside the regex. You need to add details about the regex itself. – Wais Kamal Oct 19 '21 at 10:41
  • @WaisKamal any standard regex patterns. like `X+`,`X*`,`[...]` ect – Ofek Oct 19 '21 at 10:43
  • *"the regex can contains any standard matcher"* There is nothing "standard" about what regexps can or cannot do. Regexps debuted as a theoretical way to describe regular languages, which is very restricted. Then, people figured out regexps were cool, and started adding more and more things to them to make them more powerful. Today, every regexp language has its own particularities and there is no standard. There is no longer anything regular about regexps. If you make up your own regexp language, you need to explain precisely and explicitly what is allowed in your regexps. – Stef Oct 19 '21 at 11:24
  • @Stef i changed it to basic – Ofek Oct 19 '21 at 11:26
  • 1
    What does "basic" mean? What does "etc" mean in your list of possible patterns? You won't get any meaningful help until you can describe your problem very precisely. This is all too vague. Note that describing your problem precisely will not only allow you to get answers, but it will help you see more clearly and help yourself solve your problem. – Stef Oct 19 '21 at 11:35
  • i already removed the "ect", and wrote every matcher, should i rewrite the question? – Ofek Oct 19 '21 at 11:39
  • It's not clear what your question is exactly. So, you're trying to implement this regexp matcher. OK. Then you said that you have "made an algorithm". OK. Did you implement your algorithm in a programming language? Did you test it? Does it work? What is your question? – Stef Oct 19 '21 at 11:44
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/238318/discussion-between-ofek-and-stef). – Ofek Oct 19 '21 at 12:11
  • 1
    The "match recursion" raises the expressiveness to be coextensive with context-free languages, and I'm guessing your algorithm looks like CYK. If memory serves, there are asymptotically faster algorithms that leverage faster matrix multiplication, but nothing particularly practical. – David Eisenstat Oct 19 '21 at 12:29
  • This seems to be similar to https://stackoverflow.com/questions/37385964/time-complexity-for-combination-of-parentheses – ntg Oct 20 '21 at 10:21
  • 1
    @ntg there is no connection, he asked for generating every combination of parentheses, i ask for a way to match a recursive regex. the parentheses was only an example to what is my idea. – Ofek Oct 21 '21 at 15:00
  • Apologies, read the other one too fast – ntg Oct 22 '21 at 06:00

0 Answers0