0

I have functions and placed inside of html code. These functions has this following syntax rules:

  1. There is '#' symbol as an opened tag
  2. There is a function name after the opened '#' tag. The function name can contain number (1,2,3),alphabet (a,b,c), and underscore (_).
  3. After function name, there is a pair of brackets contain of paramater. The paramater can contain anything including alphanumeric, arithmetic operator (<,>,=,!), and this: @,#,$,%,^,&,(,),?,*,/,[,]
  4. After the parameter, there is html code which is put inside of curly bracket.
  5. Finaly the function closed using '#' tag.

This is not my real function but it give the whole ideas of rules above:

<html>
#v123w(r(!@3o=?w){
<div></div>
}#
#131ie_w(13gf$>&*()(*&){
<div></div>
}#
</html>

All this time, I'm using this regex to capture the all of the function names, parameters, and the html strings inside functions:

#(\w+)\(*([\w\d\s\=\>\<\[\]\"\'\)\(\&\|\*\+\-\%\@\^\?\/\$\.\!]*)\)\)*{((?:(?R)|.)*?)}#

This is the result:

enter image description here

You can see the detail in regex tester: https://regex101.com/r/HdCeeV/1

Currently I found that preg_match_all function in php does not work for a long string. Thus, I cannot use this regex if the html code inside the function is too long. I need to capture the function name, function parameter, and html string inside the function. Is there any alternative for this regex? Maybe using PHP file function like substr, strpos, etc?

Kevin
  • 79
  • 2
  • 11
  • _preg_match_all function in php does not work for a long string_ Why do you think this? – AbraCadaver Oct 15 '19 at 11:33
  • Why not writing a parser? – Toto Oct 15 '19 at 11:34
  • @AbraCadaver, because regex work in limited available process stack, you can read it more here: https://stackoverflow.com/questions/3021316/preg-match-and-long-strings – Kevin Oct 15 '19 at 11:39
  • @Toto, my function is not xml based, so it's not possible using php xml parser – Kevin Oct 15 '19 at 11:40
  • I didn't say use a php xml-parser, I said write your own, this is too complex for regex. – Toto Oct 15 '19 at 11:43
  • @Toto That's what I'm asking about, I've write down my own code as a substitute for current regex. I'm using combination of strpos, substr, and str_replace to capture the function name, paramater, the html inside. However, the process is very slow, even slower than using regex. That's why I'm asking here if there is another solution that is much faster than my current code. – Kevin Oct 15 '19 at 11:49
  • Show us the code, may be we could find some improvements – Toto Oct 15 '19 at 11:52
  • *"This is not my real function but it give the whole ideas of rules above"*: bad idea, show us also real data. – Casimir et Hippolyte Oct 15 '19 at 11:58
  • 1
    Have look at https://regex101.com/r/HdCeeV/2 a more efficient regex. – Toto Oct 15 '19 at 12:10
  • @Toto I've tried your regex and it's work on my long string now, thanks. – Kevin Oct 15 '19 at 12:32

1 Answers1

1

Here is an improvement of your regex, a little bit more efficient:

#(\w+)\(([\w\s=><[\]"')(&|*+%@^?\/$.!-]*)\){(.+?)}#

Demo & Explanation

Toto
  • 89,455
  • 62
  • 89
  • 125