It's evil without being it completely, of course, it may be slow on big strings or on really complex regexp, which is not your case. And it is still (more readable?), easier and quicker to implement than HTML or XML parser, which are not more optimized than a simple regexp match.
$var = '<li data-tpl-classname="class" data-tpl-title="innerHTML"></li>'
preg_match_all("data-tpl-([^"]*)="([^"]*)"/i", $str, $matches);
$array = array();
for($i = 1, $size = count($matches); $i < $size; ++$i){
$array[$matches[$i][0]] = $matches[$i][1];
}
I used [^"]*
instead of .*?
since it is a bit quicker.
Note: I just made a benchmark. Compared to the first answer using DOMDocument, this code using Regexp is 4 time faster, but less cleaner since parsing Dom using regexp may lead to misinterpretations of the markup. And it is slightly slower than the answer using str
functions (but easier to read and to maintain).
Note 2: Of course use this solution only if there will never be any confusion and if you are sure of the input format, in the contrary the solution with DOMDocument is cleaner.
Why regular expression should be used wisely or avoided when parsing HTML:
http://blog.codinghorror.com/parsing-html-the-cthulhu-way
Use them with that in mind:
- It's generally a bad idea.
- Unless you have discipline and put very strict conditions on what you're doing, matching HTML with regular expressions rapidly devolves
into madness, just how Cthulhu likes it.
- I had what I thought to be good, rational, (semi) defensible reasons for choosing regular expressions in this specific scenario.