2

I referred to Regular expressions in C: examples?

It seems that the regex has to be "compiled" before using. Why does this needs to be explicitly done? Why can't 'pcre_exec' do the job itself?

Community
  • 1
  • 1
user13107
  • 3,239
  • 4
  • 34
  • 54

1 Answers1

6

It's a design decision.

It could, but if it did the compilation and the execution in one step, then it would be quite inefficient to use the same regex multiple times. The compilation of a regex is a computationally expensive operation (just like compiling some source code written in a programming language is expensive), so if you want to use the regex more than once, then doing

expensive_compilation(regex_object, "/the/regular\.expression$");

for (i = 0; i < 1000000; i++)
    regex_match(regex_object, next_line_to_be_processed);

will be significantly faster than if you moved the (redundant) compilation inside the loop.

  • 1
    +1 - Yup, and compiled regular expressions aren't unique to C; you can (or must, or implicitly) do the same in, AFAIK, Java, C#, Python, even Javascript, although many of these allow you to (inefficiently, in cases as described by @H2CO3) run a match without compiling, too. – Andrew Cheong Jul 31 '13 at 07:55
  • @acheong87 Thanks! Yes, although from a control freak point of view (i. e. "in first place, I completely disregard efficiency and focus on ease of use instead"), I don't find it too good that e. g. the POSIX `` API doesn't provide a convenience one-step interface. It's really useful if you know you will use the regex only once. -- Also, we can notice this compilation-and-execution pattern in SQL engines too (`sqlite3_prepare()`, for example, serves for the same purpose.) –  Jul 31 '13 at 07:57
  • Thanks. Coming from Perl I found it surprising that a regex compilation is expensive. – user13107 Jul 31 '13 at 07:58
  • @user13107 You're welcome. In "nice and productive" scripting languages, such as Perl, speed is generally not the greatest concern :) (Not that it should be unless benchmarked - premature optimization is evil.) -- If you create a tight loop with a regex match inside and another one with a simple string comparison, I dare you the first one will run slower. –  Jul 31 '13 at 07:59