2

I have compiled ruby 64 bit on an AIX Box. There doesn't seem to be any issue except when I use some specific regular expressions in my code. Here is an example:

/([0-9]){1000}/.match("2")

results in:

RegexpError: too big quantifier in {,}: /([0-9]*){1000}/

When I try reducing the number of repetitions, it seems to work.

I tried digging into the ruby code. But could not understand the reason. Is this some dependency or restriction that we have in AIX/64 bit ruby?

Thanks in advance :)

Ricketyship
  • 644
  • 2
  • 7
  • 22

1 Answers1

1

I almost immediately found the answer.

The first thing I did was to search in the ruby source code for the error being thrown. I found that regex.h was responsible for this.

In regex.h, the code flow is something like this:

/* Maximum number of duplicates an interval can allow.  */
#ifndef RE_DUP_MAX
#define RE_DUP_MAX  ((1 << 15) - 1)
#endif

Now the problem here is RE_DUP_MAX. On AIX box, the same constant has been defined somewhere in /usr/include. I searched for it and found in

/usr/include/NLregexp.h
/usr/include/sys/limits.h
/usr/include/unistd.h

I am not sure which of the three is being used(most probably NLregexp.h). In these headers, the value of RE_DUP_MAX has been set to 255! So there is a cap placed on the number of repetitions of a regex!

In short, the reason is the compilation taking the system defined value than that we define in regex.h!

Hence the issue was solved by reassigning the value of RE_DUP_MAX in regex.h i.e

# ifdef RE_DUP_MAX
# undef RE_DUP_MAX                                                                                            
# endif

# define RE_DUP_MAX ((1 << 15) - 1)

Cheers!

Ricketyship
  • 644
  • 2
  • 7
  • 22
  • Well, then you should certainly submit a patch to the Ruby folks! Unless of course they intend to use the system's RE_DUP_MAX ... (And you should probably mark the question answered, as well, using the check mark.) – Michael Ratanapintha Jan 20 '12 at 04:00
  • I had a similar issue some time ago and found the same cause as you have found. In my case, I added "-DRE_DUP_MAX=32767" to CFLAGS. It's not always good to do that though - better to let the configure script derive its own value for CFLAGS... – graza Mar 30 '12 at 15:35
  • @graza the problem arrives when the constant is defined in multiple header files. Which ever comes first will be taken. This is may not be controlled by CFLAGS as the header might be included in some other header and stuff! Anyways thanks for the piece of info! :) – Ricketyship Apr 06 '12 at 10:03