1

I am attempting to learn the source code of grep by building it according to the README-hacking. However, when debugging the source code, I am unable to step into the re_compile_pattern function. The gdb mentions that the function links to libc.so.6. Is there any way to resolve this issue, such as editing the Makefile? I found the Makefile for grep to be complex and I am unsure how to go about it.

I am considering building the gnulib and writing my own code to call it. However, I encountered an error stating "possibly undefined macro: gl_CHECK_NEXT_HEADERS." I will attempt to resolve this issue later using WSL.

Environment: Termux (Android) Distribution: Archlinux Device: Honor 10

ks1322
  • 33,961
  • 14
  • 109
  • 164
wuch
  • 13
  • 3
  • `libc` is the system's C runtime library. It's not code in grep. That's why you can't debug it. Usually there are packages containing debug libraries for the C runtime library that can be installed if you want to debug them, but I've never heard of "Termux" so I have no idea how you'd go about finding those. – MadScientist Jun 16 '23 at 13:51

1 Answers1

0

The error "possibly undefined macro: gl_CHECK_NEXT_HEADERS" indicates that the configure file has not been correctly generated. There are two ways to get a working tarball with a correct configure file:

Then, use ./configure --help to see the package specific configuration option. This help displays, among other options:

  --without-included-regex
                          don't compile regex; this is the default on systems
                          with recent-enough versions of the GNU C Library
                          (use with caution on other systems).

So, what you need here, is --with-included-regex.

Then, use the general flags for getting debuggable output. For mixed C/C++ programs, use CFLAGS="-ggdb" CXXFLAGS="-ggdb". The default for both variables is -g -O2, which enables optimizations that makes for a suboptimal debugging experience. Since grep uses C only, CFLAGS="-ggdb" is sufficient.

In summary, use

CFLAGS="-ggdb" ./configure --with-included-regex

Finally, note that the source code of the GNU regular expression implementation is not well-suited for learning. It's a production-quality implementation with many optimizations. Probably only two persons in the world understand this implementation's code: Paul Eggert and the original author Isamu Hasegawa. For learning, simpler regular expression implementations are better suited, see Wikipedia.

Bruno Haible
  • 1,203
  • 8
  • 8
  • It works. Thank you for your answer and advice. – wuch Jun 18 '23 at 12:06
  • I read the book called Compilers: Principles, Techniques, & Tools, this book introduces how to crate a nfa from regular expression. There is a article telling how to generate a NFA or DFA from regular expression using Thompson's algorithm, the URL is https://swtch.com/~rsc/regexp/regexp1.html. I want to write a toy compiler, I think it is very cool. I also want to write a template engine to generate my blog. The both two tools need a regular expression parser. So I want to know how the grep converts the regular expression to NFA and DFA. At last, thank you again. – wuch Jun 18 '23 at 13:27