2

Is it possible to parse compressed files in flex?

yyin is a pointer of type FILE*. So I would like to do something like this: create a pipe of compressed file and set yyin to it?

Vardan Hovhannisyan
  • 1,101
  • 3
  • 17
  • 40
  • No it's not possible, at least not directly. You need to uncompress it first. – Some programmer dude Apr 23 '14 at 13:38
  • 1
    Would this help? http://stackoverflow.com/questions/1907847/how-to-use-yy-scan-string-in-lex – André Puel Apr 23 '14 at 13:51
  • What would be your purpose in parsing a compressed file? It can be done, but as any compressed file is going to appear to be primarily random bytes (except for brief areas where metadata are stored), it's not going to follow any predictable rules that a parser would expect... – twalberg Apr 23 '14 at 15:48
  • You will need to read the compressed data, uncompress it, then feed it in to flex. This can be done block by block. You will need something like zlib to read compressed blocks of data, uncompress it, then feed that block to `yylex()`. – Jeffery Thomas Apr 23 '14 at 15:48

1 Answers1

3

With flex, you can define the macro YY_INPUT(buf,result,maxlen) to change how flex obtains input. The macro must read at most maxlen bytes into buf, and return the actual number of bytes stored in result, or set result to YY_NULL to indicate EOF.

For example, using the convenience interface of zlib, you could insert something like the following into your flex file:

 %{

 #include <zlib.h>
 gzFile gz_yyin;
 #define YY_INPUT(buf,result,maxlen) do {  \
    int n = gzread(gz_yyin, buf, maxlen);  \
    if (n < 0) { /* handle the error */ }  \
    result = n > 0 ? n : YY_NULL;     \
 } while (0)

 %}

 // lots of stuff skipped

 int main(int argc, char** argv) {
   gz_yyin = gzopen(argv[1], "rb");
   if (gz_yyin == NULL) { /* handle the error */ }
   /* Start parsing */
   // ...

(You can use gzdopen to create a gzfile using an open file descriptor, such as a pipe.)

rici
  • 234,347
  • 28
  • 237
  • 341