I have a C library that I'd like to interface to from C++ code without modifying the library. It has a yacc-generated front end that reads from yyin
, which is a FILE *
. I'd like to set yyin
to some kind of emulation of a FILE *
which reads from memory. Is there any hope of doing this in a portable (Linux, Mac, Windows) manner -- or is there another trick for making such a parser read from memory rather than a FILE *
?

- 80,601
- 10
- 150
- 186
-
Has your yacc front end also have a lex/flex generated tokenizer? – Bryan Olivier Jun 10 '13 at 04:15
-
Are you planning on reading in data from an external file as needed, or do you have the whole string to scan sitting in memory? – templatetypedef Jun 10 '13 at 04:15
-
@BryanOlivier, yes, there's a lex tokenizer. – Ernest Friedman-Hill Jun 10 '13 at 04:16
-
@templatetypedef I want to be able to send a string to it on demand. – Ernest Friedman-Hill Jun 10 '13 at 04:17
-
Typically the input routines of lex/flex are changed to read from a string. You can either use `yy_scan_buffer` as mentioned by Dietrich, but I think it is `flex` only, or redefine the `input` macro (old skool). – Bryan Olivier Jun 10 '13 at 04:19
2 Answers
You can use fmemopen()
on Linux. Unfortunately, not only is there no portable way to do this, but then again, fopen()
isn't even really portable (it's been broken for a long time on Windows).
However, if your tokenizer is Flex, you can use yy_scan_buffer()
. See String input to flex lexer.

- 1
- 1

- 205,541
- 37
- 345
- 415
-
That is excellent, thank you -- I was thinking about this all wrong. Been eating too long at the Java trough, I'm afraid. – Ernest Friedman-Hill Jun 10 '13 at 04:22
-
1Got this working with `yy_scan_string()` today -- thanks again! – Ernest Friedman-Hill Jun 10 '13 at 23:12
A yacc scanner will normally get tokens via a lexer, calling a function named yylex
.
The lexer is what normally reads characters from an input file (or buffer, in your case). Assuming you're using Flex to generate the lexer, the usual "hook" for modifying how input is read is to re-define the YY_INPUT
macro.
As @dietrich Epp mentioned, however, there are also yy_scan_string, yy_scan_buffer, and yy_scan_bytes. Whether these are more suitable for your purposes than defining your own YY_INPUT may be open to some question. Although I can't remember any of the details, my recollection is of having avoided them at times due to (at least perceived) lack of efficiency (or maybe it was just that it seemed to me that defining YY_INPUT was easier -- can't remember for sure).

- 476,176
- 80
- 629
- 1,111
-
The problem with using `YY_INPUT` seems to be that the `flex` input file already defines it (as it must) and my goal was to use the library without modifying any of the source. I'm looking at things now, trying to figure out if I can use the `yy_scan_X()` functions to work with the unmodified code. – Ernest Friedman-Hill Jun 10 '13 at 04:36