1

I was reading about compiler design principles and specifically the relationship between Buffering and Lookahead terms that i came to this question:

Question 1: Limitations of Buffering

If the main point of buffering is to increase the number of lookaheads in a compiler, then why is the usage of buffers is limited while constructing a compiler?

To clarify the question, the Buffering technique is used when a compiler is reading an input(Source Code). If it is capable of seeing more upcoming letters(Lexems professionally), Then the decisions can be made more precisely.

and my next question is about the number of lookaheads we can get by putting buffers in a compiler.

Question 2: The increase in lookaheads by adding each buffer

If we add a simple 50-Char sized buffer into our compiler, then how much is the lookahead increased?

Thanks for your great answers.

hexpheus
  • 741
  • 10
  • 22
  • Unless the terminology has changed a lot since the 1980s, "lookahead" is either being used in the sense in regular expressions (see https://stackoverflow.com/q/2973436/1256452) or in parser techniques (eg https://stackoverflow.com/q/26958854/1256452). If you are turning lexemes into tokens, that would limit you to the first case, but we don't normally count a specific number of characters, but rather look for the complexity of the lookahead. The number of characters is usually too trivial to care about, but it's important for it to be bounded, to avoid exponential time in the lexer. – torek Jun 18 '17 at 20:48
  • If you are looking at the second case (e.g. the constant k in LALR(k)) then buffer sizes generally become irrelevant: by this point the parser is dealing in tokens, and lexer details are unimportant, except for memory lifetime issues for token memory. – torek Jun 18 '17 at 20:49
  • Your first question doesn't really make sense. Lookahead is limited because buffers are finite. In any case, any LR(N) grammar can be rewritten as an LR(1) grammar, and similarly for the smaller grammar classes, so it is formally redundant. – user207421 Jun 18 '17 at 22:05
  • @EJP I understand that the number of buffers are finite. I was questioning about why we dont use like 1000 buffers instead of (for example) 10 buffers. Does it affect the performance? What are the side effects? – hexpheus Jun 18 '17 at 22:11
  • @aligholamee: what difference would you posit for 10 buffers of 10 characters each vs 1 buffer of 100 characters? In any case, buffering in general (ignoring compiler construction entirely) is just a method of amortizing costs: there is some fixed cost to do a read or write, plus some variable cost per-unit (typically per-byte). Suppose those are 200 and 2 respectively. Reading 1 byte at a time = 202 cost-units per byte; reading 100 bytes = 400 cost-units = 4 cost-units per byte. – torek Jun 19 '17 at 00:21
  • @torek Can you give me a hint on how the number of lookaheads are increased? Simply, imagine that there are 2 buffers of 50 and they are connected to each other serially. While reading a line of the source program, How many lookaheads can i have in my buffers? And finally let's say we double the number of buffers(4 buffers in this case), how much more space can i have? I guess there is a particular logic goes on here that i can't understand. Thanks. – hexpheus Jun 19 '17 at 05:15
  • @torek According to what i've learned, the choice of a buffer with size 100 is wiser in terms of buffering space. Since a final place in each buffer is dedicated to EOF, We'll have less space in 10 buffers of 10 versus 1 buffer of 100. But, Then we have to write multiple if else statements to check whether we have reached the EOF in 1 buffer of 100, which makes the performance worst than 10 buffers of 10. – hexpheus Jun 19 '17 at 05:16
  • EOF is not a character, so "storing EOF in a buffer" makes no sense. (Some languages use terminated strings and others use counted strings, and if you use the built-in string support, you may be stuck with the inability to represent the terminator as an input byte, but that's an implementation detail.) – torek Jun 19 '17 at 05:17

0 Answers0