I'm having trouble building the collection of sets of items for LR(1) parsers with a grammar containing epsilon productions. For example, given the following grammar (where eps stands for epsilon)
S -> a S U
U -> b
| eps
State0 would be
S' -> .S, $
S -> .a S U, $
Moving with 'a' from State0 would give the following state, let's call it State2
S -> a .S U, $
S -> .a S U, $/???
In order to have the lookahead for the second item of State2 I need to calculate FIRST(U$). I know that FIRST(U) = {'b', eps}. My first question is: the lookaheads of the second item of State2 are $ and 'b'? Since U can be eps, my brain tells me that I can have $ as a lookahead as well, not just 'b'. It would have been just 'b' if FIRST(U) would have been just {'b'}. Is that correct?
Second question: at some point I will have a state as the following one
S -> a S .U, $
U -> .b, $
U -> .eps, $
What do I do here? Do I need to move with eps and have a set with the item U -> eps., $
? What if I have another terminal as lookahead, i.e. X -> .eps, a/$
? And if I move, ending up having a set of the form X -> eps., $
, do I reduce?
And more: do I need to insert eps in the parse table as a symbol?
Thanks