2023-02-17

LALR(1) Parser DFA Lookahead Core Question

I am having trouble understanding what the rules are for adding a lookahead to a core production during the construction of the DFA. To illustrate my confusion, I will be using an online parser generator that exposes all the internal calculations; this_tool. (<- open in a new tab)

(The formating is: NONTERMINAL -> RULE, LOOKAHEADS, where the lookaheads are forward slash sperated)

Using this grammar as an example:

S -> E
E -> ( E )
E -> N O E
E -> N
N -> 1
N -> 2
N -> 3
O -> +
O -> -

Copy and pasting the above grammar into the lalr parser generator will produce a dfa with 12 states (click the >>). My question is finally, why are the goto(0, N) kernel productions ( {[E -> N.O E, $/)]; [E -> N., $/)]} ) initiated with the ) terminal? Where does the ) come from? I would expect the goto(0, N) to be {[E -> N.O E, $]; [E -> N., $]}. Equally the kernel production in the goto(0, ( ) has an 'extra' ).

As the dfa is being constructed, equal cores are merged (the core is the set of productions that introduce a new state by performing closure on that set). State 2 has production [E -> .N, )];, which when merged with [E -> N., $] produces the correct output, but there's no way for state 0 to have known about lookahead of )

Thanks in advance, sorry if this was a confusing and specific question and about using an external website to demonstrate my issue.✌️



No comments:

Post a Comment