Mihai Ciocoiu, Štefan Bruda * GULiveR: Generalized Unification Based LR Parser for Natural Languages




We will describe shortly the classical algorithm and then we will summarize the necessary changes of it. Here are the functions used by the table builder [10,8] (T is the terminal set and N denotes the nonterminal set):
algorithm closure (I :set_of_productions ; J) is
	J <-- I;
	repeat
		for each 
	            A --> α.Bβ in J and B --> γ in grammar 	(1)
		do
		    if B --> .γ ¬c J then J <-- J U {B --> .γ }
	until there are no new elements added to J
end algorithm

goto (I,x) ::= closure (J), where J = {A --> αx.β | A --> α.xβ c I }

algorithm states (G' :grammar ; C) is
	C <-- closure ({S' --> .S });
	repeat
		for each I c C and x c N U T U {$}do
			 if goto(I,x) ¬= FØ and goto (I,x) ¬c C then 
			    C <-- C U {goto (I,x) }
	until there are no new elements added to C
end algorithm

The table builder is specified by the following algorithm:

algorithm builder (G' :grammar ; (ACTION,GOTO) :analysis_table) is
	C <-- states (G');
	for each Ii c C do
		for each a c T do
			if exists goto (Ii,a) = Ij then 
			   ACTION [i,a] <-- {"shift j"} U ACTION [i,a];
		if A --> α. c Ii then 
		   for each β c Follow (A) do
		            ACTION [i,β] <-- 
                            {"reduce A --> α"} U ACTION [i,β];
		if S' --> S. c Ii then 
                   ACTION [i,$] <-- {"accept"} U ACTION [i,$];
	for each A c N do 
	         if exists goto (Ii,A) = Ij then 
                    GOTO [i,A] = {j} U GOTO [i,A];
	Any empty entry in ACTION or GOTO table is marked as error 
        entry;
end algorithm

Due to the introduction of structured categories we need to perform some changes into this algorithm. First, both left and right hand sides of a production must be consulted. In the "closure" function the expression (1) must be replaced by:

A --> .B'β in J and B --> γ in grammar and B, B' are able to unify each other.

The "goto" function becomes:

goto (I,x) ::= closure (J), where J = {A --> αx'.β | A --> α.xβ c I and x, x' are able to unify each other}.

Finally, in the "states" and "builder" algorithms similar changes had to be made in order to discern between dags which are located in left or right hand side of a production and to determine whether two dags are "equal" (i.e., whether they are able to unify each other). Note that, in a monadic grammar, such mechanisms are not necessary because two symbols with the same name are always identical.


102

Previous Index Next