Removing useless productions of a context free grammar through Petri Net

Mansoor Al-A’ali

INTRODUCTION

Considerable interest has been shown in Petri Nets (PNs) as suitable models for representing and studying concurrent systems because of its potential advantages.The major use of Petri Net has been the modeling of systems in which it is possible for some events to occur concurrently and or sequentially. They have been found to be useful for describing and analyzing many systems such as production, automatic control, communication protocol, circuit analysis, physics and systems involving human activity such as management and office information systems, legal systems, teaching and knowledge representation (2-6). In order to allow easy verification and hierarchical decomposition of the system, the reduction method of PNs without changing the properties has also been studied (7). However, there are still some areas where either this versatile model has not been used at all or used to a very limited extent.Formal languages and automata theory is one of such areas. Algorithms to minimize the context free grammars using the Petri net concept have been published (1),(8),(9) in which a petri net representation of the CFG has been given and techniques to eliminate [lambda] and unit-productions have been provided. The object of this paper is to exploit this PN model to remove Uselessproductions of a Context-Free Grammar (CFG). We assume that [lambda]- and unit-productions have already been removed. First the CFG is represented by a PN. Then algorithm is developed to eliminate Uselessproductions.The algorithm is analyzed and implemented in Pascal. using examples of a CFG . The proposed technique is novel in the sense that it provides a better representation and understanding of the problem. It is simple, requires less computation and is easily implemented on computers. Another technique proposed in (3) in which a given finite or infinite labeled transition graphs defined by a graph grammar, in which an algorithm decides whether this graph is isomorphic to the reachable state graph of some finite unlabeled Petri net. The algorithm aims to produce a minimal net realizing the graph. In (10) the Petri net models are established for right linear grammar, expression grammar and property tree formal grammar. The structure algorithms of these models are given and the properties of models are discussed. It is aimed at generating a language process based on Petri net models. These models are intuitional and dynamic

Petri net: A Petri net is an abstract, formal model of information flow (6). PNs are composed of two basic components: a set of places, P and a set of transitions, T. The relationship between the places and the transitions are defined by the input and the output functions. The input function I defines, for each transition [t.sub.j], the set of input places for the transition I([t.sub.j]). The output function O defines, for each transition [t.sub.j], the set of output places for the transition O([t.sub.j]).

Formally, a Petri net C is defined as the four-tuple

C = (P, T, I, O), where

P = {[p.sub.1], [p.sub.2],……, [p.sub.n]} is a finite set of places

T = {[t.sub.1], [t.sub.2], ……., [t.sub.m]} is a finite set of transitions, P[intersection] T = [empty set],

I = P [right arrow] T,

O = T [right arrow]P.

A Petri net graph is commonly used for better illustration. It consists of two types of nodes: a circle which represents a place and a bar which represents a transition. The input and output functions are represented by directed arcs from the places to the transitions and from the transitions to the places. An arc is directed from a place [p.sub.j] to a transition [t.sub.i] if the place is an input of the transition. Similarly, an arc is directed from a transition [t.sub.j] to a place [p.sub.i] if the place is an output of the transition.

To give a dynamic structure to PN, a marking [mu] is an assignment of tokens (represented by small solid dots inside the circles) to places. A Petri net executes by firing transitions. A transition is enabled to fire if each of its input places has at least one token in it. A transition fires by removing one token from each of its input places and depositing one token into each of its output places. Firing a transition will in general change the marking [mu] of the Petri net to a new marking [mu]’.

Context free grammar: Definition (11): Let G = ([V.sub.N], [V.sub.T], S, F) be a grammar where [V.sub.N] is a finite set of variables, [V.sub.T] is a finite set of objects called terminal symbols, S [member of][V.sub.N] is a special symbol called the start variable and F is a finite set of productions. Then G is said to be context-free if productions in P have the form A [right arrow] x, where A[member of][V.sub.N] and x [member of]([V.sub.N][union][V.sub.T])*.

A context free grammar may have [lamda] production, or unit productions. These productions make the grammar odd and difficult to parse. The useless productions have unnecessary variables and productions. Therefore eliminating these productions will make the grammar easier.

Theorem (11): Let L be a context-free language that does not contain [lamda]. Then there exists a context-free grammar that generates L and that does not have any useless productions.

Removing useless productions (8): An equivalent grammar that does not contain any useless variables or productions can be obtained by first finding those variables which lead to terminal strings. This is done by defining the following sets of variables

[A.sub.o] = {X | X[right arrow]P [member of] F and P[member of][V.sub.T]*}

[A.sub.i] = [A.sub.i-1] [union] {X| X[right arrow]W [member of] F and W[member of]([V.sub.T] [union][A.sub.i-1])*}

For some k for which [Ak.sub.-1] = [A.sub.k], the set of active variables [A.sub.k] is obtained.

Then only those variables which can be reached from starting symbol are obtained by defining and finding

Ro = {S}

[R.sub.i] = [R.sub.i-1] [union] {Y | X[right arrow]UYW [member of] F for some X[member of][R.sub.i-1] and

U, W [member of] ([V.sub.N] [union] [V.sub.T])*}

For some m, R[m.sub.-1]= [R.sub.m], the set of reachable variables [R.sub.m] is obtained.

After that we eliminate all variables which do not belong in [A.sub.k] [intersection] [R.sub.m] together with all rules in which they occur. The same process is repeated with the resulting grammar until we get a non-redundant grammar.

PN Representation of a CFG (1): A Context Free Grammar can be thought of as represented by interconnections of transitions and places of a PN. The productions are represented by transitions and the variables and terminal symbols by places. To include the order of appearance of variables and symbols on the right side of a production, the Petri net is modified and called an Ordered Petri Net (OPN).

Definition [10]: Let G = ([V.sub.N], [V.sub.T], S, F) be a Context-Free Grammar. An Ordered Petri Net is a PN defined to represent G if and only if it has the following properties:

* The input place of PN is labeled as S.

* The output places are labeled from [V.sub.T] [union] {[lamda]}.

* The intermediate places have labels from [V.sub.N].

* Production rules are represented by transitions. The input of the transition [t.sub.j] has label A [member of][V.sub.N] and outputs of [t.sub.j] are (from left to right) labeled as [a.sub.1], [a.sub.2], …., [a.sub.n] if and only if F has a production of the form A [right arrow] [a.sub.1], [a.sub.2],…., [a.sub.n].

The string of output places with tokens by reading from left to right omitting any [lamda]’s encountered is called the yield of OPN.

It must be noted here that there will be exactly one input place for any transition. Two or more productions may have the same variable on the left side. Only one of these productions can be used in one derivation.Therefore, the corresponding transitions can not fire simultaneously. This is guaranteed by having the same place input to such transitions. To derive a sentence, it is sufficient to find the sequence of the transitions which must fire such that the token of the starting symbol place reaches the required output places of leaf nodes.

PROPOSED TECHNIQUE FOR REDUCTION OF USELESS PRODUCTIONS

Before developing the algorithm, some notations that require explanation and clear exposition are given below:

P is the set of places corresponding to the variables and terminals. [absolute value of P] = [absolute value of([V.sub.N][union][V.sub.T])]

[P.sub.N] is the set of places corresponding to the variables.

[P.sub.T] is the set of places corresponding to the terminals.

T is the set of transitions corresponding to the productions. [absolute value of T] is the number of productions.

[t.sub.i.sup.o] is the set of output places of the transition [t.sub.i]. It must be noted that the order of the places in [t.sub.j.sup.o] is important and must be in the same sequence in which they appear in the corresponding production rule. Further it must be maintained in subsequent operations. [.sup.o.t.sub.i] is the set of input places of the transition [t.sub.i] Obviously this will contain only one element.

[p.sub.i.sup.o] is the set of output transitions of the place [p.sub.i]

[.sup.o p.sub.i] is the set of input transitions of the place [p.sub.i]

[P.sup.o] = {[p1.sup.o] , [p2.sup.o] ,…, [p.sub n.sup.o]}; [.sup.o P]={ [.sup.o p.sub.1] ,[.sup.o p.sub.2], …,[.sup.o p.sub.n]] }

[T.sup.o] = { [t.sub.1.sup.o] ,[t.sub.2.sup.o] ,…, [t.sub.n.sup.o]}; [.sup.o T]={[.sup.o t.sub.1]] , [.sup.o.t.sub.2]], …,[.sup.o t.sub.n] }

There are two types of useless productions; one is the production having their variables on the left side which cannot be reached from the starting symbol variables which do not lead to a terminal string. In the second type of productions the set of variables U which lead to terminal strings can be obtained by using reachability in PN. A place [p.sub.i] is in U iff a marking with a token in pi only can reach a marking with tokens only in leaf places by firing transitions. The second type of productions can be eliminated by finding the set of variables R which can be reached from starting symbol. A place pi is in R iff a token in S can reach to pi on firing transitions. Only those transitions, places given by a U [intersection]R along with their interconnections are included into resulting PN.

Algorithm

Input: OPN representing CFG with useless productions

Output: A reduced OPN for CFG without useless productions

Steps:

{Steps 1-4 finds the set of variables U which lead to terminal strings}

* [U.sub.0] = [empty set];[T.sub.0] = [empty set] ; i = 0

* [for all] [t.sub.j[member of] T , if [t.sub.j.sup.o][??] ([U.sun.i] [union] [P.sub.T]) then

[U.sub.i+1] = [U.sub.i] [union] [.sup.o t.sub.j]

[T.sub.i]+1 = [T.sub.i][union]{[t.sub.j]}

* If [U.sub.i+1] [not equal to] [U.sub.i] then go to step 2

* U = [U.sub.i] ; [T’.sub.1] = [T.sub.i+1]

{Steps 5-8 find the set of variables R which can be reached from starting symbol}

* [R.sub.o] = {S}; [T.sub.o] = [empty set] ; i=o

* [for all] [t.sub.j][member of] [T’.sub.1] if [.sup.o t.sub.j][??] [R.sub.i] then

[R.sub.i+1] = [R.sub.i] [union] (t.sub.o.sup.j] [intersection] [p.sub.N)

[T.sub.i+1] = [T.sub.i] [union] {t.sub.j}

* If[R.sub.i+1] [not equal to] [R.sub.i]i then go to step 6

* [P’.sub.N] = U [intersection] [R.sub.i];[T’.sub.N] = [T.sub.i+1] * {This finds the reduced OPN} [t’.sub.i] [member of] T’ [.sup.o t’.sub.i] = [t.sub.i.sup.o]; [.sup.o t’.sub.i] = [t.sub.i.sup.o] iff

[for all] [t.sub.i][member of] [T’.sub.N] [.sup.o t.sub.i] = [.sup.o t.sub i][intersection] [P’.sub.N] [union] [P.sub.T]) The order in [t.sub.i.sup.o] is to be maintained as before.

The algorithm can easily be shown to be O ([[[absolute value of T].sub.2]) where [absolute value of T] is the number of productions.

The algorithm is illustrated with the help of the following example, example 1.

Example 1: Consider the Grammar G = ({A, B, C, S}, {a, b, c}, S, F), with the following production rules:

S [right arrow] aB | bC; A [right arrow] BAc [absolute value ofbSC] a; B [right arrow] aSB | bBC; C [right arrow] SBC [absolute value ofSBc ac

The Petri net presentation is shown in Fig. 1

[FIGURE 1 OMITTED]

For this example,

[P.sub.N] = {S, A, B, C} ; [P.sub.T]={ a, b, c}; T = {[t.sub.1], [t.sub.2], [t.sub.3], t4, [t.sub.5],[t.sub.6], [t.sub.7], [t.sub.8], [t.sub.9], [t.sub.10]}

[.sup.o.t.sub.1] = {S}, [t.sub.1.sup.o] = {a, B}

[.sup.o.t.sub.2] = {S}, [t.sub.2.sup.o] = {b, C}

[.sup.o.t.sub.3] = {B}, [t.sub.3.sup.o] = {a, S, B}

[.sup.o.t.sub.4] = {B}, [t.sub.4.sup.o] = {b, B, C}

[.sup.o.t.sub.5] = {C}, [t.sub.5.sup.o] = {a, c}

[.sup.o.t.sub.6] = {C}, [t.sub.6.sup.o] = {S,B,C}

[.sup.o.t.sub.7] = {C}, [t.sub.7.sup.o] = {S,B,c}

[.sup.o.t.sub.8] = {A}, [t.sub.8.sup.o] = {B, A, c}

[.sup.o.t.sub.9] = {A}, [t.sub.9.sup.o] = {b, S, C}

[.sup.o.t.sub.10] = {A}, [t.sub.10.sup.o] = {a}

Steps 1-4 gives the following:

[U.sub.O] = [empty set];[T.sub.O] = [empty set] ; i = 0

[t.sub.1.sup.0], [t.sub.2.sup.0], [t.sub.3.sup.0] and [t.sub.4.sup.0][??] ([U.sub.O] [union] [P.sub.T]) hence [U.sub.O] = [empty set];[T.sub.O] = [empty set]

[t.sub.5.sup.0] = {a, c} [??] ([U.sub.O] [union] [P.sub.T]) hence [U.sub.1] = {[.sup.o.t.sub.5]}; [U.sub.1] = {C};[T.sub.1]={[t.sub.5]}

[t.sub.6.sup.0], [t.sub.7.sup.0] , [t.sub.8.sup.0] and [t.sub.9.sup.0] [??] ([U.sub.1] [union] [P.sub.T]) hence [U.sub.1] = {C};[T.sub.1]={[t.sub.5]}

[t.sub.10.sup.0] = {a}[??] ([U.sub.1] [union] [P.sub.T]) hence [U.sub.2] = {C,A};[T.sub.2]={[t.sub.5],[t.sub.10]}

[t.sub.1.sup.0], [t.sub.2.sup.0], [t.sub.3.sup.0] and [t.sub.4.sup.0][??] ([U.sub.O] [union] [P.sub.T])

[t.sub.1.sup.0][??] ([U.sub.2] [union] [P.sub.T]) hence [U.sub.2] = {C,A};[T.sub.2]={[t.sub.5],[t.sub.10]}

[t.sub.2.sup.0][??] ([U.sub.2] [union] [P.sub.T]) hence [U.sub.3] = {C,A};[T.sub.3]={[T.sub.2],[t.sub.5],[t.sub.10]}

[t.sub.3.sup.0] and [t.sub.4.sup.0][??] (U3 [union] [P.sub.T]) hence U3 = {C,A};[T.sub.3]={[T.sub.2],[t.sub.5],[t.sub.10]}

[t.sub.5.sup.0][??] (U3 [union] [P.sub.T]) hence [U.sub.4] = {C,A};T4={[T.sub.2],[t.sub.5],[t.sub.10]}

[t.sub.6.sup.0], [t.sub.7.sup.0] and [t.sub.8.sup.0] [??] ([U.sub.4] [union] [P.sub.T]) hence [U.sub.4] = {C,A};T4={[t.sub.2],[t.sub.5],[t.sub.10]}

[t.sub.9.sup.0] [??] ([U.sub.4] [union] [P.sub.T] ) hence [U.sub.4] = {C,A,S};T4={[T.sub.2],[t.sub.5], [t.sub.9], [t.sub.10]}

[t.sub.10.sup.0] [??] ([U.sub.4] [union] [P.sub.T]) hence [U.sub.5]= {C,A,S};[t.sub.5]={[t.sub.2],[t.sub.5], [t.sub.9], [t.sub.10]}

Since [U.sub.5]= [U.sub.4] = {C,A,S} hence U= {C,A,S};

T’1=[T.sub.5]={[t.sub.2],[t.sub.5], [t.sub.9], [t.sub.10]}

Steps 5-8 gives the following:

[R.sub.0] ={S}; [T.sub.O] = [empty set] ; i = o ; [T’.sub.1] = {t.sub.2], [t.sub.5], [t.sub.5], [t.sub.9],.[t.sub.10]}[P.sub.N] = {S, A, B, C}

[.sup.o.t.sub.2]={S}[??][R.sub.0] hence [R.sub.1] = [R.sub.0][union]([t.sub.2.sup.0][intersection][P.sub.N]) = {S}[union] ({b, C}[intersection] {S, A, B, C})={S,C} [T.sub.1] ={ [t.sub.2]}

[.sup.o.t.sub.5][??][R.sub.1] hence [R.sub.2] = [R.sub.1][union]([t.sub.5.sup.0][intersection][P.sub.N]) = {S,C}[union] ({a, c}[intersection] {S, A, B, C}) = {S,C} [T.sub.2] ={[t.sub.2], [t.sub.5]}

[.sup.o.t.sub.9] and [.sup.o t.sub.10][??][R.sub.2] ; [R.sub.2] = {S,C}; [T.sub.2] = {[t.sub.2], [t.sub.5]}

Since [R.sub.2] = [R.sub.1] hence [P’.sub.N] = U [intersection][R.sub.i] = {S,C} and [T’.sub.N] = [T.sub.i+1]={[t.sub.2], [t.sub.5]}

Step 9 gives:

[for all][t.sub.i] [member of] [T’.sub.N] = {[t.sub.2], [t.sub.5]}

[.sup.o.t.sub.2] = [.sup.o.t.sub.2] [intersection] [P’.sub.N] = {S} [intersection]{S,C} = {S};

[t.sub.2.sup.0] = [t.sub.2.sup.0] [intersection] ( [P’.sub.N] [union] [P.sub.T] ) = {b, C} [intersection] ( {S,C} [union]{a,b, c}) = {b,C}

[.sup.o.t.sub.5] = [.sup.o.t.sub.5] [intersection] [P’.sub.N] = {C} [intersection]{S,C} = {C}

[t.sub.5.sup.0] = [t.sub.5.sup.0] [intersection] ( [P’.sub.N] [union] [P.sub.T] ) = {a,c} [intersection] ( {S,C} [union]{ a, b, c}) = {a,c}

Reduced Petri net is shown in Fig. 2 and the corresponding reduced grammar is

[FIGURE 2 OMITTED]

S [right arrow] bC; C [right arrow] ac

CONCLUSION

In this paper an algorithm to eliminate the useless productions of CFG was given. It is based on the PN representation of a CFG and reachability concept of PN. The technique has been analyzed and implemented. The proposed algorithm is better than the existing techniques in the sense that PN model is easy to understand, requires less computations and easily amenable on computers. The algorithm accepts the required PN form of a CFG without [lamda] and unit productions, removes the useless productions and gives the result in PN form. The automated translation of a CFG into PN has not been provided, only a technique has been given.

REFERENCES

(1.) Khan, A.A., M. Al-A’ali and N. Al-Shamlan, 1996. Simplification of Context-Free Grammar Through Petri Net. Computers & Structures, 58: 1055-1058.

(2.) Peterson, J.L., 1981. Petri Net Theory and the Modeling of Systems. Prentice Hall Inc.

(3.) Rennes, D.P., 2001. On the Petri net realization of context-free graphs. Theoretical Computer Sci., 258: 573-98.

(4.) Tilak, A., 1979. Putting Petri Nets to Work. IEEE Computer, pp: 85-94.

(5.) Yan, C., 1999. Petri Net models of grammars and its structure algorithm. J. Appl. Sci., 17: 58-63.

(6.) Esparza, J., 1997. Petri Nets, Commutative Context-Free Grammars and basic Parallel Processes. Fundamental Informaticae, 31: 13-25.

(7.) Lee K.-H. and F. Joel, 1985. Hierarchical Reduction: Method for Analysis and Decomposition of Petri Nets. IEEE Trans. on Systems, Man and Cybernetics, SMC-15: 272-280.

(8.) Darondeau, P., 2001. On the Petri net realization of context-free graphs. Theor. Computer Sci., 258: 573-598.

(9.) Erqing, X., 2004. A Pr/T-Net model for contextfree language parsing, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788), [P.sub.T]. 3, 1919-22, Vol.3.

(10.) de Lara, J., 2004. Defining visual notations and their manipulation through meta-modeling and graph transformation. J. Visual Languages and Computing, 15: 309-30.

(11.) Linz, P., 1990. An Introduction To Formal Languages And Automata, D.C. Heath and Company.

Corresponding Author: Mansoor Al-A’ali, Department of Computer Science, College of Information Technology, University of Bahrain, P.O. Box 32038, Sakheer Campus, Kingdom of Bahrain

Mansoor Al-A’ali and Ali A Khan Department of Computer Science, College of Information Technology, University of Bahrain P.O. Box 32038, Sakheer Campus, Kingdom of Bahrain

COPYRIGHT 2007 Science Publications

COPYRIGHT 2008 Gale, Cengage Learning