CFG-R (reversed rightmost) File Format (.cfg-r) CS 241
A .cfg-r file is a text file representing a context-free grammar followed by several
(zero or more) abbreviated reversed right canonical derivations. The format
differs from the CFG file format only in that it uses
unindented reversed rightmost derivation format.
Context-free Grammar Representation
The context-free grammar representation has four components, in order:
- t, a positive integer giving the number of terminal symbols in the grammar, followed
by t lines, each containing the name of a distinct terminal symbol. Each terminal symbol may
be any string of one or more printable ascii characters, excluding white space.
- n, a positive integer giving the number of nonterminal symbols in the grammar, followed
by n lines, each containing the name of a distinct nonterminal symbol. Each nonterminal
symbol may be any string of one or more printable ascii characters, and must not be the same as any terminal symbol.
- A single line giving the name of the start symbol, which must be one of the nonterminal symbols
- r, a positive integer giving the number of production rules in the grammar, followed by r lines, each denoting a rule consisting of:
- The left-hand-side (LHS) of the rule; a nonterminal symbol
- The right-hand-side (RHS) of the rule; zero or more terminal
or nonterminal symbols separated from each other, and from the LHS, by a
single space
Abbreviated Reversed Right Canonical Derivation(s)
Zero or more derivations immediately follow the context-free grammar.
Each derivation is a representation of the parse tree, with
each node in the tree represented by a line in the file containing a production rule.
(In contrast to the derivations in CFG files, these lines do not begin with
any spaces.)
The order and indentation of the lines in the file are defined by the following
recursive rules:
- The last line of the file is a production rule whose LHS is the start symbol, indented 0
spaces. This line represents the root of the parse tree.
- Preceding each production rule
that has n nonterminal symbols in its RHS, are n sets of
lines, each set representing the subtree corresponding to one of the nonterminal symbols,
in the order they appear in the rule.
-
The subtree for each nonterminal ends with a rule whose LHS is the
corresponding nonterminal. The preceding lines in the set are determined by applying
this rule recursively.
6
)
(
EOF
BOF
id
-
3
term
S
expr
S
5
term id
term ( expr )
S BOF expr EOF
expr expr - term
expr term
term id
expr term
term id
expr term
term ( expr )
expr expr - term
term id
expr expr - term
S BOF expr EOF