Building a Recursive Descent Parser

This is a very basic overview and will not suffice to fully implement a recursive descent parser.

Productions

All production heads are non-terminals, so a function must exist for each non-terminal.

Production Body

For all terminals in the production body, call match(t), for all non-terminals call the associated function.

Example 1

AaBB  cAcBA \rightarrow aBB ~|~cAcB
Bbca  cB \rightarrow bca~|~c

Non-Terminals = {A, B}

Terminals = {a,c,b}

Example 2

expri cond t expr {print"true"} e expr {print"false"}expr \rightarrow i ~cond ~t ~expr ~\{print "true"\} ~e ~expr ~\{print "false"\}
exprϵexpr \rightarrow \epsilon

Derivations

A syntax-tree can be leftmost or rightmost derived; left and right refer to the direction in which non-terminals are expanded.

Notation

Given Aυ, ifwe have αAβ, then aAbaυBGiven ~A \rightarrow \upsilon, ~if we ~have ~\alpha A \beta , ~then ~aAb \Rightarrow a \upsilon \Beta

  • \Rightarrow reads "derives"

  • \Rightarrow^* reads "derives in zero or more steps."

  • +\Rightarrow^+ reads "derives in one or more steps."

  • IfaB and Bυ, thenaυIf a \Rightarrow^*B ~ and ~B \Rightarrow \upsilon, ~then a \Rightarrow^* \upsilon

if Sa, where S is the start symbol, a, is a sentential form of Gif ~S \Rightarrow^* a, ~where ~S ~is ~the ~start ~symbol, ~a, ~is ~a ~sentential ~form ~of ~G

Can contain terminals or non-terminals

If the sentential form contains no non-terminals, it is a Sentence.

This is because no non-terminals means all non-terminals, that, in zero or more steps, were derived to reach the symbol, have been expanded.

All that remains is the yield of the syntax-tree, or a string of terminals

The language of some Grammar G is the set of Sentences.

A language generated by a grammar is a context-free Grammar.

LL(1) Grammar

The first 'L' denotes the order in which input is scanned; in this case, from left to right. The second 'L' denotes the derivation direction and '1' represents the number of input characters viewed at a time.

Thus, LL(1) means we scan from left to right, building a leftmost derivation tree, while observing one character at a time.

LL Grammar is suitable for top-down parsers.

Last updated

Was this helpful?