# CS46 — Homework 5

Due Monday 3 March by 3pm
Starting files are available from my public cs46 folder
cd ~/cs46/
cp -r examples/homework/wk5 homework/
cp -r examples/labs/wk5 labs/
git status
git commit -m "week5 start"
git push

You may work with a partner and submit a common solution on this homework.
In lab practice
1. Construct a grammar from the PDA accepting $a^nb^n$ shown in lab.
2. Finish the proof that languages recognized by PDAs can be generated by a context free grammar. The last part of the proof is to show if a string $y$ drives PDA $M$ from state $p$ with an empty stack to a state $q$ with an empty stack, then there is a rule $A_{pq}$ which generates $y$. The proof is by induction on the number of computation steps performed in $M$.
1. Consider the base case of 0 steps. Which substring $y$ can $M$ read from input $w$ in 0 steps? Is there a rule in the grammar that mimics this?
2. Assume that for computations of $\leq k$ steps have rules in the grammar that generate the proper strings, and consider a computation taking $k+1$ steps. In the first step, the transition from $p$ must push a symbol $u$. Consider the two following cases on when the stack becomes empty again. In the first case, the stack is only empty when arriving at $q$. Show that there are transitions in $M$ and rules in $G$ that mimic this behavior. In the second case, assume that the stack becomes empty when arriving at state $r$ along the path from $p$ to $q$. Describe the rules in $G$ that generate $y$ in this condition.
3. Convert the following Grammar to Chomsky Normal Form: $S \rightarrow SS | \varepsilon | () | (S)$
4. Show how the algorithm below determines that (()(())) is in the language described by the grammar above.
Parsing Chomsky Normal Form
Given a grammar $G$ in Chomsky Normal Form, and an input string $w=w_1 w_2 \cdots w_n$, we can design a polynomial time algorithm to determine if $w \in L(G)$. The basic idea is to compute for all substrings $x=w_i \cdots w_j, j \geq i$ of $w$ if there is some rule in $G$ that generates $x$. Define $V_{i,i+s}$ to be the set of all variables in $V \in G$ that that can derive $x$. $G$ can derive $x$ if $V_{1,n}$ contains the start symbol $S \in V$

Our algorithm for parsing strings computes the sets $V_{i,j}$ for $1 \leq i \leq n, j \geq i$. The algorithm proceeds in a series of rounds, where in each round, it considers substrings of length $s, 1 \leq s \leq n$. In the first round, the algorithm sets $V_{i,i}$ to $\{ A \in V, A \rightarrow w_i$ for some rule $A$ in the grammar $\}$.

For substrings longer than $1$, the algorithm checks if the substring $x=x_i \cdots x_{i+s-1}$ can be broken into two smaller pieces at some index $k$, $y=x_i \cdots x_k$ and $z=x_{k+1} \cdots x_{i+s-1}$ such that there is a rule $A \rightarrow BC$ in the grammar where $B \in V_{i,k}$ and $C \in V_{k+1,i+s-1}$. If such a rule $A$ exists, we add $A$ to $V_{i,i+s-1}$.

After finishing all rounds, we just need to check if $S \in V_{i,n}$. A pseudocode summary is below

for i in 1 to n:
Add A to V[i,i] if A yields w[i] is a rule
for s in 1 to n-1:
for i in 1 to n-s:
for k in i to i+s-1:
if A yields BC is a rule
where B is in V[i,k] and C is in V[k+1,i+s]
return True if S is in V[1,n]

Some example output for a CNF grammar for $a^nb^n$.
$cat grammar1.txt a b S: E S: A1 B S: A B A1: A S A: a B: b$ python cnf.py grammar1.txt string1.txt
aabb: True
abab: False
a: False
aaaabbbb: True
ab: True

For the balanced parens
$cat grammar2.txt ( ) S: E S: A1 B S: A B S: S S A1: A S A: ( B: )$ python cnf.py grammar2.txt string2.txt

(): True
E: True
()(): True
(()()): True
)(: False
(((()): False



The grammar format starts with a line containing the symbols of the alphabet separated by spaces, a blank line, and then the grammar rules. The first rule lists the start symbol on the left hand side. The rule format is V: A B to indicate the rule $V \rightarrow AB$. Note A1 is a single variable not two variables. Variables on the right hand side are separated by spaces.

Remember to run git add, git commit, and git push to submit your assignments. If you are working with a partner, both you and your partner need to hand in the lab, but I will grade the most recent push. Include both names at the top of your homework.