Spring 2018

Welcome to CS46. This is an upper level Computer Science Course. There a number of administrative things to get out of the way before we dive into course content. As a student, it is your responsibility to review and complete the following checklist.

- Complete the sign-in sheet with your preferred name, preferred pronouns, and confirm that your lab section is correct. If you are trying to add the course, list your preferred lab section (include both if you are flexible).
- Read over the course Syllabus, including all sections linked on the navigation bar.
- Create or sign into your Piazza Account. Note, I have created piazza accounts for students on the waitlist. This does not mean you have been officially added to the course.
- Purchase the textbook and read chapter 0 by Thursday.
- If you do not have a clicker, purchase one.
- Register your clicker with this course.
- Confirm that you can log into your Swarthmore ITS account and your Swarthmore CS account. (There may be a delay in creating/reactivating these accounts for TriCo students)
- Confirm that you remember your ssh-key password for github, or generate and install a new key pair.

This course has a lab section. Attendance is mandatory. Sometimes we will have small programming exercises. Sometimes we will review or explore lecture material more in depth. Sometimes we will practice problems similar to those that will be assigned as written homework to be submitted and graded.

The course will have two midterm exams and a final exam. Midterms will be during the lab section and I will try to have them scheduled soon. The registrar's office schedules the final exam time sometime between May 10th and May 17th. The schedule is not released until March 1st. Do not arrange for travel during finals week until you know the final exam date.

What is Theory of Computation (TOC)?

Two fundamental questions of Computer Science:
- What can be computed?
- How efficiently can something be computed?

Algorithms, e.g., CS41 looks more closely at the second question and refines the broad categories of primarily the complexity class ${\tt P}$ into finer classes e.g., $O(1), O(\log n), O(n), O(n^2)$

Both courses answer these questions in the context of a mathematical *model of computation*. The goal of the model is to abstract away the core concepts of computation without needing to consider a specific hardware setup.

- What are some key components of a computing device?
- Are there problems an android phone cannot solve that a cloud computing platform could solve?
- What are the course set of tasks involved in
*computing*? - Is the von Neumann architecture the right model?
- Can it be simplified and have the same computing power?
- Should it be expanded? Is it limiting what we could compute?

- Define model
- Explore what model can compute
- Prove that some things cannot be computed in model
- Expand model (go back to step 1)

Sets, ordered tuples

A *set* is a collection of unordered, distinct objects, e.g., $S=\{ a, b, c, d \}$.

- set membership: $a \in S, x \not \in S$
- special sets:
- Empty set: $\emptyset$
- Natural numbers: $\mathbb{N}=\{ 0,1,2,3,... \}$
- Integers: $\mathbb{Z}$
- Reals: $\mathbb{R}$
- Rationals: $\mathbb{Q}=\{ \frac{a}{b} : a,b \in \mathbb{Z}, b \neq 0 \}$

- Size of set: $|A|$
- Subset: $A \subseteq B$, proper subset: $A \subset B$
- Superset: $A \supseteq B$, proper super set: $A \supset B$
- Set $A = B$ iff $\forall a \in A, a \in B$ and $\forall b \in B, b \in A$
- Union: $A \cup B = \{ x: x \in A \text{ or } x \in B \}$
- Intersection: $A \cap B = \{ x: x \in A \text{ and } x \in B \}$
- Set minus: $A \setminus B = \{ x: x \in A \text{ and } x \not \in B \}$

True or False?

- $ \emptyset \subseteq \emptyset$
- $ \emptyset \in \emptyset$
- $ \emptyset \in \{ \emptyset \}$
- $ \emptyset \subseteq \{ \emptyset \}$
- $ \{ \{ a, b \} \} \in 2^{ \{ a, b , \{ a, b \} \} }$

- $2^{ \{ 7, 8, 9 \} } \setminus 2^{ \{ 7, 9 \} } = $?
- $2^\emptyset = $?

Ordered pairs are often formed by the Cartesian product of two sets. Let $A$ and $B$ be two sets. Their Cartesian product $A \times B$ is the set of all ordered pairs $(a, b)$ with $a \in A$ and $b \in B$, e.g., $\{ a_1, a_2 \} \times \{ b_1 \} = \{ (a_1, b_1), (a_2, b_1) \}$

Relations and functions

A *Binary relation* on two sets $A$ and $B$ is simply a subset of $A \times B$

- $\{ (1, b), (1, c) \} $ is a relation on $ \{ 1, 3, 9 \}$ and $\{ a, b, c \}$
- $\{ (i,j): i,j \in \mathbb{N} \text{ and } i < j \} $ is the
*less than*relation

A *Function* from a set $A$ to a set $B$ is a binary relation $R$ with the property that for every $a \in A$ there is **exactly one** $b \in B$ with $(a,b) \in R$. We say the function $f$ maps the *domain* of $f, A$ to the *co-domain* of $f, B$, or $f: A \mapsto B$. We will often use the notation $f(a)=b$ to mean $(a, b) \in f \subseteq A \times B$.

Are the following functions, relations, or neither?

- $\{ (a, b) : b = a^2 \}$
- $\{ (a, b) : a \geq b \}$
- $\{ (a, b) : b^2 = a \}$

We will occasionally classify functions as *one-to-one*, *onto* or a *bijection*. A function $f: A \mapsto B$ is one-to-one if for two distinct elements $a, a' \in A, f(a) \neq f(a')$. Note this implies $|A| \leq |B|$. A function $f$ is onto if for every $b \in B$ there is some $a \in A$ such that $f(a) = b$. $f$ is a bijection if $f$ is one-to-one and onto.

The inverse of a binary relation $R \subseteq A \times B$, denoted $R^{-1}$ is a subset of $B \times A$ such that $(b,a) \in R^{-1}$ iff $(a, b) \in R$. The inverse of a relation is a relation. Is the inverse of a function a relation? Is it a function? What if $f$ is one-to-one? What if it is onto? A bijection?

Alphabets and Languages

An A *string* $w$ is a finite sequence of symbols over an alphabet, e.g, $w_1=baa$ and $w_2=a$ are strings over the alphabet $\Sigma = \{ a, b \}$. The length of the string $w$ is written as $|w|$. The empty string $\varepsilon$ has length 0.

A *language* $L$ is a set of strings from an alphabet $\Sigma$, The set of all strings over an alphabet $\Sigma$ is denoted $\Sigma^*$. Note that while $\Sigma^*$ consists of finite length strings, the size of $\Sigma^*$ is infinite. $L=\Sigma^*$ is an example of an infinite language.

Even though a language is a set and sets are unordered, sometimes we will list the strings in a modified lexicographical order which the book calls *shortlex* order. In this order, shorter strings precede longer strings. When two strings are the same length, shortlex order uses the standard lexicographical/dictionary order to compare the strings. To use some earlier defined terminology, we can say the shortlex relation is a relation $R$ on strings $w \in L$ where $(w_1, w_2) \in R$ if $w_1 < w_2$ in shortlex order.

Enough terminology! Why?

To explore the limits of computation models, we will focus on the following types of computational problems. Suppose we are given a language $L \subseteq \Sigma^*$. Can we design a computer program or machine $M(L)$ that identifies/recognizes/accepts only those inputs which are strings in $L$ and rejects all other inputs? Think of this machine $M(L)$ as function that takes as input a single string $w \in \Sigma^*$ over a known alphabet and, after some computation, outputs either a While this might seem like a gross oversimplification of what computers can do, it gives us a starting point to compare models and analyze their limits. Plus there are some applications in which recognizing patterns is important.

- Is http://www.google.com a valid URL?
- Is ((()())()))) a set of balanced parentheses?
- Does this document contain only valid words from a given dictionary?

Infinite sets

Given a finite alphabet $\Sigma$, the set of all strings $\Sigma^*$ is countably infinite. A computer program must have some finite description over some alphabet. So we can think of a computer program as nothing more than a single string over some alphabet. So how many programs can we have? That is certainly no more than the total number of strings over the program alphabet, so that must be *countably infinite*. Each program recognizes one language. But how many languages are there? Each language is a subset of $\Sigma^*$ and the set of all possible subsets of $\Sigma^*$ is $2^{\Sigma^*}$. A short diagonalization argument should convince you that the number of languages is *uncountably infinite*. This means there are way more languages than programs, so there must be some language $L$ for which there is no program $M$ such that $M$ accepts $L$. In essence, there are some things that are *uncomputable* using a finite description of a computer program/algorithm. Hmm. This raises a number of questions.

- What does a computational model look like?
- What does a program in a computation model look like?
- What does an uncomputable language in a given model look like?
- What does Marsellus Wallace look like?
- What does the Marcellus shale look like?
- What does Mark Wallace look like?
- Are these the droids you are looking for?