# knerr cs21 notes...

back to schedule
```
WEEK09: sorting (and more analysis of algorithms)
---------------------------------------------------------------
W: selection sort analysis, start recursion

REVIEW QUIZ 4:

- see /home/jk/inclass/q4.py and make sure you understand it!
- see /home/jk/inclass/q4extra.py and make sure you understand it!

(note how you can change *elements* of a list, and those changes
are seen back in main)

FROM LAST TIME:

- we fixed up anagrams.py to use a binary search
- how do anagrams.py and binarysearchanagrams.py compare
when given a 7-letter word (like parsley)? how about
and 8-letter word (like predator)?

(see /home/jk/inclass/binarysearchanagrams.py for details)

SORTING:

- how does selection sort compare to python's built-in sort?

list size (n): 20000
L.sort time: 0.0246
-----------------------------------
selectionSort(L) time: 50.9708

*** clearly L.sort is faster...why??

- how does selection sort's time depend on the size of the list (N)?
- try our selection sort for N=5000,10000,20000...what happens to
the run time?

ANALYSIS:

- how many "steps" does selection sort require to sort a list?

for i in range(len(L)):    <--- for loop depends on N
ismall = i
# find index of smallest element of rest of list
for j in range(i+1,len(L),1):   <--- another (nested) for loop
if L[j] < L[ismall]:
ismall = j
# swap smallest element with L[i]
L[i],L[ismall] = L[ismall],L[i]

Here's how selection sort works on a sample list of length 7:

[ 9,    50,    -2,     6,    9,    17,    20]
<-- find smallest num, swap to position 0 -->
[ -2,    50,    9,     6,    9,    17,    20]
<-- find smallest, swap to pos  1 -->
[ -2,     6,    9,    50,    9,    17,    20]
<-- find small,  swap to 2 -->
[ -2,     6,    9,    50,    9,    17,    20]
<-- find/swap  to 3 -->
[ -2,     6,    9,    9,    50,    17,    20]
<-- swap to 4 -->
[ -2,     6,    9,    9,    17,    50,    20]
<--  5 -->
[ -2,     6,    9,    9,    17,    20,    50]
<->
[ -2,     6,    9,    9,    17,    20,    50]

See how that looks like an NxN matrix? Actually, it
looks like 0.5*(NxN), but that's still proportional to N-squared!!

Another way to look at it is like this:

- in the first line, you have to scan N elements to find the smallest
- in the second line, you have to scan N-1 elements
- in the third line, you have to scan N-2 elements
- and so on until you scan 3, 2, and 1 element

Summing all of those up can be written
S = N + N-1 + N-2 + ... + 3 + 2 + 1

Now for some "fancy" math fun -- add two of those Sums together,
but write the second one in reverse order:

S = N + N-1 + N-2 + ... + 3  + 2   + 1
+  S = 1 +  2  +  3 + ... + N-2 + N-1 + N
---------------------------------------
2S = N+1 + N+1 + N+1 + ... + N+1 + N+1 + N+1

and note there are N N+1's on the right side of that equation,
so 2S = N*(N+1), or S = 0.5*(N-squared + N)

As N gets very large, the N-squared term is much bigger than
the N term, so this is proportional to N-squared

- something proportional to N-squared is called quadratic,
because if you double N, the "something" quadruples!

- run the /home/jk/inclass/lineplots.py program again to
see the comparison of things proportional to
N (linear) vs NlogN vs N-squared (quad)

RECURSION:

- the python built-in sort uses a divide-and-conquer algorithm,
but before we can learn that algorithm we need to understand
recursive functions: a function that calls itself is recursive

- recursion is most useful when, to solve a problem, you break
the original problem into smaller pieces, but can apply the
same algorithm to the smaller pieces

- a great example is binary search and the number guessing game:

* i'm thinking of a number between 1 and 1000

* if you guess 500, and are told that's not it, but the
answer is less than 500, then you can apply the same
strategy to a smaller problem (i'm thinking of a number
between 1 and 500...you guess 250)

- two things you need to use recursion:

1. a base case, or some way to stop the recursion
2. all chains of recursion need to lead to the base case

for example, in the binary search guessing game, the base case
is when you guess the correct number, and all guesses lead to
smaller and smaller ranges of numbers that contain the base
case (the correct number)

FACTORIAL EXAMPLE:

- the factorial function is defined as follows:

n! = 1 if n=0, and n*((n-1)!) if n > 0

so 0! = 1
1! = 1 * 0! = 1 * 1 = 1
2! = 2 * 1! = 2
3! = 3 * 2! = 6
and so on

You can try this out in python:

>>> from math import *
>>> factorial(3)
6
>>> factorial(4)
24
>>> factorial(5)
120

And the above definition is easily expressed using recursion.

What is the base case??    n = 0
How do we get to a smaller-sized problem??  n! = n * (n-1)!

Will this work???

def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n-1)

- try running /home/jk/inclass/factorialstack.py to see the stack
for the recursive factorial function:

\$ python factorialstack.py

n: 4
-------------------------------
in factorial...stacknum = 1, n = 4
calling 4 * factorial(3)
in factorial...stacknum = 2, n = 3
calling 3 * factorial(2)
in factorial...stacknum = 3, n = 2
calling 2 * factorial(1)
in factorial...stacknum = 4, n = 1
calling 1 * factorial(0)
in factorial...stacknum = 5, n = 0
returning 1
returning 1
returning 2
returning 6
returning 24
factorial(n): 24

- NOTE: you can easily write a factorial function that does not
use recursion. There are, however, some algorithms that are not
easy to write without using recursion.

```