knerr cs21 notes...

back to schedule
```
WEEK09: sorting (and more analysis of algorithms)
---------------------------------------------------------------
M: selection sort vs mylist.sort()

FROM LAST TIME:

- we fixed up anagrams.py to use a binary search
- how do anagrams.py and binarysearchanagrams.py compare
when given a 7-letter word (like parsley)? how about
and 8-letter word (like predator)?

(see /home/jk/inclass/binarysearchanagrams.py for details)

SORTING:

- python has a built-in sort function for lists:

>>> L = range(10)
>>> print L
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> from random import *
>>> shuffle(L)
>>> print L
[3, 1, 7, 4, 5, 9, 0, 6, 2, 8]
>>> L.sort()
>>> print L
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

How does L.sort() work, and is it a *good* sort to use???

- how would *you* sort numbers in a list? if you had to
write a function to sort a given list, how would you do it?

(try it out using some playing cards -- see if you can come
up with a sorting algorithm)

- here's one idea, called selection sort:

* assume we have a list called L
* find the smallest number in the list, swap it with L[0]
* find the smallest number in the list, except for L[0],
and swap it with L[1]
* find the smallest number in the list, except for L[0]
and L[1], and swap it with L[2]
* and so on....

it's called selection sort because each time you are selecting
the smallest number from the remaining unsorted elements

- here's one way to code selection sort:

def selectionSort(L):
"""
given a list, sort from lowest to highest
"""
for i in range(len(L)):
ismall = i
# find index of smallest element of rest of list
for j in range(i+1,len(L),1):
if L[j] < L[ismall]:
ismall = j
# swap smallest element with L[i]
L[i],L[ismall] = L[ismall],L[i]

PYTHON:

- this is python's own shorthand notation for swapping
two elements in a list:

L[i],L[j] = L[j],L[i]

what it is really doing is something like this:

tmp = L[i]
L[i] = L[j]
L[j] = tmp

- copy sorts.py and add the selection sort
- try selection sort vs L.sort() for N=5000, 1000, 2000, 4000:

* which is faster?
* how do the times change as N doubles?
* how does selection sort's time depend on N?

list size (n): 10
[219, 908, 21, 837, 417, 852, 409, 739, 385, 96]
L.sort time: 0.0000
[21, 96, 219, 385, 409, 417, 739, 837, 852, 908]
-----------------------------------
[385, 908, 837, 417, 219, 96, 409, 852, 21, 739]
selectionSort(L) time: 0.0001
[21, 96, 219, 385, 409, 417, 739, 837, 852, 908]

list size (n): 5000
L.sort time: 0.0053
-----------------------------------
selectionSort(L) time: 3.1940

list size (n): 10000
L.sort time: 0.0114
-----------------------------------
selectionSort(L) time: 12.2937

list size (n): 20000
L.sort time: 0.0246
-----------------------------------
selectionSort(L) time: 50.9708

```