WEEK09: sorting (and more analysis of algorithms) --------------------------------------------------------------- M: selection sort vs mylist.sort() FROM LAST TIME: - we fixed up anagrams.py to use a binary search - how do anagrams.py and binarysearchanagrams.py compare when given a 7-letter word (like parsley)? how about and 8-letter word (like predator)? (see /home/jk/inclass/binarysearchanagrams.py for details) SORTING: - python has a built-in sort function for lists: >>> L = range(10) >>> print L [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> from random import * >>> shuffle(L) >>> print L [3, 1, 7, 4, 5, 9, 0, 6, 2, 8] >>> L.sort() >>> print L [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] How does L.sort() work, and is it a *good* sort to use??? - how would *you* sort numbers in a list? if you had to write a function to sort a given list, how would you do it? (try it out using some playing cards -- see if you can come up with a sorting algorithm) - here's one idea, called selection sort: * assume we have a list called L * find the smallest number in the list, swap it with L[0] * find the smallest number in the list, except for L[0], and swap it with L[1] * find the smallest number in the list, except for L[0] and L[1], and swap it with L[2] * and so on.... it's called selection sort because each time you are selecting the smallest number from the remaining unsorted elements - here's one way to code selection sort: def selectionSort(L): """ given a list, sort from lowest to highest """ for i in range(len(L)): ismall = i # find index of smallest element of rest of list for j in range(i+1,len(L),1): if L[j] < L[ismall]: ismall = j # swap smallest element with L[i] L[i],L[ismall] = L[ismall],L[i] PYTHON: - this is python's own shorthand notation for swapping two elements in a list: L[i],L[j] = L[j],L[i] what it is really doing is something like this: tmp = L[i] L[i] = L[j] L[j] = tmp YOUR TURN: - copy sorts.py and add the selection sort - try selection sort vs L.sort() for N=5000, 1000, 2000, 4000: * which is faster? * how do the times change as N doubles? * how does selection sort's time depend on N? list size (n): 10 [219, 908, 21, 837, 417, 852, 409, 739, 385, 96] L.sort time: 0.0000 [21, 96, 219, 385, 409, 417, 739, 837, 852, 908] ----------------------------------- [385, 908, 837, 417, 219, 96, 409, 852, 21, 739] selectionSort(L) time: 0.0001 [21, 96, 219, 385, 409, 417, 739, 837, 852, 908] list size (n): 5000 L.sort time: 0.0053 ----------------------------------- selectionSort(L) time: 3.1940 list size (n): 10000 L.sort time: 0.0114 ----------------------------------- selectionSort(L) time: 12.2937 list size (n): 20000 L.sort time: 0.0246 ----------------------------------- selectionSort(L) time: 50.9708