WEEK09: sorting (and more analysis of algorithms) --------------------------------------------------------------- W: selection sort analysis, start recursion REVIEW QUIZ 4: - see /home/jk/inclass/q4.py and make sure you understand it! - see /home/jk/inclass/q4extra.py and make sure you understand it! (note how you can change *elements* of a list, and those changes are seen back in main) FROM LAST TIME: - we fixed up anagrams.py to use a binary search - how do anagrams.py and binarysearchanagrams.py compare when given a 7-letter word (like parsley)? how about and 8-letter word (like predator)? (see /home/jk/inclass/binarysearchanagrams.py for details) SORTING: - how does selection sort compare to python's built-in sort? list size (n): 20000 L.sort time: 0.0246 ----------------------------------- selectionSort(L) time: 50.9708 *** clearly L.sort is faster...why?? - how does selection sort's time depend on the size of the list (N)? - try our selection sort for N=5000,10000,20000...what happens to the run time? ANALYSIS: - how many "steps" does selection sort require to sort a list? for i in range(len(L)): <--- for loop depends on N ismall = i # find index of smallest element of rest of list for j in range(i+1,len(L),1): <--- another (nested) for loop if L[j] < L[ismall]: ismall = j # swap smallest element with L[i] L[i],L[ismall] = L[ismall],L[i] Here's how selection sort works on a sample list of length 7: [ 9, 50, -2, 6, 9, 17, 20] <-- find smallest num, swap to position 0 --> [ -2, 50, 9, 6, 9, 17, 20] <-- find smallest, swap to pos 1 --> [ -2, 6, 9, 50, 9, 17, 20] <-- find small, swap to 2 --> [ -2, 6, 9, 50, 9, 17, 20] <-- find/swap to 3 --> [ -2, 6, 9, 9, 50, 17, 20] <-- swap to 4 --> [ -2, 6, 9, 9, 17, 50, 20] <-- 5 --> [ -2, 6, 9, 9, 17, 20, 50] <-> [ -2, 6, 9, 9, 17, 20, 50] See how that looks like an NxN matrix? Actually, it looks like 0.5*(NxN), but that's still proportional to N-squared!! Another way to look at it is like this: - in the first line, you have to scan N elements to find the smallest - in the second line, you have to scan N-1 elements - in the third line, you have to scan N-2 elements - and so on until you scan 3, 2, and 1 element Summing all of those up can be written S = N + N-1 + N-2 + ... + 3 + 2 + 1 Now for some "fancy" math fun -- add two of those Sums together, but write the second one in reverse order: S = N + N-1 + N-2 + ... + 3 + 2 + 1 + S = 1 + 2 + 3 + ... + N-2 + N-1 + N --------------------------------------- 2S = N+1 + N+1 + N+1 + ... + N+1 + N+1 + N+1 and note there are N N+1's on the right side of that equation, so 2S = N*(N+1), or S = 0.5*(N-squared + N) As N gets very large, the N-squared term is much bigger than the N term, so this is proportional to N-squared YOUR TURN: - something proportional to N-squared is called quadratic, because if you double N, the "something" quadruples! - run the /home/jk/inclass/lineplots.py program again to see the comparison of things proportional to N (linear) vs NlogN vs N-squared (quad) RECURSION: - the python built-in sort uses a divide-and-conquer algorithm, but before we can learn that algorithm we need to understand recursive functions: a function that calls itself is recursive - recursion is most useful when, to solve a problem, you break the original problem into smaller pieces, but can apply the same algorithm to the smaller pieces - a great example is binary search and the number guessing game: * i'm thinking of a number between 1 and 1000 * if you guess 500, and are told that's not it, but the answer is less than 500, then you can apply the same strategy to a smaller problem (i'm thinking of a number between 1 and 500...you guess 250) - two things you need to use recursion: 1. a base case, or some way to stop the recursion 2. all chains of recursion need to lead to the base case for example, in the binary search guessing game, the base case is when you guess the correct number, and all guesses lead to smaller and smaller ranges of numbers that contain the base case (the correct number) FACTORIAL EXAMPLE: - the factorial function is defined as follows: n! = 1 if n=0, and n*((n-1)!) if n > 0 so 0! = 1 1! = 1 * 0! = 1 * 1 = 1 2! = 2 * 1! = 2 3! = 3 * 2! = 6 and so on You can try this out in python: >>> from math import * >>> factorial(3) 6 >>> factorial(4) 24 >>> factorial(5) 120 And the above definition is easily expressed using recursion. What is the base case?? n = 0 How do we get to a smaller-sized problem?? n! = n * (n-1)! Will this work??? def factorial(n): if n == 0: return 1 else: return n * factorial(n-1) - try running /home/jk/inclass/factorialstack.py to see the stack for the recursive factorial function: $ python factorialstack.py please enter a positive integer... n: 4 ------------------------------- in factorial...stacknum = 1, n = 4 calling 4 * factorial(3) in factorial...stacknum = 2, n = 3 calling 3 * factorial(2) in factorial...stacknum = 3, n = 2 calling 2 * factorial(1) in factorial...stacknum = 4, n = 1 calling 1 * factorial(0) in factorial...stacknum = 5, n = 0 returning 1 returning 1 returning 2 returning 6 returning 24 factorial(n): 24 - NOTE: you can easily write a factorial function that does not use recursion. There are, however, some algorithms that are not easy to write without using recursion.