# Week 8, Monday: Searching

• review Quiz 3 (see `cs21/inclass/w08-searching/q3.py`)
• lab 7 design due this week

### searching

Imagine we are searching for some `x` in a list `L`. For example, we could have `x=20` and `L=[5,60,34,89,16,33]`. Different search questions we can ask:

• is `x` in the list `L`? True/False
• where is it in the list?
• how many times is it in the list?
• if it is not in the list, what is the closest/best match?

For now, let's just focus on the first question. Python gives us a simple way to answer that:

``````>>> x=20
>>> L=[5,60,34,89,16,33]
>>> print(x in L)
False
``````

If we didn't have the `in` operator, we could easily write this as a function:

``````def search(x, L):
"""return True if x found in list L"""
for item in L:
if item == x:
return True

return False
``````

### Analysis of Algorithms

Part of computer science is analyzing algorithms to see how good they are. Is the above `search(x,L)` function better, worse, or the same as just using `x in L`? Is there another way to search for `x` in `L`? How can we compare algorithms?

One simple way to compare algorithms is to just time how long they take on the same data. Another way is to examine the algorithm and figure out (roughly) how many steps it will take to execute. Furthermore, it is often useful to know how the number of steps needed depends on the size of the input data.

For example, in the above `search(x,L)` function, if `len(L)` is 10, then there are 10 comparisons done (`if item == x`). There are a few more steps in that function (e.g., returning True or False), but the `for` loop is the main part. If we double the size of the list `L`, we will double the number of steps needed. This is called a linear search, because there is a linear dependence between the size of the list and the number of steps needed (or the time it takes to run). If we were to plot the size of the list vs the number of steps needed, it would be a straight (linear) line.

### binary search

If the data were sorted from low to high, is there a better way to search for `x` in `L`?

``````L = [5, 16, 33, 34, 60, 89]
``````

If we are still searching for `x=20`, there is no point searching past the `33`, since we know the list is sorted from low to high, and we are already past what we are looking for.

Here is another example: if I say "I am thinking of a number from 1 to 100", and that when you guess, I will tell you if you are too high or too low, what is the first number you would guess?

A linear search would guess one number at a time, from 1 to 100.

A binary search would pick the middle and first guess 50. If that is too low, we can then concentrate on the upper half (51-100), and we have already eliminated half of the list! In a binary search, each guess should split the remaining list in half, and eliminate one half.

### importing our own modules

I want us to write two search functions: `linearSearch(x,L)` and `binarySearch(x,L)`, which we can use in our other programs. For example, if we put these two functions in a file called `searches.py`, I want to be able to say `from searches import *` in my other programs and be able to use those two search functions!

Here is the start of `searches.py`, with the `linearSearch(x,L)` function in it. Note the test code at the bottom of the file:

``````def linearSearch(x,L):
"""return True if x is in L, False if not"""

for item in L:
if item == x:
return True

return False

if __name__ == "__main__":
# put test code here
L = range(10)
x = 5
print("%d %s %s" % (x, str(L), linearSearch(x,L)))
x = 500
print("%d %s %s" % (x, str(L), linearSearch(x,L)))
``````

The `if __name__ == "__main__":` part just decides if this `searches.py` file is being imported or run. If it is being run, like `python searches.py`, then we want to run the test code. If it is being imported, to be used in another program, then we do not want to run the test code (if we are importing it, we assume it already works!).

On Wednesday we will add the `binarySearch(x,L)` function to the `searches.py` file.