- review Quiz 3 (see
`cs21/inclass/w08-searching/q3.py`

) - lab 7 design due this week

Imagine we are searching for some `x`

in a list `L`

.
For example, we could have `x=20`

and `L=[5,60,34,89,16,33]`

.
Different search questions we can ask:

- is
`x`

in the list`L`

? True/False - where is it in the list?
- how many times is it in the list?
- if it is not in the list, what is the closest/best match?

For now, let's just focus on the first question. Python gives us a simple way to answer that:

```
>>> x=20
>>> L=[5,60,34,89,16,33]
>>> print(x in L)
False
```

If we didn't have the `in`

operator, we could easily write this
as a function:

```
def search(x, L):
"""return True if x found in list L"""
for item in L:
if item == x:
return True
# only gets here if not found
return False
```

Part of computer science is analyzing algorithms to see how
good they are. Is the above `search(x,L)`

function better,
worse, or the same as just using `x in L`

? Is there another
way to search for `x`

in `L`

? How can we compare algorithms?

One simple way to compare algorithms is to just time how long
they take on the same data. Another way is to examine the algorithm
and figure out (roughly) how many *steps* it will take to execute. Furthermore,
it is often useful to know *how the number of steps needed depends on
the size of the input data*.

For example, in the above `search(x,L)`

function, if `len(L)`

is
10, then there are 10 comparisons done (`if item == x`

). There are a
few more steps in that function (e.g., returning True or False),
but the `for`

loop is the main part. If we double the size of the
list `L`

, we will double the number of steps needed. This is called
a *linear search*, because there is a linear dependence between the
size of the list and the number of steps needed (or the time it takes
to run). If we were to plot the size of the list vs the number of
steps needed, it would be a straight (linear) line.

*If* the data were sorted from low to high, is there a better way to
search for `x`

in `L`

?

```
L = [5, 16, 33, 34, 60, 89]
```

If we are still searching for `x=20`

, there is no point searching
past the `33`

, since we know the list is sorted from low to high, and
we are already past what we are looking for.

Here is another example: if I say "I am thinking of a number from 1 to 100", and that when you guess, I will tell you if you are too high or too low, what is the first number you would guess?

A linear search would guess one number at a time, from 1 to 100.

A binary search would pick the middle and first guess 50. If that is too low, we can then concentrate on the upper half (51-100), and we have already eliminated half of the list! In a binary search, each guess should split the remaining list in half, and eliminate one half.

I want us to write two search functions: `linearSearch(x,L)`

and
`binarySearch(x,L)`

, which we can use in our other programs. For
example, if we put these two functions in a file called `searches.py`

,
I want to be able to say `from searches import *`

in my other programs
and be able to use those two search functions!

Here is the start of `searches.py`

, with the `linearSearch(x,L)`

function
in it. Note the test code at the bottom of the file:

```
def linearSearch(x,L):
"""return True if x is in L, False if not"""
for item in L:
if item == x:
return True
# only gets here if not found!
return False
if __name__ == "__main__":
# put test code here
L = range(10)
x = 5
print("%d %s %s" % (x, str(L), linearSearch(x,L)))
x = 500
print("%d %s %s" % (x, str(L), linearSearch(x,L)))
```

The `if __name__ == "__main__":`

part just decides if this `searches.py`

file
is being *imported* or *run*. If it is being run, like `python searches.py`

, then
we want to run the test code. If it is being imported, to be used in
another program, then we do not want to run the test code (if we are importing
it, we assume it already works!).

On Wednesday we will add the `binarySearch(x,L)`

function to the `searches.py`

file.