This course focuses on computer perception: using computers to analyze images, sounds, and videos. We will specifically focus on object recognition and multimedia retrieval, but will also look at segmentation, localization, clustering, tracking and other perception tasks.
The first third of this class will be a lecture style format that introduces some fundamental topics and tools. The remaining two thirds will be a seminar style format in which students will present academic papers and conduct research.
Professor: Douglas Turnbull
Office: Science Center 255
Phone: (610) 597-6071
Office hours: TBA or by appointment
Room: Science Center Conference Room
Time: Tuesday, Thursday 11:20pm–12:35pm
Text: None, but lots of suggested references and weekly readings...
| WEEK | DAY | ANNOUNCEMENTS | TOPIC & READING | LAB |
| 1 | Sep 02 | Motivation & Organization Duda, Hart, & Stork (DHS) Ch 1 (Handout) |
||
| Sep 04 | Probability Crash Course DHS App. A1-A4 (Handout) |
|||
| 2 | Sep 09 | Machine Learning Crash Course, Part 1 Russell & Norvig (RN) Ch 20.1,20.2 |
||
| Sep 11 | Machine Learning Crash Course, Part 2 Russell & Norvig (RN) Ch 20.4, 20.6, 20.7, 20.8 |
Probability Prob. Set Due |
||
| 3 | Sep 16 | Matlab Tutorial The Science of Scientific Writing by Gopen & Swan |
||
| Sep 18 | Semantic Annotation and Retrieval of Music and Sound Effects by Turnbull, Barrington, Torres, Lanckriet (Audio - Doug) |
|||
| 4 | Sep 23 | Color Indexing by Swain & Ballard (1991) (Image - Phyo) |
  | |
| Sep 25 | Content-Based Classification, Search and Retrieval of Audio Wold, Blum, Keislar, Wheaton (Audio - Anne-Mare - notes) |
Machine Learning Prob. Set Due |
||
| 5 | Sep 30 | Distinctive Image Features from Scale-Invariant Keypoints Lowe (2004/1999) (Image - Joon) |
||
| Oct 02 | Shape matching and object recognition using shape contexts Belongie, Malik, Puzicha (2002) (Image - Meggie) |
|||
| 6 | Oct 07 | Musical Genre Classification of Audio Signals by Tzanetakis and Cook (2002) (Music - Amber) |
  | |
| Oct 09 | Normalized Cuts and Image Segmentationby Shi, Malik (2000/1997) (Image - Brian) |
Proposal Due | ||
Oct 14 |
October Holiday |
|||
Oct 16 |
||||
| 7 | Oct 21 | Video google: A text retrieval approach to object matching in videos by Sivic, Zisserman (2003) (Video - Malcolm) |
||
| Oct 23 | A robust mid-level representation for harmonic content in music signals by Bello, Pickens (2005) (Music - Garth) |
|||
| 8 | Oct 28 | Speaker Verification Using Adapted Gaussian Mixture Models by Reynolds, Quatieri, Dunn (2000) (Speech - Adam) |
||
| Oct 30 | A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures by Berenzweig, Logan, Ellis, Whitman (2004) (Music - Trilok) |
Proposal Update Due |
||
| 9 | Nov 04 | Robust real-time object detection by Viola, Jones (2002/2001) (Image - Matt) |
||
| Nov 06 | Automatic Species Identification of Live Moths by Mayo and Watson (2007) (Image - Malcolm) -or- Combining Cepstral and Prosodic Features in Language Identification by Yin, Ambikairajah, Chen (2006) (Audio - Anne-Marie) |
|||
| 10 | Nov 11 | Towards Detecting Emotions in Spoken Dialogs Lee & Narayanan (2005) - or - Recognizing Emotion in Speech Dellaert (1996) (Speech - Meggie & Adam) |
||
| Nov 13 | 100% Accuracy in Automatic Face Recognition by Jenkins and Burton (2008) (Image - Matt) - or - Hit Song Science is NOT yet a Science Pachet & Roy (2008) (Music - Garth) |
|||
| 11 | Nov 18 | Talk by Prof. Youngmoo Kim about research in the Media. Entertainment. Technolgoy Lab at Drexel Univeristy (Swarthmore '93) Meet in Hicks 312 (Mural Room) |
Manuscripts Due | |
| Nov 20 | Music Similarity Measures: What's the Use? by Aucounturier, Pachet (2002) (Music - Amber) - or - Identifying Words that are Musically Meaningful by Torres, Turnbull, Barrington, Lanckriet (Music - Doug) |
|||
| 12 | Nov 25 | Scalable Recognition with a Vocabulary Tree by Nister, Stewenius (2006) - or - Sampling Strategies for Bag-of-Feature Image Classification by Nowak, Jurie, Triggs (2006) (Image - Brian & Phyo) |
Reviews Due | |
Nov 27 |
Thanksgiving |
|||
| 13 | Dec 02 | Towards Personalized Image Retrieval by Bissol, Mulhem, Chiaramella (2004) (Image - Joon & Trilok) |
||
| Dec 04 | Swarthmore Computer Perception Conference Audition Session Adam/Meggie, Anne-Marie, Amber, Garth |
|||
| 14 | Dec 09 | Swarthmore Computer Perception Conference Vision Session Brian/Phyo, Joon/Trilok, Malcolm, Matt |
Final Paper Due | |
|
|
Academic honesty is required in all work you submit to be graded. With the exception of your lab partner on lab assignments, you may not submit work done with (or by) someone else, or examine or use work done by others to complete your own work. You may discuss assignment specifications and requirements with others in the class to be sure you understand the problem. In addition, you are allowed to work with others to help learn the course material. However, with the exception of your lab partner, you may not work with others on your assignments in any capacity.
All code you submit must be your own with the following permissible exceptions: code distributed in class, code found in the course text book, and code worked on with an assigned partner. In these cases, you should always include detailed comments that indicates which parts of the assignment you received help on, and what your sources were.
``It is the opinion of the faculty that for an intentional first offense, failure in the course is normally appropriate. Suspension for a semester or deprivation of the degree in that year may also be appropriate when warranted by the seriousness of the offense.'' - Swarthmore College Bulletin (2007-2008, Section 7.1.2)
Please see me if there are any questions about what is permissible.
Machine Learning and Pattern Recognition
Image and Audio Processing
Matlab - General Info
Matlab - Computer Perception
Other Software (Weka, Matlab, etc.)