Ideas from Elman's Finding Structure in Time (1990)
Simple Recurrent Networks
- An extension of simpler feedforward networks to incorporate time
- Hidden layer activations are copied to a context layer and fed
back into the network on the next time step
- Using this recurrence, the network can develop a memory of the recent past
Finger tapping experiments
- 1 2 3 4 5 1 2 3 4 5 ...
- 1 2 3 4 5 1 3 5 2 4 ...
- 1 2 3 1 2 4 1 2 5 5 ...
- Human subjects have varying degrees of difficulty repeating these
sequences
- Simple recurrent networks show similar results (exposures needed
to learn by sequence: 250, 1500, 30000)
Discovering the structure in letter sequences
ba, dii, guuu
Discovering the notion of "word"
Discovering lexical classes from word order
Types, tokens, and structured representations
Summary of Elman's results
- SRNs are trained to predict symbolic sequences
- SRNs create a tight coupling between input and output
- Self-organized internal representations emerge
- Differential error rates indicate that SRNs "recognize"
sub-structure of sequences
- Errors are not random but reflect structured representations
- This is a first step towards developing concepts
Applying simple recurrent networks to robots
- The real world exists in time
- Robots experience the world through their sensors
- Robots must learn to recognize key events in this stream of
experience
- Unlike the controlled experiments done by Elman, a robot may
experience long periods of essentially static input
- In this case, the tight coupling of inputs and outputs in an SRN
can lead to catastrophic forgetting
- Our proposed solution could be described as a bipartite systems, one
coupled with current experiences, while the other attempts to
categorize with respect to all experiences
- Results so far suggest that each part individiually has a large
weakness, but when they are trained together they can learn patterns
(and therefore concepts) that an SRN cannot