Q-learning with hidden-unit restarting
| Author(s) : | Charles W. Anderson, |
| Publisher : | N/A |
| Publication Date : | 1993 |
| ISSN : | N/A |
| Abstract : | Platt's resource-allocation network (RAN) (Platt, 1991a, 1991b) is modi ed for a reinforcement-learning paradigm and to \restart" existing hidden units rather than adding new units. After restarting, units continue to learn via back-propagation. The resulting restart algorithm is tested in a Q-learning network that learns to solve aninverted pendulum problem. Solutions are found faster on average with the restart algorithm than without it. 1, |
