Robot Learning

Kybernetes

ISSN: 0368-492X

Article publication date: 1 July 1999

63

Keywords

Citation

Andrew, A.M. (1999), "Robot Learning", Kybernetes, Vol. 28 No. 5, pp. 75-77. https://doi.org/10.1108/k.1999.28.5.75.3

Publisher

:

Emerald Group Publishing Limited


This is a collection of eight papers, with a brief preface. In the preface, it is pointed out that the topic of robot learning has recently experienced a resurgence of interest, with sessions devoted to it at conferences on AI and on Neural Information Processing. The present volume is not attributed to a specific meeting, and appears to have been assembled by invitation of the editors.

The first paper is by the two editors and gives a useful introduction to the topic and a brief survey of the other contributions, and of some background. Three main motivations for studying robot learning are listed. One is that the environment may be too complex for the necessary information to be hard‐wired or programmed into the robot initially. Another is that the information may be unknown in advance, as when the aim is to explore a new planet, and another, related to this, is where the environment is changeable. Robot learning has characteristics that distinguish it from other computer learning, since it must operate with noisy sensory input, in a stochastic environment, and must react to inputs without undue delay. Learning has to be incremental, in that it enables the robot to improve its performance while actually performing, and in general training time is limited.

The feedback governing learning may be a simple scalar “success” indication, or may be a control feedback as in the “learning with a teacher” situation, or may be more complex still. The case of scalar feedback immediately suggests recent developments in Reinforcement Learning, and several of the papers in the volume make reference to this.

The second paper, however, is innovative in other directions. It is about autonomous vehicle guidance and in particular the ALVINN project at Carnegie Mellon University, where the acronym stands for Autonomous Land Vehicle In a Neural Network. The chapter can in fact be regarded as a synopsis of the book by Pomerleau (1993), which was presumably published later than the present volume since no reference is made to it. The project involves the use of neural nets in this practical context with various important new features prompted by its requirements.

The other papers are of more general applicability. The third is on reinforcement learning with multiple goals. In principle, reinforcement learning methods can accommodate multiple goals by forming a composite reward criterion, but the learning process is then slow and cumbersome. A method is described in which modules are assigned to the separate goals, though with suitable coordination by an “arbiter”. Reference is made to the multiple needs of an animal which must satisfy the goals of getting food, getting water, procreating, caring for young, and so on. Although the connection is not made in the paper, there may be relevance to the operation of the reticular formation of the brainstem as described by Kilmer et al (1969).

The fourth paper begins with the observation that the new incremental methods of reinforcement learning have fast real‐time performance, whereas classical methods are slow but more accurate because they make full use of the observations. A technique called “prioritised sweeping” is introduced with the aim of getting the best of both worlds. This concentrates computational effort on the most “interesting” parts of the system. Results from simulations are given.

The remaining four papers all emphasise special features of the robot learning situation, as compared to machine learning in general. The fifth paper is another by the two editors and discusses a number of ways in which learning can be speeded up. The sixth treats the semantic hierarchy in robot learning, in which an abstract “cognitive map” is at the highest level and this has to be related by stages to physical robot control. An important point receiving attention here is the transition from low‐speed, friction‐dominated control to higher‐speed inertia‐dominated control. It is likely that initial exploration by a robot will be at low speed but subsequent operation will be faster, with new demands on sensory information processing and particularly on prediction.

The seventh paper is on the learning and updating of map information using uncertain data. The map need not be a topological representation as ordinarily understood but can be a graph of states and transitions in any environment. A conventional map used for spatial navigation is a convenient specific example. A total of 12 theorems is given, relating to convergence rates under each of the four permutations of deterministic/stochastic transitions with deterministic/stochastic observations.

The final paper is a perceptive review again emphasising the differences between the physical robot situation and the relatively tidy environment of theory and computer simulations. The conclusion is reached that: “The weaknesses of the existing learning techniques, and the versatility of the knowledge necessary to make a robot perform efficiently in the real world, suggest that many concurrent, complementary, and redundant learning methods may be necessary”.

All the papers confirm the complexity of the problem and make significant contributions towards its solution. This is a useful and important book.

References

Kilmer, W.L., McCulloch, W.S. and Blum, J. (1969A model of the vertebrate central command system”, Int. J. Man‐Machine Studies, Vol. , pp. 279309.

Pomerleau, D.A. (1993,Neural Network Perception for Mobile Robot Guidance, Kluwer, Boston (reviewed in Kybernetes Vol. 25 No. 3, 1996, pp. 0‐2).

Related articles