Active vision for robotic manipulation

Industrial Robot

ISSN: 0143-991x

Article publication date: 2 March 2012

503

Citation

(2012), "Active vision for robotic manipulation", Industrial Robot, Vol. 39 No. 2. https://doi.org/10.1108/ir.2012.04939baa.001

Publisher

:

Emerald Group Publishing Limited

Copyright © 2012, Emerald Group Publishing Limited


Active vision for robotic manipulation

Article Type: Viewpoint From: Industrial Robot: An International Journal, Volume 39, Issue 2

Autonomous robotic manipulation is practically attractive for a complex task with industrial robots or service robots. The manipulation strategy is expected to be content based and task driven. To deal with the uncertainty of the environment and complexity of the task, active vision can play an important role to observe the status and perceive abundant information. Acquisition of the knowledge of three-dimensional (3D) scene and objects is absolutely useful for analyzing the environmental structure, target shape and motion information to engineering applications. For some service robots, the system also needs to learn knowledge or experience by automatic acquisition of human manipulation for automated robot programming via operator demonstration. In a wide range of robotic systems, grasping is one of the basic skills that is crucial to dexterous manipulation tasks and interaction with the environment. In most industrial applications, the problem of grasping is solved via teaching-by-doing or static programs. However, when thinking of recent research developments, aspects of on-site solutions and dynamic programming will play a very important role. New techniques are required for the robots to operate in uncharted and unknown territories.

Active vision presents the idea of moving or reconfigurating sensors to constrain interpretation of the scene. Since multiple 3D images are taken and integrated from different viewpoints to enable all areas of interest to be scanned, a sensing strategy which determines the viewing positions and sensing parameters thus becomes critically important for achieving full automation and high efficiency. Recently, the roles of sensor planning can be widely found in most autonomous robotic systems. The goal is to gain knowledge about the unseen portions of the object while satisfying the sensor-environment constraints. For 3D sensing, the robot is equipped with 3D vision sensors which have a number of degrees of freedom for moving to a specific viewpoint in a defined finite space. A practical vision sensor usually has a number of constraints to its data-acquisition range. Therefore, it has to take many different views to obtain the full necessary information about the target. The sensor is controllable to look at certain areas of the targets and its structure will be reconfigurable to gaze at the feature of interest. The difficulty here is that each view is dynamically decided because the complete geometrical description about the target is unavailable. The partially predictable information may be used to hypothesize the unseen target shape and to determine a view.

For identification and manipulation of possible objects, first of all, the robot must interpret the scene. Once interpretation has been performed and the objects as well as their positions in a 3D space have been detected, the robot can move to a position in the scenario from where it can well observe the target. Environmental interpretation as well as manipulation with an object in a complex scene inevitably requires access to geometrical 3D information. Since no prior information about the targets is available in the unfamiliar environment before the vision task is carried out, the robot should be capable to purposively decide how to finish the task without human interference. Relying solely on spatial data, vision understanding and robotic manipulation could not be very intelligent. For semantic scene interpretation, labeling can be processed to mark meaningful structures. Converting from source image data to geometrical shapes makes the scene understandable, and converting from geometrical shapes to semantic representation makes it much more understandable to the robot. By constructing geometrical map and semantic map, knowledge of the spatial relationship can be used for reasoning.

Active vision takes the advantage of on-site solution of occlusion and uncertainty. In planning for an arbitrary manipulation tasks, there is a situation that the robot has to work in a dynamic environment and the sensing system may associate with noises or uncertainties. Research in this issue has long been active in the field, but it seems that no complete solutions could be seen in the near future. On the other hand, data fusion is powerful for making reliable decisions. Multiple data sources are often available in a robotic system. When more than one video camera, ranger sensor, sonar, infrared, ultrasound, global positioning system, compass, inertial measurement unit, odometers, etc. are used together, robotic manipulation can be made more reliable by data fusion.

When autonomous robots work in complex environments, fixed component structures are not capable of dealing with all situations. Flexible design makes the system reconfigurable during the task execution. A self-calibration procedure is performed if reconfiguration happens. Researchers are clearly aware of this issue, but it is a very slow progress to implement such device due to high cost. Beside the hardware mechanism, software for control and recalibration has to be developed concurrently.

Another factor considered in active vision is constraint satisfaction. There are a lot of constraints with the vision sensors and robotic manipulators. For example, while planning to observe the target, the parameters of the sensor position, orientation, and other settings need to be determined. The robot should be moved to an optimal place where it is feasible in the practical environment. This involves many sensor-environment constraints, e.g. visibility, resolution, viewing distance or focus, field of view, overlap, viewing angle, occlusion, reachability, collision, operation cost, etc. The objective is to generate a plan which satisfies all the constraints and has the lowest operation cost. The satisfaction of the conditions to constrain the manipulator and sensors being placed in the acceptable space must be formulated. In the literature, one method employs a function that combines several components representing the sensing constraints, which is especially useful in model-based manipulation tasks. In unknown environments, a tessellated space approach which produces a viewpoint space or look-up array is effective for dealing with some small and simple objects, but no impressive methods available yet to model a large and complex scene. All constraints have to be solved during the run time.

On the trend, although active vision itself is proposed to reduce the uncertainty during the sensing procedure and improve the reliability of robotic manipulation, difficulties and related problems arise along with the complexity of the system. Critical issues for realizing this idea include efficient 3D data acquisition, sensor-environment constraint satisfaction, sensing planning, model construction, motion and force planning, task evaluation, etc. New strategies have to be developed to actively decide the sensor configuration for observing specific aspects of the target and estimating the state or completeness of the manipulation task. The successful implementation of this topic would lead to full automation of the manipulation process. It would also result in a mechanism for autonomously moving and changing sensing parameters for complete 3D data recovery, so that robotic manipulation could be effected with the knowledge of a geometrical target shape.

The author

Shengyong Chen is the Full Professor and Director of the Institute of Intelligent Systems in Zhejiang University of Technology, Zhejiang, People’s Republic of China, web site: http://sychen.com

Further Reading

Chen, S., Li, Y.F. and Kwok, N.M. (2011), “Active vision in robotic systems: a survey of recent developments”, International Journal of Robotics Research, Vol. 30 No. 36 (online first)

Chen, S., Zhang, J., Zhang, H., Kwok, N.M. and Li, Y.F. (2011), “Intelligent lighting control for vision-based robotic manipulation”, IEEE Transactions on Industrial Electronics, Vol. 58 (early access)

Related articles