Attributed Object Maps: Descriptive Object Models as High-level Semantic Features for Mobile Robotics
Delmerico, Jeffrey A.
MetadataShow full item record
This dissertation presents the concept of a mid-level representation of a mobile robot's environment: localized class-level object models marked up with the object's properties and anchored to a low level map such as an occupancy grid. An attributed object map allows for high level reasoning and contextualization of robot tasks from semantically meaningful elements of the environment. This approach is compatible with, and complementary to, existing methods of semantic mapping and robotic knowledge representation, but provides an internal model of the world that is both human-intelligible and permits inference about place and task for the robotic agent. This representation provides natural semantic context for many environments—an inventory of objects and structural components along with their locations and descriptions—and would sit at an intermediate level of abstraction between low level features, and semantic maps of the whole environment. High level robotic tasks such as place categorization, topological mapping, object search, and natural language direction following could be enabled or improved with this internal model. The proposed system utilizes a bottom-up approach to object modeling that leverages existing work in object detection and description, which are well developed research areas in computer vision. Our approach integrates many image-based object detections into a coherent and localized model for each object instance, using 3D data from a registered range sensor. By observing an object repeatedly at the frame-level, we can construct such models in an online way, in real time. This ensures that the models are robust to the noise of false-positive detections and still extract useful information from partial observations of the object. The detection and modeling steps we present do not rely on prior knowledge of specific object instances, enabling the modeling of objects in unknown environments. The construction of an attributed object map during exploration is possible with minimal assumptions, relying only on knowledge of the object classes that might be contained therein. We present techniques for modeling objects that can be described with parameterized models and whose quantitative attributes can be inferred from those models. In addition, we develop methods for generating non-parametric point cloud models for objects of classes that are better described qualitatively with semantic attributes. In particular, we propose an approach for automatic foreground object segmentation that permits the extraction of the object within a bounding box detection, using only a class-level model of that object's scale. We employ semantic attribute classifiers from the computer vision literature, using the visual features of each detection to describe the object's properties, including shape, material, and presence or absence of parts. We integrate per-frame attribute values into an aggregated representation that we call an object attribute descriptor. This method averages the confidence in each attribute classification over time, smoothing the noise in individual observations and reinforcing those attributes that are repeatedly observed. This descriptor provides a compact representation of the model's properties, and offers a way to mark up objects in the environment with descriptions that could be used as semantic features for high-level robot tasks. We propose and develop a system for detecting, modeling, and describing objects in an unknown environment that uses minimal assumptions and prior knowledge. We demonstrate results in parametric object modeling of stairways, and semantic attribute description of several non-parametric object classes. This system is deployable on a mobile robot equipped with an RGB-D camera, and runs in real time on commodity compute hardware. The object models contained in an attributed object map provide context for other robotic tasks, and offer a mutually human and robot-intelligible representation of the world that can be created online during robotic exploration of an unknown environment.