Video-based Action Understanding
MetadataShow full item record
The problem under consideration in this dissertation centres around video-based action understanding. In particular, the overarching goal is to provide the motion analysis to distinguish action from general motion and effectively detect the interesting actions in a video. Although previous research has widely approached the action recognition and detection problems by the application of motion information, the critical fundamental questions: what is action? how is action different from motion? and in a given video where is the action? remain open problems in computer vision due to the intrinsic complexity of action, the large intra-class variation and small inter-class variation of actions and so on. Motivated from this, we propose the notion of actionness, which seeks to distinguish different motions (intentional motion from general motion). Action is a specific subclass of general motion requiring an agent who has a particular goal or intention and is moving to achieve the goal. The lattice conditional ordinal random fields model is proposed to achieve actionness ranking, so that any two regions in a video can be compared according to the degree to which they contain an action. In order to get the difference between action and motion, a quantitative analysis of trajectory-based intentional movement is conducted to investigate its spatiotemporal properties. Intentional movement is characterized by a high-density of trajectories that, in the majority, will remain a part of the action through their duration. Inspired by this, the space-time trajectory graph is proposed to study the relationship between trajectories in a video and motivates a new approach for implicit intentional movement extraction that is based on spatiotemporal trajectory clustering on the graph by leveraging the properties of intentional movement. The intentional movement clusters are then used as action proposals for detection. The action detection method is called action detection-by-recognition, since the whole training process is bounding-box free and weakly supervised. In this dissertation, the solutions for the fundamental questions in computer vision are proposed. The actionness notation provides an explicit answer for what is action? The relationship between trajectories and intentional movement is exposed to show the difference between action and general motion. The action detection-by-recognition method is proposed to achieve actionness-based action detection.
Showing items related by title, author, creator and subject.
Chung, Phu Nhan (2012)This dissertation has three Chapters. In the first Chapter, we will introduce the main results of the dissertation and review some important definitions. In 1999, Douglas Lind and Klaus Schmidt established the relations ...
Peter Pfordresher Principal Investigator (2014-04-02)What we perceive is related to how we act. Nowhere is there a closer connection between action and perception than in speech and music. Communicating through speech or music involves executing planned actions and perceiving ...