| Human action recognition in video is an important research topic in computer vision and pattern recognition. This research topic is motivated by many applications such as video surveillance and human-machine interaction. However, research work on this topic is not straightforward. The results of the current human action recognition systems are often affected by background clutters, subjects’ appearance, occlusion, illumination changes and camera movement. It is a large challenge to accurately recognizing human actions in a real world scenario.One of the key components in human action recognition is to effectively obtain an accurate representation of the human action in a video. A discriminative and robust behavior representation cannot only help reduce the influence of noise, but also improve the recognition performance. Many approaches have been proposed and shown their promising performance on challenging databases. In spite of the progress, it is still a challenging problem to learn a compact and discriminative behavior representation from the extracted low-level local features, which can be used to achieve certain levels of classification. To obtain compact and discriminative coding coefficients, we studied the existing feature learning methods and applied the structural information of local features into the encoding process. The main contributions are outlined as follows:1. We propose a novel algorithm named neighbor-constrained low-rank representation. Recently, low-rank representation has been widely applied to many areas whilst demonstrating impressive performance. Low-rank representation seeks appropriate representation of low-level features with a low-rank constraint and attempts to capture the global structural information of these features. Inspired by the progress made in low-rank representation, we incorporate a local regularization term into the object function of low-rank representation. By employing a local regularization term during the encoding process, the coding coefficients of the low-level features can be approximately represented by the coefficients of the nearest neighbors. As a consequence, the neighborhood relationship reflecting intensity consistency and smoothness among the coding coefficients is preserved.2. We propose a compact and discriminative coding scheme called structural incoherence constrained low-rank representation. Most of the available approaches for action recognition encode features independently, and therefore ignore the global structural information and the correlation between different attributes. By incorporating a weighted structural incoherence regularization term into the low-rank representation of actions, our method can enhance the uncorrelation between different actions and hence sustain a more discriminative representation of actions.3. A novel human action recognition method based on Trace Lasso norm is proposed, which regularizes both the coding coefficient and dictionary atoms simultaneously. Therefore, the dictionary atoms corresponding to the coding coefficients have more strong correlation. Sparse representation chooses dictionary atoms in a random manner. Compared with sparse representation, trace lasso takes the correlation among the dictionary atoms, which can improve the discriminative of the coding coefficients.This work was supported by the National Natural Science Foundation of China(No. 61272282), and the Program for New Century Excellent Talents in University(NCET-13-0948). |