| Robot motor skill acquisition means the robot has the ability of perform an unfamiliar task automatically.In recent years,the skill learning based on demonstration learning and reinforcement learning(LfDRL)has been proposed and applied in various fields.But many of the existing methods focus on the improvement and innovation only in a certain stage of LfDRL,which will inevitably lead to the conflict between broad performance and specific performance of learning skills.In this paper,a novel method which has the feature of basis function automatic reconstruction for motor skill learning named Mixture Gaussian Regression combined with Path Integral in Policy Improvement(GMR-PI~2)is proposed to overcome the conflict.The method mainly includes two parts that are imitation and optimization.Learn the broad features to obtain the corresponding weights and basis functions by imitation learning which based on the Dynamic Movement Primitives system(DMPs)and Gaussian Mixture Regression(GMR).On the basis of this,learning specificity of skills based on GMR and path integral method.In addition,for the complex or special skills,weight space exploration is easy to fall into local optimum,the self-reconstruction of the base function is required to execute,then restart the weight search,repeatedly this until achieve accurate learning of special skills.The main research jobs of this paper are as follows:Firstly,we have studied the domestic and foreign research on robot skill learning,combined with the related research of imitation learning,taking endowing the robot the ability of learning general skill as the goal to pursue research.Secondly,build the kinematics model of complex multi DOF manipulator using screw method instead of the traditional D-H method,and analyze the problems of robot kinematics and inverse kinematics,which draws out the skill learning method based on imitation learning,path integrals and stochastic optimal control.And then,in the later stage of the study,we found that the imitation learning based on GMR whose basis function can automatically gathered in more frequent fluctuations in the sample is conducive to learn special skills more accurate,with the help of deep learning ideas,when the exploration fall into local optimum with current basis,the methods starts a basis functions self-reconstruction.And then the optimization can continue based on the new basis function,repeating this process until the optimization of the target criteria no longer change.Finally,the correctness of the proposed method is verified by the completion of the robot arm from the general teaching task to the special task by doing some experimental simulation with MATLAB and robot simulation platform V-REP.In addition,the experimental results show that theGMR-PI~2 algorithm proposed in this paper can endow the robot more effective ability than the traditional LWR-PI~2 algorithm about learning the broad and specific skills. |