| With the improvement of artificial intelligence technology and the increase of domestic consumption,the logistics industry has entered a rapid development channel.Logistics transportation is closely related to people’s production and life.During the transportation of goods,the packing and handling of goods are the key parts.With the help of artificial intelligence technology,intelligent logistics can greatly reduce labor costs and improve the transportation efficiency of enterprises and factories.Therefore,in this thesis,aiming at the problem of multi-preview and single-preview online threedimensional bin packing,the packing methods which can obtain higher space utilization rate are studied,so as to reduce labor cost and improve the economic benefits of enterprises.In multi-preview online 3D bin packing,the information of multiple packed objects can be known in advance through the sensor during the packing process.If all possible packing positions are searched for each previewed object in multi-preview online 3D bin packing,the computational efficiency of the algorithm is low and it is not suitable for practical application.In the thesis,deep reinforcement learning algorithm combined with Monte Carlo tree search is adopted to solve the problem.First,a basic model of single preview packing is trained using the deep reinforcement learning algorithm.Then,the basic model is combined with the Monte Carlo tree search method.Finally,the Monte Carlo tree search method is used to virtually reorder the previewed objects according to the maximum path value.The improved evaluation function is used to expand the nodes to the best direction,and the best loading position of the next object is obtained.Through experiments,it is found that when the number of previews increases,the utilization rate of packing space also increases.And when the number of previews reaches 5,the utilization rate of packing space has increased by 1.2% compared with the common heuristic algorithm,which indicates the effectiveness of the multi-preview bin packing algorithm in the thesis.The deep reinforcement learning algorithm is an effective solution for singlepreview online 3D bin packing,but its convergence speed and packing efficiency need to be further improved.In this thesis,the optimized actor critic using Kroneckerfactored trust region(ACKTR)algorithm is used to alleviate the problem.Firstly,node features are generated by means of tree structure extension to express the status of packing space.Secondly,in order to adapt to the growth of the packing configuration tree with time,a feature extraction network based on graph attention mechanism is designed,and then the extracted features are input into the decision network based on pointer mechanism.Finally,according to the characteristic that a large number of large matrices need to be reversed in the process of solving online 3D bin packing problem,eigenvalue-corrected Kronecker-factored approximate curvature(KFAC)optimization method is designed to update the parameters of the strategy network.The experimental results show that the utilization rate of packing space of the method in this thesis is 16% higher than that of the general heuristic algorithm,1.2%higher than that of the similar ACKTR algorithm,and 12.4% higher than that of natural gradient optimization.Moreover,it has good performance on various data sets of different sizes,and the overall utilization rate of packing space is at least above65%.Finally,the bin packing experiment platform is composed to verify the practical application effect of the bin packing algorithm in the thesis.First of all,some hardware such as Huichuan robot arm,Gocator2150 camera and Kinect v1 camera are used to constitute the bin packing experiment platform.Then,because Huichuan robot arm only provides Windows development interface,and the bin packing algorithm is developed based on Ubuntu.Therefore,the communication interface of the robot arm under Ubuntu is designed and implemented based on ROS and Modbus TCP protocol.Finally,the bin packing algorithm of this thesis is used to implement the bin packing in real environment.It can be seen from the experimental results that the effect of optimized ACKTR algorithm in the thesis is almost the same as that of manual bin packing in real environment.There are 65 figures,18 tables and 63 references in this thesis. |