| The Conveyor-Serviced Production Station(CSPS)system is widely used as a typical intelligent decision system in flexible production and processing stations.At present,the optimization control of CSPS system often models the arrival process of parts and demand as a Poisson process.After the system is established as a Semi-Markov Decision Process(SMDP)model,the optimal or sub-optimal control strategy is obtained through strategy iteration or Q learning.However,when the non Poisson parts arrives,the arrival process of the parts does not satisfied the Markov property.When the CSPS system cannot be established as an SMDP model,the Q learning's learning effect is a problem worth studying.Therefore,this thesis studies the applicability of the Q learning algorithm when the parts are non Poisson arrival.Firstly,the Markov Modulated Poisson Process(MMPP)and the Semi-Markov Modulated Poisson Process(SMMPP)are represented as non Poisson arrival.Under the same average arrival rate,analyze and compare the Q learning results when the parts arrives in both standard Poisson flow and non standard Poisson flow.The statistical average arrival rate is used as the standard Poisson arrival rate of the parts,then observe the theoretical learning of statistical average arrival rate to test the algorithm's performance;Secondly,the applicability of Q learning when two kinds of artifacts arrive at the mixed signal flow of MMPP and SMMPP is discussed.In addition,on the basis of the non Poisson arrival of the parts,this thesis studies the applicability of the algorithm when the customer demands are not subject to Poisson distribution.Simulation results show that Q learning algorithm can still learn a good control strategy when the CSPS systems with non Poisson parts flow cannot be established as a SMDP model.The system cost evaluated by the corresponding strategy is very similar to the Q learning result when the theoretical average arrival rate of non Poisson parts flow is taken as the standard Poisson arrival rate.When two types of parts arrives with a mixed signal flow of MMPP and SMMPP,Q learning can also learn a good control strategy.In addition,on the basis of the non Poisson arrival of the parts,Q learning can still learn a good control strategy when customers arrive with non Poisson flows. |