| Pharmaceutical logistics warehousing cost reduction and efficiency has been one of the key objectives of pharmaceutical logistics management research.Accurate prediction of pharmaceutical product inventory can improve warehouse stocking efficiency,thus reducing medication loss and better meeting market demand;optimized replenishment strategy can save warehouse management costs and reduce inventory waste.Inventory forecasting model and replenishment strategy can achieve accurate prediction and reasonable optimization of pharmaceutical inventory,which has positive significance to the development of pharmaceutical logistics.Existing time series forecasting methods have limitations for pharmaceutical inventory forecasting.Demand for pharmaceutical products is easily affected by external factors such as policies and seasons,which is difficult to quantify,and the data shows large fluctuations and inconspicuous characteristics of cyclical trends.The traditional replenishment strategy is difficult to meet the real scenario of complex and changing demand of real pharmaceutical warehousing,and due to the temporal and spatial heterogeneity of demand trends and inventory costs of different medicines,training prediction models for multiple types of medicines one by one will greatly increase the time cost of modeling.The thesis takes real pharmaceutical demand data as the research object,mainly researches the pharmaceutical logistics inventory forecasting and replenishment strategy optimization problems.The thesis firstly constructs a pharmaceutical logistics inventory forecasting model MAGRU(Multi Layer Gated Recurrent Unit with Attention)based on a multi-layer attention mechanism and gated recurrent units.The thesis then proposes a reinforcement learning replenishment model IACPPO(Integration of A2C and PPO)that integrates A2C(Advantage Actor Critic)and PPO(Proximal Policy Optimization)algorithms to optimize replenishment strategies,taking into account the spatial and temporal heterogeneity of multi-medicine demand and inventory costs.There are following innovations of the thesis:(1)The thesis proposed a pharmaceutical logistics warehouse inventory forecasting model that combines a multi-layer attention mechanism with a gated recurrent unit.Firstly,an embedding operation is performed on the temporal features of the original data to capture the mapping relationship between medicine demand and time.The hidden factors that are difficult to quantify,such as season and month,are explored for their impact on medicine demand.After that,the attention mechanism is combined with gated recurrent unit for stacking to improve the prediction performance of the model in long series,and the model performs well on small-scale data sets.(2)The thesis proposed a reinforcement learning model IACPPO for optimizing replenishment strategies,it models the replenishment process in pharmaceutical logistics warehouses and integrating A2C and PPO algorithms,uses the A2C and PPO algorithms to train the agents simultaneously,compares the reward differences between the output strategies of different agents in one phase,selects the better algorithm,and then achieves the purpose of selecting a low-cost strategy.(3)The thesis improves the network structure of A2C and PPO.Both A2C algorithm and PPO algorithm have Actor and Critic neural networks,and the thesis optimizes the neural network by improving the original fully connected neural network to a network using a combination of gated recurrent unit and attention mechanism,which leads to more effective access to state information and enhanced network expression,which in turn enables the agents to learn better replenishment strategies.Experimental results on a real-world pharmaceutical inventory dataset show that the replenishment strategy obtained by the IACPPO model achieves lower inventory cost compared to seven prediction models and seven reinforcement learning algorithms.The hyperparameters of the ACPPO model were also investigated to determine the optimal learning rate,reward scale,and other hyperparameter setting ranges. |