| With the continuous development and popularization of smart phones and wireless positioning devices such as smart wearable devices,the improvement of people’s living standards and the convenience of transportation,more and more people prefer to use mobile phones to record their daily life and share their footprints with friends to "check in" together.The huge check-in data is full of interesting information for us to dig,and we can make relevant application software better provide convenient services for human society through the mining of location data,so that users can find the place they want to go faster.This is the point of interest recommendation.POI recommendation is widely used in location-based platform services and plays an important role,which has attracted numerous experts and scholars to study in recent years,and has made great progress.However,some existing research methods still have some problems:Few of the current methods can make good use of the long-term historical information of users,and do not effectively model the periodicity of users,some methods use the attention mechanism to capture long-term preference information,but it is not explicit enough and the periodic pattern are not fully utilized,and there is still room for improvement in the final recommendation results.Based on these situations,this dissertation proposes the RLMoveNet model.Based on deep learning modeling,it uses reinforcement learning layers as a regularization part to drive the model to focus on behaviors with periodic actions,thus making the overall algorithm more efficient.This model regards the whole problem as a Markov decision problem.In order to model the periodic characteristics of users more effectively,through the periodic understanding of the movement law of the crowd,we design a reward function that can effectively feedback the periodicity,driving the model make information decisions that are biased towards periodicity,which makes the recommendation results more in line with the periodic characteristics of users.In addition,RLMoveNet can more fully mine the information of cross-features,and use the recurrent neural network to capture the dependencies of the user’s check-in sequence to model the user’s behavioral preferences,and then grasp the user’s behavioral rules.In reinforcement learning,in order to avoid the problem of over-estimation of the Q value,this dissertation uses the Double Q-learning algorithm to make the model more stable.Finally,in order to verify the effect of RLMoveNet,we use three real-world mobile datasets to evaluate them and compare them with several state-of-the-art methods.We find that RLMoveNet achieves higher accuracy than other methods,which proves our proposed method is effective. |