| The rapid growth of the urban population brings huge passenger flow pressure to the bus.As important records of bus interior condition,bus-mounted surveillance videos can be directly applied to count passenger and analyze the degree of bus congestion.Bus-mounted surveillance videos are very important for constructing and optimizing bus resource scheduling system.Bus-mounted surveillance videos have high occlusion,perspective distortion,variation of illumination,making most existing methods can’t achieve satisfactory results when process bus crowd counting task and congestion analysis tasks.This paper commits itself to give effective solutions to bus crowd counting task and congestion analysis task by using deep learning methods.The main work and contributions are as follows:(1)This paper proposes an effective perspective correction mechanism to solving the problem of perspective distortion in bus-mounted surveillance video.Distortion of perspective makes the difference in imaging area size of passengers too large.By performing a perspective conversion on the surveillance video image,perspective correction mechanism corrects the perspective distortion in the video caused by the location of the surveillance camera.The experimental results show that the error caused by the difference in the imaging area of the passengers is effectively reduced.(2)Perspective Correction Fully Convolutional Networks(PC-FCN).This paper take advantage of Fully Convolutional Networks’ excellent performance in crowd density estimation and integrate perspective correction mechanism into FCN.The experimental results show that PC-FCN can effectively improve the accuracy of passenger crowd density estimation in high occlusion environments.(3)Multi-task deep spatiotemporal neural networks.To take advantage of Long short-term memory’s strength for learning complex temporal correlation of passenger counting in bus-mounted surveillance video,this paper proposes a novel multi-task deep spatiotemporal neural networks PCFCN-LSTM(Perspective Correction FCN-LSTM)by combining PC-FCN and LSTM to accomplish passenger counting and crowd density estimation.Our proposed multi-task network pursues different but related objectives to achieve better local optimal,to provide more supervised information(both crowd density and passenger count)in the training process,to learn better feature representation,and obtain more accurate results of both tasks.Experimental results demonstrate that PCFCN-LSTM outperforms other baseline methods on passenger counting task and crowd density estimation task,and effectively reduce the negative impact of illumination’s variation on passenger counting and crowd density estimation. |