| Cloud computing serves as a model for enabling on-demand network access to configurable computing resources.Using this model,user can share computing,storage and network resources,and allocate resources according to their needs.Cloud computing has many features such as large-scale,virtualization,versatility,scalability and on-demand service.It can provide users with a powerful service by integrating the computing,storage and networking capabilities of a large number of distributed servers or computer.With the rapid development of cloud computing technology,more and more applications are migrated to the cloud which prompts cloud service providers to continually expand their cloud infrastructure to meet the growing demand.Expansion of cloud computing infrastructure means that we must spend more time to manage more and more servers and services,and the traditional monitoring methods cannot meet the need for the management of large-scale multi-cluster operation and maintenance any more.To the existing monitoring system,their monitoring level is relatively simple and monitoring category is fixed which leads to the lack of flexibility and they can’t adapt to the needs of cloud platform very well.In order to solve this series of problems,this paper presents a comprehensive monitor and alarm solution for host cluster,virtualized resource,system and application service resource for cloud platform.The project builds a distributed cloud monitoring system that simplifies the process of monitoring deployment,reduces the difficulty of operation and maintenance,breaks the technical barriers between the business personnel and the professional monitoring operation and maintenance,concentrates on the real objects need to be monitored,and provides a variety of visual chart to further enhance the monitoring experience.The alarm system can pinpoint the problem and promptly inform the person in charge so that they can solve the problem quickly to ensure that the IT infrastructure and services are in good condition in a cloud environment.At the same time,the cloud monitoring system opens interfaces for the cloud platform to access in the form of team and provides monitoring data used to be called.The main contents of the distributed cloud monitoring system are as follows:(1)Team management.Consider the team as a monitoring unit and team members have different permissions to perform monitoring operation.Different teams deal with different operations and maintenance projects which can standardize the responsibility.(2)Platform access.Users spend less operating costs to monitor the object into the monitoring system to reduce deployment problems in the traditional monitoring operation and maintenance.(3)Automated monitoring.Consider a company’s educational cloud platform as a monitoring object,the cloud monitoring system realizes the automatic batch access to the virtualized resources and carries on the corresponding monitor statistics and analysis to the whole resources.(4)Monitoring visualization.The system displays dynamic charts based on various monitoring metrics.Users can customize the chart and select the monitoring metrics and data sources to integrate monitoring information which can provide a more contrasting chart.(5)Automatic alarm.When the platform and service being monitored exceeds the preset threshold,the monitoring system promptly informs the team members of the alarm by mail or otherwise.There are several innovations in this paper:(1)Simplify the access to monitoring data of the host and host service.Automated batch access of virtual resources improves monitoring efficiency.For non-professional operation and maintenance personnel,the threshold for monitoring information is reduced.(2)Provide multi-granularity monitoring services,including the server,virtual machines,system services and application services,etc.The system has various monitoring metrics which can be aggregated and reflect the changing trends about resource.(3)The cloud management platform,monitoring system and alarm system use the design of high cohesion and low coupling which uses a unified API and database for message communication and data sharing.The usage pattern is quite flexible.At present,the design and implementation of the system have been initially realized in the experimental environment,and the system has obtained a satisfactory application effect. |