| Rough set theory, wih its essence as a mathematic tool describing imperfection and uncertainty, can effectively analyze and deal with those imprecise, incosistent, incomplete or other imperfect information so as to find out the implied kowledges and rules. Thus, it is widely applied to do with the information under uncertain circumstances, such as machine learning, decision analysis, process control, pattern recognition and data mining etc. The core of the theory lies in the measurement to the attribute importance as well as attributes reduction. The measurement to the attribute importance takes a crutial role in analyzing the importance of different attributes in given data. Based on the results the attribute reduction can furthermore help extract the important information by clearing out the redundant, and make out decision rules, so as to offer support to scientific management, prediction and policy making. This thesis attempts to apply the theory of rough set advanced by Z. Pawlak, a Polish mathematician to tourist industry because this theory not only has great advantages in analysis and classification of imprecise, uncertain and incomplete knowledge, but also possesses merits of simple model as well as simple process in which the transcendental information of data is not needed. In the process of improving environment of the technology in electronic commerce platform developed by Tint Company for achieving a better comprehensive service, the tool of rough set is employed by the author in reducing the attributes and posting potential rules from the data resourses collected in a long time, which is proved to be efficient. As a result, tour companies could start more effective compaigns and offer more help to their tourists in choosing tour agencies and tourist lines.This paper contains the following aspects:Firstly, grounded on an analysis of different algorithms of attribute core and attribute reduction, proposed by Professor Skowron, a mathematician of Hua Sha University in Poland, the paper tries to point out that incompatibility would cause interference to the extracting of attribute core. Although we can let the cardinal number of the set constructed by the decision attributes' values corresponding with the objects be one, which belong to the partition of the universe divided by the conditional attributes' equivalent relation, such interference still can not be corrected. In addition, the ignorance of those attributes whose approximate precision of classification is 0 would not bring much interference on the attribute reduction, but on the contrary, can reduce the complexity of time and space in algorithm efficiently.Secondly, there is always some evident weakness of the enlightening information drawn through analysis in spite of good effects brought by algorithm of attribute reduction based on the measure of this type, for example, the calculation of dependence rate is too rough; the algorithm based on information entropy is however too precise; the computation to get the average combing the above two is effective in revising the weaknesses, however too complex. To achieve both simplicity and efficiency in algorithm, approximate precision in classification, though still considered comparatively rough, is recommended in the paper to conduct measurements. In addition, genetic algorithm is used to solve the problem of combined blast, because as an optimized search algorithm, it is simple, universal, stable, efficient and practical. Thus an improved attribute reduction algorithm has been improved.Thirdly, in order to post rules easily, we improve the attributes' values reduction algorithm raised by the references[35]. By deleting repetited rules that share the same attributes' values except those unfixed attributes' values and can assure the decisions at the same time, the algorithm is speedened up rapidly.Lastly, after cleaning, translating and coding, the data accumulated in database are copied into excel worksheet. Using Excel Link macro to realize the connection of excel and matlab. The new attribute reduction algorithm proposed in this paper has been programed in matlab to analysis the data in excel worksheet, and draw out a good result. |