| Patent legal status information has timing feature.It can reflect the patent life cycle and the ownership in specific moment and so on.Patent legal status information has unique value to patent analysis.The cleaning of Patent legal status information needs to ensure the correctness of individual legal state and the rationality of the temporal relations between legal states.However,the existing data cleaning research mostly focused on the cleaning of individual data,rarely focused on the cleaning of timing relation data.Therefore,it is imperative to construct and realize the patent legal status information cleaning framework which can clean the timing relation data.Multidimensional robust data quality analysis method(MRDQA)can complete the cleaning of timing relation data conveniently and effectively.MRDQA first needs to carry on the time relation database modeling,including abstracting the state set and state transition relations in timing relation data and providing symbolic representation.Then the timing relation data cleaning task is transformed into the model detection task by the model checking.Thus the temporal logic problems hidden in timing relation data are identified.MRDQA method realizes the automatic discovery of data problems,improves the efficiency of timing relation data cleaning and reduces the cleaning cost.In order to complete the cleaning of the legal state and the timing relation between the states,this paper uses the MRDQA to model the legal status information database and perform model checking on the model,completing the cleaning of the legal state sequence relations.The process is the core of the legal state cleaning.Based on this process,this paper construct and realize legal status cleaning framework.The scientificity and validity of the framework are verified by cleaning the actual legal status information.At last,this paper presents the common error patterns of the patent legal status information,and the common causes of errors.Altogether,in this paper,the MRDQA is applied to the cleaning of patent legal status information,which provides a new way for timing relation data cleaning.At the same time,the data cleaning framework constructed in this paper provides an empirical reference for the same type of data cleaning. |