| With the rapid proliferation of computer-based assessment tests,a significant amount of crucial data known as process data has been collected.This data serves as a profound foundation for assessing individuals’ problem-solving abilities and their proficiency in problem-solving.Consequently,process data has emerged as a focal point in measurement research.However,in practice,there may be misconceptions about how subjects interact with the test interface that interfere with their judgments and choices at each step of the process,leading to incorrect answers that can have a persistent effect on problem-solving behavior.Therefore,exploring possible misconceptions in the analysis of problem-solving process data is imperative.Due to the inherently unstructured nature of process data(i.e.subjects’ action sequences are not of equal length),the difficulty in applying diagnostic classification analysis directly to process data lies in its formatting,based on ideas from previous research,which can be facilitated by the process of ’item expansion’,it usually involves coding the subject’s process data for key actions(operations),determining whether each subject’s process data contains the key actions necessary to solve the problem and then coding them(e.g.,1 for "contains" and 0 for "does not contain").Therefore,this paper aims to extract more continuous sequential information from the process data during the ’item extension’ process,which can be used to improve the assessment of subjects’ problem-solving abilities and the identification of problem-solving strategies including misconceptions and therefore proposes a new coding method.To illustrate the application and advantages of the proposed method,three studies are conducted,the details are as follows: Study 1 proposed a new coding method for behavioral sequences targeting the introduction of misconceptions.The results show that(1)the new coding method can help to quickly define five problem-solving skills and five myth concepts;(2)the new coding method can extract more continuous behavioral sequence information including misconceptions,thus realizing "item extension";(3)Based on the new coding method,the Q-matrix necessary for diagnostic classification can be constructed in the process data,and the information on response accuracy data and response time data can be extracted.Study 2 conducted data analysis based on the response accuracy data extracted from Study 1.The results showed that(1)the DINA model was a good fit to the process data of this problem-solving test;(2)the results of the data model fit showed good quality and high discrimination for most of the questions;(3)there was a significant positive correlation between the five problem-solving skills,a positive correlation between the five myth concepts,and a more significant negative correlation between the five problem-solving skills and the five myth concepts;(4)by matching the final diagnostic classification results with the subjects’ final score results revealed that the subjects’ classification changed from the original two categories of outcome scores(0 and 1)to 91 categories of potential attribute patterns(problem-solving strategies).Study 3 conducted joint data analysis based on the response accuracy data and response time data extracted from Study 1 under two joint modeling frameworks(joint hierarchical modeling framework and joint-cross-load cognitive diagnostic modeling framework,respectively)First,the JRT-DINA model was used to fit the data under the joint hierarchical modeling framework,and the results showed that(1)the JRT-DINA model fit the data adequately;(2)the slope parameter γ was greater than zero for the first five problem-solving skills,whereas the slope parameter γ was less than zero for the last five misconceptions;(3)There was a negative correlation between subjects’ latent ability and latent processing speed on that problem-solving test.Second,the SC-MCDM-θ model was used to fit the data under the joint-cross-loading cognitive diagnostic modeling work and the results showed that:(1)the S-C-MCDM-θ model fitted the response accuracy data and response time data well;(2)the median of the slope parameter of potential processing speed j < 0,which to some extent reflects the relatively high cognitive loading requirement of most of the items in this test for subjects;and(3)the S-C-MCDM-θ model not only further improved the accuracy of the estimation of the subjects’ potential ability,but also more refined the subjects’ diagnostic classification space. |