| Recently, advanced network permeation technology, such as Advanced Persistent Threat, has become more and more prevalent, that make malware analysis be a hot research topic in the field of network security. Malware is always one of the most serious threats to Internet. With the long-term development of arms race between malware creators and security analysts, modern malware has bourgeoned new characteristics of code structure and behaviors. It is a challenge for malware analysis to deal with sophisticated evading tricks, complicated binary code and network behaviors in modern malware.With the goal of mining compete execution behavior of malware, and from the perspective of arms race between malware authors and analysts on analysis technologies and evading analysis approaches, we have deeply researched into the countermeasures of runtime packing and execution environment sensitive behavior in malware, the execution behavior model and its mining methods, protocol state machine used by malware and network behavior mining methods. The main contributions of this thesis are as following:1. The proposed packed malware detection approaches based on machine learning classify the samples on feature vector space of code structure. The main limitation of those approaches is that they cannot deal with the missing features due to modifications by the malware creator on purpose. This thesis proposes A-ELM, an absent extreme learning machine algorithm, to solve such issue. A-ELM treats the changed feature vector as a missing data learning problem, and employs the missing data learning algorithm to detect packed malware samples with high performance and accuracy. As far as we know, A-ELM is the first algorithm that can deal with missing data on packed malware samples detection.2. This thesis proposes two approaches for countermeasures of execution environment sensitive behavior in malware. The first one is DSlicing ESBD. Instruction trace deviation detection is the base to detect malware emulation environment sensitive tricks. But it is very difficult to build a reference execution platform which is one component in those approaches and has the same properties with a real execution environment. Thus, this thesis proposes a dynamic program slicing based environment sensitive behavior detection method, DSlicing ESBD, for the environment fingerprint API based sensitive tricks. Firstly, the fingerprint API is defined according to the implementation mechanism of environment sensitive behavior. Secondly, the instruction trace of the target binary is extracted from a dynamic analysis platform, from which specific execution patterns are inspected for the location of the last condition code branch. And finally, a dynamic slicing algorithm is performed on the instruction trace for the condition variable of last condition branch. And the root reason of instruction trace deviation is captured. DSlicing ESBD is an approach without reference execution platform. The second one is CT-ESBD. There is no automated detection method for anti-debugging code elimination. A code template based debugging execution environment sensitive behavior detection method, CT-ESBD, is proposed for anti-debugging elimination. A code template is a syntactical signature of anti-debugging code, and can be created automatically or manually. This approach is based on instruction trace extracted from a dynamic analysis platform. An instruction based matching algorithm is performed on the trace and code templates list for anti-debugging code detection, and finally to eliminate those anti-debugging tricks.3. It is not accurate to understand and model the malicious behavior of target samples according to the analysis result derived from classic dynamic analysis tools, because of its serious limitation of single execution path, which produces an incomplete analysis report. In this thesis, a malware dissection method is proposed based on the combination of code structure and execution behavior of target malware samples. Firstly, the binary execution dynamically is modeled with Labeled Transition System. Secondly, the behavior flow graph, BFG, is proposed for execution behavior description. BFG is a fine-grained behavior model related to code structure. It can be used for behavior correlation on different paths. And it has complete view of execution behavior generated by target samples. Finally, the multiple paths exploration based BFG extraction, BFG-E, is proposed for BFG building. BFG-E is based on dynamic analysis platform, and the simple forced execution is used to explore program paths. BFG is built on the control flow graph and behavior dependence graph. The BFG compress algorithm is proposed for simplification. The complete behavior view of target samples is the characteristic of BFG and can be used for modern malware analysis.4. The passive monitoring is the main method to capture network packets in the proposed malware network behavior analysis, which is a long time process and cannot capture complete types of packets. This thesis proposed a binary analysis based specific C&C network packets capture method. Firstly, a protocol state machine for binary analysis, b PSM, is proposed. Its properties are defined and several theorems are proved that the WHILE-SWITCH code structure for C&C network activities can be approximated with b PSM. Secondly, an approach of binary analysis based C&C network traffic inducing on bot-like malware for speediness and completeness, BANTIon Bot SC, is proposed for packets inducing with a rapid way. The idea of BANTIon Bot SC is that some packets are produced actively for the trigger of code path execution, and expose corresponding to malicious network behaviors. Thirdly, it is implemented within dynamic analysis platform. With the assisted of packet-functionality rules, the functionalities of several packets can be labeled. The BANTIon Bot SC can be used to improve the packet capture efficiency, and also can be used for other network behavior mining in malware.5. With the on-going arms race between malware analysis and evading analysis, the malware creation technologies have increasingly improved. Modern malware has bourgeoned new characteristics on structure and functionalities that require a malware analysis system with countermeasures. In this thesis, a comprehensive malware analysis system named i Panda is designed and its prototype is implemented. Firstly, the intrinsic properties of modern malware analysis platform are analyzed and explained. Secondly, the architecture of i Panda is provided based on the approaches proposed in this thesis. Its modules are detailed depicted as well as its implementation. Finally, the analysis result of several malware samples are showed to demonstrate the effectiveness of i Panda. |