A Dynamic Analysis Framework for Classifying Malicious Webpages

Date of Award

12-1-2023

Degree Name

M.C.S. (Master of Computer Science)

Department

Department of Computer Science

Advisor/Chair

Advisor: Phu Phung

Abstract

In today’s interconnected digital landscape, the surge in malicious websites has caused a significant number of cyber-attacks and data breaches. These malicious entities largely employ JavaScript to execute attacks on web browsers. It is becoming increasingly apparent that attackers can evade traditional mechanisms, such as lexical analysis, content examination, and blacklists, through code obfuscation, which disguises the true intent of the code, and polymorphic or metamorphic code that alters itself with each execution. These techniques make it difficult for traditional static analysis tools to detect dynamically generated or altered code characteristics of sophisticated, evolving threats. Considering these challenges, notable research has progressed, proposing dynamic approaches that monitor JavaScript behavior in real-time. These dynamic methods can identify malicious patterns and activities, offering a significant advancement over static analysis by detecting and mitigating threats as they occur. This thesis introduces an innovative runtime analysis method for JavaScript that encompasses all JavaScript executions, including traditionally on-the-fly generated code and advanced evasion techniques. Our approach centrally applies the security reference monitor technique, which mediates JavaScript’s security-sensitive operations during execution. This includes closely monitoring function calls and property access, ensuring a thorough capture of runtime behaviors, and effectively mitigating the risk of attack, regardless of the code’s structure or the obfuscation techniques employed. We have implemented this method as an extension in the Chromium browser to intercept and log about 59 security-sensitive JavaScript operations, demonstrating its applicability in a real-world web browsing environment. To evaluate the effectiveness of our framework, we have developed a toolset to automate the execution of the Chromium browser with our extension on a large-scale raw dataset of approximately 13,900 malicious and 13,500 benign websites. We counted the number of executions of each operation for each website as features and collected approximately 27,000 labeled records to train machine learning models. Preliminary results underscore the effectiveness of our approach, pinpointing malicious JavaScript content with a promising accuracy rate.

Keywords

Malware analysis, browser APIs, Javascript APIs, dynamic analysis, security, malware detection, behavioral analysis, evasion

Rights Statement

Copyright © 2023, author.

Share

COinS