Techniques for cyber-attack comprehension through analysis of application level data
MetadataShow full item record
Malicious activity represents a credible and growing threat to the confidentiality, integrity and availability of information assets in modern computing environments. Intrusion detection, which studies the detection and mitigation of cyber-attacks, is a mature area of research that has led to the development of widely used applications called Intrusion Detection Systems (IDS). These IDSs typically focus on analyzing low-level system and network data (e.g., system calls, network packets) using rule-based and anomaly-based techniques to detect obvious malicious activity such as probes (e.g., portscanning) and denial-of-service ( DoS ) attacks. However, with the evolution of computer systems, networks and the accompanying growth of the Internet and its user base, the nature of cyber-attacks has become more sophisticated. There is an increasing prevalence of attacks that are multi-stage and goal oriented - the attacks are not designed simply to take down a system and affect its availability, but may involve intrusion followed by actions that affect confidentiality and integrity (e.g., accessing unauthorized data) of the system or network in question. Several techniques for the detection of such attacks have been proposed in the literature, mainly as aids to forensic analysis (i.e., they are not online). There has also been a lack of in-depth study into recognizing the semantics of attack scenario progression. As a consequence, prior approaches have not been able to provide analysts with adequate awareness of evolving attacks which might enable timely mitigation. The thrust of this dissertation is the development of cyber-attack detection and comprehension techniques that focus on high-level application data (IDS events, logfile entries, user queries etc.) as opposed to network packets and system calls. By restricting analysis to high-level data, attack semantics are better captured and represented; this benefit is leveraged to provide improved awareness of attacks. Online detection techniques using rule-based and learning-based approaches are developed that aim to provide security analysts with the means for attack recognition (when is an attack happening?) and comprehension (attack semantics). In the first part of this dissertation, attack scenario detection is approached from a Situation Awareness (SA) perspective. Events from IDS sensors are considered as atomic elements that define a situation (Level 1 SA) and a semantics-based attack modeling framework is used to understand the overall meaning conveyed by situation elements (Level 2 SA). A rule-based approach to event correlation and suitable visualization tools enable effective comprehension that provides analysts with a predictive and mitigative capability (Level 3 SA). A learning-based approach to attack scenario comprehension in a distributed network is the focus of the second part of the dissertation. Macro-level activity in a computer network is analyzed with a view to detecting abnormal behavior that may indicate possible malicious activity. Events generated by multiple heterogeneous sensors such as IDSs and system logs are used to define a high-dimensional state vector representing overall activity; Principal Component Analysis is used to learn characteristic patterns of activity and aid in anomaly detection. A suitable modeling framework and visualization techniques are also presented for this approach. In the final part of this dissertation, a very specific attack model in a specific application environment is analyzed - that of insider attacks against relational databases. A datacentric approach that models queries based on the data returned by their execution, as opposed to their SQL-expression syntax (syntax-centric), is the thrust of this work. Various types of query anomalies are analyzed from the data-centric viewpoint and efficient techniques for detecting potential attacks are developed. The techniques that are presented as part of this dissertation are tested and validated with test and attack datasets generated in realistic environments. Attack detection through application data analysis is found to offer significant benefits to the practice of cyber-security - ease of data handling and improved ability to capture the semantics of malicious activity are some of the important contributions.