Lexical Link Analysis of Insider Threats Using Digital Forensics

Lexical link analysis is a powerful and relatively simple technique for finding associations between digital artifacts. We will apply it to digital-forensic data of file paths, pages downloaded, and email addresses in our corpus of drive images from 4000 computers and digital devices. Drive analysis is better suited for detecting insider threats than analysis of public information like social-networking postings because much of it is information of which its creator is unaware, as it includes long-deleted data and networking details. As we have done before in modeling overt criminal activity, we will supplement our corpus with our own drive images simulating insider threats of various sorts with exfiltration of a set of designated target files. We will create these images on laboratory computers with disk drives and then image the drives to obtain our data. Once data is extracted from images, we will extract links in the three aforementioned kinds of data artifacts, perform lexical analysis, and then visualize the networks using both the Gephi open-source graphing tool and our own metric-based graph visualization tools. The goal will be to demonstrate that we can recognize the suspicious behavior on our created drives against a background of normal behavior in our corpus.
