Paper Title
Cyber Threat Hunting With Bag of Terms

Abstract
The goal of this paper is to address these two challenges by leveraging recent advancements in machine learning and, specifically, natural language processing. We propose a new framework called continuous bag of terms and time (CBoTT) to enable cybersecurity analysts to process large volumes of logs containing text-based process audits and determine if there are any processes that pose security risks. Our framework is an extension of the popular continuous-bagof- words approach and enables us to identify the processes that should be investigated with respect to not just what they do, but also when they are executed. The results of our analyses for three different injection schemes show that the CBoTT framework can identify anomalies at an average percentile range of 1.82 to 6.46. This is an improvement compared to the benchmark models, which can detect anomalies at an average percentile range of 3.25 to 80.92. Overall, the CBoTT framework demonstrates superior performance compared to the benchmark models. Keywords - Threat Hunting, Log Analysis, Anomaly Detection