Helix: DGA domain embeddings for tracking and exploring botnets

Lior Sidi, Yisroel Mirsky, Asaf Nadler, Yuval Elovici, Asaf Shabtai

Proceedings of the 29th ACM International Conference on Information …, 2020

Botnets have been using domain generation algorithms (DGA) for over a decade to covertly and robustly identify the domain name of their command and control servers (C&C). Recent advancements in DGA detection has motivated botnet owners to rapidly alter the C&C domain and use adversarial techniques to evade detection. As a result, it has become increasingly difficult to track botnets in DNS traffic. In this paper, we present Helix, a method for tracking and exploring botnets. Helix uses a spatio-temporal deep neural network autoencoder to convert domains into numerical vectors (embeddings) which capture the DGA and seed used to create the domain. This is made possible by leveraging both convolutional (spatial) and recurrent (temporal) layers, and by using techniques such as attention mechanisms and highways. Furthermore, by using an autoencoder architecture, the network can be trained in an …