MaskDGA: A black-box evasion technique against DGA classifiers and adversarial defenses

Lior Sidi, Asaf Nadler, Asaf Shabtai

arXiv preprint arXiv:1902.08909, 2019

Domain generation algorithms (DGAs) are commonly used by botnets to generate domain names through which bots can establish a resilient communication channel with their command and control servers. Recent publications presented deep learning, character-level classifiers that are able to detect algorithmically generated domain (AGD) names with high accuracy, and correspondingly, significantly reduce the effectiveness of DGAs for botnet communication. In this paper we present MaskDGA, a practical adversarial learning technique that adds perturbation to the character-level representation of algorithmically generated domain names in order to evade DGA classifiers, without the attacker having any knowledge about the DGA classifier’s architecture and parameters. MaskDGA was evaluated using the DMD-2018 dataset of AGD names and four recently published DGA classifiers, in which the average F1-score of the classifiers degrades from 0.977 to 0.495 when applying the evasion technique. An additional evaluation was conducted using the same classifiers but with adversarial defenses implemented: adversarial re-training and distillation. The results of this evaluation show that MaskDGA can be used for improving the robustness of the character-level DGA classifiers against adversarial attacks, but that ideally DGA classifiers should incorporate additional features alongside character-level features that are demonstrated in this study to be vulnerable to adversarial attacks.