Home » Publications » Quantifying the resilience of machine learning classifiers used for cyber security

Quantifying the resilience of machine learning classifiers used for cyber security

February 5, 2018

Z Katzir, Y Elovici

Expert Systems with Applications 92, 419-429, 2018

The use of machine learning algorithms for cyber security purposes gives rise to questions of adversarial resilience, namely: Can we quantify the effort required of an adversary to manipulate a system that is based on machine learning techniques? Can the adversarial resilience of such systems be formally modeled and evaluated? Can we quantify this resilience such that different systems can be compared using empiric metrics?Past works have demonstrated how an adversary can manipulate a system based on machine learning techniques by changing some of its inputs. However, comparatively little work has emphasized the creation of a formal method for measuring and comparing the adversarial resilience of different machine learning models to these changes.In this work we study the adversarial resilience of detection systems based on supervised machine learning models. We provide a formal definition for adversarial resilience while focusing on multisensory fusion systems. We define the model robustness (MRB) score, a metric for evaluating the relative resilience of different models, and suggest two novel feature selection algorithms for constructing adversary aware classifiers. The first algorithm selects only features that cannot realistically be modified by the adversary, while the second algorithm allows control over the resilience versus accuracy tradeoff. Finally, we evaluate our approach with a real-life use case of dynamic malware classification using an extensive, up-to-date corpus of benign and malware executables. We demonstrate the potential of using adversary aware feature selection for building more resilient classifiers and provide empirical evidence supporting the inherent resilience of ensemble algorithms compared to single model algorithms.