Not all datasets are born equal: On heterogeneous tabular data and adversarial examples

Yael Mathov, Eden Levy, Ziv Katzir, Asaf Shabtai, Yuval Elovici

Knowledge-Based Systems 242, 108377, 2022

Recent work on adversarial learning has mainly focused on neural networks and domains in which those networks excel, such as computer vision and audio processing. Typically, the data in those domains is homogeneous, whereas domains with heterogeneous tabular datasets remain underexplored, despite their prevalence. When searching for adversarial patterns within heterogeneous input spaces, an attacker must simultaneously preserve the complex domain-specific validity rules of the data and the adversarial nature of the identified samples. As such, applying adversarial manipulations to heterogeneous datasets has proven challenging, and a generic attack method has not yet been proposed. However, this study argue that machine learning models trained on heterogeneous tabular data are as susceptible to adversarial manipulations as those trained on continuous or homogeneous data, such as images …