Approximating Aggregated SQL Queries with LSTM Networks

Nir Regev, Lior Rokach, Asaf Shabtai

2021 International Joint Conference on Neural Networks (IJCNN), 1-8, 2021

Despite continuous investments in data technologies, the latency of querying data still poses a significant challenge. Modern analytic solutions require near real-time responsiveness both to make them interactive and to support automated processing. Current technologies (Hadoop, Spark, Dataflow) scan the dataset to execute queries and focus on providing scalable data storage and in-memory concurrent data processing to maximize task execution speed. We argue that these solutions fail to offer an adequate level of interactivity, since they depend on continual access to data. In this paper, we present a method for query approximation, also known as approximate query processing (AQP), that reduces the need to scan data during inference (query calculation), thus enabling a rapid query processing tool. We use an LSTM network to learn the relationship between queries and their results, and to provide a rapid …