Skip to main navigation Skip to search Skip to main content

Non-Parametric Error Estimation for σ-AQP using Optimized Bootstrap Sampling

  • Youngstown State University
  • Texas Tech University
  • University of Akron
  • Cleveland State University

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Approximate query processing (or AQP) aims to quickly provide approximated answers for time-consuming search queries on large datasets. It brings enormous benefits in data science when the query execution efficiency weighs more than the accuracy. However, assessing the accuracy of an approximated answer from AQP still lacks study. Existing work usually relies on strict dataset assumptions that are often not satisfied in real-world datasets. In this work, we employ a non-parametric statistical method, called bootstrap sampling, to assess errors of an AQP system for selection queries (or σ-AQP). We implement a prototype AQP system integrated with a bootstrap sampling engine that can estimate the standard deviation and produce confidence intervals for selection query estimations. Extensive experiments operating the prototype system demonstrated that the confidence intervals generated can cover the ground truth query results with high accuracy and low computing costs. In addition, we introduce optimization strategies for bootstrap sampling which can improve the overall computing efficiency of the prototype AQP system.
Original languageEnglish
Pages (from-to)38-47
Number of pages10
JournalInternational Journal of Computers and their Applications
Volume29
Issue number1
StatePublished - Mar 1 2022

Keywords

  • Approximate query processing
  • bootstrap sampling
  • error estimation
  • non-parametric method

Cite this