Resource and performance distribution prediction for large scale analytics queries

Alireza Khoshkbarforoushha, Rajiv Ranjan

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    6 Citations (Scopus)

    Abstract

    Efficient resource consumption and performance estimation of data-intensive workloads is central to the design and development of workload management techniques. Recent work has explored the efficacy of using distribution-based estimation of workload performance as opposed to single point prediction for a number of workload management problems such as query scheduling, admission control, and the like. However, the proposed approaches lack an efficient workload performance distribution prediction in that they simply assume that the probability distribution function (pdf) of the target value is already available. This paper aims to address this problem for an inseparable portion of big data analytics workloads, Hive queries. To this end, we combine knowledge of Hive query executions with the novel usage of mixture density networks to predict the whole spectrum of resource and performance as probability density functions. We evaluate our technique using the TPC-H benchmark, showing that it not only produces accurate pdf predictions but outperforms the state of the art single point techniques in half of experiments.

    Original languageEnglish
    Title of host publicationICPE 2016 - Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering
    PublisherAssociation for Computing Machinery, Inc
    Pages49-54
    Number of pages6
    ISBN (Electronic)9781450340809
    DOIs
    Publication statusPublished - 12 Mar 2016
    Event7th ACM/SPEC International Conference on Performance Engineering, ICPE 2016 - Delft, Netherlands
    Duration: 12 Mar 201616 Mar 2016

    Publication series

    NameICPE 2016 - Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering

    Conference

    Conference7th ACM/SPEC International Conference on Performance Engineering, ICPE 2016
    Country/TerritoryNetherlands
    CityDelft
    Period12/03/1616/03/16

    Fingerprint

    Dive into the research topics of 'Resource and performance distribution prediction for large scale analytics queries'. Together they form a unique fingerprint.

    Cite this