TY - JOUR
T1 - Distribution Based Workload Modelling of Continuous Queries in Clouds
AU - Khoshkbarforoushha, Alireza
AU - Ranjan, Rajiv
AU - Gaire, Raj
AU - Abbasnejad, Ehsan
AU - Wang, Lizhe
AU - Zomaya, Albert Y.
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2017/1/1
Y1 - 2017/1/1
N2 - Resource usage estimation for managing streaming workload in emerging applications domains such as enterprise computing, smart cities, remote healthcare, and astronomy, has emerged as a challenging research problem. Such resource estimation for processing continuous queries over streaming data is challenging due to: (i) uncertain stream arrival patterns, (ii) need to process different mixes of queries, and (iii) varying resource consumption. Existing techniques approximate resource usage for a query as a single point value which may not be sufficient because it is neither expressive enough nor does it capture the aforementioned nature of streaming workload. In this paper, we present a novel approach of using mixture density networks to estimate the whole spectrum of resource usage as probability density functions. We have evaluated our technique using the linear road benchmark and TPC-H in both private and public clouds. The efficiency and applicability of the proposed approach is demonstrated via two novel applications: i) predictable auto-scaling policy setting which highlights the potential of distribution prediction in consistent definition of cloud elasticity rules; and ii) a distribution based admission controller which is able to efficiently admit or reject incoming queries based on probabilistic service level agreements compliance goals.
AB - Resource usage estimation for managing streaming workload in emerging applications domains such as enterprise computing, smart cities, remote healthcare, and astronomy, has emerged as a challenging research problem. Such resource estimation for processing continuous queries over streaming data is challenging due to: (i) uncertain stream arrival patterns, (ii) need to process different mixes of queries, and (iii) varying resource consumption. Existing techniques approximate resource usage for a query as a single point value which may not be sufficient because it is neither expressive enough nor does it capture the aforementioned nature of streaming workload. In this paper, we present a novel approach of using mixture density networks to estimate the whole spectrum of resource usage as probability density functions. We have evaluated our technique using the linear road benchmark and TPC-H in both private and public clouds. The efficiency and applicability of the proposed approach is demonstrated via two novel applications: i) predictable auto-scaling policy setting which highlights the potential of distribution prediction in consistent definition of cloud elasticity rules; and ii) a distribution based admission controller which is able to efficiently admit or reject incoming queries based on probabilistic service level agreements compliance goals.
KW - Data stream processing workload
KW - continuous query
KW - distribution-based admission controller
KW - predictable auto-scaling policy
KW - resource usage estimation
UR - http://www.scopus.com/inward/record.url?scp=85015359525&partnerID=8YFLogxK
U2 - 10.1109/TETC.2016.2597546
DO - 10.1109/TETC.2016.2597546
M3 - Article
SN - 2168-6750
VL - 5
SP - 120
EP - 133
JO - IEEE Transactions on Emerging Topics in Computing
JF - IEEE Transactions on Emerging Topics in Computing
IS - 1
M1 - 7529058
ER -