TY - GEN
T1 - Resource distribution estimation for Data-Intensive workloads
T2 - Workshops on CLIoT, WAS4FI, SeaClouds, CloudWay, IDEA, FedCloudNet 2015 held in conjunction with European Conference on Service-Oriented and Cloud Computing, ESOCC 2015
AU - Khoshkbarforoushha, Alireza
AU - Ranjan, Rajiv
AU - Strazdins, Peter
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016
Y1 - 2016
N2 - Robust resource share estimation of data-intensive workloads is integral to efficient workload management in a (virtualized) cluster where multiple systems co-exist and share the same infrastructure. However, developing a reliable resource estimator is quite challenging due to (i) heterogeneity of workloads (e.g. stream processing, batch processing, transactional, etc.) in a multi-system shared cluster, (ii) limited (in batch processing) or complete uncertainties (in stream processing) on input data size or arrival rates, and (iii) changing configurations from run to run. To address above challenges, we propose an inclusive framework and related techniques for workload profiling, similar job identification, and resource distribution prediction in a cluster. Our analysis shows that the framework can successfully estimate the whole spectrum of resource usage as probability distribution functions for wide ranges of data-intensive workloads.
AB - Robust resource share estimation of data-intensive workloads is integral to efficient workload management in a (virtualized) cluster where multiple systems co-exist and share the same infrastructure. However, developing a reliable resource estimator is quite challenging due to (i) heterogeneity of workloads (e.g. stream processing, batch processing, transactional, etc.) in a multi-system shared cluster, (ii) limited (in batch processing) or complete uncertainties (in stream processing) on input data size or arrival rates, and (iii) changing configurations from run to run. To address above challenges, we propose an inclusive framework and related techniques for workload profiling, similar job identification, and resource distribution prediction in a cluster. Our analysis shows that the framework can successfully estimate the whole spectrum of resource usage as probability distribution functions for wide ranges of data-intensive workloads.
KW - Big data workload
KW - Data-intensive systems
KW - Distribution prediction
KW - Multi-cluster workload management
KW - Resource estimation
UR - http://www.scopus.com/inward/record.url?scp=84966501427&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-33313-7_17
DO - 10.1007/978-3-319-33313-7_17
M3 - Conference contribution
SN - 9783319333120
T3 - Communications in Computer and Information Science
SP - 228
EP - 237
BT - Advances in Service-Oriented and Cloud Computing - Workshops of ESOCC 2015, Revised Selected Papers
A2 - Celesti, Antonio
A2 - Leitner, Philipp
PB - Springer Verlag
Y2 - 15 September 2015 through 17 September 2015
ER -