TY - GEN
T1 - Adaptive resource remapping through live migration of virtual machines
AU - Atif, Muhammad
AU - Strazdins, Peter
PY - 2011
Y1 - 2011
N2 - In this paper we present ARRIVE-F, a novel open source framework which addresses the issue of heterogeneity in compute farms. Unlike the previous attempts, our framework is not based on linear frequency models and does not require source code modifications or off-line profiling. The heterogeneous compute farm is first divided into a number of virtualized homogeneous sub-clusters. The framework then carries out a lightweight 'online' profiling of the CPU, communication and memory subsystems of all the active jobs in the compute farm. From this, it constructs a performance model to predict the execution times of each job on all the distinct sub-clusters in the compute farm. Based upon the predicted execution times, the framework is able to relocate the compute jobs to the best suited hardware platforms such that the overall throughput of the compute farm is increased. We utilize the live migration feature of virtual machine monitors to migrate the job from one sub-cluster to another. The prediction accuracy of our performance estimation model is over 80%. The implementation of ARRIVE-F is lightweight, with an overhead of 3%. Experiments on a synthetic workload of scientific benchmarks show that we are able to improve the throughput of a moderately heterogeneous compute farm by up to 25%, with a time saving of up to 33%.
AB - In this paper we present ARRIVE-F, a novel open source framework which addresses the issue of heterogeneity in compute farms. Unlike the previous attempts, our framework is not based on linear frequency models and does not require source code modifications or off-line profiling. The heterogeneous compute farm is first divided into a number of virtualized homogeneous sub-clusters. The framework then carries out a lightweight 'online' profiling of the CPU, communication and memory subsystems of all the active jobs in the compute farm. From this, it constructs a performance model to predict the execution times of each job on all the distinct sub-clusters in the compute farm. Based upon the predicted execution times, the framework is able to relocate the compute jobs to the best suited hardware platforms such that the overall throughput of the compute farm is increased. We utilize the live migration feature of virtual machine monitors to migrate the job from one sub-cluster to another. The prediction accuracy of our performance estimation model is over 80%. The implementation of ARRIVE-F is lightweight, with an overhead of 3%. Experiments on a synthetic workload of scientific benchmarks show that we are able to improve the throughput of a moderately heterogeneous compute farm by up to 25%, with a time saving of up to 33%.
UR - http://www.scopus.com/inward/record.url?scp=80455140430&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-24650-0_12
DO - 10.1007/978-3-642-24650-0_12
M3 - Conference contribution
SN - 9783642246494
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 129
EP - 143
BT - Algorithms and Architectures for Parallel Processing - 11th International Conference, ICA3PP 2011, Proceedings
T2 - 11th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2011
Y2 - 24 October 2011 through 26 October 2011
ER -