Elasticity in a Task-based Dataflow Runtime Through Inter-node GPU Work Stealing

Joseph John*, Josh Milthorpe

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Most contemporary HPC programming models assume an inelastic runtime in which the resources allocated to an application remain fixed throughout its execution. Conversely, elastic runtimes can expand and shrink resources based on availability and/or dynamic application requirements. In this paper, we implement elasticity for PaRSEC, a task-based dataflow runtime, using inter-node GPU work stealing. In addition to supporting elasticity, we demonstrate that inter-node GPU work stealing can enhance the performance of imbalanced applications by up to 45%.

Original languageEnglish
Title of host publicationProceedings of the 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024
Place of PublicationPhiladelphia
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages97-105
Number of pages9
ISBN (Electronic)979-8-3503-9566-2
ISBN (Print)979-8-3503-9567-9
DOIs
Publication statusPublished - 8 Oct 2024
Event24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 - Philadelphia, United States
Duration: 6 May 20249 May 2024

Publication series

NameProceedings - IEEE International Symposium on Cluster, Cloud and Internet Computing CCGrid
PublisherIEEE
Number24
ISSN (Print)2376-4414
ISSN (Electronic)2993-2114

Conference

Conference24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024
Country/TerritoryUnited States
CityPhiladelphia
Period6/05/249/05/24

Fingerprint

Dive into the research topics of 'Elasticity in a Task-based Dataflow Runtime Through Inter-node GPU Work Stealing'. Together they form a unique fingerprint.

Cite this