Model-driven optimisation of memory hierarchy and multithreading on GPUs

Andrew A. Haigh, Eric C. McCreath

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    Due to their potentially high peak performance and energy effciency, GPUs are increasingly popular for scientific computations. However, the complexity of the architecture makes it difficult to write code that achieves high performance. Two of the most important factors in achieving high performance are the usage of the GPU memory hierarchy and the way in which work is mapped to threads and blocks. The dominant frameworks for GPU computing, CUDA and OpenCL, leave these decisions largely to the programmer. In this work, we address this in part by proposing a technique that simultaneously manages use of the GPU lowlatency shared memory and chooses the granularity with which to divide the work (block size). We show that a relatively simple heuristic based on an abstraction of the GPU architecture is able to make these decisions and achieve average performance within 17% of an optimal configuration on an NVIDIA Tesla K20.

    Original languageEnglish
    Title of host publicationProceedings of the 13th Australasian Symposium on Parallel and Distributed Computing, AusPDC 2015
    EditorsBahman Javadi, Saurabh Kumar Garg
    PublisherAustralian Computer Society
    Pages71-74
    Number of pages4
    ISBN (Print)9781921770456
    Publication statusPublished - 2015
    EventProceedings of the 13th Australasian Symposium on Parallel and Distributed Computing, AusPDC 2015 - Sydney, Australia
    Duration: 27 Jan 201530 Jan 2015

    Publication series

    NameConferences in Research and Practice in Information Technology Series
    Volume163
    ISSN (Print)1445-1336

    Conference

    ConferenceProceedings of the 13th Australasian Symposium on Parallel and Distributed Computing, AusPDC 2015
    Country/TerritoryAustralia
    CitySydney
    Period27/01/1530/01/15

    Fingerprint

    Dive into the research topics of 'Model-driven optimisation of memory hierarchy and multithreading on GPUs'. Together they form a unique fingerprint.

    Cite this