Profiling directed NUMA optimization on Linux systems: A case study of the Gaussian computational chemistry code

Rui Yang*, Joseph Antony, Alistair Rendell, Danny Robson, Peter Strazdins

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    15 Citations (Scopus)

    Abstract

    The parallel performance of applications running on Non-Uniform Memory Access (NUMA) platforms is strongly influenced by the relative placement of memory pages to the threads that access them. As a consequence there are Linux application programmer interfaces (APIs) to control this. For large parallel codes it can, however, be difficult to determine how and when to use these APIs. In this paper we introduce the NUMAgrind profiling tool which can be used to simplify this process. It extends the Val grind binary translation framework to include a model which incorporates cache coherency, memory locality domains and interconnect traffic for arbitrary NUMA topologies. Using NUMAgrind, cache misses can be mapped to memory locality domains, page access modes determined, and pages that are referenced by multiple threads quickly determined. We show how the NUMAgrind tool can be used to guide the use of Linux memory and thread placement APIs in the Gaussian computational chemistry code. The performance of the code before and after use of these APIs is also presented for three different commodity NUMA platforms.

    Original languageEnglish
    Title of host publicationProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
    Pages1046-1057
    Number of pages12
    DOIs
    Publication statusPublished - 2011
    Event25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 - Anchorage, AK, United States
    Duration: 16 May 201120 May 2011

    Publication series

    NameProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

    Conference

    Conference25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
    Country/TerritoryUnited States
    CityAnchorage, AK
    Period16/05/1120/05/11

    Fingerprint

    Dive into the research topics of 'Profiling directed NUMA optimization on Linux systems: A case study of the Gaussian computational chemistry code'. Together they form a unique fingerprint.

    Cite this