Use of multiple GPUs on shared memory multiprocessors for ultrasound propagation simulations

Jiri Jaros*, Bradley E. Treeby, Alistair P. Rendell

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    8 Citations (Scopus)

    Abstract

    This paper outlines our effort to migrate a compute intensive application of ultrasound propagation being developed in Matlab to a cluster computer where each node has seven GPUs. Our goal is to perform realistic simulations in hours and minutes instead of weeks and days. In order to reach this goal we investigate architecture characteristics of the target system focusing on the PCI-Express subsystem and new features proposed in CUDA version 4.0, especially simultaneous host to device, device to host and peer-to-peer transfers that the application is going to highly benefit from. We also present the results from a CPU based implementation and discuss future directions to exploit multiple GPUs.

    Original languageEnglish
    Title of host publicationParallel and Distributed Computing 2012 - Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing, AusPDC 2012
    Pages43-52
    Number of pages10
    Publication statusPublished - 2012
    Event10th Australasian Symposium on Parallel and Distributed Computing, AusPDC 2012 - Melbourne, VIC, Australia
    Duration: 31 Jan 20123 Feb 2012

    Publication series

    NameConferences in Research and Practice in Information Technology Series
    Volume127
    ISSN (Print)1445-1336

    Conference

    Conference10th Australasian Symposium on Parallel and Distributed Computing, AusPDC 2012
    Country/TerritoryAustralia
    CityMelbourne, VIC
    Period31/01/123/02/12

    Fingerprint

    Dive into the research topics of 'Use of multiple GPUs on shared memory multiprocessors for ultrasound propagation simulations'. Together they form a unique fingerprint.

    Cite this