Handling silent data corruption with the sparse grid combination technique

Alfredo Parra Hinojosa*, Brendan Harding, Markus Hegland, Hans Joachim Bungartz

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    4 Citations (Scopus)

    Abstract

    We describe two algorithms to detect and filter silent data corruption (SDC) when solving time-dependent PDEs with the Sparse Grid Combination Technique (SGCT). The SGCT solves a PDE on many regular full grids of different resolutions, which are then combined to obtain a high quality solution. The algorithm can be parallelized and run on large HPC systems. We investigate silent data corruption and show that the SGCT can be used with minor modifications to filter corrupted data and obtain good results. We apply sanity checks before combining the solution fields to make sure that the data is not corrupted. These sanity checks are derived from well-known error bounds of the classical theory of the SGCT and do not rely on checksums or data replication. We apply our algorithms on a 2D advection equation and discuss the main advantages and drawbacks.

    Original languageEnglish
    Title of host publicationSoftware for Exascale Computing - SPPEXA 2013-2015
    EditorsWolfgang E. Nagel, Hans-Joachim Bungartz, Philipp Neumann
    PublisherSpringer Verlag
    Pages165-186
    Number of pages22
    ISBN (Print)9783319405261
    DOIs
    Publication statusPublished - 2016
    EventInternational Conference on Software for Exascale Computing, SPPEXA 2015 - Munich, Germany
    Duration: 25 Jan 201627 Jan 2016

    Publication series

    NameLecture Notes in Computational Science and Engineering
    Volume113
    ISSN (Print)1439-7358

    Conference

    ConferenceInternational Conference on Software for Exascale Computing, SPPEXA 2015
    Country/TerritoryGermany
    CityMunich
    Period25/01/1627/01/16

    Fingerprint

    Dive into the research topics of 'Handling silent data corruption with the sparse grid combination technique'. Together they form a unique fingerprint.

    Cite this