Deferred gratification: Engineering for high performance garbage collection from the get go

Ivan Jibaja*, Stephen M. Blackburn, Mohammad R. Haghighat, Kathryn S. McKinley

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    12 Citations (Scopus)

    Abstract

    Implementing a new programming language system is a daunting task. A common trap is to punt on the design and engineering of exact garbage collection and instead opt for reference counting or conservative garbage collection (GC). For example, AppleScript#8482;, Perl, Python, and PHP implementers chose reference counting (RC) and Ruby chose conservative GC. Although easier to get working, reference counting has terrible performance and conservative GC is inflexible and performs poorly when allocation rates are high. However, high performance GC is central to performance for managed languages and only becoming more critical due to relatively lower memory bandwidth and higher memory latency of modern architectures. Unfortunately, retrofitting support for high performance collectors later is a formidable software engineering task due to their exact nature. Whether they realize it or not, implementers have three routes: (1) forge ahead with reference counting or conservative GC, and worry about the consequences later; (2) build the language on top of an existing managed runtime with exact GC, and tune the GC to scripting language workloads; or (3) engineer exact GC from the ground up and enjoy the correctness and performance benefits sooner rather than later. We explore this conundrum using PHP, the most popular server side scripting language. PHP implements reference counting, mirroring scripting languages before it. Because reference counting is incomplete, the implementors must (a) also implement tracing to detect cyclic garbage, or (b) prohibit cyclic data structures, or (c) never reclaim cyclic garbage. PHP chose (a), AppleScript chose (b), and Perl chose (c). We characterize the memory behavior of five typical PHP programs to determine whether their implementation choice was a good one in light of the growing demand for high performance PHP. The memory behavior of these PHP programs is similar to other managed languages, such as Java#8482; - -they allocate many short lived objects, a large variety of object sizes, and the average allocated object size is small. These characteristics suggest copying generational GC will attain high performance. Language implementers who are serious about correctness and performance need to understand deferred gratification: paying the software engineering cost of exact GC up front will deliver correctness and memory system performance later.

    Original languageEnglish
    Title of host publicationProceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, MSPC 2011
    Pages58-65
    Number of pages8
    DOIs
    Publication statusPublished - 2011
    Event2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, MSPC 2011 - San Jose, CA, United States
    Duration: 5 Jun 20115 Jun 2011

    Publication series

    NameProceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, MSPC 2011

    Conference

    Conference2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, MSPC 2011
    Country/TerritoryUnited States
    CitySan Jose, CA
    Period5/06/115/06/11

    Fingerprint

    Dive into the research topics of 'Deferred gratification: Engineering for high performance garbage collection from the get go'. Together they form a unique fingerprint.

    Cite this