Elfen scheduling: Fine-grain principled borrowing from latency-critical workloads using simultaneous multithreading

Xi Yang, Stephen M. Blackburn, Kathryn S. McKinley

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    43 Citations (Scopus)

    Abstract

    Web services from search to games to stock trading impose strict Service Level Objectives (SLOs) on tail latency. Meeting these objectives is challenging because the computational demand of each request is highly variable and load is bursty. Consequently, many servers run at low utilization (10 to 45%); turn off simultaneous multithreading (SMT); and execute only a single service- wasting hardware, energy, and money. Although co-running batch jobs with latency critical requests to utilize multiple SMT hardware contexts (lanes) is appealing, unmitigated sharing of core resources induces non-linear effects on tail latency and SLO violations. We introduce principled borrowing to control SMT hardware execution in which batch threads borrow core resources. A batch thread executes in a reserved batch SMT lane when no latency-critical thread is executing in the partner request lane. We instrument batch threads to quickly detect execution in the request lane, step out of the way, and promptly return the borrowed resources. We introduce the nanonap system call to stop the batch thread's execution without yielding its lane to the OS scheduler, ensuring that requests have exclusive use of the core's resources. We evaluate our approach for colocating batch workloads with latency-critical requests using the Apache Lucene search engine. A conservative policy that executes batch threads only when request lane is idle improves utilization between 90% and 25% on one core depending on load, without compromising request SLOs. Our approach is straightforward, robust, and unobtrusive, opening the way to substantially improved resource utilization in datacenters running latency-critical workloads.

    Original languageEnglish
    Title of host publicationProceedings of the 2016 USENIX Annual Technical Conference, USENIX ATC 2016
    PublisherUSENIX Association
    Pages309-322
    Number of pages14
    ISBN (Electronic)9781931971300
    Publication statusPublished - 2016
    Event2016 USENIX Annual Technical Conference, USENIX ATC 2016 - Denver, United States
    Duration: 22 Jun 201624 Jun 2016

    Publication series

    NameProceedings of the 2016 USENIX Annual Technical Conference, USENIX ATC 2016

    Conference

    Conference2016 USENIX Annual Technical Conference, USENIX ATC 2016
    Country/TerritoryUnited States
    CityDenver
    Period22/06/1624/06/16

    Fingerprint

    Dive into the research topics of 'Elfen scheduling: Fine-grain principled borrowing from latency-critical workloads using simultaneous multithreading'. Together they form a unique fingerprint.

    Cite this