Use of SIMD vector operations to accelerate application code performance on low-powered ARM and intel platforms

Gaurav Mitra, Beau Johnston, Alistair P. Rendell, Eric McCreath, Jun Zhou

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    72 Citations (Scopus)

    Abstract

    Augmenting a processor with special hardware that is able to apply a Single Instruction to Multiple Data(SIMD) at the same time is a cost effective way of improving processor performance. It also offers a means of improving the ratio of processor performance to power usage due to reduced and more effective data movement and intrinsically lower instruction counts. This paper considers and compares the NEON SIMD instruction set used on the ARM Cortex-A series of RISC processors with the SSE2 SIMD instruction set found on Intel platforms within the context of the Open Computer Vision (OpenCV) library. The performance obtained using compiler auto-vectorization is compared with that achieved using hand-tuning across a range of five different benchmarks and ten different hardware platforms. On the ARM platforms the hand-tuned NEON benchmarks were between 1.05x and13.88x faster than the auto-vectorized code, while for the Intel platforms the hand-tuned SSE benchmarks were between1.34x and 5.54x faster.

    Original languageEnglish
    Title of host publicationProceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
    PublisherIEEE Computer Society
    Pages1107-1116
    Number of pages10
    ISBN (Print)9780769549798
    DOIs
    Publication statusPublished - 2013
    Event2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013 - Boston, MA, Japan
    Duration: 22 Jul 201326 Jul 2013

    Publication series

    NameProceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013

    Conference

    Conference2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013
    Country/TerritoryJapan
    CityBoston, MA
    Period22/07/1326/07/13

    Fingerprint

    Dive into the research topics of 'Use of SIMD vector operations to accelerate application code performance on low-powered ARM and intel platforms'. Together they form a unique fingerprint.

    Cite this