TY - JOUR
T1 - Cache oblivious matrix transposition
T2 - Simulation and experiment
AU - Tsifakis, Dimitrios
AU - Rendell, Alistair P.
AU - Strazdins, Peter E.
PY - 2004
Y1 - 2004
N2 - A cache oblivious matrix transposition algorithm is implemented and analyzed using simulation and hardware performance counters. Contrary to its name, the cache oblivious matrix transposition algorithm is found to exhibit a complex cache behavior with a cache miss ratio that is strongly dependent on the associativity of the cache. In some circumstances the cache behavior is found to be worst than that of a naïve transposition algorithm. While the total size is an important factor in determining cache usage efficiency, the sub-block size, associativity, and cache line replacement policy are also shown to be very important.
AB - A cache oblivious matrix transposition algorithm is implemented and analyzed using simulation and hardware performance counters. Contrary to its name, the cache oblivious matrix transposition algorithm is found to exhibit a complex cache behavior with a cache miss ratio that is strongly dependent on the associativity of the cache. In some circumstances the cache behavior is found to be worst than that of a naïve transposition algorithm. While the total size is an important factor in determining cache usage efficiency, the sub-block size, associativity, and cache line replacement policy are also shown to be very important.
UR - http://www.scopus.com/inward/record.url?scp=35048820009&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-24687-9_3
DO - 10.1007/978-3-540-24687-9_3
M3 - Article
AN - SCOPUS:35048820009
SN - 0302-9743
VL - 3037
SP - 17
EP - 25
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -