TY - GEN
T1 - Experiences in teaching a specialty multicore computing course
AU - Strazdins, Peter E.
PY - 2012
Y1 - 2012
N2 - We detail the design and experiences in delivering a specialty multicore computing course whose materials are openly available. The course ambitiously covers three multicore programming paradigms: shared memory (OpenMP), device (CUDA) and message passing (RCCE), and involves significant practical work on their respective platforms: an UltraSPARC T2, Fermi GPU and the Intel Single-Chip Cloud Computer. Specialized multicore architecture topics include chip multiprocessing, virtualization support, on-chip accelerators and networks, transactional memory and speculative execution. The mode of delivery emphasizes the relationship between programming performance and the underlying computer architecture, necessitating the need to provide suitable infrastructure in the form of instrumented test programs and the use of performance evaluation tools. Further infrastructure had to be created to facilitate the safe, convenient and efficient use by students on the GPU and Single-Chip Cloud Computer. The programming assignments, based on the theme of the LINPACK benchmark, also required significant infrastructure for reliably determining correctness and assisting debugging. While the course assumed as background knowledge an introductory computer systems and concurrency course, we found that students could learn device programming in a short time, by building on their knowledge of shared memory programming. However, we found that more time is needed for learning message passing. We also found that, provided students had a suitably strong computer systems background, they could successfully meet the course's learning objectives, although the skill of correctly interpreting performance data remains difficult to learn when suitable performance analysis tools are not available.
AB - We detail the design and experiences in delivering a specialty multicore computing course whose materials are openly available. The course ambitiously covers three multicore programming paradigms: shared memory (OpenMP), device (CUDA) and message passing (RCCE), and involves significant practical work on their respective platforms: an UltraSPARC T2, Fermi GPU and the Intel Single-Chip Cloud Computer. Specialized multicore architecture topics include chip multiprocessing, virtualization support, on-chip accelerators and networks, transactional memory and speculative execution. The mode of delivery emphasizes the relationship between programming performance and the underlying computer architecture, necessitating the need to provide suitable infrastructure in the form of instrumented test programs and the use of performance evaluation tools. Further infrastructure had to be created to facilitate the safe, convenient and efficient use by students on the GPU and Single-Chip Cloud Computer. The programming assignments, based on the theme of the LINPACK benchmark, also required significant infrastructure for reliably determining correctness and assisting debugging. While the course assumed as background knowledge an introductory computer systems and concurrency course, we found that students could learn device programming in a short time, by building on their knowledge of shared memory programming. However, we found that more time is needed for learning message passing. We also found that, provided students had a suitably strong computer systems background, they could successfully meet the course's learning objectives, although the skill of correctly interpreting performance data remains difficult to learn when suitable performance analysis tools are not available.
KW - computing education
KW - multicore computing
KW - parallel computing
UR - http://www.scopus.com/inward/record.url?scp=84867434756&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2012.168
DO - 10.1109/IPDPSW.2012.168
M3 - Conference contribution
SN - 9780769546766
T3 - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
SP - 1283
EP - 1288
BT - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
T2 - 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
Y2 - 21 May 2012 through 25 May 2012
ER -