TY - GEN
T1 - Performance models for cluster-enabled OpenMP implementations
AU - Cai, Jie
AU - Rendell, Alistair P.
AU - Strazdins, Peter E.
AU - Wong, H'sien Jin
PY - 2008
Y1 - 2008
N2 - A key issue for Cluster-enabled OpenMP implementations based on software Distributed Shared Memory (sDSM) systems, is maintaining the consistency of the shared memory space. This forms the major source of overhead for these systems, and is driven by the detection and servicing of page faults. This paper investigates how application performance can be modelled based on the number of page faults. Two simple models are proposed, one based on the number of page faults along the critical path of the computation, and one based on the aggregated numbers of page faults. Two different sDSM systems are considered. The models are evaluated using the OpenMP NAS Parallel Benchmarks on an 8-node AMD-based Gigabit Ethernet cluster. Both models gave estimates accurate to within 10% in most cases, with the critical path model showing slightly better accuracy; accuracy is lost if the underlying page faults cannot be overlapped, or if the application makes extensive use of the OpenMP flush directive.
AB - A key issue for Cluster-enabled OpenMP implementations based on software Distributed Shared Memory (sDSM) systems, is maintaining the consistency of the shared memory space. This forms the major source of overhead for these systems, and is driven by the detection and servicing of page faults. This paper investigates how application performance can be modelled based on the number of page faults. Two simple models are proposed, one based on the number of page faults along the critical path of the computation, and one based on the aggregated numbers of page faults. Two different sDSM systems are considered. The models are evaluated using the OpenMP NAS Parallel Benchmarks on an 8-node AMD-based Gigabit Ethernet cluster. Both models gave estimates accurate to within 10% in most cases, with the critical path model showing slightly better accuracy; accuracy is lost if the underlying page faults cannot be overlapped, or if the application makes extensive use of the OpenMP flush directive.
UR - http://www.scopus.com/inward/record.url?scp=55849111882&partnerID=8YFLogxK
U2 - 10.1109/APCSAC.2008.4625433
DO - 10.1109/APCSAC.2008.4625433
M3 - Conference contribution
SN - 9781424426836
T3 - 13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
BT - 13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
T2 - 13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
Y2 - 4 August 2008 through 6 August 2008
ER -