TY - GEN
T1 - BITLUME: Precision-Flexible Photonic Computing for Ultra-Fast and Energy-Efficient DNN Acceleration
AU - Xia, Chengpeng
AU - Zhang, Haibo
AU - Zhang, Hao
AU - Chen, Yawen
AU - Barnard, Amanda Susan
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - As deep learning expands across emerging domains, computational demands are pushing traditional electronic accelerators to their limits. Silicon photonics has emerged as a promising technology for accelerating deep learning workloads, but precision remains a challenge due to noise and non-idealities. In this paper, we present BITLUME, a novel photonic computing unit that enables multiplications beyond 8-bit precision through a precision-flexible scheme. We further propose an optimized round-truncation algorithm and data mapping strategy for BITLUME to reduce optoelectronic conversions, enhance data reuse, and maintain computational accuracy. A hybrid optoelectronic architecture integrating BITLUME is developed and validated using a prototype built with FPGA, RF, and photonic components, achieving 3.7× lower end-to-end latency than the A100 GPU in dot product. Simulations of training seven DNN models at FP32 show that BITLUME achieves up to 3.35× and 10.78× speedup, and 1.53× and 4.12× energy savings, compared to the state-of-the-art photonic accelerator and A100 GPU, respectively.
AB - As deep learning expands across emerging domains, computational demands are pushing traditional electronic accelerators to their limits. Silicon photonics has emerged as a promising technology for accelerating deep learning workloads, but precision remains a challenge due to noise and non-idealities. In this paper, we present BITLUME, a novel photonic computing unit that enables multiplications beyond 8-bit precision through a precision-flexible scheme. We further propose an optimized round-truncation algorithm and data mapping strategy for BITLUME to reduce optoelectronic conversions, enhance data reuse, and maintain computational accuracy. A hybrid optoelectronic architecture integrating BITLUME is developed and validated using a prototype built with FPGA, RF, and photonic components, achieving 3.7× lower end-to-end latency than the A100 GPU in dot product. Simulations of training seven DNN models at FP32 show that BITLUME achieves up to 3.35× and 10.78× speedup, and 1.53× and 4.12× energy savings, compared to the state-of-the-art photonic accelerator and A100 GPU, respectively.
KW - DNN accelerator
KW - Photonic computing
UR - https://www.scopus.com/pages/publications/105029376211
U2 - 10.1109/ICCAD66269.2025.11240825
DO - 10.1109/ICCAD66269.2025.11240825
M3 - Conference Paper
AN - SCOPUS:105029376211
SN - 979-8-3315-1561-4
T3 - IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
BT - 2025 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2025 - Conference Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2025
Y2 - 26 October 2025 through 30 October 2025
ER -