Residue Number Systems Quantization for Deep Learning Inference

Sergey Sivkov

doi:10.37394/23205.2023.22.33

Search Articles

WSEAS Transactions on Computers

Print ISSN: 1109-2750, E-ISSN: 2224-2872

Volume 22, 2023

Residue Number Systems Quantization for Deep Learning Inference

Author:

Sergey Sivkov

Abstract: Quantization of learned CNN weights to Residue Number System can improve inference latency by taking advantage of fast and precise low bit integer arithmetic. In this paper we review the mathematical aspects of RNS operations for signed integer values and evaluate implementation choices for conversion of conventional float-point PyTorch weights of CNN models to RNS representation. We also present a workflow to convert weights of PyTorch neural network layers specific for computer vision domain to 4-bit RNS moduli-sets able to maintain classification accuracy within 5% of 8-bit quantization baseline.

Keywords: FPGA, inference, RNS, Residue Number System, Verilog, PyTorch, Quantization

Pages: 296-301

DOI: 10.37394/23205.2023.22.33

WSEAS Transactions on Computers, ISSN / E-ISSN: 1109-2750 / 2224-2872, Volume 22, 2023, Art. #33

PDF

HTML

DOI

XML

Certification

Search Articles

Residue Number Systems Quantization for Deep Learning Inference

Citation Tools