BitNet: Scaling 1-bit Transformers forLarge Language Models Hongyu Wang†‡ Shuming Ma† Li Dong† Shaohan Huang† Huaijie Wang§ Lingxiao Ma† Fan Yang† Ruiping Wang‡ Yi Wu§ Furu Wei†⋄† Microsoft Research ‡ University of Chinese Academy of Sciences § Tsinghua University arxiv.org Abstract 目的:大規模言…