Techtouch Developers Blog

id:sok14

DeepSeek-R1の数理的背景を理解する

DeepSeek-R1で使われた強化学習アルゴリズムGRPOの解説です。

2025-04-23 09:00