Gemma4: PLE (Per-Layer Embeddings) implementation is underdocumented and config is misleading · Issue #45206 · huggingface/transformers
Description I was implementing Gemma4 inference from scratch (in Rust) and the Per-Layer Embeddings (PLE) system was by far the hardest part to get right. The config fields are misleading, the embe...