Mynote
Mynote
20 小时前
📌 ZAYA1-8B Technical Report

arXiv:2605.05365v1 Announce Type: new Abstract: We present ZAYA1-8B, a reasoning-focused mixture-of-experts (MoE) model ...

💡 GPT/大模型动态 | via arXiv AI
arXiv.org
ZAYA1-8B Technical Report
We present ZAYA1-8B, a reasoning-focused mixture-of-experts (MoE) model with 700M active and 8B total parameters, built on Zyphra's MoE++ architecture. ZAYA1-8B's core pretraining, midtraining,...
 
 
Home
Powered by BroadcastChannel & Sepia