DeepSeek-V4: Towards Highly Efficient Million-Token Context IntelligenceDeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence DeepSeek-AI research@deepseek.com Abstract We present a preview version of DeepSeek-V4 series, including two strong Mixture-of- DeepSeek-V4-Flash with 284B parameters (13B activated) both supporting a context length of one million tokens. DeepSeek-V4 series incorporate several key upgrades in architecture and optimization: (1) a hybrid attention the state-of-the-art for open models, outperforming its predecessors in core tasks. Meanwhile, DeepSeek-V4 series are highly efficient in long-context scenarios. In the one-million-token context setting0 码力 | 58 页 | 4.27 MB | 1 月前3
共 1 条
- 1













