sparsity - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Facebook -- TVM AWS Meetup Talk

specifics X78Structured and Unstructured Sparsity - Lots of 'free' wins from exploring sparsity in modern ML models - Can often prune models to 80%+ sparsity(with retraining) - Massive speedups combined

0 码力 | 11 页 | 3.08 MB | 6 月前
3
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

N. Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. CoRR, abs/2101.03961, 2021. URL https://arxiv.org/ abs/2101.03961. L. Gao, S. Biderman, S. Black

0 码力 | 52 页 | 1.23 MB | 1 年前
3

共 2 条前往

页

Facebook TVM AWS Meetup Talk DeepSeek V2 Strong Economical and Efficient Mixture of Experts Language Model