Bit Depth - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

104K 116K 128K Context Length (#Tokens) 0 9 18 27 36 45 55 64 73 82 91 100 Document Depth Percent (%) Pressure Testing DeepSeek-V2 Base 128K Context via "Needle In A HayStack" 1 2 3 4 Lin, K. Zhu, Z. Ye, L. Chen, S. Zheng, L. Ceze, A. Krishnamurthy, T. Chen, and B. Kasikci. Atom: Low-bit quantization for efficient and accurate LLM serving. CoRR, abs/2310.19102, 2023. URL https://doi.org/10

0 码力 | 52 页 | 1.23 MB | 1 年前
3
Trends Artificial Intelligence

Source: Epoch AI (4/25) AI Technology Compounding = Numbers Behind The Momentum Performance, 16-bit FLOP/s +150% / Year Enabled by 1.6x annual growth in chips per cluster and 1.6x annual growth platforms will push breadth, stitching together knowledge across functions; specialists will push depth, delivering AI that speaks the language of compliance, contracts, and customer intent. The question

0 码力 | 340 页 | 12.14 MB | 5 月前
3
清华大学 DeepSeek+DeepResearch 让科研像聊天一样简单

stable cyclic performance, while providing high capacity and high voltage curves, has sparked in-depth research and discussion. As a promising candidate for anode materials, alloy-based anodes such as Among various options, alloy-based anodes, especially silicon (Si, 4200 mA h g-1), have sparked in-depth research and discussion. This is primarily due to their extremely high theoretical capacity, which

0 码力 | 85 页 | 8.31 MB | 8 月前
3
TVM@Alibaba AI Labs

140 100 50 0 1 aTF Lite gbit aNCNN 8bit aQNNPACK 8bit aaMNN 8bit TVM Overflow-aware 四ACE Overflow-aware (Assembly) [和| Alibaba AL.Labs

0 码力 | 12 页 | 1.94 MB | 6 月前
3
Google 《Prompt Engineering v7》

years between my partner and me and add those up. (20+(9-3)). Let’s help the model to think a little bit more like me. Prompt Engineering February 2025 31 Table 12 is an example of ‘zero-shot’ Chain of paths by branching out from different nodes in the tree. There’s a great notebook, which goes into a bit more detail showing The Tree of Thought (ToT) which is based on the paper ‘Large Language Model Guided Please refer to the notebook14 hosted in the GoogleCloudPlatform Github repository, which goes into a bit more detail showing the actual LLM inputs and outputs with a more elaborate example. Prompt Engineering

0 码力 | 68 页 | 6.50 MB | 7 月前
3
亿联TVM部署

step1 on Windows to generate the .dll for deployment 3. For application on 32bits, no support of 32bit tensorflow , a workround from FrozenGene a. python/tvm/contrib/ndk.py options = options if options

0 码力 | 6 页 | 1.96 MB | 6 月前
3
Deepseek R1 本地部署完全手册

- 存储: 5GB - 内存: 8GB （M1/M2/M3） - 存储: 5GB 简单⽂本⽣成、基础代码补全 7B - RAM: 8-10GB - GPU: GTX 1680（4-bit量化） - 存储: 8GB - 内存: 16GB（M2 Pro/M3） - 存储: 8GB 中等复杂度问答、代码调试 14B - RAM: 24GB - GPU: RTX 3090（24GB

0 码力 | 7 页 | 932.77 KB | 8 月前
3
TVM Meetup: Quantization

represented with a scale and a zero point http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf 𝑟𝑒𝑎𝑙_𝑣𝑎𝑙𝑢𝑒 = 𝑠𝑐𝑎𝑙𝑒 ∗ (𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑒𝑑_𝑣𝑎𝑙𝑢𝑒 − �

0 码力 | 19 页 | 489.50 KB | 6 月前
3

共 8 条前往

页

分类

语言

格式

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Trends Artificial Intelligence

清华大学 DeepSeek+DeepResearch 让科研像聊天一样简单

TVM@Alibaba AI Labs

Google 《Prompt Engineering v7》

亿联TVM部署

Deepseek R1 本地部署完全手册

TVM Meetup: Quantization