Code Generation - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting reducing KV cache by 93.3% KV Cache for Generation (KB/Token) 0 10000 20000 30000 40000 50000 DeepSeek-V2 DeepSeek 67B 576% of maximum throughput Maximum Generation Throughput (Tokens/Sec) . . . 31 E Discussion About Pre-Training Data Debiasing 32 F Additional Evaluations on Math and Code 33 G Evaluation Formats 34 3 1. Introduction In the past few years, Large Language Models (LLMs)

0 码力 | 52 页 | 1.23 MB | 1 年前
3
Trends Artificial Intelligence

(5 days to secure 1MM users). Generative AI = AI that can create content – text, images, audio, or code – based on learned patterns. Source: OpenAI Generative AI – Public Launch of ChatGPT 2022* Knowledge Computing-Related* Patents Granted Annually – 1960-2024, per USPTO *Uses Cooperative Patent Classification (CPC) code G06, which corresponds to computing, calculating or counting patents. Google patents data changes above show average accuracy of top-performing AI models in each calendar year. Source: Papers With Code via Nestor Maslej et al., ‘The AI Index 2025 Annual Report,’ AI Index Steering Committee, Stanford

0 码力 | 340 页 | 12.14 MB | 5 月前
3
Google 《Prompt Engineering v7》

Prompt Engineering 40 Code prompting 42 Prompts for writing code 42 Prompts for explaining code 44 Prompts for translating code 46 Prompts for debugging and reviewing code 48 What about multimodal understanding and generation tasks such as text summarization, information extraction, question and answering, text classification, language or code translation, code generation, and code documentation ‘providing an additional task to the system’. For example, you could use a system prompt to generate a code snippet that is compatible with a specific programming language, or you could use a system prompt

0 码力 | 68 页 | 6.50 MB | 6 月前
3
OctoML OSS 2019 11 8

other languages QQ octoML HTVM Overview *。 Plug directly into TVYM as a backend *，Target C to emit code for microcontrollers that is device- agnostic AuroTYM QQ octoML AutoTVM on HTVM DTYM Runtime send VM runtime VM serialization Dynamic Shape Support Dynamic Shape Allocation o Dynamic Shape Code generation ee Looking for more contributions in this part of the systeml e Haichen and | will discuss

0 码力 | 16 页 | 1.77 MB | 6 月前
3
Facebook -- TVM AWS Meetup Talk

prune models to 80%+ sparsity(with retraining) - Massive speedups combined with specialized code-generation techniques (TVM, Xbyak, etc) - Interesting new tradeoffs - how const are parameters? -

0 码力 | 11 页 | 3.08 MB | 6 月前
3
XDNN TVM - Nov 2019

Subgraphs Post-Processing Pre-Processing CPU FPGA CPU CPU FPGA© Copyright 2018 Xilinx TVM Code Generation >> 9 Subgraph 1 Parallel Subgraphs Post-Processing Pre-Processing CPU FPGA CPU CPU FPGA

0 码力 | 16 页 | 3.35 MB | 6 月前
3
TVM Meetup: Quantization

Cuda, C, … Framework Parsers Graph level optimizations Tensor-level optimizations Machine code generation© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Quantization Appraoches

0 码力 | 19 页 | 489.50 KB | 6 月前
3
OpenAI 《A practical guide to building agents》

whether that's resolving a customer service issue, booking a restaurant reservation, committing a code change,   or generating a report. Applications that integrate LLMs but don’t use them to control Explicit guidelines and guardrails defining how the   agent behaves Here’s what this looks like in code when using OpenAI’s Agents SDK. You can also implement the same concepts using your preferred library learning of specialized domain-specific languages. In contrast, the Agents SDK adopts a more flexible, code-first approach. Developers can   directly express workflow logic using familiar programming constructs

0 码力 | 34 页 | 7.00 MB | 6 月前
3
普通人学AI指南

统，以便你可以使用它创建和运行 Docker 容器。然后再运行一条命令就可以了： docker run -d --name lobe-chat -p 10084:3210 -e ACCESS_CODE=lobe66 lobehub/lobe-chat:latest 22 解释下这条命令，它用于以守护进程模式（后台）运行一个名为 lobe-chat 的 Docker 容器，并设置一些特定参数：容器的 3210 端口。这样，主机的 10084 端口的请求会被转发到容器的 3210 端口。 -e ACCESS_CODE=lobe66 ：设置环境变量 ACCESS_CODE 的值为 lobe66 ，这通常是用于在容器内配置应用程序的参数。 lobehub/lobe-chat:latest ：

0 码力 | 42 页 | 8.39 MB | 8 月前
3
TVM: Where Are We Going

TensorizationVTA: Open & Flexible Deep Learning Accelerator • Runtime JIT compile accelerator micro code • Support heterogenous devices, 10x better than CPU on the same board. • Move hardware complexity Incubated as Apache TVM recently. Independent governance, allowing competitors to collaborate. Open Code Open Development Open GovernanceAcknowledgement Apache (incubating) TVM community Our awesome

0 码力 | 31 页 | 22.64 MB | 6 月前
3

共 15 条前往

页

分类

语言

格式

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Trends Artificial Intelligence

Google 《Prompt Engineering v7》

OctoML OSS 2019 11 8

Facebook -- TVM AWS Meetup Talk

XDNN TVM - Nov 2019

TVM Meetup: Quantization

OpenAI 《A practical guide to building agents》

普通人学AI指南

TVM: Where Are We Going