TVM: Where Are We Going
Cloud FPGA ASIC Optimization AutoTVM Device FleetExisting Deep Learning Frameworks High-level data flow graph Hardware Primitive Tensor operators such as Conv2D eg. cuDNN Offload to heavily optimized intensiveMachine Learning based Program Optimizer TVM: Learning-based Learning System High-level data flow graph and optimizations Directly generate optimized program for new operator workloads and c SaveToBinary/LoadFromBinary Runtime Module Interface SubclassesUnified Runtime Benefit mod.export_library("mylib.so") Unified library packaging Free API (Py/Java/Go) lib = tvm.module.load("mylib0 码力 | 31 页 | 22.64 MB | 5 月前3Dynamic Model in TVM
shapes ○ Dynamic inputs: batch size, image size, sequence length, etc. ○ Output shape of some ops are data dependent: arange, nms, etc. ○ Control flow: concatenate within a while loop Limitation of TVM/graph modes (op_attrs, input_tensors, out_ndims) -> out_shape_tensors ○ Data dependent (op_attrs, input_data, out_ndims) -> out_shape_tensors ○ Data independent (op_attrs, input_shapes, out_ndims) -> out_shape_tensors© out_shape_tensors ○ Data dependent (op_attrs, input_data, out_ndims) -> out_shape_tensors ○ Data independent (op_attrs, input_shapes, out_ndims) -> out_shape_tensors ● Why? ○ Fuse data independent shape0 码力 | 24 页 | 417.46 KB | 5 月前3Trends Artificial Intelligence
datapoints turned into this beast. As soon as we updated one chart, we often had to update another – a data game of whack-a-mole… a pattern that shows no sign of stopping…and will grow more complex as competition related to the artificial intelligence technology evolution is indeed unprecedented, as supported by the data. This document is filled with user, usage and revenue charts that go up-and-to-the-right… often supported Threats = Rising Competition + Open-Source Momentum + China’s Rise • AI & Physical World Ramps = Fast + Data-Driven • Global Internet User Ramps Powered by AI from Get-Go = Growth We Have Not Seen Likes of0 码力 | 340 页 | 12.14 MB | 4 月前3Google 《Prompt Engineering v7》
as image prompts) is the input the model uses to predict a specific output. You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. However, crafting the most complicated. Many aspects of your prompt affect its efficacy: the model you use, the model’s training data, the model configurations, your word-choice, style and tone, structure, and context all matter. Therefore responses, and can hinder the model’s ability to provide meaningful output. You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. Prompt Engineering February0 码力 | 68 页 | 6.50 MB | 6 月前3OpenAI 《A practical guide to building agents》
error-prone, for example performing vendor security reviews. 03 Heavy reliance on unstructured data: Scenarios that involve interpreting natural language, extracting meaning from documents, or interacting redundant definitions. Broadly speaking, agents need three types of tools: Type Description Examples Data Enable agents to retrieve context and information necessary for executing the workflow. Query transaction with a series of tools when using the Agents SDK: Python 1 2 3 4 5 6 7 8 8 10 11 12 from import def agents Agent, WebSearchTool, function_tool @function_tool save_results(output): db0 码力 | 34 页 | 7.00 MB | 5 月前3Bring Your Own Codegen to TVM
Affiliates. All rights reserved. Example showcase: Intel MKL-DNN (DNNL) library 1. Import packages import numpy as np from tvm import relay 2. Load a pretrained network mod, params = relay.testing.mobilenet. = relay.create_executor(“vm”, mod=mod, ctx=tvm.cpu(0)) data = np.random.uniform(size=(1, 3, 224, 224)).astype(“float32”) out = exe.evaluate()(data, **params) How Would That Look Like?© 2019, Amazon Web (inputs) can be checked as well Return True/False for this op After Annotation op op op op data weight1 weight3 weight2 output Subgraph begin Subgraph end© 2019, Amazon Web Services, Inc. or0 码力 | 19 页 | 504.69 KB | 5 月前3DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Experimental Setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.1 Data Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.2 Hyper-Parameters MLA and MHA . . . . . . . . . . . . . . . . . . . . . . . . . 31 E Discussion About Pre-Training Data Debiasing 32 F Additional Evaluations on Math and Code 33 G Evaluation Formats 34 3 1. Introduction previous release) (DeepSeek-AI, 2024), this corpus features an extended amount of data, especially Chinese data, and higher data quality. We first pretrain DeepSeek-V2 on the full pre-training corpus. Then0 码力 | 52 页 | 1.23 MB | 1 年前3Gluon Deployment
Trademark Deploy GluonCV Models GluonCV Models MXNet Computational Graph Json Acyclic Graph Export As-is Optimize with TVM© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.0 码力 | 8 页 | 16.18 MB | 5 月前3OpenAI - AI in the Enterprise
employees can focus on the things only people can do. And because AI can process huge amounts of data from many sources, it can create customer experiences that feel more human because they’re more relevant need to explain to the candidate why this specific job was recommended to them. Indeed uses the data analysis and natural language capabilities of GPT-4o mini to shape these ‘why’ statements in their function. With thousands of suppliers, Lowe’s often has to work with incomplete or inconsistent product data. 13 AI in the EnterpriseThe key is in accurate product descriptions and tagging. But it also requires0 码力 | 25 页 | 9.48 MB | 5 月前3XDNN TVM - Nov 2019
VGG16 ResNet-50 GoogleNet-V3 Aristotle on 7020 FPGA Iphone8plus Kirin 970 CPU MEM CONTROLLER BUS Data Mover IMG WR SCHEDULER WEIGHTS WR SCHEDULER SMART MEM FABRIC IMG RD SCHEDULER WEIGHTS RD node in TVM graph { "nodes": [ { "op": "null", "name": "data", "inputs": [] }, { "op": "tvm_op", "name": "xdnn0", "attrs": { "flatten_data": "0", "func_name": “accel_fused", "num_inputs": "1", "num_outputs": "num_outputs": "1" }, "inputs": [[0, 0, 0]] }, { "op": "tvm_op", "name": "flatten0", "attrs": { "flatten_data": "0", "func_name": "fuse_flatten", "num_inputs": "1", "num_outputs": "1" }, "inputs": [[1, 0, 0]]0 码力 | 16 页 | 3.35 MB | 5 月前3
共 19 条
- 1
- 2