TVM工具组

首页文库资料文章资讯上传文档发布文章登录账户

TVM工具组

绝赞招聘中 ## 平头哥 ## TVM CAFFE 前端 2019·11·16 ## TVM 在平头哥 • 工具链产品平头哥芯片平台发布的配套软件中， TVM 是工具链产品的重要组成部分：负责将预训练好的 caffe 或者 tensorflow 的模型，转换到 LLVM IR，最后生成可以在无剑 SoC 平台上执行的二进制。平头哥集成开发环境统一应用开发框架一键应用部署 Caffe TensorFlow TVM 图形化算力分析 T-Head NN 无剑SoC平台 LLVM 自定义 AI加速器异构联合调试 ## 为何添加 caffe 前端？ ## 客户需求评估阶段：客户用于评估芯片的网络，caffe 模型占很大比重。 ## - 竞品已支持 caffe 前端当前各大芯片厂商的部署工具大多数都支持，支持 caffe 前端有利于提高竞争力。前端有利于提高竞争力。 ## - 开源社区存量的开源 caffe 网络模型众多，TVM 直接支持 caffe 让大家更方便尝试 caffe 资源。 ## 绝赞招聘中 ## 当前进度 ## - 无 caffe 依赖 from_caffe 直接导入 caffe 模型文件，不需要预先安装 caffe。 ## • net 已测试网络：alexnet / densenet121 / inception

0 码力 | 6 页 | 326.80 KB | 1 年前
3
TVM@AliOS

TVM@AliOS ## PRESENTATION AGENDA ☑ TVM @ AliOS Overview TVM @ AliOS ARM CPU TVM @ AliOS Hexagon DSP TVM @ AliOS Intel GPU ☑ Misc ## PART ONE TVM @ AliOS Overview ## AliOS Overview • AliOS (www.alios AliOS互联网汽车共创智能网联汽车共建未来出行生态 ## TVM Timeline @ AliOS ![Image](/uploads/documents/9/0/e/a/90eab7a9909eddc3e1f4b253cda18ef6/p5_1.jpg) AliOS | 驱动万物智能 ## AliOS TVM Arch ![Image](/uploads/documents/ AliOS | 驱动万物智能 ## PART TWO AliOS TVM @ ARM CPU ## AliOS TVM@ARM CPU • Support TFLite (Open Source and Upstream Master) • Optimize on INT8 & FP32 ## AliOS TVM @ ARM CPU INT8 Convolution • NHWC

0 码力 | 27 页 | 4.86 MB | 1 年前
3
TVM Meetup: Quantization

## Compilation of Quantized Models in TVM Animesh Jain Amazon SageMaker Neo AWS AI ## Quantization Overview • Represent FP32 numbers with a lower-precision INT8 numbers • Integer number stands as com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf ## Quantization in TVM ## • Quantization within TVM - Automatic Quantization • TVM stack ingests a FP32 graph and a small dataset • Finds suitable quantization quantization scale • Produces a quantized graph ## • Compiling Pre-quantized models – QNN Dialect • TVM ingests a pre-quantized graph in TFLite or MxNet • Use high-level wrapper ops of QNN dialect ![Im

0 码力 | 19 页 | 489.50 KB | 1 年前
3
Dynamic Model in TVM

## Dynamic Model in TVM ## AWS AI Presenter: Haichen Shen, Yao Wang Amazon SageMaker Neo, Deep Engine Science ## Models with dynamism • Control flow (if, loop, etc) • Dynamic shapes ☐ Dynamic inputs: Control flow: concatenate within a while loop Limitation of TVM/graph runtime • Cannot compile and run dynamic models ## Support dynamic model in TVM • Support Any-dim in typing • Use shape function to compute |Invoke|Invokes a function at in index.| |InvokeClosure|Invokes a Relay closure.| |InvokePacked|Invokes a TVM compiled kernel.| |AllocStorage|Allocates a storage block.| |AllocTensor|Allocates a tensor value of

0 码力 | 24 页 | 417.46 KB | 1 年前
3
亿联TVM部署

## 厦门亿联网络技术股份有限公司杭州分公司 TVM for deloyment dolphintear 1. Voice Communication 2. Video Conferencing ### 1. Face detection/recognition, background blur, human pose estimation ![Image](/uploads/docum 53a4ce393875628303/p3_2.jpg) ## Why choosing TVM for our deployment? 1. OpenVino a black box, can not deploy our network(with depthwise conv2d,) 2. TVM can not only deploy our network, but also get get a good performance gain by autotuning 3. TVM can support many kinds of hardware platform: Intel/arm CPU, Nvidia/arm GPU, VTA... ## TVM on windows : 1. Get a .log file from the autotvm on Ubuntu 2.

0 码力 | 6 页 | 1.96 MB | 1 年前
3
TVM: Where Are We Going

## TVM: Where are we going Tianqi Chen ![Image](/uploads/documents/3/0/5/6/305660adb6c05c11d51f4c9239e8e6a0/p1_1.jpg) ## Current Deep Learning Landscape Frameworks and Inference engines ![Image] 8e6a0/p2_17.jpg) Open source, automated end-to-end optimization framework for deep learning. ## TVM Stack ![Image](/uploads/documents/3/0/5/6/305660adb6c05c11d51f4c9239e8e6a0/p3_1.jpg) ![Image](/ /p5_13.jpg) ![Image](/uploads/documents/3/0/5/6/305660adb6c05c11d51f4c9239e8e6a0/p5_14.jpg) ## TVM: Learning-based Learning System Frameworks ![Image](/uploads/documents/3/0/5/6/305660adb6c05c11d

0 码力 | 31 页 | 22.64 MB | 1 年前
3
XDNN TVM - Nov 2019

## FPGA CNN Accelerator and TVM Elliott Delaye EXILINX ## TVM Target devices and models HW Platforms ![Image](/uploads/documents/2/e/c/7/2ec7540601bc5a8294577483fdc9bd97/p2_1.jpg) ![Image](/uplo /c/7/2ec7540601bc5a8294577483fdc9bd97/p4_2.jpg) ## I nference Flow https://github.com/xilinx ## TVM as Unified ML Front End Caffe ![Image](/uploads/documents/2/e/c/7/2ec7540601bc5a8294577483fdc9bd97/p6_1 @relay.transform.module_pass(opt_level=4) class AccelModule: XIR Compiler Quantizer Partitioner ## TVM Partitioning - More than supported/not supported, pattern matching graph colorization - Choices how

0 码力 | 16 页 | 3.35 MB | 1 年前
3
TVM@Alibaba AI Labs

阿里巴巴人工智能实验室 AI Labs & TVM PART 1 : ARM32 CPU PART 2 : HIFI4 DSP PART 3 : PowerVR GPU ARM 32 CPU ## Resolution Overflow-aware Quantization Tensorize Kernel + ALIOS TVM ARM32 ARM32 ARM32 $$ .jpg) PowerVR GPU ## PowerVR support by TVM DL Model Caffe2 K mxnet CUDA TOPI Mali TOPI ROCM TOPI PVR TOPI NNVM Frontends Tuning tasks Auto TVM Machine Learning Automated Optimizer Schedule Computation Graph Optimizations Tensor Operators & Property Registry Compiler Toolchain TVM Runtime TOPI Operators ![Image](/uploads/documents/5/2/3/8/5238308620ae88131ebc0c8bea60f084/p9_2

0 码力 | 12 页 | 1.94 MB | 1 年前
3
PAI & TVM Meetup - Shanghai 20191116

Training/Inference PAI (Platform of AI) Alibaba Cloud Intelligence ## Outline • TensorCore AutoCodeGen in TVM • FP16 Mixed-Precision Training on PAI • INT8 Inference on PAI-Blade ## TensorCore ## AutoCodeGen stride, nvcuda::wmma::mem\_col\_major)| ## Background • TVM TensorCore Intrinsics • Authored by @Hzfengsy • Intrinsics: tvm_load_matrix_sync, tvm_mma_sync ... • New Memory Scopes: wmma.matrix_a/b, accumulator Virtual threads for data reuse (on going) ## Performance on V100 (FP16) |M, N, K|cuBLAS TensorCore|TVM TensorCore|speedup| |---|---|---|---| |512, 16, 512|7.7470us|5.2570us|1.47X| |512, 32, 512|8.0140us|6

0 码力 | 26 页 | 5.82 MB | 1 年前
3
Facebook -- TVM AWS Meetup Talk

## TVM at Facebook Lots of contributors at FB and elsewhere ## Why TVM? - Performance matters a lot - Heterogenous computing environment - High variety of workloads - Ever-increasing set of primitives primitives (over 500 aten kernels) - Interpreter methods not delivering generalized performance ## TVM for Speech Synthesis - WaveRNN-style model architecture - Autoregressive sampling net running at faster - Uh oh ![Image](/uploads/documents/f/7/3/5/f7351055bd40315d4c4b8f5dcab64462/p4_1.jpg) ## Enter, TVM and model co-design - PyTorch operator overhead makes interpreter infeasible - Reduce FLOPs with

0 码力 | 11 页 | 3.08 MB | 1 年前
3

共 1000 条前往

页

分类

语言

格式

TVM@AliOS

TVM Meetup: Quantization

Dynamic Model in TVM

亿联TVM部署

TVM: Where Are We Going

XDNN TVM - Nov 2019

TVM@Alibaba AI Labs

PAI & TVM Meetup - Shanghai 20191116

Facebook -- TVM AWS Meetup Talk

搜索

分类

语言

格式