TVM: Where Are We Goingoptimization framework for deep learning.TVM Stack High-Level Differentiable IR Tensor Expression and Optimization Search Space LLVM, CUDA, Metal VTA Edge FPGA Cloud FPGA ASIC Optimization AutoTVM Device Runtime TSIM Driver TSIM Binary New Hardware Design in Verilog VerilatorToward Unified IR InfraOverview of New IR Infra Single unified module/pass, type system, with function variants supportCompilation print(mod[”te_add_one”].args) Use hybrid script as an alternative text format Directly write pass, manipulate IR structures Accelerate innovation, e.g. use (GA/RL/BayesOpt/your favorite ML method) for AutoSchedule0 码力 | 31 页 | 22.64 MB | 5 月前3
Facebook -- TVM AWS Meetup Talkblock-sparse matrix multiplication (~50 lines of TVM IR) - Add relay.reinterpret to implement rational approximations in user space (~10 lines of Relay IR) - A few days of work - TVM sampling model running Kernel synthesis - Dynamic shapes, stride specialization - Impedance mismatch with PyTorch JIT IR and Relay IR - Watch this space :)Big thanks to the community0 码力 | 11 页 | 3.08 MB | 5 月前3
PAI & TVM Meetup - Shanghai 20191116IR Passes *。 Need to satisfy warp tile requirements (16x16x16 .…) | TensorCore Intrinsics "。Kind of Auto Tensorization 下 CUDA CodeGen *。IR passes0 码力 | 26 页 | 5.82 MB | 5 月前3
TVM Meetup Nov. 16th - Linaro835), mate10/mate10pro (kirin 970), p20/p20pro (kirin 970) -target=arm64-linux-android -mattr=+neon llvm firefly rk3399, rock960, ultra96 -target=aarch64-linux-gnu -mattr=+neon rasp3b (bcm2837) -targ (mali g71) N/A FPGA vta pynq, ultra96 N/A sdaccel Out-of-tree support or WIP: Hexagon DSP (via llvm), Ascend NPU, and more Green: Linaro 96BoardsLinaro for TVM ● Linaro AI/ML group can be a good fit0 码力 | 7 页 | 1.23 MB | 5 月前3
亿联TVM部署“-shared”, “-fPIC”, “-m32”] b. python tensorflow_blur.py to get the .log c. Use the .log, with target=“llvm –mcpu=i686 –mtriple=i686-linux-gnu” then TVM_NDK_CC=“clang –m32” python tf_blur.py���������������0 码力 | 6 页 | 1.96 MB | 5 月前3
Dynamic Model in TVMVMCompiler() with tvm.autotvm.apply_graph_best("resnet50_v1_graph_opt.log"): vm = vmc.compile(mod, "llvm") vm.init(ctx) vm.load_params(params) data = np.random.uniform(size=(1,0 码力 | 24 页 | 417.46 KB | 5 月前3
共 6 条
- 1













