Intel GPU - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

FFmpeg在Intel GPU上的硬件加速与优化

## FFmpeg在Intel GPU上的硬件加速与优化赵军 DCG/NPG @ Intel ## 介绍FFmpeg VAAPI • Media pipeline review • 何谓FFmpeg VAAPI • 为什么我们需要FFmpeg VAAPI • 当前状态 • 更进一步的计划 · 附录 ## 典型的 media pipeline SOURCE libavformat ## • 依赖于后端驱动，可以提供Video硬件加速 • 解码 • 编码 • 图像后处理 ## 可用的后端驱动 • Intel VA(i965) driver for Intel chip-sets • Intel hybrid driver • Intel HD driver • Mesa's state-trackers for gallium drivers: • radeon radeon, nouveau (?), freedreno, … • 废弃的 API bridges • vdpau—va bridge • powervr—va bridge ## I ntel GPU简介 ## • Gfx Label • Gen3: Pinetrail (Pineview) • Gen4: G965 • Gen5: G4X, Ironlake (Piketon, Calpella)

0 码力 | 26 页 | 964.83 KB | 2 年前
3
Go on GPU

## Go on GPU ## Changkun Ou changkun.de/s/gogpu GopherChina 2023 Session “Foundational Toolchains” 2023 June 10 ## Agenda - Basic knowledge for interacting with GPUs • Accelerate Go programs using • Conclusion and outlooks ## Agenda - Basic knowledge for interacting with GPUs o Motivation o GPU Driver and Standards Render and compute pipeline o Vulkan/Metal/DX12/OpenGL ☐ Accelerate Go programs programs using GPUs ☐ Challenges in Go when using GPUs ☐ Conclusion and outlooks ## Motivation of GPU Acceleration Improve system computation performance Increase amount of concurrency Processing large

0 码力 | 57 页 | 4.62 MB | 2 年前
3
Deploy VTA on Intel FPGA

## DEPLOY VTA ON INTEL FPGA ## HARMAN A SAMSUNG COMPANY LIANGFU CHEN 11/16/2019 ## Moore's Law is Slowing Down 42 Years of Microprocessor Trend Data ![Image](/uploads/documents/e/6/5/c/e65c \*mem)| ||Get physical memory of cma memory block (should be used for DMA). More...| ## DEPLOY VTA ON INTEL FPGA ## Software - Driver ## Cyclone V & Arria V SoC HPS Physical Memory Map ![Image](/upl }); } ## DEPLOY VTA ON INTEL FPGA ## Hardware ## Datapath of Chisel VTA ![Image](/uploads/documents/e/6/5/c/e65c0162174e1bee4bc4cb31f78819bb/p9_2.jpg) ## DEPLOY VTA ON INTEL FPGA ## Hardware ![I

0 码力 | 12 页 | 1.35 MB | 1 年前
3
1.2.4 Go on GPU

Go on GPU 欧长坤 changkun.de/s/gogpu GopherChina 2023 Session “Foundational Toolchains” 2023 June 10 关于我慕尼黑大学博士 (@mimuc/staff) 研究方向人在环路优化 (human-in-the-loop optimization) Sixt高级工程师 (@sixt/pricing-yield) …) 大纲与GPU进行交互的基本知识在Go程序中支持GPU加速使用 Go 进行 GPU 计算的挑战总结大纲与 GPU 进行交互的基本知识使用 GPU 的动机 GPU 驱动和标准渲染管线和计算管线 Vulkan/Metal/DX12/OpenGL 在 Go 程序中支持 GPU 加速使用 Go 进行 GPU 计算的挑战总结总结使用GPU加速计算的动机提高系统计算性能更大的并发量大规模数据处理机器学习、深度学习、图形学渲染等等 ’ alt=‘OCR图片’/> CPU的架构 ’ alt=‘OCR图片’/> GPU的架构 SIMD Exec Unit Cache SIMD Exec Unit Cache SIMD Exec Unit Cache SIMD Exec Unit Cache SIMD Exec

0 码力 | 55 页 | 4.79 MB | 1 月前
3
GPU Resource Management On JDOS

## GPU Resource Management On JDOS 梁永清 liangyongqing1@jd.com ## 提供的服务 ## Experiment ## Training 1. 用于实验的 GPU 容器 2. 基于 Kubeflow 的机器学习训练服务 3. 模型管理和模型 Serving 服务 ## Serving 均基于容器，不对业务方直接提供 GPU 物理机物理机 ## GPU 实验 JDOS 常规的容器服务，使用 gpu 的 zone，自行设定相应的镜像即可，有完善的周边服务我的系统 ![Image](/uploads/documents/8/5/3/d/853d658ef8422c42cb997f278e0dedcd/p3_2.jpg) 三一键编译 ![Image](/uploads/documents/8/5/3/d/85 _4.jpg) public/tensor/now.1.4.1-ueve-gpu-vi ![Image](/uploads/documents/8/5/3/d/853d658ef8422c42cb997f278e0dedcd/p3_5.jpg) public/tensorflow:1.7.0-devel-gpu-py3-v1 ![Image](/uploads/documents/8/

0 码力 | 11 页 | 13.40 MB | 1 年前
3
大数据时代的Intel之Hadoop

## 大数据时代的Intel之Hadoop 系统方案架构师：朱海峰英特尔 $ ^{®} $ 中国云计算创新中心 2013.4 北京 ## 法律声明本文所提供之信息均与英特尔 $ ^{®} $ 产品相关。本文不代表英特尔公司或其它机构向任何人明确或隐含地授予任何知识产权。除相关产品的英特尔销售条款与条件中列明之担保条件以外，英特尔公司不对销售和/或使用英特尔产品做出其它任何明确或隐含的担订购产品前，请联系您当地的英特尔销售办事处或分销商，了解最新技术规范。如欲获得本文或其它英特尔文献中提及的带订单编号的文档副本，可致电1-800-548-4725，或访问http://www.intel.com/design/literature.htm 性能测试和等级评定均使用特定的计算机系统和/或组件进行测量，这些测试大致反映了英特尔 $ ^{®} $ 产品的性能。系统硬件、软件设计或配置的，英特尔主动管理技术可能在基于主机操作系统的虚拟专用网（VPN）上，或者在无线连接、使用电池电源、睡眠、休眠或关机时无法使用或是某些功能受到限制。如欲了解更多信息，请访问：http：//www.intel.com/technology/iamt。英特尔 $ ^{®} $ 架构上的 64 位计算要求计算机系统采用支持英特尔 $ ^{®} $ 64 架构的处理器、芯片组、基本输入输出系统（BI

0 码力 | 36 页 | 2.50 MB | 2 年前
3
Bridging the Gap: Writing Portable Programs for CPU and GPU

CPU and GPU ## THOMAS MEJSTRIK ## DIMETOR ![Image](/uploads/documents/e/0/4/9/e04984c6d792732e1852981d08548d37/p2_2.jpg) FWF ## Bridging the Gap: Writing Portable Programs for CPU and GPU using CUDA ROCm, Vulkan, ... ☐ You can tell me about afterwards ## Why write programs for CPU and GPU ## ☐ Difference CPU/GPU Algorithms are designed differently ☐ Latency/Throughput ☐ Memory bandwidth ☐ Number Problem ☐ Why it makes sense? ☐ Scope of the talk ## Why write programs for CPU and GPU ## ☐ Difference CPU/GPU ☐ Why it makes sense? Library/Framework developers ☐ Embarrassingly parallel algorithms

0 码力 | 124 页 | 4.10 MB | 1 年前
3
激活函数与GPU加速

## PyTorch ## 激活函数与GPU加速主讲人：龙良曲 ![Image](/uploads/documents/a/1/2/3/a123d1e5f7cf442518ac7eb1e3f17c73/p2_1.jpg) ![Image](/uploads/documents/a/1/2/3/a123d1e5f7cf442518ac7eb1e3f17c73/p3_1.jpg) ![Ima \beta*x)) $$ ![Image](/uploads/documents/a/1/2/3/a123d1e5f7cf442518ac7eb1e3f17c73/p7_1.jpg) ## GPU accelerated ## ☐ ☐ ☐ device = torch.device('cuda:0') net = MLP().to(device) optimizer =

0 码力 | 11 页 | 452.22 KB | 2 年前
3
TVM@AliOS

PRESENTATION AGENDA ☑ TVM @ AliOS Overview TVM @ AliOS ARM CPU TVM @ AliOS Hexagon DSP TVM @ AliOS Intel GPU ☑ Misc ## PART ONE TVM @ AliOS Overview ## AliOS Overview • AliOS (www.alios.cn) is a newly designed r0 = #0; jumpr r31 } ## PART FOUR AliOS TVM @ Intel GPU ## AliOS TVM @ Intel GPU • Implement the schedule from scratch • Leverage Intel Subgroup Extension ## Subgroups ![Image](/uploads/do s/9/0/e/a/90eab7a9909eddc3e1f4b253cda18ef6/p23_1.jpg) ## AliOS TVM @ Intel GPU GEMM Hardware Efficiency @ Intel Apollo Lake GPU ![Image](/uploads/documents/9/0/e/a/90eab7a9909eddc3e1f4b253cda18ef6/p24_1

0 码力 | 27 页 | 4.86 MB | 1 年前
3
C++高性能并行编程与优化 - 课件 - 08 CUDA 开启的 GPU 编程

## CUDA 开启的 GPU 编程 by 彭于斌 (@archibate) 往期录播：https://www.bilibili.com/video/BV1fa411r7zp 课程 PPT 和代码：https://github.com/parallel101/course ## 前置条件 • 学过 C/C++ 语言编程。 - 理解 malloc/free 之类的概念。 • 熟悉 STL ## 编写一段在 GPU 上运行的代码 - 定义函数 kernel，前面加上 ___ global___ 修饰符，即可让他在 GPU 上执行。 - 不过调用 kernel 时，不能直接 kernel()，而是要用 kernel<<1, 1>>() 这样的三重尖括号语法。为什么？这里面的两个 1 有什么用？稍后会说明。 • 运行以后，就会在 GPU 上执行 printf kernel 函数在 GPU 上执行，称为核函数，用 ___ global___ 修饰的就是核函数。 ![Image](/uploads/documents/6/b/e/7/6be70db418434c4b3ebda53c2593beaa/p6_1.jpg) ## 没有反应？同步一下！ - 然而如果直接编译运行刚刚那段代码，是不会打印出Hello, world! 的。这是因为 GPU 和 CPU 之间的通信，为了高效，是异步的。也就是

0 码力 | 142 页 | 13.52 MB | 2 年前
3

共 1000 条前往

页

分类

语言

格式

FFmpeg在Intel GPU上的硬件加速与优化

Go on GPU

Deploy VTA on Intel FPGA

1.2.4 Go on GPU

GPU Resource Management On JDOS

大数据时代的Intel之Hadoop

Bridging the Gap: Writing Portable Programs for CPU and GPU

激活函数与GPU加速

TVM@AliOS

C++高性能并行编程与优化 - 课件 - 08 CUDA 开启的 GPU 编程

搜索

分类

语言

格式