Intel GPU - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

FFmpeg在Intel GPU上的硬件加速与优化

FFmpeg在Intel GPU上的硬件加速与优化赵军 DCG/NPG @ Intel 介绍FFmpeg VAAPI • Media pipeline review • 何谓FFmpeg VAAPI • 为什么我们需要FFmpeg VAAPI • 当前状态 • 更进一步的计划 • 附录典型的 media pipeline File Device Network Stream com/01org/libva • 依赖于后端驱动，可以提供Video硬件加速 • 解码 • 编码 • 图像后处理可用的后端驱动 • Intel VA(i965) driver for Intel chip-sets • Intel hybrid driver • Intel HD driver • Mesa's state-trackers for gallium drivers: • radeon, nouveau (?), freedreno, … • 废弃的 API bridges • vdpau—va bridge • powervr—va bridge • … Intel GPU简介 • Gfx Label • Gen3: Pinetrail (Pineview) • Gen4: G965 • Gen5: G4X, Ironlake (Piketon, Calpella)

0 码力 | 26 页 | 964.83 KB | 1 年前
3
Go on GPU

Changkun Ou. 2023. Go on GPU. GopherChina 2023. Session "Foundational Toolchains" Go on GPU Changkun Ou changkun.de/s/gogpu GopherChina 2023 Session “Foundational Toolchains” 2023 June 10 1 Changkun Ou. 2023. Go on GPU. GopherChina 2023. Session "Foundational Toolchains" Agenda ● Basic knowledge for interacting with GPUs ● Accelerate Go programs using GPUs ● Challenges in Go when using outlooks 2 Changkun Ou. 2023. Go on GPU. GopherChina 2023. Session "Foundational Toolchains" Agenda ● Basic knowledge for interacting with GPUs ○ Motivation ○ GPU Driver and Standards ○ Render and

0 码力 | 57 页 | 4.62 MB | 1 年前
3
Bridging the Gap: Writing Portable Programs for CPU and GPU

1/66Bridging the Gap: Writing Portable Programs for CPU and GPU using CUDA Thomas Mejstrik Sebastian Woblistin 2/66Content 1 Motivation Audience etc.. Cuda crash course Quiz time 2 Patterns Oldschool Motivation Patterns The dark path Cuda proposal Thank you Why write programs for CPU and GPU Difference CPU/GPU Algorithms are designed differently Latency/Throughput Memory bandwidth Number of cores Motivation Patterns The dark path Cuda proposal Thank you Why write programs for CPU and GPU Difference CPU/GPU Why it makes sense? Library/Framework developers Embarrassingly parallel algorithms User

0 码力 | 124 页 | 4.10 MB | 6 月前
3
C++高性能并行编程与优化 - 课件 - 08 CUDA 开启的 GPU 编程

CUDA 开启的 GPU 编程 by 彭于斌（ @archibate ）往期录播： https://www.bilibili.com/video/BV1fa411r7zp 课程 PPT 和代码： https://github.com/parallel101/course 前置条件 • 学过 C/C++ 语言编程。 • 理解 malloc/free 之类的概念。 • 熟悉 STL 中的容器、函数模板等。做不到的。编写一段在 GPU 上运行的代码 • 定义函数 kernel ，前面加上 __global__ 修饰符，即可让他在 GPU 上执行。 • 不过调用 kernel 时，不能直接 kernel() ，而是要用 kernel<<<1, 1>>>() 这样的三重尖括号语法。为什么？这里面的两个 1 有什么用？稍后会说明。 • 运行以后，就会在 GPU 上执行 printf 了。 kernel 函数在 GPU 上执行，称为核函数，用 __global__ 修饰的就是核函数。没有反应？同步一下！ • 然而如果直接编译运行刚刚那段代码，是不会打印出 Hello, world! 的。 • 这是因为 GPU 和 CPU 之间的通信，为了高效，是异步的。也就是 CPU 调用 kernel<<<1, 1>>>() 后，并不会立即在 GPU 上执行完毕，再返回。实际上只是把

0 码力 | 142 页 | 13.52 MB | 1 年前
3
Heterogeneous Modern C++ with SYCL 2020

http://wongmichael.com/about ● C++11 book in Chinese: https://www.amazon.cn/dp/B00ETOV2OQ We build GPU compilers for some of the most powerful supercomputers in the world 34 Nevin “:-)” Liber nliber@anl Attribution 4.0 International License SYCL Single Source C++ Parallel Programming GPU FPGA DSP Custom Hardware GPU CPU CPU CPU Standard C++ Application Code C++ Libraries ML Frameworks give better performance on complex apps and libs than hand-coding AI/Tensor HW GPU FPGA DSP Custom Hardware GPU CPU CPU CPU AI/Tensor HW Other BackendsSYCL 2020 is here! Open Standard for

0 码力 | 114 页 | 7.94 MB | 6 月前
3
Distributed Ranges: A Model for Building Distributed Data Structures, Algorithms, and Views

performance claims, visit www.intel.com/PerformanceIndex or scan the QR code: © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries about future Intel products. - I work in Intel’s research labs. Work described here will involve experimental prototypes and early research.Problem: writing parallel programs is hard - Multi-GPU, multi-CPU / execution necessary. CPU NIC GPU GPU GPU GPU Xe LinkMulti-GPU Systems - NUMA regions: - 4+ GPUs - 2+ CPUs CPU NIC GPU GPU GPU GPU Xe LinkMulti-GPU Systems - NUMA regions: - 4+ GPUs

0 码力 | 127 页 | 2.06 MB | 6 月前
3
Back to Basics: Concurrency

transistors incorporated in a chip will approximately double every 24 months." --Gordon Moore, Intel co-founderMoore’s Law (2/2) 29 ● Around 1965 Gordon Moore predicted the number of transistors months." --Gordon Moore, Intel co-founderDennard Scaling (1/3) "The number of transistors incorporated in a chip will approximately double every 24 months." --Gordon Moore, Intel co-founder http://www-cs-faculty transistors incorporated in a chip will approximately double every 24 months." --Gordon Moore, Intel co-founder http://www-cs-faculty.stanford.edu/~eroberts/cs181/projects/2010-11/TechnologicalSing

0 码力 | 141 页 | 6.02 MB | 6 月前
3
Tracy: A Profiler You Don't Want to Miss

iOS, Android, WASM*) Hybrid profiling capabilities (sampling and/or instrumentation) (CPU and GPU instrumentation) Tracing capabilities (values, messages, plots, allocations, …) Hassle-free integration spall https://handmade.network/p/333/spall/ geiger https://github.com/david-grs/geiger Intel IACA https://www.intel.com/content/www/us/en/developer/ articles/tool/architecture-code-analyzer.html“There is experience! Tracy can do it all!Tracy Profiler GUI 13Tracy Profiler GUI 14 Frame Info Menu bar GPU Timeline (per “device”) CPU Timeline (per-thread) Custom Plots & Allocation Trackers15Tracy Client

0 码力 | 84 页 | 8.70 MB | 6 月前
3
Tracy: A Profiler You Don't Want to Miss

macOS, iOS, Android, WASM*) Hybrid profiling capabilities (sampling and/or instrumentation) (CPU and GPU instrumentation) Tracing capabilities (values, messages, plots, allocations, …) Hassle-free integration https://github.com/david-grs/geiger Xpedite https://github.com/morganstanley/Xpedite Intel IACA https://www.intel.com/content/www/us/en/developer/ articles/tool/architecture-code-analyzer.html“There is transforms the profiling experience!Tracy Profiler GUI 13Tracy Profiler GUI 14 Frame Info Menu bar GPU Timeline (per “device”) CPU Timeline (per-thread) Custom Plots & Allocation Trackers15Tracy Client

0 码力 | 85 页 | 6.51 MB | 6 月前
3
cppcon 2021 safety guidelines for C parallel and concurrency

platform at Woven Planet Ilya Burylov Principle Engineer at Intel An architect of C++ software solutions for autonomous driving market in Intel Contribution into functional safety MISRA standard Contribution http://wongmichael.com/about ● C++11 book in Chinese: https://www.amazon.cn/dp/B00ETOV2OQ We build GPU compilers for some of the most powerful supercomputers in the world 4 © The Khronos® Group Inc Generation Safety Critical APIs for Graphics, Compute and Display Industry Need for CPU/GPU Acceleration APIs designed to ease system safety certification Rendering Compute Display • Khronos

0 码力 | 52 页 | 3.14 MB | 6 月前
3

共 538 条前往

页

分类

语言

格式

FFmpeg在Intel GPU上的硬件加速与优化

Go on GPU

Bridging the Gap: Writing Portable Programs for CPU and GPU

C++高性能并行编程与优化 - 课件 - 08 CUDA 开启的 GPU 编程

Heterogeneous Modern C++ with SYCL 2020

Distributed Ranges: A Model for Building Distributed Data Structures, Algorithms, and Views

Back to Basics: Concurrency

Tracy: A Profiler You Don't Want to Miss

Tracy: A Profiler You Don't Want to Miss

cppcon 2021 safety guidelines for C parallel and concurrency