## Enabling eBPF Super Powers on
ARM64 with Cilium
Jianlin Lv
eBPF Summit
finleap
## Preface


Trip.com
Arm ecosystem of CNI
## flannel PROJECT CALICO cilium multi Contiv weave net
## Support
Arm64
## • Cilium enable on aarch64
https://github.com/ Fix compiling and runtime issue on
Arm64;
• Multi-arch support for cilium-related images
· CI/CD
• Travis
• Unit test
## Travis Cl
## •
Arm64 Full VM
## •
arm64-graviton2
0 码力 |
10 页 |
1.24 MB
| 1 年前 3 ## How and When You Should Measure CPU Overhead of eBPF Programs
eBPF Summit
## Why should I profile eBPF programs?
## CI variance tracking
●●●●
name
TCPLatency/eBPF/kprobe/sys_bind
TCPLatency/eB 0 码力 |
20 页 |
2.04 MB
| 1 年前 3
Go toolchain internals and implementation based on arm64
Wei Xiao (肖玮)
Arm Staff Software Engineer
Wei.Xiao@arm.com
’ alt=‘OCR图片’/>
Go toolchain overview
A toolchain is a package composed of the compiler infrastructure.
’ alt=‘OCR图片’/>
Go toolchain example
$go build -x helloworld.go
/golang/pkg/tool/linux_arm64/compile -o $WORK/b001/pkg.a -trimpath $WORK/b001 -p main -complete -buildid Lz0Z4IaaV-BMteKblcuy $WORK/b001/importcfg -pack -c=4 ./helloworld.go
/golang/pkg/tool/linux_arm64/buildid -w $WORK/b001/pkg.a # internal
/golang/pkg/tool/linux_arm64/link -o $WORK/b001/exe/a.out -importcfg $WORK/b001/importcfg.link 0 码力 |
22 页 |
2.19 MB
| 1 月前 3 Manual for Arm-based Computers
Version 1.0, January 2023
www.moxa.com/products
© 2023 Moxa Inc. All rights reserved.
MOXA $ ^{®} $
# Moxa Industrial Linux 3.0 (Debian 11) Manual for Arm-based Computers 6
Eligible Computing Platforms ..... 7
2. Getting Started ..... 8
Connecting to the Arm-based Computer ..... 8
Connecting through the Serial Console ..... 8
Connecting via the .... 20
Login Policy ..... 23
Clearing the TPM Module ..... 24
Localizing Your Arm-based Computer ..... 24
Adjusting the Time ..... 24
NTP Time Synchronization ..... 25 0 码力 |
111 页 |
2.94 MB
| 2 年前 3 TVM@AliOS
## PRESENTATION AGENDA
☑ TVM @ AliOS Overview
TVM @ AliOS ARM CPU
TVM @ AliOS Hexagon DSP
TVM @ AliOS Intel GPU
☑ Misc
## PART ONE TVM @ AliOS Overview
## AliOS Overview
• AliOS (www.alios | 驱动万物智能
## PART TWO AliOS TVM @ ARM CPU
## AliOS TVM@ARM CPU
• Support TFLite (Open Source and Upstream Master)
• Optimize on INT8 & FP32
## AliOS TVM @ ARM CPU INT8
Convolution
• NHWC layout AliOS TVM @ ARM CPU INT8
TVM / QNNPACK Speed Up @ Mobilenet V2 @ rasp 3b+ AARCH64

## AliOS TVM @ ARM CPU INT8
Depthwise 0 码力 |
27 页 |
4.86 MB
| 1 年前 3 ## Python for Good >>> PyCon China 2022
ARM 芯片的 Python + AI 算力优化
主讲人:朱宏林-阿里云程序语言与编译器团队
Python

HELLO WORLD 语言编写的 AI 程序。过去这些程序总跑在 GPU 或者 x86 架构的 CPU 上。然而综合考虑到功耗、成本、性能等因素,云厂商们开始建设 ARM 架构的服务平台,如何整合 Python + AI 的相关软件并使其在该平台上发挥最高的性能成为了工程师们关注的焦点。
- 矩阵乘法是深度学习计算的重要组成部分,我们利用 ARM 架构新提供的矩阵扩展对 bf16 类型的矩阵乘法计算进行优化,该优化将纯矩阵乘法的运算速度提升 OpenBLAS 和 PyTorch 中。
- 本次演讲,将向大家介绍我们在倚天 710 ARM 芯片上开展的 Python + AI 优化工作,以及在 ARM 云平台上部署 Python + AI 任务的最佳实践。
## 深度学习
• 广泛使用的深度学习框架
• TensorFlow、PyTorch
• 结合硬件(ARM 服务端芯片)
• 倚天 710
• AWS graviton
• 矩阵乘法 0 码力 |
24 页 |
4.00 MB
| 2 年前 3 cd2064a1322/p12_1.jpg)
2.7 (Old) Startup CPU Usage

2.8 (New) Startup CPU Usage
## Startup Breakdown
Enumerate asset 60110a4e5decd2064a1322/p17_1.jpg)
## High CPU Time
Single threaded code
Inefficient algorithms
Branch misprediction, cache misses
Spin locks
## High CPU Time
Single threaded code
Inefficient algorithms rouping: | Function / Call Stack |
| Function / Call Stack | CPU Time | Wait Time by Utilization ▼ | Wait Count | Module | | | 0 码力 |
76 页 |
2.22 MB
| 1 年前 3
[Image](/uploads/documents/4/f/5/8/4f5831fc6a31121411d9dc2cb0142e51/p1_1.jpg)
## Bringing together the Arm ecosystem

## Linaro best-in-class Deep Learning performance by leveraging Neural Network acceleration in IP and SoCs from the Arm ecosystem, through collaborative seamless integration with the ecosystem of AI/ML software frameworks and libraries

## Arm NN open source project
• Linaro-hosted https://www.mlplatform.org/
• Git and review servers
• Forums
0 码力 |
7 页 |
1.23 MB
| 1 年前 3