Just-in-Time Compilation - J F Bastien - CppCon 2020other applications are available only under the x86 architecture. ☹ We designed Digital FX!32 to make the complete set of applications, both native and x86, available to Alpha. The goal for the software transparent execution of x86 Win32 applications on Alpha systems. FX!32 achieves its goal by transparently running those applications at speeds comparable to high-performance x86 platforms. Digital FX!32 32 is a software utility that enables x86 Win32 applications to be run on Windows NT/Alpha platforms. Once FX!32 has been installed, almost all x86 applications can be run on Alpha without special commands0 码力 | 111 页 | 3.98 MB | 6 月前3
Hidden Overhead of a Function APIbelow (127B)Understanding how machine code is generated from C++ System V gABI psABI: ARM, x86, … C++ Itanium ABI Microsoft Windows ABIs C++ Standard 18armv8-a System V ABI - iPhone - M1 armv7-a System V ABI - ancient iPhone - low-end Android smartphone x86 (IA-32) System V ABI - ancient Linux server x86 (IA-32) Microsoft ABI - ancient Windows device 19armv8-a System V ABI 0.1 -O2 -std=c++20 x86 (IA-32) System V ABI Intel386 Architecture Processor Supplement x86-64 gcc 14.2 -O2 -std=c++20 -m32 x86 (IA-32) Microsoft ABI calling conventions x86 msvc v19.40 VS17.10 -O20 码力 | 158 页 | 2.46 MB | 6 月前3
Performance Engineering: Being Friendly to Your Hardwareuint64_t v = 0x123456789abcdef0; 46 x86 movabs r10, 0x123456789abcdef0 49 ba f0 de bc 9a 78 56 34 12Code density uint64_t v = 0x123456789abcdef0; 47 x86 movabs r10, 0x123456789abcdef0 49 ba f0 78 14 02 00 af 26 42 64 b8 14 02 00 f0 de 42 34Code density uint64_t v = 0x123456789abcdef0; 48 x86 movabs r10, 0x123456789abcdef0 49 ba f0 de bc 9a 78 56 34 12 RISC-V li a5, 305418240 addi a5, 78 14 02 00 af 26 42 64 b8 14 02 00 f0 de 42 34Code density uint64_t v = 0x123456789abcdef0; 49 x86 movabs r10, 0x123456789abcdef0 49 ba f0 de bc 9a 78 56 34 12 RISC-V li a5, 305418240 addi a5,0 码力 | 111 页 | 2.23 MB | 6 月前3
TVM Meetup: QuantizationGraph Target-independent Relay passes Target-optimized graph Target-dependent Relay passes Intel x86 ARM CPU Nvidia GPU ARM GPU Schedule templates written in TVM Tensor IR .. More targets AutoTVM QNN Dialect QNN passes Target-independent Relay passes Target-optimized Int8 Relay Graph Intel x86 schedule ARM CPU schedule Nvidia GPU schedule ARM GPU schedule Relay Int8 Graph Target-dependent Relay operators • QNN Optimization passes • Some optimizations are easier at QNN level • Intel x86 VNNI requires conv input dtypes to uint8 x int8© 2019, Amazon Web Services, Inc. or its Affiliates0 码力 | 19 页 | 489.50 KB | 5 月前3
Conan 2.10 DocumentationChapter 4. Tutorial Conan Documentation, Release 2.10.1 engine/1.0 matrix/1.0 sound32/1.0 if arch==x86 We will start by creating the first matrix/1.0 version: $ conan create matrix --version=1.0 Now = "arch" def requirements(self): self.requires("matrix/[>=1.0 <2.0]") if self.settings.arch == "x86": self.requires("sound32/[>=1.0 <2.0]") Lets move to the engine folder and install its dependencies: revision. But sound32/1.0 is not in the lockfile, because for the default configuration profile (not x86), this sound32 is not a dependency. Now, a new matrix/1.1 version is created: $ cd .. $ conan create0 码力 | 803 页 | 5.02 MB | 10 月前3
Conan 2.9 DocumentationChapter 4. Tutorial Conan Documentation, Release 2.9.3 engine/1.0 matrix/1.0 sound32/1.0 if arch==x86 We will start by creating the first matrix/1.0 version: $ conan create matrix --version=1.0 Now = "arch" def requirements(self): self.requires("matrix/[>=1.0 <2.0]") if self.settings.arch == "x86": self.requires("sound32/[>=1.0 <2.0]") Lets move to the engine folder and install its dependencies: revision. But sound32/1.0 is not in the lockfile, because for the default configuration profile (not x86), this sound32 is not a dependency. Now, a new matrix/1.1 version is created: $ cd .. $ conan create0 码力 | 795 页 | 4.99 MB | 10 月前3
Conan 2.7 Documentationbut they are good to learn the concepts of lockfiles. engine/1.0 matrix/1.0 sound32/1.0 if arch==x86 We will start by creating the first matrix/1.0 version: $ conan create matrix --version=1.0 Now = "arch" def requirements(self): self.requires("matrix/[>=1.0 <2.0]") if self.settings.arch == "x86": self.requires("sound32/[>=1.0 <2.0]") Lets move to the engine folder and install its dependencies: revision. But sound32/1.0 is not in the lockfile, because for the default configuration profile (not x86), this sound32 is not a dependency. Now, a new matrix/1.1 version is created: $ cd .. $ conan create0 码力 | 779 页 | 4.93 MB | 10 月前3
Conan 2.8 Documentationbut they are good to learn the concepts of lockfiles. engine/1.0 matrix/1.0 sound32/1.0 if arch==x86 We will start by creating the first matrix/1.0 version: $ conan create matrix --version=1.0 Now = "arch" def requirements(self): self.requires("matrix/[>=1.0 <2.0]") if self.settings.arch == "x86": self.requires("sound32/[>=1.0 <2.0]") Lets move to the engine folder and install its dependencies: revision. But sound32/1.0 is not in the lockfile, because for the default configuration profile (not x86), this sound32 is not a dependency. Now, a new matrix/1.1 version is created: $ cd .. $ conan create0 码力 | 785 页 | 4.95 MB | 10 月前3
Conan 2.6 Documentationbut they are good to learn the concepts of lockfiles. engine/1.0 matrix/1.0 sound32/1.0 if arch==x86 We will start by creating the first matrix/1.0 version: $ conan create matrix --version=1.0 Now = "arch" def requirements(self): self.requires("matrix/[>=1.0 <2.0]") if self.settings.arch == "x86": self.requires("sound32/[>=1.0 <2.0]") Lets move to the engine folder and install its dependencies: revision. But sound32/1.0 is not in the lockfile, because for the default configuration profile (not x86), this sound32 is not a dependency. Now, a new matrix/1.1 version is created: $ cd .. $ conan create0 码力 | 777 页 | 4.91 MB | 10 月前3
Exceptions Under the Spotlightsize and speed on ARM (single core). Runtime (ms) Size (B) Throw3 Return Ratio Throw3 Return Ratio X86 2199 5 435 15536 15316 1.01 ARM 59104 128 477 14152 10356 1.4• Two main implementations for exceptions Return Ratio Throw1 Return Ratio X86 2362 1.2 1861 14936 14056 1.1 ARM 53747 40 1343 14540 10276 1.4 Runtime (ms) Size (B) Throw3 Throw1 Ratio Throw3 Throw1 Ratio X86 2199 2262 0.97 15536 14936 1.04 thinnest Ratio Throw1 thinnest Ratio X86 2362 4 590 14936 9440 1.58 • The “thinnest” throw: Runtime (ms) Size (B) Throw3 Return Ratio Throw3 Return Ratio X86 2199 5 435 15536 15316 1.01 • We got0 码力 | 53 页 | 2.82 MB | 6 月前3
共 109 条
- 1
- 2
- 3
- 4
- 5
- 6
- 11













