《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesefficient models capable of running on mobile and edge devices. We have also set up a couple of programming projects for a hands-on model optimization experience using these efficient layers and architectures self-supervised models are strong semi-supervised learners. Advances in neural information processing systems, 33, 22243-22255. 17 A head is a trainable sub-network that takes in the output of the base model 22 Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017). Mathematically, we are given a pair of sequences and with shapes (n, d) and (m, d) where0 码力 | 53 页 | 3.92 MB | 1 年前3
动手学深度学习 v2.0总而言之,我们没有编写唤醒词识别器,而是编写了一个“学习”程序。如果我们用一个巨大的带标签的数 据集,它很可能可以“学习”识别唤醒词。这种“通过用数据集来确定程序行为”的方法可以被看作用数据 编程(programming with data)。比如,我们可以通过向机器学习系统,提供许多猫和狗的图片来设计一个 “猫图检测器”。检测器最终可以学会:如果输入是猫的图片就输出一个非常大的正数,如果输入是狗的图片 选择构成的所有可能 的组合进行求和。如果任何hi可以接受k个不同的值(有限的状态数),这意味着我们需要对kT 个项求和,这 个任务显然难于登天。幸运的是,有个巧妙的解决方案:动态规划(dynamic programming)。 352 9. 现代循环神经网络 要了解动态规划的工作方式,我们考虑对隐变量h1, . . . , hT 的依次求和。根据 (9.4.1),将得出: P(x1, . . . , xT 对于前几章中实现的那些模型,可以进一步提高它们的计算性能。例如,我们可以在不影响准确性的前提下, 大大减少训练时间。 12.1 编译器和解释器 目前为止,本书主要关注的是命令式编程(imperative programming)。命令式编程使用诸如print、“+” 和if之类的语句来更改程序的状态。考虑下面这段简单的命令式程序: def add(a, b): return a + b def fancy_func(a0 码力 | 797 页 | 29.45 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques5 Hubara, Itay, et al. "Binarized neural networks." Advances in neural information processing systems 29 (2016). 4 Rastegari, Mohammad, et al. "Xnor-net: Imagenet classification using binary convolutional evaluation is a boiler-plate code. There is not much we can do to make it interesting. We are programming in the python language. Naturally, it is possible to use other languages (like Java for Android0 码力 | 33 页 | 1.96 MB | 1 年前3
PyTorch Release NotesThe method implemented in your system depends on the DGX OS version that you installed (for DGX systems), the NGC Cloud Image that was provided by a Cloud Service Provider, or the software that you installed 15 ‣ OpenMPI 4.1.4+ ‣ GDRCopy 2.3 ‣ TensorBoard 2.9.0 ‣ Nsight Compute 2023.1.1.4 ‣ Nsight Systems 2023.2.3.1001 ‣ NVIDIA TensorRT™ 8.6.1.6 ‣ Torch-TensorRT 1.5.0.dev0 ‣ NVIDIA DALI® 1.27.0 ‣ 15 ‣ OpenMPI 4.1.4+ ‣ GDRCopy 2.3 ‣ TensorBoard 2.9.0 ‣ Nsight Compute 2023.1.1.4 ‣ Nsight Systems 2023.2.3.1001 ‣ NVIDIA TensorRT™ 8.6.1.6 ‣ Torch-TensorRT 1.5.0.dev0 ‣ NVIDIA DALI® 1.26.0 ‣0 码力 | 365 页 | 2.94 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquesfollows right after. Following the lead from the previous chapters, the theory is complemented with programming projects to assist readers to implement these techniques from scratch. Our journey of learning Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems 27 (2014). 16 Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." Journal0 码力 | 56 页 | 18.93 MB | 1 年前3
Lecture 6: Support Vector Machineequivalent to minimizing ∥ω∥2 = ωTω min ω,b ωTω s.t. y(i)(ωTx(i) + b) ≥ 1, ∀i This is a quadratic programming (QP) problem! Interior point method (https://en.wikipedia.org/wiki/Interior-point_method) Active problem, so the strong duality (p∗ = d∗) holds and the KKT conditions are respected Quadratic Programming problem in α Several off-the-shelf solvers exist to solve such QPs Some examples: quadprog (MATLAB)0 码力 | 82 页 | 773.97 KB | 1 年前3
《TensorFlow 快速入门与实战》1-TensorFlow初印象Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning 1990s��������������� Jeff Dean, Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning ������������������ Dean, Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning ����� Google ��� Jeff Dean, Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning TensorFlow0 码力 | 34 页 | 35.16 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression TechniquesJohn Denker, and Sara Solla. "Optimal brain damage." Advances in neural information processing systems 2 (1989). As you can deduce, the parameter changes the influence of the previous value of momentum "Deconstructing lottery tickets: Zeros, signs, and the supermask." Advances in neural information processing systems 32 (2019). 10 Liu, Zhuang, et al. "Rethinking the value of network pruning." arXiv preprint arXiv:1810 weights and connections for efficient neural network." Advances in neural information processing systems 28 (2015). 7 Dettmers, Tim, and Luke Zettlemoyer. "Sparse networks from scratch: Faster training0 码力 | 34 页 | 3.18 MB | 1 年前3
Lecture 1: OverviewFellow, National University of Singapore, Singapore. Research Interests: Distributed Algorithms and Systems, Wireless Net- works, Mobile Computing, Internet of Things. Feng Li (SDU) Overview September 6, driver Feng Li (SDU) Overview September 6, 2023 11 / 57 Why Do We Need Machine Learning? Develop systems that are too difficult/expensive to construct manually because they require specific detailed skills skills or knowledge tuned to a specific task (knowledge engineering bottleneck) Develop systems that can automatically adapt and customize them- selves to individual users. Personalized news or mail filter0 码力 | 57 页 | 2.41 MB | 1 年前3
Lecture Notes on Support Vector Machine(x(i), y(i)), ωT x(i) +b ≥ 1 if y(i) = 1, and ωT x(i) + b ≤ 1 if y(i) = −1. This is a quadratic programming (QP) problem, and can be solved by exiting generic QP solvers, e.g., interior point method, active0 码力 | 18 页 | 509.37 KB | 1 年前3
共 18 条
- 1
- 2













