módulo manual - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

world, we must automate the embedding table generation because of the high costs associated with manual embeddings. One example of an automated embedding generation technique is the word2vec family of Kharameh"," Mahmudabad (Persian:دﺎﺑادﻮﻤﺤﻣ also Romanized as Maḩmūdābād; also known as Maḩbūdābād-e Pā’īn Mahmood Abad Hoomeh Maḩmūdābād-e Ḩūmeh and Maḩmūdābād-e Pā’īn) is a village in Korbal Rural District in E., & Johnson, M. (2021). Distilling Large Language Models into Tiny and Effective Students using pQRNN. arXiv preprint arXiv:2101.08890. 15 Chung, H. W., Fevry, T., Tsai, H., Johnson, M., & Ruder, S.

0 码力 | 53 页 | 3.92 MB | 1 年前
3
机器学习课程-温州大学-03深度学习-PyTorch入门

greater x.le/x.gt np.greater_equal/np.equal/np.not_equal x.ge/x.eq/x.ne 随机种子 np.random.seed torch.manual_seed 1.Tensors张量的概念 10  Python、PyTorch 1.x与TensorFlow2.x的比较类别 Python PyTorch 1+ TensorFlow 参考文献 1. IAN GOODFELLOW等，《深度学习》，人民邮电出版社，2017 2. Andrew Ng，http://www.deeplearning.ai 3. Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer-Verlag, 2006 4. 李宏毅，《一天搞懂深度学习》 5. 吴茂

0 码力 | 40 页 | 1.64 MB | 1 年前
3
PyTorch Release Notes

performance drop for GNMT training ‣ On Volta: ‣ Up to 20% performance drop for Tacotron training. ‣ Manual synchronization is required in CUDA graphs workloads between graph replays. ‣ The PyTorch container 21.04: ‣ On NVIDIA Ampere Architecture GPUs: ‣ Up to 17% performance drop for VGG16 training ‣ Manual synchronization is required in CUDA graphs workloads between graph replays. ‣ The DLProf TensorBoard Up to 20% performance drop in MaskRCNN training ‣ Up to 15% performance drop in VGG16 training ‣ Manual synchronization is required in CUDA graphs workloads between graph replays. ‣ The DLProf TensorBoard

0 码力 | 365 页 | 2.94 MB | 1 年前
3
Keras: 基于 Python 的深度学习库

clear_session keras.backend.clear_session() 销毁当前的 TF 图并创建一个新图。有用于避免旧模型/网络层混乱。 manual_variable_initialization keras.backend.manual_variable_initialization(value) 设置变量手动初始化的标志。这个布尔标志决定了变量是否应该在实例化时初始化（默认），或者用户是否应该自己处理用 PEP8 linter： • 安装 PEP8 包：pip install pep8 pytest-pep8 autopep8 • 运行独立的 PEP8 检查：py.test --pep8 -m pep8 • 你可以通过运行这个命令自动修复一些 PEP8 错误： autopep8 -i --select 例如：autopep8 -i --select

0 码力 | 257 页 | 1.19 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

practitioners have to do. Apart from saving humans time, it also helps by reducing the bias that manual decisions might introduce when designing efficient networks. Automation techniques can help improve Automated Hyper-Param Optimization (HPO) is one such technique that can be used to replace / supplement manual tweaking of hyper-parameters like learning rate, regularization, dropout, etc. This relies on search

0 码力 | 21 页 | 3.17 MB | 1 年前
3
【PyTorch深度学习-龙龙老师】-测试版202112

23 1.8 参考文献 [1] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou w_current, points, lr): # 计算误差函数在所有点上的导数，并更新 w,b b_gradient = 0 w_gradient = 0 M = float(len(points)) # 总样本数 for i in range(0, len(points)): x = points[i, 0] 2(wx+b-y)，参考公式(2.3) b_gradient += (2/M) * ((w_current * x + b_current) - y) # 误差函数对 w 的导数：grad_w = 2(wx+b-y)*x，参考公式(2.2) w_gradient += (2/M) * x * ((w_current * x + b_current) - y)

0 码力 | 439 页 | 29.91 MB | 1 年前
3
PyTorch Tutorial

out gradients after each update • t.grad.zero_() *Assume 't' is a tensor Autograd (continued) • Manual Weight Update - example Optimizer • Optimizers (optim package) • Adam, Adagrad, Adadelta, SGD etc

0 码力 | 38 页 | 4.09 MB | 1 年前
3
Lecture 5: Gaussian Discriminant Analysis, Naive Bayes

September 27, 2023 33 / 122 Warm Up (Contd.) Given a set of training data D = {x(i), y(i)}i=1,··· ,m The training data are sampled in an i.i.d. manner The probability of the i-th training data (x(i), P(D) = m � i=1 pX|Y (x(i) | y (i))pY (y (i)) Feng Li (SDU) GDA, NB and EM September 27, 2023 34 / 122 Warm Up (Contd.) Log-likelihood function ℓ(θ) = log m � i=1 pX,Y (x(i), y(i)) = log m � i=1 i=1 pX|Y (x(i) | y(i))pY (y(i)) = m � i=1 � log pX|Y (x(i) | y(i)) + log pY (y(i)) � where θ = {pX|Y (x | y), pY (y)}x,y Feng Li (SDU) GDA, NB and EM September 27, 2023 35 / 122 Warm Up (Contd.)

0 码力 | 122 页 | 1.35 MB | 1 年前
3
Lecture Notes on Gaussian Discriminant Analysis, Naive

µ1)T Σ−1(x − µ1) � (7) Given m sample data {(x(i), y(i))}i=1,··· ,m, the log-likelihood is defined as ℓ(ψ, µ0, µ1, Σ) = log m � i=1 pX,Y (x(i), y(i); ψ, µ0, µ1, Σ) = log m � i=1 pX|Y (x(i) | y(i); µ0 µ0, µ1, Σ)pY (y(i); ψ) = m � i=1 log pX|Y (x(i) | y(i); µ0, µ1, Σ) + m � i=1 log pY (y(i); ψ)(8) where ψ, µ0, and σ are parameters. Substituting Eq. (5)∼(7) into Eq. (8) gives 2 us a full expression expression of ℓ(ψ, µ0, µ1, Σ) ℓ(ψ, µ0, µ1, Σ) = m � i=1 log pX|Y (x(i) | y(i); µ0, µ1, Σ) + m � i=1 log pY (y(i); ψ) = � i:y(i)=0 log � 1 (2π)n/2|Σ|1/2 exp � −1 2(x − µ0)T Σ−1(x − µ0) �� + � i:y(i)=1

0 码力 | 19 页 | 238.80 KB | 1 年前
3
Lecture Notes on Support Vector Machine

margin of x0 (with respect to the hyperplane ωT x + b = 0). Now, given a set of m training data {(x(i), y(i))}i=1,··· ,m, we first assume that they are linearly separable. Specifically, there exists a with y(i) = 1, while ωT x(i) + b ≤ 0 for ∀i with y(i) = −1. As shown in Fig. 1, for ∀i = 1, · · · , m, we can calculate its margin as γ(i) = y(i) �� ω ∥ω∥ �T x(i) + b ∥ω∥ � (5) With respect to the · · · , k Aw − b = 0 if it is strictly feasible, i.e., ∃ ω ∈ relintD : gi(ω) < 0, i = 1, · · · , m, Aw = b Detailed proof of the above theorem can be found in Prof. Boyd and Prof. Vandenberghe’s Convex

0 码力 | 18 页 | 509.37 KB | 1 年前
3

共 68 条前往

页

分类

语言

格式

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

机器学习课程-温州大学-03深度学习-PyTorch入门

PyTorch Release Notes

Keras: 基于 Python 的深度学习库

《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

【PyTorch深度学习-龙龙老师】-测试版202112

PyTorch Tutorial

Lecture 5: Gaussian Discriminant Analysis, Naive Bayes

Lecture Notes on Gaussian Discriminant Analysis, Naive

Lecture Notes on Support Vector Machine