动手学深度学习 v2.0,并为论坛提供讨论。 虽然我们的体系尚不完善,但这些选择在相互冲突的问题之间提供了一个很好的妥协。我们相信,这可能是 第一本使用这种集成工作流程出版的书。 1 http://distill.pub 2 http://discuss.d2l.ai 2 目录 在实践中学习 许多教科书教授一系列的主题,每一个都非常详细。例如,Chris Bishop的优秀教科书 (Bishop, 2006) """将时间机器数据集加载到文本行的列表中""" with open(d2l.download('time_machine'), 'r') as f: lines = f.readlines() return [re.sub('[^A-Za-z]+', ' ', line).strip().lower() for line in lines] lines = read_time_machine() print(f'# 量样本的损失求平均,因此优化算法中 的梯度不需要除以批量大小。 def sgd(params, states, hyperparams): for p in params: p.data.sub_(hyperparams['lr'] * p.grad) p.grad.data.zero_() 下面实现一个通用的训练函数,以方便本章后面介绍的其他优化算法使用。它初始化了一个线性回归模型,0 码力 | 797 页 | 29.45 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesstrong semi-supervised learners. Advances in neural information processing systems, 33, 22243-22255. 17 A head is a trainable sub-network that takes in the output of the base model (in this case the frozen 22 Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017). Mathematically, we are given a pair of sequences and with shapes (n, d) and (m, d) where "Character-level convolutional networks for text classification." Advances in neural information processing systems 28 (2015): 649-657. We vectorize the input samples using a TextVectorization layer. This layer collects0 码力 | 53 页 | 3.92 MB | 1 年前3
深度学习与PyTorch入门实战 - 34. 动量与lr衰减动量与学习率衰减 主讲人:龙良曲 Tricks ▪ momentum ▪ learning rate decay Momentum https://distill.pub/2017/momentum/ No momentum With appr. momentum momentum Learning rate tunning Learning rate decay Scheme0 码力 | 14 页 | 816.20 KB | 1 年前3
【PyTorch深度学习-龙龙老师】-测试版202112d\textquotesingle Alch é-BucF., FoxE., & GarnettR. (编辑), Advances in Neural Information Processing Systems 32 (页 8024–8035). Curran Associates, Inc. 检索来源: http://papers.neurips.cc/paper/9015-pytorch-an 基本数学运算函数,本节将系统地介 绍 PyTorch 中常见的数学运算函数。 4.9.1 加、减、乘、除运算 加、减、乘、除是最基本的数学运算,分别通过 torch.add、torch.sub、torch.mul、 torch.div 函数实现,PyTorch 已经重载了+、 − 、 ∗ 、/运算符,通常推荐直接使用运算符 来完成加、减、乘、除运算,简单清晰。 预览版202112 USA, 2011. [3] J. Mizera-Pietraszko 和 P. Pichappan, Lecture Notes in Real-Time Intelligent Systems, Springer International Publishing, 2017. 预览版2021120 码力 | 439 页 | 29.91 MB | 1 年前3
PyTorch Release NotesThe method implemented in your system depends on the DGX OS version that you installed (for DGX systems), the NGC Cloud Image that was provided by a Cloud Service Provider, or the software that you installed 15 ‣ OpenMPI 4.1.4+ ‣ GDRCopy 2.3 ‣ TensorBoard 2.9.0 ‣ Nsight Compute 2023.1.1.4 ‣ Nsight Systems 2023.2.3.1001 ‣ NVIDIA TensorRT™ 8.6.1.6 ‣ Torch-TensorRT 1.5.0.dev0 ‣ NVIDIA DALI® 1.27.0 ‣ 15 ‣ OpenMPI 4.1.4+ ‣ GDRCopy 2.3 ‣ TensorBoard 2.9.0 ‣ Nsight Compute 2023.1.1.4 ‣ Nsight Systems 2023.2.3.1001 ‣ NVIDIA TensorRT™ 8.6.1.6 ‣ Torch-TensorRT 1.5.0.dev0 ‣ NVIDIA DALI® 1.26.0 ‣0 码力 | 365 页 | 2.94 MB | 1 年前3
《TensorFlow 快速入门与实战》1-TensorFlow初印象Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning 1990s��������������� Jeff Dean, Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning ������������������ Dean, Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning ����� Google ��� Jeff Dean, Google Brain Team, Building Intelligent Systems with Large Scale Deep Learning TensorFlow0 码力 | 34 页 | 35.16 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression TechniquesJohn Denker, and Sara Solla. "Optimal brain damage." Advances in neural information processing systems 2 (1989). As you can deduce, the parameter changes the influence of the previous value of momentum "Deconstructing lottery tickets: Zeros, signs, and the supermask." Advances in neural information processing systems 32 (2019). 10 Liu, Zhuang, et al. "Rethinking the value of network pruning." arXiv preprint arXiv:1810 weights and connections for efficient neural network." Advances in neural information processing systems 28 (2015). 7 Dettmers, Tim, and Luke Zettlemoyer. "Sparse networks from scratch: Faster training0 码力 | 34 页 | 3.18 MB | 1 年前3
Lecture 1: OverviewFellow, National University of Singapore, Singapore. Research Interests: Distributed Algorithms and Systems, Wireless Net- works, Mobile Computing, Internet of Things. Feng Li (SDU) Overview September 6, driver Feng Li (SDU) Overview September 6, 2023 11 / 57 Why Do We Need Machine Learning? Develop systems that are too difficult/expensive to construct manually because they require specific detailed skills skills or knowledge tuned to a specific task (knowledge engineering bottleneck) Develop systems that can automatically adapt and customize them- selves to individual users. Personalized news or mail filter0 码力 | 57 页 | 2.41 MB | 1 年前3
keras tutorialfunction. Sequential model exposes Model class to create customized models as well. We can use sub-classing concept to create our own complex model. Functional API: Functional API is basically used create our own customized layers. Customized layer can be created by sub-classing the Keras.Layer class and it is similar to sub-classing Keras models. Core Modules Keras also provides a lot of built-in learn how to create new layer in this chapter. Keras provides a base layer class, Layer which can sub-classed to create our own customized layer. Let us create a simple layer which will find weight based0 码力 | 98 页 | 1.57 MB | 1 年前3
从推荐模型的基础特点看大规模推荐类深度学习系统的设计 袁镱Partitions for Memory-Efficient Recommendation Systems Twiiter [RecSys21] Model Size Reduction Using Frequency Based Double Hashing for Recommender Systems 9 千 万 key hash1(key) hash2(key) 千 万 业界⽅案:Double0 码力 | 22 页 | 6.76 MB | 1 年前3
共 22 条
- 1
- 2
- 3













