《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquessimilar to the baseline, but does so in fewer epochs. We could ideally save an epoch’s worth of training time by terminating the training early, if we adopt this hypothetical sample efficient model training. effective utilization of the training data. Labeling data is often an expensive process both in terms of time consumption and fiscal expenditure because it involves human labelers looking at each example and the four classes, three of which are the keywords that the device will accept: hello, weather and time. The fourth class (none) indicates the absence of an acceptable keyword in the input signal. Figure0 码力 | 56 页 | 18.93 MB | 1 年前3
PyTorch Release Notestested against each NGC monthly container release to ensure consistent accuracy and performance over time. ‣ ResNeXt101-32x4d model: This model was introduced in the Aggregated Residual Transformations for leverages mixed precision arithmetic by using Tensor Cores on NVIDIA V100 GPUs for 1.3x faster training time while maintaining target accuracy. This model script is available on GitHub and NGC. ‣ Tacotron tested against each NGC monthly container release to ensure consistent accuracy and performance over time. ‣ ResNeXt101-32x4d model: This model was introduced in the Aggregated Residual Transformations for0 码力 | 365 页 | 2.94 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationmodel, input data and the hyperparameter trial set is ready. Let's go ahead and train the model, each time choosing one item from the trial set. Each model is trained for 2000 iterations. At the end of a trial on the hyperparameters for the final training. For large models, this is very expensive in terms of time and resources. Alternatively, we can base the search approach on the budget allocation to cap the 24s] val_accuracy: 0.6313725709915161 Best val_accuracy So Far: 0.7284313440322876 Total elapsed time: 00h 17m 23s Results summary Results in hpo/hyperband Showing 3 best trials Trial summary Hyperparameters:0 码力 | 33 页 | 2.48 MB | 1 年前3
机器学习课程-温州大学-时间序列总结还可以将包含多个datetime对象的列表传给 index参数,同样能创建具有时间戳索引的 Series对象。 date_list = [datetime(2018, 1, 1), datetime(2018, 1, 15] time_se = pd.Series(np.arange(6), index=date_list) 12 创建时间序列 如果希望DataFrame对象具有时间戳索引, 也可以采用上述方式进行创建。 2, 15)] time_df = pd.DataFrame(data_demo, index=date_list) 13 通过时间戳索引选取子集 最简单的选取子集的方式,是直接使用位置 索引来获取具体的数据。 # 根据位置索引获取数据 time_se[3] 14 通过时间戳索引选取子集 还可以使用datetime构建的日期获取其对应 的数据。 date_time = datetime(2015 datetime(2015, 6, 1) date_se[date_time] 15 通过时间戳索引选取子集 还可以在操作索引时,直接使用一个日期字 符串(符合可以被解析的格式)进行获取。 date_se['20150530'] date_se['2018/01/23'] 16 通过时间戳索引选取子集 如果希望获取某年或某个月的数据,则可以 直接用指定的年份或者月份操作索引。 date_se['2015']0 码力 | 67 页 | 1.30 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression TechniquesChapter 2 - Compression Techniques “I have made this longer than usual because I have not had time to make it shorter.” Blaise Pascal In the last chapter, we discussed a few ideas to improve the deep of the simplest approaches towards efficiency is compression to reduce data size. For the longest time in the history of computing, scientists have worked tirelessly towards storing and transmitting information Footprint Metrics Quality Metrics ● Model Size ● Inference Latency on Target Device ● Training Time for Convergence ● Peak RAM Consumption ● Accuracy ● Precision ● Recall ● F1 ● AUC Table 2-1:0 码力 | 33 页 | 1.96 MB | 1 年前3
动手学深度学习 v2.0import math import os import random import re import shutil import sys import tarfile import time import zipfile from collections import defaultdict import pandas as pd import requests from IPython 了实现这一点,需要我们对计算进 行矢量化,从而利用线性代数库,而不是在Python中编写开销高昂的for循环。 %matplotlib inline import math import time import numpy as np import torch from d2l import torch as d2l 为了说明矢量化为什么如此重要,我们考虑对向量相加的两种方法。我们实例化两个全为1的10000维向量。 [] self.start() def start(self): """启动计时器""" self.tik = time.time() def stop(self): """停止计时器并将时间记录在列表中""" self.times.append(time.time() - self.tik) return self.times[-1] def avg(self): """返回平均时间"""0 码力 | 797 页 | 29.45 MB | 1 年前3
深度学习下的图像视频处理技术-沈小勇x4 Comparisons 62 Ours Running Time 63 BayesSR [Liu et al, 2011] Running Time 64 2 hour / frame Scale Factor: 4× Frames: 31 MFSR [Ma et al, 2015] Running Time 65 10 min / frame Scale Factor: 2015] Running Time 66 / frame Scale Factor: 4× Frames: 31 8 min VSRNet [Kappeler et al, 2016] Running Time 67 40 s / frame Scale Factor: 4× Frames: 5 Ours (5 frames) Running Time 68 0.19 s / / frame Scale Factor: 4× Frames: 5 Ours (3 frames) Running Time 69 / frame Scale Factor: 4× Frames: 3 0.14 s More Results 70 71 72 Summary 73 • End-to-end & fully scalable • New SPMC layer0 码力 | 121 页 | 37.75 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewon any of the footprint metrics. These techniques might get superseded by other better methods over time, but again our goal is to give you a gentle introduction to this area for you to be able to research compute efficiency since only have to train the model on a small number of examples, saving training time compute too. A Typical Self-Supervised Learning Recipe We can break-down common self-supervised is also used in BERT (Devlin et al.), and other related models like GPT, RoBERTa, T5, etc. At the time of its publishing, BERT beat the state-of-the-art on eleven NLU tasks. A critical point to note is0 码力 | 31 页 | 4.03 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionrecommendations that show up are based on your past interests, what is popular with other users at that time, and so on. If you have seen ‘The Office’ many times over like me, there are chances you might like Conference Proceedings, 2011. Figure 1-2: Growth of parameters in Computer Vision and NLP models over time. (Data Source) We have seen a similar effect in the world of Natural Language Processing (NLP) (see or automatically. These models also often have billions (or trillions) of parameters. At the same time, the incredible performance of these models also drives the demand for applying them on new tasks0 码力 | 21 页 | 3.17 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesdownstream tasks, which gives them a boost in quality, and drastically reduces the training data size and time required. The quality of the embeddings primarily depends on the below two factors: Number of dimensions dbpedia_csv/test.csv 560000 dbpedia_csv/train.csv 70000 dbpedia_csv/test.csv It all looks good! Now, it’s time to put our theory into practice. Even though we are going to use pre-trained embeddings, we will roughly audio, and video domains that are ready-to-deploy. For instance, you should not spend resources and time training your own ResNet model. Instead, you can directly get the model architecture and weights from0 码力 | 53 页 | 3.92 MB | 1 年前3
共 33 条
- 1
- 2
- 3
- 4













