QCon北京2018-《未来都市--智慧城市与基于深度学习的机器视觉》-陈宇恒调度的支持不断加强;但配套设施落后(e.g. Spark on K8s, GitlabCI) • 容器系统调用栈深,需要仔细验证操作系统,内核及异构设备驱动的兼容性 • Kubernetes对NUMA、异构计算、存储设备的调度能力待加强 1.6 nvidia/gpu custom scheduler 1.8 local-volume 1.10 CPU manager Device0 码力 | 23 页 | 9.26 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical ReviewIn the next subsection on sharpness aware minimization is an interesting peek into how tweaking the objective function can help with generalization. Sharpness Aware Minimization Neural networks are universal help through techniques like momentum to help the optimizer escape the local minimas. Sharpness-Aware Minimization (SAM)22 is one such technique. It suggests that steep valleys in the objective function defined as: 23 https://en.wikipedia.org/wiki/Occam%27s_razor 22 Foret, Pierre, et al. "Sharpness-Aware Minimization for Efficiently Improving Generalization." arXiv, 3 Oct. 2020, doi:10.48550/arXiv.20100 码力 | 31 页 | 4.03 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionwhen a model is trained to predict if a given tweet contains offensive text, the user should be aware of how many GPUs / TPUs are needed and for how long to converge to a good accuracy. Figure 1-3 shows during the deployment phase the user will be concerned about the inference efficiency and should be aware of what is the inference latency per tweet, peak RAM consumption, and other requirements that are0 码力 | 21 页 | 3.17 MB | 1 年前3
Lecture Notes on Support Vector Machineproblem. We can use Eq. (36) to calculate the optimal value of ω, i.e., ω∗. The question is, being aware of ω∗, how to calculate the optimal value of b, i.e., b∗? Due to the complementary slackness, α∗0 码力 | 18 页 | 509.37 KB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 7 - AutomationPareto optimal child networks by computing their latencies 7 Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." Proceedings of the IEEE/CVF Conference on Computer Vision0 码力 | 33 页 | 2.48 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquesdequantization error would be large for ranges where the data is densely distributed. Quantization-aware training can mitigate some of the losses by making the network resilient to the errors, but if we0 码力 | 34 页 | 3.18 MB | 1 年前3
共 6 条
- 1













