动手学深度学习 v2.0
的数学名称。 首先,我们可以使用 arange 创建一个行向量 x。这个行向量包含以0开始的前12个整数,它们默认创建为整 数。也可指定创建类型为浮点数。张量中的每个值都称为张量的 元素(element)。例如,张量 x 中有 12 个 元素。除非额外指定,新的张量将存储在内存中,并采用基于CPU的计算。 x = torch.arange(12) x tensor([ 0, 1, 2, 表示二元标量运算符,这意味着该 函数接收两个输入,并产生一个输出。给定同一形状的任意两个向量u和v和二元运算符f,我们可以得到向 量c = F(u, v)。具体计算方法是ci ← f(ui, vi),其中ci、ui和vi分别是向量c、u和v中的元素。在这里,我们 通过将标量函数升级为按元素向量运算来生成向量值 F : Rd, Rd → Rd。 对于任意具有相同形状的张量,常见的标准算术运算符( 预备知识 (tensor(5.), tensor(6.), tensor(1.5000), tensor(9.)) 2.3.2 向量 向量可以被视为标量值组成的列表。这些标量值被称为向量的元素(element)或分量(component)。当向 量表示数据集中的样本时,它们的值具有一定的现实意义。例如,如果我们正在训练一个模型来预测贷款违 约风险,可能会将每个申请人与一个向量相关联,其分量与其收入、工作年限、过往违约次数和其他因素相0 码力 | 797 页 | 29.45 MB | 1 年前3AI大模型千问 qwen 中文文档
qwen7b -f Modelfile 完成后,你即可运行你的 ollama 模型: ollama run qwen7b 1.6 Text Generation Web UI Text Generation Web UI(简称 TGW,通常被称为“oobabooga”)是一款流行的文本生成 Web 界面工具,类似 于 AUTOMATIC1111/stable-diffusion-webui Qwen1.5-7B-Chat │ │ ├── config.json │ │ ├── generation_config.json (续下页) 1.6. Text Generation Web UI 11 Qwen (接上页) │ │ ├── model-00001-of-00004.safetensor │ │ ├── model-00002-of-00004.safetensor │ 随后你需要运行 python server.py 来启动你的网页服务。请点击进入 `http://localhost:7860/?__theme=dark` 然后享受使用 Qwen 的 Web UI 吧! 1.6.2 下一步 TGW 中包含了许多更多用途,您甚至可以在其中享受角色扮演的乐趣,并使用不同类型的量化模型。您可 以训练诸如 LoRA 这样的算法,并将 Stable Diffusion0 码力 | 56 页 | 835.78 KB | 1 年前3Lecture 2: Linear Regression
directional derivative of f in the direction of u can be represented as ∇uf (x) = n � i=1 f ′ i (x) · ui Feng Li (SDU) Linear Regression September 13, 2023 11 / 31 Gradient (Contd.) Proof. Letting g(h) chain rule, g′(h) = n � i=1 f ′ i (x) d dh(xi + hui) = n � i=1 f ′ i (x)ui (2) Let h = 0, then g′(0) = �n i=1 f ′ i (x)ui, by substituting which into (1), we complete the proof. Feng Li (SDU) Linear0 码力 | 31 页 | 608.38 KB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures
Recurrent Neural Network. The image on the left shows a recurrent cell processing the input sequence element at time step t. The image on the right explains the processing of the entire input sequence across indicates that the first position in the english sequence has a strong relationship with the first element in the spanish sequence. That makes sense because typically21 the sentences in both the languages sequences. It takes into account, for instance, the relationship of the first element in the first sequence and the last element in the second sequence. Hence, it addresses the limitations of RNN with long0 码力 | 53 页 | 3.92 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 7 - Automation
consists of all the combinations of valid hyperparameters values. Each trial is configured with an element from the trial set. After all the trials are complete, we pick the one with the best results. The convolution are two different choices for primitive operations. The combination operation has two choices: element wise addition or the concatenation of output of primitive operations. The concatenation operation count=1), ] The STATE_SPACE has three components to mimic the NASNet search space. The hidden_state element can take two values to represent the two input hidden states to a cell. The primitives and the combination0 码力 | 33 页 | 2.48 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques
0, 3) print(x_q) This returns the following result. [0 1 2 3 4 5 6 7 7] Table 2-2 shows the element wise comparison of x and xq. x -10.0 -7.5 -5.0 -2.5 0 2.5 5.0 7.5 10.0 xq 0 1 2 3 4 5 6 7 7 Table learning conventionally. We receive this dequantized array upon running the code. Note that the last element was supposed to be 10.0, and the error is 2.5. array([-10. , -7.5, -5. , -2.5, 0. , 2.5, result of the operation (XW + b) is [batch size, D2]. σ is a nonlinear function that is applied element-wise to the result of (XW + b). Some examples of the nonlinear functions are ReLU (ReLU(x) = x if0 码力 | 33 页 | 1.96 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniques
list all the centroids in a codebook and replace each element in our tensor with the index of the centroid in the codebook closest to that element. The decoding process simply requires replacing the centroid (typically for floating point values), the codebook will cost us bytes to store. For each tensor element we will now store only the index of its centroid in the codebook, which will only take up bits. (WCSS) metric. . Here, we are trying to find a set which has centroids , such that for each element that is closest to the centroid in , the sum of the squared distances between the every such weight0 码力 | 34 页 | 3.18 MB | 1 年前3PyTorch Brand Guidelines
When printing, please use CMYK or the listed Pantone code. For UI button elements, please reference “Color Variations for UI Buttons” to apply the color properly. 9 Brand Guidelines PyTorch0 码力 | 12 页 | 34.16 MB | 1 年前3机器学习课程-温州大学-03深度学习-PyTorch入门
矩阵逐元素(Element-wise)乘法 torch.mul() torch.mul(mat1, other, out=None) 其中 other 乘数可以是标量,也可以是任意维度的矩阵 , 只要满足最终相乘是可以broadcast的即可。 15 1.Tensors张量乘法 5. 两个运算符 @ 和 * @:矩阵乘法,自动执行适合的矩阵乘法函数 *:element-wise乘法0 码力 | 40 页 | 1.64 MB | 1 年前3Lecture Notes on Support Vector Machine
infima) of a subset S of a partially ordered set T is the greatest element in T that is less than or equal to all elements of S, if such an element exists. More details about infimum and its counterpart suprema0 码力 | 18 页 | 509.37 KB | 1 年前3
共 16 条
- 1
- 2