Lecture 4: Regularization and Bayesian StatisticsOverfitting Problem 2 Regularized Linear Regression 3 Regularized Logistic Regression 4 MLE and MAP Feng Li (SDU) Regularization and Bayesian Statistics September 20, 2023 2 / 25 Overfitting Problem Bayesian Statistics September 20, 2023 13 / 25 Maximum-a-Posteriori Estimation (MAP) Maximum-a-Posteriori Estimation (MAP): Maximize the posterior prob- ability of θ (i.e., probability in the light of θ)dθ The Bayes Rule lets us update our belief about θ in the light of observed data While doing MAP, we usually maximize the log of the posteriori prob- ability Feng Li (SDU) Regularization and Bayesian0 码力 | 25 页 | 185.30 KB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression TechniquesKeeping all that in mind, it is easy to see that floating-point xmin should map to 0, and xmax should map to 2b-1. How do we map the rest of the floating point values in between xmin and xmax to integer continuous values are also clamped to be in the range [xmin, xmax]. Solution: Note that we have to map all the values from [xmin, xmax] to 2b possible values (let’s call them bins). Figure 2-5 shows a visual is referred to as scale). Hence, [xmin, xmin + s) will map to the bin 0, [xmin + s, xmin + 2s) will map to bin 1, [xmin + 2s, xmin + 3s) will map to bin 2, and so on. Thus, to find which bin the given0 码力 | 33 页 | 1.96 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationcast(image, tf.uint8) return image, label train_ds = train_ds.map(resize_image) val_ds = val_ds.map(resize_image) test_ds = test_ds.map(resize_image) Note that the create_model() function here has two a Reduction cell. A normal cell's output feature map is identical to the input feature map. In contrast, a reduction cell reduces the output feature map to half. Figure 7-7 shows two child networks that primitive operations. The concatenation operation happens along the filter dimension to keep the feature map intact. Figure 7-9 shows the Normal and Reduction cells predicted by NASNet with the cifar10 dataset0 码力 | 33 页 | 2.48 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquescast(image, tf.uint8) return image, label train_ds = train_ds.map(dsitem_to_tuple).map(resize_image).cache() val_ds = val_ds.map(dsitem_to_tuple).map(resize_image).cache() print(train_ds.as_numpy_iterator() layer composite of random flip and rotation. Then, we map each image through this layer to apply the transform. The mapping is done using a handy map() method on the tf.data.Dataset object. augs = tf.keras image = tf.squeeze(image, axis=0) # Squeeze the batch return image, label train_aug_ds = train_ds.map(augfn) That’s it! Simple, isn’t it? Let’s run the training function again with the augmented dataset0 码力 | 56 页 | 18.93 MB | 1 年前3
AI大模型千问 qwen 中文文档model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-7B-Chat") # Instead of using model model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto", device_map="auto", attn_implementation="flash_attention_2", ) 为了解决下载问题,我们建议您尝试从 ModelScope 进行下载,只需将上述代码的第一行更改为以下内容: model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-7B-Chat") # Instead of using model0 码力 | 56 页 | 835.78 KB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesthe actual words by their indices in our vocabulary. If a word doesn’t exist in the vocabulary, we map it to the index of the OOV token. Similarly, we replace the label word with the index of that word end_to_end_model = tf.keras.Model(string_input, predictions) We can now classify a new piece of text, and map it to a class name. probabilities = end_to_end_model.predict( [['Usain Bolt is a very well known ‘hello’ map to the same slot in the embedding table and will learn the same embedding. Figure 4-13: Hashing trick as an option for reducing the vocabulary size. In the figure, ‘bar’ and ‘hello’ map to the0 码力 | 53 页 | 3.92 MB | 1 年前3
机器学习课程-温州大学-09深度学习-目标检测2.目标检测算法 mAP(Mean Average Precision) 多个类别的目标检测中,每一个类别都 可以绘制一条P-R曲线,各类别AP的均 值(即所有类别的AP和/类别数目)即 是mAP,mAP衡量的是训练出来的模型 在所有类别上的检测能力的好坏。 AP衡量的是学出来的模型在每个类别上的好坏,mAP衡量的是学出的模型在所有 类别上的好坏,得到AP后mAP的计算就变得很简单了,就是取所有AP的平均值。0 码力 | 43 页 | 4.12 MB | 1 年前3
全连接神经网络实战. pytorch 版测 试 的 数 据 download=True , #如 果 根 目 录 没 有 就 下 载 transform=ToTensor () ) #把 数 据 显 示 一 下 labels_map = { 0: ”T−Shirt ” , 1: ” Trouser ” , 2: ” Pullover ” , 3: ” Dress ” , 4: ”Coat” , 5: ” Sandal ” # 抽 取 索 引 为 100 的 数 据 来 显 示 img , l a b e l = training_data [ 1 0 0 ] plt . t i t l e ( labels_map [ l a b e l ] ) #squeeze 函 数 把 为1 的 维 度 去 掉 plt . imshow ( img . squeeze () , cmap=” gray ” ) plt 为numpy, 然 后 再 取 索 引 23 的 标 签 l a b e l = train_labels . numpy() [ 2 3 ] plt . t i t l e ( labels_map [ l a b e l ] ) plt . imshow (img , cmap=” gray ” ) plt . show () 程序得到显示结果: 数据有时候并不适合直接丢进网络进行训练,因此我们需要把数据进行转换。由于0 码力 | 29 页 | 1.40 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewmodel in this fine-tuning stage is not being used for learning rudimentary features, but rather how to map the high-level representations it learned in the pretraining stage to solving our new task. Thus, the model will work on. train_ds = batched_train.map( lambda x: (preprocessor(x['description']), tf.expand_dims(x['label'], axis=-1))) test_ds = batched_test.map( lambda x: (preprocessor(x['description'])0 码力 | 31 页 | 4.03 MB | 1 年前3
机器学习课程-温州大学-14深度学习-Vision Transformer (ViT) 作为原始图像块的替代方法,输入序列可以由CNN的特征图形成。 在该混合模型中,将patch嵌入投影E应用于从CNN feature map中提取的patch。 作为一种特殊情况,patches的空间大小可以是1x1,这意味着输入序列是通过简单地打平 feature map的空间维度并投射到Transformer维度来获得的。如前所述,增加分类输入嵌入和 位置嵌入。 4.模型缺点与改进 300 码力 | 34 页 | 2.78 MB | 1 年前3
共 25 条
- 1
- 2
- 3













