Lecture Notes on Linear Regression+ · · · + ✓1x1 + ✓0 Geometrically, when n = 1, h✓(x) is actually a line in a 2D plane, while h✓(x) represents a plane in a 3D space when n = 2. Generally, when n � 3, h✓(x) defines a so-called “hyperplane” by ✓. Since our goal is to make predictions according to the hypothesis function given a new test data, we need to find the optimal value of ✓ such that the resulting prediction is as accurate as possible based on a given set of m training data {x(i), y(i)}i=1,··· ,m. In particular, we are supposed to find a hypothesis function (parameterized by ✓) which fits the training data as closely as possible. To measure0 码力 | 6 页 | 455.98 KB | 1 年前3
Lecture 1: OverviewPeople’s Posts and Telecommunications Press, 2016 Trevor Hastie, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd Ed.), World Publishing Corporation, 2015 Christopher M. Bishop Personalized news or mail filter Personalized tutoring Discover new knowledge from large databases (data mining) Market basket analysis (e.g. diapers and beer) Medical information mining (e.g. migraines given only indirect feedback? Feng Li (SDU) Overview September 6, 2023 14 / 57 Source of Training Data Provided random examples outside of the learner’s control. Negative examples available or only positive0 码力 | 57 页 | 2.41 MB | 1 年前3
Lecture 3: Logistic Regressionmalignant Feng Li (SDU) Logistic Regression September 20, 2023 9 / 29 Logistic Regression (Contd.) Data samples are drawn randomly X: random variable representing feature vector Y : random variable representing X = x; θ) = 1/(1 + exp(−θTx)) The “score” θTx is also a measure of distance of x from the hyper- plane (the score is positive for pos. examples, and negative for neg. examples) High positive score: High positive samples and all other samples as negative ones Inputs: A learning algorithm L, training data {(x(i), y (i))}i=1,··· ,m where y (i) ∈ {1, ..., K} is the label for the sample x(i) Output: A list0 码力 | 29 页 | 660.51 KB | 1 年前3
Lecture 2: Linear Regressionthe 2D plane is a straight line. Hypothesis: hθ(x) = θ0 + θ1x (where θ0 and θ1 are parameters) Feng Li (SDU) Linear Regression September 13, 2023 7 / 31 Linear Regression (Contd.) Given data x ∈ Rn Linear Regression 1: Given a starting point θ ∈ dom J 2: repeat 3: Randomly shuffle the training data; 4: for i = 1, 2, · · · , m do 5: θ ← θ − α∇J(θ; x(i), y(i)) 6: end for 7: until convergence criterion Linear Regression September 13, 2023 28 / 31 Probabilistic Interpretation (Contd.) The training data {x(i), y(i)}i=1,··· ,m are sampled identically and inde- pendently p(y = y(i) | x = x(i); θ) = 10 码力 | 31 页 | 608.38 KB | 1 年前3
Lecture Notes on Support Vector Machinefli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ωT x + b = 0 (1) where ω ∈ Rn is the outward pointing normal vector, and b is the so-called margin of x0 (with respect to the hyperplane ωT x + b = 0). Now, given a set of m training data {(x(i), y(i))}i=1,··· ,m, we first assume that they are linearly separable. Specifically, there exists hyperplane actually serves as a decision boundary to differentiating positive data samples from negative data samples. Given a test data sample, we will make a more confident decision if its margin (with respect0 码力 | 18 页 | 509.37 KB | 1 年前3
Lecture 5: Gaussian Discriminant Analysis, Naive BayespX(x) , ∀y We calculate pX|Y (x | y) for ∀x, y and pY (y) for ∀y according to the given training data Fortunately, we do not have to calculate pX(x), because arg max y pY |X(y | x) = arg max y pX|Y learning from training data, but how? Feng Li (SDU) GDA, NB and EM September 27, 2023 33 / 122 Warm Up (Contd.) Given a set of training data D = {x(i), y(i)}i=1,··· ,m The training data are sampled in an an i.i.d. manner The probability of the i-th training data (x(i), y (i)) P(X = x(i), Y = y (i)) = P(X = x(i) | Y = y (i))P(Y = y (i)) = pX(x(i) | y (i))pY (y (i)) = pX|Y (x(i) | y (i))pY (y (i)) The0 码力 | 122 页 | 1.35 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesfeatures in the input. Recurrent Neural Nets (RNNs) facilitated learning from the sequences and temporal data. These breakthroughs contributed to bigger and bigger models. Although they improved the quality of you see, books you read, food you enjoy and so on), without the need of knowing all the encyclopedic data about them. When working with deep learning models and inputs such as text, which are not in numerical high-dimensional data into low-dimension, while retaining the properties from the high-dimensional representation. It is useful because it is often computationally infeasible to work with data that has a large0 码力 | 53 页 | 3.92 MB | 1 年前3
动手学深度学习 v2.0import Image from torch import nn from torch.nn import functional as F from torch.utils import data from torchvision import transforms 目标受众 本书面向学生(本科生或研究生)、工程师和研究人员,他们希望扎实掌握深度学习的实用技术。因为我们 从头开始解 编写了一个“学习”程序。如果我们用一个巨大的带标签的数 据集,它很可能可以“学习”识别唤醒词。这种“通过用数据集来确定程序行为”的方法可以被看作用数据 编程(programming with data)。比如,我们可以通过向机器学习系统,提供许多猫和狗的图片来设计一个 “猫图检测器”。检测器最终可以学会:如果输入是猫的图片就输出一个非常大的正数,如果输入是狗的图片 就会输出一个非常小的负数 学习的一个主要分支,本节稍后的内容将对其 进行更详细的解析。 1.2 机器学习中的关键组件 首先介绍一些核心组件。无论什么类型的机器学习问题,都会遇到这些组件: 1. 可以用来学习的数据(data); 2. 如何转换数据的模型(model); 3. 一个目标函数(objective function),用来量化模型的有效性; 4. 调整模型参数以优化目标函数的算法(algorithm)。0 码力 | 797 页 | 29.45 MB | 1 年前3
keras tutorialof algorithms, inspired from the model of human brain. Deep learning is becoming more popular in data science fields like robotics, artificial intelligence(AI), audio & video recognition and image recognition -U scikit-learn Seaborn Seaborn is an amazing library that allows you to easily visualize your data. Use the below command to install: pip install seaborn You could see the message similar as specified json { "image_data_format": "channels_last", "epsilon": 1e-07, "floatx": "float32", "backend": "tensorflow" } Here, image_data_format represent the data format. epsilon0 码力 | 98 页 | 1.57 MB | 1 年前3
Keras: 基于 Python 的深度学习库metrics=['accuracy']) # 生成虚拟数据 import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(2, size=(1000, 1)) # 训练模型,以 32 个样本为一个 batch 进行迭代 model.fit(data, labels, epochs=10, batch_size=32) compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) # 生成虚拟数据 import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(10, size=(1000, 1)) # 将标签转换为分类的 one-hot one_hot_labels = keras.utils.to_categorical(labels, num_classes=10) # 训练模型,以 32 个样本为一个 batch 进行迭代 model.fit(data, one_hot_labels, epochs=10, batch_size=32) 3.1.5 例子 这里有几个可以帮助你开始的例子! 在 examples 目录 中,你可以找到真实数据集的示例模型:0 码力 | 257 页 | 1.19 MB | 1 年前3
共 74 条
- 1
- 2
- 3
- 4
- 5
- 6
- 8













