Lecture 5: Gaussian Discriminant Analysis, Naive Bayesdistributions Joint probability distribution Independence Conditional probability distribution Bayes’ Theorem ... ... Feng Li (SDU) GDA, NB and EM September 27, 2023 3 / 122 Sample Space, Events and Probability =y(x)dx P(Y = y) = pY (y) Feng Li (SDU) GDA, NB and EM September 27, 2023 16 / 122 Bayes’ Theorem Bayes’ theorem (or Bayes’ rule) describes the probability of an event, based on prior knowledge of conditions A)P(A) P(B) In the Bayesian interpretation, probability measures a “degree of be- lief”, and Bayes’ theorem links the degree of belief in a proposition before and after accounting for evidence. For proposition0 码力 | 122 页 | 1.35 MB | 1 年前3
Lecture Notes on Support Vector Machinethus is a concave function regardless of the original problem; iii) G can be −∞ for some α and β Theorem 1. Lower Bounds Property: If α ⪰ 0, then G(α, β ) ≤ p∗ where p∗ is the optimal value of the (original) We now choose the minimizer of f(˜ω) over all feasible ˜ω’s to get p∗ ≥ G(α, β ). It is shown by Theorem 1 that, the Lagrange dual function provides a non-trivial lower bound to the primal optimization Complementary Slackness Let ω∗ be a primal optimal point and (α∗, β ∗) be a dual optimal point. Theorem 2. Complementary Slackness: If strong duality holds, then α∗ i gi(ω∗) = 0 (16) for ∀i = 1, 2, ·0 码力 | 18 页 | 509.37 KB | 1 年前3
Lecture Notes on Gaussian Discriminant Analysis, NaiveBayes and EM Algorithm Feng Li fli@sdu.edu.cn Shandong University, China 1 Bayes’ Theorem and Inference Bayes’ theorem is stated mathematically as the following equation P(A | B) = P(B | A)P(A) P(B) hθ(x) to model the relationship between feature vector x and label y, while we now rely on Byes’ theorem to characterize the relationship through parameters θ = {P(X = x | Y = y), P(Y = y)}x,y. 2 Gaussian and pX|Y (x | 1) according to our assumptions (5)∼(7), and make predictions according to Bayes’ theorem (see Eq. (2)). Specifically, given a test data featured by ˜x, we compare P(Y = ˜y | X = ˜x) = pY0 码力 | 19 页 | 238.80 KB | 1 年前3
Lecture 2: Linear Regressionf (x) w.r.t. xi Feng Li (SDU) Linear Regression September 13, 2023 10 / 31 Gradient (Contd.) Theorem For any n-dimensional vector u, the directional derivative of f in the direction of u can be represented BATC Feng Li (SDU) Linear Regression September 13, 2023 26 / 31 Revisiting Least Square (Contd.) Theorem: The matrix ATA is invertible if and only if the columns of A are linearly independent. In this case case, there exists only one least-squares solution θ = (X TX)−1X TY Prove the above theorem in Problem Set 1. Feng Li (SDU) Linear Regression September 13, 2023 27 / 31 Probabilistic Interpretation0 码力 | 31 页 | 608.38 KB | 1 年前3
动手学深度学习 v2.0probability),并用P(B = b | A = a)表示它:它是B = b的概率,前提是A = a已发生。 贝叶斯定理 使用条件概率的定义,我们可以得出统计学中最有用的方程之一:Bayes定理(Bayes’theorem)。根据乘法法 则(multiplication rule )可得到P(A, B) = P(B | A)P(A)。根据对称性,可得到P(A, B) = P(A | B)P(B)。 假设P(B) 2的可能性将会降低,我们的训练误差将与泛化误差相匹配。 140 4. 多层感知机 统计学习理论 由于泛化是机器学习中的基本问题,许多数学家和理论家毕生致力于研究描述这一现象的形式理论。在同名 定理(eponymous theorem)62中,格里文科和坎特利推导出了训练误差收敛到泛化误差的速率。在一系列 开创性的论文中,Vapnik和Chervonenkis63 将这一理论扩展到更一般种类的函数。这项工作为统计学习理论 验法则相当有用:统计学家认为,能够轻松解释任意事实的模型是复杂的,而表达能力有限但仍能很好地解 62 https://en.wikipedia.org/wiki/Glivenko%E2%80%93Cantelli_theorem 63 https://en.wikipedia.org/wiki/Vapnik%E2%80%93Chervonenkis_theory 4.4. 模型选择、欠拟合和过拟合 141 释数据0 码力 | 797 页 | 29.45 MB | 1 年前3
PyTorch Tutorialtime: • Google Colab provides free Tesla K80 GPU of about 12GB. You can run the session in an interactive Colab Notebook for 12 hours. • https://colab.research.google.com/ Misc • Dynamic VS Static Computation0 码力 | 38 页 | 4.09 MB | 1 年前3
PyTorch Release Notesbefore you proceed to step 3. 3. To run the container image, select one of the following modes: ‣ Interactive ‣ If you have Docker 19.03 or later, a typical command to launch the container is: docker run nvidia-docker run -it --rm -v local_dir:container_dir nvcr.io/nvidia/ pytorch:-py3 ‣ Non-interactive ‣ If you have Docker 19.03 or later, a typical command to launch the container is: docker run 0 码力 | 365 页 | 2.94 MB | 1 年前3
共 7 条
- 1













