Synopses & Reviews
Synopsis
This book presents the Statistical Learning Theory in a detailed and easy to understand way, by using practical examples, algorithms and source codes. It can be used as a textbook in graduation or undergraduation courses, for self-learners, or as reference with respect to the main theoretical concepts of Machine Learning. Fundamental concepts of Linear Algebra and Optimization applied to Machine Learning are provided, as well as source codes in R, making the book as self-contained as possible.
It starts with an introduction to Machine Learning concepts and algorithms such as the Perceptron, Multilayer Perceptron and the Distance-Weighted Nearest Neighbors with examples, in order to provide the necessary foundation so the reader is able to understand the Bias-Variance Dilemma, which is the central point of the Statistical Learning Theory.
Afterwards, we introduce all assumptions and formalize the Statistical Learning Theory, allowing the practical study of different classification algorithms. Then, we proceed with concentration inequalities until arriving to the Generalization and the Large-Margin bounds, providing the main motivations for the Support Vector Machines.
From that, we introduce all necessary optimization concepts related to the implementation of Support Vector Machines. To provide a next stage of development, the book finishes with a discussion on SVM kernels as a way and motivation to study data spaces and improve classification results.
Synopsis
Chapter 1 - A Brief Review on Machine Learning
1.1 Machine Learning definition
1.2 Main types of learning
1.3 Supervised learning
1.4 How a supervised algorithm learns?
1.5 Illustrating the Supervised Learning
1.51. The Perceptron
1.5.2 Multilayer Perceptron
1.6 Concluding Remarks
Chapter 2 - Statistical Learning Theory
2.1 Motivation
2.2 Basic concepts
2.2.1 Probability densities and joint probabilities
2.2.2 Identically and independently distributed data
2.2.3 Assumptions considered by the Statistical Learning Theory
2.2.4 Expected risk and generalization
2.2.5 Bounds for generalization with a practical example
2.2.6 Bayes risk and universal consistency
2.2.7 Consistency, overfitting and underfitting
2.2.8 Bias of classification algorithms
2.3 Empirical Risk Minimization Principle
2.3.1 Consistency and the ERM Principle