# Classification and regularization in learning theory

In this dissertation, we research classification algorithms created by regularization schemes. The design of these algorithms and their error analysis are completely explained. These algorithms rely on convex risk minimization with Tikhonov regularization. They require an admissible convex loss function, a hypothesis space and a regularizer. We first talk about the admissibility of convex loss functions along with their properties. Comparison theorems are mentioned and new ones are established. Then we check out hypothesis space and regularizer. It is shown that support vector machines (SVMs) and regularized boosting are standard cases of these algorithms. We also introduce the regularization scheme in multi-kernel spaces, a new classification algorithm. As for the error analysis, we take into account 2 classes of models depending on whether the hypothesis space and the regularizer are sample independent or dependent. For the regularization scheme with sample-independent hypothesis space and regularizer, we present a regularization technique. The error is decomposed into the sum of the sample error and the regularization error…..

Contents

1 Introduction
1.1 Binary classiﬁcation problems
1.2 Classiﬁcation algorithms and consistency
1.3 Empirical risk minimization principle
1.4 Convex risk minimization
1.5 Regularization
1.6 Outline of this thesis
2 Convex Loss Functions and Comparison Theorems
2.1 Convex loss functions
2.2 Comparison theorems
3 Regularized Classiﬁers in Reproducing Kernel Hilbert Spaces
3.1 Reproducing kernel Hilbert spaces
3.2 Regularized classiﬁer in the RKHS
3.3 Regularized classiﬁer in a multi-kernel space
4 Probability Inequalities
4.1 Basic inequalities
4.2 Covering numbers in learning theory
4.2.1 Covering numbers concerning RKHS
4.3 Uniform convergence
4.4 Reﬁned concentration inequalities
5 Error Analysis
5.1 Basic models
5.2 Overview of existing methods
5.3 Error decomposition and projection
5.4 Error bounds and learning rates
5.4.1 Basic assumptions
5.4.2 A general theorem
5.4.3 Apply to the CRM-based algorithms
5.4.4 Proofs
5.5 Remarks on consistency
6 Support Vector Machines
6.1 The 1-norm soft margin SVM
6.2 The offset
6.2.1 Uniform stability may fail
6.2.2 Bounding the offset
6.2.3 Compute the capacity of the hypothesis space
6.2.4 Inﬂuence on the regularization error
6.3 Error analysis
6.3.1 Error decomposition
6.3.2 Bayes-risk consistency
6.3.3 Learning rates
6.3.4 Regularization error
6.4 Consistency with hypothesis space may fail
6.5 q-norm SVM with q > 1
6.5.1 Tight risk comparison
6.5.2 Bounding the offset
6.5.3 Learning rates
6.5.4 Estimate the regularization error
7 Linear Programming SVM
7.1 LP-SVM with Mercer kernels
7.1.1 Sample-dependent hypothesis space and regularizer
7.1.2 Differences from sparse approximation and neural network
7.2 Error decomposition
7.2.1 A stepping stone
7.2.2 Error decomposition for LP-SVM
7.3 Consistency…………..

Source: City University of Hong Kong