Support Vector Machine
Support vector machines (SVMs) are a basic machine learning method for supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Introduced by Vladimir Vapnik and his colleagues, SVMs are a relatively new learning method used for binary classification. The basic idea is to find a hyperplane which separates the d-dimensional data perfectly into its two classes.
Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.
SVM can be considered as a revision of logistic regression. Logistic regression is also used to predict a binary response from a binary predictor, used for predicting the outcome of a categorical dependent variable (i.e., a class label) based on one or more predictor variables (features). While logistic regression considers more on overall distributions of the classes, SVM seems consider more on the boundaries of the classes are more informative than the overall distributions.
However, since example data is often not linearly separable, SVM's introduce the notion of a “kernel induced feature space” which casts the data into a higher dimensional space where the data is separable. Typically, casting into such a space would cause problems computationally, and with overfitting. The key insight used in SVM's is that the higher-dimensional space doesn't need to be dealt with directly (as it turns out, only the formula for the dot-product in that space is needed), which eliminates the above concerns.
Furthermore, the VC-dimension (a measure of a system's likelihood to perform well on unseen data) of SVM's can be explicitly calculated, unlike other learning methods like neural networks, for which there is no measure.
II. Description and Analysis
We first introduce logistic regression. An explanation of logistic regression begins with an explanation of the logistic function, which always takes on values between zero and one:
and viewing as a linear function of an explanatory variable x (or of a linear combination of explanatory variables), the logistic function can be written as:
This will be interpreted as the probability of the dependent variable equaling a "success" or "case" rather than a failure or non-case. We also define the inverse of the logistic function:
Fig. 1 is a graph of the logistic function F(x). The formula for F(x) illustrates that the probability of the dependent variable equaling a case is equal to the value of the logistic function of the linear regression expression. This is important in that it shows that the value of the linear regression expression can vary from negative to positive infinity and yet, after transformation, the resulting expression for the probability F(x) ranges between 0 and 1. The equation for g(x) illustrates that the logistic function (i.e., log-odds or natural logarithm of the odds) is equivalent to the linear regression expression. Likewise, the next equation illustrates that the odds of the dependent variable equaling a case is equivalent to the exponential function of the linear regression expression. This illustrates how the logistic function serves as a link function between the probability and the linear regression expression. Given that the logistic function ranges between negative infinity and positive infinity, it provides an adequate criterion upon which to conduct linear regression...
Please join StudyMode to read the full document