Volume 10, Issue 1
Download This Issue
Staff and Contributors
A Flexible Implementation for Support Vector Machines
Solving the Optimization Problem
The Primal Problem
It turns out that the optimal separating hyperplane solving (1) can be found as the solution to the equivalent optimization problem
referred to as the primal problem. Typically, only a small subset of the data points will attain equality in the constraint; these are termed support vectors since they are "supporting" (constraining) the hyperplane (Figure 1). In fact, the solution depends only on these specific points. Therefore, the method also is a scheme for data compression, in the sense that the support vectors contain all the information necessary to derive the decision rule.
The Dual Problem
For reasons that will become clear later, often has very high dimension, which makes the primal problem (2) intractable. Therefore, we attack (2) indirectly, by solving the dual problem
where . This is a quadratic programming (QP) problem, which has a unique solution whenever is positive semidefinite (as is the case here). It is solved numerically in MathSVM by the function QPSolve. (At this point we need to load the MathSVM package; see Additional Material.)
The variable has the number of data points, so the matrix has elements, which may be quite large for large problems. Therefore, QPSolve employs a divide-and-conquer approach  that allows for solving (3) efficiently without storing the full matrix in memory.
Having solved the dual problem for using QPSolve, we obtain the optimal weight vector and bias term , that is, the solution to the primal problem (2), using the identities
where in (5) are any two support vectors belonging to class and , respectively (there always exist at least two such support vectors) .
A Simple SVM Example
Enough theory--let us generate some data and solve a simple SVM problem using MathSVM.
For this elementary problem, we use the simple SVM formulation (2) provided in MathSVM by the SeparableSVM function.
The returned vector is the solution found by QPSolve for the dual formulation (3) of the SVM problem we just constructed. For this specific problem, the dual formulation used is exactly that described by (3), that is
The support vectors are immediately identifiable as the nonzero . (4) and (5) are implemented as
A plot similar to Figure 1 is produced by the SVMPlot function. As in Figure 1, the solid line marks the optimal hyperplane, and dotted lines mark the width of the corridor that joins support vectors (highlighted in blue).
Often the assumption of separable training data is not reasonable (it may fail even for the preceding simple example). In such cases, NonseparableSVM should be used. This SVM variant takes a parameter that determines how hard points violating the constraint in (2) should be penalized. This parameter appears in the objective function of the primal problem, which now is formulated as 
Large means high penalty, and in the limit we obtain the separable case.
Getting QP Formulations
Sometimes it is interesting to examine what the QP problem looks like for a given SVM formulation. Using the option FormulationOnly, we can inspect the various parameters instead of actually solving the QP. This can, for example, be used to study expressions analytically.
The parameters are given in the order corresponding to the arguments of QPSolve.
About Mathematica | Download Mathematica Player
© 2006 Wolfram Media, Inc. All rights reserved.