The 
Mathematica Journal
Volume 10, Issue 1

Search

In This Issue
Articles
Trott's Corner
New Products
New Publications
Calendar
News Bulletins
New Resources
Classifieds

Download This Issue 

About the Journal
Editorial Policy
Staff and Contributors
Submissions
Subscriptions
Advertising
Back Issues
Contact Information

A Flexible Implementation for Support Vector Machines
Roland Nilsson
Johan Björkegren
Jesper Tegnér

Solving the Optimization Problem

The Primal Problem

It turns out that the optimal separating hyperplane solving (1) can be found as the solution to the equivalent optimization problem

referred to as the primal problem. Typically, only a small subset of the data points will attain equality in the constraint; these are termed support vectors since they are "supporting" (constraining) the hyperplane (Figure 1). In fact, the solution depends only on these specific points. Therefore, the method also is a scheme for data compression, in the sense that the support vectors contain all the information necessary to derive the decision rule.

The Dual Problem

For reasons that will become clear later, often has very high dimension, which makes the primal problem (2) intractable. Therefore, we attack (2) indirectly, by solving the dual problem

where . This is a quadratic programming (QP) problem, which has a unique solution whenever is positive semidefinite (as is the case here). It is solved numerically in MathSVM by the function QPSolve. (At this point we need to load the MathSVM package; see Additional Material.)

The variable Alpha has the number of data points, so the matrix has elements, which may be quite large for large problems. Therefore, QPSolve employs a divide-and-conquer approach [7] that allows for solving (3) efficiently without storing the full matrix in memory.

Having solved the dual problem for Alpha using QPSolve, we obtain the optimal weight vector and bias term , that is, the solution to the primal problem (2), using the identities

where in (5) are any two support vectors belonging to class and , respectively (there always exist at least two such support vectors) [5].

A Simple SVM Example

Enough theory--let us generate some data and solve a simple SVM problem using MathSVM.

For this elementary problem, we use the simple SVM formulation (2) provided in MathSVM by the SeparableSVM function.

The returned vector Alpha is the solution found by QPSolve for the dual formulation (3) of the SVM problem we just constructed. For this specific problem, the dual formulation used is exactly that described by (3), that is

The support vectors are immediately identifiable as the nonzero . (4) and (5) are implemented as

A plot similar to Figure 1 is produced by the SVMPlot function. As in Figure 1, the solid line marks the optimal hyperplane, and dotted lines mark the width of the corridor that joins support vectors (highlighted in blue).

Nonseparable Data

Often the assumption of separable training data is not reasonable (it may fail even for the preceding simple example). In such cases, NonseparableSVM should be used. This SVM variant takes a parameter that determines how hard points violating the constraint in (2) should be penalized. This parameter appears in the objective function of the primal problem, which now is formulated as [5]

Large means high penalty, and in the limit we obtain the separable case.

Getting QP Formulations

Sometimes it is interesting to examine what the QP problem looks like for a given SVM formulation. Using the option FormulationOnly, we can inspect the various parameters instead of actually solving the QP. This can, for example, be used to study expressions analytically.

The parameters are given in the order corresponding to the arguments of QPSolve.



     
About Mathematica | Download Mathematica Player
© Wolfram Media, Inc. All rights reserved.