This means that we can sidestep the expensive calculations of the new dimensions.įigure out what the dot product in that space looks like: Doing this for every vector in the dataset can be a lot of work, so it’d be great if we could find a cheaper solution.Īnd we’re in luck! Here’s a trick: SVM doesn’t need the actual vectors to work its magic, it actually can get by only with the dot products between them. However, it turns out that calculating this transformation can get pretty computationally expensive: there can be a lot of new dimensions, each one of them possibly involving a complicated calculation. In our example we found a way to classify nonlinear data by cleverly mapping our space to a higher dimension. We plot our already labeled training data on a plane: We want a classifier that, given a pair of (x,y) coordinates, outputs if it’s either red or blue.
Let’s imagine we have two tags: red and blue, and our data has two features: x and y.
The basics of Support Vector Machines and how it works are best understood with a simple example. This makes the algorithm very suitable for text classification problems, where it’s common to have access to a dataset of at most a couple of thousands of tagged samples. After giving an SVM model sets of labeled training data for each category, they’re able to categorize new text.Ĭompared to newer algorithms like neural networks, they have two main advantages: higher speed and better performance with a limited number of samples (in the thousands).