Google Vertex vs Masterful AI (Part 2): Using unlabeled data
In a previous post we compared Masterful AI’s computer vision training platform to Google’s Vertex AI AutoML platform. We observed significant improvements across the board from Masterful. But that comparison only considered the use of labeled data, whereas Masterful—unlike Vertex—can also leverage your unlabeled data to improve performance, using semi-supervised learning (SSL).
So how effectively can Masterful leverage your unlabeled data? In this post, we’ll find out by comparing Masterful to Vertex on a benchmark with labeled and unlabeled data
We’ll be using the FGVC-Aircraft dataset, a collection of 10,200 images comprising 70 aircraft families, as it’s commonly used in few-shot learning scenarios. Here, we’ll be using 50% of the labeled data, removing the labels from the other 50%. Though Vertex doesn’t have a use for this data, we can provide it to Masterful to improve our model’s performance. We resize all images to 336px square (zero-padded) and train a ResNet-50 classifier.
Examples from the FGVC-Aircraft dataset.
Performing semi-supervised learning with Masterful is simple: we simply learn SSL parameters using our unlabeled dataset:
This is all we need to start leveraging unlabeled data with Masterful. The effect is significant: Masterful produces a 20% decrease in test error relative to Vertex's performance, scoring 81% accuracy against Vertex’s 76%.
Top-1 accuracies on FGVC-Aircraft test set for Google Vertex and Masterful. Masterful achieves a significantly higher accuracy, in part by leveraging unlabeled data through SSL.
Through the Masterful Visualize tool, we can even see the effect SSL has on our model’s validation loss. In the waterfall plot below, we can see that SSL using unlabeled data was one of the most significant factors in reducing validation loss from an untuned baseline:
Waterfall plot demonstrating how different training techniques affect validation loss. From left-to-right, we see the Baseline validation loss, with improvement delivered by horizontal mirror, rot90, HSV / contrast / blur, spatial augmentations, MixUp, and semi-supervised learning (SSL). Masterful learns to compose these techniques to achieve the significantly reduced validation loss on the right, in orange.
This is just a simple example of how Masterful AI makes semi-supervised learning straightforward and effective, enabling improvement over purely supervised AutoML solutions. In the next of our Vertex comparison posts, we’ll extend this idea further, making use of uncurated unlabeled data “in the wild”.
Jack is an MLE at Masterful, where he works on automating himself out of a job.