It's time to use Semi-Supervised Learning for your CV models
Intro
Previously, we showed that throwing more training data at a deep learning model has rapidly...
Version 0.4 of Masterful
We’re excited to announce the v0.4 release of Masterful, the training platform for computer vision models. This latest release introduces a number of new product capabilities and enhancements, including a streamlined API, fewer steps to get from pip install to training models, and a new way to take advantage of the power of semi-supervised learning (SSL) to learn from your unlabeled data.
Get started without a sign-up
You can now run pip install masterful and immediately use all of the functionality of the Masterful platform. No sign up required to get going. After 30 days, just visit www.masterfulai.com/get-it-now to continue usage.
As always, Masterful is free to use for personal projects, including academic projects.
For commercial customers, Masterful is free to use for evaluation purposes, but once you want to deploy a model trained by Masterful into production, you’ll need to contact us at learn@masterfulai.com to get a commercial license. For more details, see www.masterfulai.com/personal-and-evaluation-agreement.
Streamlined API
We worked closely with users to develop a series of refinements to our API. The result is an API that provides more control, is easier to understand, and still very easy to use.
We have organized our API around five basic ingredients of a trained model:
There are five param objects corresponding to each of the ingredients above. For example, OptimizationParams describes the optimizer, learning rate, and batch size parameters to control training.
Placing all the optimization hyperparameters in a single object like OptimizationParams allows you to look at all your optimization decisions holistically. This is conceptually clearer than the Keras approach of spreading these parameters around. For example, in Keras, you might find some of these hyperparameters as a callback, some as an attribute of the optimizer, and some in a map call on the tf.data.Dataset. Our custom training process is built on top of Keras, so those Keras APIs are still accessed by Masterful, but you don’t need to work with them directly - you can control all your optimization parameters from the OptimizationParams object.
The real magic is how you discover these params. Each set of params has an associated learner that takes care of learning the params. For example, the masterful.regularization.learn_regularization_params() learner will figure out the optimal data augmentation, label regularization, and weight decay technique for your model.
Finally, although our autofit function was compact, it was also becoming monolithic. In its place, we have introduced multiple discrete functions to provide developers with more control of the platform. The basic steps for training with Masterful are now:
Check out the quickstart tutorial for more details on how to use these APIs.
A simple SSL recipe
We’ve introduced a new way to get started with semi-supervised learning (SSL). There are no changes to your existing regularization and training loop. Just add one step to integrate your unlabeled data using a pair of utilities: masterful.ssl.analyze_data_then_save_to() and masterful.ssl.load_from().
These utilities handles some of the trickier parts of implementing SSL:
There are tradeoffs of using the simple SSL recipe instead of the SSL capabilities of the full Masterful platform of course. In general, the full Masterful platform will run faster and deliver a more accurate model than the simple SSL recipe. But the simple SSL recipe will still outperform training without any unlabeled data at all.
For example, we trained a convnet with 5,000 labeled examples from the CIFAR10 dataset as the baseline. After using the simple SSL recipe to also learn from an additional 25,000 unlabeled examples, we were able to reduce the error rate by 8%! Once you’ve collected your unlabeled data, get started with the guide to the Simple SSL Recipe.
Figure 1: Error rate with 5,000 labeled examples of CIFAR10 and 25,000 unlabeled examples over 10 runs. Average error rate dropped from 0.53 to 0.49, or around 8%. Not shown in graph: every single run showed an improvement.
Conclusion
We'd love to hear your thoughts about our 0.4 release! Join our community at https://www.masterfulai.com/community and let us know what you think.
Previously, we showed that throwing more training data at a deep learning model has rapidly...
Deep learning has opened the door to tackling many real-world computer vision problems. But...