Back to Blog

Masterful 0.5.2: The easiest way to train high performance object detection models

Image of Yaoshiang Ho
Yaoshiang Ho

We've released Masterful 0.5.2, the platform that automates model development for computer vision. This release adds support for object detection, allowing developers to build detection models faster than ever before.

You can download it right now with a `pip install -U masterful`. 

With 0.5.2, you can now train high performance, production ready object detection and semantic segmentation models with just your data, a dozen lines of YAML, and a CLI call to `masterful-train`.  

Get started using Masterful on an Object Detection problem with a fast, small toy dataset called YYMNIST. It introduces you to some of the issues you'll run into in trying to analyze images of paper documents. You'll learn about data prep, training, evaluating the model, and running the model in inference mode. 

For a more advanced problem that requires larger, more powerful models, check out the example application on detecting pedestrians in street level imagery (SLI). 

What Is Object Detection

The first computer vision task we learn is Classification: what is in an image. The next question we usually get asked is, "where is it?" That's what Object Detection does. Put another way, object detection means, "tell me what is in an image and put a bounding box around it".

Object detection also tracks instances of classes rather than the mere presence of a category. For example if there's an image of two cars and three pedestrians, Object Detection will properly put bounding boxes around both cars and all three pedestrians. Multi-label classification by contrast would simply indicate the presence of humans and cars, without describing how many there are or where they are located within the image. 

What You'll Learn

For folks new to object detection, you'll find that our examples are a gentle introduction to many of the nuances of object detection. With the help of our docs and CLI, you can deliver production ready object detection models.

You'll learn: 

  • Additional information in the labels to account for multiple instances of a class, as well as the four coordinates to define a bounding box around that instance.
  • The basic metrics, mAP and AP, and how they incorporate both classification accuracy and the correctness of the predicted bounding boxes.
  • Understanding confusion matrixes for Object Detection, including the "background" class. 

Masterful compared to Tensorflow Object Detection API and Detectron2

Most implementations of object detection do not start from scratch, but instead generally start with Tensorflow Object Detection API (TFODAPI) or Detectron2. Both support many modern model architectures. 

While those tools excel at providing model architecture, the training and hyperparameter configuration is difficult, to say the least. The config files are very long and while you can control anything you want, you basically need to understand all the underlying code. The common practice is to just take a defaults and give up on trying to configure it. 

By comparison, Masterful brings its data-centric and AutoML thinking to the problem of detection. The config YAML file only has 12 lines to fill out, mostly pointers to file locations. Under the hood, the Masterful CLI takes care of figuring out all of the hyperparameters that TFODAPI and Detectron2 would force you to handle. And you can even improve the accuracy of your models using unlabeled data, using our novel extensions of semi-supervised learning techniques to the task of object detection. 

BTW, Masterful's Model Zoo includes all the object detection model architectures from the Tensorflow Object Detection API. Here's a complete list of model architectures

Semantic Segmentation Too

0.5.2 also includes support for semantic segmentation. 

Like object detection, semantic segmentation also answers the question, "where is the object", but it answers the question at a finer level of detail. Rather than simply drawing a bounding box, semantic segmentation figures out the class for every single pixel. You can think of it as drawing the outline of an object and filling it in.

Semantic segmentation does not distinguish between instances of a class. That task is called Instance segmentation and if you are interested, please join our slack channel and let us know!

Here's a handy diagram explaining the difference, taken from Kaiming He's Tutorial on Mask-RCNN, presented at ICCV in 2017.

Get Started

Masterful is available for free for personal, academic, and trial use.  Just run 'pip install -U masterful' in your Python terminal to get the latest release, then head over to the example application to start training your first object detection model.  Also join our Slack community - we're there to answer your questions and would love to hear your feedback and ideas on this latest release!  




Related Posts

Announcing Masterful v0.4

Image of Yaoshiang Ho
Yaoshiang Ho

Version 0.4 of Masterful

Read more

A simple way to improve your CV model with unlabeled data

Image of Jack Lynch
Jack Lynch

Semi-supervised learning (SSL) unlocks value in your unlabeled data, but it can be difficult to...

Read more