Since we launched Custom Training, we've seen a ton of interesting use cases and custom visual recognition models being built on our platform, from flower recognition models, to shoe recognition models, and even muppet recognition models! But the number one question we've gotten from users is, "How do I make my custom model more accurate?" Well, we've been hard at work coming up with a solution, and we're proud to introduce the Model Evaluation tool for Custom Training. Here's how it works!

How it works

Model Evaluation does a 5-split cross validation on the data used to train your custom model. You might be wondering what that means? Here is a graphic to visually showcase that.

We take all the training data that you’ve given us for your custom trained model. We then split that in 5 parts. Next, we set aside 1 part for a test set and use the remaining 80% of data to train a new model against.

Once that model is created we use the Test set and make predictions against it using this model. We then compare these predictions against the actual labels given for the inputs. After that we repeat this process for every Test set.

Let’s try it out!

What better way to introduce this feature than by creating a simple model and using Model Evaluation to make it better?! Let’s begin by creating a model using our Portal UI, a visual way to build and train models. If you’re not familiar with it, you can get more info here or here.

The first couple of steps below will just quickly walk us through creating a model. If you already have one, feel free to skip to Step Four.

Step One: Sign up with Clarifai and create an application

If you haven’t already done so, you can quickly sign up here:

After the signup process, go to your dashboard and create an application.

I’ll be calling my application “modelEvaluation”. This application will house our model.

Step Two: Go to Our Explorer UI:

Next, click on the "View In Explorer" option from the application's options menu at the bottom right of it:

Step Three: Custom Build and train a model.

For this scenario we are going to quickly custom build a model to run our evaluation tool against. The model that I’ll be building will be able to look at a photo of a Garage Door and let us know if it’s Open or Closed.

1. Add Inputs (Images)

Once you complete Step #2 you'll be taken to this page, which will let us add some of our own images. You can either use local files on your machine or URLs:

2. Create Model & Concepts

After that, make sure you create a model and concepts with the menu on the left side. In this case, I created the model “garage_eval” with the concepts “open” and “closed”.

3. Label each image

Next, let’s go through each photo and label it with one of the concepts. In the below video we are selecting all of the "Open" ones and then adding the concept to it via the "Add Concepts" button, which you can then repeat for all of the "Closed" pictures.

4. Train the Model

All we have to do now is click on the model name in the top left corner and then click on the "Train Model" button at the top of the ensuing screen to be able to run evaluations on it!

Step Four: Run Model Evaluation

On the same page that you trained your model on, you'll see two tabs, "Concepts" and "Versions":

1. Click on the Versions Tab

2. And then click on the Evaluate button

In this screen, we showcase all the versions of the model you have trained. You can evaluate any trained version by clicking on the Evaluate button. Note that the previous versions that weren't trained yet do not have this button.

Step Five: Interpreting Results

Once the evaluation completes, the “Evaluate” button will turn into a “View” button. Click to view the evaluation results, which should look similar to this:

The results are shown in 3 main parts: Evaluation Summary Table, Concept by Concept Matrix, and Selection Details. The Evaluation Summary Table shows how the model performed when it predicted against the test set in 1 split. This is why the total number of labeled inputs on this table is around 20% of the size of the original training set you used. Feel free to adjust the threshold bar to find the right gauge for your recall and precision rates.

In this example, you can quickly see from the results above that our model will need more data to give stronger predictions for open garage doors in certain instances. The model predicted that a picture with a slightly open garage door has a low probability score for the “open” concept. You can see this particular input in the Selection Details section. To improve the model, we would want to add more images of a partially open door labeled as ‘open’ so the model can start to recognize those images as the “open” concept.

For a detailed breakdown on how to interpret the results and best practices around building your custom model, check out our docs.

We want to hear from you!

If you have any other questions or thoughts on this blog post, the Model Evaluation tool, or Clarifai in general, feel free to reach out,! We look forward to hearing from you.

Did this answer your question?