Custom Training is about teaching computers to see the world in a way that is specific to your own content and context. 

"Specific to your own content" can mean a variety of things:

  • Specificity: 🍎  A custom apple model if you need to predict items like 'Pink Lady' or 'Red Delicious'. (Fine grained classification of objects)
  • Taxonomy: 🚆  A custom Transportation Model if you have a long running taxonomy containing 'locomotive' or 'bus'. (Ensuring there is no unnecessary mapping or confusion)
  • Subjective: 🍂  A custom Foliage Model if you need a model to classify images that fit your 2019 Fall brand style guide. (Content recommendations, preferences and filtering, identification applied to style, brand guidelines, and user behavior)

With your own training data, taxonomy and API endpoint, you will have a precise understanding and organization of your visual content. 

This visual content may be user generated photos, existing content within an internal DAM, or untagged partner imagery or scraped data from the web. With Clarifai Custom Training APIs and web products, that visual content becomes actionable in a variety of ways that serve your business, app or workflow best. 

You can think of Custom Training as a series of inputs where you ultimately teach a neural network what concept_1 is and what it is not.


Since Clarifai's founding in early 2013, we have been focused on enabling developers to understand any image or video in the world. To do this, we develop and expose advanced prediction APIs that abstract all the complexity away from neural network training, data aggregation, hosting, evaluation and re-training.

These prediction-focused APIs (General, Food, Travel, NSFW, Apparel, etc.) are currently available and live here. These are pre-trained with our data and research teams over the course of months with millions of images. We define the training data, taxonomy (the model's 'concepts') and other performance characteristics.

With Custom Training, some of those important training and building decisions belong to you while being able to leverage our infrastructure, expertise and complementary products. 

We are particularly excited to democratize access to this package of technologies with the same renewed focus on speed, ease of use and straight forward commercial terms. Over the last few years we have seen an increasingly complex, and niche set of recognition requests. While we would love to service all of our inbound requests it would be too daunting of a task to execute all of it in-house. Thanks to CT, each app developer, product leader has powerful recognition solutions in their own hands.

All Custom Training users end up with a private API endpoint, so it's really up to the imagination as to what to do with it. Several real-world examples are included below to get you thinking.

Whether it's best-in-class pre-trained off-the-shelf APIs or your own Custom Training model, we are here to extend the product, tech and data resources needed to support you.


User Interface Links

Web Interface
Walkthrough Guide

API Links

The V2 API
5 Official API Clients (JavaScript, Python, Java, C#, PHP)
Quick Start Code Examples

What you should know

If your content is visually distinct and easy to identify, 25-50 positive examples per concept will provide robust and accurate predictions. A 'concept' is synonymous with 'tag', 'category' or 'keyword'. Concepts are your business' world view as to what an object, visual pattern, or style may represent. 

For reference, our General model contains 11,000 concepts. Custom models are typically much smaller-resembling taxonomies like moderation guides, apparel sorting and classifieds categorization. 

See Glossary

The most important step in Custom Training is having a focus on building robust and accurate concepts, as the performance of your custom model is only as good as your underlying concepts are. Training isn't a magical step; it should be seen as a necessary technical process that combines raw ingredients (concepts) in a powerful way.

What makes for a well built concept?

  • Accurate labels. Mis-labeled images introduces noise into your model and can lead to weak or confusing predictions.
  • Balanced training data. Skewed training sets where several concepts have 5-20x as many positive training images as others may affect model performance. 
  • Matching training and prediction context. It's crucial that your training images for your concepts resemble the conditions and context of imagery you'll be making predictions on.

As an example, training a flower identification model solely with stock photography and then attempting to predict on user generated smartphone photos will not be ideal.

Custom Training works on a variety of content types.

Successful implementations span across a wide range of implementations: document categorization, plant and flower identification, apparel classification, user generated content filtering, industrial abnormality detection, ad listing moderation, etc. 

A training image can have multiple concepts. So a picture of a yacht in the harbor could be trained as


for simple categorization


'yacht' 'boat' 'ocean 'water' 'shoreline' 'sunset' 'fishing' 'recreation'

for more general search purposes.

An API prediction response from your custom model will return all concepts and their respective confidence scores, ranging from 0 to 100%.

Custom models currently support several hundred concepts

It is important to help distinguish two concepts from each other by labeling a training image as a negative example of what it is not.

Re-training time can vary given your total number of labeled images but it typically takes just a few seconds with tens of thousands of labeled images in your application. 

There is no monthly cost associated with hosting 10,000 images in an application, predicting up to 5,000 times with your API and building one Model that contains up to 10 concepts. Additional pricing details can be found here.

Many successful implementations involve using both the General domain model (11,000 concepts) and a client's own Custom Model. The General model provides a broad and first layer of intelligence as to what is in the respective media. Given a General model response, certain relevant concepts returned can then trigger a Custom model prediction in your internal workflow.

General: "dog", "canine", "pet"

Custom dog breed model: "German Shepherd"

The training images and the subsequent private model API you build are private to you, and no other Clarifai users and customers can access your training images, concept names, model spec, etc.

Real world solutions

Below you can find a sampling of real world problems and features clients are currently using our Custom Training for.

Use Case: Categorization

An analytics platform dedicated to serving brands and social media influencers wanted to identify 'flat lay' photos within the Instagram accounts of consumer brands. 

Use Case: Identification

A mobile consumer app focused on being able to recognize common plants and flowers.

Use Case: Categorization

A large real estate listings platform focused on blocking imagery that is prohibited within their terms of service.

Use Case: Categorization

An insurance company removing the need for human moderation of user-submitted content within their fitness rewards app by doing real-time verification. 

Use Case: Recommendations

An online apparel marketplace needing to predict how closely newly uploaded products matched their brand guidelines for the featured homepage section.  

Use Case: Categorization

A national home improvement retailer needing to accurately categorize incoming product imagery from ad agency partners that contained no metadata, description or labels.

Use Case: Identification

An aerial data collection firm needing to identify key physical landmarks and thus was able to significantly remove much of its reliance on human annotated drone videos.

Use Case: Categorization

An international mobile classifieds using a 40 concept model for real-time understanding and categorization of user uploaded images. 

Use Case: Categorization

Real-time filtering and categorization of user submitted content are providing instant reports for this digital market research firm and their international CPG clients. 

Use Case: Categorization

A home insurance firm ensuring that their mobile app claims submission can immediately identify certain damage conditions. 

Train your first model

Check out our Walkthrough to learn how to do this with our User Interface!

What's next?

Advanced features such as counting, custom brand logo training and custom face recognition are in the research phases. Let us know if that interests you. You can count on even more publicly available domain models to be exposed in the coming months.

Additional resources

Community Page
Learn from fellow users, get access to alpha products, discover other use cases

Commercial questions and discussions

Technical questions, suggestions and clarification 

Happy Training!

Did this answer your question?