One of the transformative powers of multimodal AI is the ability to translate text in images to machine-encoded text. Clarifai makes it easy to do this with the Visual Text Recognition (VTR) workflow.
With the VTR work flow, the text in this image is detected and then transformed into standard text format that can be read by any computer.
Getting Started
First things first. You will need to set up a Clarifai account and create an application. This how-to article shows you how to label images with the Clarifai API and through Portal. If you would like to label images via API, you will also need to generate an API key.
Recognize image-text in Portal
You can do almost anything that Clarifai can do with Clarifai Portal, and we work hard to make Portal the world's easiest interface for using AI. Classifying images with Portal is as simple as uploading your data, and choosing the right base workflow.
Create your application and choose your base workflow
Simply log in to Clarifai Portal and create a new application. To use the wedding model, select "VTR" as your base workflow.
View your image in Explorer view
Upload an image to your application and view predictions in the righthand sidebar under the tab that says "App Workflow".
Recognize image-text in a local image via API
Use the following Python snippet as an example of how to run a prediction on an image hosted on your local computer. For more details and information on working with predictions in our other client languages, please refer to our API documentation.
from clarifai_grpc.grpc.api import service_pb2, resources_pb2 from clarifai_grpc.grpc.api.status import status_code_pb2 # This is how you authenticate. metadata = (('authorization', 'Key {{YOUR_CLARIFAI_API_KEY}}'),) with open("{YOUR_IMAGE_FILE_LOCATION}", "rb") as f: file_bytes = f.read() request = service_pb2.PostModelOutputsRequest( model_id='9fe78b4150a52794f86f237770141b33', inputs=[ resources_pb2.Input( data=resources_pb2.Data( image=resources_pb2.Image( base64=file_bytes ) ) ) ]) response = stub.PostModelOutputs(request, metadata=metadata) if response.status.code != status_code_pb2.SUCCESS: raise Exception("Request failed, status code: " + str(response.status.code)) for concept in response.outputs[0].data.concepts: print('%12s: %.2f' % (concept.name, concept.value))
Recognize image-text in images hosted on the web
Here is an example of how to run a prediction on an image that is hosted on a URL. This snippet is in Python, but we offer support for many other client languages. Please refer to our API documentation for additional information.
from clarifai_grpc.grpc.api import service_pb2, resources_pb2 from clarifai_grpc.grpc.api.status import status_code_pb2 # This is how you authenticate. metadata = (('authorization', 'Key {{YOUR_CLARIFAI_API_KEY}}'),) request = service_pb2.PostModelOutputsRequest( model_id='9fe78b4150a52794f86f237770141b33', inputs=[ resources_pb2.Input(data=resources_pb2.Data(image=resources_pb2.Image(url='{{YOUR_IMAGE_URL}}'))) ]) response = stub.PostModelOutputs(request, metadata=metadata) if response.status.code != status_code_pb2.SUCCESS: raise Exception("Request failed, status code: " + str(response.status.code)) for concept in response.outputs[0].data.concepts: print('%12s: %.2f' % (concept.name, concept.value))