AI In Action

Image & Video APIs

Last updated: Jul-21-2025

Cloudinary Image (Programmable Media) offers an array of AI-powered features that enable you to effortlessly transform, manage, and moderate your images and videos.

Whether you're seeking to enhance visual appeal, streamline content analysis, or ensure seamless moderation, our AI capabilities have got you covered. From generative AI transformations to advanced content analysis, and even AI-driven video playback features, you'll discover how to leverage AI to create, refine, and moderate your media with precision and creativity.

Notes

Some of the features require you to register for an add-on. Once registered, you can try them out for free.
There are special transformation counts for some of the features.

On this page:

Generative AI transformations
AI content analysis for transformations
AI content analysis for management
AI video playback features

Contents	Description
Generative AI transformations	Creatively transform your images, using AI to automatically generate pixels that integrate seamlessly into the picture. Use these transformations to extend your images to new dimensions, replace backgrounds, remove, replace or recolor items, or restore degraded images.
AI content analysis for transformations	Transform your images and videos based on their content. Ensure that you keep the content that matters to you when cropping your media and removing backgrounds. Leverage Cloudinary's understanding of your content to create video previews, apply drop shadows, apply different artistic styles and more.
AI content analysis for management	Save yourself hours of time manually analyzing images, by using AI for tagging and moderation. Auto-tag your assets to help you to categorize and organize your assets, and make them easier to find within your product environment. Automatically moderate your assets based on their content to check for inappropriate images and videos.
AI video playback features	Use AI to enhance your users' video playback experience. Transcribe and translate videos with ease, then add captions and subtitles to them. Discover the most interesting parts of your videos and display a visual representation in the Video Player seek bar.

Generative AI transformations

Generative background replace

Generative background replace uses AI to generate new backgrounds for images. Customize the background with a prompt or let AI generate it based on the image content. Place your products in different environments to appeal to more potential buyers, or simply enable content creativity programmatically.

Try it on your own images

Note

You can also try the FinalTouch: AI-powered background generator product.

Generative fill

Generative fill, utilized with various cropping methods, uses AI to expand original images, aiding in orientation changes. It seamlessly integrates AI-generated backgrounds with existing content, facilitating creative solutions and reducing workflow time, while allowing programmatic control over transformations, enhancing content creativity and velocity.

Try it on your own images

Watch a video tutorial

How are our customers using generative fill?

Generative recolor

Generative recolor enables color alterations in images using natural language, through AI and NLP. This feature simplifies creating color variants, especially beneficial for e-commerce products, by allowing color changes at scale via API.

Try it on your own images

Generative remove

Generative remove effortlessly eliminates unwanted objects, text, or user-defined regions from images, providing a valuable capability across various industries. The feature is accessible via Cloudinary's APIs, enabling scalable object removal tasks which traditionally would require significant time and effort.

Try it on your own images

How are our customers using generative remove?

Generative replace

Generative replace uses AI to replace objects within images with alternative objects or images, while maintaining a natural look. This feature allows for creative or functional alterations in images, enhancing the versatility and usage of your media assets.

Try it on your own images

Generative restore

Generative restore uses AI to mend image imperfections like compression artifacts, noise, and blurriness. Through a two-step restoration process, it recovers lost details and refines the image, enhancing the clarity and quality of old or damaged photos and user-generated content.

Try it on your own images

Watch a video tutorial

How are our customers using generative restore?

AI content analysis for transformations

Content analysis for resizing and cropping

Smart cropping utilizes AI technology to intelligently focus on the most significant regions of images and videos, ensuring viewers receive an engaging visual experience irrespective of the device or browser used. By automating the cropping process, smart cropping not only enhances the visual appeal but also ensures that critical content is not lost, making the media more viewer-centric and adaptable to varying display requirements.

Try it on your own videos

Watch a video tutorial

If you know what you expect to see in an image, you can use more specific content aware cropping, such as object-detection based cropping, text-detection based cropping or face-detection based cropping, even to the level of facial attributes.

Try it on your own images

Watch a video tutorial

The upscale transformation utilizes super resolution to enhance the quality of images when upscaling them, making low-resolution images appear clearer and sharper. This is particularly useful when high-resolution images are required but only lower-resolution images are available. The transformation improves image details, making them suitable for various uses without compromising on visual quality.

Try it on your own images

Watch a video tutorial

Content analysis for enhancing images

AI image enhancement harnesses AI to automatically analyze and improve image quality. Key features include correcting overexposure, enhancing underexposed areas, intensifying colors, and adjusting color temperature for a balanced, vibrant, and true-to-life visual experience. This effect seamlessly enhances image appeal while maintaining natural quality, ideal for refining visual content across diverse applications.

Try it on your own images

Watch a video tutorial

Content analysis for displaying product images

The background removal transformation dynamically extracts the foreground subject in images while removing the background on the fly. This is useful for creating uniform product images, or isolating subjects from distracting backgrounds.

Try it on your own images

Watch a video tutorial

The drop shadow effect employs AI to apply realistic shadows to objects within an image, which is useful especially for product images where background removal has been used. By specifying the light source position and spread, you can control the appearance of the shadow, creating a more natural or dramatic effect as needed. This effect enhances the visual depth and distinction of images.

Try it on your own images

Watch a video tutorial

Content analysis for extracting components of an image

Powered by AI, the extract effect makes it easy to isolate specific parts of an image using simple natural language prompts. Whether you want to highlight a product by removing the background or get creative by focusing on specific elements, this transformation does the work for you. Just state what you want to keep (or remove), and let the magic happen!

Try it on your own images

Content analysis for video previews

The AI-based video preview transformation effect generates video previews automatically by activating deep learning algorithms that identify the most interesting video segments. You can optionally control the length of the generated preview, and the number and duration of the video segments. Video previews can be used to engage your audience and help them select the video content that interests them.

Try it on your own videos

AI content analysis for management

Content analysis for auto-tagging

Tagging your assets makes them easier to organize and find, but manually tagging your assets can be a tedious and time-consuming task.

There are various auto-tagging add-ons available that automatically add tags to your assets on or after upload to your product environment. Some of the add-ons have broad tagging capabilities, such as the Amazon Rekognition, Google Image, and Imagga auto-tagging add-ons for images, and the Google Video and Microsoft Azure Video Indexer auto-tagging add-ons for videos. You can use these add-ons in conjunction with the Google Translation add-on to translate your tags to different languages.

Others are more specific in terms of what they detect, for example you can use the Amazon Rekognition Celebrity Detection add-on to detect celebrities, or the Cloudinary AI Content Analysis add-on to detect objects in a specific object model.

You can also try out the Cloudinary AI Vision add-on to interpret and respond to visual content queries. This is particularly useful for determining if user-generated content is suitable for your site as you can be very specific in what you allow or reject based on components of the image.

Watch a video tutorial

Content analysis for Visual Search

Tags are incredibly useful when it comes to searching for assets, but there's another powerful AI capability for searching - Visual Search. Either use text to describe what you're searching for, or an image similar to what you're looking for. Visual Search looks at the visual content of images, rather than their public ID or metadata.

Content analysis for image captioning

The Cloudinary AI Content Analysis add-on can also be used for AI-based image captioning, whereby an image is analyzed and a caption is suggested based on the images' contents. You can use this for image metadata or as the alt text for an image, improving your website's accessibility.

Watch a video tutorial

Content analysis for moderating assets

Cloudinary offers various add-ons offer advanced content moderation, enabling businesses to maintain a safe and compliant online environment for their users.

The Amazon Rekognition AI Moderation add-on leverages Amazon Rekognition's AI to automatically identify and moderate potentially unsafe content in images, suitable for social media platforms and e-commerce websites.
The Amazon Rekognition Video Moderation add-on specializes in video content moderation for video-sharing platforms, ensuring live-streamed and pre-recorded videos comply with guidelines.
The Google AI Video Moderation add-on employs Google's AI technology to assess and moderate user-generated videos, ideal for video-hosting services.
The WebPurify Image Moderation add-on automatically filters out inappropriate images in real-time across various platforms, from social media to e-commerce websites, ensuring adherence to content guidelines and legal standards.

These add-ons collectively empower businesses to automate content moderation, saving time and resources, while also ensuring that their online spaces remain compliant and user-friendly by preventing the dissemination of harmful or inappropriate content.

AI video playback features

Transcription services

Save time and resources transcribing videos in almost any language with the Google AI Video Transcription or the Microsoft Azure Video Indexer add-ons. These add-ons automatically transcribe spoken words in video content, making them an excellent choice for media companies, e-learning platforms, and businesses needing accurate video transcriptions for accessibility and SEO optimization.

AI-based highlights

The Video Player AI-based highlights graph shows a visual representation of the highlights of the video based on how our AI preview algorithm determines the level of interest for each part of the video. Hover over the timeline in the video to see it.

✔️ Feedback sent!

✖️

Error

Unfortunately there's been an error sending your feedback.

Rate this page:

Error