Accessible media
Last updated: Jun-30-2025
Digital accessibility ensures that everyone, regardless of ability, can access, understand, and engage with online content. Disabilities can be permanent, like colorblindness, temporary, like light sensitivity from a concussion, or situational, such as trying to watch a video in a noisy café. Prioritizing accessible media isn't just about legal compliance, it's about creating inclusive, user-friendly experiences for all.
Different types of media present unique accessibility challenges and opportunities:
- Images and graphics require text alternatives for users with visual impairments, color adjustments for users with color vision deficiencies, and consideration of motion sensitivity for animated content. Complex images like charts and infographics may need extended descriptions to convey their full meaning.
- Videos need captions for users with hearing impairments, audio descriptions for users with visual impairments, and accessible player controls that work with keyboard navigation and screen readers. Motion in videos can also trigger vestibular disorders or seizures in some users.
- Audio content requires transcripts for users with hearing impairments, volume controls for users with different hearing abilities, and the ability to distinguish foreground speech from background sounds. Live audio needs real-time captioning solutions.
- Interactive media like product galleries and media players must be operable through keyboard navigation, provide clear focus indicators, and include appropriate ARIA (Accessible Rich Internet Applications) labels for screen readers. All interactive elements need to be accessible regardless of input method.
- Document and file formats need to be screen reader compatible, have proper heading structures, and include alternative formats when necessary.
Around the world, standards like the Web Content Accessibility Guidelines (WCAG), the European Accessibility Act (EAA), the Americans with Disabilities Act (ADA), and others set expectations for digital accessibility. This guide explores how you can use Cloudinary's tools and best practices to make each of these media types more accessible, empowering your team to build inclusive content from the start.
Media accessibility considerations
When designing media experiences, it's helpful to think about the diverse ways people consume content. The following tables present key accessibility considerations for images, videos, and audio, along with Cloudinary tools and techniques that can help address these needs. Each consideration also includes links to relevant Web Content Accessibility Guidelines (WCAG) for those who want to dive deeper into the technical standards.
Perceivability considerations
When creating accessible media, consider how users with different abilities will perceive your content. Visual content needs text alternatives for screen readers, videos require captions and audio descriptions, and color-based information needs additional visual cues to be accessible to users with color vision differences.
Image accessibility considerations
Consideration | Cloudinary Image Techniques | WCAG Reference |
---|---|---|
Consider how users with visual impairments will understand your images - they may rely on screen readers that need descriptive text alternatives to convey the same information. | 🔧 Managing text alternatives 🔧 AI-based image captioning 🔧 Cloudinary AI Vision |
1.1.1 Non-text content |
Video and audio accessibility considerations
Consideration | Cloudinary Video Techniques | WCAG Reference |
---|---|---|
Think about users who can't hear audio-only content - they'll need text transcripts. For video-only content, consider whether users who can't see the visuals would understand what's happening through text descriptions or audio narration. | 🔧 Alternatives for video only content 🔧 Audio and video transcriptions |
1.2.1 Audio-only and video-only (prerecorded) |
Consider users who can't hear your video content - they'll need captions that show not just dialogue, but also sound effects and other important audio cues that contribute to understanding. | 🔧 Video captions | 1.2.2 Captions (prerecorded) |
Think about users who can't see your video content - they may need audio descriptions that explain what's happening visually, including actions, scene changes, and other important visual information that isn't conveyed through dialogue alone. | 🔧 Audio descriptions |
1.2.3 Audio description or media alternative (prerecorded) 1.2.5 Audio description (prerecorded) 1.2.7 Extended audio description (prerecorded) |
For live content, consider how users who are deaf or hard of hearing will follow along with real-time events - they'll need live captions or transcripts that keep pace with the broadcast. | 🔧 Live streaming closed captions |
1.2.4 Captions (live) 1.2.9 Audio-only (Live) |
Consider users who communicate primarily through sign language - they may prefer sign language interpretation over captions for understanding spoken content. | 🔧 Sign language in video overlays | 1.2.6 Sign language (prerecorded) |
Think about providing comprehensive synchronized text alternatives that give users the same information as your video or audio content, so they can choose the format that works best for them. | 🔧 Alternatives for video only content 🔧 Audio and video transcriptions |
1.2.8 Media alternative (prerecorded) |
Visual and audio clarity considerations
Consideration | Cloudinary Image Techniques | Cloudinary Video Techniques | WCAG Reference |
---|---|---|---|
Consider that some users can't distinguish colors - if you're using color to convey important information, think about adding patterns, shapes, or text labels so everyone can understand the message. | 🔧 Assist people with color blind conditions | 1.4.1 Use of color | |
Think about users who may be startled or distracted by unexpected audio - if your content plays sound automatically, consider giving users controls to pause, stop, or adjust the volume. | 🔧 Adjust audio volume 🔧 Cloudinary Video Player |
1.4.2 Audio control | |
Consider users with visual impairments who may have difficulty reading text with poor contrast - they'll need sufficient color contrast between text and backgrounds to read your content comfortably. | 🔧 Customizable caption styling 🔧 Text overlays on images and videos 🔧 Adjust contrast on images and videos 🔧 Replacing colors for light/dark themes |
1.4.3 Contrast (minimum) 1.4.6 Contrast (enhanced) |
|
Consider whether users can resize, customize, or access your text content - actual text is generally more flexible and accessible than text embedded in images. | 🔧 Customize text overlays in images 🔧 OCR text detection and extraction |
1.4.5 Images of text | |
Think about users who have difficulty separating speech from background noise - they may need clear audio where the main content stands out from any background sounds. | 🔧 Mixing audio tracks | 1.4.7 Low or no background audio |
Operability considerations
Consider how users will interact with and control your media content. Some users rely on keyboards instead of mice, others may be sensitive to motion and flashing content, and many need the ability to pause or adjust audio that plays automatically. Design your media experiences to accommodate different interaction methods and user preferences.
Consideration | Cloudinary Image Techniques | WCAG Reference |
---|---|---|
Consider users who may be sensitive to motion or prefer reduced animations - they may want the option to disable or reduce motion effects that aren't essential to understanding your content. | 🔧 Convert animations to still images | 2.3.3 Animation from interactions |
Cloudinary's UI widgets have been built with operability considerations in mind, including keyboard navigation, focus management, and user control features. For detailed information about the accessibility features available in these components, see the Cloudinary Product Gallery widget and Cloudinary Video Player sections.
Image accessibility
Text alternatives are crucial for making visual content accessible to everyone, particularly users with visual impairments, cognitive disabilities, or those using assistive technologies like screen readers. These alternatives provide equivalent information about images, charts, diagrams, and other visual content in a format that can be understood through text-to-speech software, braille displays, or simply read by users who prefer textual descriptions.
Cloudinary provides tools and approaches for managing text alternatives at scale, including:
- Centralized metadata management: Store alt text and descriptions with your assets as the single source of truth
- AI-powered generation: Automatically generate descriptive alt text using machine learning
Managing text alternatives
Cloudinary Assets, Cloudinary's Digital Asset Management (DAM) product, serves as the single source of truth for managing text alternatives across your entire Media Library. Rather than maintaining alt text in multiple locations throughout your application, you can centralize all accessibility metadata within Cloudinary Assets, enabling consistent management, review, and approval workflows.
Using contextual metadata for text alternatives
The simplest approach is to use Cloudinary's built-in contextual metadata field, called alt
, to store text alternatives. You can manage this field through the Media Library interface or programmatically via the APIs.
Here's an example of setting the alt
contextual metadata field for an image during upload programmatically:
Alternatively, you can use any contextual metadata field name to store the text.
Using structured metadata for text alternatives
For better standardization across an organization, you can use structured metadata to create custom fields that support validation, approval workflows, and advanced search capabilities.
You can manage structured metadata fields through the Media Library interface or programmatically via the APIs.
Here's an example of setting a structured metadata field, with external ID, asset_description
, on upload:
Centralized asset management benefits
By storing accessibility metadata directly with your assets in Cloudinary, you gain several key advantages, including:
- Single source of truth:
- All text alternatives are stored with the asset, ensuring consistency across all implementations
- No need to maintain separate databases or files for accessibility content
- Changes to descriptions automatically propagate to all applications using the asset
- Searchable and discoverable:
- Use the Search API method to find assets by their accessibility descriptions
- Identify assets missing alt text for remediation
- Analyze description quality across your Media Library
- Review and approval workflows:
- Content teams can review and approve accessibility descriptions in the Media Library
- Implement approval workflows before publishing content
- Bulk operations:
- Programmatically update multiple assets simultaneously
- Import accessibility descriptions from external sources
- Export descriptions for review by accessibility experts
Integrating with delivery
Once text alternatives are stored as asset metadata, they're automatically available in your delivery implementations:
JavaScript integration:
Product Gallery Widget integration:
This centralized approach ensures that accessibility improvements benefit all your applications simultaneously, while providing the tools needed for professional content management and review processes.
AI-based image captioning
While storing text alternatives as metadata provides centralized management, creating meaningful descriptions for large image libraries can be time-consuming. Using AI-based image captioning, you can programmatically provide captions for images, saving time and resources.
- Subscribe to the Cloudinary AI Content Analysis add-on.
-
Upload an image to your Media Library, invoking AI-based image captioning:
-
Use the AI-generated caption from the response for the alt text:
Example code:

Alternative ways to invoke AI-based image captioning:
- Define an upload preset in the Cloudinary Console settings, which you can use either programmatically, or in your Media Library, when uploading images.
- Use the Cloudinary Image Captioning block in MediaFlows to generate a caption as part of a workflow, for example, to update the
alt
contextual metadata field on upload.
- For images already in your Media Library, use the update method of the Admin API, instead of the upload method of the Upload API.
Cloudinary AI Vision
An alternative to AI-based image captioning is to use the Cloudinary AI Vision add-on. This has the benefit of analyzing images that are external to Cloudinary, or stored in Cloudinary product environments that you don't own. You just need a valid URL to the image.
- Subscribe to the Cloudinary AI Vision add-on.
-
Send a request to the Analyze API asking for a brief description of the image.
-
Use the AI-generated response for the alt text:
Example code:

Video and audio accessibility
Audio and video content presents unique accessibility challenges because the information is presented over time and may not be perceivable by all users. People with hearing impairments need text alternatives like captions and transcripts, while those with visual impairments benefit from audio descriptions that explain visual content. Additionally, users may need control over audio levels or the ability to pause content that plays automatically.
The most accessible approach is often to let users control when media plays. Autoplaying content can interfere with screen readers, startle users, cause motion sensitivity issues, and consume bandwidth without consent.
Best practice: Provide clear controls and let users initiate playback. If autoplay is necessary, mute videos by default and include prominent play/pause controls.
This section covers Cloudinary's tools and techniques for making time-based media accessible, including generating transcriptions, adding captions, creating audio descriptions, and providing sign language overlays.
Alternatives for video-only content
Video-only content (videos without audio) can be made accessible in two main ways: by providing a written text description that conveys the visual information, or by adding an audio track that narrates what's happening on screen.
Video with written description
Videos containing no audio aren't accessible to people with visual impairments. You can provide a text alternative in the form of a description presented alongside a video, which a screen reader can read:
Video description:
The video takes place in a picturesque, hilly landscape during dusk, featuring rocky formations and a clear sky transitioning in colors from blue to orange hues. Initially, a person is seen standing next to a parked SUV. As the video progresses, another individual joins, and they are seen together near the vehicle. The couple, equipped with backpacks, appears to be enjoying the serene environment. Midway through, they share a hug near the SUV, emphasizing a moment of closeness in the tranquil setting. Towards the end of the video, the couple is observed walking away from the SUV, exploring the scenic, rugged terrain around them.
Video with audio description
As an alternative to providing a video with a written description, you can provide the description as an audio track, using the audio
layer parameter (l_audio
):
Use the button below to toggle the audio description on and off. Notice how the transformation URL changes when you toggle the audio description:
🎧 Audio Description Toggle Demo
Audio Description: Off - Click the button below to toggle the audio description track.
https://res.cloudinary.com/demo/video/upload/docs/grocery-store-no-audio.mp4
Audio and video transcriptions
For people with hearing impairments, you can generate transcripts of audio and video files that contain speech using Cloudinary's transcription functionality.
For audio-only files, you can present the text alongside the audio. For example, for this podcast, you can generate the transcript and display the wording below the file:
Transcript:
Tonight on Quest, nanotechnology. It's a term that's become synonymous with California's high-tech future. But what are these mysterious nanomaterials and machines? And why are they so special? Come along as we take an incredible journey into the land of the unimaginably small.
Types of transcription services
Cloudinary offers multiple transcription options to suit different needs and workflows:
Service | Key Features | Formats | Learn More |
---|---|---|---|
Cloudinary Video Transcription (Built-in service) | • Automatic language detection and transcription • Supports translation to multiple languages • Integrates seamlessly with the Cloudinary Video Player • Includes confidence scores for quality assessment |
.transcript |
🔧 Learn more |
Google AI Video Transcription add-on | • Leverages Google's Speech-to-Text API • Supports over 125 languages and variants • Provides speaker diarization (identifying different speakers) |
.transcript VTT SRT |
🔧 Learn more |
Microsoft Azure Video Indexer add-on | • Advanced video analysis including speech transcription • Supports multiple languages with auto-detection • Provides additional insights like emotions and topics • Generates comprehensive metadata about video content |
.transcript VTT SRT |
🔧 Learn more |
Transcript formats and usage
Different services generate transcripts in different formats, each suited for specific use cases:
Format | Description | Best For | Example |
---|---|---|---|
Cloudinary .transcript |
JSON format with word-level timing and confidence scores. Each excerpt includes full transcript text, confidence value, and individual word breakdowns with precise start/end times. | Advanced Video Player features like word highlighting, paced subtitles, and confidence-based filtering | Example file |
Google .transcript |
JSON format with word-level timing and confidence scores. Similar structure to Cloudinary but generated via Google's Speech-to-Text API with varying excerpt lengths. | Google-specific integrations and applications requiring Google's speech recognition accuracy | Example file |
Azure .transcript |
JSON format with confidence scores and start/end times. Provides transcript excerpts with timing but different structure than Cloudinary/Google formats. | Microsoft Azure integrations and applications requiring Azure's video analysis capabilities | Example file |
VTT files (WebVTT) | Industry standard web-based caption format with timing cues. Supported by most modern video players and browsers. | Web video players, HTML5 video elements, and broad compatibility needs | Example file |
SRT files (SubRip) | Simple text format with numbered sequences and timing codes. Widely supported across video editing software and players. | Video editing workflows, legacy player support, and simple subtitle implementations | Example file |
Generating transcripts using the Cloudinary Video Transcription
You can generate transcripts using several methods:
During upload:
For existing videos using the explicit method:
From the Video Player Studio:
Navigate to the Video Player Studio, add your video's public ID, and select the Transcript Editor to generate and edit transcripts directly in the interface.
The transcript editor provides a user-friendly interface for:
- Generating transcripts for existing videos
- Editing transcript content to ensure accuracy
- Adjusting individual word timings
- Adding or removing transcript lines
- Reviewing confidence scores for quality assessment
- Docs: Video transcription
Video captions
For prerecorded audio content in synchronized media you can supply captions. Using the Cloudinary Video Player you can display your generated transcriptions in sync with the video.
Having generated your transcription, to add it as captions to your video, set the textTracks
parameter at the source
level.
If you use the Cloudinary video transcription service to generate your transcription, as in this example, you don't need to specify the transcript file in the Video Player configuration - it's added automatically.
Here's the configuration for the video above:
Adding VTT and SRT files as captions
To set a VTT or SRT file as the captions (if you use a different transcription service), specify the file in the url
field of the textTracks
captions
object.
VTT:
SRT:
- Docs: Subtitles and captions
Audio descriptions
You can use the Cloudinary Video Player to display audio descriptions as captions on a video, which a screen reader can read. Additionally, you can set up alternative audio tracks that can provide audio descriptions of a video instead of any audio that's built into the video.
Audio descriptions as captions
When a video has no audio, or no dialogue, it can be helpful to provide a description of the video. You can present this alongside the video, or as captions in the video. Both can be read by screen readers.
In this video, the description appears as captions. Notice the audio descriptions menu at the bottom right of the player, when you play the video:
To set audio descriptions as captions, which can be read by screen readers, use the descriptions
kind of text track:
Audio descriptions as alternative audio tracks
A different way of providing a description of a video is to provide an alternative audio track. In this video the description is available as an audio track, which you can select from the audio selection menu at the bottom right of the player when you play the video:
To define additional audio tracks, use the audio layer transformation (l_audio
) with the alternate flag (fl_alternate
). This functionality is supported only for videos using adaptive bitrate streaming with automatic streaming profile selection (sp_auto
).
Here's the transformation URL that's supplied to the Video Player:
Live streaming closed captions
Add closed captions to your live streams using your preferred streaming software. Cloudinary's live streaming capabilities include support for captions embedded in the H.264 video stream using the CEA-608 standard.
-
To begin streaming, navigate to the Live Streams page in the Console and create a new live stream by providing a name. Once created, you'll receive the necessary streaming details:
- Input: RTMP URL and Stream Key – Use these credentials to configure your streaming software and initiate the live stream.
- Output: HLS URL – This is your stream's output URL, which can be used with your video player. The Cloudinary Video Player natively supports live streams for seamless playback.
Live streaming in the Cloudinary Console Add closed captions to the stream using your streaming software.
- Docs: Live streaming
Sign language in video overlays
To help deaf people to understand the dialogue in a video, you could apply a sign-language overlay.
- Upload the video(s) of the sign language and the main video to your product environment.
- Use a video overlay (
l_video
in URLs) with placement (e.g.g_south_east
) and start and end offsets (e.gso_2.0
andeo_4.0
). - Optionally speed up or slow down the overlay to fit with the dialogue (e.g.
e_accelerate:100
).
In this example, there are two overlays applied, one between 2 and 4 seconds, and the other at 6 seconds (this one is present until the end of the video so no end offset is required):
Other options to try:
- Consider making the background of the signer transparent to see more of the main video (e.g.
co_rgb:aca79d,e_make_transparent:0
as a transformation in the video layer). This works best if the background is a different color to anything else in the video. - Fade the overlay videos in and out for a smoother effect (e.g.
e_fade:500/e_fade:-500
).
- Docs: Video overlays
- Docs: Video transparency
Visual and audio clarity
Making content distinguishable means ensuring that users can perceive and differentiate important information regardless of their visual or auditory abilities. This includes providing sufficient color contrast, not relying solely on color to convey information, controlling audio levels, and allowing customization of visual and audio elements.
Users with color blindness, low vision, hearing impairments, or various sensitivities need content that can be perceived clearly in different ways. This section covers Cloudinary's tools for creating high-contrast visuals, assisting color blind users, managing audio levels, customizing text presentation, and adapting content for different viewing modes and environments.
Assist people with color blind conditions
People with color blindness may have difficulty distinguishing between certain colors, particularly red and green. Cloudinary provides tools to help make your media more accessible by using both color and pattern to convey information.
Simulate color blind conditions
You can experience how your images look to people with different color blind conditions. Apply the e_simulate_colorblind
effect with parameters like deuteranopia
, protanopia
, tritanopia
, or cone_monochromacy
to preview your content (see all the options).
Analyze color accessibility
For a more objective approach to assessing the accessibility of your images, use Accessibility analysis (currently available to paid accounts only).
-
Upload your images with the
accessibility_analysis
parameter set totrue
: -
See the accessibility results in the response:
For more information see Accessibility analysis, and for an example of using the results, watch this video tutorial.
Apply stripes
Consider a chart that uses red and green colors to convey information. For someone with red-green color blindness, this information would be inaccessible.
By adding patterns or symbols alongside colors, you can ensure the information is conveyed regardless of color perception.
To add the stripes, apply the assist_colorblind
effect with a stripe strength from 1 to 100, e.g. e_assist_colorblind:20
:
Apply color shifts
For an image where the problematic colors aren't isolated, it can be even harder to distinguish the content of the image.
By shifting the colors, you can ensure the image is clear regardless of color perception.
To shift the colors, apply the e_assist_colorblind
effect with the xray effect, e.g. e_assist_colorblind:xray
:
Obviously, you wouldn't want everyone to experience your images with the assist colorblind effects applied, but you could consider implementing a toggle that adds these effects to your images on demand.
Interactive color blind accessibility demo
Use the controls below to test different color blind assistance techniques and simulate various color blind conditions. This helps you understand which techniques work best for different types of color vision deficiency.

Current Transformation URL:
https://res.cloudinary.com/demo/image/upload/w_400/bo_1px_solid_black/docs/piechart.png
Tips for Testing:
- Pie Chart: Notice how stripes help distinguish sections that may look similar with color blindness
- Red Flower: X-ray mode shifts colors to make the flower more visible against the green background
- Compare: Try different combinations to see which techniques work best for each condition
- Docs: Assist people with color blind conditions
- Docs: Accessibility analysis
- Blog: Open your Eyes to Color Accessibility
- Video tutorial: Color accessibility in JavaScript
Adjust audio volume
For people with hearing impairments or those in different listening environments, providing volume control options ensures your audio and video content is accessible. The WCAG guidelines specify that if audio plays automatically for more than 3 seconds, users must have a mechanism to pause, stop, or control the volume independently.
With Cloudinary, you can implement this mechanism both programmatically and using the Cloudinary Video Player.
Programmatic volume adjustment
Programmatically adjust the volume directly in your media transformations using the volume
effect (e_volume
). This allows you give control to your users via external controls (as shown in the demo).
For example, to reduce the volume to 50% (e_volume:50
):
To increase the volume by 150% (e_volume:150
):
You can also mute audio completely by setting the volume to mute
:
Demo: External volume controls using transformations
For users with restricted movement or motor disabilities, you can create larger, more accessible volume controls outside the video player. These external controls use Cloudinary's volume transformations to deliver videos at different volume levels, making them easier to interact with than the built-in player controls. You can see the delivery URL change when you choose a different volume.
https://res.cloudinary.com/demo/video/upload/docs/grocery-store.mp4
Video Player volume controls
The Cloudinary Video Player provides built-in volume controls that users can adjust according to their needs. The player includes both a volume button and a volume slider for precise control.
You can customize the volume controls and set default volume levels in your JavaScript:
- Docs: Adjust the audio volume
- Docs: Cloudinary Video Player
- Docs: Audio normalization
Customizable caption styling
Captions and subtitles must meet specific contrast requirements to be accessible to people with visual impairments. The WCAG guidelines specify minimum contrast ratios between text and background colors to ensure readability.
Understanding contrast ratios
A contrast ratio measures the difference in brightness between text and its background, expressed as a ratio like 4.5:1 or 7:1. The higher the number, the more contrast there is.
WCAG Requirements:
- Level AA (minimum): 4.5:1 contrast ratio for normal text
-
Level AAA (enhanced): 7:1 contrast ratio for normal text
- Large text (18pt+ or 14pt+ bold): Lower ratios of 3:1 (AA) or 4.5:1 (AAA)
How to measure contrast ratios
You can measure contrast ratios using online tools such as WebAIM Contrast Checker.
How it works:
- Pick your colors: Select the text color and background color
- Get the ratio: The tool calculates the mathematical contrast ratio
- Check compliance: See if it meets WCAG AA (4.5:1) or AAA (7:1) standards
Example measurements:
- Black text (#000000) on white background (#FFFFFF) = 21:1 (excellent)
- White text (#FFFFFF) on blue background (#0066CC) = 5.74:1 (passes AA)
- Light gray text (#CCCCCC) on white background (#FFFFFF) = 1.61:1 (fails - too low)
Implementing accessible caption styling
The Cloudinary Video Player allows you to customize caption appearance to meet contrast requirements. The recommended approach is to use the built-in theme
options which provide predefined backgrounds and styling.
The built-in themes are described in this table:
Theme | Description | Best for |
---|---|---|
default |
None | High contrast videos only |
videojs-default |
High contrast theme with a dark background and white text | General accessibility |
yellow-outlined |
Yellow text with a dark outline for visibility | Videos with varied backgrounds |
player-colors |
Uses the video player's custom color scheme for the text and background | Brand consistency + accessibility |
3d |
Text with a 3D shadow effect | Stylistic preference |
The example at the top of this section uses the videojs-default
theme. Note that you can also override elements of the theme, for example, by setting the font size. Here's the Video Player configuration:
To set custom colors for the font and background you can use the player-colors
theme. This theme uses the colors that you configure when customizing your Video Player.
Text overlays on images and videos
Before creating text overlays embedded in images or videos, consider whether the text could instead be placed in your HTML and visually positioned over the media using CSS. HTML text is inherently more accessible because it can be announced by screen readers, restyled by users, translated automatically, and scales with user preferences—all without requiring additional accessibility techniques.
When you do need embedded text overlays in images and videos, it's crucial to ensure sufficient contrast between the text and background for readability. People with visual impairments or those viewing content in bright environments need clear, high-contrast text. Adding background colors or effects to text overlays helps meet WCAG contrast requirements and improves accessibility for everyone.
Text overlays on images with background
Without proper contrast, text overlays can be difficult or impossible to read. Here's how to add accessible text overlays with background colors:
The accessible version uses a semi transparent black background (b_rgb:00000080
) behind white text (co_white
) to achieve maximum contrast:
Text overlays on videos with background
Video text overlays face additional challenges as backgrounds change throughout the video. Consistent background colors ensure text remains readable regardless of the video content. This video uses white text (co_white
) on a semi-transparent blue background (b_rgb:0000cc90
) to create an overlay that remains visible throughout the video.
- Docs: Text overlays on images
- Docs: Text overlays on videos
Adjust contrast on images and videos
Proper contrast, brightness, and saturation adjustments are essential for making images and videos accessible to people with visual impairments, low vision, or those viewing content in challenging lighting conditions. These adjustments can help ensure content remains visible and legible across different viewing environments and for users with varying visual needs.
Contrast adjustments for images
Contrast adjustments can dramatically improve the readability and accessibility of images. Here are examples showing how different contrast levels affect image visibility:
Use the contrast effect with a value between -100 and 100:
Interactive contrast, brightness, and saturation demo
In addition to contrast, you can also alter brightness and saturation to help improve image visibility.
Use the controls below to see how contrast, brightness, and saturation adjustments affect image accessibility in real-time. Notice how the transformation URL changes as you adjust the settings:

https://res.cloudinary.com/demo/image/upload/c_scale,w_500/f_auto/q_auto/docs/groceryshop.jpg
Video visual adjustments
Video content can also benefit from contrast, brightness, and saturation adjustments. These are especially important for users with visual impairments who may struggle with low-contrast video content.
This video uses enhanced contrast (e_contrast:50
), increased brightness (e_brightness:10
) and saturation (e_saturation:20
) to improve visibility and accessibility.
- Docs: The contrast effect
- Docs: The brightness effect
- Docs: The saturation effect
Replacing colors for light/dark themes
For users who navigate websites with light and dark themes, consistency in visual presentation is crucial for both usability and accessibility. Light and dark themes can significantly impact users with visual sensitivities, light sensitivity conditions, or those who simply prefer one theme over another for better readability. Cloudinary provides powerful tools to automatically adapt image colors to match your application's theme, ensuring a cohesive visual experience.
Understanding the accessibility need
Different users have varying preferences and needs when it comes to visual themes:
- Light sensitivity: Users with photophobia, migraines, or certain medical conditions may find dark themes more comfortable
- Visual impairments: Some users with low vision find better contrast in a specific theme
- Environmental factors: Dark themes can be easier on the eyes in low-light environments
- Battery conservation: On OLED displays, dark themes can help conserve battery life
- Personal preference: Users may simply prefer one theme for better readability
Dynamic color replacement with replace_color
The replace_color effect allows you to dynamically swap colors in images based on the user's theme preference. This is particularly useful for logos, icons, and graphics that need to maintain brand consistency while adapting to different backgrounds. Try changing the theme at the top right of this page, and you'll see how the different icons look with light and dark themes.
This example replaces the predominant color with light gray (e_replace_color:e6e6e6:50
) with a tolerance of 50 to ensure similar shades are also replaced:
Using the theme effect for comprehensive adaptation
For more sophisticated theme adaptation, use the theme effect which applies comprehensive color adjustments to the image based on a specific background color.
For example, change the screen capture to a dark theme with increased sensitivity to photographic elements (e_theme:color_black:photosensitivity_110
):
The effect applies an algorithm that intelligently adjusts the color of illustrations, such as backgrounds, designs, texts, and logos, while keeping photographic elements in their original colors.
Interactive theme adaptation demo
Experience how Cloudinary can automatically adapt images for different themes. This demo shows how the same image can be dynamically modified to suit both light and dark themes using the replace_color
transformation, in addition to smart color replacement using the theme
effect:

https://res.cloudinary.com/demo/image/upload/c_scale,w_400/f_auto/q_auto/cloudinary_icon.png
- Video tutorial: Light and dark mode images in React
- Docs: Replace color effect
- Docs: Theme effect
Customize text overlays in images
Customizable text overlays are essential for accessibility because they allow you to adapt text presentation to meet diverse user needs. Users with visual impairments, dyslexia, or reading difficulties often benefit from specific font styles, sizes, and spacing adjustments. By providing flexibility in text overlay styling, you ensure your content remains accessible across different abilities and preferences.
The WCAG guidelines emphasize that text should be customizable to support users who need larger fonts, different font families, or modified spacing for better readability. Cloudinary's text overlay system provides extensive customization options that help you meet these accessibility requirements while maintaining visual appeal.



Understanding text overlay parameters
Cloudinary's text overlay transformation (l_text
) supports numerous styling parameters that can be combined to create accessible and visually appealing text:
Core Parameters (Required):
-
Font: Any universally available font or custom font (e.g.,
Arial
,Helvetica
,Times
) -
Size: Text size in pixels (e.g.,
50
,100
)
Styling Parameters (Optional):
-
Weight: Font thickness (
normal
,bold
,thin
,light
) -
Style: Font appearance (
normal
,italic
) -
Decoration: Text decoration (
normal
,underline
,strikethrough
) -
Alignment: Text positioning (
left
,center
,right
,justify
) -
Stroke: Text outline (
none
,stroke
) -
Letter spacing: Space between characters (
letter_spacing_<value>
) -
Line spacing: Space between lines (
line_spacing_<value>
)
Visual Enhancement Parameters:
-
Color: Text color (
co_<color>
) -
Background: Background color (
b_<color>
) -
Border: Outline styling (
bo_<border>
)
Interactive text overlay customization demo
Use the controls below to experiment with different text styling parameters and see how they affect accessibility and readability. Notice how the transformation URL updates as you adjust the settings:

https://res.cloudinary.com/demo/image/upload/c_fit,l_text:Arial_50:Sample%20Text,co_black,w_1800/fl_layer_apply,g_center/c_scale,w_600/f_auto/q_auto/docs/white-texture.jpg
Accessibility considerations for text overlays
When creating text overlays, consider these accessibility best practices:
Font Size: Use sizes of at least 16px for body text, larger for headers. Users with low vision may need even larger text.
Font Choice: Sans-serif fonts like Arial and Helvetica are often easier to read, especially for users with dyslexia.
Letter Spacing: Additional spacing between letters can improve readability for users with dyslexia or visual processing difficulties.
Color Contrast: Ensure sufficient contrast between text and background colors (minimum 4.5:1 ratio for normal text).
Background: Use solid background colors behind text when overlaying on complex images to ensure readability.
Font Weight: Bold text can improve readability, but avoid fonts that are too thin (like
light
orthin
weights) for important content.
Video text overlays
The same customization principles apply to video text overlays. Here's an example of accessible text styling on video content:
This example uses large, bold white text (Arial_60_bold
) with a semi-transparent black background (b_rgb:000000cc
) to ensure high contrast and readability across the entire video.
- Docs: Text overlays on images
- Docs: Text overlays on videos
- Docs: Create images from text
OCR text detection and extraction
For images containing text content, Optical Character Recognition (OCR) technology can extract that text and make it accessible to screen readers and other assistive technologies. This is particularly important for images of documents, signs, menus, handwritten notes, or any visual content where text is embedded within the image rather than provided as separate HTML text.
Cloudinary's OCR Text Detection and Extraction add-on can automatically extract text from images during upload, making the content available for accessibility purposes.
Here's an example showing an Italian restaurant menu and the text that Cloudinary's OCR add-on automatically extracted from it:

Extracted Text Content (Available to Screen Readers):
INSALATA VERDE
PIZZA CAPRESE
18.50
MENU 2
BRUSCHETTA DELLA CASA
INSALATA DI POLLO
19.50
MENU 3
BRUSCHETTA DELLA CASA
CANNELLONI DI CARNE
AL FORNO 21.50
This text content was automatically extracted using OCR and can be read by screen readers, making the Italian menu accessible to users with visual impairments. Note that the OCR detected the language as Italian (locale: "it") and extracted all menu items with their prices.
To extract text from an image:
Subscribe to the OCR add-on: Enable the OCR Text Detection and Extraction add-on in your Cloudinary account.
-
Extract text during upload: When uploading images that contain text, use the
ocr
parameter to extract the text content: -
Use extracted text for accessibility: The OCR results are returned in the upload response and can be used to provide accessible alternatives:
Here's an example in React using the Italian restaurant menu response:
- You can invoke the OCR Text Detection and Extraction add-on for images already in your product environment using the Admin API update method.
- You can retrieve the response at a later date using the Admin API resource method.
- Consider using contextual or structured metadata to store the text.
Mixing audio tracks
For users with hearing difficulties or auditory processing disorders, the ability to control the balance between foreground speech and background audio is crucial for accessibility. The WCAG guidelines specify that background sounds should be at least 20 decibels lower than foreground speech content, or users should have the ability to turn off background sounds entirely.
Cloudinary's audio mixing capabilities allow you to layer multiple audio tracks and control their relative volumes, ensuring your content meets accessibility requirements while maintaining audio richness.
To control the volume of different audio tracks, use the volume effect in each of the audio layers. In this example, the narration is set to a volume of 3dB higher than the original asset (e_volume:3dB
), and the background wind noise is set to a volume of 18dB lower than the original asset (e_volume:-18dB
):
Audio normalization for consistent levels
Before mixing audio tracks, it helps to normalize them to consistent baseline levels. Different audio recordings often have varying baseline volumes, which can make it difficult to achieve predictable dB differences for accessibility compliance.
To normalize your audio files before uploading them to Cloudinary, you can use audio processing tools, such as FFmpeg.
For example, normalize the audio file nantech.mp3 to -16 LUFS:
This ensures that when you apply -20dB
or -25dB
adjustments in Cloudinary, you get the exact dB separation needed for WCAG compliance.
Interactive audio mixing demo
This demo shows how Cloudinary can mix a primary audio track (nanotechnology narration) with a background audio layer (wind sounds). Use the controls to adjust the volume levels and observe how the dB difference affects accessibility:
🎙️ Narration (Foreground)
🌬️ Wind (Background)
https://res.cloudinary.com/demo/video/upload/e_volume:3dB/l_audio:docs:wind_norm/e_volume:-18dB/fl_layer_apply/docs/nanotech_norm.mp3
User-controlled audio track levels
Similar to the above demo, you could provide controls in your application to let the user decide on the levels of each track to meet their needs. Here's some example React code that you could use:
- Always provide a no-background option: Some users need complete silence behind speech
- Maintain 20+ dB separation: When background audio is present, ensure it's at least 20 dB lower
- Test with real users: Audio perception varies greatly between individuals
- Consider frequency content: Low-frequency background sounds are less distracting than mid-range frequencies
- Provide visual indicators: Show users the current dB levels and compliance status
- Use consistent levels: Maintain the same audio balance throughout your content
- Docs: Mixing audio tracks
- Docs: Audio transformations
Interactive content and controls
User interface components and navigation must be operable by all users, regardless of their physical abilities or the input methods they use. This means ensuring that users can interact with and navigate your content using various methods including keyboards, screen readers, voice commands, or other assistive technologies.
For media content, operability encompasses several key areas: providing alternatives to motion-based content for users with vestibular disorders, ensuring all interactive elements are keyboard accessible, and designing interfaces that work seamlessly with assistive technologies. Users with motor impairments, visual disabilities, or other conditions need content that responds predictably to their preferred interaction methods.
This section covers Cloudinary's tools and widgets that support operable interfaces, including techniques for managing animated content, implementing keyboard-accessible galleries, and creating video players that work with assistive technologies.
Convert animations to still images
Many users have vestibular disorders, seizure conditions, or other sensitivities that make animated content problematic or even dangerous. WCAG Success Criterion 2.3.3 requires that motion animation triggered by interaction can be disabled unless the animation is essential to the functionality. Additionally, some users simply prefer reduced motion for better focus and less distraction.
Cloudinary provides several approaches to make animated content accessible by converting animations to still images or providing user control over motion.
Understanding motion sensitivity
Users may need reduced motion for various reasons:
- Vestibular disorders: Inner ear conditions that cause dizziness and nausea from motion
- Seizure disorders: Flashing or rapid motion can trigger seizures
- Attention disorders: Animation can be distracting and make it difficult to focus
- Migraine triggers: Motion can trigger or worsen migraines
- Battery conservation: Reducing animation saves device battery life
- Bandwidth limitations: Still images use less data than animated content
Extracting still frames from animations
You can extract a single frame from an animated GIF or video to create a still image alternative. This is useful for providing a static version of animated content.
Use the page parameter (pg_
) to extract a specific frame from an animated GIF:
You can also extract the first frame by converting the format to a static image format like JPG or PNG:
Implementing user-controlled motion preferences
The most accessible approach is to respect user preferences for reduced motion. Modern browsers support the prefers-reduced-motion CSS media query, which you can combine with Cloudinary transformations to serve appropriate content.
Here's an interactive demo showing how to implement motion preferences:

https://res.cloudinary.com/demo/image/upload/c_scale,w_400/kitten_fighting.gif
Implementation examples
Here are some examples of respecting the prefers-reduced-motion
setting:
React:
HTML:
CSS:
Video posters
For video content, you can extract a poster frame to show when motion is reduced:
The poster frame uses the start offset (so_
) parameter to extract a frame from 10.2 seconds into the video.
You can also use so_auto
to let Cloudinary automatically choose the best frame to use as the poster.
-
Respect system preferences: Always check for
prefers-reduced-motion: reduce
- Provide user controls: Allow users to toggle motion on/off regardless of system settings
- Choose meaningful still frames: Select frames that best represent the animated content
- Maintain functionality: Ensure that stopping animation doesn't break essential features
- Test with users: Verify that reduced motion versions are still informative and useful
- Consider alternatives: Sometimes a different approach (like a slideshow) works better than a single still frame
- Docs: Deliver a single frame of an animated image
- Docs: Video thumbnails
- Docs: Page parameter for animated images
- Video tutorial: Reduce motion of images in React
Cloudinary Product Gallery widget
The Cloudinary Product Gallery widget provides comprehensive accessibility features that ensure users with disabilities can effectively navigate and interact with product galleries. The widget includes keyboard navigation, screen reader support, and customizable display options that meet WCAG operability requirements.
🎯 Accessible Product Gallery Demo
Keyboard Navigation: Use Tab to navigate, Enter to view items, Escape to close zoom, Arrow keys to browse between media assets.
Keyboard accessibility
The Product Gallery enables full keyboard accessibility for users who cannot use a mouse or rely on assistive technologies. All interactive elements are accessible using standard keyboard navigation:
Keyboard Navigation Controls
Accessible configuration options
For the most accessible experience, Cloudinary recommends these configuration settings:
Screen reader support
The Product Gallery provides semantic markup for screen readers and uses alt text from your configured metadata sources. You can specify where the gallery should look for alt text using the accessibilityProps
parameter:
Using structured metadata for alt text:
Using contextual metadata for alt text:
If no alt text source is configured or the specified metadata field is empty, the gallery defaults to descriptive text in the format "Gallery asset n of m".
Focus management and visual indicators
The Product Gallery provides clear visual focus indicators that help users understand their current position within the gallery:
- High contrast focus rings: Clearly visible borders around focused elements
- Logical tab order: Sequential navigation through thumbnails, main viewer, and controls
- Focus trapping: When zoom is activated, focus remains within the zoom interface
- Expanded mode benefits: In expanded mode, focus areas are more visually prominent
Video accessibility features
When displaying videos in the Product Gallery, accessibility features include:
- Keyboard controls: Spacebar to play/pause, Enter to activate full controls
- Screen reader announcements: Video state changes are announced to assistive technology
-
Simplified controls: The
controls: "play"
option reduces interface complexity - Caption support: When using the Cloudinary Video Player, captions and subtitles are fully supported
Responsive accessibility
The Product Gallery maintains accessibility across different viewport sizes:
- Mobile optimization: Touch-friendly controls and appropriate sizing
- Viewport breakpoints: Maintains usability as layout adapts to screen size
- Consistent navigation: Keyboard accessibility preserved across all breakpoints
- Provide meaningful alt text: Use structured or contextual metadata to supply descriptive alt text for all images
- Use expanded mode for better focus visibility: The expanded display mode provides more prominent focus indicators
-
Simplify video controls: Use
controls: "play"
to reduce cognitive load - Test with keyboard only: Ensure all functionality is accessible without a mouse
- Provide captions for videos: When using videos, include captions or transcripts
- Consider loading states: Ensure loading indicators are announced to screen readers
- Test with screen readers: Verify that the gallery provides a logical and informative experience for screen reader users
- Docs: Product Gallery accessibility
- Video tutorial: Product Gallery accessibility features
- Code example: Accessible Product Gallery sandbox
Cloudinary Video Player
The Cloudinary Video Player is designed to provide an inclusive video experience that meets WCAG 2.1 AA compliance standards. The player includes comprehensive accessibility features that ensure users with visual, auditory, motor, and cognitive impairments can fully engage with video content through assistive technologies, keyboard navigation, and other accessibility-friendly enhancements.
Accessibility features overview
The Cloudinary Video Player provides extensive accessibility support including:
- Full keyboard navigation: All controls accessible via Tab key with clear focus indicators
- Screen reader compatibility: ARIA attributes and semantic markup for assistive technologies
- Closed captions and subtitles: Multi-language support with customizable styling (refer to Video captions)
- Audio descriptions: Support for descriptive audio tracks and caption-based descriptions (refer to Audio descriptions)
- Video chapters: Easy navigation to key sections for improved usability
- Adjustable playback: Variable speed controls for better comprehension
- High-contrast UI: Customizable themes for improved visibility (refer to Customizable caption styling)
Live accessibility demo
Here's a working Cloudinary Video Player demonstrating accessibility features. Try navigating the controls using only your keyboard (Tab, Space, Arrow keys) and notice the clear focus indicators:
🎬 Accessible Video Player Demo
Keyboard Controls: Tab to navigate controls, Space to play/pause, Arrow keys to seek. Use Tab to reach mute, fullscreen, and caption buttons.
- Keyboard navigation with visible focus indicators
- Closed captions with high-contrast styling
- Screen reader compatible controls
- Customizable playback speed
- ARIA labels and semantic markup
Keyboard navigation controls
Video Player Keyboard Controls
Implementation example
Here's how to configure the Cloudinary Video Player with optimal accessibility settings:
- Docs: Video Player accessibility
- Docs: Cloudinary Video Player
- Docs: Video Player customization
- Docs: Subtitles and captions
- Video tutorials: Video Player tutorials