Audio description is a separate audio track that narrates important visual information in a video for people who are blind or have low vision. Good descriptions cover actions, scene changes, on-screen text, and visual details that carry meaning the dialogue does not convey. The narration fits into pauses between dialogue, stays neutral in tone, and never competes with the original audio. Under WCAG 2.1 AA, prerecorded video with meaningful visual content requires audio description (Success Criterion 1.2.5). The standard does not apply to videos that already convey all visual information through the existing soundtrack.
| Area | Best Practice |
|---|---|
| When required | Prerecorded video with visual content not already conveyed through audio (WCAG 1.2.5, Level AA) |
| What to describe | Actions, characters, scene changes, on-screen text, graphics, and visual cues tied to meaning |
| Timing | Place narration in natural pauses; use extended audio description if pauses are too short |
| Voice and tone | Neutral, clear, paced to match the scene; avoid interpretation or emotional editorializing |
| Delivery | Offer a described track as an alternate audio option or a separate described version |

When Audio Description Is Required
WCAG 2.1 AA requires audio description for prerecorded synchronized media at Success Criterion 1.2.5. If the video already conveys all meaningful visual information through dialogue or natural sounds, audio description is not required. A talking-head interview where every important point is spoken aloud generally does not need it. A training video that shows steps on a screen without narrating them does.
Live video is covered separately under 1.2.7 (Extended Audio Description), which is Level AAA and rarely scoped into typical conformance work. For Section 508 and EN 301 549, the same WCAG criteria apply.
What Should an Audio Description Actually Describe?
Describe what a sighted viewer learns from the screen that a blind viewer would otherwise miss. That includes who is on screen, what they are doing, where the scene takes place, and any text or graphics that carry meaning.
Skip anything already obvious from the soundtrack. If a character says “I’m leaving,” you do not need to describe them walking out. If on-screen text reads “Chapter 2: Setup” and no one says it aloud, describe it.
Prioritize meaning over completeness. A two-second pause cannot hold a full visual inventory, and trying to cram one in makes the description harder to follow than the video itself.
Writing Descriptions That Work
Write in the present tense. Use plain, specific language. “A woman in a red jacket opens the door” works. “A figure clad in crimson outerwear ingresses through the portal” does not.
Stay neutral. The description should report what is visible, not interpret it. If a character looks angry, describe the visible cue (“she clenches her jaw”) rather than the conclusion. Let the viewer interpret the scene the same way a sighted viewer would.
Match the energy of the scene. A quiet moment calls for a measured pace. An action sequence may need shorter sentences fit into tighter pauses. The description should feel like part of the video, not a layer on top of it.
Timing and Placement
Fit the narration into natural pauses in the dialogue and sound. Never talk over important spoken content. If the pauses are too short to carry the description, you have two options: trim the description to the essentials, or produce an extended audio description version where the video pauses to make room for narration.
Extended audio description is more involved to produce because it changes the video’s runtime. It is the right approach for content where visual detail is critical, such as instructional videos or documentary footage where information is dense.
Voicing and Production
Use a clear, professional voice that does not blend with the video’s existing audio. A different voice from the main narrator helps viewers distinguish the description from the original content. Record at a consistent volume that sits underneath dialogue but stays fully intelligible.
Many production teams use professional voice talent for public-facing content. For internal training videos, a clear in-house voice can work as long as the recording quality is high and the delivery is even.
Delivery Options for Audio Description
There are two common ways to deliver audio description. The first is an alternate audio track that viewers can switch on through the video player. Players like YouTube, Vimeo, and most enterprise video platforms support multiple audio tracks. The second is a separate described version of the video posted alongside the original.
The alternate-track approach is cleaner for the viewer. The separate-version approach is easier to implement when your player does not support multiple audio tracks. Either method meets WCAG 1.2.5 as long as the described version is available and the player controls are accessible.
Common Issues to Avoid
Describing things the dialogue already covers crowds the audio without adding meaning. Skipping on-screen text that contains information not spoken aloud leaves gaps for blind viewers. Editorializing or interpreting emotions instead of describing visible cues takes control away from the viewer. Talking over dialogue or important sound effects defeats the purpose of the description. Producing a described version but burying it behind a player control viewers cannot find makes the effort pointless.
How Audio Description Fits Into a Broader Accessibility Effort
Audio description is one of several media requirements under WCAG. Captions cover the audio side for deaf and hard-of-hearing viewers. Transcripts give a text version that supports both groups and improves search indexing. A complete accessibility evaluation of your website assesses whether video content meets all relevant criteria, not only the description track.
Accessible.org audit reports identify which videos require description, which already meet the criterion through existing audio, and what production work is needed for the rest. That mapping is what turns a vague “we need audio description” task into a concrete work list.
Frequently Asked Questions
Do I need audio description on every video?
No. Audio description is required only when the video contains meaningful visual information that the existing soundtrack does not already convey. A podcast recorded with a camera and no visual content beyond the speakers usually does not need it. A product demo that shows steps on screen without narrating them does.
Does YouTube auto-generate audio description?
No. YouTube auto-generates captions, but audio description has to be produced and uploaded as a separate audio track or as a described version of the video. Some video platforms are researching AI-assisted description drafting, and Accessible.org Labs is actively studying how AI can support media accessibility workflows, but human review remains necessary for quality and accuracy.
What is the difference between standard and extended audio description?
Standard audio description fits narration into natural pauses in the existing audio. Extended audio description pauses the video itself to make room for longer descriptions when the original pauses are too short. Extended description is Level AAA under WCAG and is most useful for content with dense visual information.
Can captions substitute for audio description?
No. Captions serve deaf and hard-of-hearing viewers by converting audio into text. Audio description serves blind and low-vision viewers by converting visual information into audio. They address different needs and are both required when applicable.
How do I know which of my videos need audio description?
An accessibility evaluation reviews each video against WCAG 1.2.5 and identifies which ones have visual content that is not covered by the existing soundtrack. The output is a list of videos that need described tracks and a description of what visual information must be covered.
Audio description done well becomes part of the viewing experience. Done poorly, it pulls attention away from the content. The difference is preparation, clear writing, and production that respects the original audio.
For an evaluation of how your video content measures against WCAG 2.1 AA, Contact Accessible.org.