Text Tracks

Text tracks are a feature of HTML5 for displaying time-triggered text to the end-user. Video.js offers a cross-browser implementation of text tracks.

Table of Contents

A Note on “Remote” Text Tracks

Video.js refers to so-called “remote” text tracks. This is a convenient term for tracks that have an associated <track> element rather than those that do not.

Either can be created programmatically, but only remote text tracks can be removed from a player. For that reason, we recommend only using remote text tracks.

Creating the Text File

Timed text requires a text file in WebVTT format. This format defines a list of “cues” that have a start time, an end time, and text to display. Microsoft has a builder that can help you get started on the file.

Note: When creating captions, there are additional caption formatting techniques to make captions more meaningful, like brackets around sound effects (e.g. [ birds chirping ]).

For a more in depth style guide for captioning, see the Captioning Key, but keep in mind not all features are supported by WebVTT or (more likely) the Video.js WebVTT implementation.

Adding Text Tracks to Video.js

Once you have your WebVTT files created, you can add them to your video element using the track tag. Similar to source elements, track elements should be added as children of the video element:

  1. <video
  2. class="video-js"
  3. controls
  4. preload="auto"
  5. width="640"
  6. height="264"
  7. data-setup='{}'>
  8. <source src="//vjs.zencdn.net/v/oceans.mp4" type="video/mp4">
  9. <source src="//vjs.zencdn.net/v/oceans.webm" type="video/webm">
  10. <track kind="captions" src="//example.com/path/to/captions.vtt" srclang="en" label="English" default>
  11. </video>

Video.js will automatically read track elements from the video element. Tracks (remote and non-remote) can also be added programmatically.

track Attributes

kind

standard definition

One of the track types supported by Video.js:

  • "subtitles" (default): Translations of the dialogue in the video for when audio is available but not understood. Subtitles are shown over the video.
  • "captions": Transcription of the dialogue, sound effects, musical cues, and other audio information for viewer who are deaf/hard of hearing, or the video is muted. Captions are also shown over the video.
  • "chapters": Chapter titles that are used to create navigation within the video. Typically, these are in the form of a list of chapters that the viewer can use to navigate the video.
  • "descriptions": Text descriptions of the action in the content for when the video portion isn’t available or because the viewer is blind or not using a screen. Descriptions are read by a screen reader or turned into a separate audio track.
  • "metadata": Tracks that have data meant for JavaScript to parse and do something with. These aren’t shown to the user.

label

standard definition

Short descriptive text for the track that will used in the user interface. For example, in a menu for selecting a captions language.

default

standard definition

The boolean default attribute can be used to indicate that a track’s mode should start as "showing". Otherwise, the viewer would need to select their language from a captions or subtitles menu.

Note: For chapters, default is required if you want the chapters menu to show.

srclang

standard definition

The valid BCP 47 code for the language of the text track, e.g. "en" for English or "es" for Spanish.

For supported language translations, please see the languages folder (/lang) folder located in the Video.js root and refer to the languages guide for more information on languages in Video.js.

Text Tracks from Another Domain

Because Video.js loads the text track file via JavaScript, the same-origin policy applies. If you’d like to have a player served from one domain, but the text track served from another, you’ll need to enable CORS on the server that is serving your text tracks.

In addition to enabling CORS, you will need to add the crossorigin attribute to the video element itself. This attribute has two possible values "anonymous" and "use-credentials". Most users will want to use "anonymous" with cross-origin tracks:

  1. <video class="video-js" crossorigin="anonymous">
  2. <source src="//vjs.zencdn.net/v/oceans.mp4" type="video/mp4">
  3. <track src="//example.com/oceans.vtt" kind="captions" srclang="en" label="English">
  4. </video>

One thing to be aware of is that the video files themselves will also need CORS headers. This is because some browsers apply the crossorigin attribute to the video source itself and not just the tracks. This is considered a security concern by the spec.

Working with Text Tracks

Showing Tracks Programmatically

Certain use cases call for turning captions on and off programmatically rather than forcing the user to do so themselves. This can be easily achieved by modifying the mode property of the text tracks.

The mode can be one of three values "disabled", "hidden", and "showing". When a text track’s mode is "disabled", the track does not show on screen as the video is playing.

When the mode is set to "showing", the track is visible to the viewer and updates while the video is playing.

  1. // Get all text tracks for the current player.
  2. var tracks = player.textTracks();
  3. for (var i = 0; i < tracks.length; i++) {
  4. var track = tracks[i];
  5. // Find the English captions track and mark it as "showing".
  6. if (track.kind === 'captions' && track.language === 'en') {
  7. track.mode = 'showing';
  8. }
  9. }

Listen for a Cue Becoming Active

One of the supported values for mode is "hidden". This mode means that the track will update as the video is playing, but it won’t be visible to the viewer. This is most useful for tracks where kind="metadata".

A common use case for metadata text tracks is to use them to trigger behaviors when their cues become active. For this purpose, tracks emit a "cuechange" event.

  1. // Get all text tracks for the current player.
  2. var tracks = player.textTracks();
  3. var metadataTrack;
  4. for (var i = 0; i < tracks.length; i++) {
  5. var track = tracks[i];
  6. // Find the metadata track that's labeled "ads".
  7. if (track.kind === 'metadata' && track.label === 'ads') {
  8. track.mode = 'hidden';
  9. // Store it for usage outside of the loop.
  10. metadataTrack = track;
  11. }
  12. }
  13. // Add a listener for the "cuechange" event and start ad playback.
  14. metadataTrack.addEventListener('cuechange', function() {
  15. player.ads.startLinearAdMode();
  16. });

Emulated Text Tracks

By default, Video.js will use native text tracks and fall back to emulated text tracks if the native functionality is broken, incomplete, or non-existent.

The Video.js API and TextTrack objects were modeled after the W3C specification. Video.js uses Mozilla’s vtt.js library to parse and display emulated text tracks.

To disable native text track functionality and force Video.js to use emulated text tracks always, the nativeTextTracks option can be passed to a tech:

  1. // Create a player, passing `nativeTextTracks: false` to the HTML5 tech.
  2. var player = videojs('myvideo', {
  3. html5: {
  4. nativeTextTracks: false
  5. }
  6. });

Text Track Settings

When using emulated text tracks, captions will have an additional item in the menu called “Caption Settings”. This allows the user to alter how captions are styled on screen.

This feature can be disabled by turning off the TextTrackSettings component and hiding the menu item.

  1. var player = videojs('myvideo', {
  2. // Make the text track settings dialog not initialize.
  3. textTrackSettings: false
  4. });
  1. /* Hide the captions settings item from the captions menu. */
  2. .vjs-texttrack-settings {
  3. display: none;
  4. }

Text Track Precedence

In general, "descriptions" tracks are of lower precedence than "captions" and "subtitles". What this mean for developers using Video.js?

  • If you are using the default attribute, Video.js will choose the first track that is marked as default and turn it on. If there are multiple tracks marked default, it will turn on the first "captions" or "subtitles" track before any "descriptions" tracks.
    • This only applied to the emulated text track support, native text tracks behavior will change depending on the browser.
  • If a track is selected from the menu, Video.js will turn off all the other tracks of the same kind. While this suggests Video.js supports both "subtitles" and "captions" being turned on simultaneously, this is currently not the case; Video.js only supports one track being displayed at a time.
    • This means that for emulated text tracks, Video.js will display the first enabled "subtitles" or "captions" track.
    • When native text tracks are supported, other tracks of the same kind will still be disabled, but it is possible that multiple text tracks are shown.
    • If a "descriptions" track is selected and subsequently a "subtitles" or "captions" track is selected, the "descriptions" track is disabled and its menu button is also disabled.
  • When enabling a track programmatically, Video.js performs minimal enforcement.
    • For emulated text tracks, Video.js chooses the first track that’s "showing" - again choosing "subtitles" or "captions" over "descriptions".
    • For native text tracks, this behavior depends on the browser. Some browsers will allow multiple text tracks, but others will disable all other tracks when a new one is selected.

API

For more complete information, refer to the Video.js API docs.

Remote Text Tracks

As mentioned above, remote text tracks represent the recommended API offered by Video.js as they can be removed.

  • Player#remoteTextTracks()

  • Player#remoteTextTrackEls()

  • Player#addRemoteTextTrack(Object options)

    Available options are the same as the available track attributes. And language is a supported option as an alias for the srclang attribute - either works here.

    Note: If you need a callback, instead of a callback you could use the technique below:

    1. const trackEl = player.addRemoteTextTrack({src: 'en.vtt'}, false);
    2. trackEl.addEventListener('load', function() {
    3. // your callback goes here
    4. });
  • Player#removeRemoteTextTrack(HTMLTrackElement|TextTrack)

Text Tracks

It is generally recommended that you use remote text tracks rather than these purely programmatic text tracks for the majority of use-cases.

  • Player#textTracks()

  • Player#addTextTrack(String kind, [String label [, String language]])

    Note: Non-remote text tracks are intended for purely programmatic usage of tracks and have the important limitation that they cannot be removed once created.

    The standard addTextTrack does not have a corresponding removeTextTrack method; so, we actually discourage the use of this method!

  • TextTrackList()

  • TextTrack()