9.1 Introduction

The MediaStreamTrack interface, as defined in the [GETUSERMEDIA] specification, typically represents a stream of data of audio or video. One or more MediaStreamTracks can be collected in a MediaStream (strictly speaking, a MediaStream as defined in [GETUSERMEDIA] may contain zero or more MediaStreamTrack objects).

A MediaStreamTrack may be extended to represent a media flow that either comes from or is sent to a remote peer (and not just the local camera, for instance). The extensions required to enable this capability on the MediaStreamTrack object will be described in this section. How the media is transmitted to the peer is described in [RFC8834], [RFC7874], and [RFC8835].

A MediaStreamTrack sent to another peer will appear as one and only one MediaStreamTrack to the recipient. A peer is defined as a user agent that supports this specification. In addition, the sending side application can indicate what MediaStream object(s) the MediaStreamTrack is a member of. The corresponding MediaStream object(s) on the receiver side will be created (if not already present) and populated accordingly.

As also described earlier in this document, the objects RTCRtpSender and RTCRtpReceiver can be used by the application to get more fine grained control over the transmission and reception of MediaStreamTracks.

Channels are the smallest unit considered in the Media Capture and Streams specification. Channels are intended to be encoded together for transmission as, for instance, an RTP payload type. All of the channels that a codec needs to encode jointly MUST be in the same MediaStreamTrack and the codecs SHOULD be able to encode, or discard, all the channels in the track.

The concepts of an input and output to a given MediaStreamTrack apply in the case of MediaStreamTrack objects transmitted over the network as well. A MediaStreamTrack created by an RTCPeerConnection object (as described previously in this document) will take as input the data received from a remote peer. Similarly, a MediaStreamTrack from a local source, for instance a camera via [GETUSERMEDIA], will have an output that represents what is transmitted to a remote peer if the object is used with an RTCPeerConnection object.

The concept of duplicating MediaStream and MediaStreamTrack objects as described in [GETUSERMEDIA] is also applicable here. This feature can be used, for instance, in a video-conferencing scenario to display the local video from the user’s camera and microphone in a local monitor, while only transmitting the audio to the remote peer (e.g. in response to the user using a “video mute” feature). Combining different MediaStreamTrack objects into new MediaStream objects is useful in certain situations.

Note

In this document, we only specify aspects of the following objects that are relevant when used along with an RTCPeerConnection. Please refer to the original definitions of the objects in the [GETUSERMEDIA] document for general information on using MediaStream and MediaStreamTrack.