-
trackIdentifierof type DOMString -
The value of the
MediaStreamTrack'sidattribute. -
midof type DOMString -
If the
RTCRtpTransceiverowning this stream has amidvalue that is notnull, this is that value, otherwise this member MUST NOT be present. -
remoteIdof type DOMString -
The
remoteIdis used for looking up the remoteRTCRemoteOutboundRtpStreamStatsobject for the same SSRC. -
framesDecoded -
MUST NOT exist for audio. It represents the total number of frames correctly decoded for this RTP stream, i.e., frames that would be displayed if no frames are dropped.
-
keyFramesDecodedof type unsigned long -
MUST NOT exist for audio. It represents the total number of key frames, such as key frames in VP8 [RFC6386] or IDR-frames in H.264 [RFC6184], successfully decoded for this RTP media stream. This is a subset of
framesDecoded.framesDecoded - keyFramesDecodedgives you the number of delta frames decoded. -
framesRendered -
MUST NOT exist for audio. It represents the total number of frames that have been rendered. It is incremented just after a frame has been rendered.
-
framesDroppedof type unsigned long -
MUST NOT exist for audio. The total number of frames dropped prior to decode or dropped because the frame missed its display deadline for this receiver's track. The measurement begins when the receiver is created and is a cumulative metric as defined in Appendix A (g) of [RFC7004].
-
frameWidthof type unsigned long -
MUST NOT exist for audio. Represents the width of the last decoded frame. Before the first frame is decoded this member MUST NOT exist.
-
frameHeightof type unsigned long -
MUST NOT exist for audio. Represents the height of the last decoded frame. Before the first frame is decoded this member MUST NOT exist.
-
framesPerSecondof type double -
MUST NOT exist for audio. The number of decoded frames in the last second.
-
qpSumof type unsigned long long -
MUST NOT exist for audio. The sum of the QP values of frames decoded by this receiver. The count of frames is in
framesDecoded.The definition of QP value depends on the codec; for VP8, the QP value is the value carried in the frame header as the syntax element
y_ac_qi, and defined in [RFC6386] section 19.2. Its range is 0..127.Note that the QP value is only an indication of quantizer values used; many formats have ways to vary the quantizer value within the frame.
-
totalDecodeTimeof type double -
MUST NOT exist for audio. Total number of seconds that have been spent decoding the
framesDecodedframes of this stream. The average decode time can be calculated by dividing this value withframesDecoded. The time it takes to decode one frame is the time passed between feeding the decoder a frame and the decoder returning decoded data for that frame. -
totalInterFrameDelayof type double -
MUST NOT exist for audio. Sum of the interframe delays in seconds between consecutively rendered frames, recorded just after a frame has been rendered. The interframe delay variance be calculated from
totalInterFrameDelay,totalSquaredInterFrameDelay, andframesRenderedaccording to the formula: (totalSquaredInterFrameDelay-totalInterFrameDelay^2/framesRendered)/framesRendered. -
totalSquaredInterFrameDelayof type double -
MUST NOT exist for audio. Sum of the squared interframe delays in seconds between consecutively rendered frames, recorded just after a frame has been rendered. See
totalInterFrameDelayfor details on how to calculate the interframe delay variance. -
pauseCountof type unsigned long -
MUST NOT exist for audio. Count the total number of video pauses experienced by this receiver. Video is considered to be paused if time passed since last rendered frame exceeds 5 seconds.
pauseCountis incremented when a frame is rendered after such a pause. -
totalPausesDurationof type double -
MUST NOT exist for audio. Total duration of pauses (for definition of pause see
pauseCount), in seconds. This value is updated when a frame is rendered. -
freezeCountof type unsigned long -
MUST NOT exist for audio. Count the total number of video freezes experienced by this receiver. It is a freeze if frame duration, which is time interval between two consecutively rendered frames, is equal or exceeds Max(3 * avg_frame_duration_ms, avg_frame_duration_ms + 150), where avg_frame_duration_ms is linear average of durations of last 30 rendered frames.
-
totalFreezesDurationof type double -
MUST NOT exist for audio. Total duration of rendered frames which are considered as frozen (for definition of freeze see
freezeCount), in seconds. This value is updated when a frame is rendered. -
lastPacketReceivedTimestampof type DOMHighResTimeStamp -
Represents the timestamp at which the last packet was received for this SSRC. This differs from
timestamp, which represents the time at which the statistics were generated by the local endpoint. - of type unsigned long long
-
Total number of RTP header and padding bytes received for this SSRC. This includes retransmissions. This does not include the size of transport layer headers such as IP or UDP.
headerBytesReceived + bytesReceivedequals the number of bytes received as payload over the transport. -
packetsDiscardedof type unsigned long long -
The cumulative number of RTP packets discarded by the jitter buffer due to late or early-arrival, i.e., these packets are not played out. RTP packets discarded due to packet duplication are not reported in this metric [XRBLOCK-STATS]. Calculated as defined in [RFC7002] section 3.2 and Appendix A.a.
-
fecBytesReceivedof type unsigned long long -
Total number of RTP FEC bytes received for this SSRC, only including payload bytes. This is a subset of
bytesReceived. If a FEC mechanism that uses a differentssrcwas negotiated, FEC packets are sent over a separate SSRC but is still accounted for here. -
fecPacketsReceivedof type unsigned long long -
Total number of RTP FEC packets received for this SSRC. If a FEC mechanism that uses a different
ssrcwas negotiated, FEC packets are sent over a separate SSRC but is still accounted for here. This counter can also be incremented when receiving FEC packets in-band with media packets (e.g., with Opus). -
fecPacketsDiscardedof type unsigned long long -
Total number of RTP FEC packets received for this SSRC where the error correction payload was discarded by the application. This may happen 1. if all the source packets protected by the FEC packet were received or already recovered by a separate FEC packet, or 2. if the FEC packet arrived late, i.e., outside the recovery window, and the lost RTP packets have already been skipped during playout. This is a subset of
fecPacketsReceived. -
bytesReceivedof type unsigned long long -
Total number of bytes received for this SSRC. This includes retransmissions. Calculated as defined in [RFC3550] section 6.4.1.
-
firCountof type unsigned long -
MUST NOT exist for audio. Count the total number of Full Intra Request (FIR) packets, as defined in [RFC5104] section 4.3.1, sent by this receiver. Does not count the RTCP FIR indicated in [RFC2032] which was deprecated by [RFC4587].
-
pliCountof type unsigned long -
MUST NOT exist for audio. Count the total number of Picture Loss Indication (PLI) packets, as defined in [RFC4585] section 6.3.1, sent by this receiver.
-
totalProcessingDelayof type double -
It is the sum of the time, in seconds, each audio sample or video frame takes from the time the first RTP packet is received (reception timestamp) and to the time the corresponding sample or frame is decoded (decoded timestamp). At this point the audio sample or video frame is ready for playout by the MediaStreamTrack. Typically ready for playout here means after the audio sample or video frame is fully decoded by the decoder.
Given the complexities involved, the time of arrival or the reception timestamp is measured as close to the network layer as possible and the decoded timestamp is measured as soon as the complete sample or frame is decoded.
In the case of audio, several samples are received in the same RTP packet, all samples will share the same reception timestamp and different decoded timestamps. In the case of video, the frame is received over several RTP packets, in this case the earliest timestamp containing the frame is counted as the reception timestamp, and the decoded timestamp corresponds to when the complete frame is decoded.
This metric is not incremented for frames that are not decoded, i.e.
framesDropped. The average processing delay can be calculated by dividing thetotalProcessingDelaywith theframesDecodedfor video (or provisional stats spectotalSamplesDecodedfor audio). -
nackCountof type unsigned long -
Count the total number of Negative ACKnowledgement (NACK) packets, as defined in [RFC4585] section 6.2.1, sent by this receiver.
-
estimatedPlayoutTimestampof type DOMHighResTimeStamp -
This is the estimated playout time of this receiver's track. The playout time is the NTP timestamp of the last playable audio sample or video frame that has a known timestamp (from an RTCP SR packet mapping RTP timestamps to NTP timestamps), extrapolated with the time elapsed since it was ready to be played out. This is the "current time" of the track in NTP clock time of the sender and can be present even if there is no audio currently playing.
This can be useful for estimating how much audio and video is out of sync for two tracks from the same source, audioInboundRtpStats.
estimatedPlayoutTimestamp- videoInboundRtpStats.estimatedPlayoutTimestamp. -
jitterBufferDelayof type double -
The purpose of the jitter buffer is to recombine RTP packets into frames (in the case of video) and have smooth playout. The model described here assumes that the samples or frames are still compressed and have not yet been decoded. It is the sum of the time, in seconds, each audio sample or a video frame takes from the time the first packet is received by the jitter buffer (ingest timestamp) to the time it exits the jitter buffer (emit timestamp). In the case of audio, several samples belong to the same RTP packet, hence they will have the same ingest timestamp but different jitter buffer emit timestamps. In the case of video, the frame maybe is received over several RTP packets, hence the ingest timestamp is the earliest packet of the frame that entered the jitter buffer and the emit timestamp is when the whole frame exits the jitter buffer. This metric increases upon samples or frames exiting, having completed their time in the buffer (and incrementing
jitterBufferEmittedCount). The average jitter buffer delay can be calculated by dividing thejitterBufferDelaywith thejitterBufferEmittedCount. -
jitterBufferTargetDelayof type double -
This value is increased by the target jitter buffer delay every time a sample is emitted by the jitter buffer. The added target is the target delay, in seconds, at the time that the sample was emitted from the jitter buffer. To get the average target delay, divide by
jitterBufferEmittedCount. -
jitterBufferEmittedCountof type unsigned long long -
The total number of audio samples or video frames that have come out of the jitter buffer (increasing
jitterBufferDelay). -
jitterBufferMinimumDelayof type double -
There are various reasons why the jitter buffer delay might be increased to a higher value, such as to achieve AV synchronization or because a jitterBufferTarget was set on a RTCRtpReceiver. When using one of these mechanisms, it can be useful to keep track of the minimal jitter buffer delay that could have been achieved, so WebRTC clients can track the amount of additional delay that is being added.
This metric works the same way as
jitterBufferTargetDelay, except that it is not affected by external mechanisms that increase the jitter buffer target delay, such as jitterBufferTarget (see link above), AV sync, or any other mechanisms. This metric is purely based on the network characteristics such as jitter and packet loss, and can be seen as the minimum obtainable jitter buffer delay if no external factors would affect it. The metric is updated every timejitterBufferEmittedCountis updated. -
totalSamplesReceivedof type unsigned long long -
MUST NOT exist for video. The total number of samples that have been received on this RTP stream. This includes
concealedSamples. -
concealedSamplesof type unsigned long long -
MUST NOT exist for video. The total number of samples that are concealed samples. A concealed sample is a sample that was replaced with synthesized samples generated locally before being played out. Examples of samples that have to be concealed are samples from lost packets (reported in
packetsLost) or samples from packets that arrive too late to be played out (reported inpacketsDiscarded). -
silentConcealedSamplesof type unsigned long long -
MUST NOT exist for video. The total number of concealed samples inserted that are "silent". Playing out silent samples results in silence or comfort noise. This is a subset of
concealedSamples. -
concealmentEventsof type unsigned long long -
MUST NOT exist for video. The number of concealment events. This counter increases every time a concealed sample is synthesized after a non-concealed sample. That is, multiple consecutive concealed samples will increase the
concealedSamplescount multiple times but is a single concealment event. -
insertedSamplesForDecelerationof type unsigned long long -
MUST NOT exist for video. When playout is slowed down, this counter is increased by the difference between the number of samples received and the number of samples played out. If playout is slowed down by inserting samples, this will be the number of inserted samples.
-
removedSamplesForAccelerationof type unsigned long long -
MUST NOT exist for video. When playout is sped up, this counter is increased by the difference between the number of samples received and the number of samples played out. If speedup is achieved by removing samples, this will be the count of samples removed.
-
audioLevelof type double -
MUST NOT exist for video. Represents the audio level of the receiving track. For audio levels of tracks attached locally, see
RTCAudioSourceStatsinstead.The value is between 0..1 (linear), where 1.0 represents 0 dBov, 0 represents silence, and 0.5 represents approximately 6 dBSPL change in the sound pressure level from 0 dBov.
The
audioLevelis averaged over some small interval, using the algorithm described undertotalAudioEnergy. The interval used is implementation-defined. -
totalAudioEnergyof type double -
MUST NOT exist for video. Represents the audio energy of the receiving track. For audio energy of tracks attached locally, see
RTCAudioSourceStatsinstead.This value MUST be computed as follows: for each audio sample that is received (and thus counted by
totalSamplesReceived), add the sample's value divided by the highest-intensity encodable value, squared and then multiplied by the duration of the sample in seconds. In other words,duration * Math.pow(energy/maxEnergy, 2).This can be used to obtain a root mean square (RMS) value that uses the same units as
audioLevel, as defined in [RFC6464]. It can be converted to these units using the formulaMath.sqrt(totalAudioEnergy/totalSamplesDuration). This calculation can also be performed using the differences between the values of two differentgetStats()calls, in order to compute the average audio level over any desired time interval. In other words, doMath.sqrt((energy2 - energy1)/(duration2 - duration1)).For example, if a 10ms packet of audio is produced with an RMS of 0.5 (out of 1.0), this should add
0.5 * 0.5 * 0.01 = 0.0025tototalAudioEnergy. If another 10ms packet with an RMS of 0.1 is received, this should similarly add0.0001tototalAudioEnergy. Then,Math.sqrt(totalAudioEnergy/totalSamplesDuration)becomesMath.sqrt(0.0026/0.02) = 0.36, which is the same value that would be obtained by doing an RMS calculation over the contiguous 20ms segment of audio.If multiple audio channels are used, the audio energy of a sample refers to the highest energy of any channel.
-
totalSamplesDurationof type double -
MUST NOT exist for video. Represents the audio duration of the receiving track. For audio durations of tracks attached locally, see
RTCAudioSourceStatsinstead.Represents the total duration in seconds of all samples that have been received (and thus counted by
totalSamplesReceived). Can be used withtotalAudioEnergyto compute an average audio level over different intervals. -
framesReceivedof type unsigned long -
MUST NOT exist for audio. Represents the total number of complete frames received on this RTP stream. This metric is incremented when the complete frame is received.
-
decoderImplementationof type DOMString -
MUST NOT exist unless exposing hardware is allowed.

MUST NOT exist for audio. Identifies the decoder implementation used. This is useful for diagnosing interoperability issues.
-
playoutIdof type DOMString -
MUST NOT exist for video. If audio playout is happening, this is used to look up the corresponding
RTCAudioPlayoutStats. -
powerEfficientDecoderof type boolean -
MUST NOT exist unless exposing hardware is allowed.

MUST NOT exist for audio. Whether the decoder currently used is considered power efficient by the user agent. This SHOULD reflect if the configuration results in hardware acceleration, but the user agent MAY take other information into account when deciding if the configuration is considered power efficient.
-
framesAssembledFromMultiplePacketsof type unsigned long -
MUST NOT exist for audio. It represents the total number of frames correctly decoded for this RTP stream that consist of more than one RTP packet. For such frames the
totalAssemblyTimeis incremented. The average frame assembly time can be calculated by dividing thetotalAssemblyTimewithframesAssembledFromMultiplePackets. -
totalAssemblyTimeof type double -
MUST NOT exist for audio. The sum of the time, in seconds, each video frame takes from the time the first RTP packet is received (reception timestamp) and to the time the last RTP packet of a frame is received. Only incremented for frames consisting of more than one RTP packet.
Given the complexities involved, the time of arrival or the reception timestamp is measured as close to the network layer as possible. This metric is not incremented for frames that are not decoded, i.e.,
framesDroppedor frames that fail decoding for other reasons (if any). Only incremented for frames consisting of more than one RTP packet. -
retransmittedPacketsReceivedof type unsigned long long -
The total number of retransmitted packets that were received for this SSRC. This is a subset of
packetsReceived. If RTX is not negotiated, retransmitted packets can not be identified and this member MUST NOT exist. -
retransmittedBytesReceivedof type unsigned long long -
The total number of retransmitted bytes that were received for this SSRC, only including payload bytes. This is a subset of
bytesReceived. If RTX is not negotiated, retransmitted packets can not be identified and this member MUST NOT exist. -
rtxSsrcof type unsigned long -
If RTX is negotiated for retransmissions on a separate RTP stream, this is the SSRC of the RTX stream that is associated with this stream's
ssrc. If RTX is not negotiated, this value MUST NOT be present. -
fecSsrcof type unsigned long -
If a FEC mechanism that uses a separate RTP stream is negotiated, this is the SSRC of the FEC stream that is associated with this stream's
ssrc. If FEC is not negotiated or uses the same RTP stream, this value MUST NOT be present. -
totalCorruptionProbabilityof type double -
MUST NOT exist for audio. Represents the cumulative sum of all corruption probability measurements that have been made for this SSRC, see
corruptionMeasurementsregarding when this attribute SHOULD be present.Each measurement added to
totalCorruptionProbabilityMUST be in the range [0.0, 1.0], where a value of 0.0 indicates the system has estimated there is no or negligible corruption present in the processed frame. Similarly a value of 1.0 indicates there is almost certainly a corruption visible in the processed frame. A value in between those two indicates there is likely some corruption visible, but it could for instance have a low magnitude or be present only in a small portion of the frame.Note
The corruption likelihood values are estimates - not guarantees. Even if the estimate is 0.0, there could be corruptions present (i.e. it's a false negative) for instance if only a very small area of the frame is affected. Similarly, even if the estimate is 1.0 there might not be a corruption present (i.e. it's a false positive) for instance if there are macroblocks with a QP far higher than the frame average. Just like there are edge cases for e.g. PSNR measurements, these metrics should primarily be used as a basis for statistical analysis rather than be used as an absolute truth on a per-frame basis.
-
totalSquaredCorruptionProbabilityof type double -
MUST NOT exist for audio. Represents the cumulative sum of all corruption probability measurements squared that have been made for this SSRC, see
corruptionMeasurementsregarding when this attribute SHOULD be present. -
corruptionMeasurementsof type unsigned long long -
MUST NOT exist for audio. When the user agent is able to make a corruption probability measurement, this counter is incremented for each such measurement and
totalCorruptionProbabilityandtotalSquaredCorruptionProbabilityare aggregated with this measurement and measurement squared respectively. If the corruption-detection header extension is present in the RTP packets, corruption probability measurements MUST be present.
US