OpusFAQ - XiphWiki

OpusFAQ - XiphWiki
Jump to content
From XiphWiki
If you are looking for info not covered in this FAQ, try the
main Opus website
or the pages included in the
Opus category
of this wiki.
General Questions
What is Opus? Who created it?
Opus is a totally open, royalty-free, highly versatile audio codec.
It is primarily designed for interactive speech and music transmission over the Internet, but is also applicable to storage and streaming applications. It incorporates technology from Skype's
SILK
codec and Xiph.Org's
CELT
codec. It has been standardized by the
Internet Engineering Task Force
(IETF) as
RFC 6716
Opus has been in development since early 2007. Programmers associated with
Xiph.Org
Skype
and several other organizations have contributed to its development and to the standardization process as part of the
IETF's Codec Working Group
How does Opus compare to other codecs?
Opus is distinguished from most high quality formats (eg:
Vorbis
, AAC, MP3) by having
low delay
(5 ~ 66.5 ms) and distinguished from most low delay formats (eg:
Speex
, G.711, GSM) by supporting
high audio quality
(supports narrow-band all the way to full-band audio).
It
meets or exceeds existing codecs' quality
across a wide range of bitrates, and it operates at lower delay than virtually any existing compressed format.
Most importantly, the Opus format and its reference implementation are both available under
liberal, royalty-free licenses
This makes it:
easy to adopt
compatible with free software
suitable for use as part of the basic infrastructure of the Internet
Does Opus make all those other lossy codecs obsolete?
Yes.
From a technical point of view (loss, delay, bitrates, ...) Opus renders
Speex
obsolete and should also replace
Vorbis
and the common proprietary codecs too (e.g. AAC, MP3, ...).
Will Opus replace Vorbis in video files?
For
Ogg
video files (which use the
Theora
video codec), you
can
use Opus instead of Vorbis, but the overall size reduction will be minimal and it will break compatibility with existing players.
For WebM video files, the convention is to use the
VP9 video codec
when using Opus as an audio codec.
How do I use Opus?
For now, the best way to
encode
audio into Opus files is to use the
opusenc
command-line tool from the
opus-tools package
If you want to encode many files at once (e.g. your music library), try the applications listed in the
Opus Support
page.
For rough guidelines on encoding settings, see the
Opus Recommended Settings
page.
What programs support Opus?
Opus decoding support is now included in
some Internet browsers
and
many applications
, including
Firefox
foobar2000
and
VLC
, as well as in frameworks such as
GStreamer
and
FFmpeg
For real-time applications, Opus support is available in
Google's WebRTC codebase
Opus is a relatively new codec (standardized in September 2012), but
many more applications
will support it in the near future.
Does Opus support higher sampling rates, such as 96 kHz or 192 kHz?
Yes and no.
Opus encoding tools like opusenc will happily encode input files that are sampled at 96 or 192 kHz.
However, files at these rates are internally
converted to 48 kHz
and then only frequencies
up to 20 kHz
are encoded.
The reason is simple: lossy codecs are designed to preserve audible details while discarding irrelevant information. Since the human ear can only hear up to 20 kHz at best (usually lower than that), frequency content above 20 kHz is the first thing to go.
See Monty's
article
for more details.
If you want a codec to handle high sampling rates losslessly, use
FLAC
What are the licensing requirements?
The reference Opus source code is released under a three-clause BSD license, which is a very permissive Open Source license. Commercial use and distribution (including in proprietary software) is permitted, provided that some basic conditions specified in the license are met.
Opus is also covered by some patents, for which royalty-free usage rights are granted, under conditions that the authors believe are compatible with (hopefully) all open source licenses, including the GPL (v2 and v3).
See the
Opus Licensing
page for details.
Why make Opus free?
On the Internet, protocol and codec standards are part of the common infrastructure everyone builds upon.
Most of the value of a high-quality standard is the innovation and inter-operation provided by the systems built on top of it. When a few parties have monopoly rights to monetize a standard, that infrastructure stops being so common and everyone else has more reason to use their own solution instead, increasing cost and reducing efficiency.
Imagine a road system where each type of car could only drive on its own manufacturer's pavement. We all benefit from living in a world where all the roads are connected.
This is why Opus, unlike many codecs, is free.
Is the SILK part of Opus compatible with the SILK implementation shipped in Skype?
No.
The SILK codec, as submitted by Skype to the IETF, was heavily modified as part of its integration within Opus. The modifications are significant enough that it is not possible to just write a "translator". Even sharing code between Opus and the "old SILK" would be highly complex.
Why not keep the SILK and CELT codecs separate?
Opus is more than just two independent codecs with a switch.
In addition to a
Linear Prediction
SILK mode
and an
MDCT
CELT mode
it has a
hybrid mode
, where speech frequencies up to 8 kHz are encoded with LP while those between 8 and 20 kHz are encoded with MDCT. This is what allows Opus to have such high speech quality around 32 kbps.
Another advantage of the integration is the ability to switch between these 3 modes seamlessly, without any audible "glitches" and without any out-of-band signalling.
Now that Opus is standardized, will its development stop or can it be further improved?
Yes, Opus
can
and
should
be improved, because unlike most
ITU-T codecs
, Opus is only defined in terms of its decoder.
The encoder can keep evolving as long as the bitstream it produces can be decoded by the reference decoder. This is what made it possible for modern MP3 encoders (e.g.
LAME
) to improve far beyond the original
L3enc
and
dist10
reference implementations.
Although it is unlikely that Opus encoders will see such a spectacular evolution, we certainly hope that future encoders will become much better than the reference encoder.
In fact, the 1.1 libopus release significantly improves on the reference encoder's quality. See
Monty's demo
for more details.
Will all future Opus releases comply with the
Opus specification
Yes.
In what ways is Opus optimized for the Internet?
Opus has good packet loss robustness and concealment, but its optimisations go further.
One of the first things we've been asked when designing Opus was to make the rate
really
adaptable because we never know what kind of rates will be available. This not only meant having a wide range of bitrates, but also being able to vary in small increments.
This is why Opus scales from about
to
512 kb/s
, in increments of
0.4 kb/s
(one byte with 20 ms frames). Opus can have
more than 1200 possible bitrates
while spending only
11 bits
signalling the bitrate because UDP already encodes the packet size.
One last aspect is that Opus is simple to transport over RTP, as can be seen from the
Opus RTP payload format
. For example, it's possible to decode RTP packets without having even seen the SDP or any out-of-band signalling.
What applications for Android can play Opus?
Right now, there are just a few but that list is fast growing. Please reference
this question on android.stackexchange.com
. Feel free to suggest other applications.
When will the next version be released?
When it's done. Seriously, we do not know.
Opus is not a large project with a fixed release schedule.
That being said, our
pre-releases
and even the git repositories (
Xiph
GitHub
) are pretty stable and given proper testing (which you should always do anyway), are safe to distribute.
Just be aware that the API of new features (that have never been included in a stable release) could potentially still change.
Software Developers' Questions
On what platforms does Opus run?
The Opus code base is written in C89 and should run on the vast majority of recent (and not so recent) CPUs.
Some of the platforms on which Opus has been tested
include x86, x86-64, ARM, Itanium, Blackfin, and SPARC.
Is there a fixed-point implementation?
Yes.
The fixed-point and floating-point decoder and encoder implementations are part of the same code base.
The code defaults to float, so you need to configure with
--enable-fixed-point
(or define
FIXED_POINT
if not using the configure script) to build the code for fixed-point.
Which implementation should I use?
While the implementation in RFC 6716 is what
defines
the standard, it is likely not the best and most up-to-date implementation.
The
Opus
website was set up for the purpose of continually improving the implementation — in terms of speed, encoding quality, device compatibility, etc — while still conforming to the standard.
All Opus implementations are compatible by definition.
How is supporting Opus different from supporting Speex/G.711/MP3?
Opus has variable frame durations which can change on the fly, so an Opus decoder needs to be ready to accept packets with durations that are
any multiple of 2.5ms
up to a
maximum of 120ms
The opus encoder and decoder do not need to have matched sampling rates or channel counts. It is recommended to always just decode at the highest rate the hardware supports (e.g. 48kHz stereo) so the user gets the full quality of whatever the far end is sending.
My application doesn't work. Can anyone help me?
It's possible to get help, but before doing so, there are a few basic things to try:
Implement your application with uncompressed audio instead of Opus. If it still doesn't work, then the problem isn't related to Opus.
Read the
Opus documentation
Read the
opus_demo.c
source code to see how to use the encoder and decoder.
If you still can't solve the problem, the best option is to ask for help on the
mailing list
or on the
#opus
IRC channel on
irc.freenode.net
How do I report a bug?
If you think you have found a bug in Opus (and not in your application), please
file an issue
Please include a way for us to reproduce the problem. The best way to do this is to provide an input file, along with the opusenc/opusdec/opus_demo command line that causes the bug to occur.
If the bug cannot be triggered by the command line tools, please provide a simple patch or C file that can help reproduce it. Please also provide any other relevant information, such as OS, CPU, build options, etc.
Don't hesitate to also contact us on the
mailing list
or on
IRC
What is Opus Custom?
Opus Custom is an
optional
part of the Opus standard that allows for sampling rates other than 8, 12, 16, 24, or 48 kHz and frame sizes other than multiples of 2.5 ms.
Opus Custom requires additional out-of-band signalling that Opus does not normally require and disables many of Opus' coding modes. Also, because it is an optional part of the specification, using Opus Custom may lead to compatibility problems.
For these reasons,
its use is discouraged
outside of very specific applications.
You may want to use Opus Custom for:
ultra-low-delay applications, where synchronization with the soundcard buffer is important.
low-power embedded applications, where compatibility with others is not important.
For almost all other types of applications, Opus Custom should not be used.
How do I use 44.1 kHz or some other sampling rate not directly supported by Opus?
Tools which read or write Opus should inter-operate with other sampling rates by transparently performing sample rate conversion behind the scenes whenever necessary. In particular, software developers should not use Opus Custom for 44.1 kHz support, except in the very specific circumstances outlined above.
Note that it's generally preferable for a decoder to output at 48kHz, even when you know the original input was 44.1kHz. This is not only because you can skip resampling, but also because many cheaper audio interfaces have poor quality output for 44.1kHz.
The
opus-tools
package source code contains a small, high quality, high performance, BSD licensed
resampler
which can be used where resampling is required.
But won't the resampler hurt the quality? Isn't it better to use 44.1 kHz directly?
Not really. The quality degradation caused by any reasonable resampler (SoX, libspeexdsp, libsamplerate, ...) is far less than the distortion caused by the best lossy codec at its highest bitrate. If you can't tolerate the quality degradation caused by a good 44.1 ↔ 48 kHz resampler, then you shouldn't be using a lossy codec in the first place. Similarly, the extra CPU spent in the resampler is small compared to the rest of the codec. Not only that, but many soundcards only support 48 kHz on playback, so players can directly play the output rather than resample it to 48 kHz (e.g. for a 44.1 kHz MP3). So effectively, Opus is only shifting the burden of resampling from the decoder side to the encoder side.
One advantage of supporting only one internal rate is that it makes it possible for Opus to support many features, including efficient speech compression (through SILK) and real-time applications. It also means all the quality tuning effort can be spent on a single configuration, which helps bring even better quality.
How is the bitrate setting used in VBR mode?
Variable bitrate (VBR) mode allows the bitrate to automatically vary over time based on the audio being encoded, in order to achieve a consistent quality.
The bitrate setting controls the desired quality, on a scale that is calibrated to closely approximate the average bitrate that would be obtained over a large and diverse collection of audio. The actual bitrate of any particular audio stream may be higher or lower than this average.
What frame size should I use?
20ms
frame size works well for most applications. Smaller frame sizes may be used to achieve lower latency, but have lower quality at a given bitrate.
Sizes greater than 20 ms increase latency and are generally beneficial only at fairly low bitrates, or when used to reduce external overhead (e.g. by reducing the number of packets that are sent). For file encoding, using a frame size larger than 20 ms will usually result in
worse
quality for the same bitrate because it constrains the encoder in the decisions it can make.
Forward Error correction (FEC) doesn't appear to do anything! HELP!
The in-band FEC feature of Opus helps reduce the harm of packet loss by encoding some information about the prior packet.
In order to make use of in-band FEC the decoder must delay its output by at least one frame so that it can call the decoder with the decode_fec argument on the
next
frame in order to reconstruct the missed frame. This works best if it's integrated with a jitter buffer.
FEC is only used by the encoder under certain conditions:
the feature must be enabled via the
OPUS_SET_INBAND_FEC
CTL
the encoder must be told to expect loss via the
OPUS_SET_PACKET_LOSS_PERC
CTL
the codec must be operated in any of the
Linear Prediction
or
Hybrid
modes
Frame durations shorter than 10ms and very high bitrates will use the MDCT modes, where FEC is not available.
Even when FEC is not used, telling the encoder about the expected level of loss will help it make more intelligent decisions. By default, the implementation assumes there is no loss.
I can't use malloc or much stack on my embedded platform. How do I make Opus work?
A normal build of libopus only uses
malloc/free
in the
_create()
and
_destroy()
calls, making it safe for realtime use as long as the codec state is pre-created.
To build Opus without the references to
malloc/free
, you must:
use
init()
calls rather than
create()
calls in your application
compile with
CFLAGS="-DOVERRIDE_OPUS_ALLOC -DOVERRIDE_OPUS_FREE -D'opus_alloc(x)=NULL' -D'opus_free(x)=NULL' "
If libopus is built with
-DNONTHREADSAFE_PSEUDOSTACK
(instead of
VAR_ARRAYS
, or
USE_ALLOCA
), it will use a user-provided block of heap instead of stack for many things, resulting in much lower stack usage.
This makes the resulting library
non-threadsafe
and is
not recommended
on anything except limited embedded platforms.
How can I ensure that my software interoperates with other software implementing Opus?
For applications using Ogg files, there are some
Ogg Opus testvectors
to test decoders and you can test encoders with opusdec. For RTP applications, the opusrtp tool can be useful.
In general, here's a list of specific issues to check:
Can your application handle all frame sizes, including changing the frame size from frame to frame?
Does your application react properly to lost packets, by calling the decoder with a NULL packet?
What is the complexity of Opus?
The complexity of Opus varies by a large amount based on the settings used.
It depends on the mode, audio bandwidth, number of channels, and even a "complexity knob" that can trade complexity for quality. It will run easily on any recent PC or smartphone.
For slower embedded CPUs/DSPs, the amount of CPU required will vary depending on the configuration and the exact CPU, so you will need to experiment. Do not expect Opus to run quickly on really slow devices like 8-bit micro-controllers.
Opus is using too much CPU for my application. What can I do?
First don't panic and don't start writing assembly just yet.
It's possible that you're just not using the right set of options.
If you're targeting an embedded/mobile platform, chances are the fixed-point build will be faster, so make sure you're using
--enable-fixed-point
or defining
FIXED_POINT
in the build system.
Opus also has a complexity option that can trade quality for complexity. The default is highest quality and highest complexity. You can control this using
OPUS_SET_COMPLEXITY()
(see the
Documentation
for details).
If all else fails and you need to optimize the Opus code, see the next question.
I would like to optimize/improve/help with Opus. Where should I start?
Please
before you start, or at least before you get too far.
This will help coordinate the efforts made on Opus and reduce the probability of wasting your time on duplicated effort or going down the wrong path. More details in the
contributing page
Does Opus have an echo canceller like Speex does?
Echo cancellation is completely independent from codecs.
You can use any echo canceller (including the one from libspeexdsp) along with Opus.
That being said, among the free acoustic echo cancelers (AEC) we're aware of, the best is probably the Google AEC from the
WebRTC codebase
How do I get the duration of a .opus file?
Use
op_pcm_total()
from
libopusfile
If you want to implement this yourself, you need to
Read the BOS (Beginning Of Stream) pages to enumerate the serial numbers of all concurrently multiplexed streams, identify the Opus stream you want, and get its preskip value.
Read up through the first complete audio data page to compute the starting granule position (since the timestamps might not start at 0, e.g., if the file was captured from a live stream that was joined after the start).
Seek near the end of a file and look for a page with the same serial number as found in the headers (just under 64 kB from the end should be sufficient to ensure you find a page, assuming the Opus data is not multiplexed with another stream and there is no trailing garbage in the file).
If you find a page whose serial number was not included in the original set of BOS pages, you have a chained stream. You need to bisect the file to identify the end of the first chain and the start of the next, and repeat this process for each link in the chain.
If you don't find any pages at all, or find a page whose serial number was included in the original set of BOS pages, but was not the serial number of the Opus stream you want, back up and try again (being careful to avoid rescanning the same data, which can produce quadratic worst-case complexity).
If you find a page whose serial number matches the Opus stream you want, look at its final granule position, and compute the total duration (in seconds) as (final_granule_position - initial_granule_position - preskip)/48000.0.
Why don't you store the duration in the header? Isn't all of that slow and complicated?
Computing the duration directly from the file contents allows files to be written in a single pass, without any seeking, which is necessary for live streaming. Chaining also simplifies live streaming, as you can just pipe multiple files into the same network connection, with all associated metadata updates, etc., and the results are still valid .opus files (contrast with the
hacks used to add metadata to MP3 streams
).
Opening a typical .opus file, which is not multiplexed and not chained, and computing the duration over the network requires just one extra HTTP request, which can proceed in parallel with the buffering in the main request. This is the behavior you will get from libopusfile's HTTP backend by default.
Enumeration of chain boundaries can be expensive in files with many links, but in our testing libopusfile used nearly an order of magnitude fewer seeks to do this than some other media frameworks (at the time). Storing a duration in a header wouldn't solve this, since every link in a chain has its own, independent headers. If the cost of chain enumeration is a problem, the best way to avoid it is to store the links in separate files (i.e., don't use chaining).
How do I seek in a .opus file?
Use
op_pcm_seek() or op_raw_seek()
from
libopusfile
If you want to implement seeking yourself, you need to
Identify the link that contains the target (if you have a chained file).
Adjust the target by 80 ms to get enough pre-roll data (to ensure the decoder will have converged by the time you reach the target), as recommended by
RFC 7845
Estimate the location of the last audio data page with a completed packet prior to the adjusted target, using the duration and size (in bytes) of the link.
Seek to that location and scan forward until you find an audio data page with a completed packet (that contains a valid granule position).
If you think you are sufficiently close to the adjusted target, scan forward until you find the next audio data page with a completed packet.
If the adjusted target lies between the first audio data page with a completed packet you found and the next one, stop. You can decode forward from here and start playing when you reach your (original, unadjusted) target.
Otherwise, go back and re-estimate the seek location using the granule positions and file offsets of the page(s) you just found.
libopusfile includes fallbacks to prevent pathological worst-case behavior when its guesses are repeatedly wrong. Weighted bisection can degrade to a linear scan, but libopusfile's worst case is within a constant factor of naive bisection (i.e., logarithmic). We have only ever observed such pathological behavior in files we manually constructed to trigger it.
libopusfile also takes shortcuts when the target location is near the current position, to make small seeks cheaper. In the best case it can loop forever over very short files whose data is contained in a single page (e.g., less than 1 second long with default encoder settings) without any seeking at all.
You can find more information on seeking in files that contain Opus multiplexed with other streams (e.g., video)
on this page
Wouldn't it be better to build an index?
As with file durations, an index at the beginning of the file is incompatible with live streaming. It also means more data has to be fetched before a file can start playing over the network, because you must read past the index even when you don't intend to seek. The index could be stored at the end (which even still allows encoding the file in a single pass), but this requires one (or more) extra seeks to read the index (especially if its exact location at the end is not known), either on file open or on first seek. Unlike the final timestamp, which is small and fixed in size, an index grows with the file duration, and can have unbounded size. It is also easy for an index to become out of sync with a file that has been edited or damaged, in which case seeking will simply fail. By contrast, you can seek in a truncated .opus download without issues.
In practice, bisection seeking on VBR audio achieves performance that is very nearly as good as seeking with an index, without any of the drawbacks of an index. libopusfile provides a test program called seeking_example which can be used to benchmark the performance on your files.
On a 96 kbps VBR file nearly one hour long (the second movement of Mahler's Symphony No. 8 "Symphony of a Thousand"):
Testing exact PCM seeking to random places in 169680000 samples (58m55.000s)...
Total seek operations: 1020 (1.020 per exact seek, 2 maximum).
On a chained file formed by concatenating the eight test vectors for the currently supported channel layouts in mapping family 1:
Opened file containing 8 links with 18 seeks (2.250 per link).
Testing exact PCM seeking to random places in 2759064 samples (57.481s)...
Total seek operations: 946 (0.946 per exact seek, 2 maximum).
That is, the number of physical seeks required is almost always 1, every once in a while 2, and in short files, sometimes even 0.
References
Automated test results used to be available
here
last saved copy
). The
new git repository
doesn't have any links to test results, but you can always run the tests yourself!
Retrieved from "
Category
Opus
OpusFAQ
Add topic