Why there is no stb_opus

Sean Barrett, October 2013

As the author of stb_vorbis, a noteworthy^* software implementation of Ogg Vorbis which is available in the freest-possible license (public domain, where available) and used by some non-insignificant number of game developers and other software developers, I am occasionally asked when stb_opus wll be released.

The answer is "never", as things currently stand.

Background

Ogg Vorbis is an audio codec created by the Xiph.Org and released in 2000.

Opus is an audio codec specification released by the IETF in 2012, based on combining two independent codecs, a non-Vorbis codec (invented by Xiph.Org) and SILK (invented by Skype Limited).

Opus appears to be superior to Ogg Vorbis in many important ways. One of significance to game developers is that Vorbis has a substantially-sized "header" which contains decode tables. This storage cost is insigificant on long files, but substantial on short files. This means that using Ogg Vorbis for sound effects, which are normally stored one sound-effect per file, is significantly space-inefficient due to the many redundant tables.

For these reasons, creating an stb_opus library is a compelling proposition. I would like to create an stb_opus. I think it would be useful.

Why this is not going to happen (as things currently stand) owes to the nature of the Opus specification, and for this to make sense you'll need a little educational background about copyright.

Copyright, Derivative Works, Specifications

If I create a work (a program, a novel, a piece of music) which incorporates elements of some copyrighted work, the work I create is a "derivative work"—it derives from the original work. The legal status of a derivative work (in the USA) is that it is essentially also covered by the copyright of the original work. (That is, it's really covered by the copyright of the original and a second new copyright for the derivative work. Copies can only be made with permission of the owners of both copyrights; I can't sell my Harry Potter fanfiction without J.K. Rowling's permission, but she can't sell my Potter fanfiction without my permission either!)

The software libraries I've created like, stb_vorbis and stb_jpeg, are not covered by any other copyright than my own. This frees me to disclaim my copyright and to place the libraries in the public domain (where that dedication is legally allowed). However, I can't take somebody else's software and make some modifications to it and place the final software in the public domain! I can make my modifications public domain, but they're of course useless without the original software, which is still covered by the original copyright.

Now, I did implement stb_vorbis and stb_jpeg from specifications, documents that describe in plain English what the behavior of the software is supposed to be. Even though that plain English description may explain very mechanically what the software is supposed to do, legal doctrine (at least as far as I understand the legal doctrine of the USA) does not recognize an implementation-from-specification as copying from that specification. The implementation of software from a plain-English specification is not a derivative work of the specification; the software is not "contaminated" by the copyright on the specification. As a result, when I disclaim copyright on that software, I know the only potential copyright that would have applied is my own, and thus my renunciation of the copyright is meaningful (i.e. valid where such renunciations are valid).

Can I always create software that's uncontaminated by any other copyright, so I can make it public domain? Yes.

Reverse Engineering

The United States has a legally recognized method for reproducing the function of another system without violating the copyrights (or trade secrets) of that thing. This method is known as Clean Room Reverse-Engineering.

In this process, one team of experts closely analyzes the copyrighted system to be re-implemented, and constructs a plain-English description of the behavior of the system, i.e. an independently-constructed specification. Then a second team of experts, with no other communication with the first team than reading the specification, implements the specification. Hopefully, at the end, the second team will have constructed a functioning "clone" of the original system, and this process guarantees that the resulting clone is not a derivative work in the copyright sense.

Thus, if I wanted to create a public domain clone of a system for which I did not have a plain-English specification, I could use the above process (hiring a third party to write a plain-English specification) and then implement that.

This process, however, is expensive, and is not something I am actually likely to do with my own time and money!

Opus Specification and Copyright

Unfortunately, the Opus specification is not plain English! Whereas many other specifications have language like "if the reference code and the specification deviate, the specification is correct", Opus has chosen the opposite. Only the reference implementation is normative:

RFC6716 section 1:

   The primary normative part of this specification is provided by the
   source code in Appendix A.
[...]
   Additionally, any conflict between the symbolic representation and
   the included reference implementation must be resolved.  For the
   practical reasons of compatibility and testability, it would be
   advantageous to give the reference implementation priority in any
   disagreement.  The C language is also one of the most widely
   understood, human-readable symbolic representations for machine
   behavior.  For these reasons, this RFC uses the reference
   implementation as the sole symbolic representation of the codec.

Moreover, the plain English explanation for it is (apparently) incomplete!

   While the symbolic representation is unambiguous and complete, it is
   not always the easiest way to understand the codec's operation.  For
   this reason, this document also describes significant parts of the
   codec in prose

No implementation of Opus from the specification can proceed from a plain-English description of the algorithm (there is no complete, correct one), and so any such implementation must derive from code elements in the RFC.

What is the copyright status of a new implementation derived from those code elements? Well, those code elements are themselves copyrighted. Here is what the Copyright Notice of RFC 6716 says:

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.
   [...]
   Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

The IETF Trust's Legal Provisions document has little new to say, also establishing the applicability of the Simplified BSD License to the code in the specification.

The simplified BSD license is a kind of open source license which allows derivative works to be copied without explicit permission from the original copyright holder; the license grants the creator of the derivative work the freedom to copy given certain limitations. (For example, they require acknowledgement of the original authorship of the code elements, both in the source code or in materials accompanying a binary.)

Making a Non-Derivative Work

This license is clearly what applies if the code elements from the specification are copied directly, since they are clearly covered by the copyright on that code. If the new code is a paraphrase of or "inspired by" the other code, the copyright status is unclear to me. (I do not claim to be an expert in this subject area. I just need enough knowledge to know whether I can legitimately place something in the public domain, and the existence of the clean-room reverse-engineering method certainly implies that if you don't follow the method, if you work directly from the copyrighted code, the resulting work will be contaminated by copyright—otherwise you'd never need clean-room practices.)

If we need the copyright status to be clearly "not affected by the copyright in the original code" (which I need to be able to place my implementation in the public domain, or indeed any independent implementation would need to have an implementation not covered by the Simplified BSD license), then we have one mechanism that I am aware of to create such code from this specification: clean room reverse-engineering.

Of course, the process of clean room reverse-engineering is basically just the process of creating a plain-English specification which can be implemented without reference to any source code. Normally we would just call this a specification, but in this particular case the IETF chose to accept an argument in factor of "compatibiility and testability" to make both make the code primary and to not even create a complete plain-English specification.

As a result, I cannot create a public domain stb_opus unless one of the following occurs:

IETF revises RFC-6716 to renounce copyright on the Code Components
IETF revises RFC-6716 to grant a no-restrictions copyright license on the Code Components (this won't allow me to make a public domain stb_opus, but it allows me to get close enough to make no difference)
IETF revises RFC-6716 to renounce copyright claims on code derived from (but not copied from) the Code Components
IETF revises RFC-6716 to include a complete plain-English specification that is primary, i.e. normative
some third-party creates a plain-English specification and there is good reason to believe it is accurate

If you want to take steps to see an stb_opus, take steps to make one of the above things happen.

And if you want to see future stb_ libraries for future IETF specifications, it would probably wise to convince the IETF that it's pernicious to create specifications in which copyrighted code is normative.

^* In terms of deployed software, it may not be that large a player, but it is part of a suite of easy-to-use, public domain libraries by the author that sees growing use within the game development community.