Vorbis Input Streaming
======================

Ogg Parsing

The most important design decision for how stb_vorbis loads
data is that it does not have a separate Ogg parser. This was
a snap design decision I made early in development, primarily
for efficiency reasons and clarity reasons, based on my experience
with jpeg and png: the clearest, best code just decodes everything
as it comes in in a linear stream. In practice I doubt that my choice
leads to an efficiency win _or_ a clarity win; it does make the
decoder less effective at recovering from certain unlikely kinds
of corruption; but it does absolutely minimize the amount of
read-ahead necessary to decode a frame. The last reason probably
makes it the right choice, but it's debatable.

If we parsed Ogg separately, we would have the ability to go
ahead and read a whole Ogg page, and to pass the individual
packets to the Vorbis code. 99% of the time, we would not
look at the Vorbis bytes, and would not need to reassemble
them (they would be intact in the Ogg frame), and the code
would run at the same speed. In the infrequent case that a
packet crossed an Ogg page boundary, we could copy an reassemble
the packet and keep that complexity out of the vorbis decoder.

Of course, in the current implementation, that complexity is
buried even _further_ down, in get_bits and get8_packet, which
track segments and process pages behind the decoder's back.
The question is where that complexity is better to have. A
plausible Ogg-parsing implementation would read _all_ of each
page a packet spans, whereas the current implementation reads
no further than the end of each packet. Because in practice
pages and packets are both small, this makes little difference,
but in terms of what the _spec_ allows this is fairly huge,
since two pages combined could be as big as 128KB. The current
implementation using pushdata might require that much memory
(or more should a single packet cross even more pages--allowed
by the spec as implausible as it might be), but the streaming
readers can use an arbitrarily small buffer. Overall this feels
like the more flexible design (more likely to be useful in a
variety of platforms), at some sacrifice of clarity and
corruption-detection. (See bullet-point list below for the
case of bad corruption detection.)


Streaming Input

Most of the stb_vorbis codebase is written as if we are
_pulling_ data in from some source, not having the data pushed
at it. Even so, unless you seek, stb_vorbis does not do any
read-ahead or rewinding. As noted, parsing of Ogg data is
interleaved with parsing of Vorbis data automatically.

For this reason the 'delete samples at the beginning' is not
supported, since it's implicitly encoded in the ogg granule
position describing the _last_ frame of a page, and we'd have
to scan backwards and work out the size of all the frames in
the page. IMO, the whole thing with Ogg having all the timing
data is an unfortunate design that makes decoding way more painful,
compared to, say, 2 bytes in the header. (It's not like you can
losslessly edit an ogg vorbis file by ONLY parsing Ogg, since
you still need to know how long each packet is in samples,
which you can't tell without decoding the Vorbis header and a
little of each packet.) The specification makes a note that
encoders should only output two packets in the first frame if
using a non-0 offset, but this is not guaranteed, and the
consequences for this decoder would be far greater than the spec
imagines (I imagine the rationale for this is that they expectated
is that you'd load a whole page at a time, but since a page can
be 64KB and that's 1/4 of the memory on some otherwise plausible
platforms, that seems unwise).


General Overview

As a result, you can see a few things going on:
   -- pages and packets are decoded in a linear stream
      without prescanning / rewinding
   -- CRCs are not checked except on seek/stream recovery,
      because by the time we'd notice they were wrong we'd
      have already output all the bad data anyway (and we're
      about to miss an ogg capture pattern and catch the
      problem anyway)
   -- page numbers are not checked because if they're wrong,
      we'd like to check the CRC and keep going if the CRC
      is ok, but we have no way to do that. better to hope
      it's a legit page and a wrong page number and keep
      going (if it's a bad page, we'll recover soon enough;
      50% chance of a framing error on each packet) then it
      is to throw the page away!
   -- the 'pushdata' interface creates a buffer and streams
      from it like normal. some pre-scanning code checks to
      make sure we have enough data (assuming no corruption).
      A corrupted input stream might not have enough data, in
      which case the end of the input buffer will be treated
      as eof and cause an error. (in fact, I use this to simplify
      determining that we have enough pushdata for parsing the
      header).
      -- for this reason we _have_ to check that we don't go off
         the end of the pushdata buffer, even if it passes the
         ogg 'enough data for this frame' rule, so there's no
         real savings to be had by putting the pushdata through
         a different interface. Therefore, if you're using the
         pushdata path, its use of the get8_packet() interface
         might seem inefficient but it's really not so bad.
   -- the pushdata buffer must have the ENTIRE header in it for
      us to open the file; and it must have an entire packet
      (and any immediately preceding Ogg page header) to decode
      a frame