Sony Japan's 2003 A-V Catalog (p. 15) included a brochure-level (i.e. non-technical) description of general ATRAC compression along with this illustration of the Bit Reallocation technique employed in ATRAC Type-R:
In a re-analysis of the audio data, bits are re-allocated to frequency bands critical to audibility. |
The illustration clarifies the 2-dimensional nature of the encoding bit allocation problem. ATRAC and other perceptual coders preferentially allocate bits to those frequency components in each analysis frame* deemed most audible to humans. The idea is to encode each segment of the spectrum with an accuracy that is proportional to its audibility. Once the available bits have been apportioned to each of ATRAC's 52 spectral bands, ATRAC Type-R apparently takes a second look at the actual quantization error that would occur, were each component encoded with its apportioned bits. At this point a decision can reasonably be made to reallocate bits from one band to another if doing so would decrease the [perceptually weighted] overall quantization error of the frame.
The point is that a closed-form solution wherein these two optimization criteria are simultaneously considered may be difficult or impossible to implement. In ATRAC Type-R a two-pass approach is employed, first considering how to allocate bits based upon the human auditory system's response to a frame, and second considering how to minimize the quantization error of the actual spectral components this frame must encode.
*Analysis frame (a.k.a. "window") size for SP mode audio is 512 samples (11.6ms) and 1024 samples (23ms) for the LP modes.