Entropy Coding
In these lab notes I derive an arithmetic coder that is optimally efficient using 64 bit registers.
Range multiplication
Given two sub-intervals of :
and we want to compute the product of into :
64 bit intervals
We want to represent non-empty subintervals of with 64 bit integers, .
Take with the following map:
Some examples of this map:
\begin{aligned} \p{0,0} &↦ \delim[{0, 2⁻^{64}}) \\ \p{0,2^{64}-1} &↦ \delim[{0, 1}) \\ \p{2^{64}-1, 0} &↦ \delim[{1 - 2⁻^{64}, 1}) \\ \p{2^{64}-1, 2^{64}-1} &↦ \delim[{1 - 2⁻^{64}, 2 - 2⁻^{64}}) \\ \end{aligned}
Where the last example is invalid. For validity we require
A reverse map can be created as:
In order for to be in we require . We already have . To satisfy the upper bound we additionally require:
l ≤ 1 - 2⁻^{64}
In order for to be in we require . The upper bound is already satisfied. To satisfy the lower bound we require:
Since this also implies the stronger bound on . For the reverse map to work we need .
Going full circle results in:
It can be easily shown that the later interval is always a subinterval of the former. It is also easily seen that this operation is idempotent.
64 bit interval multiply
The first bound in gives:
The second bound in gives:
Mapping this back to using the reverse map:
Since we have and we can implement this using a 64 bit multiply with 128 bit result, and an 128 bit and 64 bit add as , or in code:
uint64 h, l;
std::tie(h, l) = mul128(b_S, r_B);
std::tie(h, l) = add128(h, l, b_S);
const uint64 t = h + (l > 0 ? 1 : 0);
b_B += t;
Where the final term implements the ceiling operator.
To obtain a valid result we need . The lower bound goes like:
Where in the last step I have used , which will be enforced later, in normalization. The upper bound goes like:
The second term we already have as t
. For the first term we need another tricky multiply. In this case we have and , so the intermediate result can actually be . The will make sure the final result will be in . We must be careful, but modular arithmetic will handle the overflow fine. Let's rewrite the multiplication in values :
Now we can calculate the final interval:
std::tie(h, l) = mul128(b_S + r_S, r_B);
std::tie(h, l) = add128(h, l, b_S + r_S);
std::tie(h, l) = add128(h, l, r_B);
std::tie(h, l) = add128(h, l, 1);
r_B = h - t - 1;
Normalization
Given an interval we want to extract the prefix bits that won't change anymore. To determine which bits won't change it is useful to look at :
The settled bits can be written directly to the output. The outstanding bits can still change because of a carry, but are otherwise settled. To normalize this interval we output the first 12 bits, note that there are 7 bits outstanding, and rescale the interval by . There is one edge case we can consider:
Writing the number in two different but equal ways can result in a different number of leading zeros. We take the one that results in the largest number of leading zeros. So in general we want to rescale by with given by:
The interval is normalized if and thus iff .
Here we can be in one of two situations, either the integral parts are the same, or 's is one larger that 's. In this case 's is in fact one larger. We can subtract the integral part of :
We have now scaled our interval to a subinterval of , but we also have
In summary, an interval is normalized when it is a subinterval of and the above inequalities hold.
After normalization we end up in one of two cases:
Our situation is as above, there are no bits outstanding. This is when we have a carry outstanding and . After further narrowing of the interval we will eventually end up in the first case and flush the carry buffer, or we will end up in a third case:
In this third case we should add the carry and subtract from both and . We are then effectively back in the first case.
64 bit normalization
For this to be normalized in we must have
And the normalization condition reduces to . This essentially means that and must have the most significant bit set.
The ceiling and the cancel, except when . What remains is essentially a count leading zeroes operation:
const uint n = r == 0 ? 63 : count_leading_zeros(r);
\begin{aligned} l'' &= l·2^n - \floor{l·2^n} = \mod{\frac{\b}{2^{64}⁻^n}}_1 \\ h'' &= h·2^n - \floor{l·2^n} = \frac{\b + \r + 1}{2^{64}⁻^n} - \floor{\frac{\b}{2^{64}⁻^n}}\\ \end{aligned}
Where is the fractional part operator.
\begin{aligned} \b'' &= \ceil{l'' · 2^{64}} \\ &= \ceil{\mod{\frac{\b}{2^{64}⁻^n}}_1 · 2^{64}} \\ &= \ceil{\mod{\frac{\b}{2^{64}⁻^n}_1 · 2^{64}}_{2^{64}}} \\ &= \mod{\b · 2^n}_{2^{64}} \end{aligned}
Where is the modular operator. Since is strictly integer, the ceiling has no effect and the operation reduces to a simple right shift:
b <<= n;
\begin{aligned} \r'' &= \floor{h'' · 2^{64}} - \ceil{l'' · 2^{64}} - 1 \\ &= \floor{h'' · 2^{64}} - \b'' - 1 \\ &= \floor{\p{\frac{\b + \r + 1}{2^{64}⁻^n} - \floor{\frac{\b}{2^{64}⁻^n}}} · 2^{64}} - \b'' - 1 \\ &= \floor{\p{\b + \r + 1}·2^n - \floor{\frac{\b}{2^{64}⁻^n}}· 2^{64} } - \b'' - 1 \\ \end{aligned}
This is essentially shifting to the right places while shifting in ones. We don't need to worry about overflow here because is strictly less than .
r <<= n;
r += (1UL << n) - 1;
Appendix: 128 Bit arithmetic
pair<uint64_t, uint64_t> mul128_emu(uint64_t a, uint64_t b)
{
using uint64_t;
const uint64_t u1 = (a & 0xffffffff);
const uint64_t v1 = (b & 0xffffffff);
uint64_t t = (u1 * v1);
const uint64_t w3 = (t & 0xffffffff);
uint64_t k = (t >> 32);
a >>= 32;
t = (a * v1) + k;
k = (t & 0xffffffff);
uint64_t w1 = (t >> 32);
b >>= 32;
t = (u1 * b) + k;
k = (t >> 32);
const uint64_t h = (a * b) + w1 + k;
const uint64_t l = (t << 32) + w3;
return make_pair(h, l);
}
pair<uint64_t, uint64_t> add128(uint64_t h, uint64_t l, uint64_t n)
{
l += n;
h += (l < n) ? 1 : 0;
return make_pair(h, l);
}