Entropy Coding

Remco Bloemen

In these lab notes I derive an arithmetic coder that is optimally efficient using 64 bit registers.

Range multiplication

Given two sub-intervals of [0,1)\left[{0, 1}\right):

B=[b0,b1)S=[s0,s1) \begin{aligned} B &= \left[{b_0 , b_1 }\right) \\ S &= \left[{s_0 , s_1 }\right) \end{aligned}

and we want to compute the product of SS into BB:

SB=[b+s0(b1b0),b+s1(b1b0)) S \cdot B = \left[{bₒ + s_0 (b_1 - b_0 ), bₒ + s_1 (b_1 - b_0 )}\right)

64 bit intervals

We want to represent non-empty subintervals of [0,1)\left[{0,1}\right) with 64 bit integers, 𝔹64\mathbb{B} ^{64}.

Take (𝚋,𝚛)(𝔹64)2\left({\mathtt{b},\mathtt{r}}\right) \in \left({\mathbb{B} ^{64}}\right)^2 with the following map:

(𝚋,𝚛)[𝚋264,𝚋+𝚛+1264) \left({\mathtt{b},\mathtt{r}}\right) \mapsto \left[{\frac{\mathtt{b}}{2^{64}}, \frac{\mathtt{b}+ \mathtt{r}+ 1}{2^{64}}}\right)

Some examples of this map:

(0,0)[0,264)(0,2641)[0,1)(2641,0)[1264,1)(2641,2641)[1264,2264) \begin{aligned} \left({0,0}\right) &\mapsto \left[{0, 2^- ^{64}}\right) \\ \left({0,2^{64}-1}\right) &\mapsto \left[{0, 1}\right) \\ \left({2^{64}-1, 0}\right) &\mapsto \left[{1 - 2^- ^{64}, 1}\right) \\ \left({2^{64}-1, 2^{64}-1}\right) &\mapsto \left[{1 - 2^- ^{64}, 2 - 2^- ^{64}}\right) \\ \end{aligned}

Where the last example is invalid. For validity we require

𝚋+𝚛+1264 \mathtt{b}+ \mathtt{r}+ 1 \leq 2^{64}

A reverse map can be created as:

[l,h)(l264,h264l2641) \left[{l, h}\right) \mapsto \left({\left\lceil {l \cdot 2^{64}}\right\rceil , \left\lfloor {h \cdot 2^{64}}\right\rfloor - \left\lceil {l \cdot 2^{64}}\right\rceil - 1}\right)

In order for 𝚋\mathtt{b} to be in 𝔹64\mathbb{B} ^{64} we require 0l264<2640 \leq \left\lceil {l \cdot 2^{64}}\right\rceil < 2^{64}. We already have 0l<10 \leq l < 1. To satisfy the upper bound we additionally require:

l1264 l \leq 1 - 2^- ^{64}

In order for 𝚛\mathtt{r} to be in 𝔹64\mathbb{B} ^{64} we require 0h264l2641<2640 \leq \left\lfloor {h \cdot 2^{64}}\right\rfloor - \left\lceil {l \cdot 2^{64}}\right\rceil - 1 < 2^{64}. The upper bound is already satisfied. To satisfy the lower bound we require:

0h264l2641<(hl)26411l+263<h \begin{aligned} 0 &\leq \left\lfloor {h \cdot 2^{64}}\right\rfloor - \left\lceil {l \cdot 2^{64}}\right\rceil - 1 \\ &< \left({h - l}\right) \cdot 2^{64} - 1 - 1 \\ l + 2^- ^{63} &< h \end{aligned}

Since h1h \leq 1 this also implies the stronger bound l<1263l < 1 - 2^- ^{63} on ll. For the reverse map to work we need l+263<h1l + 2^- ^{63} < h \leq 1.

Going full circle results in:

[l,h)(𝚋,𝚛)[l264264,h264264) \left[{l, h}\right) \mapsto \left({\mathtt{b},\mathtt{r}}\right) \mapsto \left[{\frac{\left\lceil {l \cdot 2^{64}}\right\rceil }{2^{64}}, \frac{\left\lfloor {h \cdot 2^{64}}\right\rfloor }{2^{64}}}\right)

It can be easily shown that the later interval is always a subinterval of the former. It is also easily seen that this operation is idempotent.

64 bit interval multiply

(𝚋B,𝚛B)[𝚋B264,𝚋B+𝚛B+1264)(𝚋S,𝚛S)[𝚋S264,𝚋S+𝚛S+1264) \begin{aligned} \left({\mathtt{b}_B,\mathtt{r}_B}\right) &\mapsto \left[{\frac{\mathtt{b}_B}{2^{64}}, \frac{\mathtt{b}_B + \mathtt{r}_B + 1}{2^{64}}}\right) \\ \left({\mathtt{b}_S,\mathtt{r}_S}\right) &\mapsto \left[{\frac{\mathtt{b}_S}{2^{64}}, \frac{\mathtt{b}_S + \mathtt{r}_S + 1}{2^{64}}}\right) \end{aligned}

[𝚋B264+𝚋S264(𝚋B+𝚛B+1264𝚋B264),𝚋B264+𝚋S+𝚛S+1264(𝚋B+𝚛B+1264𝚋B264)) \left[{ \frac{\mathtt{b}_B}{2^{64}} + \frac{\mathtt{b}_S}{2^{64}} \left({\frac{\mathtt{b}_B + \mathtt{r}_B + 1}{2^{64}} - \frac{\mathtt{b}_B}{2^{64}}}\right), \frac{\mathtt{b}_B}{2^{64}} + \frac{\mathtt{b}_S + \mathtt{r}_S + 1}{2^{64}} \left({\frac{\mathtt{b}_B + \mathtt{r}_B + 1}{2^{64}} - \frac{\mathtt{b}_B}{2^{64}}}\right) }\right)

[𝚋B264+𝚋S(𝚛B+1)2128,𝚋B264+(𝚋S+𝚛S+1)(𝚛B+1)2128) \left[{ \frac{\mathtt{b}_B}{2^{64}} + \frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{128}}, \frac{\mathtt{b}_B}{2^{64}} + \frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{128}} }\right)

The first bound in l+263<h1l + 2^- ^{63} < h \leq 1 gives:

𝚋B264+𝚋S(𝚛B+1)2128+263<𝚋B264+(𝚋S+𝚛S+1)(𝚛B+1)2128𝚋S(𝚛B+1)+264<(𝚋S+𝚛S+1)(𝚛B+1) \begin{aligned} \frac{\mathtt{b}_B}{2^{64}} + \frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{128}} + 2^- ^{63} &< \frac{\mathtt{b}_B}{2^{64}} + \frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{128}} \\ \mathtt{b}_S\left({\mathtt{r}_B + 1}\right) + 2^{64} &< \left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right) \end{aligned}

The second bound in l+263<h1l + 2^- ^{63} < h \leq 1 gives:

𝚋B264+(𝚋S+𝚛S+1)(𝚛B+1)21281𝚋B264+(𝚋S+𝚛S+1)(𝚛B+1)2128 \begin{aligned} \frac{\mathtt{b}_B}{2^{64}} + \frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{128}} & \leq 1 \\ \mathtt{b}_B \cdot 2^{64} + \left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right) & \leq 2^{128} \end{aligned}

Mapping this back to (𝔹64)2\left({\mathbb{B} ^{64}}\right)^2 using the reverse map:

𝚋B=l264=(𝚋B264+𝚋S(𝚛B+1)2128)264=𝚋B+𝚋S(𝚛B+1)264 \begin{aligned} \mathtt{b}_B' &= \left\lceil {l \cdot 2^{64}}\right\rceil \\ &= \left\lceil {\left({ \frac{\mathtt{b}_B}{2^{64}} + \frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{128}} }\right)\cdot 2^{64}}\right\rceil \\ &= \mathtt{b}_B + \left\lceil {\frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}} }\right\rceil \\ \end{aligned}

Since we have 𝚋S<264\mathtt{b}_S < 2^{64} and 𝚛B+1264\mathtt{r}_B + 1 \leq 2^{64} we can implement this using a 64 bit multiply with 128 bit result, and an 128 bit and 64 bit add as 𝚋S𝚛B+𝚋S\mathtt{b}_S \cdot \mathtt{r}_B + \mathtt{b}_S, or in code:

Where the final term implements the ceiling operator.

𝚛B=h264l2641=(𝚋B264+(𝚋S+𝚛S+1)(𝚛B+1)2128)264𝚋B+𝚋S(𝚛B+1)2641=(𝚋S+𝚛S+1)(𝚛B+1)264𝚋S(𝚛B+1)2641 \begin{aligned} \mathtt{r}_B' &= \left\lfloor {h \cdot 2^{64}}\right\rfloor - \left\lceil {l \cdot 2^{64}}\right\rceil - 1 \\ &= \left\lfloor {\left({\frac{\mathtt{b}_B}{2^{64}} + \frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{128}}}\right)\cdot 2^{64}}\right\rfloor - \mathtt{b}_B + \left\lceil {\frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rceil - 1 \\ &= \left\lfloor {\frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rfloor - \left\lceil {\frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rceil - 1 \\ \end{aligned}

To obtain a valid result we need 0𝚛B<2640 \leq \mathtt{r}_B' < 2^{64}. The lower bound goes like:

0(𝚋S+𝚛S+1)(𝚛B+1)264𝚋S(𝚛B+1)26410(𝚋S+𝚛S+1)(𝚛B+1)264𝚋S(𝚛B+1)26411<(𝚛S+1)(𝚛B+1)264264(𝚛S+1)(𝚛B+1)𝚛S264𝚛B+11𝚛S1 \begin{aligned} 0 &\leq \left\lfloor {\frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rfloor - \left\lceil {\frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rceil - 1 \\ 0 &\leq \frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{64}} - \frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}} - 1 \\ 1 &< \frac{\left({\mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{64}} \\ 2^{64} &\leq \left({\mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right) \\ \mathtt{r}_S &\geq \frac{2^{64}}{\mathtt{r}_B + 1} - 1 \\ \mathtt{r}_S &\geq 1 \end{aligned}

Where in the last step I have used 𝚛B263\mathtt{r}_B \geq 2^{63}, which will be enforced later, in normalization. The upper bound 𝚛B<264\mathtt{r}_B' < 2^{64} goes like:

264>(𝚋S+𝚛S+1)(𝚛B+1)264𝚋S(𝚛B+1)2641264>(𝚋S+𝚛S+1)(𝚛B+1)264𝚋S(𝚛B+1)2641 \begin{aligned} 2^{64} &> \left\lfloor {\frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rfloor - \left\lceil {\frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rceil - 1 \\ 2^{64} &> \frac{\left({\mathtt{b}_S + \mathtt{r}_S + 1}\right)\left({\mathtt{r}_B + 1}\right)}{2^{64}} - \left\lceil {\frac{\mathtt{b}_S\left({\mathtt{r}_B + 1}\right)}{2^{64}}}\right\rceil - 1 \\ \end{aligned}

The second term we already have as t. For the first term we need another tricky multiply. In this case we have 𝚋S+𝚛S+1264\mathtt{b}_S + \mathtt{r}_S + 1 \leq 2^{64} and 𝚛B+1264\mathtt{r}_B + 1 \leq 2^{64}, so the intermediate result can actually be 2642^{64}. The 1-1 will make sure the final result will be in 𝔹64\mathbb{B} ^{64}. We must be careful, but modular arithmetic will handle the overflow fine. Let’s rewrite the multiplication in values <264<2^{64}:

(𝚋S+𝚛S)𝚛B+(𝚋S+𝚛S)+𝚛B+1 \left({\mathtt{b}_S + \mathtt{r}_S}\right)\cdot \mathtt{r}_B + \left({\mathtt{b}_S + \mathtt{r}_S}\right) + \mathtt{r}_B + 1

Now we can calculate the final interval:

Normalization

Given an interval [l,h)\left[{l,h}\right) we want to extract the prefix bits that won’t change anymore. To determine which bits won’t change it is useful to look at hlh-l:

l=0.0001101101000111111110110010001001h=0.0001101101001000000011000010000111hl=0.000000000000settled0000000outstanding100001111111110active \begin{aligned} l &= \mathtt{0.0001101101000111111110110010001001}\ldots \\ h &= \mathtt{0.0001101101001000000011000010000111}\ldots \\ h - l &= \mathtt{0.} \underbrace{\mathtt{000000000000}}_{\mathrm{settled}} \underbrace{\mathtt{0000000}}_{\mathrm{outstanding}} \underbrace{\mathtt{100001111111110}\ldots }_{\mathrm{active}} \end{aligned}

The settled bits can be written directly to the output. The outstanding bits can still change because of a carry, but are otherwise settled. To normalize this interval we output the first 12 bits, note that there are 7 bits outstanding, and rescale the interval by 212+72^{12}^+ ^7 . There is one edge case we can consider:

hl=0.0000000000000000000100000000000000=0.0000000000000000000011111111111111active \begin{aligned} h - l &= \mathtt{0.0000000000000000000100000000000000}\ldots \\ &= \mathtt{0.00000000000000000000} \underbrace{\mathtt{11111111111111}\ldots }_{\mathrm{active}} \end{aligned}

Writing the number in two different but equal ways can result in a different number of leading zeros. We take the one that results in the largest number of leading zeros. So in general we want to rescale by 2n2^n with nn given by:

n=log2(hl)1 n = \left\lfloor {-\log _2 \left({h - l}\right)}\right\rfloor - 1

The interval is normalized if n=0n=0 and thus iff ½<hl½ < h - l.

l=l2n=1101101000111111.110110010001001h=h2n=1101101001000000.011000010000111 \begin{aligned} l' = l\cdot 2^n &= \mathtt{1101101000111111.110110010001001}\ldots \\ h' = h\cdot 2^n &= \mathtt{1101101001000000.011000010000111}\ldots \\ \end{aligned}

Here we can be in one of two situations, either the integral parts are the same, or hh’s is one larger that ll’s. In this case hh’s is in fact one larger. We can subtract the integral part of ll:

l=ll=0.110110010001001h=hl=1.011000010000111 \begin{aligned} l'' = l'-\left\lfloor {l'}\right\rfloor &= \mathtt{0.110110010001001}\ldots \\ h'' = h'-\left\lfloor {l'}\right\rfloor &= \mathtt{1.011000010000111}\ldots \\ \end{aligned}

We have now scaled our [l,h)\left[{l,h}\right) interval to a subinterval of [0,2)\left[{0,2}\right), but we also have

0l<1 0 \leq l < 1

½<hl1 ½ < h-l \leq 1

In summary, an interval is normalized when it is a subinterval of [0,2)\left[{0,2}\right) and the above inequalities hold.

After normalization we end up in one of two cases:

Our situation is as above, there are no bits outstanding. This is when we have a carry outstanding and h>1h > 1. After further narrowing of the interval we will eventually end up in the first case and flush the carry buffer, or we will end up in a third case:

In this third case we should add the carry and subtract 11 from both hh and ll. We are then effectively back in the first case.

64 bit normalization

[𝚋264,𝚋+𝚛+1264) \left[{\frac{\mathtt{b}}{2^{64}}, \frac{\mathtt{b}+ \mathtt{r}+ 1}{2^{64}}}\right)

For this to be normalized in [0,2)\left[{0, 2}\right) we must have

And the normalization condition ½<hl1½ < h-l \leq 1 reduces to 263𝚛<2642^{63} \leq \mathtt{r}< 2^{64}. This essentially means that 𝚛𝔹64\mathtt{r}\in \mathbb{B} ^{64} and 𝚛\mathtt{r} must have the most significant bit set.

n=log2(𝚛+1264)1=63log2(𝚛+1) \begin{aligned} n &= \left\lfloor {-\log _2 \left({\frac{\mathtt{r}+ 1}{2^{64}}}\right)}\right\rfloor - 1 \\ &= 63 - \left\lceil {\log _2 \left({\mathtt{r}+ 1}\right)}\right\rceil \\ \end{aligned}

The ceiling and the +1+1 cancel, except when 𝚛=0\mathtt{r}=0. What remains is essentially a count leading zeroes operation:

l=l2nl2n=[𝚋264n]1h=h2nl2n=𝚋+𝚛+1264n𝚋264n \begin{aligned} l'' &= l\cdot 2^n - \left\lfloor {l\cdot 2^n }\right\rfloor = \left[{\frac{\mathtt{b}}{2^{64}^- ^n }}\right]_1 \\ h'' &= h\cdot 2^n - \left\lfloor {l\cdot 2^n }\right\rfloor = \frac{\mathtt{b}+ \mathtt{r}+ 1}{2^{64}^- ^n } - \left\lfloor {\frac{\mathtt{b}}{2^{64}^- ^n }}\right\rfloor \\ \end{aligned}

Where [x]1=xx\left[{x}\right]_1 = x - \left\lfloor {x}\right\rfloor is the fractional part operator.

𝚋=l264=[𝚋264n]1264=[𝚋264n264]264=[𝚋2n]264 \begin{aligned} \mathtt{b}'' &= \left\lceil {l'' \cdot 2^{64}}\right\rceil \\ &= \left\lceil {\left[{\frac{\mathtt{b}}{2^{64}^- ^n }}\right]_1 \cdot 2^{64}}\right\rceil \\ &= \left\lceil {\left[{\frac{\mathtt{b}}{2^{64}^- ^n } \cdot 2^{64}}\right]_{2^{64}}}\right\rceil \\ &= \left[{\mathtt{b}\cdot 2^n }\right]_{2^{64}} \end{aligned}

Where [x]264=xmod264\left[{x}\right]_{2^{64}} = x \;\mathrm{mod}\; 2^{64} is the modular operator. Since 𝚋2n\mathtt{b}\cdot 2^n is strictly integer, the ceiling has no effect and the operation reduces to a simple right shift:

𝚛=h264l2641=h264𝚋1=(𝚋+𝚛+1264n𝚋264n)264𝚋1=(𝚋+𝚛+1)2n𝚋264n264𝚋1=(𝚋2n264𝚋2n264)264+(𝚛+1)2n𝚋1=[𝚋2n]264+(𝚛+1)2n𝚋1=𝚋+(𝚛+1)2n𝚋1=𝚛2n+2n1 \begin{aligned} \mathtt{r}'' &= \left\lfloor {h'' \cdot 2^{64}}\right\rfloor - \left\lceil {l'' \cdot 2^{64}}\right\rceil - 1 \\ &= \left\lfloor {h'' \cdot 2^{64}}\right\rfloor - \mathtt{b}'' - 1 \\ &= \left\lfloor {\left({\frac{\mathtt{b}+ \mathtt{r}+ 1}{2^{64}^- ^n } - \left\lfloor {\frac{\mathtt{b}}{2^{64}^- ^n }}\right\rfloor }\right) \cdot 2^{64}}\right\rfloor - \mathtt{b}'' - 1 \\ &= \left\lfloor {\left({\mathtt{b}+ \mathtt{r}+ 1}\right)\cdot 2^n - \left\lfloor {\frac{\mathtt{b}}{2^{64}^- ^n }}\right\rfloor \cdot 2^{64} }\right\rfloor - \mathtt{b}'' - 1 \\ &= \left\lfloor {\left({\frac{\mathtt{b}\cdot 2^n }{2^{64}} - \left\lfloor {\frac{\mathtt{b}\cdot 2^n }{2^{64}}}\right\rfloor }\right)\cdot 2^{64} + \left({\mathtt{r}+ 1}\right)\cdot 2^n }\right\rfloor - \mathtt{b}'' - 1 \\ &= \left\lfloor {\left[{\mathtt{b}\cdot 2^n }\right]_{2^{64}} + \left({\mathtt{r}+ 1}\right)\cdot 2^n }\right\rfloor - \mathtt{b}'' - 1 \\ &= \left\lfloor {\mathtt{b}'' + \left({\mathtt{r}+ 1}\right)\cdot 2^n }\right\rfloor - \mathtt{b}'' - 1 \\ &= \mathtt{r}\cdot 2^n + 2^n - 1 \\ \end{aligned}

This is essentially shifting 𝚛\mathtt{r} to the right nn places while shifting in ones. We don’t need to worry about overflow here because 𝚛\mathtt{r} is strictly less than 264n2^{64}^- ^n .

Appendix: 128 Bit arithmetic