2024-02-16

Representing (masked) bits in rings

$\def\vec#1{\mathbf{#1}} \gdef\mat#1{\mathrm{#1}} \def\T{\mathsf{T}} \def\F{\mathsf{F}} \def\U{\mathsf{U}} \def\popcount{\mathtt{popcount}} \def\count{\mathtt{count}} \def\hamming{\mathtt{hamming}} \def\fhd{\mathtt{fhd}} \def\vsum{\mathtt{sum}}$

In cryptography it is often useful to represent bits in algebraic objects. I will specifically consider rings $ℤ_m$ of integers modulo $m$ . If $m$ is a prime number this happens to also be a field, but the difference is not very important here.

Bits are elements of $𝔹 = \{\T, \F\}$ , i.e. true and false, with the usual operations of not $\neg$ , and $\land$ , or $\lor$ and xor $\oplus$ .

Given bit vectors $\vec a, \vec b ∈ 𝔹^n$ let the $\popcount: 𝔹^n → [0,n]$ function return the number of $\T$ values in a vector. We can then define the hamming distance function as $\hamming(\vec a, \vec b) = \popcount(\vec a ⊕ \vec b)$ .

Masked bits are elements of $𝕋 = \{\T, \F, \U\}$ , i.e. true, false, and unavailable. The operations are the same as for binary, except when one of the arguments is $\U$ , the result is always $\U$ .

We take $\popcount$ to count the number of $\T$ 's and introduce $\count$ to count the number of available entries i.e. either $\F$ or $\T$ but not $\U$ . We can then define a fractional hamming distance $\fhd$ as

$\fhd(\vec a, \vec b) = \frac{\popcount(\vec a ⊕ \vec b)}{\count(\vec a ⊕ \vec b)}$

The $\{0,1\}$ representation of bits

We represent $\F, \T$ as $0,1$ respectively. The usual binary operations of not $\neg$ , and $\land$ , or $\lor$ , xor $\oplus$ , can respectively be represented using ring operations as

$\begin{aligned} ¬ a &= 1 - a \\ a ∧ b &= a ⋅ b \\ a ∨ b &= a + b - a ⋅ b \\ a ⊕ b &= a + b - 2 ⋅ a ⋅ b \\ \end{aligned}$

If we have a vector $\vec a ∈ 𝔹^n$ we can represent it element wise in $ℤ_m^n$ . This allows us to represent the $\popcount$ function:

$\begin{aligned} \popcount(\vec a) &= \vsum(\vec a) \mod m\\ \end{aligned}$

Observing that the sum over elementwise products is the dot product, we obtain the following useful identity:

$\popcount(\vec a ∧ \vec b) = \vec a ⋅ \vec b \mod m$

This representation is reversible for any $m \geq 2$ , except for $\popcount$ which need $m \geq n$ .

$\hamming(\vec a, \vec b) = \vec a ⋅ \vec a + \vec b ⋅ \vec b - 2 ⋅ \vec a ⋅ \vec b$

The $\{1, -1\}$ representation of bits

We represent $\F, \T$ as $1,-1$ respectively. The usual binary operations of not $\neg$ , and $\land$ , or $\lor$ , xor $\oplus$ , can respectively be represented using ring operations as

$\begin{aligned} ¬ a &= - a \\ a ∧ b &= ? \\ a ∨ b &= ? \\ a ⊕ b &= a ⋅ b \\ \end{aligned}$

If we have a vector $\vec a ∈ 𝔹^n$ we can represent it element wise in $ℤ_m^n$ . This allows us to represent the $\popcount$ function:

$\begin{aligned} \popcount(\vec a) &= ½ ⋅(n - \vsum(\vec a)) \\ \end{aligned}$

Given a $m$ bit vectors $\vec a_i ∈ 𝔹^n$ we can construct a matrix $\mat A ∈ 𝔹^{n×m}$ such tht $\mat A_{i,:} = \vec a_i$ . If we have a second matrix of hamming distances $\mat D ∈ [0,n]^{m×m}$ such that $\mat D_{ij} = \hamming(\vec a_i, \vec a_j)$ then the following holds:

$\mat D = ½ ⋅(n⋅\mat J - \mat A ⋅ \mat A^\T)$

Where $\mat J$ is the matrix of all ones. Suppose we are given $\mat D$ , we can compute $\mat D' = n⋅\mat J - 2⋅\mat D$ and solve the following quadratic system:

$\begin{aligned} \mat D' &= \mat A ⋅ \mat A^\T \\ \mat J &= \mat A ⊙ \mat A \end{aligned}$

The $\{-1,0,1\}$ representation of masked bits

We represent $\F, \U, \T$ as $-1, 0, 1$ respectively. The binary operations become

$\begin{aligned} ¬ a &= - a \\ a ∧ b &= ½ ⋅ a ⋅ b ⋅ (1 + a + b - a ⋅ b) \\ a ∨ b &= ½ ⋅ a ⋅ b ⋅ (a ⋅ b + a + b -1 ) \\ a ⊕ b &= -a ⋅ b \\ \end{aligned}$

Note that the formulas for $\land, \lor$ can be evaluated using two multiplies in a depth-2 circuit.

Given a masked bitvector $\vec a ∈ 𝕋^n$ , we can define

$\begin{aligned} \count(\vec a) &= \vsum\left(\vec a^{⊙2}\right) \\ \popcount(\vec a) &= ½ ⋅ \vsum\left(\vec a^{⊙2} + \vec a\right) \end{aligned}$

We can represent this on a suitably large (Q how large?) ring as $-1,0,1$ for $\F, \U, \T$ respectively.

Note that squaring, $\vec a^{⊙2}$ , produces a regular $0,1$ bitvector that is $0$ whenever the value is $\U$ and $1$ otherwise, i.e. it allows us to extract the mask as a regular bitvector. Similarly $½ ⋅(\vec a^{⊙2} + \vec a)$ extracts the data bits: $1$ for $\T$ and $0$ otherwise. The reverse mapping, converting data bits $\vec b$ and mask bits $\vec m$ to a masked bitvector is $\vec m - 2⋅\vec b⊙\vec m$ . If the data bits are known to be $\F$ in the in the unavailable region, then it simplifies to $\vec m - 2⋅\vec b⊙\vec m$ .

In this representation and and or have awkward fourth degree expressions, though they can be evaluated using only two multiplies. The expressions for xor, $\count$ and $\popcount$ are quite nice considering that they correctly account for masks. This makes it a suitable system for computing fractional hamming distances.

This representation is reversible for any $m \geq 2$ , except for $\popcount$ which need $m \geq n$ .

Fractional hamming distance

The fractional hamming weight of a masked bitvector $\vec a$ is defined as

$\begin{aligned} \mathtt{fhw}(\vec a) &= \frac {\popcount(\vec a)} {\count(\vec a )} \end{aligned}$

and the fractional hamming distance between two masked bitvectors $\vec a$ and $\vec b$ as

$\begin{aligned} \fhd(\vec a, \vec b) &= \mathtt{fhw}(\vec a ⊕ \vec b) = \frac {\popcount(\vec a ⊕ \vec b)} {\count(\vec a ⊕ \vec b)} \end{aligned}$

In the ring representation these can be computed as

$\begin{aligned} \fhd(\vec a, \vec b) & =\frac { ½ ⋅ \vsum\left((-\vec a ⊙ \vec b)^{⊙2} -\vec a ⊙ \vec b\right)} {\vsum\left((-\vec a ⊙ \vec b)^{⊙2}\right)} &&=\frac{1}{2} -\frac { \vsum\left(\vec a ⊙ \vec b\right)} {2⋅\vsum\left((\vec a ⊙ \vec b)^{⊙2}\right)} \\ \end{aligned}$

Note that $(\vec a ⊙ \vec b)^{⊙2} = \vec a^{⊙2} ⊙ \vec b^{⊙2}$ and thus depends only on the masks of $\vec a$ and $\vec b$ . Given the masks $\vec a_m$ and $\vec b_m$ it can be computed in binary as

$\vsum\left((\vec a ⊙ \vec b)^{⊙2}\right) = \popcount(\vec a_m ∧ \vec b_m)$

Representing (masked) bits in rings

The \{0,1\} representation of bits

The \{1, -1\} representation of bits

The \{-1,0,1\} representation of masked bits

Fractional hamming distance

The $\{0,1\}$ representation of bits

The $\{1, -1\}$ representation of bits

The $\{-1,0,1\}$ representation of masked bits