Representing (masked) bits in rings
\def\vec#1{\mathbf{#1}} \gdef\mat#1{\mathrm{#1}} \def\T{\mathsf{T}} \def\F{\mathsf{F}} \def\U{\mathsf{U}} \def\popcount{\mathtt{popcount}} \def\count{\mathtt{count}} \def\hamming{\mathtt{hamming}} \def\fhd{\mathtt{fhd}} \def\vsum{\mathtt{sum}}
In cryptography it is often useful to represent bits in algebraic objects. I will specifically consider rings ℤ_m of integers modulo m. If m is a prime number this happens to also be a field, but the difference is not very important here.
Bits are elements of 𝔹 = \{\T, \F\}, i.e. true and false, with the usual operations of not ¬, and ∧, or ∨ and xor ⊕.
Given bit vectors \vec a, \vec b ∈ 𝔹^n let the \popcount: 𝔹^n → [0,n] function return the number of \T values in a vector. We can then define the hamming distance function as \hamming(\vec a, \vec b) = \popcount(\vec a ⊕ \vec b).
Masked bits are elements of 𝕋 = \{\T, \F, \U\}, i.e. true, false, and unavailable. The operations are the same as for binary, except when one of the arguments is \U, the result is always \U.
We take \popcount to count the number of \T's and introduce \count to count the number of available entries i.e. either \F or \T but not \U. We can then define a fractional hamming distance \fhd as
\fhd(\vec a, \vec b) = \frac{\popcount(\vec a ⊕ \vec b)}{\count(\vec a ⊕ \vec b)}
The \{0,1\} representation of bits
We represent \F, \T as 0,1 respectively. The usual binary operations of not ¬, and ∧, or ∨, xor ⊕, can respectively be represented using ring operations as
\begin{aligned} ¬ a &= 1 - a \\ a ∧ b &= a ⋅ b \\ a ∨ b &= a + b - a ⋅ b \\ a ⊕ b &= a + b - 2 ⋅ a ⋅ b \\ \end{aligned}
If we have a vector \vec a ∈ 𝔹^n we can represent it element wise in ℤ_m^n. This allows us to represent the \popcount function:
\begin{aligned} \popcount(\vec a) &= \vsum(\vec a) \mod m\\ \end{aligned}
Observing that the sum over elementwise products is the dot product, we obtain the following useful identity:
\popcount(\vec a ∧ \vec b) = \vec a ⋅ \vec b \mod m
This representation is reversible for any m ≥ 2, except for \popcount which need m ≥ n.
\hamming(\vec a, \vec b) = \vec a ⋅ \vec a + \vec b ⋅ \vec b - 2 ⋅ \vec a ⋅ \vec b
The \{1, -1\} representation of bits
We represent \F, \T as 1,-1 respectively. The usual binary operations of not ¬, and ∧, or ∨, xor ⊕, can respectively be represented using ring operations as
\begin{aligned} ¬ a &= - a \\ a ∧ b &= ? \\ a ∨ b &= ? \\ a ⊕ b &= a ⋅ b \\ \end{aligned}
If we have a vector \vec a ∈ 𝔹^n we can represent it element wise in ℤ_m^n. This allows us to represent the \popcount function:
\begin{aligned} \popcount(\vec a) &= ½ ⋅(n - \vsum(\vec a)) \\ \end{aligned}
Given a m bit vectors \vec a_i ∈ 𝔹^n we can construct a matrix \mat A ∈ 𝔹^{n×m} such tht \mat A_{i,:} = \vec a_i. If we have a second matrix of hamming distances \mat D ∈ [0,n]^{m×m} such that \mat D_{ij} = \hamming(\vec a_i, \vec a_j) then the following holds:
\mat D = ½ ⋅(n⋅\mat J - \mat A ⋅ \mat A^\T)
Where \mat J is the matrix of all ones. Suppose we are given \mat D, we can compute \mat D' = n⋅\mat J - 2⋅\mat D and solve the following quadratic system:
\begin{aligned} \mat D' &= \mat A ⋅ \mat A^\T \\ \mat J &= \mat A ⊙ \mat A \end{aligned}
The \{-1,0,1\} representation of masked bits
We represent \F, \U, \T as -1, 0, 1 respectively. The binary operations become
\begin{aligned} ¬ a &= - a \\ a ∧ b &= ½ ⋅ a ⋅ b ⋅ (1 + a + b - a ⋅ b) \\ a ∨ b &= ½ ⋅ a ⋅ b ⋅ (a ⋅ b + a + b -1 ) \\ a ⊕ b &= -a ⋅ b \\ \end{aligned}
Note that the formulas for ∧, ∨ can be evaluated using two multiplies in a depth-2 circuit.
Given a masked bitvector \vec a ∈ 𝕋^n, we can define
\begin{aligned} \count(\vec a) &= \vsum\left(\vec a^{⊙2}\right) \\ \popcount(\vec a) &= ½ ⋅ \vsum\left(\vec a^{⊙2} + \vec a\right) \end{aligned}
We can represent this on a suitably large (Q how large?) ring as -1,0,1 for \F, \U, \T respectively.
Note that squaring, \vec a^{⊙2}, produces a regular 0,1 bitvector that is 0 whenever the value is \U and 1 otherwise, i.e. it allows us to extract the mask as a regular bitvector. Similarly ½ ⋅(\vec a^{⊙2} + \vec a) extracts the data bits: 1 for \T and 0 otherwise. The reverse mapping, converting data bits \vec b and mask bits \vec m to a masked bitvector is \vec m - 2⋅\vec b⊙\vec m. If the data bits are known to be \F in the in the unavailable region, then it simplifies to \vec m - 2⋅\vec b⊙\vec m.
In this representation and and or have awkward fourth degree expressions, though they can be evaluated using only two multiplies. The expressions for xor, \count and \popcount are quite nice considering that they correctly account for masks. This makes it a suitable system for computing fractional hamming distances.
This representation is reversible for any m ≥ 2, except for \popcount which need m ≥ n.
Fractional hamming distance
The fractional hamming weight of a masked bitvector \vec a is defined as
\begin{aligned} \mathtt{fhw}(\vec a) &= \frac {\popcount(\vec a)} {\count(\vec a )} \end{aligned}
and the fractional hamming distance between two masked bitvectors \vec a and \vec b as
\begin{aligned} \fhd(\vec a, \vec b) &= \mathtt{fhw}(\vec a ⊕ \vec b) = \frac {\popcount(\vec a ⊕ \vec b)} {\count(\vec a ⊕ \vec b)} \end{aligned}
In the ring representation these can be computed as
\begin{aligned} \fhd(\vec a, \vec b) & =\frac { ½ ⋅ \vsum\left((-\vec a ⊙ \vec b)^{⊙2} -\vec a ⊙ \vec b\right)} {\vsum\left((-\vec a ⊙ \vec b)^{⊙2}\right)} &&=\frac{1}{2} -\frac { \vsum\left(\vec a ⊙ \vec b\right)} {2⋅\vsum\left((\vec a ⊙ \vec b)^{⊙2}\right)} \\ \end{aligned}
Note that (\vec a ⊙ \vec b)^{⊙2} = \vec a^{⊙2} ⊙ \vec b^{⊙2} and thus depends only on the masks of \vec a and \vec b. Given the masks \vec a_m and \vec b_m it can be computed in binary as
\vsum\left((\vec a ⊙ \vec b)^{⊙2}\right) = \popcount(\vec a_m ∧ \vec b_m)