Succinct Multi-Linear Extensions

\gdef\p#1{\left({#1}\right)} \gdef\ceil#1{\left\lceil{#1}\right\rceil} \gdef\floor#1{\left\lfloor{#1}\right\rfloor} \gdef\norm#1{\left|{#1}\right|} \gdef\set#1{\left\{#1\right\}} \gdef\range#1{\left[#1\right)} \gdef\setn#1{\mathcal{#1}} \gdef\vec#1{\mathbf{#1}} \gdef\mat#1{\mathrm{#1}} \gdef\F{\mathbb{F}} \gdef\Z{\mathbb{Z}} \gdef\vd{𝛅} \gdef\i{\mathrm{i}} \gdef\j{\mathrm{j}} \gdef\om{\mathrm{\omega}}

Many interactive proof systems rely on multi-linear extensions (MLEs) that can be efficiently evaluated, e.g. Jolt (NTZ25 appendix A for Jolt, see also the source code for many examples), SuperSpartan (STW23 section 5) , or GKR (Tha13, Tha23 section 4.6.6).

Despite their importance, it is not easy to come up with specific efficient MLEs, or even get a sense of what is possible. At least, that was the case for me until I read HJR+25 "Jagged Polynomial Commitments" and played with the ideas presented there and in this article. It turns out a lot more is possible than I initially expected! In the process I learned how to

So, time to get good at succinct MLEs!

Multi-Linear Extensions

Here follows a brief review of multi-linear extensions (MLEs) and their properties. For a more extensive treatment see Tha23 § 3.5.

Denote with \mathsf{bits}_k: \mathbb{N} \rightarrow \F_q^k the bitvector \vec b \in \{0,1\}^k such that i = \sum_{j \in [0,k)} b_j \cdot 2^{k-j-1}, i.e. in big-endian order. We will leave the field \F_q implicit and often drop the subscript k when it is clear from context.

Definition 1. A k-variable Multi-Linear Extension (MLE) evaluation is a map

\mathsf{MLE}_k: \F_q^{2^k} \times \F_q^k \rightarrow \F_q

such that \mathsf{MLE}_k(\vec w, \mathsf{bits}(i)) = w_i for all i \in [0,2^k) and \mathsf{MLE}_k(\vec w, \vec r) is a polynomial in \vec r of degree at most 1 in each variable.

Now we show this map is unique and present several equivalent definitions.

Lemma 2. The MLE evaluation map is unique and equivalent to the recursion

\begin{align*} \mathsf{MLE}_0([w_0], []) &= w_0 \\[4pt] \mathsf{MLE}_{k+1}([\vec w_0 \, \vec w_1], [r_0 \, \vec r]) &= u_0 + r_0 \cdot (u_1 - u_0) \\ \text{ where } u_i &= \mathsf{MLE}_k(\vec w_i, \vec r) \text{.} \end{align*}

Proof: Uniqueness follows from the fact that there are 2^k interpolation points and a polynomial of degree at most 1 in each of k variable has at most 2^k coefficients.

Multi-linearity in \vec r: The base case is trivial. For the inductive case, it is a linear combination of \mathsf{MLE}s in the \vec r variables and is linear in the new r_0 variable.

Interpolation: The base case is trivial. For the inductive case, observe that for b \in \{0,1\}, i \in [0,2^k) we have

\mathsf{MLE}_{k+1}([\vec w_0 \, \vec w_1], [b \, \mathsf{bits}_k(i)]) = \mathsf{MLE}_k(\vec w_b, \mathsf{bits}_k(i)) = w_{b, i}

and since concatenation results in [\vec w_0 \, \vec w_1]_{b \cdot 2^k + i} = w_{b,i} and \mathsf{bits}_{k+1}(b \cdot 2^k + i) = [b\, \mathsf{bits}_k(i)], the interpolation property holds.

This definition can be evaluated in 2^k-1 field multiplications, additions and subtractions. This makes it an O(n) operation algorithm where n = 2^k. Since in general all n coefficients of \vec w are needed to compute the result, O(n) is optimal. But as we will see, if \vec w has some special structure, we can do better.

Another common and useful expression is in terms of Lagrange basis polynomials, which in the context of MLEs are commonly denoted as \mathsf{eq}_i(\vec r):

Lemma 3. The MLE evaluation map is equivalent to

\mathsf{MLE}_k(\vec w, \vec r) = \sum_{i \in [0, 2^k)} \mathsf{eq}_{k\,i}(\vec r) \cdot w_i

where

\mathsf{eq}_{k\,i}(\vec r) = \prod_{j \in [0, k)} \begin{cases} 1- r_j & \mathsf{bits}_k(i)_j = 0 \\ r_j & \mathsf{bits}_k(i)_j = 1 \text{.} \end{cases}

Proof: Observe that \mathsf{eq}_i(\vec r) is linear in each r_j and for i,j \in [0, 2^k) we have

\mathsf{eq}_i(\mathsf{bits}(j)) = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}

which makes \mathsf{eq}_i a multi-linear Lagrange basis for \{0,1\}^k. The properties of the expression for \mathsf{MLE} follow from this.

We can rewrite this lemma using the tensor product notation \otimes:

Lemma 4. The MLE evaluation map is equivalent to the inner product

\mathsf{MLE}_k(\vec w, \vec r) = \vec w \cdot \p{\bigotimes_{i \in \range{0,k}} \begin{bmatrix} 1 - r_i \\ r_i \end{bmatrix} } \text{.}

From this equivalent definition it follows directly that the MLE is linear in the \vec w argument:

Lemma 5. (Linearity) For all \alpha, \beta \in \F_q and \vec w, \vec w' \in \F_q^{2^k} and \vec r \in \F_q^k we have

\mathsf{MLE}(\alpha \cdot \vec w + \beta \cdot \vec w', \vec r) = \alpha \cdot \mathsf{MLE}(\vec w, \vec r) + \beta \cdot \mathsf{MLE}(\vec w', \vec r) \text{.}

However, it does not follow that the MLE is linear in the \vec r argument. Instead it is multi-linear, i.e. linear in each variable separately, which is a much weaker property.

The MLE evaluation respects the symmetries of the hyper-cube. The hypercube \{0,1\}^k has a large symmetry group, the hyper-octahedral group B_k of order k! \cdot 2^k which is generated by permutations of the k axes and reflections in each axis. The MLE respects these symmetries in the following sense:

Lemma 6. (Hypercube Symmetries) Every symmetry of the hyper-cube \sigma \in B_k induces an equivalent symmetry on the \vec w and \vec r:

\mathsf{MLE}_k(\sigma_w(\vec w), \vec r) = \mathsf{MLE}_k(\vec w, \sigma_r(\vec r)) \text{.}

Proof: It suffices to show this for the generators of the group, which come in two types:

  1. Permutation of axes: \sigma_r(\vec r) = (r_{\pi(0)}, \dots, r_{\pi(k-1)}) for some permutation \pi on [0,k) and \sigma_w(\vec w) = (w_{\pi'(0)}, \dots, w_{\pi'(2^k - 1)}) where \pi' is the permutation on [0,2^k) induced by \pi on the bit representations of the indices.

  2. Reflection in axis j: \sigma_r(\vec r) = (r_0, \dots, 1 - r_j, \dots, r_{k-1}) and \sigma_w(\vec w) = (w_{0 \oplus 2^j}, \dots, w_{i \oplus 2^j}, \dots, w_{(2^k - 1) \oplus 2^j}) where \oplus is bitwise XOR.

It is sufficient to show that the left and right side of the equation are MLEs and agree on the interpolation points \set{0,1}^k. The only non-trivial case is the right side of the reflection, where we note that 1 - r_j is still linear in r_j.

The ability to permute variables plays nicely with other lemma's that allow us to split or combine variables. For example, we can compose two MLE evaluations into one:

Lemma 7. (Tensor Composition) Given k, k' and \vec w \in \F_q^{2^k}, \vec w' \in \F_q^{2^{k'}}, \vec r \in \F_q^k, and \vec r' \in \F_q^{k'} we have

\mathsf{MLE}_k(\vec w, \vec r) \cdot \mathsf{MLE}_k(\vec w', \vec r') = \mathsf{MLE}_{k+k'}(\vec w \otimes \vec w', [\vec r\, \vec r']) \text{.}

In the opposite direction, factoring an \mathsf{MLE} is hard as \vec w doesn't generally have a tensor decomposition. But when we fix \vec r' to a bitvector we can do it:

Lemma 8. (Partial Fixing) Given \vec w \in \F_q^{2^{k +l}}, \vec r \in \F_q^k, and \vec b \in \set{0,1}^{l} we have

\mathsf{MLE}_{k+l}(\vec w, [\vec r\, \vec b]) = \mathsf{MLE}_{k}(\vec w', \vec r)

where \vec w' \in \F_q^{2^k} is given by w'_i = w_{i \cdot 2^{l} + j} with j such that \vec b = \mathsf{bits}_{l}(j).

That concludes the basic properties of general MLE evaluations. We will now look at special cases for \vec w that allow efficient evaluation of \mathsf{MLE}_k(\vec w, \vec r), i.e. sub-linear in the size of \vec w or, equivalently, sub-exponential in k.

Tensor Weights

For succinct verification we need to restrict \vec w to vectors that allow MLE evaluation in O(\log n) or better (where n = 2^k). A simple but useful class is those where \vec w can be decomposed as the tensor product of 2-vectors.

Lemma 9. Given \vec t_0,\dots,\vec t_{k-1} \in \F_q^2, there is an O(k) algorithm to compute \mathsf{MLE}_k(\vec w, \vec r) where

\vec w = \bigotimes_{i \in [0,k)} \vec t_i \text{.}

Proof: Consider the base case \mathsf{MLE}_0([1], []) = 1 and recursion:

\begin{align*} \mathsf{MLE}_{k+1}(\vec t_0 \otimes \vec w', [r_0, \vec r]) &= \p{(\vec t_0)_0 \cdot (1 - r_0) + (\vec t_0)_1 \cdot r_0} \cdot \mathsf{MLE}_k(\vec w', \vec r) \end{align*}

The recursive step is linear in r_0 and for r_0 \in \{0,1\} we have (\vec t_0)_0 or (\vec t_0)_1 times the MLE in the remaining variables, which results in the MLE of the tensor product by induction. Finally the algorithm is O(k) since it does O(1) work per variable.

This class is closed under the hypercube symmetries, as permutations just permute the \vec t_i and reflections swap the two elements of some \vec t_i. We can also scale a tensor program by scaling one of the \vec t_i, but we can not generally add two tensor programs as the sum of two tensor products is not generally a tensor product. However, tensor weights are closed under element-wise multiplication of the weights vectors by taking the element-wise products of the \vec t_i. This extends to element-wise exponentiation, division and inverses.

This is a very useful class of weights, as it allows us to evaluate polynomials in some useful bases. Here are some multi-linear examples:

Example 10. (Multi-Linear Monomial Basis) Given \vec x \in \F_q^k the tensor weights for evaluating a multi-linear polynomial in monomial basis are \vec t_i = \begin{bmatrix} 1 & x_i \end{bmatrix}.

Example 11. (Multi-Linear Boolean Hypercube Basis) Given \vec x \in \F_q^k the tensor weights for evaluating a polynomial in the \{0,1\}^k multi-linear basis are \vec t_i = \begin{bmatrix} 1 - x_i & x_i \end{bmatrix}.

Example 12. (General Multi-Linear Hypercube Basis) Given \vec x \in \F_q^k the tensor weights for evaluating a polynomial in the \{a_i,b_i\}_{i \in [0,k)} multi-linear basis are

\vec t_i = \begin{bmatrix} \displaystyle \frac{x_i - b_i}{a_i - b_i} & \displaystyle \frac{x_i - a_i}{b_i - a_i} \end{bmatrix} \text{.}

To do. The \set{0, \infty}^k basis.

We can also do some univariate evaluation, but it is much more restrictive. Monomial evaluation works:

Example 13. (Univariate Monomial Basis) Given x \in \F_q the weights vector for evaluating a polynomial in the univariate monomial basis has w_i = x^i and tensor weights \vec t_i = \begin{bmatrix} 1 & x^{2^i} \end{bmatrix}.

Evaluation on one or two Lagrange points are just degenerate multi-linears. We can also do three Lagrange points (where we ignore w_4), but barely:

Example 14. (Univariate 3-Point Lagrange Basis) Given Lagrange basis points x_0,x_1,x_2 \in \F_q and evaluation point x \in \F_q,

\begin{align*} \vec t_0 &= \begin{bmatrix} \displaystyle (x - x_2) \\[1em] \displaystyle \frac{(x - x_0)}{(x_2 - x_1) \cdot (x_1 - x_0)^{-1}} \end{bmatrix} & \vec t_1 &= \begin{bmatrix} \displaystyle \dfrac{(x - x_1)}{(x_0 - x_1) \cdot (x_0 - x_2)} \\[1em] \displaystyle \frac{(x - x_0)}{(x_1 - x_0) \cdot (x_1 - x_2)} \end{bmatrix} \text{.} \\ \end{align*}

This is cheating a bit, as \vec w is length 4 and we simply ignore the last element. To show that we can't do better, we first introduce a lemma that succinctly characterizes tensor weights:

Lemma 15. (Face Identities) For m < k and i,k \in \range{0, 2^m} and j,l \in \range{0,2^{k-m}} we have for tensor weights \vec w:

w_{i + j \cdot 2^m} \cdot w_{k + l \cdot 2^m} - w_{i + l \cdot 2^m} \cdot w_{k + j \cdot 2^m} = 0 \text{.}

These identities correspond to the faces of the k-hypercube and they are necessary and sufficient conditions for \vec w to be a tensor product of 2-vectors.

Proof: To show necessity split \vec w in two parts \vec w = \vec w' \otimes \vec w'' with \vec w' \in \F_q^{2^l} and \vec w'' \in \F_q^{2^{k-l}} such that w_{i + j \cdot 2^l} = w'_i \cdot w''_j and similar for the other terms. Then the identity becomes trivially true:

w'_i \cdot w''_j \cdot w'_k \cdot w''_l - w'_i \cdot w''_l \cdot w'_k \cdot w''_j = 0 \text{.}

To show sufficiency, we use induction on k. The base case k = 1 is trivial. For the inductive case, split \vec w in two halves \vec w = [\vec w_0\, \vec w_1]. We will show that \vec w_1 = \alpha \cdot \vec w_0 for some \alpha and then we can take \vec t_0 = [1\, \alpha] and use the inductive hypothesis on \vec w_0 to get the remaining \vec t_i. Consider the face identities for m = 1, i=0, k=1 and all j,l \in \range{0,2^{k-1}} which we can rewrite as

\frac{w_{0 + j \cdot 2}}{w_{1 + j \cdot 2}} = \frac{w_{0 + l \cdot 2}}{w_{1 + l \cdot 2}} \text{.}

If all the w_{1 + j \cdot 2} = 0 then we set \alpha = 0 and we are done. Otherwise they all have the same ratio, which we take as \alpha.

We can now show that no dense univariate basis exists for more than 2 points using tensor weights:

Lemma 16. For k > 1 it is not possible to construct a univariate basis for n \geq 2^k interpolation points using weights tensors.

Proof: The n>2^k case is trivial. For n = 2^k we are looking for tensor weights such that w_i = l_i(x) where l_i is the i-th Lagrange basis polynomial for some set of points \set{x_j}_{j \in \range{0, n}}. Pick four corners of a face of the hyper-cube, a, b, c,d, the face identity becomes

l_a(x) \cdot l_b(x) - l_c(x) \cdot l_d(x) = 0 \text{.}

This identity needs to hold for any x and the left side is a degree 2(n-1) polynomial. Taking the derivative we get

\p{l_a(x) \cdot l_b'(x) + l_a'(x) \cdot l_b(x)} - \p{l_c(x) \cdot l_d'(x) + l_c'(x) \cdot l_d(x)} = 0 \text{.}

For x=x_a we have l_a(x_a) = 1 and l_b(x_a) = l_c(x_a) = l_d(x_a) = 0 and the above simplifies to l_b'(x_a) = 0, which contradicts every zero of l_b' being simple.

It's an open question (to me at least) if a dense univariate Lagrange basis MLE can be computed efficiently.

While tensor weights already capture a surprisingly useful class of succinct MLEs, they remain limited to factorizable structures and are, for example, not closed under addition. To go further, we can replace each scalar by a matrix, allowing more complex linear combinations of partial results. This leads to Weighted Ordered Binary Decision Diagram.

Weighted OBBDs

A large class of efficient MLEs can be constructed by replacing the scalars in the tensors by matrices. This was first introduced in HR18 in the proof of thm. 5.4, later refined in HJR+25 § 4.1 where they are called read-once matrix branching programs. We will use terminology more in line with Weg00, augmented with terms from automata theory.

Definition 17. A Weighted Ordered Binary Decision Diagram (WOBDD) consists of permutation \pi on \range{0,k} and weight matrices

\set{\mat M_{i\,0},\mat M_{i\,1} \in \mathbb{F}_q^{\,u_i \times u_{i+1}}}_{i \in [0,k)} \text{.}

where u_i are matrix dimensions such that the initial and final dimension u_0 = u_k = 1. The evaluation of the program in \vec r is then defined as

\prod_{i \in [0,k)} \p{ (1 - r_{\pi(i)}) \cdot \mathrm{M}_{i\,0} + r_{\pi(i)} \cdot \mathrm{M}_{i\,1} } \text{.}

We call \max_i u_i the width of the WOBDD. When the permutation is given, we call it a \pi-WOBDD.

Note. This definition differs from HJR+25 and literature on weighted automata in allowing matrices of varying dimensions. The requirement that u_0 = u_l = 1 ensures that the result is a scalar, where in other literature this is achieved using separate source and sink vectors. This mostly makes the notation cleaner, but it's easy to convert between the two.

Let's establish that these are indeed MLE evaluations and can be evaluated efficiently.

Lemma 18. A WOBDD is an MLE evaluation for \vec w \in \F_q^{2^k} where

w_i = \prod_{j \in [0,k)} \mathrm{M}_{j\, b_{\pi(i)}}

where \vec b = \mathsf{bits}_k(i). If the width is bounded evaluation takes O(k) operations.

Proof: Each r_i appears only once in the product and the expression is linear in each r_i, hence it is multi-linear in \vec r. For r_i \in \set{0, 1} the values of w_i follow directly from the definition.

To show how this work, let's look at a simple example that interpolates over the vector \vec w = (0, 1, \dots, 2^k -1).

Example 19. (Popcount) We can construct a WOBDD for the MLE of w_i = \mathsf{popcount}(i) \pmod{\F_q} using permutation \pi = \mathsf{id} and matrices

\begin{align*} \mat M_{0\,b} &= \begin{bmatrix} 1 & b \end{bmatrix} & \mat M_{i\,b} &= \begin{bmatrix} 1 & b \\ 0 & 1 \end{bmatrix} & \mat M_{k-1\,b} &= \begin{bmatrix} b \\ 1 \end{bmatrix} \text{.} \end{align*}

Example 20. (Counter) We can construct a WOBDD for the MLE of w_i = i \pmod{\F_q} using permutation \pi = \mathsf{id} and matrices

\begin{align*} \mat M_{0\,b} &= \begin{bmatrix} 2 & b \end{bmatrix} & \mat M_{i\,b} &= \begin{bmatrix} 2 & b \\ 0 & 1 \end{bmatrix} & \mat M_{k-1\,b} &= \begin{bmatrix} b \\ 1 \end{bmatrix} \text{.} \end{align*}

In Jolt there are many lookup tables for m-arry functions where the k-bit inputs a_0, a_1, \dots, a_{m-1} \in \range{0,2^k} are combined into an index i = \sum_{j \in [0,m)} a_j \cdot 2^{j \cdot k} into the weight vector \vec w \in \F_q^{m \cdot k}. The goal is then to find a succinct MLE for this \vec w. For boolean binary operations we can do this as

Example 21. (Bitwise Operation) Given a boolean binary operation \ast such as AND, OR or XOR, we can construct a WOBDD for the \mathsf{MLE}_{2\cdot k} evaluation of

w_{\vec a \, \vec b} =\sum_{i \in \range{0,k}} 2^i \cdot \p{a_i \ast b_i} \pmod{\F_q} \text{.}

Pick interleaved permutation \pi = (a_0, b_0, a_1, b_1, \dots) and matrices

\begin{align*} \mat M_{2\cdot i+1\,b} &= \begin{bmatrix} 1 & 0 & (1 - b) & b \\ 0 & 1 & 0 & 0 \end{bmatrix} & \mat M_{0\,b} &= \begin{bmatrix} 1 & 0 & 1 - b & b \end{bmatrix} \\ \mat M_{2\cdot i\,b} &= \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 2^i \cdot (b \ast 0) \\ 0 & 2^i \cdot (b \ast 1) \end{bmatrix} & \mat M_{k-1\,b} &= \begin{bmatrix} 0 \\ 1 \\ 2^i \cdot (b \ast 0) \\ 2^i \cdot (b \ast 1) \end{bmatrix} \text{.}\\ \end{align*}

When we have WOBDDs for two vectors \vec w, \vec w' we can combine them in several ways to get WOBDDs for new vectors. The following lemma's show three useful constructions.

Lemma 22. (Linear Combinations) Given two \pi-WOBDDs for \vec w, \vec w' we can construct a \pi-WOBDD for \vec w'' where w''_i = \alpha \cdot w_i + \beta \cdot w'_i by taking block matrices

\begin{align*} \mat M_{0\,j}'' &= \begin{bmatrix} M_{0\, j} & M_{0, j}' \end{bmatrix} & \mat M_{ij}'' &= \begin{bmatrix} \mat M_{ij} & 0 \\[4pt] 0 & \mat M_{ij}' \end{bmatrix} & \mat M_{k-1\,l}'' &= \begin{bmatrix} \alpha \cdot \mat M_{k-1\,j} \\[4pt] \beta \cdot \mat M_{k-1\,j}' \end{bmatrix} \text{.} \end{align*}

The width of the resulting WOBDD is the sum of the widths of the input WOBDDs.

Lemma 23. (Multiplication) Given two k-variable \pi-WOBDDs for \vec w, \vec w' we can construct a \pi-WOBDD$ for \vec w'' where w''_i = w_i \cdot w'_i by taking tensor product matrices

\mat M_{ij}'' = \mat M_{ij} \otimes \mat M_{ij}'\text{.}

The width of the resulting WOBDD is the product of the widths of the input WOBDDs.

Lemma 24. (Reversing) Given a WOBDD with permutation i_0, \dots, i_{k-1} we can construct an equivalent WOBDD of same width with the reversed permutation i_{k-1},\dots,i_0 by transposing matrices \mat M_{i\,b}' = \mat M_{(k-i-1)\,b}^{\mathsf{T}}.

Lemma 25. (Composition) Given a k variable WOBDD and k' variable WOBDD we can construct a k+k' variable WOBBD for \vec w'' such that w''_{i + j\cdot2^{k}} = w_i \cdot w'_j by concatenating the permutations and matrices.

Using these constructions we can e.g. build a lookup table for squaring by taking the counter from example 20 and multiplying it with itself using lemma 23.

We could generalize the WOBDD definition by replacing the permutation with a set partitioning of r_i into subvectors \vec r_i \in \F_q^{k_i}, having 2^{k_i} matrices for each i and evaluating using the \mathrm{eq} polynomials:

\prod_{i \in [0,l)} \sum_{j \in [0, 2^{k_i})} \mathrm{eq}_{k_i\,j}(\vec r_i) \cdot \mathrm{M}_{ij}\text{.}

This allows for a time-space trade-off, especially if we can eliminate a large intermediate u_i value. For simplicity we will not consider this generalization further.

Transducers

To create more powerful manipulations of WOBDDs we introduce transducers, which are like weighted non-deterministic finite transducers (WFSTs) except they have a layered structure. They also extend the read-once branching programs from HJR+25 and OR-OBDDs from Weg00.

Definition 26. An Non-deterministic Weighted Ordered Binary Decision Transducer, is a matrix \mat T \in \F_q^{2^k \times 2^k} specified as follows:

  • There is an input permutation \pi on \range{0,k}.
  • There are layered sets of states \setn{S}_0,\dots,\setn{S}_k.
  • For each i \in \range{0,k} there is a transition weight function \delta_i : \setn{S}_i \times \set{0,1} \times \setn{S}_{i+1} \times \set{0,1} \rightarrow \F_q such that in state s on input bit b we can transition to state s' outputting bit b' with weight \delta_i(s, b, s', b').

Take \widehat{\setn{S}} = S_0 \times S_1 \times \cdots \times S_k then the matrix \mat T is specified by summing over all paths while accumulating weights multiplicatively,

\mat T_{i j} = \sum_{\vec s \in \widehat{\setn S}} \prod_{l \in \range{0, k}} \delta_l(s_l, \mathsf{bits}(i)_{\pi(l)}, s_{l+1}, \mathsf{bits}(j)_{\pi(l)}) \text{.}

The width of the transducer is \max_i |\setn{S}_i|. For a given \pi we call it a \pi-transducer.

Transducers can be used to transform WOBDDs such that each resulting weight is a a linear combination of the original weights:

Lemma 27. (Transducer Action) Given a \pi-transducer \mat T and a \pi-WOBDD for \vec w we can construct a \pi-WOBDD for \vec w' = \mat T \cdot \vec w using the block matrices

(\mat M'_{i\, b})_{s\,s'} = \sum_{b' \in \set{0,1}} \delta_i(s, b, s', b') \cdot \mat M_{i\, b'}

where the initial and final dimension are collapsed to 1 by summation. The width of resulting WOBDD is at most the product of the width of the original WOBDD and the width of the transducer.

Proof: For each path in T, the construction picks the matrices from the original program according to the output and multiplies them.

The non-determinism complicates things compared to the deterministic read-once branching programs (ROBP) from HJR+25, but it also makes it more powerful. For one, it allows us to reverse the permutation without increasing the width. Without non-determinism, many useful programs can only be expressed in either MSB or LSB order and we would not be able to combine them. The class of efficiently computable functions is also larger (see e.g. Weg00 thm 12.2.1). In terms of the matrix \mat T, adding the output bit allows us to go from diagonal matrices to (generalized) permutation matrices and adding non-determinism allows us to go dense matrices.

For convenience, we will talk about Most-Significant Bit first (MSB) and Least-Significant Bit first (LSB) transducers, where the input permutation is \pi(i) = i and \pi(i) = k-i-1 respectively. When the transducer is deterministic we specify transitions as a function \delta_i: \setn{S}_i \times \set{0,1} \rightarrow \setn{S}_{i+1} \times \set{0,1} \times \F_q. We also omit the output and weight when they equal the input and 1 respectively. We say we associate a weight to an initial/final state to mean applying the weight to each transition entering or leaving that state respectively.

To show the utility of this class, let's look at some simple examples of transducers. The first example does a MSB lexicographic comparison with a constant:

Example 28. (MSB Comparison) A comparison for c \in \range{0, 2^{k}} is constructed as an MSB-transducer with states \setn S_0 = \set{\mathtt{equal}}, \setn S_i = \set{\mathtt{less}, \mathtt{equal}, \mathtt{greater}} and transition function

\begin{align*} \delta_i(\mathtt{less}, b) &= \mathtt{less} \\[4pt] \delta_i(\mathtt{equal}, b) &= \begin{cases} \mathtt{less} & \text{if } b < \mathsf{bits}(c)_{k-i-1} \\ \mathtt{equal} & \text{if } b = \mathsf{bits}(c)_{k-i-1} \\ \mathtt{greater} & \text{if } b > \mathsf{bits}(c)_{k-i-1} \end{cases} \\[6pt] \delta_i(\mathtt{greater}, b) &= \mathtt{greater} \text{.} \end{align*}

We associate weights to the final state (\mathtt{less}, \mathtt{equal}, \mathtt{greater}) as (1, 0, 0), (1, 1, 0), (0, 1, 0), (1, 0 ,1), (0, 1, 1) or (0, 0, 1) to get the transducers for <, \leq, =, \neq, \geq, > respectively.

This method is width 3. There is also a width 2 LSB-transducer using subtraction:

Example 29. (LSB Comparison) A comparison for c \in \range{0, 2^{k}} is constructed as an LSB-transducer with states \setn S_0 = \set{0}, \setn S_i = \set{0,1} and transition function

\delta_i(s, b) = \begin{cases} 1 & \text{if } b < s + \mathsf{bits}(c)_{k-i-1} \\[4pt] 0 & \text{otherwise} \\[4pt] \end{cases}

The weights associating to the final state can be set to (0, 1), (1, 0) to get the transducers for <, \geq respectively.

The next example shows the utility of the output function to permute the weights, in this case shift them by any amount:

Example 30. (Shift) For a cyclic rotation of the indices by an amount c \in \range{0,2^k} we construct a LSB-transducer with states \setn S_0 = \set{0}, \setn S_i = \set{0,1} representing the carry bit, and transition function \delta_i(s, b) = (s', b') where

\begin{align*} t &= s + b + \mathsf{bits}(c)_{k-i-1} \\[4pt] (s', b') &= (\floor{t / 2}, t \bmod 2) \text{.} \end{align*}

The weights associated with final states can then be set to (1, 1), (1,0), (0,1) for cyclic rotation, left shift, and right shift (of 2^k - c) respectively.

Combining these two examples we can move any contiguous range of weights from a WOBDD to anywhere in the weight vector and set the rest to zero. In Inner Product Commitment Schemes this allows us to pack many polynomials of various shapes in a single commitment and open them with appropriately shifted weights, i.e. to move the range of weights \range{0,c} to \range{d, c+d} and set all other zero we can apply a less-than-c followed by a d-right shift. This expands the width by 4. Neither c nor d need to be powers of two.

Another interesting example to show the expressiveness of transducers is division, which for an arbitrary (but small) c transforms the weights as w_i' = w_{\floor{i / c}} using a width c.

Example 31. (Division) For a floored division of the indices by an amount c we construct an MSB-transducer with states \setn S_0 = \set{0}, \setn S_i = \range{0,c} and transition function \delta_i(s, b) = (s', b') where

\begin{align*} t &= 2 \cdot s + b \\ (s', b') &= (t \bmod c, \floor{t / c}) \text{.} \end{align*}

Finally we show the power of non-determinism to construct dense matrices \mat T in bounded width:

Example 32. (Hadamard Transform) For a Hadamard transform of the weights such that w_i' = \sum_{j \in [0, 2^k)} (-1)^{i \cdot j} \cdot w_j we construct a LSB-transducer with one state \setn S_i = \set{\circ} and transition weight function

\delta_i(\circ, b, \circ, b', ) = (-1)^{b \cdot b'} \text{.}

Example 33. (Prefix Sum) For a prefix sum of the weights such that w_i' = \sum_{j \in [0, i]} w_j we construct an MSB-transducer with states \setn S_0 = \set{\mathsf{eq}}, \setn S_i = \set{\mathsf{lt}, \mathsf{eq}} and transition weight function

\begin{align*} \delta_i(\mathsf{lt}, b, \mathsf{lt}, b') &= 1 \text{ for all } b, b' \\ \delta_i(\mathsf{eq}, b, \mathsf{eq}, b) &= 1 \text{ for all } b \\ \delta_i(\mathsf{eq}, 1, \mathsf{lt}, 0) &= 1 \\ \delta_i(s,b, s', b') &= 0 \text { otherwise.} \\ \end{align*}

Extension fields and embeddings

With these tool, we can tackle the problem from my post on "Embedding Inner Products". Recall the setup, we have an inner product commitment scheme

Definition 34. An \F_q-Inner Product Commitment Scheme is a tuple of algorithms (\mathsf{commit}, \mathsf{open}, \mathsf{verify}) such that.

  • \mathsf{commit} algorithm takes a vector \vec v \in \F_q^n and outputs a commitment C to \vec v.
  • The \mathsf{open} algorithm takes \vec v, C, a vector \vec w \in \F_q^n, and a value y \in \F_q and outputs a proof \pi that the inner product \vec v \cdot \vec w = \sum_{i \in [0,n)} v_i \cdot w_i equals y.
  • The \mathsf{verify} algorithm takes C, \vec w, y, \pi and outputs true if the proof is valid and false otherwise.

In Ligerito and WHIR the running time of \mathsf{verify} is O(\log n) iff there is a succinct MLE for \vec w. Those schemes also require that \F_q is cryptographically large, which we can achieve by taking an extension field \F_{q^d} and an embedding:

Definition 35. An embedding of \F_q in \F_{q^d} is given by a \vec c \in \F_{q^d}^p such that the embedding map \mathsf{embed}_{\vec c}: \F_q^n \rightarrow \F_{q^d}^{n'}: \vec w \mapsto \vec w' with n' = \ceil{n / p} is

w_i' = \sum_{j \in [0, p)} c_j \cdot w_{i \cdot p + j} \text{.}

Furthermore, when p = d we call it a packed embedding and when \vec c = [1] we call it the natural embedding.

We then apply embeddings to \vec v and \vec w and use an \F_{q^d}-inner product commitment scheme and extract the \F_q result from the \F_{q_d} inner product. With careful choice of embeddings this can work.

Lemma 36. (Embedding Inner Products) For towers of extensions with suitable moduli, there exist packed embeddings \vec c, \vec c' such that for any \vec v, \vec w \in \F_q^n we have

\vec v \cdot \vec w = \mathsf{unembed}( \mathsf{embed}_{\vec c}(\vec v) \cdot \mathsf{embed}_{\vec c'}(\vec w)) \text{.}

where \mathsf{unembed}: \F_{q^d} \rightarrow \F_q simply takes the constant coefficient.

This raises the question: If we have a succinct MLE for \vec w \in \F_q^n, can we get a succinct MLE for \vec w' = \mathsf{embed}_{\vec c}(\vec w) \in \F_{q^d}^{n'}? It turns out we often can, using transducers.

Example 37. (Folding Transducer) For a weighted sum of indices such that w'_i = \sum_{j \in [0, p)} c_j \cdot w_{i \cdot p + j} we can construct a LSB-transducer with states \setn S_i = \range{0,p} and transition function \delta_i(s, b) = (s', b') where

\begin{align*} t &= s + p \cdot b \\ (s', b') &= (t \bmod 2, \floor{t / 2}) \text{.} \end{align*}

where the weights for the initial states are (c_0, \dots, c_{p-1}) and the weights for the final states are (1, 0, \dots, 0). This transducer has width p.

With this transducer, we can answer the question above positively for MSB- and LSB-WOBDDs:

Lemma 38. (Embedding Succinct MLEs) Given an embedding \vec c \in \F_{q^d}^p and a MSB- or LSB-WOBDD for \vec w \in \F_q^n we can construct a WOBDD for \vec w' = \mathsf{embed}_{\vec c}(\vec w) with width at most p times the width of the original WOBDD.

Proof: Reverse the original WOBDD if needed to make it LSB. Lift the matrices from \F_q to \F_{q^d} (i.e. take the natural embedding), then apply the folding transducer above.

This solves it for inner product commitments, but often there are more parts to a protocol and these may not necessarily allow for packed embeddings. For example if we have a naturally embedded sumcheck round, we may need to evaluate \mathsf{MLE}(\vec v', \vec r) for some \vec r \in \F_{q^d}^k where \vec v' \in \F_q^{2^k} is the natural embedding of \vec v.

We can do this using an inner product with MLE evaluation weights for \vec r (see example 11). But now we have an inner product between \vec v \in \F_q and \vec w \in \F_{q^d} vector and can only do inner products in \F_q. We can easily solved this by taking any basis \beta_0, \dots, \beta_{d-1} of \F_{q^d} over \F_q and expressing the \F_{q^d} inner product as a sum of d \F_q inner products, each with a different set of weights \vec w^{(l)} which are the coefficients of \vec w in the basis such that \vec w = \sum_{l \in \range{0,d}} \beta_l \cdot \vec w^{(l)} and \vec v \cdot \vec w = \sum_{l \in \range{0,d}} \beta_l \cdot \p{\vec v \cdot \vec w^{(l)}}. But this raises the question:

If we have a succinct MLE for \vec w \in \F_{q^d}^n and some basis \vec \beta of \F_{q^d} over \F_q, can we get an succinct MLEs for each of the d coefficient vectors \vec w^{(l)} \in \F_q^n?

It turns out, we can, for any WOBDD in fact:

Lemma 39. (Lowering) Given a \F_{q^d} WOBDD and a basis \beta_0, \dots, \beta_{d-1} of \F_{q^d} we can construct WOBDDs for each of the d coefficient vectors \vec w^{(l)} \in \F_q^{2^k} such that w_i = \sum_{l \in [0,d)} w_i^{(l)} \cdot \beta_l. Take the original permutation and block matrices

(\mat M'_{i\,b})_{u\,v} = \mathsf{mul}_{\vec \beta}((\mat M_{i\,b})_{u\,v})

where \mathsf{mul}_{\vec \beta}(t) is the matrix of the \F_q-linear map x \mapsto t \cdot x in basis \vec \beta. The initial matrix is multiplied by the \beta representation of 1. The final matrix is turned into d vectors by taking its columns, each of these final matrices computes the WOBDD for one of the coefficient vectors. The new WOBDDs has d times the original width.

It makes sense to extend the definition of WOBDDs to allow vector results so that we can skip selecting the final column and compute all d evaluations in one pass. If we do this, the evaluation is optimal as it performs exactly the operations that the original WOBDD would have done, just expressed in the basis. To use it in the above packing, we then still need to apply the folding transducer which multiplies the width by p again. It also lifts all the intermediate values to \F_{q^d}, though the matrices remain in \F_q. On the other hand, we gain some efficiency by packing requiring \floor{\log_2 p} fewer variables.

Further Thoughts and Questions


References

Sage Implementation

def bits(F, k, i):
    assert i < 2^k
    return vector(F, reversed(ZZ(i).digits(2, padto=k)))
def unbits(bits):
    return sum(Integer(b) * 2^(len(bits) - i - 1) for i, b in enumerate(bits))
def msb(k):
    P = Permutations(range(k))
    return P(range(k))
def lsb(k):
    P = Permutations(range(k))
    return P(range(k)[::-1])
def eq(k, i, r):
    if i >= 2^k:
        raise ValueError(f"index i={i} larger than 2^{k} = {2^k}")
    if not isinstance(r, sage.structure.element.Vector):
        raise TypeError("r is not a vector")
    if len(r) != k:
        raise ValueError("length of r is not k = {k}")

    F = r.base_ring()
    result = F.one()
    for j in range(k):
        if i & (1 << (k - j - 1)) == 0:
            result *= F.one() - r[j]
        else:
            result *= r[j]
    return result
def mle(w, x):
    if not isinstance(w, sage.structure.element.Vector):
        raise TypeError("w is not a vector")
    if not isinstance(x, sage.structure.element.Vector):
        raise TypeError("x is not a vector")
    if w.base_ring() != x.base_ring():
        raise ValueError("w and x are not over the same field")
    if len(w) != 2^len(x):
        raise ValueError("length of w is not a 2^len(x)")

    k = len(x)
    def rec(lo, hi, i):
        if i == k:
            return w[lo]
        mid = (lo + hi) // 2
        u0 = rec(lo, mid, i + 1)
        u1 = rec(mid, hi, i + 1)
        return u0 + x[i] * (u1 - u0)

    return rec(0, len(w), 0)

Class WOBDD

class WOBDD:
    def __init__(self, field, k, permutation, matrices):
        if not permutation in Permutations(range(k)):
            raise TypeError(f"{permutation} not in {Permutations(range(k))}")

        # Make sure matrix set size is correct
        if not isinstance(matrices, list):
            raise TypeError(f"matrices is not a list of list of matrices")
        if len(matrices) != k:
            raise ValueError(f"Number of matrix sets does not equal {k}")
        u = 1
        v = None
        for (i, mis) in enumerate(matrices):
            if not isinstance(mis, list) or len(mis) != 2:
                raise TypeError(f"matrices[{i}] is not a pair of matrices")
            for (j, m) in enumerate(mis):
                if not isinstance(m, sage.structure.element.Matrix):
                    raise TypeError(f"matrices[{i}][{j}] is not a matrix")
                if m.base_ring() != field:
                    raise TypeError(f"matrices[{i}][{j}] is not a matrix over the field")
                v = v or m.ncols()
                if i == k - 1:
                    v = 1
                if m.dimensions() != (u, v):
                    raise TypeError(f"matrices[{i}][{j}] is not of dimension {u} x {v}")
            (u, v) = (v, None)

        self.field = field
        self.k = k
        self.permutation = permutation
        self.matrices = matrices

    def random(field, k, permutation=None, max_width=5):
        permutation = permutation or Permutations(range(k)).random_element()
        u = [1] + [randint(1, max_width) for _ in range(k-1)] + [1]
        matrices = []
        for i in range(k):
            V = MatrixSpace(field, u[i], u[i+1])
            matrices.append([V.random_element(), V.random_element()])
        return WOBDD(field, k, permutation, matrices)

    def __repr__(self):
        return f"WOBDD width={self.width()} over {VectorSpace(self.field, self.k)}"

    def __call__(self, r):
        if isinstance(r, Integer) or isinstance(r, int):
            r = bits(self.field, self.k, r)
        if not isinstance(r, sage.structure.element.Vector):
            raise TypeError("r is not a vector")
        if r.base_ring() != self.field:
            raise ValueError("r is not over the field")
        r = [r[self.permutation[i]] for i in range(self.k)]
        M = matrix(self.field, [1])
        for ri, [Mi0, Mi1] in zip(r, self.matrices):
            M *= (1 - ri) * Mi0 + ri * Mi1
        return M[0,0]

    def width(self):
        return max(m.nrows() for ms in self.matrices for m in ms)

    def weights(self):
        return vector(self.field, [self(i) for i in range(2^self.k)])

    def assert_compatible(self, other):
        if self.field != other.field:
            raise ValueError("Not in same field")
        if self.permutation != other.permutation:
            raise ValueError("Different permutation")

    def is_compatible(self, other):
        try:
            self.assert_compatible(other)
        except:
            return False
        return True

    def lift(self, field):
        if not field.has_coerce_map_from(self.field):
            raise ValueError(f"Can not lift from {self.field} to {field}")
        matrices = [[m.change_ring(field) for m in ms] for ms in self.matrices]
        return WOBDD(field, self.k, self.permutation, matrices)

    def lower(self, F, i):
        (V, v_to_f, f_to_v) = self.field.vector_space(F)
        basis = [v_to_f(x) for x in V.basis()]

        def mul(x):
            return matrix([f_to_v(x * beta) for beta in basis])

        def lower_matrix(matrix):
            return block_matrix([[mul(x) for x in row] for row in matrix.rows()])

        matrices = [[lower_matrix(m) for m in ms] for ms in self.matrices]
        e = f_to_v(1)
        matrices[0] = [matrix(e * m) for m in matrices[0]]
        e = V.basis()[i]
        matrices[-1] = [matrix(m * e).T for m in matrices[-1]]
        return WOBDD(F, self.k, self.permutation, matrices)

    def __add__(self, other):
        if not isinstance(other, WOBDD):
            return NotImplemented
        self.assert_compatible(other)
        matrices = [
            [block_matrix([[m1, m2]]) for (m1, m2) in zip(self.matrices[0], other.matrices[0])],
        ]
        for i in range(1, len(self.matrices) - 1):
            matrices += [
                [block_matrix([[m1, 0], [0, m2]]) for (m1, m2) in zip(self.matrices[i], other.matrices[i])],
            ]
        matrices += [
            [block_matrix([[m1], [m2]]) for (m1, m2) in zip(self.matrices[-1], other.matrices[-1])],
        ]
        return WOBDD(self.field, self.k, self.permutation, matrices)

    def __mul__(self, other):
        # TODO: Scalar multiplication.
        if not isinstance(other, WOBDD):
            return NotImplemented
        self.assert_compatible(other)
        matrices = [
            [m1.tensor_product(m2) for (m1, m2) in zip(m1s, m2s)]
            for (m1s, m2s) in zip(self.matrices, other.matrices)
        ]
        return WOBDD(self.field, self.k, self.permutation, matrices)

    def reverse(self):
        P = Permutations(range(self.k))
        permutation = P(self.permutation[::-1])
        matrices = []
        for mis in self.matrices[::-1]:
            matrices += [[m.transpose() for m in mis]]

        return WOBDD(self.field, self.k, permutation, matrices)

WOBDD Examples

def popcount(F, k):
    matrices = (
        [[Matrix(F, [[1, b]]) for b in [0,1]]] +
        [[Matrix(F, [
            [1, b],
            [0, 1]
        ]) for b in [0,1]] for i in range(1,k-1)] +
        [[Matrix(F, [[b], [1]]) for b in [0,1]]])
    return WOBDD(F, k, lsb(k), matrices)
def counter(F, k, start=0):
    matrices = (
        [[Matrix(F, [[2, start + b]]) for b in [0,1]]] +
        [[Matrix(F, [
            [2, b],
            [0, 1]
        ]) for b in [0,1]] for i in range(1,k-1)] +
        [[Matrix(F, [[b], [1]]) for b in [0,1]]])
    return WOBDD(F, k, lsb(k), matrices)
def bitwise(F, k, op):
    permutation = Permutations(range(2 * k))([j for i in range(k) for j in [i, i+k]])
    matrices = []
    matrices.append([Matrix(F, [[1, 0, 1-b, b]]) for b in [0,1]])
    for i in range(k-1):
        matrices.append([Matrix(F, [
            [1, 0, 0, 0],
            [0, 1, 2^(k - i - 1) * op(b, 0), 2^(k - i - 1) * op(b, 1)]
        ]).T for b in [0,1]])
        matrices.append([Matrix(F, [
            [1, 0, 1-b, b],
            [0, 1, 0, 0],
        ]) for b in [0,1]])
    i = k - 1
    matrices.append([Matrix(F, [[0, 1, 2^(k - i - 1) * op(b, 0), 2^(k - i - 1) * op(b, 1)]]).T for b in [0,1]])
    return WOBDD(F, 2*k, permutation, matrices)

Class Transducer

class Transducer:
    def __init__(self, field, k, permutation, initial_states, transitions):
        if not permutation in Permutations(range(k)):
            raise TypeError(f"{permutation} not in {Permutations(range(k))}")
        # Expand repeated rules to lists
        if not isinstance(transitions, list):
            transitions = [transitions] * k
        if not len(transitions) == k:
            raise TypeError(f"len(transitions) is {len(transitions)} not {k}")

        # Validate, extract reachable states by layer, and relabel states to [0, len(states))
        states = [len(initial_states)]
        labels = initial_states
        transition_relation = []
        for i, transitions in enumerate(transitions):
            # Expand functions to relations
            if callable(transitions):
                relation = []
                for s in range(states[-1]):
                    for b in [0, 1]:
                        [next_state, output, weight] = transitions(i, labels[s], b)
                        relation.append([s, b, next_state, output, weight])
                transitions = relation
            else:
                transitions = [[labels.index(s), b, ns, o, w] for [s,b,ns,o,w] in transitions if s in labels]
            for [s, b, ns, o, w] in transitions:
                assert b in [0,1]
                assert o in [0,1]
            # Remove zero-weight transitions
            transitions = [[s, b, ns, o, w] for [s,b,ns,o,w] in transitions if w != 0]
            labels = list(set(t[2] for t in transitions))
            states.append(len(labels))
            transition_relation.append([[s, b, labels.index(n), o, w] for [s,b,n,o,w] in transitions])
        # TODO: Backwards reachability pass, general minimization, pow-2 tricks

        self.field = field
        self.k = k
        self.permutation = permutation
        self.states = states
        self.initial_states = initial_states
        self.transition = transition_relation

    def __repr__(self):
        return f"Transducer on {self.k} bits of width {self.width()}"

    def width(self):
        return max(self.states)

    def paths(self, input):
        if isinstance(input, Integer) or isinstance(input, int):
            input = bits(GF(2), self.k, input)
        if not isinstance(input, sage.structure.element.Vector):
            raise TypeError("input is not a vector")
        for ri in input:
            if ri != 0 and ri != 1:
                raise ValueError("input is not a bitvector")
        input = [input[self.permutation[i]] for i in range(self.k)]

        paths = [(s,[], self.field(1)) for s in self.initial_states]
        output = []
        for input, transitions in zip(input, self.transition):
            next_paths = []
            for [state, outputs, weight] in paths:
                for [s, b, ns, o, w] in transitions:
                    if b != input or s != state:
                        continue
                    next_paths.append((ns, outputs + [o], weight * w))
            paths = next_paths

        def unpermute(l):
            result = [None] * self.k
            for i in range(self.k):
                result[self.permutation[i]] = l[i]
            return result

        return [(unpermute(o), w) for (s, o, w) in paths]

    def matrix(self):
        m = matrix(self.field, 2^self.k, 2^self.k)
        for i in range(2^self.k):
            for (j, w) in self.paths(i):
                m[i, unbits(j)] = w
        return m

    def action(self, wobdd):
        if not isinstance(wobdd, WOBDD):
            raise TypeError("wobdd is not a WOBDD")
        if self.permutation != wobdd.permutation:
            raise TypeError("wobdd is not compatible")
        # TODO: This fails in the degenerate case where there are zero states.

        matrices = []
        for (i, (transitions, Mi)) in enumerate(zip(self.transition, wobdd.matrices)):
            u, v = Mi[0].dimensions()
            blocks = [[[matrix(wobdd.field, u, v) for _ in range(self.states[i + 1])] for _ in range(self.states[i])] for _ in range(2)]
            for [s, b, ns, o, w] in transitions:
                blocks[b][s][ns] += w * Mi[o]
            mi = [block_matrix(m) for m in blocks]
            if i == 0:
                mi = [matrix(sum(M.rows())) for M in mi]
            elif i == self.k - 1:
                mi = [matrix(sum(M.columns())).T for M in mi]

            matrices.append(mi)
        return WOBDD(wobdd.field, self.k, self.permutation, matrices)

Transducer Examples

def less_than(F, k, c):
    permutation = lsb(k)
    initial_states = [0]
    c = bits(ZZ, k, c)[::-1]
    def transition(i, state, input_bit):
        next_state = 1 if input_bit < state + c[i] else 0
        output_bit = input_bit
        weight = 1 if i < k-1 else next_state
        return (next_state, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)
def shift(F, k, c, kind='left'):
    if kind == 'left':
        weights = [1,0]
    elif kind == 'right':
        c = 2^k - c
        weights = [0,1]
    elif kind == 'cyclic':
        weights = [1,1]
    else:
        raise ValueError(f"Unknown kind {kind}")

    permutation = lsb(k)
    initial_states = [0]
    c = bits(ZZ, k, c)[::-1]
    def transition(i, s, b):
        t = s + b + c[i]
        next_state = t // 2
        output_bit = t % 2
        weight = weights[next_state] if i == k - 1  else 1
        return (next_state, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)
def division(F, k, c):
    permutation = msb(k)
    initial_states = [0]
    def transition(i, state, input_bit):
        t = 2 * state + input_bit
        next_state = t % c
        output_bit = t // c
        weight = 1
        return (next_state, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)
def hadamard(F, k):
    permutation = lsb(k)
    initial_states = 's'
    transitions = []
    for i in range(k):
        transitions.append([
            ('s', 0, 's', 0, F(1)),
            ('s', 0, 's', 1, F(1)),
            ('s', 1, 's', 0, F(1)),
            ('s', 1, 's', 1, F(-1))
        ])
    return Transducer(F, k, permutation, initial_states, transitions)
def prefix_sum(F, k):
    permutation = msb(k)
    initial_states = 'eq'
    transitions = []
    for i in range(k):
        transitions.append([
            ('lt', 0, 'lt', 0, F(1)),
            ('lt', 0, 'lt', 1, F(1)),
            ('lt', 1, 'lt', 0, F(1)),
            ('lt', 1, 'lt', 1, F(1)),
            ('eq', 0, 'eq', 0, F(1)),
            ('eq', 1, 'lt', 0, F(1)),
            ('eq', 1, 'eq', 1, F(1)),
        ])
    return Transducer(F, k, permutation, initial_states, transitions)
def fold(F, k, coeffs):
    p = len(coeffs)
    permutation = lsb(k)
    initial_states = range(p)
    def transition(i, carry, input_bit):
        t = carry + p * input_bit
        next_carry = t // 2
        output_bit = t % 2
        weight = coeffs[carry] if i == 0 else 1
        if i == k - 1:
            weight = 1 if next_carry == 0 else 0
        return (next_carry, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)

Remco Bloemen
Math & Engineering
https://2π.com