2025-10-10

Succinct Multi-Linear Extensions

$\gdef\p#1{\left({#1}\right)} \gdef\ceil#1{\left\lceil{#1}\right\rceil} \gdef\floor#1{\left\lfloor{#1}\right\rfloor} \gdef\norm#1{\left|{#1}\right|} \gdef\set#1{\left\{#1\right\}} \gdef\range#1{\left[#1\right)} \gdef\setn#1{\mathcal{#1}} \gdef\vec#1{\mathbf{#1}} \gdef\mat#1{\mathrm{#1}} \gdef\F{\mathbb{F}} \gdef\Z{\mathbb{Z}} \gdef\vd{𝛅} \gdef\i{\mathrm{i}} \gdef\j{\mathrm{j}} \gdef\om{\mathrm{\omega}}$

Many interactive proof systems rely on multi-linear extensions (MLEs) that can be efficiently evaluated, e.g. Jolt (NTZ25 appendix A for Jolt, see also the source code for many examples), SuperSpartan (STW23 section 5) , or GKR (Tha13, Tha23 section 4.6.6).

Despite their importance, it is not easy to come up with specific efficient MLEs, or even get a sense of what is possible. At least, that was the case for me until I read HJR+25 "Jagged Polynomial Commitments" and played with the ideas presented there and in this article. It turns out a lot more is possible than I initially expected! In the process I learned how to

avoid needing power-of-two alignment,
arbitrarily mask and move coefficients around to efficiently pack of polynomials,
generally manipulate coefficient indices in a structured way,
trivially construct many common lookup tables,
solve the small field inner product embedding problems from a previous post.

So, time to get good at succinct MLEs!

Multi-Linear Extensions

Here follows a brief review of multi-linear extensions (MLEs) and their properties. For a more extensive treatment see Tha23 § 3.5.

Denote with $\mathsf{bits}_k: \mathbb{N} \rightarrow \F_q^k$ the bitvector $\vec b \in \{0,1\}^k$ such that $i = \sum_{j \in [0,k)} b_j \cdot 2^{k-j-1}$ , i.e. in big-endian order. We will leave the field $\F_q$ implicit and often drop the subscript $k$ when it is clear from context.

Definition 1. A $k$ -variable Multi-Linear Extension (MLE) evaluation is a map

$\mathsf{MLE}_k: \F_q^{2^k} \times \F_q^k \rightarrow \F_q$

such that $\mathsf{MLE}_k(\vec w, \mathsf{bits}(i)) = w_i$ for all $i \in [0,2^k)$ and $\mathsf{MLE}_k(\vec w, \vec r)$ is a polynomial in $\vec r$ of degree at most 1 in each variable.

Now we show this map is unique and present several equivalent definitions.

Lemma 2. The MLE evaluation map is unique and equivalent to the recursion

$\begin{align*} \mathsf{MLE}_0([w_0], []) &= w_0 \\[4pt] \mathsf{MLE}_{k+1}([\vec w_0 \, \vec w_1], [r_0 \, \vec r]) &= u_0 + r_0 \cdot (u_1 - u_0) \\ \text{ where } u_i &= \mathsf{MLE}_k(\vec w_i, \vec r) \text{.} \end{align*}$

Proof: Uniqueness follows from the fact that there are $2^k$ interpolation points and a polynomial of degree at most 1 in each of $k$ variable has at most $2^k$ coefficients.

Multi-linearity in $\vec r$ : The base case is trivial. For the inductive case, it is a linear combination of $\mathsf{MLE}$ s in the $\vec r$ variables and is linear in the new $r_0$ variable.

Interpolation: The base case is trivial. For the inductive case, observe that for $b \in \{0,1\}, i \in [0,2^k)$ we have

$\mathsf{MLE}_{k+1}([\vec w_0 \, \vec w_1], [b \, \mathsf{bits}_k(i)]) = \mathsf{MLE}_k(\vec w_b, \mathsf{bits}_k(i)) = w_{b, i}$

and since concatenation results in $[\vec w_0 \, \vec w_1]_{b \cdot 2^k + i} = w_{b,i}$ and $\mathsf{bits}_{k+1}(b \cdot 2^k + i) = [b\, \mathsf{bits}_k(i)]$ , the interpolation property holds.

This definition can be evaluated in $2^k-1$ field multiplications, additions and subtractions. This makes it an $O(n)$ operation algorithm where $n = 2^k$ . Since in general all $n$ coefficients of $\vec w$ are needed to compute the result, $O(n)$ is optimal. But as we will see, if $\vec w$ has some special structure, we can do better.

Another common and useful expression is in terms of Lagrange basis polynomials, which in the context of MLEs are commonly denoted as $\mathsf{eq}_i(\vec r)$ :

Lemma 3. The MLE evaluation map is equivalent to

$\mathsf{MLE}_k(\vec w, \vec r) = \sum_{i \in [0, 2^k)} \mathsf{eq}_{k\,i}(\vec r) \cdot w_i$

where

$\mathsf{eq}_{k\,i}(\vec r) = \prod_{j \in [0, k)} \begin{cases} 1- r_j & \mathsf{bits}_k(i)_j = 0 \\ r_j & \mathsf{bits}_k(i)_j = 1 \text{.} \end{cases}$

Proof: Observe that $\mathsf{eq}_i(\vec r)$ is linear in each $r_j$ and for $i,j \in [0, 2^k)$ we have

$\mathsf{eq}_i(\mathsf{bits}(j)) = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}$

which makes $\mathsf{eq}_i$ a multi-linear Lagrange basis for $\{0,1\}^k$ . The properties of the expression for $\mathsf{MLE}$ follow from this.

We can rewrite this lemma using the tensor product notation $\otimes$ :

Lemma 4. The MLE evaluation map is equivalent to the inner product

$\mathsf{MLE}_k(\vec w, \vec r) = \vec w \cdot \p{\bigotimes_{i \in \range{0,k}} \begin{bmatrix} 1 - r_i \\ r_i \end{bmatrix} } \text{.}$

From this equivalent definition it follows directly that the MLE is linear in the $\vec w$ argument:

Lemma 5. (Linearity) For all $\alpha, \beta \in \F_q$ and $\vec w, \vec w' \in \F_q^{2^k}$ and $\vec r \in \F_q^k$ we have

$\mathsf{MLE}(\alpha \cdot \vec w + \beta \cdot \vec w', \vec r) = \alpha \cdot \mathsf{MLE}(\vec w, \vec r) + \beta \cdot \mathsf{MLE}(\vec w', \vec r) \text{.}$

However, it does not follow that the MLE is linear in the $\vec r$ argument. Instead it is multi-linear, i.e. linear in each variable separately, which is a much weaker property.

The MLE evaluation respects the symmetries of the hyper-cube. The hypercube $\{0,1\}^k$ has a large symmetry group, the hyper-octahedral group $B_k$ of order $k! \cdot 2^k$ which is generated by permutations of the $k$ axes and reflections in each axis. The MLE respects these symmetries in the following sense:

Lemma 6. (Hypercube Symmetries) Every symmetry of the hyper-cube $\sigma \in B_k$ induces an equivalent symmetry on the $\vec w$ and $\vec r$ :

$\mathsf{MLE}_k(\sigma_w(\vec w), \vec r) = \mathsf{MLE}_k(\vec w, \sigma_r(\vec r)) \text{.}$

Proof: It suffices to show this for the generators of the group, which come in two types:

Permutation of axes: $\sigma_r(\vec r) = (r_{\pi(0)}, \dots, r_{\pi(k-1)})$ for some permutation $\pi$ on $[0,k)$ and $\sigma_w(\vec w) = (w_{\pi'(0)}, \dots, w_{\pi'(2^k - 1)})$ where $\pi'$ is the permutation on $[0,2^k)$ induced by $\pi$ on the bit representations of the indices.
Reflection in axis $j$ : $\sigma_r(\vec r) = (r_0, \dots, 1 - r_j, \dots, r_{k-1})$ and $\sigma_w(\vec w) = (w_{0 \oplus 2^j}, \dots, w_{i \oplus 2^j}, \dots, w_{(2^k - 1) \oplus 2^j})$ where $\oplus$ is bitwise XOR.

It is sufficient to show that the left and right side of the equation are MLEs and agree on the interpolation points $\set{0,1}^k$ . The only non-trivial case is the right side of the reflection, where we note that $1 - r_j$ is still linear in $r_j$ .

The ability to permute variables plays nicely with other lemma's that allow us to split or combine variables. For example, we can compose two MLE evaluations into one:

Lemma 7. (Tensor Composition) Given $k, k'$ and $\vec w \in \F_q^{2^k}$ , $\vec w' \in \F_q^{2^{k'}}$ , $\vec r \in \F_q^k$ , and $\vec r' \in \F_q^{k'}$ we have

$\mathsf{MLE}_k(\vec w, \vec r) \cdot \mathsf{MLE}_k(\vec w', \vec r') = \mathsf{MLE}_{k+k'}(\vec w \otimes \vec w', [\vec r\, \vec r']) \text{.}$

In the opposite direction, factoring an $\mathsf{MLE}$ is hard as $\vec w$ doesn't generally have a tensor decomposition. But when we fix $\vec r'$ to a bitvector we can do it:

Lemma 8. (Partial Fixing) Given $\vec w \in \F_q^{2^{k +l}}$ , $\vec r \in \F_q^k$ , and $\vec b \in \set{0,1}^{l}$ we have

$\mathsf{MLE}_{k+l}(\vec w, [\vec r\, \vec b]) = \mathsf{MLE}_{k}(\vec w', \vec r)$

where $\vec w' \in \F_q^{2^k}$ is given by $w'_i = w_{i \cdot 2^{l} + j}$ with $j$ such that $\vec b = \mathsf{bits}_{l}(j)$ .

That concludes the basic properties of general MLE evaluations. We will now look at special cases for $\vec w$ that allow efficient evaluation of $\mathsf{MLE}_k(\vec w, \vec r)$ , i.e. sub-linear in the size of $\vec w$ or, equivalently, sub-exponential in $k$ .

Tensor Weights

For succinct verification we need to restrict $\vec w$ to vectors that allow MLE evaluation in $O(\log n)$ or better (where $n = 2^k$ ). A simple but useful class is those where $\vec w$ can be decomposed as the tensor product of $2$ -vectors.

Lemma 9. Given $\vec t_0,\dots,\vec t_{k-1} \in \F_q^2$ , there is an $O(k)$ algorithm to compute $\mathsf{MLE}_k(\vec w, \vec r)$ where

$\vec w = \bigotimes_{i \in [0,k)} \vec t_i \text{.}$

Proof: Consider the base case $\mathsf{MLE}_0([1], []) = 1$ and recursion:

$\begin{align*} \mathsf{MLE}_{k+1}(\vec t_0 \otimes \vec w', [r_0, \vec r]) &= \p{(\vec t_0)_0 \cdot (1 - r_0) + (\vec t_0)_1 \cdot r_0} \cdot \mathsf{MLE}_k(\vec w', \vec r) \end{align*}$

The recursive step is linear in $r_0$ and for $r_0 \in \{0,1\}$ we have $(\vec t_0)_0$ or $(\vec t_0)_1$ times the MLE in the remaining variables, which results in the MLE of the tensor product by induction. Finally the algorithm is $O(k)$ since it does $O(1)$ work per variable.

This class is closed under the hypercube symmetries, as permutations just permute the $\vec t_i$ and reflections swap the two elements of some $\vec t_i$ . We can also scale a tensor program by scaling one of the $\vec t_i$ , but we can not generally add two tensor programs as the sum of two tensor products is not generally a tensor product. However, tensor weights are closed under element-wise multiplication of the weights vectors by taking the element-wise products of the $\vec t_i$ . This extends to element-wise exponentiation, division and inverses.

This is a very useful class of weights, as it allows us to evaluate polynomials in some useful bases. Here are some multi-linear examples:

Example 10. (Multi-Linear Monomial Basis) Given $\vec x \in \F_q^k$ the tensor weights for evaluating a multi-linear polynomial in monomial basis are $\vec t_i = \begin{bmatrix} 1 & x_i \end{bmatrix}$ .

Example 11. (Multi-Linear Boolean Hypercube Basis) Given $\vec x \in \F_q^k$ the tensor weights for evaluating a polynomial in the $\{0,1\}^k$ multi-linear basis are $\vec t_i = \begin{bmatrix} 1 - x_i & x_i \end{bmatrix}$ .

Example 12. (General Multi-Linear Hypercube Basis) Given $\vec x \in \F_q^k$ the tensor weights for evaluating a polynomial in the $\{a_i,b_i\}_{i \in [0,k)}$ multi-linear basis are

$\vec t_i = \begin{bmatrix} \displaystyle \frac{x_i - b_i}{a_i - b_i} & \displaystyle \frac{x_i - a_i}{b_i - a_i} \end{bmatrix} \text{.}$

To do. The $\set{0, \infty}^k$ basis.

We can also do some univariate evaluation, but it is much more restrictive. Monomial evaluation works:

Example 13. (Univariate Monomial Basis) Given $x \in \F_q$ the weights vector for evaluating a polynomial in the univariate monomial basis has $w_i = x^i$ and tensor weights $\vec t_i = \begin{bmatrix} 1 & x^{2^i} \end{bmatrix}$ .

Evaluation on one or two Lagrange points are just degenerate multi-linears. We can also do three Lagrange points (where we ignore $w_4$ ), but barely:

Example 14. (Univariate 3-Point Lagrange Basis) Given Lagrange basis points $x_0,x_1,x_2 \in \F_q$ and evaluation point $x \in \F_q$ ,

$\begin{align*} \vec t_0 &= \begin{bmatrix} \displaystyle (x - x_2) \\[1em] \displaystyle \frac{(x - x_0)}{(x_2 - x_1) \cdot (x_1 - x_0)^{-1}} \end{bmatrix} & \vec t_1 &= \begin{bmatrix} \displaystyle \dfrac{(x - x_1)}{(x_0 - x_1) \cdot (x_0 - x_2)} \\[1em] \displaystyle \frac{(x - x_0)}{(x_1 - x_0) \cdot (x_1 - x_2)} \end{bmatrix} \text{.} \\ \end{align*}$

This is cheating a bit, as $\vec w$ is length $4$ and we simply ignore the last element. To show that we can't do better, we first introduce a lemma that succinctly characterizes tensor weights:

Lemma 15. (Face Identities) For $m < k$ and $i,k \in \range{0, 2^m}$ and $j,l \in \range{0,2^{k-m}}$ we have for tensor weights $\vec w$ :

$w_{i + j \cdot 2^m} \cdot w_{k + l \cdot 2^m} - w_{i + l \cdot 2^m} \cdot w_{k + j \cdot 2^m} = 0 \text{.}$

These identities correspond to the faces of the $k$ -hypercube and they are necessary and sufficient conditions for $\vec w$ to be a tensor product of $2$ -vectors.

Proof: To show necessity split $\vec w$ in two parts $\vec w = \vec w' \otimes \vec w''$ with $\vec w' \in \F_q^{2^l}$ and $\vec w'' \in \F_q^{2^{k-l}}$ such that $w_{i + j \cdot 2^l} = w'_i \cdot w''_j$ and similar for the other terms. Then the identity becomes trivially true:

$w'_i \cdot w''_j \cdot w'_k \cdot w''_l - w'_i \cdot w''_l \cdot w'_k \cdot w''_j = 0 \text{.}$

To show sufficiency, we use induction on $k$ . The base case $k = 1$ is trivial. For the inductive case, split $\vec w$ in two halves $\vec w = [\vec w_0\, \vec w_1]$ . We will show that $\vec w_1 = \alpha \cdot \vec w_0$ for some $\alpha$ and then we can take $\vec t_0 = [1\, \alpha]$ and use the inductive hypothesis on $\vec w_0$ to get the remaining $\vec t_i$ . Consider the face identities for $m = 1$ , $i=0$ , $k=1$ and all $j,l \in \range{0,2^{k-1}}$ which we can rewrite as

$\frac{w_{0 + j \cdot 2}}{w_{1 + j \cdot 2}} = \frac{w_{0 + l \cdot 2}}{w_{1 + l \cdot 2}} \text{.}$

If all the $w_{1 + j \cdot 2} = 0$ then we set $\alpha = 0$ and we are done. Otherwise they all have the same ratio, which we take as $\alpha$ .

We can now show that no dense univariate basis exists for more than $2$ points using tensor weights:

Lemma 16. For $k > 1$ it is not possible to construct a univariate basis for $n \geq 2^k$ interpolation points using weights tensors.

Proof: The $n>2^k$ case is trivial. For $n = 2^k$ we are looking for tensor weights such that $w_i = l_i(x)$ where $l_i$ is the $i$ -th Lagrange basis polynomial for some set of points $\set{x_j}_{j \in \range{0, n}}$ . Pick four corners of a face of the hyper-cube, $a, b, c,d$ , the face identity becomes

$l_a(x) \cdot l_b(x) - l_c(x) \cdot l_d(x) = 0 \text{.}$

This identity needs to hold for any $x$ and the left side is a degree $2(n-1)$ polynomial. Taking the derivative we get

$\p{l_a(x) \cdot l_b'(x) + l_a'(x) \cdot l_b(x)} - \p{l_c(x) \cdot l_d'(x) + l_c'(x) \cdot l_d(x)} = 0 \text{.}$

For $x=x_a$ we have $l_a(x_a) = 1$ and $l_b(x_a) = l_c(x_a) = l_d(x_a) = 0$ and the above simplifies to $l_b'(x_a) = 0$ , which contradicts every zero of $l_b'$ being simple.

It's an open question (to me at least) if a dense univariate Lagrange basis MLE can be computed efficiently.

While tensor weights already capture a surprisingly useful class of succinct MLEs, they remain limited to factorizable structures and are, for example, not closed under addition. To go further, we can replace each scalar by a matrix, allowing more complex linear combinations of partial results. This leads to Weighted Ordered Binary Decision Diagram.

Weighted OBBDs

A large class of efficient MLEs can be constructed by replacing the scalars in the tensors by matrices. This was first introduced in HR18 in the proof of thm. 5.4, later refined in HJR+25 § 4.1 where they are called read-once matrix branching programs. We will use terminology more in line with Weg00, augmented with terms from automata theory.

Definition 17. A Weighted Ordered Binary Decision Diagram (WOBDD) consists of permutation $\pi$ on $\range{0,k}$ and weight matrices

$\set{\mat M_{i\,0},\mat M_{i\,1} \in \mathbb{F}_q^{\,u_i \times u_{i+1}}}_{i \in [0,k)} \text{.}$

where $u_i$ are matrix dimensions such that the initial and final dimension $u_0 = u_k = 1$ . The evaluation of the program in $\vec r$ is then defined as

$\prod_{i \in [0,k)} \p{ (1 - r_{\pi(i)}) \cdot \mathrm{M}_{i\,0} + r_{\pi(i)} \cdot \mathrm{M}_{i\,1} } \text{.}$

We call $\max_i u_i$ the width of the WOBDD. When the permutation is given, we call it a $\pi$ -WOBDD.

Note. This definition differs from HJR+25 and literature on weighted automata in allowing matrices of varying dimensions. The requirement that $u_0 = u_l = 1$ ensures that the result is a scalar, where in other literature this is achieved using separate source and sink vectors. This mostly makes the notation cleaner, but it's easy to convert between the two.

Let's establish that these are indeed MLE evaluations and can be evaluated efficiently.

Lemma 18. A WOBDD is an MLE evaluation for $\vec w \in \F_q^{2^k}$ where

$w_i = \prod_{j \in [0,k)} \mathrm{M}_{j\, b_{\pi(i)}}$

where $\vec b = \mathsf{bits}_k(i)$ . If the width is bounded evaluation takes $O(k)$ operations.

Proof: Each $r_i$ appears only once in the product and the expression is linear in each $r_i$ , hence it is multi-linear in $\vec r$ . For $r_i \in \set{0, 1}$ the values of $w_i$ follow directly from the definition.

To show how this work, let's look at a simple example that interpolates over the vector $\vec w = (0, 1, \dots, 2^k -1)$ .

Example 19. (Popcount) We can construct a WOBDD for the MLE of $w_i = \mathsf{popcount}(i) \pmod{\F_q}$ using permutation $\pi = \mathsf{id}$ and matrices

$\begin{align*} \mat M_{0\,b} &= \begin{bmatrix} 1 & b \end{bmatrix} & \mat M_{i\,b} &= \begin{bmatrix} 1 & b \\ 0 & 1 \end{bmatrix} & \mat M_{k-1\,b} &= \begin{bmatrix} b \\ 1 \end{bmatrix} \text{.} \end{align*}$

Example 20. (Counter) We can construct a WOBDD for the MLE of $w_i = i \pmod{\F_q}$ using permutation $\pi = \mathsf{id}$ and matrices

$\begin{align*} \mat M_{0\,b} &= \begin{bmatrix} 2 & b \end{bmatrix} & \mat M_{i\,b} &= \begin{bmatrix} 2 & b \\ 0 & 1 \end{bmatrix} & \mat M_{k-1\,b} &= \begin{bmatrix} b \\ 1 \end{bmatrix} \text{.} \end{align*}$

In Jolt there are many lookup tables for $m$ -arry functions where the $k$ -bit inputs $a_0, a_1, \dots, a_{m-1} \in \range{0,2^k}$ are combined into an index $i = \sum_{j \in [0,m)} a_j \cdot 2^{j \cdot k}$ into the weight vector $\vec w \in \F_q^{m \cdot k}$ . The goal is then to find a succinct MLE for this $\vec w$ . For boolean binary operations we can do this as

Example 21. (Bitwise Operation) Given a boolean binary operation $\ast$ such as AND, OR or XOR, we can construct a WOBDD for the $\mathsf{MLE}_{2\cdot k}$ evaluation of

$w_{\vec a \, \vec b} =\sum_{i \in \range{0,k}} 2^i \cdot \p{a_i \ast b_i} \pmod{\F_q} \text{.}$

Pick interleaved permutation $\pi = (a_0, b_0, a_1, b_1, \dots)$ and matrices

$\begin{align*} \mat M_{2\cdot i+1\,b} &= \begin{bmatrix} 1 & 0 & (1 - b) & b \\ 0 & 1 & 0 & 0 \end{bmatrix} & \mat M_{0\,b} &= \begin{bmatrix} 1 & 0 & 1 - b & b \end{bmatrix} \\ \mat M_{2\cdot i\,b} &= \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 2^i \cdot (b \ast 0) \\ 0 & 2^i \cdot (b \ast 1) \end{bmatrix} & \mat M_{k-1\,b} &= \begin{bmatrix} 0 \\ 1 \\ 2^i \cdot (b \ast 0) \\ 2^i \cdot (b \ast 1) \end{bmatrix} \text{.}\\ \end{align*}$

When we have WOBDDs for two vectors $\vec w, \vec w'$ we can combine them in several ways to get WOBDDs for new vectors. The following lemma's show three useful constructions.

Lemma 22. (Linear Combinations) Given two $\pi$ -WOBDDs for $\vec w, \vec w'$ we can construct a $\pi$ -WOBDD for $\vec w''$ where $w''_i = \alpha \cdot w_i + \beta \cdot w'_i$ by taking block matrices

$\begin{align*} \mat M_{0\,j}'' &= \begin{bmatrix} M_{0\, j} & M_{0, j}' \end{bmatrix} & \mat M_{ij}'' &= \begin{bmatrix} \mat M_{ij} & 0 \\[4pt] 0 & \mat M_{ij}' \end{bmatrix} & \mat M_{k-1\,l}'' &= \begin{bmatrix} \alpha \cdot \mat M_{k-1\,j} \\[4pt] \beta \cdot \mat M_{k-1\,j}' \end{bmatrix} \text{.} \end{align*}$

The width of the resulting WOBDD is the sum of the widths of the input WOBDDs.

Lemma 23. (Multiplication) Given two $k$ -variable $\pi$ -WOBDDs for $\vec w, \vec w'$ we can construct a $\pi$ -WOBDD$ for $\vec w''$ where $w''_i = w_i \cdot w'_i$ by taking tensor product matrices

$\mat M_{ij}'' = \mat M_{ij} \otimes \mat M_{ij}'\text{.}$

The width of the resulting WOBDD is the product of the widths of the input WOBDDs.

Lemma 24. (Reversing) Given a WOBDD with permutation $i_0, \dots, i_{k-1}$ we can construct an equivalent WOBDD of same width with the reversed permutation $i_{k-1},\dots,i_0$ by transposing matrices $\mat M_{i\,b}' = \mat M_{(k-i-1)\,b}^{\mathsf{T}}$ .

Lemma 25. (Composition) Given a $k$ variable WOBDD and $k'$ variable WOBDD we can construct a $k+k'$ variable WOBBD for $\vec w''$ such that $w''_{i + j\cdot2^{k}} = w_i \cdot w'_j$ by concatenating the permutations and matrices.

Using these constructions we can e.g. build a lookup table for squaring by taking the counter from example 20 and multiplying it with itself using lemma 23.

We could generalize the WOBDD definition by replacing the permutation with a set partitioning of $r_i$ into subvectors $\vec r_i \in \F_q^{k_i}$ , having $2^{k_i}$ matrices for each $i$ and evaluating using the $\mathrm{eq}$ polynomials:

$\prod_{i \in [0,l)} \sum_{j \in [0, 2^{k_i})} \mathrm{eq}_{k_i\,j}(\vec r_i) \cdot \mathrm{M}_{ij}\text{.}$

This allows for a time-space trade-off, especially if we can eliminate a large intermediate $u_i$ value. For simplicity we will not consider this generalization further.

Transducers

To create more powerful manipulations of WOBDDs we introduce transducers, which are like weighted non-deterministic finite transducers (WFSTs) except they have a layered structure. They also extend the read-once branching programs from HJR+25 and OR-OBDDs from Weg00.

Definition 26. An Non-deterministic Weighted Ordered Binary Decision Transducer, is a matrix $\mat T \in \F_q^{2^k \times 2^k}$ specified as follows:

There is an input permutation $\pi$ on $\range{0,k}$ .
There are layered sets of states $\setn{S}_0,\dots,\setn{S}_k$ .
For each $i \in \range{0,k}$ there is a transition weight function $\delta_i : \setn{S}_i \times \set{0,1} \times \setn{S}_{i+1} \times \set{0,1} \rightarrow \F_q$ such that in state $s$ on input bit $b$ we can transition to state $s'$ outputting bit $b'$ with weight $\delta_i(s, b, s', b')$ .

Take $\widehat{\setn{S}} = S_0 \times S_1 \times \cdots \times S_k$ then the matrix $\mat T$ is specified by summing over all paths while accumulating weights multiplicatively,

$\mat T_{i j} = \sum_{\vec s \in \widehat{\setn S}} \prod_{l \in \range{0, k}} \delta_l(s_l, \mathsf{bits}(i)_{\pi(l)}, s_{l+1}, \mathsf{bits}(j)_{\pi(l)}) \text{.}$

The width of the transducer is $\max_i |\setn{S}_i|$ . For a given $\pi$ we call it a $\pi$ -transducer.

Transducers can be used to transform WOBDDs such that each resulting weight is a a linear combination of the original weights:

Lemma 27. (Transducer Action) Given a $\pi$ -transducer $\mat T$ and a $\pi$ -WOBDD for $\vec w$ we can construct a $\pi$ -WOBDD for $\vec w' = \mat T \cdot \vec w$ using the block matrices

$(\mat M'_{i\, b})_{s\,s'} = \sum_{b' \in \set{0,1}} \delta_i(s, b, s', b') \cdot \mat M_{i\, b'}$

where the initial and final dimension are collapsed to $1$ by summation. The width of resulting WOBDD is at most the product of the width of the original WOBDD and the width of the transducer.

Proof: For each path in $T$ , the construction picks the matrices from the original program according to the output and multiplies them.

The non-determinism complicates things compared to the deterministic read-once branching programs (ROBP) from HJR+25, but it also makes it more powerful. For one, it allows us to reverse the permutation without increasing the width. Without non-determinism, many useful programs can only be expressed in either MSB or LSB order and we would not be able to combine them. The class of efficiently computable functions is also larger (see e.g. Weg00 thm 12.2.1). In terms of the matrix $\mat T$ , adding the output bit allows us to go from diagonal matrices to (generalized) permutation matrices and adding non-determinism allows us to go dense matrices.

For convenience, we will talk about Most-Significant Bit first (MSB) and Least-Significant Bit first (LSB) transducers, where the input permutation is $\pi(i) = i$ and $\pi(i) = k-i-1$ respectively. When the transducer is deterministic we specify transitions as a function $\delta_i: \setn{S}_i \times \set{0,1} \rightarrow \setn{S}_{i+1} \times \set{0,1} \times \F_q$ . We also omit the output and weight when they equal the input and $1$ respectively. We say we associate a weight to an initial/final state to mean applying the weight to each transition entering or leaving that state respectively.

To show the utility of this class, let's look at some simple examples of transducers. The first example does a MSB lexicographic comparison with a constant:

Example 28. (MSB Comparison) A comparison for $c \in \range{0, 2^{k}}$ is constructed as an MSB-transducer with states $\setn S_0 = \set{\mathtt{equal}}, \setn S_i = \set{\mathtt{less}, \mathtt{equal}, \mathtt{greater}}$ and transition function

$\begin{align*} \delta_i(\mathtt{less}, b) &= \mathtt{less} \\[4pt] \delta_i(\mathtt{equal}, b) &= \begin{cases} \mathtt{less} & \text{if } b < \mathsf{bits}(c)_{k-i-1} \\ \mathtt{equal} & \text{if } b = \mathsf{bits}(c)_{k-i-1} \\ \mathtt{greater} & \text{if } b > \mathsf{bits}(c)_{k-i-1} \end{cases} \\[6pt] \delta_i(\mathtt{greater}, b) &= \mathtt{greater} \text{.} \end{align*}$

We associate weights to the final state $(\mathtt{less}, \mathtt{equal}, \mathtt{greater})$ as $(1, 0, 0)$ , $(1, 1, 0)$ , $(0, 1, 0)$ , $(1, 0 ,1)$ , $(0, 1, 1)$ or $(0, 0, 1)$ to get the transducers for $<$ , $\leq$ , $=$ , $\neq$ , $\geq$ , $>$ respectively.

This method is width 3. There is also a width 2 LSB-transducer using subtraction:

Example 29. (LSB Comparison) A comparison for $c \in \range{0, 2^{k}}$ is constructed as an LSB-transducer with states $\setn S_0 = \set{0}, \setn S_i = \set{0,1}$ and transition function

$\delta_i(s, b) = \begin{cases} 1 & \text{if } b < s + \mathsf{bits}(c)_{k-i-1} \\[4pt] 0 & \text{otherwise} \\[4pt] \end{cases}$

The weights associating to the final state can be set to $(0, 1)$ , $(1, 0)$ to get the transducers for $<$ , $\geq$ respectively.

The next example shows the utility of the output function to permute the weights, in this case shift them by any amount:

Example 30. (Shift) For a cyclic rotation of the indices by an amount $c \in \range{0,2^k}$ we construct a LSB-transducer with states $\setn S_0 = \set{0}, \setn S_i = \set{0,1}$ representing the carry bit, and transition function $\delta_i(s, b) = (s', b')$ where

$\begin{align*} t &= s + b + \mathsf{bits}(c)_{k-i-1} \\[4pt] (s', b') &= (\floor{t / 2}, t \bmod 2) \text{.} \end{align*}$

The weights associated with final states can then be set to $(1, 1), (1,0), (0,1)$ for cyclic rotation, left shift, and right shift (of $2^k - c$ ) respectively.

Combining these two examples we can move any contiguous range of weights from a WOBDD to anywhere in the weight vector and set the rest to zero. In Inner Product Commitment Schemes this allows us to pack many polynomials of various shapes in a single commitment and open them with appropriately shifted weights, i.e. to move the range of weights $\range{0,c}$ to $\range{d, c+d}$ and set all other zero we can apply a less-than- $c$ followed by a $d$ -right shift. This expands the width by $4$ . Neither $c$ nor $d$ need to be powers of two.

Another interesting example to show the expressiveness of transducers is division, which for an arbitrary (but small) $c$ transforms the weights as $w_i' = w_{\floor{i / c}}$ using a width $c$ .

Example 31. (Division) For a floored division of the indices by an amount $c$ we construct an MSB-transducer with states $\setn S_0 = \set{0}, \setn S_i = \range{0,c}$ and transition function $\delta_i(s, b) = (s', b')$ where

$\begin{align*} t &= 2 \cdot s + b \\ (s', b') &= (t \bmod c, \floor{t / c}) \text{.} \end{align*}$

Finally we show the power of non-determinism to construct dense matrices $\mat T$ in bounded width:

Example 32. (Hadamard Transform) For a Hadamard transform of the weights such that $w_i' = \sum_{j \in [0, 2^k)} (-1)^{i \cdot j} \cdot w_j$ we construct a LSB-transducer with one state $\setn S_i = \set{\circ}$ and transition weight function

$\delta_i(\circ, b, \circ, b', ) = (-1)^{b \cdot b'} \text{.}$

Example 33. (Prefix Sum) For a prefix sum of the weights such that $w_i' = \sum_{j \in [0, i]} w_j$ we construct an MSB-transducer with states $\setn S_0 = \set{\mathsf{eq}}, \setn S_i = \set{\mathsf{lt}, \mathsf{eq}}$ and transition weight function

$\begin{align*} \delta_i(\mathsf{lt}, b, \mathsf{lt}, b') &= 1 \text{ for all } b, b' \\ \delta_i(\mathsf{eq}, b, \mathsf{eq}, b) &= 1 \text{ for all } b \\ \delta_i(\mathsf{eq}, 1, \mathsf{lt}, 0) &= 1 \\ \delta_i(s,b, s', b') &= 0 \text { otherwise.} \\ \end{align*}$

Extension fields and embeddings

With these tool, we can tackle the problem from my post on "Embedding Inner Products". Recall the setup, we have an inner product commitment scheme

Definition 34. An $\F_q$ -Inner Product Commitment Scheme is a tuple of algorithms $(\mathsf{commit}$ , $\mathsf{open}$ , $\mathsf{verify})$ such that.

$\mathsf{commit}$ algorithm takes a vector $\vec v \in \F_q^n$ and outputs a commitment $C$ to $\vec v$ .
The $\mathsf{open}$ algorithm takes $\vec v, C$ , a vector $\vec w \in \F_q^n$ , and a value $y \in \F_q$ and outputs a proof $\pi$ that the inner product $\vec v \cdot \vec w = \sum_{i \in [0,n)} v_i \cdot w_i$ equals $y$ .
The $\mathsf{verify}$ algorithm takes $C, \vec w, y, \pi$ and outputs true if the proof is valid and false otherwise.

In Ligerito and WHIR the running time of $\mathsf{verify}$ is $O(\log n)$ iff there is a succinct MLE for $\vec w$ . Those schemes also require that $\F_q$ is cryptographically large, which we can achieve by taking an extension field $\F_{q^d}$ and an embedding:

Definition 35. An embedding of $\F_q$ in $\F_{q^d}$ is given by a $\vec c \in \F_{q^d}^p$ such that the embedding map $\mathsf{embed}_{\vec c}: \F_q^n \rightarrow \F_{q^d}^{n'}: \vec w \mapsto \vec w'$ with $n' = \ceil{n / p}$ is

$w_i' = \sum_{j \in [0, p)} c_j \cdot w_{i \cdot p + j} \text{.}$

Furthermore, when $p = d$ we call it a packed embedding and when $\vec c = [1]$ we call it the natural embedding.

We then apply embeddings to $\vec v$ and $\vec w$ and use an $\F_{q^d}$ -inner product commitment scheme and extract the $\F_q$ result from the $\F_{q_d}$ inner product. With careful choice of embeddings this can work.

Lemma 36. (Embedding Inner Products) For towers of extensions with suitable moduli, there exist packed embeddings $\vec c, \vec c'$ such that for any $\vec v, \vec w \in \F_q^n$ we have

$\vec v \cdot \vec w = \mathsf{unembed}( \mathsf{embed}_{\vec c}(\vec v) \cdot \mathsf{embed}_{\vec c'}(\vec w)) \text{.}$

where $\mathsf{unembed}: \F_{q^d} \rightarrow \F_q$ simply takes the constant coefficient.

Proof: See "Embedding Inner Products".

This raises the question: If we have a succinct MLE for $\vec w \in \F_q^n$ , can we get a succinct MLE for $\vec w' = \mathsf{embed}_{\vec c}(\vec w) \in \F_{q^d}^{n'}$ ? It turns out we often can, using transducers.

Example 37. (Folding Transducer) For a weighted sum of indices such that $w'_i = \sum_{j \in [0, p)} c_j \cdot w_{i \cdot p + j}$ we can construct a LSB-transducer with states $\setn S_i = \range{0,p}$ and transition function $\delta_i(s, b) = (s', b')$ where

$\begin{align*} t &= s + p \cdot b \\ (s', b') &= (t \bmod 2, \floor{t / 2}) \text{.} \end{align*}$

where the weights for the initial states are $(c_0, \dots, c_{p-1})$ and the weights for the final states are $(1, 0, \dots, 0)$ . This transducer has width $p$ .

With this transducer, we can answer the question above positively for MSB- and LSB-WOBDDs:

Lemma 38. (Embedding Succinct MLEs) Given an embedding $\vec c \in \F_{q^d}^p$ and a MSB- or LSB-WOBDD for $\vec w \in \F_q^n$ we can construct a WOBDD for $\vec w' = \mathsf{embed}_{\vec c}(\vec w)$ with width at most $p$ times the width of the original WOBDD.

Proof: Reverse the original WOBDD if needed to make it LSB. Lift the matrices from $\F_q$ to $\F_{q^d}$ (i.e. take the natural embedding), then apply the folding transducer above.

This solves it for inner product commitments, but often there are more parts to a protocol and these may not necessarily allow for packed embeddings. For example if we have a naturally embedded sumcheck round, we may need to evaluate $\mathsf{MLE}(\vec v', \vec r)$ for some $\vec r \in \F_{q^d}^k$ where $\vec v' \in \F_q^{2^k}$ is the natural embedding of $\vec v$ .

We can do this using an inner product with MLE evaluation weights for $\vec r$ (see example 11). But now we have an inner product between $\vec v \in \F_q$ and $\vec w \in \F_{q^d}$ vector and can only do inner products in $\F_q$ . We can easily solved this by taking any basis $\beta_0, \dots, \beta_{d-1}$ of $\F_{q^d}$ over $\F_q$ and expressing the $\F_{q^d}$ inner product as a sum of $d$ $\F_q$ inner products, each with a different set of weights $\vec w^{(l)}$ which are the coefficients of $\vec w$ in the basis such that $\vec w = \sum_{l \in \range{0,d}} \beta_l \cdot \vec w^{(l)}$ and $\vec v \cdot \vec w = \sum_{l \in \range{0,d}} \beta_l \cdot \p{\vec v \cdot \vec w^{(l)}}$ . But this raises the question:

If we have a succinct MLE for $\vec w \in \F_{q^d}^n$ and some basis $\vec \beta$ of $\F_{q^d}$ over $\F_q$ , can we get an succinct MLEs for each of the $d$ coefficient vectors $\vec w^{(l)} \in \F_q^n$ ?

It turns out, we can, for any WOBDD in fact:

Lemma 39. (Lowering) Given a $\F_{q^d}$ WOBDD and a basis $\beta_0, \dots, \beta_{d-1}$ of $\F_{q^d}$ we can construct WOBDDs for each of the $d$ coefficient vectors $\vec w^{(l)} \in \F_q^{2^k}$ such that $w_i = \sum_{l \in [0,d)} w_i^{(l)} \cdot \beta_l$ . Take the original permutation and block matrices

$(\mat M'_{i\,b})_{u\,v} = \mathsf{mul}_{\vec \beta}((\mat M_{i\,b})_{u\,v})$

where $\mathsf{mul}_{\vec \beta}(t)$ is the matrix of the $\F_q$ -linear map $x \mapsto t \cdot x$ in basis $\vec \beta$ . The initial matrix is multiplied by the $\beta$ representation of $1$ . The final matrix is turned into $d$ vectors by taking its columns, each of these final matrices computes the WOBDD for one of the coefficient vectors. The new WOBDDs has $d$ times the original width.

It makes sense to extend the definition of WOBDDs to allow vector results so that we can skip selecting the final column and compute all $d$ evaluations in one pass. If we do this, the evaluation is optimal as it performs exactly the operations that the original WOBDD would have done, just expressed in the basis. To use it in the above packing, we then still need to apply the folding transducer which multiplies the width by $p$ again. It also lifts all the intermediate values to $\F_{q^d}$ , though the matrices remain in $\F_q$ . On the other hand, we gain some efficiency by packing requiring $\floor{\log_2 p}$ fewer variables.

Further Thoughts and Questions

General WOBDDs are not closed under addition, even though we know efficient MLEs are. Can we find a more powerful class that is closed under addition? Since the set of $\pi, \pi_{\text{reversed}}$ -WOBDDs are closed under addition, we need a way to add incompatible programs. I suspect a weighted version of FBDDs might work, where we sum over merging paths, but I haven't worked out the details.
We have shown transducers to be equivalent to matrices / linear maps. Can we characterize which linear maps are succinctly representable as transducers?
We could extend transducers with matrix weights on the transitions and tensor products when applied to a WOBDD. I don't see a good use for this yet, but it may prove to be more powerful in the same way that WOBDDs are more powerful than tensor weights. Though I suspect it is redundant with the states.
Some examples can be optimized when $c = 2^{k'} \cdot c'$ for odd $c$ by splitting off the $2^{k'}$ part using variable tricks. This can probably be implemented as an optimization pass on the transducer state machine. In general the current implementation can do more optimization such as dead state elimination.
For Weighted Finite Automata there is good literature on minimization, which can likely be augmented to apply to WOBDDs. Similarly there is literature on minimizing state-machines/transducers.
For Jolt style lookup tables we could imagine a variant of transducers where we can see the $m$ -arry input as $m$ parallel bit streams and have transitions that read one bit from each stream at a time.

References

Weg00 Ingo Wegner (2000). Branching Programs and Binary Decision Diagrams.
DKV09 Manfred Droste, Werner Kuich, Heiko Vogler (2009). Handbook of Weighted Automata.
Juk12 Stasys Jukna (2012). Boolean Function Complexity.
Tha13 Justin Thaler (2013). Time-Optimal Interactive Proofs for Circuit Evaluation.
HR18 Justin Holmgren, Ron D. Rothblum (2018). Delegating computations with (almost) minimal time and space overhead.
Tha23 Justin Thaler (2023). Proofs, Arguments, and Zero Knowledge.
STW23 Srinath Setty, Justin Thaler, Riad Wahby (2023). Customizable constraint systems for succinct arguments
HJR+25 Tamir Hemo, Kevin Jue, Eugene Rabinovich, Gyumin Roh, Ron D. Rothblum (2025). Jagged Polynomial Commitments.
NTZ25 Vineet Nair, Justin Thaler, Michael Zhu (2025). Proving CPU Executions in Small Space

Sage Implementation

def bits(F, k, i):
    assert i < 2^k
    return vector(F, reversed(ZZ(i).digits(2, padto=k)))

def unbits(bits):
    return sum(Integer(b) * 2^(len(bits) - i - 1) for i, b in enumerate(bits))

def msb(k):
    P = Permutations(range(k))
    return P(range(k))

def lsb(k):
    P = Permutations(range(k))
    return P(range(k)[::-1])

def eq(k, i, r):
    if i >= 2^k:
        raise ValueError(f"index i={i} larger than 2^{k} = {2^k}")
    if not isinstance(r, sage.structure.element.Vector):
        raise TypeError("r is not a vector")
    if len(r) != k:
        raise ValueError("length of r is not k = {k}")

    F = r.base_ring()
    result = F.one()
    for j in range(k):
        if i & (1 << (k - j - 1)) == 0:
            result *= F.one() - r[j]
        else:
            result *= r[j]
    return result

def mle(w, x):
    if not isinstance(w, sage.structure.element.Vector):
        raise TypeError("w is not a vector")
    if not isinstance(x, sage.structure.element.Vector):
        raise TypeError("x is not a vector")
    if w.base_ring() != x.base_ring():
        raise ValueError("w and x are not over the same field")
    if len(w) != 2^len(x):
        raise ValueError("length of w is not a 2^len(x)")

    k = len(x)
    def rec(lo, hi, i):
        if i == k:
            return w[lo]
        mid = (lo + hi) // 2
        u0 = rec(lo, mid, i + 1)
        u1 = rec(mid, hi, i + 1)
        return u0 + x[i] * (u1 - u0)

    return rec(0, len(w), 0)

Class WOBDD

class WOBDD:
    def __init__(self, field, k, permutation, matrices):
        if not permutation in Permutations(range(k)):
            raise TypeError(f"{permutation} not in {Permutations(range(k))}")

        # Make sure matrix set size is correct
        if not isinstance(matrices, list):
            raise TypeError(f"matrices is not a list of list of matrices")
        if len(matrices) != k:
            raise ValueError(f"Number of matrix sets does not equal {k}")
        u = 1
        v = None
        for (i, mis) in enumerate(matrices):
            if not isinstance(mis, list) or len(mis) != 2:
                raise TypeError(f"matrices[{i}] is not a pair of matrices")
            for (j, m) in enumerate(mis):
                if not isinstance(m, sage.structure.element.Matrix):
                    raise TypeError(f"matrices[{i}][{j}] is not a matrix")
                if m.base_ring() != field:
                    raise TypeError(f"matrices[{i}][{j}] is not a matrix over the field")
                v = v or m.ncols()
                if i == k - 1:
                    v = 1
                if m.dimensions() != (u, v):
                    raise TypeError(f"matrices[{i}][{j}] is not of dimension {u} x {v}")
            (u, v) = (v, None)

        self.field = field
        self.k = k
        self.permutation = permutation
        self.matrices = matrices

    def random(field, k, permutation=None, max_width=5):
        permutation = permutation or Permutations(range(k)).random_element()
        u = [1] + [randint(1, max_width) for _ in range(k-1)] + [1]
        matrices = []
        for i in range(k):
            V = MatrixSpace(field, u[i], u[i+1])
            matrices.append([V.random_element(), V.random_element()])
        return WOBDD(field, k, permutation, matrices)

    def __repr__(self):
        return f"WOBDD width={self.width()} over {VectorSpace(self.field, self.k)}"

    def __call__(self, r):
        if isinstance(r, Integer) or isinstance(r, int):
            r = bits(self.field, self.k, r)
        if not isinstance(r, sage.structure.element.Vector):
            raise TypeError("r is not a vector")
        if r.base_ring() != self.field:
            raise ValueError("r is not over the field")
        r = [r[self.permutation[i]] for i in range(self.k)]
        M = matrix(self.field, [1])
        for ri, [Mi0, Mi1] in zip(r, self.matrices):
            M *= (1 - ri) * Mi0 + ri * Mi1
        return M[0,0]

    def width(self):
        return max(m.nrows() for ms in self.matrices for m in ms)

    def weights(self):
        return vector(self.field, [self(i) for i in range(2^self.k)])

    def assert_compatible(self, other):
        if self.field != other.field:
            raise ValueError("Not in same field")
        if self.permutation != other.permutation:
            raise ValueError("Different permutation")

    def is_compatible(self, other):
        try:
            self.assert_compatible(other)
        except:
            return False
        return True

    def lift(self, field):
        if not field.has_coerce_map_from(self.field):
            raise ValueError(f"Can not lift from {self.field} to {field}")
        matrices = [[m.change_ring(field) for m in ms] for ms in self.matrices]
        return WOBDD(field, self.k, self.permutation, matrices)

    def lower(self, F, i):
        (V, v_to_f, f_to_v) = self.field.vector_space(F)
        basis = [v_to_f(x) for x in V.basis()]

        def mul(x):
            return matrix([f_to_v(x * beta) for beta in basis])

        def lower_matrix(matrix):
            return block_matrix([[mul(x) for x in row] for row in matrix.rows()])

        matrices = [[lower_matrix(m) for m in ms] for ms in self.matrices]
        e = f_to_v(1)
        matrices[0] = [matrix(e * m) for m in matrices[0]]
        e = V.basis()[i]
        matrices[-1] = [matrix(m * e).T for m in matrices[-1]]
        return WOBDD(F, self.k, self.permutation, matrices)

    def __add__(self, other):
        if not isinstance(other, WOBDD):
            return NotImplemented
        self.assert_compatible(other)
        matrices = [
            [block_matrix([[m1, m2]]) for (m1, m2) in zip(self.matrices[0], other.matrices[0])],
        ]
        for i in range(1, len(self.matrices) - 1):
            matrices += [
                [block_matrix([[m1, 0], [0, m2]]) for (m1, m2) in zip(self.matrices[i], other.matrices[i])],
            ]
        matrices += [
            [block_matrix([[m1], [m2]]) for (m1, m2) in zip(self.matrices[-1], other.matrices[-1])],
        ]
        return WOBDD(self.field, self.k, self.permutation, matrices)

    def __mul__(self, other):
        # TODO: Scalar multiplication.
        if not isinstance(other, WOBDD):
            return NotImplemented
        self.assert_compatible(other)
        matrices = [
            [m1.tensor_product(m2) for (m1, m2) in zip(m1s, m2s)]
            for (m1s, m2s) in zip(self.matrices, other.matrices)
        ]
        return WOBDD(self.field, self.k, self.permutation, matrices)

    def reverse(self):
        P = Permutations(range(self.k))
        permutation = P(self.permutation[::-1])
        matrices = []
        for mis in self.matrices[::-1]:
            matrices += [[m.transpose() for m in mis]]

        return WOBDD(self.field, self.k, permutation, matrices)

WOBDD Examples

def popcount(F, k):
    matrices = (
        [[Matrix(F, [[1, b]]) for b in [0,1]]] +
        [[Matrix(F, [
            [1, b],
            [0, 1]
        ]) for b in [0,1]] for i in range(1,k-1)] +
        [[Matrix(F, [[b], [1]]) for b in [0,1]]])
    return WOBDD(F, k, lsb(k), matrices)

def counter(F, k, start=0):
    matrices = (
        [[Matrix(F, [[2, start + b]]) for b in [0,1]]] +
        [[Matrix(F, [
            [2, b],
            [0, 1]
        ]) for b in [0,1]] for i in range(1,k-1)] +
        [[Matrix(F, [[b], [1]]) for b in [0,1]]])
    return WOBDD(F, k, lsb(k), matrices)

def bitwise(F, k, op):
    permutation = Permutations(range(2 * k))([j for i in range(k) for j in [i, i+k]])
    matrices = []
    matrices.append([Matrix(F, [[1, 0, 1-b, b]]) for b in [0,1]])
    for i in range(k-1):
        matrices.append([Matrix(F, [
            [1, 0, 0, 0],
            [0, 1, 2^(k - i - 1) * op(b, 0), 2^(k - i - 1) * op(b, 1)]
        ]).T for b in [0,1]])
        matrices.append([Matrix(F, [
            [1, 0, 1-b, b],
            [0, 1, 0, 0],
        ]) for b in [0,1]])
    i = k - 1
    matrices.append([Matrix(F, [[0, 1, 2^(k - i - 1) * op(b, 0), 2^(k - i - 1) * op(b, 1)]]).T for b in [0,1]])
    return WOBDD(F, 2*k, permutation, matrices)

Class Transducer

class Transducer:
    def __init__(self, field, k, permutation, initial_states, transitions):
        if not permutation in Permutations(range(k)):
            raise TypeError(f"{permutation} not in {Permutations(range(k))}")
        # Expand repeated rules to lists
        if not isinstance(transitions, list):
            transitions = [transitions] * k
        if not len(transitions) == k:
            raise TypeError(f"len(transitions) is {len(transitions)} not {k}")

        # Validate, extract reachable states by layer, and relabel states to [0, len(states))
        states = [len(initial_states)]
        labels = initial_states
        transition_relation = []
        for i, transitions in enumerate(transitions):
            # Expand functions to relations
            if callable(transitions):
                relation = []
                for s in range(states[-1]):
                    for b in [0, 1]:
                        [next_state, output, weight] = transitions(i, labels[s], b)
                        relation.append([s, b, next_state, output, weight])
                transitions = relation
            else:
                transitions = [[labels.index(s), b, ns, o, w] for [s,b,ns,o,w] in transitions if s in labels]
            for [s, b, ns, o, w] in transitions:
                assert b in [0,1]
                assert o in [0,1]
            # Remove zero-weight transitions
            transitions = [[s, b, ns, o, w] for [s,b,ns,o,w] in transitions if w != 0]
            labels = list(set(t[2] for t in transitions))
            states.append(len(labels))
            transition_relation.append([[s, b, labels.index(n), o, w] for [s,b,n,o,w] in transitions])
        # TODO: Backwards reachability pass, general minimization, pow-2 tricks

        self.field = field
        self.k = k
        self.permutation = permutation
        self.states = states
        self.initial_states = initial_states
        self.transition = transition_relation

    def __repr__(self):
        return f"Transducer on {self.k} bits of width {self.width()}"

    def width(self):
        return max(self.states)

    def paths(self, input):
        if isinstance(input, Integer) or isinstance(input, int):
            input = bits(GF(2), self.k, input)
        if not isinstance(input, sage.structure.element.Vector):
            raise TypeError("input is not a vector")
        for ri in input:
            if ri != 0 and ri != 1:
                raise ValueError("input is not a bitvector")
        input = [input[self.permutation[i]] for i in range(self.k)]

        paths = [(s,[], self.field(1)) for s in self.initial_states]
        output = []
        for input, transitions in zip(input, self.transition):
            next_paths = []
            for [state, outputs, weight] in paths:
                for [s, b, ns, o, w] in transitions:
                    if b != input or s != state:
                        continue
                    next_paths.append((ns, outputs + [o], weight * w))
            paths = next_paths

        def unpermute(l):
            result = [None] * self.k
            for i in range(self.k):
                result[self.permutation[i]] = l[i]
            return result

        return [(unpermute(o), w) for (s, o, w) in paths]

    def matrix(self):
        m = matrix(self.field, 2^self.k, 2^self.k)
        for i in range(2^self.k):
            for (j, w) in self.paths(i):
                m[i, unbits(j)] = w
        return m

    def action(self, wobdd):
        if not isinstance(wobdd, WOBDD):
            raise TypeError("wobdd is not a WOBDD")
        if self.permutation != wobdd.permutation:
            raise TypeError("wobdd is not compatible")
        # TODO: This fails in the degenerate case where there are zero states.

        matrices = []
        for (i, (transitions, Mi)) in enumerate(zip(self.transition, wobdd.matrices)):
            u, v = Mi[0].dimensions()
            blocks = [[[matrix(wobdd.field, u, v) for _ in range(self.states[i + 1])] for _ in range(self.states[i])] for _ in range(2)]
            for [s, b, ns, o, w] in transitions:
                blocks[b][s][ns] += w * Mi[o]
            mi = [block_matrix(m) for m in blocks]
            if i == 0:
                mi = [matrix(sum(M.rows())) for M in mi]
            elif i == self.k - 1:
                mi = [matrix(sum(M.columns())).T for M in mi]

            matrices.append(mi)
        return WOBDD(wobdd.field, self.k, self.permutation, matrices)

Transducer Examples

def less_than(F, k, c):
    permutation = lsb(k)
    initial_states = [0]
    c = bits(ZZ, k, c)[::-1]
    def transition(i, state, input_bit):
        next_state = 1 if input_bit < state + c[i] else 0
        output_bit = input_bit
        weight = 1 if i < k-1 else next_state
        return (next_state, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)

def shift(F, k, c, kind='left'):
    if kind == 'left':
        weights = [1,0]
    elif kind == 'right':
        c = 2^k - c
        weights = [0,1]
    elif kind == 'cyclic':
        weights = [1,1]
    else:
        raise ValueError(f"Unknown kind {kind}")

    permutation = lsb(k)
    initial_states = [0]
    c = bits(ZZ, k, c)[::-1]
    def transition(i, s, b):
        t = s + b + c[i]
        next_state = t // 2
        output_bit = t % 2
        weight = weights[next_state] if i == k - 1  else 1
        return (next_state, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)

def division(F, k, c):
    permutation = msb(k)
    initial_states = [0]
    def transition(i, state, input_bit):
        t = 2 * state + input_bit
        next_state = t % c
        output_bit = t // c
        weight = 1
        return (next_state, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)

def hadamard(F, k):
    permutation = lsb(k)
    initial_states = 's'
    transitions = []
    for i in range(k):
        transitions.append([
            ('s', 0, 's', 0, F(1)),
            ('s', 0, 's', 1, F(1)),
            ('s', 1, 's', 0, F(1)),
            ('s', 1, 's', 1, F(-1))
        ])
    return Transducer(F, k, permutation, initial_states, transitions)

def prefix_sum(F, k):
    permutation = msb(k)
    initial_states = 'eq'
    transitions = []
    for i in range(k):
        transitions.append([
            ('lt', 0, 'lt', 0, F(1)),
            ('lt', 0, 'lt', 1, F(1)),
            ('lt', 1, 'lt', 0, F(1)),
            ('lt', 1, 'lt', 1, F(1)),
            ('eq', 0, 'eq', 0, F(1)),
            ('eq', 1, 'lt', 0, F(1)),
            ('eq', 1, 'eq', 1, F(1)),
        ])
    return Transducer(F, k, permutation, initial_states, transitions)

def fold(F, k, coeffs):
    p = len(coeffs)
    permutation = lsb(k)
    initial_states = range(p)
    def transition(i, carry, input_bit):
        t = carry + p * input_bit
        next_carry = t // 2
        output_bit = t % 2
        weight = coeffs[carry] if i == 0 else 1
        if i == k - 1:
            weight = 1 if next_carry == 0 else 0
        return (next_carry, output_bit, weight)
    return Transducer(F, k, permutation, initial_states, transition)