2024-01-11

Sumcheck and MLEs

$\gdef\delim#1#2#3{\mathopen{}\mathclose{\left#1 #2 \right#3}} \gdef\p#1{\delim({#1})} \gdef\ps#1{\delim\{{#1}\}} \gdef\F{\mathbb F} \gdef\set#1{\mathcal #1} \gdef\mat#1{\mathrm #1} \gdef\vec#1{\bm #1} \gdef\popcount{\mathrm{popcount}} \gdef\eq{\mathrm{eq}} \gdef\∀{\mathop{\huge ∀}\limits} \gdef\mul{\mathsf{mul}} \gdef\add{\mathsf{add}} \gdef\addc{\mathsf{add^C}} \gdef\bar#1{\overline{#1}}$

Sumcheck

Sumcheck was introduced in LFKN92 and allows reducing certain sums of evaluations to a single evaluation. Given a field $\F$ multivariate polynomial $f ∈ \F[X_1,…,X_n]$ and an evaluation domain $\set B = \set B_1 × ⋯ × \set B_n$ with $\set B_i \subseteq \F$ . Typically the evaluation domain is a unit hypercube, $\set B = \ps{0,1}^n$ , but I'll present it generically here.

The goal is to reduce proving

$h = \sum_{\vec x ∈ \set B} f\p{\vec x}$

to proving $f(\vec r) = c$ for some $\vec r ∈ \F^n$ and $c ∈ \F$ . This can then be proven by other means, for example by an opening of a polynomial commitment. The protocol proceeds as follows:

Prover sends $f_1 ∈ \F[X]$ , a univariate polynomial such that $f_1(X) = \sum_{\vec x ∈ \set B_2×⋯×\set B_n} f\p{X, \vec x}$
Verifier sends a random $r_1 ∈ \F$ and checks that $\deg f_1 = \deg_1 f$ and computes $h = \sum_{x ∈ \set B_1} f_1\p{x}$
Prover sends $f_2(X)$ where $f_2(X) = \sum_{\vec x ∈ \set B_3×⋯×\set B_n} f\p{r_1, X, \vec x}$
Verifier sends a random $r_2 ∈ \F$ and checks that $\deg f_2 = \deg_2 f$ and $f_1(r_1) = \sum_{x ∈ \set B_2} f_2\p{x}$
The prover and verifier repeat this process $n-3$ more times.
Prover sends $f_n$ where $f_n(X) = f\p{\vec r, X}$
Verifier sends a random $r_n ∈ \F$ and computes $c = f_n(r_n)$ .
The prover and verifier use further methods to check $c = f(\vec r)$

See Justin Thaler's notes on Sum-Check T20a, T23a.

Q: Does this generalize to other reductions than addition? What is it about addition that makes this work?

Q: Does this generalize to other projections than on an axis? Would a sequence of linear independent lines $\F → \F^n$ work?

Analysis

Soundness. (TODO: check) Soundness $O\p{\frac {b ⋅ n}{|\F|}}$ where $b = \max_i |\set B_i|$ . See also Thaler's note on small fields T23b.

Communication. There are $n$ rounds in the protocol and at each round $i$ the prover sends a $\deg_i f$ univariate polynomial. Which in general takes $\sum_i \p{\deg_i f + 1}$ elements of $\F$ total. In each round the prover sends a random $\F$ , so $n$ elements of $\F$ total. Assuming

Prover. TODO: Prover complexity, especially it being better than the $k⋅\log k$ (in terms of number of values committed) that univariate proving systems have.

Verifier. In each round $i$ the verifier computes a sum over $|\set B_i|$ evaluations of a $\deg_i f$ degree polynomial. With direct evaluation this takes $\p{|\set B_i|⋅\p{1 + \deg_i f} - 1} ⋅ \add + |\set B_i| ⋅ \deg_i f ⋅ \mul$ but there are more efficient algorithms for polynomial multi-evaluation. It is also that likely $\set B_i$ and $f$ have additional structure that can be exploited in a particular application.

Multi-linear functions

A multilinear function is a multivariate polynomial such that no variable occurs with degree higher than one. In other words, when considering a single variable and holding the rest constant the function is linear.

The equality function

A particularly useful multilinear function is the equality function:

$\eq(\vec x, \vec y) = \prod_{i∈[0,n)}\p{x_i⋅y_i + (1 - x_i)⋅(1 - y_i)}$

It is called the equality function because it is the indicator function for equality of two vectors on the unit hypercube:

$\∀_{\vec x, \vec y ∈ \{0,1\}^n} \eq(\vec x, \vec y) = \begin{cases} 1 & \vec x = \vec y \\ 0 & \vec x ≠ \vec y \end{cases}$

It is symmetrical under swapping arguments and permuting the vectors, i.e. given a permutation matrix $P$ :

$\eq(\vec x, \vec y) = \eq(\vec y, \vec x) = \eq(P⋅\vec x, P⋅\vec y)$

It can be efficiently evaluated using the product formula above, requiring the following operation in $\F$ :

$(3⋅n-1) ⋅ \mul + n⋅\add + 2⋅n⋅\addc$

Addendum. It actually can be rewritten into one multiplication $x_i\cdoty_i + (1 - x_i)\cdot(1 - y_i) = 1 - x_i - y_i + 2\cdotx_i\cdoty_i$

The hypercube basis

The equality function makes it easy to define basis functions for the corners of the hypercube. Given a corner $\vec c ∈ \{0,1\}^n$ the function $\eq(\vec c, \vec x)$ is $1$ on $\vec c$ and $0$ on all other corners. The evaluation of this function simplifies to $\eq(\vec c, \vec x) = \prod_{i∈[0,n)}\begin{cases} x_i & c_i = 1 \\ 1 - x_i & c_i = 0 \end{cases}$

This simplifies the number of operations in $\F$ to: $(n - 1)⋅\mul + (n - \popcount(\vec c))⋅\addc$

If we need the values $\eq(\vec c, \vec r)$ for all $\vec c ∈ \{0,1\}^n$ and some arbitrary $\vec r$ we can compute these efficiently by building up one dimension at a time. Taking $\bar{r_i} = 1 - r_i$

$\begin{aligned} x_0 &= \bar{r_0} &\quad x_{00} &= x_0 ⋅ \bar{r_1} &\quad x_{000} &= x_{00} ⋅ \bar{r_2} & \quad&⋯\\ x_1 &= r_0 & x_{01} &= x_0 ⋅ r_1 & x_{001} &= x_{00} ⋅ r_2 & &⋯\\ & & x_{10} &= x_1 ⋅ \bar{r_1} & x_{010} &= x_{01} ⋅ \bar{r_2} & &⋯\\ & & x_{11} &= x_1 ⋅ r_1 & x_{011} &= x_{01} ⋅ r_2 & &⋯\\ & & & & x_{100} &= x_{10} ⋅ \bar{r_2} & &⋯\\ & & & & x_{100} &= x_{11} ⋅ r_2 & &⋯\\ & & & & &\ \ ⋮ & &⋱\\ \end{aligned}$

This can be done in-place and takes $\p{2^{n+1} - 4}⋅\mul + n⋅\addc$ operations in $\F$ . See VSBW13, T17.

This method generalizes to any function that factors into the form $f(\vec x) = \prod_{i∈[0,n)} f_i(x_i)$ where we get the same number of multiplications and also $2\cdotn$ evaluations of $f_i$ in $0, 1$ . If we want the sum of all these $f(\vec c)$ values we only need $\p{n-1}⋅\mul + n⋅\add$ operations using:

$\sum_{x ∈ \ps{0,1}^n} \prod_{i∈[0,n)} f_i(x_i) = \prod_{i∈[0,n)}\p{f_i(0) + f_i(1)}$

Both these methods generalize to arbitrary product domains $\set B = \set B_0 × ⋯ × \set B_{n-1}$ , e.g.

$\sum_{\vec x ∈ \set B} \prod_{i∈[0,n)} f_i(x_i) = \prod_{i∈[0,n)} \sum_{x ∈ \set B_i} f_i(x)$

Q. Would it be posible to compute it in $O(n)$ time with $O(1)$ memory if we permute the order of elements to be a Gray code?

Multilinear extensions (MLEs)

Given $k = 2^n$ values in $\F$ we can store them at the corners of a $n$ dimensional hypercube $\{0,1\}^n$ . Take some ordering of the corners $\vec c_i$ and values $y_i$ then the multilinear function interpolating these values is

$f(\vec x) = \sum_{i ∈ [0,k)} y_i ⋅ \eq(\vec c_i, \vec x)$

Directly evaluating this takes $O\p{k⋅\log k}$ operations (see CTY11, T17). This can be reduced to $O\p{k}$ using $k$ memory by first computing all $\eq(\vec c, \vec x)$ values using the method above.

Hyperrectangles

We have focussed on the unit hypercube, $\{0,1\}^n$ , so far. Another interesting basis is the symmetric hypercube $\{-1, 1\}$ . And really this should generalize straightforward with non-degenerate affine transformations.

Note. Wikipedia has an interesting remark that transforming between coefficient and evaluation form on the unit hypercube is a Möbius transform, and for the symmetric hypercube a Hadamard transform.

Note. Bit equality indicator in $\{0,1\}$ vs $\{-1,1\}$ : $\begin{aligned} 1 - a - b + 2⋅a⋅b &&\quad& \frac{a⋅b + 1}{2} \end{aligned}$ in the $\operatorname{eq}$ usecase the $\frac12$ factor can be moved out of the product, and likely absorbed somewhere else entirely. So this seems beneficial. It does not appear possible for any basis to do better than $a\cdotb + 1$ as neither addition or multiplication alone is sufficient.

Zero testing

Sumcheck can be used to prove $f(\vec x) = 0$ for all $\vec x ∈ \{0,1\}^n$ . To do this, we first construct a function $g(\vec x)$ that is a multilinear extension of $f$ resricted to the unit cube: $g(\vec x) = \sum_{\vec y ∈ \{0,1\}^n} \eq(\vec y, \vec x) ⋅ f(\vec y)$

Since the interpolation is unique, $g(\vec x)$ is the zero polynomial if and only if $f$ is zero on the unit cube. By Schwartz-Zippel if $g(\vec r) = 0$ for a random $\vec r ∈\F^n$ then it is likely the zero polynomial. Thus, to proof our claim we do the following:

Verifier send a random $\vec r ∈ \F^n$ .
Prover and verifier use sumcheck on the sum $0 = g(\vec r) = \sum_{\vec y ∈ \ps{0,1}^n} \eq(\vec y, \vec r) ⋅ f(\vec y)$
The prover and verifier use further methods to check $\eq(\vec r', \vec r) ⋅ f(\vec r') = c$

Reduction to one query

Suppose we have a polynomial $f: \F^n → \F$ and we want to prove $f(\vec x_1) = y_1$ and $f(\vec x_2) = y_2$ . We can reduce this to a single evaluation. Create a linear function $l: \F → \F^n$ such that $l(0) = \vec x_1$ and $l(1) = \vec x_2$ :

$l(x) = \vec x_1 + x⋅(\vec x_2 - \vec x_1)$

Prover sends univariate polynomial $f'(x) = f(l(x))$ .
Verifier sends random $r ∈ \F$ and checks $f'(0) = y_1$ and $f'(1) = y_2$ .
The prover and verifier use further methods to check $f'(r) = f(l(r))$

This last claim is of the form $f(\vec x) = y$ .

Q. What is the size of $f'$ ? In T17 it is stated to be at most degree $n$ , but this could be a special case.

If $f$ is multilinear then $f'$ is degree at most $n$ . This generalizes to $k$ evaluation points with $l$ interpolating $l(0)=\vec x_0, l(1) = \vec x_1, …, l(k-1) = x_{k-1}$ . And $\deg f'$ at most $\p{k-1}⋅ n$ .

Alternative using linear combinations

See CFS17 and XZZ⁺19.

Verifier sends random $r \in 𝔽$ .
Prover sends

Zero-knowledge

Note. This still leaks a single evaluation of $f$ .

The sum-check protocol by itself is not zero-knowledge, but can be made such. In CFS17 the prover does this using a random masking polynomial. Starting with the prover claiming $h = \sum_{\vec x ∈ \set B} f\p{\vec x}$ .

Prover picks a random $g ∈ \F[X_1,…,X_n]$ of equal degrees as $f$ and computes $h' = \sum_{\vec x ∈ \set B} g\p{\vec x}$
Prover sends $h'$ and a commitment to $g$ .
Verifier sends random $ρ \in 𝔽$ .
The prover and verifier use the sumcheck protocol on $h + ρ ⋅ h' = \sum_{\vec x ∈ \set B} f\p{\vec x} + ρ ⋅ g\p{\vec x}$ to reduce it to $c + ρ ⋅ c' = f\p{\vec r} + ρ ⋅ g\p{\vec r}$
Prover decommits $c' = g\p{\vec r}$ .
The prover and verifier now continue with the claim $c = f\p{\vec r}$ .

The masking polynomial $g$ has $\prod_i(1 + \deg_i f)$ random coefficients, which is as large as the original $f$ . But often $f$ has special structure that allows much faster evaluation than an arbitrary polynomial. But $g$ being arbitrary ruins these often crucial optimizations. In XZZ⁺19 (aka Libra) this is addressed by realizing that a smaller $g$ is sufficient. In particular $g(\vec x) = g_1(x_1) + g_2(x_2) + ⋯ + g_n(x_n)$ where $g_i \in 𝔽[X]$ are random with $\deg g_i = \deg_{i} f$ is sufficient. Note that the constant term only has to be generated once, resulting in only $1 + \sum_i \deg_i f$ random coefficients.

References

GKR08 Shafi Goldwasser, Yael Tauman Kalai, Guy N. Rothblum (2008). Delegating Computation: Interactive Proofs for Muggles.
CTY11 Graham Cormode, Justin Thaler, Ke Yi (2011). Verifying Computations with Streaming Interactive Proofs.
CMT12 Graham Cormode, Michael Mitzenmacher, Justin Thaler (2012). Practical Verified Computation with Streaming Interactive Proofs.
T13 Justin Thaler (2013). "Time-Optimal Interactive Proofs for Circuit Evaluation"
T15 Justin Thaler (2015). "A Note on the GKR Protocol"
CFS17 Alessandro Chiesa, Michael A. Forbes, Nicholas Spooner (2017). A Zero Knowledge Sumcheck and its Applications.
XZZ⁺19 Tiancheng Xie, Jiaheng Zhang, Yupeng Zhang, Charalampos Papamanthou, Dawn Song (2019). Libra: Succinct Zero-Knowledge Proofs with Optimal Prover.
T20a T20b: Justin Thaler (2020). "The Unreasonable Power of the Sum-Check Protocol"
K22 Erich Kaltofen (2022). "The GKR Protocol Revisited"
PH23 Shahar Papini, Ulrich Haböck (2023). "Improving logarithmic derivative lookups using GKR"
S23 Lev Soukhanov (2023). "Power circuits: a new arithmetization for GKR-styled sumcheck"
T23a Justin Thaler (2023). "Proofs, Arguments, and Zero-Knowledge". See also lecture notes and older lecture notes
T23b Justin Thaler (2023). The sumcheck protocol over fields of small characteristic.
G24 Ariel Gabizon (2024). zkWarsaw talk "The GKR method".
BDT24 Suyash Bagad, Yuval Domb, Justin Thaler (2024). The Sum-Check Protocol over Fields of Small Characteristic.
BI24 Suyash Bagad, Karthik Inbasekar (2024). Ingoyama Sumcheck 201.