List Decoding MDS Codes

\gdef\p#1{\left({#1}\right)} \gdef\set#1{\left\{{#1}\right\}} \gdef\ceil#1{\left\lceil{#1}\right\rceil} \gdef\norm#1{\left|{#1}\right|} \gdef\setn#1{\mathcal{#1}} \gdef\vec#1{\mathbf{#1}} \gdef\mat#1{\mathrm{#1}} \gdef\F{\mathbb{F}} \gdef\om{\mathrm{\omega}} \gdef\wt{\operatorname{wt}} \gdef\d{\operatorname{d}} \gdef\A{\operatorname{A}} \gdef\Aut{\operatorname{Aut}} \gdef\Iso{\operatorname{Iso}} \gdef\L{\operatorname{L}} \gdef\span{\operatorname{span}} \gdef\rank{\operatorname{rank}} \gdef\supp{\operatorname{supp}} \gdef\n{\operatorname{n}} \gdef\H{\operatorname{H}}

Ground rules:

The theory is of linear codes is essentially linear algebra with extra vocabulary and an odd almost-norm: the Hamming weight. Given a finite field \F_q and a linear space \F_q^n (called the ambient space). Then a linear code \setn C is a subspace \setn C \subseteq \F_q^n. The dimension of the code is k = \dim \setn C. A linear code with these q, n, k parameters are is denoted a [n,k]_q-code.

Preliminaries on MDS Codes

Hamming Weight and Distance

Definition 1. (Hamming weight) Given a vector \vec v \in \F_q^n, its support \supp(\vec v) \subseteq [0,n) is the set of indices of its non-zero coordinates,

\supp(\vec v) = \set{i \in [0,n) \mid v_i \neq 0} \text{,}

and its weight \wt(\vec v) \in [0,n] is the number of non-zero coordinates, denoted as

\wt(\vec v) = \norm{\supp(\vec v)} \text{.}

Given a subset \setn{S} \subseteq \F_q^n, its weight is

\wt(\setn S) = \min_{\vec v \in \setn S \setminus \{\vec 0\}} \wt(\vec v)

and its weight distribution is the sequence for i \in [0,n]

\A_i(\setn S) = \norm{\set{\vec v \in \setn S \mid \wt(\vec v) = i}} \text{.}

The Hamming weight is almost a norm on \F_q^n, except that it does not satisfy homogeneity: instead \wt(\alpha \cdot \vec v) = \wt(\vec v) for all \alpha \in \F_q^\times. It does however satisfy the triangle inequality \wt(\vec u + \vec v) \leq \wt(\vec u) + \wt(\vec v), non-negativity \wt(\vec v) \geq 0 and positive definiteness \wt(\vec v) = 0 iff \vec v = \vec 0. Furthermore, the distance derived from it, \d(\vec u, \vec v) = \wt(\vec u - \vec v) is a proper metric on \F_q^n.

The Hamming weight and the ambient space are invariant under permutations of the coordinates or scalings of individual coordinates. Together these are represented by monomial matrices, essentially permutation matrices where the non-zero entries can be any non-zero field element. It can be shown that together with field automorphisms of \F_q these are all automorphisms of \F_q^n and the Hamming weight.

Definition 2. (Linear code) A [n,k,d]_q-code \setn C is a subspace \setn C \subseteq \F_q^n of length n, dimension k, and minimum distance d = \wt(\setn C). When d is not specified, we simply write [n,k]_q-code. The members of \setn C are called codewords. A [n,k]_q code \setn C can be represented as the row space of a generator matrix \mat G \in \F_q^{k \times n} of rank k:

\setn C = \set{\vec m \cdot \mat G \mid \vec m \in \F_q^k} \text{.}

A generator matrix in the form \mat G = [\mat I_k \mid \mat P] is said to be in systematic form with \mat P is the parity matrix and the first k coordinates an information-set of \setn C. The dual of \setn C is the [n,n-k]_q code \setn C^\perp, a generator of the dual code is called a parity-check matrix of \setn C.

Two codes \setn C_1, \setn C_2 \subseteq \F_q^n are monomially equivalent if there exists a monomial matrix \mat M \in \F_q^{n \times n} such that \setn C_2 = \setn C_1 \cdot \mat M = \set{\vec c \cdot \mat M \mid \vec c \in \setn C_1}.

Note. In coding theory, one refers only to the set of codewords without specifying how messages are encoded into codewords. In other words, the encoding map \mathsf{enc} : \setn M \to \setn C is not specified and there are many valid encodings for the same code. Typically encoding is done through multiplication by a generator matrix.

I'm being loose here with the choice of coordinates and the definition of information set. For the codes we will consider, this doesn't matter.

Given any invertible matrix \mat M \in \F_q^{k \times k}, the generator matrix \mat G' = \mat M \cdot \mat G generates the same code \setn C. Similarly, given a matrix \mat M \in \F_q^{n \times n} then \mat G' = \mat G \cdot \mat M generates the code \setn C' = \setn C \cdot \mat M. If \mat M is a monomial matrix, then \setn C and \setn C' are monomially equivalent codes. Monomial equivalence preserves all the aspects of a code we care about here: dimension, minimum distance, weight distribution, list decoding size, etc.

Lemma 2. (MacWilliam equations) Given [n,k]_q code \setn C we have have the following relations between its weight distribution and that of its dual code \setn C^\perp, for i \in [0,n]:

\sum_{j \in [0,n-i]} \binom{n - j}{i} \A_j(\setn C) = q^{k-i} \sum_{j \in [0, i]} \binom{n - j}{n-i} \A_j(\setn C^\perp) \text{.}

We know that by definition 0 \leq \A_i \leq \binom{n}{i} (q-1)^{i}. We also have \sum_{i \in [0,n]} \A_i(\setn C) = q^k and \sum_{i \in [0,n]} \A_i(\setn C^\perp) = q^{n-k}. Together with the MacWilliams equations this forms an integer linear program to bound the weight distribution of any [n,k]_q code.

MDS Codes

A well known theorem in coding theory is the Singleton bound, which limits the minimum distance d of a code for given n and k.

Lemma 2. (Singleton bound) Given [n,k]_q code \setn C we have

d = \wt(\setn C) \leq n - k + 1 \text{.}

Codes that achieve this bound with equality are called maximum distance separable (MDS) codes. There are various equivalent definitions of MDS codes, we list some of them here:

Lemma 4. Given a [n,k]_q code \setn C the following are equivalent:

  1. \setn C is MDS.
  2. \setn C^\perp is MDS.
  3. \wt(\setn C) = d = n - k + 1.
  4. Any set of k coordinates is an information set.
  5. Any k columns of a generator matrix of \setn C are linearly independent.
  6. Every parity matrix \mat P is superregular, i.e., all square submatrices of \mat A are invertible.
  7. For every \setn S \subseteq [0,n) with \norm{\setn S} = d there is a codeword \vec c \in \setn C with support \supp(\vec c) = \setn S.
  8. For every \setn S \subseteq [0,n) with \norm{\setn S} = k the projection map \pi_{\setn S} : \setn C \to \F_q^k defined as \pi_{\setn S}(\vec c) = (c_i)_{i \in \setn S} is surjective.

Note. A matrix that is superregular is also called an MDS matrix, especially in the cryptographic literature where they are used as diffusion layers in block ciphers and hash functions (e.g. AES, Poseidon). An MDS code can be constructed from a superregular matrix \mat P as the image of [\,\mat I \mid \mat P\,].

https://arxiv.org/pdf/2403.10372 https://greggkelly.com/VANDERMONDE.html https://math.stackexchange.com/questions/1781867/rank-of-square-matrix-a-with-a-ij-lambda-jp-i-where-p-i-is-an-incre

, which is true only for non-zero evaluation points over the reals, but not over finite fields. See e.g.:

Superregular matrices: https://arxiv.org/pdf/2507.08809

K. C. Gupta, S. K. Pandey, I. G. Ray, S. Samanta, Cryptographically significant mds matrices over finite fields: A brief survey and some generalized results, Advances in Mathematics of Communications 13 (4) (2019) 779–843. doi:10.3934/amc.2019045.

The [n,k]_q MDS codes not only all have the same maximum distance, they have their complete weight distribution in common (see e.g. Huffman-Pless lemma 7.4.1 p. 262):

Lemma 5. (MDS weight distribution) The weight distribution of an [n,k]_q MDS code \setn C is

\A_i(\setn C) = \begin{cases} 1 & \text{if } i = 0 \\ \binom{n}{i} (q-1)\sum_{j\in[0, i - n + k]} (-1)^j \binom{i-1}{j} q^{i - n + k - j} & \text{otherwise.} \\ \end{cases}

It follows from the MDS property, and also from the summation range, that \A_i(\setn C) = 0 for 0 < i < n - k. The first nonzero value is \A_{n-k}(\setn C) = \binom{n}{k+1} (q-1).

MDS conjecture, which bounds the size of an MDS code for a given q.

Conjecture 5. (MDS conjecture) Given an [n,k]_q MDS code then either

  1. k \in \{0, 1, n - 1, n\}, or
  2. n \leq q + 1, or
  3. q is even, n = q + 2, and k \in \{3, q - 1\}.

Evidence. The conjecture has been proven for many cases, in particular for all prime fields, and for q = p^h with prime p and k < p. For an overview see, BL19.

The MDS Conjecture can equally be restated as a conjecture on the existence of superregular matrices of dimension k \times (n - k) over \F_q.

The first case k \in \{0, 1, n - 1, n\} are trivial codes: the empty code, the full space, the repetition code and its dual the single-parity check code.

We will now look at a class of codes that achieves both the Singleton bound and the entire range allowed by n \leq q + 1: Reed-Solomon codes. In fact, Reed-Solomon codes are the only codes that do so.

Reed-Solomon Codes

Definition 6. (Reed-Solomon Code) A [n,k]_q-Reed-Solomon (RS) code is defined by a sequence of n distinct evaluation points \vec x \in \F_q^n, the code \setn C is generated by the Vandermonde matrix

\begin{bmatrix} 1 & 1 & 1 & \cdots & 1 \\ x_0 & x_1 & x_2 & \cdots & x_{n-1} \\ x_0^2 & x_1^2 & x_2^2 & \cdots & x_{n-1}^2 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ x_0^{k-1} & x_1^{k-1} & x_2^{k-1} & \cdots & x_{n-1}^{k-1} \\ \end{bmatrix} \text{.}

Note. It is sometimes claimed that a Vandermonde matrix is superregular (or rather "is MDS"). While a Vandermonde matrix generates MDS codes, it is only superregular under specific conditions (e.g. over the reals when the evaluation points are positive). In particular Vandermonde matrices are not superregular when the evaluation points are a multiplicative subgroup of a finite field, as is common in practice.

A [n,k]_q-Reed-Solomon code is essentially the evaluation of all polynomials of degree less than k at the n distinct points x_i. Any subset of k such evaluation points produces a unique polynomial, so any k columns of the generator matrix are linearly independent. By lemma 4 this means that Reed-Solomon codes are MDS.

The above definition of Reed-Solomon codes can be extended by adding the point at infinity \infty to the evaluation points, which corresponds to reading off the x^{k-1} coefficient in much the same way that evaluation at x=0 is reading off the x^0 coefficient. In other words, its column looks like [0, 0, \dots, 0, 1]^\top. This allows RS codes with n = q + 1.

Extended RS codes are not closed under monomial equivalence: Scaling a column by \alpha produces [\alpha, \alpha \cdot x, \alpha \cdot x^2, \dots]^\top. We can fix this by assigning scaling values \vec a \in (\F_q^{\times})^n to each evaluation point, producing a generalized Reed-Solomon code(GRS) generated by a generalized Vandermonde matrix \mat G_{ij} = \alpha_j \cdot x_j^{i}, with the column for the point at infinity similarly scaled.

Theorem 7. An [n,k]_q MDS code is a GRS code if and only if the parity matrix is a generalized Cauchy matrix with entries

\mat P_{ij} = \frac{u_i \, v_j}{x_i - y_j} \text{,}

with x_i, y_j \in \F_q \cup \{\infty\} all distinct and u_i, v_j \in \F_q^{\times}. At most one of the x_i or y_j can be \infty, for which the row/column will be \mat P_{ij} = u_i \, v_j.

It is known that square Cauchy matrices are invertible, and it is easy to see that all its submatrices are also Cauchy matrices, so all Cauchy matrices are superregular. This is another way to see that Reed-Solomon codes are MDS.

An important result is that GRS codes are essentially the only MDS codes at the boundary of the MDS conjecture, with a few exceptions:

Conjecture 8. (GRS Conjecture.) A [n, k]_q MDS code with n = q + 1 is a GRS code, except for the following cases:

  1. [10, 5]_9, the Casse-Glynn code,
  2. q=2^h and
    1. k \in \{3, q-1\}, the hyperoval codes and their duals,
    2. k \in \{4, q-3\} and h odd, the Glynn codes and their duals.

Evidence. The conjecture has been proven in many cases, see [BL19]. In particular, it holds for

The GRS conjecture can equally be stated (with equivalent exceptions) as a conjecture that all supperregular matrices of dimension k \times (q - k + 1 ) over \F_q are Cauchy, or that all q+1 point arcs in the projective space \mathrm{PG}(k-1, q) are normal rational curves. A similar theorem, that extends below the MDS boundary is due to RL89:

Theorem 9. (Roth-Lempel) A [n,k]_q MDS code is GRS if q is odd, 3 \leq k \leq q-1 and n \geq k + q - \ceil{\frac{\sqrt{q} + 1}{4}}.

The Hyperoval codes for q = 2^h and k=3 come from the addition of an extra [0, 1, 0]^\top column to the RS code that only works in this specific case. This can be seen as a coefficient-selection evaluation much like 0 and \infty, selecting the x^1 coefficient. So in that sense the exception is not too far from being a Reed-Solomon code, but unlike the point at infinity, there is no clean interpretation in projective geometry. The other exception, k=q-1 are their duals. This extra point allows them to grow to n = q +2, which explains the exception to the MDS Conjecture.

The other exception is a single code the Casse-Glynn code:

Example 8. (Casse-Glynn) The following is a [10,5]_9 MDS code that is not a GRS code, given by the generator matrix

\begin{bmatrix} 1 & 0 & 0 & 0 & 0 & \delta^7 & \delta^5 & 1 & 1 & \delta^5 \\ 0 & 1 & 0 & 0 & 0 & \delta^5 & \delta^7 & \delta^5 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 & 1 & \delta^5 & \delta^7 & \delta^5 & 1 \\ 0 & 0 & 0 & 1 & 0 & 1 & 1 & \delta^5 & \delta^7 & \delta^5 \\ 0 & 0 & 0 & 0 & 1 & \delta^5 & 1 & 1 & \delta^5 & \delta^7 \\ \end{bmatrix}\text{.}

where \delta is a root of \delta^2 + 2 \, \delta + 2.

The binary field exception we already discussed. The main proof is through Roth–Lempel (1989), who showed that any [n,k]_q MDS code with n = q + 1 is equivalent to an RS code. It is constructive: bring the generator matrix to the form [\mat I_k \mid \mat A], then \mat A will be a generalized Cauchy matrix with entries A_{i,j} = \frac{u_i\,v_i}{x_i - y_j} for distinct x_i, y_j \in \F_q and nonzero u_i, v_i \in \F_q^{\times}. Roth-Seroussi (1985) showed these generators correspond to generalized Reed-Solomon codes, which in turn are monomially equivalent to Reed-Solomon codes.

But the result holds only at the boundary of the MDS bound. For n < q + 1 there are codes known not to be equivalent to Reed-Solomon codes. A concrete example is the [6,3]_7 MDS code generated by

\begin{bmatrix} 1 & 0 & 0 & 4 & 2 & 3 \\ 0 & 1 & 0 & 2 & 2 & 1 \\ 0 & 0 & 1 & 4 & 6 & 5 \\ \end{bmatrix}\text{.}

Question. Do non-RS MDS codes behave materially different?

Higher order MDS codes:

https://arxiv.org/pdf/2111.03210v2

https://arxiv.org/pdf/2310.12888

Superregular matrices

https://www.aimsciences.org/article/doi/10.3934/amc.2019045

List Decoding

Definition 2. Given an [n,k]_q-code \setn C and a vector \vec r \in \F_q^n, the coset of \setn C with offset \vec r is the affine subspace

\vec r + \setn C = \set{\vec r + \vec c \mid \vec c \in \setn C} \text{.}

The list decoding size of \vec r is, for i \in [0,n]

\L_i(\setn C, \vec r) = \sum_{j \in [0,i]} \A_j(\vec r + \setn C) \text{,}

and the list decoding bound of \setn C is, for i \in [0,n]

\L_i(\setn C) = \max_{\vec r \in \F_q^n} \L_i(\setn C, \vec r) \text{.}

Let's quickly check that this definition matches the usual distance-to-code one:

\begin{align*} \L_i(\setn C, \vec r) &= |\set{\vec c \in \setn C \mid \d(\vec r, \vec c) \leq i}| \\ &= \sum_{j \in [0,i]} |\set{\vec c \in \setn C \mid \wt(\vec r - \vec c) = i}| \\ &= \sum_{j \in [0,i]} |\set{\vec v \in (\vec r + \setn C) \mid \wt(\vec v) = i}| \text{.} \end{align*}

We are interested in computing or bounding \L_i(\setn C) for a given code \setn C. To do this we need to understand the weight distribution of all cosets \A_i(\vec r + \setn C). We have already seen the weight distribution of the trivial coset \vec 0 + \setn C = \setn C. The trivial coset can be computed from [n,k]_q alone, but generally the weight distributions of the cosets depends on the code \setn C itself, we will see an example of this in the numerical experiments.

But before we get to computation, there is more that can be done on the theory front. Consider this explicitly solvable coset:

Lemma 5. Given a [n,k]_q MDS code \setn C and a vector \vec r \in \F_q^n such that \setn D = \span(\vec r + \setn C) is an [n,k+1]_q MDS code, we have

\A_i(\vec r + \setn C) = \begin{cases} 0 & \text{if } i < n - k \\ \A_i(\setn C) + (-1)^{n - k + i} \binom{n}{i} \binom{i-1}{n - k - 1} & \text{if } i \geq n - k \text{.} \end{cases}

Proof: By Huffman-Pless lemma 7.5.1.ii we have

\A_i(\vec r + \setn C) = \frac{\A_i(\setn D) - \A_i(\setn C)}{q - 1} \text{.}

For i = 0 both \A_i(\setn D) and \A_i(\setn C) are one and cancel out. For i < n - k both \A_i(\setn D) and \A_i(\setn C) are zero. For i = n - k we have \A_i(\setn D) = \binom{n}{k} (q-1) and \A_i(\setn C) = 0. For i > n - k we get

\binom{n}{i} (q-1)\sum_{j\in[0, i - n + k]} (-1)^j \binom{i-1}{j} q^{i - n + k - j}

\footnotesize \begin{align*} &\frac{\A_i(\setn D) - \A_i(\setn C)}{q-1} \\ &= \binom{n}{i} \p {\sum_{j\in [0,i - n + k]} (-1)^j \binom{i-1}{j} q^{i - n+k - j} - \!\!\!\sum_{j\in[0, i - n +k -1]}\!\!\!\! (-1)^j \binom{i-1}{j} q^{i - n+k -1 - j}} \\ &= \binom{n}{i} \p{\p{q - 1}\sum_{j=0}^{i - d} (-1)^j \binom{i-1}{j} q^{i - d - j} + (-1)^{i - n + k} \binom{i-1}{i - d + 1}} \\ &= \A_i(\setn C) + (-1)^{n - k + i} \binom{n}{i} \binom{i-1}{n - k - 1} \text{.} \\ \end{align*}

In particular for an non-extended RS code with k < n such a vector \vec r must exist. Question: How many such vectors exist?

To do. What are the conditions for this span-extension to exist? It doesn't seem to hold in general for MDS.

Computing coset weight distributions

Given a subspace \setn C \subseteq \F_q^n of dimension k we want to efficiently compute the weight distribution of all cosets \A_i(\vec r + \setn C) for \vec r \in \F_q^n. To reduce the search space we first note that for any \vec d \in \setn C we have \vec r + \setn C = (\vec r + \vec d) + \setn C. For any \alpha \in \F_q^\times we also have \A_i(\alpha \cdot \vec r + \setn C) = \A_i(\vec r + \setn C). This follows from \wt(\alpha \cdot \vec x) = \wt(\vec x) and \alpha(\vec r + \setn C) = \alpha \cdot \vec r + \setn C (see also Huffman-Pless lemma 7.5.1.i).

Under these symmetries, each coset class can be assigned a unique representative offset \vec r \in \setn C^\perp with first non-zero coordinate equal to 1.

To iterate through the reduced search space we first construct a basis for \setn C^\perp of the form [\mat I_r \mid \mat H'] where r = \dim \setn C^\perp = n - k and \mat H' \in \F_q^{r \times k}. Then we iterate through all vectors \vec s \in \F_q^r with first non-zero coordinate equal to 1, and construct the corresponding coset offset as \vec r = [\vec s \mid \mat H' \cdot \vec s]. In total, we need to consider 1 + (q^r - 1)/(q - 1) cosets, where the 1 accounts for the zero coset \setn C itself.

Once we have a specific offset \vec r, we need to compute the weight distribution \A_i(\vec r + \setn C). For \dim C = 1 we can do this efficiently by observing that \vec r + \setn C = \{\vec r + \alpha \cdot \vec c \mid \alpha \in \F_q\}. Let \vec v = \vec r + \alpha \cdot \vec c, then for each coordinate v_j = r_j + \alpha \cdot c_j we have v_j = r_j if c_j= 0 and otherwise v_j = 0 when \alpha = \alpha_j = -r_j / c_j. So we first count the weight of \vec r on the coordinates where c_j = 0, call this w_0. Then we compute \alpha_j for the remaining coordinates and count repetitions. This takes O(n + q) time + O(q) space (or O(n \log n) time and O(n) space).

If \dim \setn C = k > 1 we can use the above method as the base case in a recursive algorithm. We pick a basis vector \vec c \in \setn C and construct the k-1 dimensional subspace \setn C' of the remaining basis vectors. Then we have

A_i(\vec r + \setn C) = \sum_{\alpha \in \F_q} A_i((\vec r + \alpha \cdot \vec c) + \setn C') \text{.}

Then there are small practical optimizations: We can pick a basis of \setn C of the form [\mat I_k \mid \mat C] so the first k coordinates are trivial. Then instead of iterating through \alpha and computing \vec r + \alpha \cdot \vec c we can initialize an accumulator with \vec r and repeatedly add \vec c to it q - 1 times. Finally we can precompute c_j^{-1} for the last dimension.

Numerical experiments

To do. Define RS codes.

For every prime q \in \{2,3,5,7,\dots\} we construct all [n,k]_q Reed-Solomon codes with n \in[1,q) and k \in [1,n-1] up to permutation of the coordinates. For each code we compute the weight distribution of all cosets using the method above. As it turns out, there are only a few distinct coset weight distributions per code, and often many cosets share the same weight distribution. We record the distinct coset weight distributions and their frequencies.

TODO. Also do for n = q.

The first question we ask is whether all Reed-Solomon codes with the same parameters q,n,k have identical coset weight distributions. Despite \binom{q}{n} unique evaluation point sets up to permuation, most produce identical spectra. But this breaks down earliest at [6,2]_13 where four distinct spectra exists.

Lemma. The weight distributions and the maximum list size of a [n,k]_q-Reed-Solomon code depends on the choice of evaluation points.

Proof. We meet the first example at [6,2]_{13}, where the frequencies of coset weight distributions differ, but we need to look further to [7,2]_{13} to see differences in the maximum list size. Take evaluation point sets

\begin{align*} \vec x_1 &= [0, 1, 2, 3, 4, 5, 6] \\ \vec x_2 &= [0, 1, 2, 3, 4, 5, 9] \\ \end{align*}

The resulting codes [7,2]_13-RS codes \setn C_1, \setn C_2 have the following maximum distances

\begin{align*} \L(\setn C_1) &= [1, 1, 1, 2, 6, 21, 85, 169] \\ \L(\setn C_2) &= [1, 1, 1, 2, 5, 21, 85, 169] \\ \end{align*}

and the following coset weight distributions

                                  C₁     C₂
  [1, 0, 0, 0, 0,  0, 84, 84]:     1      1
  [0, 1, 0, 0, 0,  6, 73, 89]:     7      7
  [0, 0, 1, 0, 1,  8, 67, 92]:   105    105
  [0, 0, 1, 0, 0, 11, 64, 93]:   147    147
  [0, 0, 0, 2, 0,  9, 65, 93]:    70     70
  [0, 0, 0, 1, 3,  6, 66, 93]:    72     56
  [0, 0, 0, 1, 2,  9, 63, 94]:  1044   1092
  [0, 0, 0, 1, 1, 12, 60, 95]:  2561   2513
  [0, 0, 0, 1, 0, 15, 57, 96]:  1223   1239
  [0, 0, 0, 0, 6,  3, 67, 93]:     2      0
  [0, 0, 0, 0, 5,  6, 64, 94]:    84    105
  [0, 0, 0, 0, 4,  9, 61, 95]:  1596   1533
  [0, 0, 0, 0, 3, 12, 58, 96]:  7259   7357
  [0, 0, 0, 0, 2, 15, 55, 97]: 10862  10766
  [0, 0, 0, 0, 1, 18, 52, 98]:  5193   5250
  [0, 0, 0, 0, 0, 21, 49, 99]:   716    701

We observe that the coset with weight 6 at distance 4 only occurs for \setn C_1, leading to the difference in \L_4.

Despite this, there appear only very few distinct weight distribution classes for a given [n,k]_q, for [n,k]_q-RS codes up to q \leq 13 and n \leq 8, only the following have more than one class, with the given frequencies per class (no particular order):

\begin{align*} [6,2]_{13} &: [52, 104, 312,1248] \\ [6,3]_{13} &: [52, 104, 312, 1248] \\[1em] [7,2]_{13} &: [78, 182, 364, 546, 546] \\ [7,3]_{13} &: [78, 182, 364, 546, 546] \\ [7,4]_{13} &: [78, 182, 546, 910] \\[1em] [8,2]_{13} &: [39, 78, 234, 468, 468] \\ [8,3]_{13} &: [39, 78, 234, 468, 468] \\ [8,4]_{13} &: [39, 78, 234, 468, 468] \\ [8,5]_{13} &: [39, 78, 234, 468, 468] \\ \end{align*}

Observe that the class frequencies are identical for each n, except for [7,4]_q where it appears two classes have collapsed into one.

Generally we find that there is a lot more symmetry left to exploit. There are much few coset classes then what we identified, and much fewer code weight distrbutions then what is possible.

It is computationally demanding to explore this further, so instead we will focus on a particular set of evaluation points: the multiplicative cosets of subgroups of \F_q^\times, i.e. x_i = \alpha \cdot \om_n^i where \alpha \in \F_q^\times and \om_n \in \F_q is a primitive n-th root of unity. These are the point sets of practical concern as they can be efficiently implemented using number-theoretic transforms (NTTs).

We test if \alpha influences the weight distributions, and find that it does not for prime q < 13, q = 13, n < 12 and q = 13, n = 12, k \in [3,10]. We skipped the other $[12,k]_{13]$ cases due to computational complexity, but some of these cases follow from the lemmas below. We will from here on fix \alpha = 1.

TODO.

Exact solutions for MDS \setn C

Lemma 3. Given a subspace \setn C \subseteq \F_q^n we have the singleton bound

\wt(\setn C) \leq n - \dim \setn C + 1 \text{,}

and when the bound is tight, i.e., when equality holds, then \setn C is called maximum distance separable (MDS).

Proof sketch. We have n coordinates, if we use all k = \dim \setn C degrees of freedom to try to zero them out, we end up with the zero vector which is excluded from the weight minimum. So we can at most zero out k - 1 coordinates, leaving at most n - (k - 1) = n - k + 1 non-zero coordinates.

When \setn C is MDS, we can solve the coset weight distribution of two special cases exactly. The first one is is the trivial coset when \vec r \in \setn C and the coset is \setn C itself:

Lemma 4. Given \setn C \subseteq \F_q^n with k = \dim \setn C the following are equivalent:

  1. \setn C is MDS.
  2. \setn C^\perp is MDS.
  3. \wt(\setn C) = n - k + 1.
  4. Any k columns of a generator matrix of \setn C are linearly independent.
  5. If \mat G = [\mat I \mid \mat A] is a generator matrix of \setn C, then \mat A is hyper-invertible, i.e., all square submatrices of \mat A are invertible.
  6. For every \setn S \subseteq [0,n) with \norm{\setn S} = n-k+1 there is a codeword \vec c \in \setn C with support \supp(\vec c) = \set{i \in [0,n) \mid c_i \neq 0} = \setn S.
  7. For every \setn S \subseteq [0,n) with \norm{\setn S} = k the projection map \pi_{\setn S} : \setn C \to \F_q^k defined as \pi_{\setn S}(\vec c) = (c_i)_{i \in \setn S} is surjective.

Lemma 4. Given an MDS \setn C \subseteq \F_q^n with k = \dim \setn C and d = n-k+1 we have

\A_i(\setn C) = f_i(q,n,k) + \begin{cases} 1 & \text{if } i = 0 \\ 0 & \text{otherwise,} \\ \end{cases}

where

f_i(q,n,k) = \binom{n}{i} (q-1)\sum_{j=0}^{i - d} (-1)^j \binom{i-1}{j} q^{i - d - j}

which has special cases f_i(q,n,k) = 0 for i < d and f_{d}(q,n,k) = \binom{n}{k+1} (q-1).

Proof: See Huffman-Pless lemma 7.4.1 p. 262.

In particular f_{n - k + 1}(q,n,k) = \binom{n}{k+1} (q-1).

We must have \sum_{i\in[0,n]} \A_i(\setn C) = q^k, from this follows that \sum_{i\in[0,n]} f_i(q,n,k) = q^k - 1.

The second special case is when \vec r is such that the span of \vec r + \setn C is also MDS. For an [n,k]_q-RS code with k < n and evaluation points x_i we can construct such an \vec r by choosing r_i = x_i^{k}. The resulting span is the [n,k+1]_q-RS code on the same evaluation points, and hence is also MDS.

Lemma 5. Given an MDS \setn C \subseteq \F_q^n and a vector \vec r \notin \setn C such that \setn D = \span(\vec r + \setn C) is also MDS, we have

\A_i(\vec r + \setn C) = f_i(q,n,k) + \begin{cases} 0 & \text{if } i < n - k \\ (-1)^{n - k + i} \binom{n}{i} \binom{i-1}{n - k - 1} & \text{if } i \geq n - k \text{.} \end{cases}

In particular \A_{n-k}(\vec r + \setn C) = \binom{n}{k}.

Proof: Since \setn D is MDS, its minimum distance is

d' = n - \dim \setn D + 1 = n - (\dim \setn C + 1) + 1 = d - 1 \text{.}

By Huffman-Pless lemma 7.5.1.ii we have

\A_i(\vec r + \setn C) = \frac{\A_i(\setn D) - \A_i(\setn C)}{q - 1} \text{.}

For i = both \A_i(\setn D) and \A_i(\setn C) are one and cancel out. For i < d - 1 both \A_i(\setn D) and \A_i(\setn C) are zero. For i = d - 1 we have \A_i(\setn D) = \binom{n}{d-1} (q-1) and \A_i(\setn C) = 0. For i \geq d we get

\footnotesize \begin{align*} &\frac{\A_i(\setn D) - \A_i(\setn C)}{q-1} = \frac{f_i(q, n, d - 1) - f_i(q, n, d)}{q - 1} \\ &= \binom{n}{i} \p {\sum_{j=0}^{i - d + 1} (-1)^j \binom{i-1}{j} q^{i - d + 1 - j} - \sum_{j=0}^{i - d } (-1)^j \binom{i-1}{j} q^{i - d - j}} \\ &= \binom{n}{i} \p{\p{q - 1}\sum_{j=0}^{i - d} (-1)^j \binom{i-1}{j} q^{i - d - j} + (-1)^{i - d + 1} \binom{i-1}{i - d + 1}} \\ &= f_i(q,n,d) - (-1)^{d + i} \binom{n}{i} \binom{i-1}{d - 2} \text{.} \\ \end{align*}

It is interesting that these fluctuations do not depend on q.

The above two cosets provide a simple lower bound for the list decoding size of [n,k]_q-RS codes with k < n:

\begin{align*} \L_{d-1} \geq \binom{n}{d-1} && \L_d \geq 1 + \binom{n}{d} (q - 1) \text{.} \end{align*}

Coset Weight Distributions

Given \setn C \subseteq \F_q^n with \dim \setn C = k and \vec r \in \F_q^n. We want to compute \A_i = \A_i(\vec r + \setn C), the number of vectors in the coset \vec r + \setn C of weight i.

We will tackle this by considering all support sets \setn I \subseteq [0,n). Define the subset of the coset with support contained in \setn I and exactly equal to \setn I respectively as

\begin{align*} \setn A_{\setn I} &= \{\vec c \in \vec r + C \mid \supp(\vec c) \subseteq \setn I\} \\ \setn A_{\setn I}' &= \{\vec c \in \vec r + C \mid \supp(\vec c) = \setn I\} \\ \end{align*}

Then these are related by inclusion-exclusion as

\norm{\setn A_{\setn I}'} = \sum_{\setn J \subseteq \setn I} (-1)^{\norm{\setn I} - \norm{\setn J}} \norm{\setn A_{\setn J}} \text{,}

and to the the coset weight distribution by

\A_i = \sum_{\substack{\setn I \subseteq [0,n)\\[.3em] \norm{\setn I} = i}} \norm{\setn A_{\setn I}'} = \sum_{\substack{\setn I \subseteq [0,n)\\[.3em] \norm{\setn I} = i}} \sum_{\setn J \subseteq \setn I} (-1)^{\norm{\setn I} - \norm{\setn J}} \norm{\setn A_{\setn J}} \text{.}

So the task is to compute \norm{\setn A_{\setn J}} for all \setn J \subseteq [0,n).

Fix \setn J \subseteq [0,n) and lets its complement be \overline{\setn J} = [0,n) \setminus \setn J. For \vec x \in \vec r + \setn C to be in \setn A_{\setn J} it must be that x_j = 0 for all j \in \overline{\setn J}. This gives the coordinate wise equations

\underset{j \in \overline{\setn J}}{\Large \forall}{}\ c_j = - r_j \text{.}

Denote with subscript \overline{\setn J} the restriction to coordinates in \overline{\setn J}, i.e. \vec c_{\overline{\setn J}} = (c_j)_{j \in \overline{\setn J}}, \setn C_{\overline{\setn J}} = \set{\vec c_{\overline{\setn J}} \mid \vec c \in \setn C}, etc. Let \pi_{\overline{\setn J}}: \setn C \to \setn C_{\overline{\setn J}} be the coordinate restriction projection. Then the above equations can be rewritten as \pi_{\overline{\setn J}}(\vec c) = - r_{\overline{\setn J}}. And we have

|\setn A_{\setn J}| = \begin{cases} 0 & \text{if } \vec r_{\overline{\setn J}} \notin C_{\overline{\setn J}} \\ \norm{\ker(\pi_{\overline{\setn J}})} & \text{if } \vec r_{\overline{\setn J}} \in C_{\overline{\setn J}} \text{,} \end{cases}

where \norm{\ker(\pi_{\overline{\setn J}})} = q^{\dim \ker(\pi_{\overline{\setn J}})}. Define s_{\!\setn J} = \dim \ker(\pi_{\overline{\setn J}}), then

\A_i = \sum_{\substack{\setn I \subseteq [0,n)\\[.3em] \norm{\setn I} = i}} \sum_{\substack{\setn J \subseteq \setn I \\[.3em] \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} }} (-1)^{\norm{\setn I} - \norm{\setn J}} q^{s_{\!\setn J}} \text{.}

So the two pieces of information that matter are (for each \setn I \subseteq [0,n)),

  1. s_{\setn I} = \dim \ker(\pi_{\overline{\setn I}}), which depends only on \setn C, and
  2. whether \vec r_{\setn I} \in C_{\setn I}, which also depends on the coset \vec r.

A natural next question is when does s_{\setn J} only depend on the size j = \norm{\setn J} but not its contents? It turns out this is exactly when \setn C is MDS:

MDS Codes: Coarse and Fine Structure

Assume s_{\setn J} only depend on the size j = \norm{\setn J}. It follows that \dim \ker (\pi_{\setn S}) depends only on \norm{\setn S} and thus \rank(\pi_{\setn S}) also depends only on \norm{\setn S}. Let f(t) = \rank(\pi_{\setn S}) for any \setn S with \norm{\setn S} = t. Then we must have f(t) \leq \min(t, k), since the image of \pi_{\setn S} is a subspace of \F_q^t and the rank cannot exceed the dimension of \setn C. Recall that \pi simply deletes coordinates, so for t+1 we are adding a single coordinates, hence f(t+1) - f(t) \in \{0,1\}. From these two conditions it follows that f(t) = \min(t,k), i.e. \pi_{\setn S} is surjective for when \norm{\setn S} \leq k. By lemma 4 this is equivalent to \setn C being MDS.

Conversely, when \pi_{\setn I} is surjective when \norm{\setn I} \leq k, and hence \setn C_{\setn I} = \F_q^{\norm{\setn I}}. Therefore for \norm{\setn I} \geq k we have \vec r_{\setn I} \in \setn C_{\setn I} for all \vec r \in \F_q^n. Additionally we have \rank(\pi_{\setn I}) = \dim(\setn C_{\setn I}) = \min(\norm{\setn I}, k), and thus \dim \ker(\pi_{\setn I}) = k - \min(\norm{\setn I}, k),

Take j = \norm{\setn J}, so that \norm{\overline{\setn J}} = n - j, then we have

s_{\!\setn J} = k - \min(n-j, k) = \begin{cases} 0 & \text{if } j \leq n - k \\ j - n + k & \text{if } j \geq n - k \text{.} \end{cases}

We split the inner sum for \A_i splits into two parts, one for j < n - k where q^{s_{\!\setn J}} = 1 and \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} may or may not hold, and one for j \geq n - k where q^{s_{\!\setn J}} = q^{j - n + k} and \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} always holds. Hence

\A_i = \overbrace{ \sum_{\substack{\setn I \subseteq [0,n)\\[.3em] \norm{\setn I} = i}} \sum_{\substack{\setn J \subseteq \setn I \\[.3em] j \geq n - k }} (-1)^{i-j} q^{j - n + k} }^{\A_i^\text{coarse}} + \overbrace{ \sum_{\substack{\setn I \subseteq [0,n)\\[.3em] \norm{\setn I} = i}} \sum_{\substack{\setn J \subseteq \setn I \\[.3em] j < n - k \\[.3em] \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} }} (-1)^{i-j} }^{\A_i^\text{fine}} \text{.}

The first term is independent of \vec r and is responsible for what we call the coarse structure of the coset weight distributions. The second term depends on \vec r and is responsible for the fine structure. Note that the second term does not immediately depend on q (though it may still do so indirectly through the membership condition \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}}).

Let's get the coarse structure term into a closed form. The double sum only depends on sizes i and j, so we can rewrite it as

\A_i^{\text{coarse}} = \binom{n}{i} \sum_{j \in [n - k, i]}(-1)^{i-j} \binom{i}{j} q^{j - n + k} \text{.}

In particular this is zero for i < n - k. The first non-trivial value is at i = n - k, where we have \A_{n-k}^{\text{coarse}} = \binom{n}{k}. We can show that the coarse structure sums to q^k over all weights:

\begin{align*} \sum_{i \in [0,n]} \A_i^{\text{coarse}} &= \sum_{i \in [n-k,n]} \binom{n}{i} \sum_{j \in [n - k, i]}(-1)^{i-j} \binom{i}{j} q^{j - n + k} \\ &= \sum_{j \in [n - k, n]} \sum_{i \in [j,n]} (-1)^{i-j} \binom{n}{i} \binom{i}{j} q^{j - n + k} \\ &= \sum_{j \in [n - k, n]} q^{j - n + k} \binom{n}{j} \sum_{i \in [0,n-j]} (-1)^i \binom{n - j}{i} \\ &= q^k \end{align*}

where we used that \sum_{i \in [0,n-j]} (-1)^i \binom{n - j}{i} = 0 unless n=j, where it is 1. Since this matches the total number of vectors in the coset, it follows that the fine structure term sums to zero.

We can also polish the \A_i^{\text{fine}} term a bit. Define the number of solutions of size j < n - k to the membership condition \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} as

\n_j(\vec r, \setn C) = \norm{\set{ \setn J \subseteq [0,n) \mid \vec r_{\overline{\setn J} } \in \setn C_{\overline{\setn J}} \wedge \norm{\setn J} = j }} \text{.}

Then we can rewrite the fine structure term as

\A_i^{\text{fine}} = \sum_{j \in [0,\min(i, n-k-1)]} (-1)^{i-j} \binom{n-j}{i-j} \n_j \text{.}

I said "no ratios", but this one is worth it: If instead of counts \n_j we had fractions p_j = \n_j / \binom{n}{j} \in [0,1], we can use the identity \binom{n-j}{i-j} \binom{n}{j} = \binom{n}{i} \binom{i}{j} to rewrite the fine structure term as

\A_i^{\text{fine}} = \binom{n}{i} \sum_{j \in [0,\min(i, n-k-1)]} (-1)^{i-j} \binom{i}{j} p_j \text{.}

For i < n-k this is the binomial transform of the sequence p_j. It also shows the similarity with the coarse structure term. In fact, the sequence \A_i is simply a binomial transform of the sequence [p_0, p_1, \dots, p_{n-k-1}, 1, q^1, q^2, \dots, q^k] multiplied by factors (-1)^i\binom{n}{i}. It's useful to write down the reverse transformation as well, i.e. computing \n_j from \A_i^{\text{fine}} for j \in [0,n-k):

\n_j = \sum_{i \in [0, j]} \binom{n-i}{n-j} \A_i^{\text{fine}} \text{.}

So for a given MDS code \setn C, the coset weight distributions are completely determined by the first n-k the weight counts \A_i, solution counts \n_j, or fractions p_j (or some combination thereof). Now the task is to find these for all cosets \vec r + \setn C.

Fine Structure of the Trivial Coset

Let's solve the fine structure for \vec r = \vec 0. This is the trivial coset since \vec r + \setn C = \setn C is the code itself. In this case \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} always holds, so we get p_j = 1 and thus

\begin{align*} \A_i^{\text{fine}} &= \binom{n}{i} \sum_{j \in [0,\min(i, n-k-1)]} (-1)^{i-j} \binom{i}{j} \text{,} \end{align*}

To make this resemble the \A_i^\text{coarse} we will reindex the sum. From the binomial identity \sum_{j\in[0,i]} (-1)^j \binom{i}{j} = 0 we have for i \geq n-k that \sum_{j \in [0,n-k)} (-1)^{i-j} \binom{i}{j} = - \sum_{j \in [n-k, i]} (-1)^{i-j} \binom{i}{j}, so we can rewrite as

\A_i^{\text{fine}} = - \binom{n}{i} \sum_{j \in [n-k, i]} (-1)^{i-j} \binom{i}{j}

Now the summation matches and we can combine the sums to get a nice closed form for the weight distribution of the MDS code itself:

\A_i(\setn C) = \binom{n}{i} \sum_{j \in [n-k, i]} (-1)^{i-j} \binom{i}{j} (q^{j-n+k} - 1) \text{.}

This is exactly the known MDS code weight distribution of e.g. Huffman-Pless theorem 7.4.1. A good sanity check!

Alternatively, we can solve the fine structure term explicitly using the binomial identity

\sum_{j\in[0,m]} (-1)^j \binom{r}{j} =\begin{cases} 1 & \text{if } m = 0 \\ 0 & \text{if } 0 < m < r \\ (-1)^m \binom{r-1}{m} & \text{if } m \geq r \text{.} \end{cases}

Applying this with m = \min(i, n-k - 1) and r = i gives the closed form

\A_i^{\text{fine}} = \begin{cases} 1 & \text{if } i = 0 \\ 0 & \text{if } 0 < i \leq n-k -1 \\ (-1)^{n - k -1 + i} \binom{n}{i} \binom{i - 1}{n - k -1} & \text{if } i \geq n - k - 1 \text{.} \end{cases}

The first non-trivial term happens at i = n - k + 1 = d, where we have \A_d^{\text{fine}} = -\binom{n}{k-1}. From there on the values alternate in sign.

The set of \vec r which produce the trivial coset is exactly \setn C itself, which has size q^k. Recall \A_0(\vec r + \setn C) = \norm{ \{ \vec v \in (\vec r + \setn C) \mid \wt(\vec v) = 0 \} } and the only solution to \wt(\vec v) = 0 is the zero vector \vec v = \vec 0. So \A_0(\vec r + \setn C) = 1 if and only if \vec r \in \setn C. From this follows that the only coset with \A_0 > 0 is the trivial coset. This leaves only n - k - 1 degrees of freedom for the fine structure of the other cosets.

Fine Structure of the MDS Coset

Now let's solve it for some offset \vec r such that \span(\vec r + \setn C) is also MDS. To show that such an \vec r always exists for [n,k]_q-RS codes with k < n, observe that there is a \vec r \notin \setn C such that \vec r is the evaluation of a degree k polynomial at the evaluation points of the RS code. The span \setn D = \span(\vec r + \setn C) is then a [n,k+1]_q-RS code on the same evaluation points, which is also MDS.

We need to solve the membership condition \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}}. From the MDS properties we have

\begin{align*} \rank\p{\pi_{\overline{\mathcal J}}^{(\setn C)}} &= \min(k, n-j) \\ \rank\p{\pi_{\overline{\mathcal J}}^{(\setn D)}} &= \min(k+1, n-j) \text{.} \end{align*}

We also have \setn D_{\overline{\setn J}} = \span(\vec r_{\overline{\setn J}} + \setn C_{\overline{\setn J}}), so

\begin{align*} \vec r_{\overline{\setn J}} \in \setn C_{\overline{\setn J}} &\iff \setn D_{\overline{\setn J}} = \setn C_{\overline{\setn J}} \\ &\iff \rank\p{\pi_{\overline{\mathcal J}}^{(\setn D)}} = \rank\p{\pi_{\overline{\mathcal J}}^{(\setn C)}} \\ &\iff \min(k, n-j) = \min(k+1, n-j) \\ &\iff j \geq n - k \text{.} \end{align*}

So for j< n -k we have p_j = 0. This gives A_i^{\text{fine}} = 0 for all i and thus the weight distribution of the MDS coset is exactly \A_i^{\text{coarse}}.

Conjecture. The MDS coset maximizes \A_{n-k} among all cosets.

Fine Structure of the Weight One Coset

We have \A_i^{\text{fine}} = [0, 1, 0, \dots, 0] for \n_j j < n-k. This gives \n_j = \binom{n-1}{n-j} and thus

\A_i^{\text{fine}} = \sum_{j \in [0,\min(i, n-k-1)]} (-1)^{i-j} \binom{n-j}{i-j} \binom{n-1}{n-j} \text{.}

\A_i^{\text{fine}} = \begin{cases} 0 & \text{if } i = 0 \\ 1 & \text{if } i = 1 \\ 0 & \text{if } 1 < i < n-k \\ (-1)^{n - k + 1 + i} \binom{n-1}{i - 1} \binom{i-2}{n - k - 2} & \text{if } i \geq n - k \text{.} \end{cases}

A useful result is the value at i = n - k:

\A_{n-k}^{\text{fine}} = \binom{n - 1}{k} \text{.} .

\A_{n-k} = \binom{n}{k} - \binom{n-1}{k} = \binom{n-1}{k-1} \text{.} .

Bounds

For an MDS code we only need to consider i < n - k since those determine \A_i for all i. From the definitions of \A_i and \n_j immediately follow, for all i,j \in [0,n]:

\begin{align*} 0 &\leq \A_i \leq \binom{n}{i} (q-1)^i & 0 &\leq \n_j \leq \binom{n}{j} \text{.} \end{align*}

We also have linear relationships between the \A_i and the \n_j from the MDS structure formulas. We can combine all this to derive bounds on \n_j and \A_i.

To do. This actually gives stronger upper bounds up to capacity! The transformation rules and non-negativity requirements give stronger upper bounds than the trivial ones above.

Given r \in [0,n], we want to maximize \sum_{i \in [0,r]} \A_i subject to the constraints above. We need to consider all the \n_j, which we will treat as our unknowns, but as we will see, it is sufficient to consider only the \A_is for i \geq n - k to derive upper bounds. Take \mat M \in \mathbb{Z}^{(n-k)\times (n-k+1)} the matrix representing the linear transformation from \n_j to \A_i^{\text{fine}} given by

\begin{bmatrix} 1\ \, \\ -\binom{n}{1} & 1\ \ \\ \phantom{-}\binom{n}{2} & -\binom{n-1}{1} & 1\ \, \\ -\binom{n}{3} & \phantom{-}\binom{n-1}{2} & -\binom{n-2}{1} & 1\ \, \\ \vdots & \vdots & \vdots & \vdots & \ddots \\ \pm\binom{n}{n-k-1} & \mp\binom{n-1}{n-k-2} & \pm\binom{n-2}{n-k-3} & \mp\binom{n-3}{n-k-4} & \cdots & 1\ \, \\ \mp\binom{n}{n-k} & \pm\binom{n-1}{n-k-1} & \mp\binom{n-2}{n-k-2} & \pm\binom{n-3}{n-k-3} & \cdots & \pm\binom{k + 1}{1}\\ \end{bmatrix}

such that \A = \A^{\text{coarse}} + \mat M \n. Note that for i < n-k we have \A_i^{\text{coarse}} = 0 and \A_{n-k}^{\text{coarse}} = \binom{n}{k}. We can then formulate the following integer linear optimization problem that bounds the list decoding size \sum_{i \in [0,r]} \A_i for i \leq n-k for any coset of any [n,k]_q-MDS code:

\begin{align*} &\max_{\n}\ [\ \overbrace{1 \, 1 \, \cdots \, 1}^{r+1} \, 0 \, \cdots 0\ ]\, \mat M \n \\ &\text{s.t. } \begin{bmatrix} \mat I_{n-k} \\ \mat M \end{bmatrix} \n \geq \begin{bmatrix} \vec 0 \\ -\binom{n}{k} \\ \end{bmatrix} \text{.} \end{align*}

The constraints only enforce the non-negativity of \n_j and \A_i. We will relax the integrality constraint to get an upper bound. We claim that the active set of the solution is \n_j \geq 0 for all j \in [0,r) and \A_i \geq 0 for all i \neq r (we will proof this later). From this follows \n_j = 0 for all j< r and \A_i = 0 for all i \neq r. From the transform then follows \A_r = \n_r. The remaining constraints then are

\begin{bmatrix} -\binom{n-r}{r} & 1\ \, \\ \phantom{-}\binom{n-r}{r+1} & -\binom{n-r-1}{r} & 1\ \, \\ \vdots & \vdots & \vdots & \ddots \\ \mp\binom{n-1}{n-k-2} & \pm\binom{n-2}{n-k-3} & \mp\binom{n-3}{n-k-4} & \cdots & 1\ \, \\ \pm\binom{n-1}{n-k-1} & \mp\binom{n-2}{n-k-2} & \pm\binom{n-3}{n-k-3} & \cdots & \pm\binom{k + 1}{1}\\ \end{bmatrix} \n = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ -\binom{n}{k} \\ \end{bmatrix}

0 \leq \n_j = \sum_{i \in [0,j]} \binom{n-i}{n-j} \p{\A_i - \A_i^{\text{coarse}}} \text{.}

0 \leq \n_j = \sum_{i \in [0,j]} \binom{n-i}{n-j} \p{\A_i - \A_i^{\text{coarse}}} \text{.}

From \n_{n-k} \geq 0, and \A_{n-k}^{\text{coarse}} = \binom{n}{k} (the other \A_i^{\text{coarse}} being zero), and the reverse transform we get the bound

\sum_{i \in [0,n-k]} \binom{n-i}{k} \A_i \geq \binom{n}{k}

So we want to solve the integer linear optimization problem

\begin{align*} \max_{\A_0,\dots,\A_{n-k}} &\sum_{i \in [0,r]} \A_i \\ \text{s.t. } & \A_i \geq 0 \quad (0 \leq i \leq n-k) \\ &\sum_{i \in [0,n-k]} \binom{n-i}{k} \A_i \geq \binom{n}{k} \text{.} \\ \end{align*}

This bound only assumes \setn C is [n,k]-MDS, and does not use any properties of specific codes. Notably it does not depend on q.

We can get a thigher bound by assuming that for non-trivial cosets the One coset minimizes \A_{n-k}.

The likely provable q-independent upper bound solution

\frac{\binom{n}{k}}{\binom{n-r}{k}} = \frac{(n)_r}{(n-k)_r} \text{.}

Where (n)_r = n (n-1) (n-2) \cdots (n-r+1) is the falling factorial.

For large n with fixed rate \rho = k/n and fixed relative distance \delta = r/n < (1-\rho) the upper bound on the list decoding size for MDS codes becomes

\log \L_{\delta n} \leq \log \frac{\binom{n}{\rho\, n}}{\binom{n-\delta\, n}{\rho\, n}} = n \p{\H(\rho) - (1-\delta) \H\p{\frac{\rho}{1-\delta}}} + O(1)

The more ambitions one-coset q-independent upper bound solution

\frac{\binom{n-1}{k-1}}{\binom{n-r}{k}} = \frac{n}{k} \frac{(n)_r}{(n-k)_r} \text{.}

For large n with fixed rate \rho = k/n and fixed relative distance \delta = r/n < (1-\rho) the upper bound on the list decoding size for MDS codes becomes

\log \L_{\delta n} \leq \log \frac{\binom{n-1}{\rho\, n -1}}{\binom{n-\delta\, n}{\rho\, n}} = \log \rho + n \p{\H(\rho) - (1-\delta) \H\p{\frac{\rho}{1-\delta}}} + O(1)

Previous Bounds

MDS codes from Cauchy matrices: https://ieeexplore.ieee.org/document/45291

https://people.seas.harvard.edu/~salil/research/list-lb-Jan10.pdf

The Unique Decoding Bound

\sum_{i \in [0, \frac{d-1}{2}]} \A_i \leq 1

The Johnson Bound

For all \epsilon \in (0, 1] and

r \leq n \p{1 - \frac{1}{q}} \p{1 - \sqrt{1 - (1- \epsilon)\frac{q}{q-1} \frac{d}{n}}}

we have

\sum_{i \in [0, r]} \A_i \leq \frac{1}{\epsilon}

To get the tightest bound for r we want to find the largest \epsilon \in (0,1] such that the above holds. We can solve the inequality to get

\epsilon = 1- \frac{q-1}{q} \frac{n}{d}\p{1- \p{1 - \frac{r}{n \p{1 - \frac{1}{q}}}}^2} \text{.}

Elias' Averaging Lemma

Huffman-Pless lemma 2.5.2.

For every r \in [0, n], there exists an \vec r such that

\sum_{i \in [0, r]} \A_i(\vec r + \setn C) \geq \frac{V_q(n, r)}{q^{n - k}}

where

V_q(n, r) = \sum_{i \in [0, r]} \binom{n}{i} (q-1)^i

Justesen and Hoholdt

https://www.math.uci.edu/~dwan/listcode.pdf

For any g < n there exists a coset such that

L_{n -g} \geq \frac{\binom{n}{g}}{q^{g-k}}

For any r < n there exists a coset such that

L_{r} \geq \frac{\binom{n}{r}}{q^{n-k-r}}

https://www.math.uci.edu/~dwan/listcode.pdf

https://ieeexplore.ieee.org/document/923744

https://www.sciencedirect.com/science/article/abs/pii/S0097316523001036 https://arxiv.org/pdf/2112.05592

Numerical Epxperiments

What we know and suspect:

Since the list decoding sizes are tightly bounded by the allowable set of weight distributions, we should focus on finding this universal list of weight distributions. For this we suspect:

Next steps.

https://ar5iv.labs.arxiv.org/html/2101.12722

https://dr.ntu.edu.sg/server/api/core/bitstreams/f6ca9d42-8532-42fa-8a1a-e28f5655c1d1/content

https://www.cs.utexas.edu/~danama/courses/codes/lec8-LP-bound.pdf


Lemma 2. The set of transformations that preserve the \F_q^n-linear structure and the Hamming weight (equivalently, the Hamming distance) are exactly the monomial transformations together with global \F_q-automorphisms. That is, for any \vec{v} \in \F_q^n, monomial matrix \mat{M} \in \operatorname{Mon}(n, \F_q), and field automorphism \sigma \in \Aut(\F_q), we have

\wt(\mat{M} \cdot \sigma(\vec{v}) ) = \wt(\vec{v}) \text{.}

The group of all such weight and distance preserving transformations is denoted

\Iso\!\p{\F_q^n} = \operatorname{Mon}(n, \F_q) \rtimes \Aut(\F_q) \text{.}

Recall that a monomial matrix is a square matrix with exactly one non-zero entry in each row and column, or equivalently, the product of a permutation matrix and an invertible diagonal matrix. For finite fields \F_{p^m} the automorphisms are precisely the m powers of the Frobenius automorphism \sigma(x) = x^p. For prime field m =1 and the only automorphism is the identity. The automorphism \sigma is applied coordinate-wise to high dimensional objects.

Definition 4. A Reed-Solomon (RS) code \operatorname{RS}_q(n, k, \vec{x}) with \vec{x} \in F_q^n all distinct is generated by the n \times k matrix \mat G_{ij} = x_i^j. We can extend it with an additional evaluation point that has powers x^j = 0 for j < k - 1 and x^{(k-1)} = 1.

By monomial equivalence, permuting the evaluation points does not materially change the code. We can also scale the evaluation points by non-zero \lambda_i \in \F_q^\times to get a more general generator matrix \mat G_{ij} = \lambda_i x_i^j.

The \vec{\lambda} doesn't add much, other than making the class closed under monomial equivalence. Typically the x_i's form (a coset of) a subgroup of \F_q^\times to allow fast NTT algorithms, but the only requirement is that they are distinct. It is popular to interpret Reed–Solomon codes as evaluations of low-degree polynomials, but we will avoid introducing even more terminology here.

Absolute chaos below

References

https://arxiv.org/abs/2101.12722

https://errorcorrectionzoo.org/kingdom/q-ary_digits_into_q-ary_digits

https://errorcorrectionzoo.org/c/reed_solomon https://rptu.de/channel-codes

https://dspace.mit.edu/bitstream/handle/1721.1/4484/RLE-TR-335-04734756.pdf

Remco Bloemen
Math & Engineering
https://2π.com