Paper TLDR Project

$$ \def\field{\mathbb{F}} \def\ZZ{\mathbb{Z}} \def\eq{\text{eq}} \def\enc{\text{Enc}} \def\com{\text{Com}} \def\bin{{\rm bin}} \def\left{{\rm L}} \def\right{{\rm R}} \def\ring{\mathcal{R}} \def\distro{\mathcal{D}} \def\dist{\Delta} \def\bc{\pmb{c}} \def\be{\pmb{e}} \def\bm{\pmb{m}} \def\br{\pmb{r}} \def\bt{\pmb{t}} \def\bu{\pmb{u}} \def\bv{\pmb{v}} \def\bw{\pmb{w}} \def\bx{\pmb{x}} \def\by{\pmb{y}} \def\bz{\pmb{z}} \def\zo{\lbrace 0, 1\rbrace} \def\eq{{\rm eq}} \def\calR{\mathcal{R}} \def\calS{\mathcal{S}} \def\eps{\varepsilon} \def\expect{\mathop{\mathbb{E}}} \def\trace{\mathsf{Tr}} \def\list{\mathsf{List}} \def\event{\mathcal{E}} $$

This project serves as intermediate indices to my memory of paper reading and notes.

It also serves as a humble mimic to Oded’s choices, or my investment portfolio - I believe my preference in research is better than my investment.

I guess this might not be the best place to jot down the long existing idea, but maybe this is the first place that made me put together these thoughts.

The spirit of the current argument systems (from linear PCP to (P)IOP) for circuit model computation is to reduce the global satisfiability, i.e., the satisfiability of the whole arithmetic circuit (or other invariants), down to a random check over witness polynomial, which is the interpolated wiring values interpolated, over univariate or a boolean hypercube.

Be it linear PCP, e.g., Groth16 or BCIOP13, the wiring values, or the witness polynomial is the R1CS witness, while the reduction to QAP is performed by a trusted setup by sampling a secret randomness, encrypting to the CRS with DLOG (or other hardness assumption) to hide from computationally bounded adversary.

Now the wiring value is interpolated over a univariate polynomial.

In Plonk case, KZG allows prover to describe circuit size up to KZG CRS size, and the constraint system allows for the description of different circuits through selection polynomial.

In this case, the wiring values are also interpolated over univariate polynomials.

The reduction from whole circuit satisfiability down to a random check can be done through a permutation argument, and the randomness relies on random oracle model - no trusted setup needed.

In HyperPlonk, similar to prior Plonk constraint system, yet the polynomials (both selector polynomial and witness polynomials) are interpolated over boolean hypercubes, and the permutation argument can be done through linear time via sumcheck to avoid FFT, which is sort of inheriant in univariate polynomial case.

The linear GKR allows one to commit less, that intermediate wire values can be skipped from being committed. The random reduction from whole circuit satisfiability down to random evaluation claims on input polynomial can be reduced and propogated backwards from outputs to inputs, layer by layer.

The probabilistic proof system shapes the landscape of the argument systems, so choose your PCP/(P)IOP, choose your cryptography.

Essentially, PCP/(P)IOPs are arguing relationships between polynomials, and eventually reducing these relation checks down to a random evaluation check over these polynomials. Randomness here gives succinctness.

Down to the bottom, we use either a single index over finite field, or a vector in a boolean hypercube, to index into the polynomials (wiring predicates, selection polynomials, and witnesses), and hoping that over all these points, these polynomials’ relationship is satisfying.

[Seen 2023/11 for Spartan line of work to zkVMs and folding schemes]

Spartan: Efficient and general-purpose zkSNARKs without trusted setup

To bridge R1CS with sum-check protocol, Setty et al. encoded R1CS with low-degree extension (LDE) into multi-linear polynomials.

The Spartan NIZK version requires the verifier to evaluate encoded R1CS polynomials over random points, thus they applied polynomial commitments to make the prover to compute the evaluation point for verifier.

R1CS matrices are sparse, i.e., the matrix can be $\field^{m \times m}$ with $n = O(m)$ non-zero entries. They showed a gradual transform in prover complexity:

An $O(m^2)$ solution would be directly committing against LDE multi-linear polynomial of matrix $\field^{m \times m}$.
An $O(n \log n)$ “strawman” solution would be to LDE against tuples of matrix coordinates ($\field^\mu$ where $\mu = 2s$, $s = \log m$) and non-zero entry $\field$.
- In total requires $O(n \log n)$ variables, where the extra $\log n$ comes from $\field^\mu$ of coordinates.
Finally, an $O(n)$ solution would be compressing sparse matrix $\field^{m \times m}$ into $n$-length list of tuples of (row, col, val), then use offline memory checking technique in Spice to construct an $O(n)$ sized, $O(\log n)$ depth circuit.
- Offline memory checking performs time stamp order construction for non-zero entries in matrices, the trick is init + write = read + audit.
- The public coin multiset hash function helps checking the equality of sets of (coordinate, value, time-stamp) tuple.

The hardest part to comprehend (for me) is the memory checking circuit for $\tilde{M}(\pmb{r_x},\pmb{r_y})$ on Section 7 for SPARK compiler that converts a poly commitment into one that handles sparse multi-linear poly commitment.

Nova: Recursive Zero-Knowledge Arguments from Folding Schemes

The motivation is that: vanilla R1CS relation does not allow for merging 2 statement-witness pairs into one, and naive construction of IVC is far from practical to be used.

The previous R1CS relation is $(A, B, C \in \field^{m \times m}, \bz \in \field^m)$ where $\bz = 1 \parallel \bz \parallel {w} \in \field^m$, such that $(A \bz) \circ (B \bz) = C\bz$.

To make the relation foldable, we should introduce relaxation terms $u \in \field$ and $\be \in \field^m$, such that $(A \cdot (\bz_1 + r \bz_2)) \circ (B \cdot (\bz_1 + r \bz_2)) = u \cdot C \cdot (\bz_1 + r \bz_2) + \be$.

Additionally, we want to commit to the R1CS witness $\bw, \be$ with an additive homomorphic commitment as $\com_w$ and $\com_e$.

To make use of folding scheme, or non-interactive folding scheme (NIFS) in IVC, we want to let the circuit to fold 2 R1CS statement into 1:

R1CS statement contains inputs and outputs, the contradiction is that we can no longer make R1CS statement constant as each step goes.
To make the R1CS statement size being a constant, they use hash of inputs as R1CS statement.
Such hash can be vied as a “constant sized computation binding” to the inputs we want to use at each step.

Series connection of folding components into an IVC, while at each step a NIFS prover will keep track of the relaxation terms and IVC proof, which is the R1CS witness.

To compress the IVC proof size, which is the same as R1CS statement-witness pair size, they applied Spartan to generate a proof of knowledge of IVC proof, which compresses the concrete proof size.

The trickiest part is the hash of R1CS statements in folding scheme as R1CS statement, which ensures the statement size being constant.

Algebraic Reductions of Knowledge

TODO: SuperNova: Proving universal machine executions without universal circuits

TODO: HyperNova: Recursive arguments for customizable constraint systems

Unlocking the lookup singularity with Lasso

One thing I kinda regret is that, I didn’t read Spartan really carefully.

I thought Spartan-Spark is a compiler that compiles a multilinear PCS into a sparse multilinear PCS, but actually it is more of a “computation commitment”.
It is the Spartan verifier that encode the sparse R1CS matrix into row/col/val and “metadata” for memory checking.

Noticing that for a multilinear poly $f(\bx) = \sum_{\br \in \zo^N } f_{\br} \cdot \tilde{\text{eq}}(\br, \bx)$, if it is a $m$-sparse poly, then $f(\bx) = \sum_{\br \in \zo^{\log m}} {\rm val}(\br) \cdot \prod_{i = 1}^c \tilde{\eq}({\rm bits} ({\rm tableIndex}_i(\br)), \bx_i)$.

On commit phase, one commits to $\rm val$ and ${\rm tableIndex}_i$ all $c + 1$ $\log m$-variate multilinear polys for dense representation of $f$.

On proving phase, given the evaluation point $\bx$, prover commits to

$c$ $\log m$-variate polys $\tilde{\eq}({\rm bits} ({\rm tableIndex}_i(\br)), \bx_i)$ for table looked up values
$c$ $\log m$-variate polys for $m$ time stamps on accessing each table cell value
$c$ $\log N^{1/c}$-variate polys for $N^{1/c}$ time stamps for final table cell time stamps

The memory checking invariat is ${\rm WS = RS} \cup S$, where $\rm WS$ has LDE being $\rm WS = RS + 1$.

The allowance of malicious prover to commit to these metadata for memory checking is from claim 2 of the paper, that a dishonest memory cannot forge a $S$ for a checker.

TODO: Jolt: SNARKs for Virtual Machines via Lookups

TODO: LatticeFold: A Lattice-based Folding Scheme and its Applications to Succinct Proof Systems

[Seen 2023/11 for PlonK line of work]

see notes here

TODO: Efficient Zero-Knowledge Argument for Correctness of a Shuffle

PlonK: Permutations over Lagrange-bases for Oecumenical Non-Interactive arguments of Knowledge

A permutation argument checks if a polynomial is permutation of another polynomial by $f(\sigma(x)) = g(x)$.

The instantiation of permutation argument in Plonk is product argument, which constructs $z(x)$ progressing over $g^0 … g^n$.
Each step of $z(x)$ multiplies a step over $f(x) + \beta x + \gamma$, with a division over $g(x) + \beta \cdot \sigma(x) + \gamma$.
The goal is to check when finishing the progression, $z(x)$ should have all terms canceled, which means $z(g^n) = 1$.
$\gamma$ is necessary as a randomized shift for multiset equality, while $\beta$ is necessary for prover holding on to the same permutation $\sigma$, which yields a permutation argument.

Another perspective of constraint system is that, view computation circuit as $n$ gates and $m$ wires, then the wire values attached to gates as input/output should be equal.

This yields the motivation of introducing the copy constraints, which is established by permutation argument.

A trick to combine multiple zeroity check is to apply random linear combination over multiple zerotiy terms with exponents of random $\alpha \in \field$.

The same trick applies to Spartan in sumcheck on multiple terms.

KZG polynomial commitment allows up to polynomial degree up to $n$ by CRS, linearization allows one to open against polynomial as products of sub-polynomials of degree lower than $n$.

Give verifier evaluation points of a subset of sub-polynomials, and swap the evaluations against these sub-polynomials in the original polynomial, we get the linearization polynomial.
KZG commitments are additive homomorphic, while evaluations sent over are over $\field$, then the verifier can construct the commitment for linearization polynomial, and finish the open check.

PlookUp: A simplified polynomial protocol for lookup tables

A similar trick of permutation argument application, while we want to check all the elements in multiset $F = f_1 … f_n$ appeared in vector table $T = t_1 … t_d$.

We only need to check the multiset equality, so we no need of $\beta \in \field$ in plonk.

To check $F \subseteq T$, we merge these two multisets and sort them, yielding a multiset $S’$.

Supposing $F \subseteq T$, then all $F$ elements with same value as $T$ can appear later than $T$’s element.
We convert $S’$ into $S$ by $s_i + \beta s_{i+1}$, the motivation is to help with multiset subset checking.
The progression over $S’$ from first and $n^{th}$ should be equal to progression over $f_i (1 + \beta)$ ($f_i$ is identical with earlier $t_i$) and $t_i + \beta t_{i+1}$.

PlonKup: Reconciling PlonK with plookup

Combining the upper 2 works into a Plonk with look up table supported.

To combine a line of 2 input 1 output table entry into $\field$, see Spartan multiset hash function, that $t = t_1 + \gamma \cdot t_2 + \gamma^2 \cdot t_3$ for random $\gamma \in \field$.

Additional to constraint system is that $q_{k_i} (w_i + \gamma \cdot w_{n + i} + \gamma^2 \cdot w_{2n + i} - t_i) = 0$ for all $i \in [n]$, which checks intermediate witnesses match table results.

HyperPlonk: Plonk with Linear-Time Prover and High-Degree Custom Gates

This paper gave off a good vibe from Spartan, with 2 contributions:

Formalization of sumcheck based PIOP toolkits to build up plonk-alike SNARK without FFT involving.
Compression of multilinear polynomial commitment size with the help of Commit-and-Prove SNARK, which is similar to Spartan’s Spark compiler in some sense.

This paper helped me realizing the essence of Plonk, that is multiset hash function with multiset product argument:

Recall the multiset hash function in Spartan, with $h_\gamma(\alpha_1, \alpha_2, \alpha_3) = \alpha_1 + \gamma \alpha_2 + \gamma^2 \alpha_3$ and $H_\beta(\pmb{\alpha}) = \prod (\alpha_i - \beta)$.
The constraints in Plonk can be viewed as an entry of 2 elements of index and witness value merged with $h_\gamma$ above, and the final multiset should hold a multiset hash from $H_\beta$.
Given a permutation that permutes over same value, the permutation way of constructing multiset hash should lead to same multiset hash with a vanilla way of constructing multiset hash.
We can check above with a product check argument, that the grand product of divisions leads to 1 as a result.

I personally like the way they organize the PIOPs in section 3, which is very well organized and can be easily read.

We saw the sumcheck for high individual degree polynomials in Spartan before, but the prover complexity is $O(2^\mu d^2)$ rather than $O(2^\mu d \log^2 d)$ in implementation – this should not be a big issue when $d$ is low, just asymptotically different.
3 queries per round (0, 1, $r$) in multilinear sumcheck can be merged into 1, with the help of last round’s claim of sum and multilinear polynomial dividing over $X (1-X)$.
They use Spartan’s zero-over-subcube check trick extensively, with a random challenge $\br \in \field^\mu$ and check $\sum\limits_{\bx \in {(0, 1)}^\mu} f(\bx) \cdot \eq(\bx, \br) = 0$.
Product check takes advantage from Quark, which checks product over $\mu$ bits by carefully arrange evaluations ($2^\mu$) and sub-products ($2^\mu - 1$) over $\mu + 1$ bits.
- We check the encoding over $\mu + 1$ bits by seeing if it satisfies $v(1, \bx) = v(\bx, 0) \cdot v(\bx, 1)$ over $\mu$ bits of $\bx$.
- The rest should follow from a standard sumcheck.
Permutation PIOP with small soundness error views $\sigma$ as permutation, using inverse permutation $\sigma^{-1}$, we avoided exploding the degree in sumcheck round polynomial.
- The first $n$ rounds can be done in $O(\mu 2^\mu)$ complexity, with each round $O(2^\mu)$, by $O(2^{\mu - i})$ on $\eq$ and $O(2^i)$ on multilinear minus term (zeroity terms).
- The last $n$ rounds can be done in $O(2^\mu \cdot \mu \log^2 \mu)$, for $\sigma^{-1}$ is applied in $\eq$ incurring degree to be $O(\mu)$, and operation done by sumcheck over high degree polynomial.
Lookup PIOP deploys a degree-2 multivariate “generator” polynomial, but it gets linearized by computing $f$ twice rather than computing $f(g(\bx))$ directly.
- The rest should follow from plookup and multiset check PIOP, for “shift-by-1” table and values over $\mu+1$ bits.
Batch opening for multilinear polynomials gets to optimized complexity with $O(k 2^\mu)$ for $k$ batch size.
- This is achieved by rewriting $f(\bx) = \sum\limits_{\bz \in {(0, 1)}^\mu} f(\bz) \cdot \eq(\bz, \bx)$ in its multilinear form (a variant of Spartan’s trick).
- All these subchecks can be combined together by Spartan’s zero-on-subcube checks, checking zeroity over sum of $\mu + \log k$ bits.
- A specific case when all the opening points are over subcube is that, we can precompute the $\eq$ terms in Spartan trick, and the prover complexity optimized to $O(2^\mu)$.

Linear time multilinear polynomial commitment scheme from Orion has concretely large proof size, due to full MT implementation.

Similar to Spartan, to compress both the proof size and verifier cost, they build an outer SNARK to prove the verification of original proof goes through.
To decrease the circuit complexity of MT verification, they uses multilinear KZG as part of the commitment for lower SNARK circuit complexity.
For random access of proximity test query location, they construct lookup table for querying.
The consistency check in witness with public IO can be removed from circuit from Commit-and-Prove SNARK.

TODO: Succinct Arguments over Towers of Binary Fields

[Polynomial Commitments]

Doubly-efficient zkSNARKs without trusted setup (multilinear PCS from matrix commitment and IPA, $\sqrt{|w|}$ communication complexity)

The contributions of this paper follows:

2 options of squashing 2 sumchecks into 1 (“reducing from 2 points to 1 point”), either invoke 2 summed term evaluation with slightly larger proof size, or only 1 summed term invocation with smaller proof size.
Cost reduction on verifier side for sumcheck by dot product proof of knowledge, with random linear combination over linear checks to form dot product.
To reduce verifier side work on witness, which is the last round of GKR sumcheck, verifier invoke a multilinear polynomial commitment, rather than evaluating $\tilde{V_x}$ term.

Multilinear polynomial commitment starts from the observation of $\tilde{\eq}(\bu, \bv)$ can be separated into $\prod_{1 \le i \le \ell}\tilde{\eq}(u_i, v_i)$.

Instead of performing inner product between $\pmb{f} \in \field^{2^\ell}$ and $\tilde{\eq}(\bu, \bt)$ for $t \in 0, 1^\ell$, we write $\pmb{f}$ as a $\field^{2^{\ell / 2} \times 2^{\ell / 2}}$ matrix.
Committing phase requires using multi-commitment (Pedersen commitment’s multi-commitment) committing each row of $\pmb{f}$.
Evaluating phase requires prover to linear combine the multi-commitments with first part of $\tilde{\eq}$, and verifier invoke a proof of dot product with prover on $f(\bu) = y$.

RedShift: Transparent SNARKs from List Polynomial Commitments (univariate PCS from FRI)

This paper introduced a way to construct polynomial commitment from FRI.

Specifically, they proposed a construction for List Polynomial Commitment (LPC) from FRI.
By taking advantage of randomness, they can “patch” the LPC such that it commits to the actual polynomial uniquely, with minor soundness error.

The motivation behind LPC is that, distinguishing a sufficient close function from a valid codeword in RS code leads to large parameters.

This involves a concept called list decoding, that $L_\delta(f)$ denotes a list of valid codewords in RS code that are at most $\delta$-far from $f$.
Supposing the $\delta$ is sufficient small (smaller than unique decoding radius $\delta_0$), then $L_\delta(f)$ contain only 1 codeword.
that is the function constrained over RS field $D \subseteq \field$.
Instead, we can try to commit to a list of polynomials that are in $L_\delta(f)$, and complying to an additional list of point-evaluation pairs.

However, the LPC above doesn’t give uniqueness in normal polynomial commitment.

The rough idea is to take advantage of randomness sampling, picking points to patch the LPC point-evaluation pairs that $f$ must comply to.
The application of a $\mu$-dimensional $\eps$-distinguisher happens in the “offline” stage.
It appends $\mu$ pairs of random point-evaluation of $f$ to the list of point-evaluation pairs for LPC.
The PPT distinguisher patches all points that all other $g \in L_\delta(f)$ that are not agreeing with $f$, to be evaluated by $f$.
Soundness error follows from Schwartz-Zippel, that for a $g \in L_\delta(f)$ that is not $f$, we have $\Pr\limits_{x \in \field}\left[f(x) = g(x)\right] \le \dfrac{d}{|\field|}$.
By union bound over $|L_\delta(f) -1|$ other polynomials and iterating over $\mu$ points, $\eps \le \left(\dfrac{d}{|\field|} \cdot (|L_\delta| - 1)\right)^\mu$.

Finally, double-column ACM format is anti-human, definitely hate it.

TODO: Linear-Time Arguments with Sublinear Verification from Tensor Codes

Brakedown: Linear-time and field-agnostic SNARKs for R1CS (multilinear PCS from tensor-code IOPP)

On Brakedown, the main contributions are two folds:

They proved the extractability of coding based multilinear polynomial commitments, without the need of efficient decoding, this has something to do with extractor.
On the top of removing the need of efficient decoding, they constructed an efficient linear code with linear-time encoding from expander code.

The polynomial commitment scheme from BCG20 is holding the form of $g(\br) = \langle \pmb{q}_1 \otimes \pmb{q}_2, \bu\rangle$, where $\bu$ is the Lagrange basis for $g$ over $(0, 1)^n$.

As for the efficient extractor without need of efficient decoder part:

We should start with the definition of extractor, that extractor should be ignorant of the odds of prover convincing verifier.
The first construction of the extractor needs to know the probability $\eps$ that the prover convinces the verifier, for it needs $\eps$ to estimate its runtime.
The second extractor construction builds on the top of the first one, that “guesses” the $\eps$.
- The extractor runs regardless what $\eps$ is, in case $\eps$ is not inverse poly time with $m$ and $\lambda$, and extractor won’t be in expected poly time.
- The guessing trick is from Hazay, Lindell, and Goldreich.
The resulting expected runtime of second extractor turns out to be in expected poly time, given that it aborts on $2^m$ steps.

The claim 1 and lemma 1 for testing phase are a bit hard to comprehend; while using Ligero’s lemma 4.2 and 4.4, things are straightforward.

There are a lot to learn in coding theory and extractor runtime subtly (EPT) proof, I need to be better at it.

The linear code description in Brakedown is a bit hard to comprehend, so we defer this to the following writeup on expander code.

TODO: Brakedown’s expander code (supplementary file for Brakedown Section 4 construction)

Gemini: Elastic SNARKs for Diverse Environments (a multilinear PCS from KZG, $O(\log N)$ proof size and verifier pairings, $2N$ MSM)

See https://drive.google.com/file/d/1NcRnqdwFLcLi77DvSZH28QwslTuBVyb4/edit

Orion: Zero Knowledge Proof with Linear Prover Time (from tensor-code IOPP and linear-time high rate code)

This paper has following main contributions:

A testing algorithm about whether a randomly sampled lossless-expander graph is not expanding (containing non-expanding subgraph).
A PCS construction from tensor-code IOP, where an interleaved linear codeword has each row being an expander codeword, the generalized Spielman code by Druk-Ishai-14.

Given a bipariate graph $G = (L, R, E)$, where $V = L \cup R$ and $L \cap R = \varnothing$. Let $0 < \eps < 1$, $\delta > 0$, and $\Gamma(S)$ be the neighbor set of $S \subseteq V$.

Let $|L| = k$, $|R| = k’ = \alpha k$, we say $G$ is a $(k, k’, g)$-lossless expander graph with following properties:

$\deg(v) = g$ for all $v \in L$, namely $G$ is a $g$-left-regular bipariate graph.
$| \Gamma(S) | \ge (1 - \eps) g |S| $ for all $S \subseteq L$ with $|S| \le \dfrac{\delta k}{g}$.

On the other hand, $|S| \le \dfrac{\alpha k}{(1 - \eps) g}$, or $S$ expanding from $L$ into more than $\alpha k$ vertices in $R$.

First, for a $g$-left-regular bipariate graph $G$, w.p. $O\left(\dfrac{1}{\text{poly}(k)}\right)$ it is not a $(k, k’, g)$-lossless-expander graph.

Let $s = |S|$ for $S \subseteq L$, and $T = \Gamma(S) \subseteq R$ with $t = |T|$. Then $T$ being $S$’s neighbor has probability upper bounded by $\left(\dfrac{t}{\alpha k} \right)^{sg}$.

Thus the probability of a scenario of non-expanding subgraph in $G$ is upper bounded by $t = (1 - \eps) g s$, and by union bound $\le \sum\limits^{\delta k / g}_{s = 2} \dbinom{k}{s} \dbinom{\alpha k}{(1 - \eps) gs} \left( \dfrac{(1 - \eps) gs}{\alpha k} \right)^{sg}$, which is $O\left(\dfrac{1}{\text{poly}(k)}\right)$.

On the other hand, when $s$ is a constant, the total search space is $(\alpha k)^{sg}$, which is $\text{poly}(k)$ - if there exists non-expanding subgraphs, then non-expanding probability is lower bounded by $O\left(\dfrac{1}{\text{poly}(k)}\right)$.

In the summation term above, for $\log \log k \le s \le \dfrac{\delta k}{g}$, with appropriate parameter choices, the summation term can be concretely negligible.

On the other hand, for $2 \le s \le \log \log k$, there is an efficient algorithm to find the very small set expansion cases.

There are 2 preliminary lemmas/facts can be helpful in bounding the algorithm runtime and algorithm searching object.

If there exists a non-expanding subgraph in $G$, then there is a connected non-expanding subgraph in $G$.
- One can always half down the non-expanding subgraph until it is connected.
- Finding a connected non-expanding subgraph is also finding non-expanding subgraph, and the algorithm does need the connective property in searching.
With high probability, the degree of a right vertex is upper bounded by $g / \alpha + 10 \ln k$.
- The degree of a right vertex can be modeled by a sum of $k$ Bernoulli random variables w.p. $\dfrac{g}{\alpha k}$ being 1, otherwise 0.
- Eventually, this degree can be upper bounded by Chernoff bound.

The algorithm is basically iterating each $v \in L$, find $D \subseteq L$ that is within $2 \log \log k$ distance from $v \in L$, and all possible non-expanding graphs from $S \subseteq D$.

Essentially, the distance $2 \log \log k$ means other left vertices involved in connected non-expanding subgraph with $s \le \log \log k$ cannot be more than $2 \log \log k$ far from $v$.
The right vertex degree upper bound help with bounding the algorithm runtime within polynomial time.
- $L \to R \to L$ marks at most $g \cdot (g / \alpha + 10 \ln k)$ elements in $D \subseteq L$.
- Procedure repeats $\log \log k$ times, and thus $|D| \le (g \cdot (g / \alpha + 10 \ln k))^{\log \log k}$.

TODO: SLAP: Succinct Lattice-Based Polynomial Commitments from Standard Assumptions

BaseFold: Efficient Field-Agnostic Polynomial Commitment Schemes from Foldable Codes

TODO: Vortex: A List Polynomial Commitment and its Application to Arguments of Knowledge

TODO: Polynomial Commitments from Lattices: Post-Quantum Security, Fast Verification and Transparent Setup

WHIR: Reed–Solomon Proximity Testing with Super-Fast Verification

TODO: Blaze: Fast SNARKs from Interleaved RAA Codes

MERCURY: A multilinear Polynomial Commitment Scheme with constant proof size linear field work ($O(1)$ proof and verifier pairings, $N + O(\sqrt{N})$ MSM)

One of the most important trick is, given a KZG univariate polynomial oracle, one can prove a univariate evaluation in $O(N)$ time, and one can prove a multilinear evaluation in $O(N \log N)$ time through a FFT based inner product argument.

To make the prover running in a fully linear time, one can break down the $O(N)$ multilinear polynomial into 2 committed instances of $O(\sqrt{N})$ size.

More specifically, let $f(X) = \sum\limits_{i = 0}^{2^{n - k} - 1} \sum\limits_{j = 0}^{2^k - 1} f_{i, j} X^{2^k i + j} = \sum\limits_{j = 0}^{2^k - 1} f_j(X) \cdot X^j$:

Let $g(X) = \sum\limits^{2^k - 1}_{j = 0} f_j(\alpha) \cdot X^j$, then $f(X) \equiv g(X) \pmod {X^{2^k} - \alpha}$, and $f(X) = q(X) \cdot (X^{2^k} - \alpha) + g(X)$, while $g(X)$ is of size $2^k$.
Prover also commit to $h(X) = \sum\limits^{2^k - 1}_{j = 0} f_j(X) \cdot \eq(\bin(i), \bu_{\left})$, where $h(X)$ is of size $2^{n - k}$.

The remaining is to prove:

multilinear evaluation over $h(X)$ (like Hyrax) at $\bu_{\right}$ giving $\gamma$ with $O(\sqrt{N} \log N)$ time through priorly mentioned FFT based IPA
multilinear evaluation over $g(X)$ at $\bu_{\left}$ giving $h(\alpha)$ through FFT based IPA, also in $O(\sqrt{N} \log N)$ time
univariate evaluation over $h(X)$ at $\alpha$ through KZG in $O(\sqrt{N})$ time
degree bound of $g$ in $O(\sqrt{N})$ time
relation between $f, g, q$, which should be strictly linear in KZG

[Reed-Solomon Proximity Tests]

TODO: Interactive Proofs of Proximity: Delegating Computation in Sublinear Time

Ligero: Lightweight Sublinear Arguments Without a Trusted Setup (distance perserving in interleaved code)

The contributions of this paper are two folds as follows:

They introduced an interleaved linear code proximity test.
- Given a interleaved linear code $U^\ast$ with large min distance from $L^m$ by $d(U^\ast,L^m) > e$, an element $\bw^\ast \in L^\ast$ preserves $d(\bw^\ast, L) > e$ with high probability.
- Brakedown, Orion, and BCG20 papers follow from the proximity test over interleaved linear code, which plays a role in tensor code based polynomial commitment.
This work provides a viewpoint that connects MPC with argument system, that each column of an interleaved linear code can be viewed as a party’s view in MPC.
- If the linear code is RS-code, then the secret sharing is Shamir secret sharing.

We start by Lemma 4.2/4.3 version, where $e < \dfrac{d}{4}$, that: if $d(U^\ast,L^m)>e$, let $L^\ast = \text{rowspan}(U^\ast)$, then $\Pr\limits_{\bw^\ast \overset{R}{\gets} L^\ast}[d(\bw^\ast,L) \le e] \le \dfrac{e + 1}{|\field|}$.

The main trick is to write $\bw^\ast$ into the affine form by $\alpha \cdot U^\ast_i + \bx$, then we argue for a given $\bx$, at most one $\alpha$ s.t. $j \in \dist(U_i^\ast,L)$, but $d(\bw^\ast,L) \le e$ and $j \notin \dist(\bw^\ast,L)$.
Or the span of these $\bw^\ast \in L^\ast$ would contradict with $j \in \dist(U_i^\ast,L)$, for $U_i$ itself is in the span. (The subtlety here is the unique decoding radius, that $2e < d/2$)

An improved analysis by by Roth and Zemor gives $e < \dfrac{d}{3}$, that: if $d(U^\ast,L^m)>e$, then $\Pr\limits_{\bw^\ast \overset{R}{\gets} L^\ast}[d(\bw^\ast,L) \le e] \le \dfrac{e + 1}{|\field|}$.

The improvement on bound $e$ is due to escaping the previous constraint from unique decoding radius.
This result derive from a lemma, that says if $d(U^\ast,L^m)>e$, then $\exists \bw^\ast \in L^\ast$ that $d(\bw^\ast,L)$.
The lemma is achieved by proof of contradiction, that assuming all elements in $L^\ast$ are at most $e$-far from $L$, then the bases $U_i^\ast$ are all $e$-close to $L$.
Assuming one vector is $\bv_0^\ast$ maximize the distance, then there exists another basis $U_i^\ast$ holding other error entries that $\dist(U_i^\ast,L)\setminus \dist(\bv^\ast_0,L) \neq \varnothing$.
Then by the affine form $\alpha \cdot U_i^\ast + \bv_0^\ast \in L^\ast$, and by the unique closest codeword in $L$, that lead to a larger distance to $L$, which contradict to $\bv_0^\ast$ maximizing the distance.

Based on the interleaved code proximity test, they developed a way to test if an interleaved codeword contain message satisfying linear/quadratic constraints.

The trick of consistency test and evaluation test is by encoding a row of RS code into a polynomial, and using random linear combination to combine all rows into one.
To ZKify this, we pad a line of random codeword, and thus randomize the response polynomial from prover, which makes it simulatable.

Fast Reed-Solomon Interactive Oracle Proofs of Proximity

This paper lays the foundations for the FRI, or fast Reed-Solomon interactive proof of proximity, that follows the line of work of PCPs and PCPPs.

The key idea is similar to the Basefold one, that is reducing a proximity testing problem over a large domain into similar proximity testing problems over smaller domains.

During the probabilistic reduction to a proximity problem of halved (or 1/4-ed, 1/8-ed) size, w.h.p. the distance of reduced codeword will not decrease, which allows for multiple rounds of reductions.

There are 3 main ingredients related to the probabilistic reduction of proximity problem:

Blockwise distance measure, or coset distance - a more coarse-grained measure than Hamming distance, that tells if alphabets over cosets are agreeing or not.
Probabilistic reducing RS alphabets over a coset into one alphabet for RS code on the next level, where the coset measure tells w.h.p. the next alphabet agrees or not.
Distortion sets, or bad events in verifier, where verifier challenges for folding decrease codeword distances.
- Distortion sets for codewords with coset distance outside of unique decoding radius means folded codeword has a Hamming distance to RS codes within unique decoding radius.
- Distortion sets for codewords with coset distance inside of unique decoding radius means folded codeword has a Hamming distance to RS codes that is closer than the original codeword’s coset distance.
- Note: distortion set is essentially a measure about how many challenges can make disagreeing cosets vanish in the next folded codeword.

The proving strategy splits into 2 parts:

The probability of verifier challenges falling into distortion sets is low.
Supposing none of the verifier challenges is in the distortion set, then the rejecting probability is at least $\min(\delta^{(0)}, \delta_0(\eps))$.

The round rejecting probability can be split into 2 cases:

A “close (noisy but still in unique decoding radius) and consistent” adversary, an easy case, has rejecting probability $\delta^{(0)}$, which directly follow from query algorithm.
A “far (out of unique decoding radius) or inconsistent (closest RS codeword change during folding)” adversary, or a more difficult case, has rejecting probability $\delta_0(\eps) = (1 - 3\rho - \eps) / 4$.
- $\eps$ parameterize the size of the distortion set for codewords outside of unique decoding radius $(1 - \rho) / 2$.
- In this case, an honestly folded codeword is always at least $\delta_0$-far (Hamming dist) from RS codewords.
- More specifically, in inconsistent case, the honestly folded codeword from $i^{th}$-round $f^{(i+1)}_{f^{(i)}, x^{(i)}}$ is more than unique decoding radius away (Hamming dist) from unique closest RS codeword $\overline{f}^{(i + 1)}$ to $f^{(i + 1)}$.
- Thus the rejecting probability, that is the errors in folding + the errors to closest RS codes against the evaluation set, is lower bounded by the Hamming distance above, and thus lower bounded by $\delta_0(\eps)$.

To argue the low probability of verifier challenges falling into distortion set, the proof still split into 2 cases by distance in/out of unique decoding radius:

Distortion sets for coset distance in unique decoding radius, the disagreeing cosets are unique, and thus at most $|L^{(i)}|$ challenges are in a distortion set by union bound.
Distortion sets for coset distance outside of unique decoding radius, by:
- Constructing a polynomial $C$ that agrees on the closest RS code to $Q^{(i)}(X, Y)$ on all $X \in B$ and $Y \in L^{(i + 1)}$.
- Constructing a “error locator polynomial” non-zero $E$ that vanishes the disagreeing places between $C$ and $Q^{(i)}$.
- Supposing there are more than expected elements in distortion set, we want to prove by contraposition that eventually coset distance is in unique decoding radius.
- We know the existence of a polynomial $P$ that agrees with $Q^{(i)} \cdot E$ and $C \cdot E$ over $X, Y \in B, L^{(i + 1)}$ with a lower degree, and $E \mid P$.
- Then $Q \equiv P / E$ agrees with $Q^{(i)}$ on $Y$ so much, that disagreeing rows on $X$ to $Q^{(i)}$ (cosets) are less than the ratio of unique decoding radius, thus contraposition proof holds.

Worst-Case to Average Case Reductions for the Distance to a Code (distance preserving in interleaved code, improved distortion set and $\delta_0$ soundness)

The main contribution is on a special case of “worst-case to avg-case” problem, that is:

If $u^\ast$ in an affine space $U \subseteq \field^n$ is $\delta_{max}$-far from another linear space $V \subseteq \field^n$, what can we say about $\delta_{med}$ of most elements in $U$ to $V$?

The paper gives 3 versions of solutions:

One for general spaces, that $\delta_{med} = J_\eps(D(U, V))$, that $D(U, V)$ is divergence of $U$ and $V$ defined to be the largest $\Delta(u \in U, V)$.
The other two for $V$ being error correction space that $\Delta(V)$ is large, that the $\delta_{med}$:
- One works up to $1/3 \cdot \Delta(V) - \eps$, the “unique-decoding” version.
- The other works up to $J_\eps^2(\Delta(V)) - \eps$, the “list-decoding” version.

One of the main technique, or the spirit here, is related to how to measure distance between 2 affine (linear) spaces $U$ and $V$.

All 3 solutions above have same pattern, that is foliating affine space $U = u’ + U’$ into lines of $u^\ast + \alpha \cdot u$, that $u^\ast$ is $\delta_{max}$-far, while $U’$ is a linear space.
By arguing at most how many points on each line is $\delta_{med}$-close to $V$ (or something bad happen then fumble the whole proof), this gives a $\delta_{med}$ w.h.p.
Essentially, this turns out to be, for any line in affine space $U$, not many points are close to $V$ (or something bad happen).
Reminds me of Moshkovitz’s Schwartz-Zippel proof - foliate $\field^n$ with lines, and upper bound the intersections on each line with the polynomial hyperplane.

The techniques in proving each line is at most some points $\delta_{med}$-close to $V$ are rather distinct in 3 solutions:

When $V$ is a general linear space, line closeness to $V$ can be reduced to a list-decodability problem by bounding $\delta_{med}$ with Johnson bound.
- Let $u^\ast$ being the furthest codeword to $V$, that $\Delta(u^\ast, V) = D(U, V)$.
- By Johnson bound and list-decodability, “few points on $u^\ast + \alpha \cdot u$ being $J_\eps(\Delta(u^\ast, V))$-close to $V$” transformed to “distance among $(v^\alpha - u^\ast) / \alpha$ being far”.
- “distance among $(v^\alpha - u^\ast) / \alpha$ being far” reduce back to $u^\ast$ distance to $V$ being far.
- The Johnson bound gives a better $\delta_{med}$ than prior work.
When $V$ is an ECC with large $\Delta(V)$, $\delta_{med}$ being $1 / 3 \cdot \Delta(V)$ case.
- $u^\ast + \alpha \cdot u$ being close to $V$ means colinearity in the points $v^\alpha \in A \subseteq V$ close to the line.
- The colinearity in $v^\alpha \in V$ yields $\delta_{med} \le 1 / 3 \cdot \Delta(V)$, and $v^\alpha = v^\ast + \alpha \cdot v$ for some $v, v^\ast \in V$ from colinearity.
- If too many points in $A$ are colinear to the line $u^\ast + \alpha \cdot u$, then we can argue $u^\ast$ and $v^\ast \in A \subseteq V$ are rather close (from avg argument), which contradict the assumption.
When $V$ is an ECC with large $\Delta(V)$, $\delta_{med} = J_\eps^2(\Delta(V))$ case.
- The proof idea is similar to the “unique-decoding” version, also finding the “colinearity” of $v^\alpha$ in $V$, just the colinearity set is smaller and more complex to construct.
- Since we want to bound the number of $\alpha$ s.t. $\Delta(u^\ast + \alpha \cdot u, V) < \delta - \eps$ that $\delta < J_\eps^2(\Delta(V))$, we convert it to how many $v^\alpha - \alpha \cdot u$ that $\Delta(u^\ast, v^\alpha - \alpha \cdot u) < \delta - \eps$.
- Unknown of structure of $v^\alpha - \alpha \cdot u$, we use graph theory - $\alpha$ and $\alpha’$ are connected iff $\Delta(v^\alpha - \alpha \cdot u, v^{\alpha’} - \alpha’ \cdot u) < J_\eps^{-1}(\delta)$.
- Then by Johnson bound, that pairwise $J_\eps^{-1}(\delta)$-far $v^{\alpha} - \alpha \cdot u$ yet still $\delta$-close to $u^\ast$ is at most $1/\eps$.
- Let $A$ be all the $\alpha$’s that has $v^\alpha - \alpha \cdot u$ that is $\delta - \eps$-close to $u^\ast$.
- By Turan’s theorem, the independent set is at most $1 / \eps$ in a graph of $|A|$ vertices, then there exists $\alpha_0 \in A$ has at least $|A| \cdot \eps - 1$ connections.
- In a sense, let $B$ denote all the $\alpha \in B \subseteq A$ that $\Delta(v^\alpha - \alpha \cdot u, v^{\alpha_0} - \alpha_0 \cdot u) < J^{-1}_{\eps}(\delta)$.
- Then there are $|B| \geq |A| \cdot \eps - 1$ codewords $v’ \in V$ that has $\Delta(u, v’ = (v^{\alpha_0} - v^\alpha) / (\alpha_0 - \alpha)) < J^{-1}_\eps(\delta)$.
- On the other hand, by Johnson bound and list-decodability, there are at most $1 / \eps$ elements $v \in V$ has $\Delta(u, v) < J_\eps(\Delta(u^\ast, V))$.
- Since $J_\eps^{-1}(\delta) < J_\eps(\Delta(u^\ast, V))$, then at most $1 / \eps$ elements of $v \in V$ has $\Delta(u, v) < J^{-1}_\eps(\delta)$, contradicting prior lower bound $|B| \geq |A| \cdot \eps - 1$.
- Pigeonhole! At least some $\alpha$ has more than $\eps \cdot |B|$ elements of $v’ = (v^{\alpha_0} - v^\alpha) / (\alpha_0 - \alpha)$ map to a same element of $v \in V$.
- Let $C$ denote all the $\alpha \in C \subseteq B$ that map to same $v’$ above, then $v^\alpha = (v^{\alpha_0} - \alpha_0 \cdot v’) + \alpha \cdot v$ - The colinearity kicks in again!
- $|C| \ge |B| / (1 / \eps) \ge \eps \cdot (\eps \cdot |A| - 1)$, then follows from the following proof in unique decoding version.

On the other hand, for a Reed-Solomon code $u^\ast$ specifically, a random “affine-shift” by $u^\ast(x) + c \cdot u^\ast(a x + b)$ can amplify $u^\ast$’s distance to the subspace of the RS code.

This looks like a dual of the worst-case to avg-case result, that is, in the wild some codeword can have its distance to RS code subspace amplified up to $1 - o(1)$.
We want to prove $\Pr\limits_{a, b, c \in \field_q} \left[ \Delta(u’ + c \cdot T_{a, b}(u’’), V) < \overline{\delta} - \eps \right] < \frac{K}{q}$, that
- $\Delta(u’, V) > \delta’$, $\Delta(u’’, V) > \delta’'$
- $T_{a, b}(f)$ for $f: \field \mapsto \field$ by $g(x) = f(a x + b)$
- $\overline{\delta} = \min(J_\eps^2(\Delta(V)), \delta’ + \delta’’ - \delta’ \delta’’ - \eps)$, and $K = 8 / \eps^4$.
Again, prove by contraposition, that assuming the probability is higher than $K / q$.
Fix $a, b$ choices, then the problem statement looks like list-decoding version of theorem for average-case distance.
The colinearity $v’$ and $v’’$ should be $1 - \overline{\delta}$-close to $u’$ and $T_{a, b}(u’’)$ in union by the theorem above, again by Johnson bound, each has at most $1/\eps$ choices.
Essentially, $v’$ and $v’’$ does not agree that much with $u’$ and $u’’$ (by distance assumption, $(1 - \overline{\delta} - \eps)$-far in union).
But by the proof by contradiction for the colinearity, the union agreement set should be large, more $1 - \overline{\delta}$ fraction.
Thus contraposition holds by a probabilistic method, that the chance of $a, b$ existing is lower than at least one choice of $a, b$ pair exists, and distance should be amplified to $\overline{\delta} - \eps$.

If one improves the distortion set for $\delta_0$ (extend distortion set sensitivity beyond unique decoding regime, and giving larger $\delta_0$), then one improves the soundness of FRI.

We directly apply the result from list-decoding medium distance result for distortion set, and we ball!
When it comes to FRI soundness, supposing $i^{th}$ round FRI challenge is not in distortion set, then $\delta^{(i+1)} \ge (\delta^{(i)} - \eps) - \beta^{(i + 1)}$ by triangle inequality, just like FRI analysis.
Thus, each query soundness error lower bounded by $\sum \beta^{(i)} \ge \delta^{(0)} - r \cdot \eps$.

BaseFold: Efficient Field-Agnostic Polynomial Commitment Schemes from Foldable Codes (multilinear PCS from Foldable Code IOPP)

The paper has following three contributions:

A new foldable linear code family that has a generation matrix efficiently and randomly sampled, and with high probability that the encoding has high min distance.
A generalization FRI IOPP for the foldable linear code, that applies to a broad class of codes (can be instantiated with RS code).
A construction of multilinear PC based on interleaving BaseFold and sumcheck protocol.

The foldable linear code derive from a family of random foldable distribution, that has high min distance with high probability.

The key idea behind the construction follows:
- Starting from a MDS generation matrix for a MDS code, a natural question to ask is about how to encode a doubled length message more efficiently.
- Write $\bm = \bm_l \parallel \bm_r$, the codeword for next level would be $\enc_{i + 1}(\bm) = \enc_i(\bm_l) + \bt_1 \circ \enc_i (\bm_r) \parallel \enc_i(\bm_l) + \bt_2 \circ \enc_i (\bm_r)$.
- The random foldable distribution additionally requires $\bt_1 \in (\field^\ast)^{n_i}$, and $\bt_1 + \bt_2 = 0$.
The construction above requires that, the minimum relative distance over $n$ layers of encoding should remain sufficient.
Applying $\bt_1$ and $\bt_2$ against $\enc_i(\bm_r)$ for $\enc_{i + 1}(\bm)$ will lead to additional zeros in Hamming Weight.
The key lemma here is to introduce rank-nullity theorem to bound the kernel size of the sub-matrix from columns of the generator matrix, which help with lowering the parameters in Hamming Weight.
The additional zeros introduced in new level of encoding caused larger and larger min distance, that is $\dist_{C_i} \ge \dist_{C_j}$ for $i \le j$.

The IOPP for foldable linear code gives a similar vibe of FRI.

The main idea of the foldable linear code IOPP is somewhat going in the opposite direction of foldable linear code:
- That is to unapply the $\bt_l$ and $\bt_r$ on left and right side of $\enc_{i + 1}(\bm)$, and fold $\enc_i(\bm_l)$ and $\enc_i(\bm_r)$ with $\alpha$.
The main idea should derive from worst-case to avg-case reduction in the distance to a code.

The polynomial commitment scheme is a combination of BaseFold IOPP and sumcheck protocol.

The tricky part is: BaseFold IOPP only rule out an encoding far from $C_d$ will not have a distance close to $C_i$ in folding process.
But it won’t rule out the case that the prover shift from the folding of previous message to the encoding of another message.
The soundness proof focus on the bad case above, i.e., swapping over to another encoded message, instead of folding of the previous one.
This is done by proof of contradiction, that $\dist_{C_k} \le \dist_{C_d}$ contradicts to $\dist_{C_i} \ge \dist_{C_k}$ for any $i \le k$, which is by foldable code construction.
The knowledge soundness requires one to construct an EPT extractor: Gaussian elimination over $2^d$ samples of MLE bases over $\field^{2^d}$ against function’s coefficient form.
The extractor is somewhat looking like Brakedown’s, but the difference lies in the predicate forking lemma for sampling the linearly independent MLE bases.

DEEP-FRI: Sampling Outside the Box Improves Soundness (avg-case dist from $J_\eps^2(\lambda)$ to $J_\eps^{3/2}(\lambda)$, $\lambda = \Delta(V)$)

First, the paper provides a better average case ($1 - (1 - \lambda + \eps)^{1/3}$, or the so-called $J^{3/2}_\eps(\lambda)$) in distance to a linear space $V$ formed by a linear code.

Prior work’s main technique relys on distance bounds (linear code distances, Johnson bound), to argue the “colinearity” bad event:
- If $u^\ast + \alpha \cdot u$ finds too many close points on $v^\ast + \alpha \cdot v$ in $V$ that $u^\ast |_S = v^\ast |_S$, $u|_S = v|_S$, that $S \subset D$ has $|S| > (1 - \delta) |D|$.
In this work, the technique in finding the “close-colinearity” directly connects the number of agreeing indices to the average-case distance. The key insight is:
- If the points on $u^\ast + \alpha \cdot u$ being close to $v^\alpha \in V$ agree with them over $S \subset D$, and $|S| > (1 - \lambda) |D|$, then it suffices to construct the close colinear line in $V$.

Detailed avg-case distance result: $\eps < 1/3$, $\delta < 1 - (1 - \lambda + \eps)^{1/3}$, if $\Delta(u^\ast, V) > \delta + \eps$, then for all $u \in \field^n$, $\Pr\limits_{x \in \field} \left[ \Delta(u^\ast + x \cdot u, V) < \delta \right] < \dfrac{2}{\eps^2 \cdot |\field|}$.

Prove by contraposition - supposing there are more than $\dfrac{2}{\eps^2}$ points on $u^\ast + x \cdot u$ close to $V$, then $v^\ast + x \cdot v$ is a “close-colinear” line, with properties:
- $C \subset D$ that $|C| > (1 - \eps - \delta) |D|$,
- $u^\ast |_C = v^\ast |_C$, $u |_C = v |_C$.
Let $A$ be the set for $x \in \field$ that $\Delta(u^\ast + x \cdot u, V) < \delta$, and $S_x \subset D$ denotes $(u^\ast + x \cdot u) |_{S_x} = v^x |_{S_x}$.
Then $\expect\limits_{\alpha, \beta, \gamma \in A} \left[ | S_\alpha \cap S_\beta \cap S_\gamma | / |D| \right] \ge (1 - \delta)^3 = 1 - \lambda + \eps$, or $\Pr\limits_{\alpha, \beta, \gamma \in A} \left[ |S_\alpha \cap S_\beta \cap S_\gamma| \ge (1 - \lambda) |D| \right] \ge \eps$.
$\alpha, \beta, \gamma$ being distinct has probability $\ge 1 - 3 / |A| > 1 - \eps / 2$.
Fix $\alpha_0, \beta_0$ distinct, then $\Pr\limits_{\gamma \in A} \left[ |S_{\alpha_0} \cap S_{\beta_0} \cap S_\gamma | \ge (1 - \lambda) |D| \right] \ge \eps / 2$, then for a $\gamma$ in the event:
- $(\alpha_0, u_{\alpha_0})$ ~ $(\beta_0, u_{\beta_0})$ ~ $(\gamma, u_\gamma)$ are colinear (over $S \subset D$ with $S = S_{\alpha_0} \cap S_{\beta_0} \cap S_\gamma$), and thus $(\alpha_0, v^{\alpha_0})$ ~ $(\beta_0, v^{\beta_0})$ ~ $(\gamma, v^\gamma)$ are colinear.
There are more than $|A| \cdot (\eps / 2) > 1 / \eps > 3$ satisfied $\gamma$ out there, which means the colinear line can be found over $(\gamma, v^\gamma)$ by $v^\ast + x \cdot v$.
Let $C \subset D$ satisfy $u^\ast |_C = v^\ast |_C$ and $u |_C = v |_C$, and we denote $A’$ the set of $\gamma$ on the line.
Observing that for a $Y \in D \setminus C$, $(u^\ast(Y) - v^\ast(Y)) + X \cdot (u(Y) - v(Y))$ vanishes at most over choice of $X \in \field$.
Since $\delta_{med} = \delta \ge \expect\limits_{x \in A’} \left[ \Delta(u^\ast + x \cdot u, v^\ast + x \cdot v) \right] \ge \dfrac{|D \setminus C|}{|D|} \cdot (1 - \dfrac{1}{|A’|}) \ge 1 - \dfrac{|C|}{|D|} - \eps$, thus $|C| \ge (1 - \delta - \eps) |D|$.

The result is tight by one case over $\field_{2^n}$, evaluation domain $D = \field_{2^n}$, and code rate $\rho = 2^{-3}$.

The main trick is using the trace function $\trace: \field_{2^n} \mapsto \field_2$, s.t. $u^\ast, u$ has $3/4$ distance, while the points on the line has distance $1/2$ as $\trace$ has $2^{n-1}$ roots.
Yet the tightness deteriorates quickly in other cases, once the evaluation domain gets smaller.

Take a step back, for a $u_x|_D = (u^\ast + x \cdot u) |_D$ in affine space $U$, there might be some pretenders in its list-decoding radius.

Supposing there are less pretenders (that we eliminate by some preprocessing), then each query’s soundness will be higher.
The distinct pretenders on a same evaluation point may evaluate to distinct points - say RS code, w.p. $\le \dfrac{d}{|\field|}$ evaluations collide.

The Domain Extending for Eliminating Pretenders (DEEP) lets verifier to ask prover for extra help to evaluate $u_x$ on some $z \in \field$.

Originally, the $u_x$ has evaluation domain $D \subset \field$, while asking for a random evaluation expanding the domain to $\field$.
For example, for $u_X = u^\ast + X \cdot u$, when asked for evaluation at some $z \in \field$, has $b = u_X(z)$, then we are testing the distance from $u_X$ to $V_{z, b} \subset V$ instead of $V$.
The sub-code $V_{z, b} = \langle X - z \rangle + b = \{Q|_D \in V: Q(z) = b\}$, is an additive coset (shift by $b$) of a low-degree ideal $\langle X - z \rangle$
- The sub-code thing looks like a coset of lattice in GPV pre-image sampling…

The improved DEEP method for RS code result: $\eps < 1 / 3$, $\delta < J_\eps(\lambda)$, if $\Delta(u^\ast, V) > \delta + \eps$, then $\Pr\limits_{x, z \in \field} \left[ \Delta(u_x, V_{z, B_z(x)}) < \delta \right] < \eta$.

$B_z(X)$ be an arbitrary linear function,
$\eta = \max\left(2 L_\delta^\ast \left( \dfrac{d}{q} + \eps \right)^{1/3}, \dfrac{4}{\eps^2 \cdot q}\right)$, $L_\delta^\ast$ is the upper bound of the list,
the event $\event_{x, z}$ that is $\Delta(u_x, V_{z, B_z(x)}) < \delta$ in human language - some pretender $P \in \list(u_x, V, \delta)$ has $P(z) = B_z(x)$.

Still prove by contraposition, with a similar thought process - if the most promising pretenders $P_x$ of $u_x$ become colinear, then $u^\ast$ cannot be that far.

Supposing $\Pr\limits_{x, z} \left[ \event_{x, z} \right] \ge \eta$, then $\Pr\limits_{x} \left[ \Pr\limits_{z} \left[ \event_{x, z} \right] \ge \eta / 2 \right] \ge \eta / 2$.
Let $A$ be the set of $x$ that $\Pr\limits_{z} \left[ \event_{x, z} \right] \ge \eta / 2$:
- $A$ used to be the set of $x$ has $u_x$ close to $V$.
- The new $A$ is the set of $x$ that, over the choice of $z$ w.p. $\ge \dfrac{\eta}{2}$, the corresponding $u_x$ are $\delta$-close to $V_{z, B_z(x)}$, as some $\delta$-close pretender is in $V_{z, B_z(x)}$.
Let $P_x$ be the best pretender for $u_x$, $x \in A$, that appears in $V_{z, B_z(x)}$ the most, and thus has most $B_z(x)$ evaluated over $z \in \field$.
There are at most $L^\ast_\delta$ pretenders for $u_x$, yet over the choice of $z \in \field$, there are more than $\dfrac{\eta}{2}$-portion of $V_{z, B_z(x)}$ has pretender in it.
Pigeonhole again, the best pretender $P_x$ must appear in $V_{z, B_z(x)}$ for more than $\dfrac{\eta}{2 L^\ast_\delta}$-portion of $z \in \field$, and we denote $S_x \in \field$ such set of $z$.
- $S_x$ used to be over $D$ in prior version of proof, but since we extended evaluation domain from $D$ to $\field$, now $S_x \subset \field$.
The rest of the proof goes similar to the one-and-a-half Johnson bound - that $S_\alpha, S_\beta, S_\gamma$ agree for more than $d$ points for colinearity, and thus $\Delta(u^\ast, V) < \delta + \eps$.

TODO: Proximity Gaps for Reed-Solomon Codes

TODO: STIR: Reed–Solomon Proximity Testing with Fewer Queries

TODO: WHIR: Reed–Solomon Proximity Testing with Super-Fast Verification

[Interleaved code proximity test]

Ligero: Lightweight Sublinear Arguments Without a Trusted Setup

Worst-Case to Average Case Reductions for the Distance to a Code

DEEP-FRI: Sampling Outside the Box Improves Soundness

Proximity Gaps for Reed-Solomon Codes

[FFT over curves]

TODO: Elliptic Curve Fast Fourier Transform (ECFFT) Part I: Fast Polynomial Algorithms over all Finite Fields

TODO: Scalable and Transparent Proofs over All Large Fields, via Elliptic Curves (ECFFT part II)

TODO: Circle STARKs

[MPC and distributed ZK related works]

TODO: Zero-Knowledge Proofs on Secret-Shared Data via Fully Linear PCPs

TODO: Efficient Fully Secure Computation via Distributed Zero-Knowledge Proofs

Experimenting with Collaborative zk-SNARKs: Zero-Knowledge Proofs for Distributed Secrets

This paper specialized in introducing MPC techniques into developed SNARK schemes, and implementing relevant components for the collaborative version.

The main technique here is bringing linear secret sharing over elliptic curve, that avoids using MPC to lift group elements over the curve.

The definition of collaborative SNARK on knowledge soundness does not guarantee extraction of each sub-prover.

But rather, when all the prover pool information together, they can determine witnesses.

The definition of zero-knowledgeness in collaborative SNARK is also a bit different.

For a single prover, it has to be valid witness, or an honest prover cannot generate proof.
For collaborative proofs, each prover holds a share of witness, and eventually the reconstruction of witness is valid or not.
Thus the simulator need to know the validity of the witness itself.

To bring KZG commitment into collaborative scenario, the key is lemma 4.

The goal is to prove an MPC scheme replacing a sub-circuit for linear function with local computation over secret sharing is secure.

EOS: Efficient Private Delegation of zkSNARK Provers

This paper achieved private delegation of SNARK with efficient throughput, and malicious security at the same time.

The differences from OB22 are:

EOS uses honest delegator to minimize preprocessing cost, avoiding the cost of generating correlated randomness with PKE.
EOS introduces a notion of PIOP consistency check, that allows delegator to check the consistency of polynomial computation, which minimizes malicious security.
However, EOS requires the delegator to be also involved in the proof generation computation, while OB22 just delegates, and walks away.

zkSaaS: Zero-Knowledge SNARKs as a Service

An essence of secret sharing based MPC shown in this paper is the use of randomness:

It comes in as jointly generated random secret sharings, reconstructed and applied to target, and remvoed by deducting these random sharings.

This paper’s main contribution is the use of packed secret sharing:

The rewritten sub-protocols gives lower memory and communication, and performs “SIMD-alike” computation over secret of vector form.
The workload of servers involved in computation (not the large server in the center of star topology) is reduced due to packed secret sharing.

The nature of FFT leads to an $O(\log m)$ round naive MPC, but the paper provides a 2 round computation from packed secret sharing:

By arranging the vector packed in the secret, a party can last from $\log m$ to $\log \ell$ round by locally perform linear combination.
Once $\log \ell$ round in FFT is reached, the parties add masked randomness to sharings, and send over all computation to a server with large memory,
The large server taking over FFT take over the rest of the computation, and send out shaings once its computation is done.
The small server perform only $O(m / \ell \cdot (\log m - \log \ell))$ operations and $O(m / \ell)$ memory, due to SIMD operation from packed SS.
The large server additionally need $O(m \log \ell)$ operations for FFT, $O(m \log n)$ operations for secret sharing over $n$ parties, and $O(m)$ memory.

TODO: Sometimes You Can’t Distribute Random-Oracle-Based Proofs

[GKR Proof system related]

Delegating Computation: Interactive Proofs for Muggles (GKR proof systems)

Behold, the GKR prover is here. I first encountered this in Alessandro Chiesa’s course, for doubly efficient proof systems.

The one liner for GKR, or proof system relying on sumcheck to multilinear PCS, is reduction:

The proof system looks like reducing a statement (looking like a sumcheck statement) to a random evaluation evaluation over a multilinear polynomial.
GKR is a chain of reduction, going reversely from last layer of output back to inputs, that is committed by a multilinear PCS. Supposing intermediate circuits are has a succinct description, then PCS is only needed for input MLE.

I slightly started learning the circuit complexity, so basically the computation GKR can prove is uniform NC circuits, which encapsulates NL problems.

For problems related to IPCP or PCA, really need to circle back after reading KR08, KR09.

Libra: Succinct Zero-Knowledge Proofs with Optimal Prover Computation (Libra linear GKR)

The linear GKR for arbitrary fan-in gates over multilinear polynomials can be done in linear time by separating into multiple phases, each phase vanishes one boolean hypercube, reducing it down to a random vector of field elements through sumcheck.

For example, say a multiplicative gate with fan-in 2, the sumcheck relation should look like $\sum\limits_{x, y \in \zo^\ell} w(g, x, y) f_0(x) f_1(y) = \alpha$, where $g \in \field^\ell$ is a random point reduced from prior round, though it should be a $z \in \zo^\ell$ pointing to output wire index.

Now the first phase reorder the summation through $\sum\limits_{x \in \zo^\ell} f_0(x) \left( \sum\limits_{y \in \zo^\ell} w(g, x, y) f_1(y) \right) = \sum\limits_{x\in \zo^\ell} f_0(x) h_g(x)$, where $h_g(x)$ is a multilinear by $\sum\limits_{y, z \in \zo^\ell} \eq(g, z) w(z, x, y) f_1(y)$ and the sparseness of $w$ of $O(2^\ell)$ non-zero entries over $\zo^{3\ell}$, can be prepared in $O(2^\ell)$ time.

The second phase after first boolean hypercube for $x \in \zo^\ell$ reduced to a random $u \in \field^\ell$ should look like $f_0(u) \sum\limits_{y \in \zo^\ell} w(g, u, y) f_1(y)$, where $w(g, u, y) = \sum\limits_{x, z \in \zo^\ell} \eq(g, z) \eq(u, x) w(z, x, y)$ can be prepared in $O(2^\ell)$ time.

For multiple fan-in gate in linear GKR, each phase reduces one boolean hypercube index, each phase should look like one sumcheck reduction for a level that is full of fan-in 1 gates, each phase time complexity is dominated by one sumcheck and one wiring predicate sparse poly preparation.

The sparse wiring predicate preparation for the specific boolean hypercube index is a projection from $\field^{{(k + 1)}\ell} \mapsto \field^\ell$ that is $\sum\limits_{o, x_1, \ldots, x_k \in \zo^\ell}\eq(u_0, o) \cdot \prod\limits_{1 \le j \le i} \eq(u_i, x_i) \cdot w(o, x_1, …, x_{i+1}, …, x_k) \cdot \prod\limits_{i + 1 < j \le k} f_j(x_j)$. The search and build is predominated by $w$, which is non-zero at most $O(2^\ell)$ over whole boolean hypercube indices.

On zero-knowledgeness, one insight is the zero-knowledge sumcheck does not require a whole random masking multilinear polynomial over $\zo^\ell$. Only $O(d\ell)$ non-zero random coefficients suffices, as sumcheck folds one variable at a time, then $O(d\ell)$ random coefficients suffices in masking all prover to verifier information.

Another insight on top of zero-knowledge sumcheck is hiding reduced sumcheck statements, that the zero-knowledge sumcheck does not hide. The previous technique is to mask the multilinear polynomial with $Z(z) \sum\limits_{w \in \zo} R(z_1, w)$, where $R$ is bivariate and each variable has degree 2.

The point here is to ensure that - the previous round reduced 2 sumcheck statements (e.g., from mul) with randomness $\sum\limits_{w \in \zo} R(u, w), \sum\limits_{w \in \zo} R(v, w)$, and later 2 statements reduced from this round with randomness $R(u, c), R(v, c)$, being all mutually linearly independent, and thus zero knowledge.

Doubly Efficient Interactive Proofs for General Arithmetic Circuits with Linear Prover Time (Virgo++ GKR cross layer)

The contributions of Virgo++ are that:

Reducing down a cross layer GKR layer into multiple input layer evaluation claims through sumcheck can be done in linear time, say this layer has inputs from multiple layers nearer to inputs.
When GKR reduction propagates backwards from output to inputs, claims about an intermediate layer represents as multiple evaluations over sub-hypercubes. Reducing all these claims back into a single evaluation claim over the whole hypercube can be done through a sumcheck in linear time.

Multivariate lookups based on logarithmic derivatives & Improving logarithmic derivative lookups using GKR (logUp)

So, what are we talking when we are talking about a “table-lookup” argument? Verbally, this means for a precommitted table elements, prove that a bunch of table query results belong to the table, refer to Lasso definition.

The main insight here is, let the table be a set $(t_i)_{i \in [M]}$, and the lookup results are $(a_i)_{i \in [N]}$, the lookup relation should be $\prod\limits_{i \in [M]} (X - t_i)^{m_i} = \prod\limits_{i \in [N]} (X - a_i)$ for $(m_i)_{i \in [M]}$, standing for the multiplicity of $t_i$ table element among the set of $a_i$ lookup results.

Equivalentally, we can say $\sum\limits_{i \in [M]} \dfrac{m_i}{X - t_i} = \sum\limits_{i \in [N]} \dfrac{1}{X - a_i}$. Similarly, indexing the table entries and different lookup results, we have $\sum\limits_{\bx \in \zo^\ell} \dfrac{m(\bx)}{X - t(\bx)} = \sum\limits_{k \in [1, K]} \sum\limits_{\bx \in \zo^\ell} \dfrac{1}{X - a_k(\bx)}$, where there are at most $K$ lookup results from table $(t_i)_{i \in [M]}$.

Now, all the multiplicative terms with roots and zeros being table entries and look results, are transformed into additive terms of poles of same multiplicities - the multiplicity is the $m(\bx)$ here.

The linearity and additive terms are useful to make use of sumcheck, which help with performance (random reduction with in strict linear time, and the claim can be represented in a GKR circuit).

Now we have a sumcheck relation over rational polynomial, looking like $\sum\limits_{\bx \in \zo^\ell} \Big(\dfrac{m(\bx)}{X - t(\bx)} - \sum\limits_{k \in [1, K]} \dfrac{1}{X - a_k(\bx)}\Big) = 0$, which can be rewritten into $\sum\limits_{\bx \in \zo^\ell} \Big( \sum\limits_{k \in [0, K]} \dfrac{m_k(\bx)}{\varphi_k(\bx)} \Big) = \sum\limits_{\bx \in \zo^\ell} \sum\limits_{k \in [0, K]} h_k(\bx) = 0$, where $h_k = \dfrac{m_k}{\varphi_k}$ are the fractional multilinear polynomials.

Back to the polynomial form where table entries and lookup results are zeroes/roots over the polynomials, then $\prod\limits_{\bx \in \zo^\ell} (X - t(\bx))^{m(\bx)} = \prod\limits_{k \in [K]}\prod\limits_{\bx \in \zo^\ell} (X - a_k(\bx))$ can be checked through Schwartz-Zippel with sampling $r \gets \field \setminus (t_i)_{i \in [M]}$ and challenge at $X$.

Then $\sum\limits_{\bx \in \zo^\ell} \Big(\dfrac{m(\bx)}{r - t(\bx)} - \sum\limits_{k \in [1, K]} \dfrac{1}{r - a_k(\bx)}\Big) = \sum\limits_{\bx \in \zo^\ell} \Big( \sum\limits_{k \in [0, K]} \dfrac{m_k(\bx)}{\varphi_k(\bx)} \Big) = \sum\limits_{\bx \in \zo^\ell} \sum\limits_{k \in [0, K]} h_k(\bx) = 0$ needs to commit 3 set of polynomials, namely $m_k$ and $\varphi_k$ over hypercube, and $h_k$ fractional values over hypercube.

The rest is going on with combining a sumcheck PIOP with a “zero-check” PIOP proving fractional relation between $m_k, \varphi_k$, and $h_k$.

The GKR trick is, you can avoid commiting the $h_k$ fractional values over hypercube.

The GKR circuit represents $\sum\limits_{\bx \in \zo^\ell} \Big( \sum\limits_{k \in [0, K]} \dfrac{m_k(\bx)}{\varphi_k(\bx)} \Big) = 0$ has $\log$ rounds with each round halving down gate numbers, computing fractional sums from previous layer.

TODO: Ceno: Non-uniform, Segment and Parallel Zero-knowledge Virtual Machine (Ceno)

TODO: Almost optimal succinct arguments for Boolean circuit on RAM

How to Prove False Statements: Practical Attacks on Fiat-Shamir (Fiat-Shamir GKR attack)

There are 2 ingredients in the 2 attacks from the paper:

sufficient depth in the GKR circuit to describe polynomial commitment and FS hash function, to in-circuit derive the FS random challenge $\gamma$ used to challenge the GKR output layer.
carefully designed real output layer $(\gamma, \gamma - 1)$ from prior in-circuit derived $\gamma$, that vanishes on the challenge $\gamma$, like the GKR fake output layer $(0, 0)$.

The 2 parts work together as follows:

On a real run, the attack circuit component computes $\gamma \gets h(\langle C \rangle, y^\ast, \alpha)$, the FS challenge to the fake output $y^\ast = (0, 0)$, where $\alpha$ is the PCS commit to the witness, $\langle C \rangle$ is the circuit hash.
The real run outputs real output being $y = (\gamma, \gamma - 1)$.
On an attacked GKR proving, adversary prove against fake output $y^\ast = (0, 0)$, yielding an evaluation statement $\tilde{y^\ast}(\gamma) = 0$, which is the same as $\tilde{y}(\gamma) = 0$, as $\tilde{y}(X) = \gamma - X$.
The circuit should be outputting $y = (\gamma, \gamma - 1)$, yet proving a case with output $y^\ast = (0, 0)$ still hold, as evaluation claim on output layer is the same, the rest should hold from FS hash.

Such attack can be instantiated because:

in GKR proof system, the randomness is at verifier’s choice, and the final evaluation statement on input witness can be done by the verifier;
while in GKR argument system, the randomness is from hash function instantiating FS random oracle, and the final evaluation can be proven by PCS efficiently (which is essentially a commit-and-prove SNARG).
the challenge can be efficiently derived in circuit, we can construct a case where the real output evaluated at the a maliciously chosen challenge being equivalent to an evaluation claim over a forged output.

The vanilla attack can be extended to any circuit with sufficient depth, with an additional component like “output-swap proxy”:

The “swap” looks like $f(a, b, \bx, \by) = (a - b) (\bx - \by) + \by$, and thus when $a = b + 1$ outputs $\bx$, while $a = b$ outputs $\by$.
On a real run, the attack circuit prepares the random challenge $\gamma$ from a fake output $\by^\ast$, that additionally requires a GKR reduction from output $\by^\ast$ to $(0, 0, \by, \by^\ast)$.
While the real run continues with $(\gamma, \gamma - 1, \by, \by^\ast)$ being the input to the “output-swap proxy”, and yielding output being $\by$.
On an attack GKR proving, the GKR reduction starts from $\by^\ast$ to $(0, 0, \by, \by^\ast)$, and yielding to the same evaluation claim, by FS hash function.

In short, the attack circuit prepares the randomness by simulating GKR reduction in GKR circuit, then the randomness is prepared a few layers ahead, thus the attack holds for changing outputs to any desired outputs.

A temporary fix is to make the output layer FS challenge preparation more expensive, by applying a long hash chain over the circuit hash, PCS commitment, and outputs. Eventually, this leads to an extremely large attack circuit depth, which makes attack infeasable in practice.

TODO: GKR for Boolean Circuits with Sub-linear RAM Operations

[Argument System in general]

TODO: How to Construct Constant-Round Zero-Knowledge Proof Systems for NP

TODO: Interactive PCP

TODO: Probabilistic Checkable Arguments

TODO: Efficient Arguments without Short PCPs

TODO: Succinct Non-Interactive Arguments via Linear Interactive Proofs

TODO: A Uniform Min-Max Theorem with Applications in Cryptography

TODO: Are PCPs Inherent in Efficient Arguments?

Separating SNARGs from All Falsifiable Assumptions

I had a mixed feeling about this paper:
- On one hand, this paper has rather high information entropy to me (a year ago in 2023). The note taking is almost copying the paper word by word.
- On the other hand, this paper destroyed my illusion of constructing succinct arguments from standard assumptions, i,e, falsifiable assumptions, in a direct manner.
- When I was a grad student at UVA, I was constantly wondering if the information theortic proof systems (e.g., LPCPs) can be replaced to something computationally sound, but it seems to be negative.
- The real content of the proof is rather succinct. Alas, why can’t I figure out this result (oh wait I was in primary school by then).
A black box reduction is essentially using oracle access to a successful attacker to break certain underlying assumption.
- Here, the oracle access is to some SNARG adversary, and breaks the cryptographic assumption for the SNARG.
A falsifiable assumption has efficient challenger, and thus we want to use black box reduction to break the falsifiable assumption from SNARG adversary.
- Thus we introduce the notion of “simulatable adversary”, which essentially means, an efficient distinguisher cannot tell apart a (potentially) unbounded adversary from an efficient challenger.
- Supposing every SNARG has a simulatable adversary, then an oracle reduction of simulator $\calR^\calS$ runs in poly time (even without oracle) and breaks the assumption, which makes the assumption itself false.
A takeaway is that, SNARG itself is an argument system, which means an unbounded prover can break a SNARG, yet an efficient challenger (verifier) cannot tell anything, which also account for the intuition of “simulatable adversary”.
As a result, the proof proceeds in 2 steps:
- Supposing for every SNARG there is a simulatable adversary, then an efficient challenger of a falsifiable assumption cannot tell reduction from an unbounded SNARG adversary or an efficient simulator, yet both should break assumption by reduction.
- Thus, seperation between SNARG and falsifiable assumption relies on existence of simulatable assumption of SNARG.
Proving the existence of simulatable adversary is interesting:
- Essentially it is trying to say, given 2 pairs $(x,\pi)$ and $(\bar{x}, \bar{\pi})$ of statements of hard subset problem and correlated auxiliary information (SNARG proof) $\pi$, even with the correlated auxiliary information, efficient distinguisher cannot tell them apart.
- The flavor is very similar to proving leakage resilient, while $\pi$ here is the leakage.
- The proving technique is proof by contradiction, supposing there is a distinguisher can tell apart $(x, \pi)$ and $(\bar{x}, \bar{\pi})$, use it to tell apart $x$ and $\bar{x}$.
Interesting min-max theorem.

TODO: Fiat-Shamir: From practice to Theory, Part I: Fiat-Shamir From Simpler Assumptions

TODO: Fiat-Shamir: From Practice to Theory, Part II: NIZK and Correlation Intractability from Circular-Secure FHE

TODO: On the (In)security of Kilian-Based SNARGs

TODO: Transparent SNARKs from DARK Compilers

TODO: Does Fiat-Shamir Require a Cryptographic Hash Function?

TODO: Fiat-Shamir via List-Recoverable Codes (or: Parallel Repetition of GMW is not Zero-Knowledge)

TODO: Non-Interactive Batch Arguments for NP from Standard Assumptions

TODO: SNARGs for P from LWE

TODO: Algebraic Reductions of Knowledge

TODO: Batch Arguments for NP and More from Standard Bilinear Group Assumptions

TODO: Lower Bound on SNARGs in the Random Oracle Model

TODO: PPAD is as Hard as LWE and Iterated Squaring

TODO: Non-Interactive Zero-Knowledge from Non-Interactive Batch Arguments

TODO: Probabilistically Checkable Arguments for all NP

[Fiat-Shamir-with-Abort line of work]

Lattice Signatures Without Trapdoors (reject sampling technique)

This paper’s reject sampling technique made possible a fruitful line of work on lattice based zero-knowledge proof, really really impressive.

We start with the reject sampling technique in general, without instantiating it over any specific distribution.

This technique gives a way to “hammer” one distribution to be looking like another one, by rejecting output upper bounded by some probability.
Supposing we want to make a distribution $(h, g_v)$ statistically close to $(h, f)$, where $v \gets h$, and $f(z) / g_v(z) \le M$ w.p. $\ge 1 - \eps$ for $z \gets f$ and any $v$.
On $(h, g_v)$ side, reject w.p. $\min(1, \dfrac{f(z)}{M \cdot g_v(z)})$; while on $(h, f)$ side, reject w.p. $M^{-1}$.
The statistical distance should be $\eps / M$, for no-output SD by $\eps / M$ and output case SD by $\eps / M$.

On instantiating with Gaussian distribution, the motivation is to hammer $(h, \distro^m_{\bv, \sigma})$, where $\bv \gets h$, to be looking like $(h, \distro^m_\sigma)$.

This helps arguing the zero-knowledgeness (simulatability) without knowledge of center of distribution $\bv$.
A key lemma is to bound the probability of $\dfrac{\distro_\sigma^m(\bz)}{\distro^m_{\bv,\sigma}(\bz)} = O(1)$, which applies to the general reject sampling lemma above.

Short Invertible Elements in Partially Splitting Cyclotomic Rings

First of all, think geometrically, prove algebraically.

Suppose we have a ring represented in CRT form. We write modulus $p = \prod p_i^{e_i}$, short elements $v$ with each entry being between $(-p_i, p_i) \setminus {0}$ are invertible.

Their CRT form won’t contain 0, thus they are invertible.
Their CRT form is a vector of same value $v$.
If the $\ell_\infty$-norm bound of the CRT form is beyond $p_i$, the challenge space $C$ will contain invertible elements, where 0 exists in their CRT form.

Suppose we have a $m^{th}$ cyclotomic polynomial $\Phi_m(x) \equiv \prod\limits_{i = 1}^{\varphi(m)}(x - r_i) \mod p$, with $r_i$ being $m^{th}$ primitive root of unities over $\ZZ_p^\ast$.

Then $\Phi_m(x) \equiv \Phi_{\delta(m)}(x^{m / \delta(m)}) \mod p$, with $\delta(m) = \prod\limits_{\text{prime}~p_i | m} p_i$.
More generally we have $\Phi_m(x) \equiv \Phi_{z}(x^{m / z}) \mod p$ for $z = \prod p_i^{f_i}$ with $1 \le f_i \le e_i$, because of $\varphi(m) = m / \delta(m) \cdot \varphi(\delta(m)) = m / z \cdot \varphi(z)$.

Now, let $\Phi_m(x) \equiv \prod\limits_{i = 1}^{d}f_i(x) \mod p$, also let $\Lambda$ be a set for $\bz \in R = \ZZ^{\varphi(m)} / \Phi_m(x)$ that $\bz \equiv \pmb{0} \mod (\Phi_m(x), p)$. Note that $\Lambda$ forms an ideal lattice in $R$.

A polynomial in $R_p$ that is invertible, then its modulo $f_i$ over $p$ should always be non zero.
We write $\lambda_1(\Lambda)$ as the Euclidean length of the shortest non-zero vector in $\Lambda$.
We have $\det(\Lambda) = | \ZZ^{\varphi(m)} / \Lambda | = p^{\varphi(m) / d}$, and we can use it to retrieve $\lambda_1(\Lambda)$ upper bounded by $\dfrac{\sqrt{\varphi(m)}}{s_1(m)} \cdot \det(\Lambda)^{1/ \varphi(m)}$.
Supposing $\by \in R$ has $\by \equiv \pmb{0} \mod (\Phi_m(x), p)$ and $\lVert \by \rVert_2 < \lambda_1(\Lambda)$, then contradict with shortest vector in ideal lattice. (Not long enough to round back)

In short, given an ideal lattice $\Lambda$ in $R$, if $\forall \bc \in C$ are shorter than any shortest vector in it, then they are always invertible.

More Efficient Commitments from Structured Lattice Assumptions (BDLOP commitment)

This is one of the few papers I read when I started self-teaching in Algorand years.

The BDLOP paper has following contributions:

The BDLOP additive commitment itself, that is instantiated to be computationally hiding and statistically binding.
A connection between knapsack problem and LWE/SIS, that gives a duality of the search/decision problem being statistically/computationally hard.
A proof of opening with relaxed soundness (can be extended to proof of linear relation), which is before the introduction of the proof of exact knowledge from lattice.

The search version of knapsack problem ${\sf SKS}^2_{n, k, \beta}$ in the $\ell_2$ norm asks to find a $\by$ that $\lVert \by \rVert_2 \le \beta$ to solve $\begin{bmatrix} {\bf I}_n & {\bf A}’ \end{bmatrix} \cdot \by = \pmb{0}$, with ${\bf A}’$ being uniformly random.

Supposing each element in $\by$ is invertible, then the probability over ${\bf A}’$ to find the solution above derive from Schwartz-Zippel.
Statistical hardness derives from the union bound for all the $\by$ that $\lVert \by \rVert_2 \le \beta$, each with probability to solve ${\sf SKS}$ over randomness of ${\bf A}’$.

The decision version of knapsack problem ${\sf DKS}^\infty_{n, k, \beta}$ in the $\ell_\infty$ norm asks to tell $({\bf A}’, \begin{bmatrix} {\bf I}_n & {\bf A}’ \end{bmatrix} \cdot \by)$ from uniformly random sample.

Statistical hardness derives from Leftover Hash Lemma (LHL), that ${\bf A}’$ serves as a hash function.

As the norm bound $\beta$ increases, the ${\sf SKS}$ degrades from statistically to computationally hard, while ${\sf DKS}$ strengthens from computationally to statistically hard.

The choice of $\beta$ gives the duality of search/decision problem being statistically/computationally hard.

Essentially, the BDLOP commitment hiding the secret being masking message with a (M)LWE sample, while the binding is from a statistically secure version of SIS, that is ${\sf SKS}$.

Algebraic Techniques for Short(er) Exact Lattice-Based Zero-Knowledge Proofs (exact proof of knowledge)

This paper and Efficient Lattice-Based Zero-Knowledge Arguments with Standard Soundness: Construction and Applications both solved lattice based exact knowledge of linear system, in a similar flavor.

The trick is that: to prove exact knowledge of $\vec{s} \in \ZZ_q^m$ that satisfies $A \in \ZZ^{n \times m}_q$, we can rely on $A$ being a strong (R)SIS instance, that binds a short $\vec{s}$ for linear system $A \vec{s} \equiv \vec{u} \pmod{q}$.

The shortness of $\vec{s} \in {0, 1, 2}^m$ can be checked through $(\vec{s} - \vec{2}) \circ (\vec{s} - \vec{1}) \circ \vec{s} = \vec{0}$, and can be lifted to NTT as follows:

Suppose $\vec{s}$ can be lifted to an NTT representation of a ring $\ring_q$, written as $\hat{s}$, that $(s - 2)(s - 1)s = 0$ over $\ring_q$.

The main protocol of exact knowledge is a sigma-protocol, that:

Prover samples a uniform masking polynomial $y \in \ring_q$ that $A \hat{y} \equiv \vec{w} \pmod q$;
Verifier samples a uniform challenge $c \in \ZZ_q \subseteq \ring_q$;
Finally prover outputs masked secret $z = y + c \cdot s$.
This gives extraction of $A (\hat{z} - \hat{z’}) \equiv (c - c’) \cdot \vec{u} \pmod q$.

Prover additionally sends out commitment to $y$ and $s$, such that verifier can check $C_z \stackrel{?}{=} c \cdot C_s + C_y$.

This improves extractability to $A (c - c’) \hat{s} \equiv (c - c’) \cdot \vec{u} \pmod q$.
Cancelling $c - c’$ terms on both sides, the rest is to prove the $\hat{s}$ over $C_s$.

Noticing $(z - 2c) (z - c) z = y^3 + 3y^2(s - 1)c + y(3s^2 - 6s + 2)c^2 + (s - 2)(s - 1)s c^3$.

Prover additionally commits to $y^3$, $3y^3(s - 1)$, and $y(3s^2 - 6s + 2)$.
Verifier checks $C_{(z - 2c)(z - c)z} \stackrel{?}{=} C_{y^3} + c \cdot C_{3y^2(s-1)} + c^2 \cdot C_{y(3s^2 - 6s + 2)}$.

Finally, the verfiier needs to check whether all the commitment terms are well formed, following BDLOP commitment proof of opening, which is binding.

Practical Product Proofs for Lattice Commitments (short invertible elements on fully splitting rings, with high probability)

One of the main contribution of this paper from Short Invertible Elements in Partially Splitting Cyclotomic Ring is that, it allows for short invertible elements in fully splitting ring (namely no restriction on modulus $q$) with high probability.

An invertible ring element has all its NTT coefficients being non-zero, while an NTT coefficient is a polynomial ring element’s evaluation over a cyclotomic ring root-of-unity.

Now that write $\bc = \sum\limits^{d - 1}_{i = 0} c_i X^i \in \ZZ_q[X] / (X^d + 1)$, where $\ring_q \cong \prod\limits_{i \in \ZZ_{2 \ell}^\times} \ZZ_q [X] / (X^{d / \ell} - \xi^i)$ (note that $X^{d / \ell} + \xi^i$ doesn’t necessary have to be irreducible, as it doesn’t matter for the result here).

The proof’s main idea is that:

Supposing each coefficients of $\bc \in \ring_q$ are iid sampled, $\bc$ and $\sigma(\bc)$ are identically distributed, where $\sigma \in \mathsf{Aut}(\ring_q)$.
Since $\ring_q / (X^{d/ \ell} - \xi^i) \cong \ring_q / (X^{d / \ell} - \xi^j)$, then $\bc / (X^{d / \ell} - \xi^i)$ and $\bc / (X^{d / \ell} - \xi^j)$ should be identically distributed, for $\sigma(\bc / (X^{d / \ell} - \xi^i)) = \sigma(\bc) / (X^{d / \ell} - \xi^j)$.
Thus each the polynomial splitting sharing same distribution, and we can simply argue on one split, and the argument should go through on the other splits.
Observing that $\bc / (X^{d / \ell} - \xi)$ can be reduced by: $\bc = \sum\limits_{k = 0}^{\ell - 1} X^{k d / \ell} \sum\limits_{j = 0}^{d / \ell - 1} c_{k \ell + j} X^j \equiv \sum\limits_{k = 0}^{\ell - 1} \xi^k \sum\limits_{j = 0}^{d / \ell - 1} c_{k \ell + j} X^j \pmod{X^{d / \ell} - \xi}$.
The probability density of each coefficient in the polynomial split is a convolution of the (shifted) probability densities of coefficients of $\bc$, that $\mu_k(0) = p$, and $\mu_k(\xi^{\pm k}) = (1 - p) / 2$ ($\bc$’s probability density is $\mu_0$).
The rest follow from Fourier analysis to upper bound the $\ell_\infty$-norm of the probability density of the polynomial split’s coefficient $\sum\limits_{k = 0}^{\ell - 1} c_{k \ell + j} \xi^k$ for $j \in [0, d/\ell - 1]$, which bounds the probability of the split being 0.

Practical Exact Proofs from Lattices: New Techniques to Exploit Fully-Splitting Rings (soundness amplification for exact proof of knowledge)

The soundness of BLS19 is $1 / q$ for the challenge in the first round of $\Sigma$-protocol is $c \in \ZZ_q \subset \ring_q$, the protocol need to be repeated multiple times for sufficient soundness.

TODO: Practical Lattice-Based Zero-Knowledge Proofs for Integer Relations

TODO: Lattice-Based Zero-Knowledge Proofs and Applications: Shorter, Simpler, and More General

[Lattices]

TODO: Streaming Authenticated Data Structures

TODO: Leveled Fully Homomorphic Signatures from Standard Lattices

TODO: FHE Circuit Privacy Almost For Free

TODO: Lattice (List) Decoding Near Minkowski’s Inequality

TODO: Hardness of the (Approximate) Shortest Vector Problem: A Simple Proof via Reed-Solomon Codes

TODO: Succinct Vector, Polynomial, and Functional Commitments from Lattices

TODO: Attribute-Based Encryption for Circuits of Unbounded Depth from Lattices

TODO: Chipmunk: Better Synchronized Multi-Signatures from Lattices

TODO: A New Approach for Non-Interactive Zero-Knowledge from Learning with Errors

[Secret Sharings]

TODO: On the Local Leakage Resilience of Linear Secret Sharing Scheme

TODO: Constructing Locally Leakage-resilient Linear Secret-sharing Schemes

TODO: Additive Randomized Encodings and Their Applications

TODO: Arithmetic Sketching

[Combinatorics Papers]

An Alternative Proof of The Schwartz-Zippel Lemma

A degree $d$ polynomial in $n$ variables cuts out a hypersurface $S$ in $\field^n$. There are at most $d$ intersection points between a hypersurface $S$ and an independent line (i.e., the intersection is not the whole line).

This explains separating out homogeneous $g$ of total degree $d$ from $f$: we need $g$ for constructing hypersurface $S$ in $\field^n$.

If one can find a direction that is everywhere independent of the hypersurface $S$, one can foliate $\field^n$ by parallel lines in that direction and count intersections within each line.

Question: Why we want to foliate $\field^n$?

Recall that, we want to upper bound the number of $\pmb{\alpha}$’s such that $f(\pmb{\alpha}) = 0$ over $\field^n$, namely, $|S|$ (points over $S$) over $\field^n$.

The foliation is parameterized by the direction’s orthogonal complement, which is isomorphic to $\field^{n-1}$, so the total points in $\field^n$ can be described by $\field^{n-1}$ lines, and we count intersection points between lines and hyperplane $S$ within each line.

The total number of lines is $|\field|^{n-1}$, while on each line either at most $d$ points or all $p$ points over the line (we eliminated this case by independence), thus the total number of hypersurface points across all of $\field^n$ is at most $d \cdot |\field|^{n-1}$.

TODO: Multilinear Schwartz-Zippel mod N with Applications to Succinct Arguments