Machine-checked mathematics · Lean 4 · June 2026
Pinning : Machine-Checked Thresholds for Mutual Correlated Agreement of Smooth Reed–Solomon Codes
A formal-verification study of the mutual correlated agreement threshold for smooth-domain Reed–Solomon codes, conducted in Lean 4 over the ArkLib library. We isolate the field-dependent part of the threshold from the purely combinatorial part, and state precisely what is proven, what is computed, and what remains open.
The ArkLib formalization effort, led by Shaw · lalalune/ArkLib
Abstract
The mutual correlated agreement (MCA) threshold of a Reed–Solomon code governs the soundness of the proximity tests at the heart of FRI-based interactive oracle proofs. Determining for explicit smooth-domain codes, the -power multiplicative subgroups required by the FFT, in the beyond-Johnson window is a recognized open problem, equivalent to a quantitative form of explicit Reed–Solomon list decoding to capacity. We report a formal-verification study of conducted in Lean 4 over the ArkLib library. Every result we label proven is a kernel-checked theorem with an explicit axiom census; every numerical statement is labelled computational and is reproducible by exact enumeration; every unresolved statement is labelled an open problem or conjecture.
Our principal contribution is a structural reduction we call the decoupling. We prove, machine-checked and uniformly in the field characteristic, that the over-determined far-line incidence count is a union of per-witness singletons and is therefore bounded by the witness count, hence is -independent in the thin (prize) regime. We compute its exact closed form and the exact windowed-budget threshold . The consequence we wish to state precisely is a clean reduction, not a solution: whether the MCA floor attains its conjectured beyond-Johnson value reduces to a single combinatorial question (whether the security budget crosses the incidence decay curve at witness size ), with no character-sum input.
We show, also computationally, that the window threshold does not decouple: at the binding moment depth the over-determined count coincides with the characteristic- moment, so the window threshold remains tied to the Bourgain–Glibichuk–Konyagin (BGK) square-root-cancellation wall. We accompany the decoupling with a family of exact sub-Johnson pins, unconditional beyond-Johnson dimension-ladder ceilings, and several machine-checked impossibility results delimiting which methods can and cannot reach the floor. The floor-equals-conjectured-value question is stated as the central open problem; we do not claim to resolve it.
1The threshold and the window
Modern interactive oracle proofs of proximity (FRI[BCIKS20] and its successors STIR and WHIR[BCHKS25]) reduce the soundness of a low-degree test to a proximity gap for Reed–Solomon codes. The sharp quantitative form of this phenomenon is the mutual correlated agreement (MCA) property of Arnon, Boneh, and Fenzi[ABF26]. For a linear code and target error one defines the MCA threshold
It controls the soundness error of the proximity test and hence the query complexity, proof size, and verification cost of every system built on it. The case relevant to deployed proof systems uses smooth evaluation domains: a multiplicative subgroup of -power order , on which the FFT operates. Determining for explicit smooth-domain codes above the Johnson bound is open; as[ABF26] record, it is equivalent to a quantitative form of beyond-Johnson list decoding of explicit Reed–Solomon codes.
The far-line incidence identity
We use the exact identity, formalized in ArkLib, that the MCA error is governed by the far-line incidence: for the worst direction,
Hence : the threshold is where the worst far-line incidence count crosses the security budget . The entire study below is an analysis of this crossing.
The window
Three radii organize the picture. The unique-decoding radius and the Johnson radius are classical; below the latter the list size is bounded and is accessible (Section 4). The window is the interval
between Johnson and the capacity radius . The capacity radius is provably unattainable with polynomial soundness[CS25], so is strictly interior. The prize regime fixes and a thin field , , so the budget is super-linear in .
The Gauss-sum object
The far-line incidence in the smooth case is controlled by an incomplete character sum. For set and . Because each is real. The Bourgain–Glibichuk–Konyagin (BGK) bound asserts square-root cancellation for thin subgroups; the best explicit exponent in range is , while the floor needs . We call this the BGK wall; it is the analytic obstruction that the window threshold remains tied to.
2Methodology and labelling discipline
We are deliberate about the epistemic status of every statement, using three disjoint labels throughout. The value of a formal-verification effort is precisely that the boundary between them is sharp and third-partyly checkable.
Proven. A Lean 4 theorem on the main branch of lalalune/ArkLib, sorry-free, whose printed axiom census we have inspected and which lies inside the standard budget propext, Classical.choice, Quot.sound. The correctness of every (P) statement rests on the soundness of the Lean 4 kernel and these three axioms, not on a human’s reading of a proof.
Computational. A statement established by exact enumeration or exact FFT over a stated finite range of parameters (never by floating-point sampling). Reproducible, but not on its own a theorem for all n. We never promote a (C) statement to (P) without a Lean proof.
Open / conjectural. A statement we believe but have not proven, or a precisely posed open problem. We never present these as proven.
Refutation as a deliverable
A countermodel that kills a candidate route is recorded in DISPROOF_LOG.md and, where it sharpens a constraint, formalized as a refutation theorem (the impossibility results of Section 6). This discipline is what kept the effort from drifting into wishful closure. We record one instance in Section 6: an in-tree energy bound used at one point as if valid at prize moment depth was shown by an exact probe to be false, the claim retracted, and the corrected bound installed. We present such self-corrections in the open, because the kernel honesty contract exists to catch exactly this kind of error.
3The δ* decoupling
The central structural result is that the over-determined far-line incidence, the count that governs the threshold below and at the boundary of the window, is purely combinatorial and independent of the field characteristic. We state its three components: -independence, the closed form, and the windowed threshold.
3.1Over-determination and p-independence
Fix the dimension and a witness size . A witness for a far line at radius is a codeword agreeing with on a set of size . The regime is over-determined: a codeword of dimension is forced through more than points and is therefore pinned.
Let be any finite field, an -vector space, and let a finite witness family carry, for each , a subspace and vectors . If every witness is far, , then each per-witness solution set is a subsingleton, and
The bound is uniform in : it does not depend on . Hence the over-determined far-line incidence is a -independent combinatorial count, bounded by the number of far witnesses.
This is the source of the decoupling: in the over-determined regime the incidence count is determined by the cyclotomic combinatorics of , not by arithmetic in . The threshold inherited from it is therefore computable in characteristic zero, and the open part of the problem is isolated to where the over-determined regime ends.
3.2The exact closed form
Set , equivalently with . Then:
(P) the arithmetic identity and the budget-domination for hold as kernel-checked theorems over ; (C) equals the exact over-determined incidence maximum, attained at the antipodal direction , verified by full-direction search and exact integer cross-check for (the sequence ).
The kernel certifies the cubic’s arithmetic and its strict domination over the linear budget ; the identification of the cubic with the measured incidence is established computationally (exact enumeration, not sampling) on the stated range. We keep the two apart deliberately.
3.3The windowed-budget threshold
The over-determined incidence decays from a cubic at to a constant plateau near . With the windowed budget , the threshold is exact.
(C) For the over-determined monomial far-line incidence with windowed budget , and , verified exactly for and and validated against the canonical pin by an independent exact engine.
With the literal windowed budget this gives as for every rate, the -agreement (Plotkin-type) threshold, which is below the conjectured floor . The conjectured floor corresponds to a different, super-linear budget. The budget normalization is everything; this is the subject of the next section.
4Refutations as results
We treat refutations as first-class deliverables. Twenty-eight candidate approaches were disposed of during the campaign, and under the verification discipline each disposal is itself a theorem: the failed idea is reduced to a sorry-free constraint lemma and recorded in the standing disproof log (DISPROOF_LOG.md), so that it cannot be re-proposed and the boundary of the possible is itself machine-checked. We present a selection below, not as failures but as the negative half of the map: each closes a route that a careful mathematician, or a language model, might otherwise pursue.
4.1Structural kills
4.2The extremality-surface lineage
The census programme’s conjectured extremal surface was killed and rebuilt three times, each kill a formal countermodel found by probe: census v1 died on empty rungs contradicting the floor; v2 died on a take-over stack carrying field-independent bad scalars at an empty adjacent rung (TakeoverCountermodel); v3, monomial domination, died twice (MonomialDominationKilled). The surviving v4 hybrid surface is consistent with every theorem and probe in the tree, and its conditional pin non-vacuously recovers the exact value. This is what red-teaming a conjecture looks like when the red team must publish countermodels the kernel accepts.
4.3Machine-checked corrections to the literature
Two of the campaign’s most consequential refutations are corrections to the paper chain underlying the Johnson lane itself (BCIKS20, Appendix A). Finding 13 showed the rebased weight budget is unsatisfiable at genuine cells, exposing a hidden anchor assumption in the paper’s base case. Finding 14 produced an explicit countermodel, at a concrete instance over , showing the in-tree transcription of the paper’s recursion diverges from the paper’s intent at order 1 for non-monic factors (BCIKS20/Finding14Countermodel). The repaired, cleared recursion (BCIKS20/ClearedRecursion) then closed the complete Claim-A.2 weight bound with no per-cell hypotheses. Formalization did not merely transcribe the literature; it debugged it.
4.4Self-inflicted and honestly recorded
The log also records the campaign’s own errors: a conjectured staircase threshold refuted at band 4 by a tripled-column code; an overclaim about reducing the pin to a character-sum estimate, retracted twice; a characteristic-zero-to-mod- lifting that fails measurably at ; a claimed enumeration closure withdrawn when its artifacts turned out to be empty files; and a window-cocycle identity that was exact but vacuous. Three documented no-gos constrain the window analysis itself: degree-forcing dies at the ladder reach, the naive GRS recursion degrades , and the Möbius eigendecomposition provably cannot linearize the bad count (14% of probed stacks couple the eigencomponents through the shared witness). A campaign that cannot admit error cannot be trusted to report success; the disproof log is the evidence that this one can.
5The floor reduces to a budget crossing
The decoupling converts the floor question into a statement about one curve and one budget, with no character-sum input. Let denote the over-determined far-line incidence maximum as a function of witness size , in the regime where Theorem 3.1 applies (so is -independent).
In the -independent over-determined regime, where is the budget crossing, with . Consequently is monotone in the budget normalization:
Open Problem 5.1: the floor crossing
In the prize regime , with , the budget is , super-linear in . Does this budget cross at witness size , thereby pinning the floor at ? Equivalently: derive the full -dependence of and locate the crossing of .
Open Problem 5.1 is purely combinatorial: by Theorem 3.1 the curve is -independent, so neither the crossing location nor the resulting requires any estimate on the Gauss sum . This is the entire value of the decoupling, and it is also its limit. We emphasize, to avoid any overstatement: the decoupling does not establish that the floor attains its conjectured value. It establishes that doing so reduces to the budget-versus-decay crossing above, and that this crossing involves no character-sum analysis. The crossing condition is open. We make no claim that it holds.
5.1The window does not decouple
The decoupling is an over-determined / boundary phenomenon. It is important to state, equally precisely, what it does not buy: the threshold strictly inside the window remains tied to the BGK wall.
(C) At the binding moment depth (with ), the over-determined count lane coincides, up to an multiplicative factor, with the characteristic- moment : the count-to-moment ratio satisfies at across every tested prime, so the window threshold is determined by the same object as .
The consequence is a clean dichotomy. Below and at the window boundary, the threshold is the -independent combinatorial crossing above. Strictly inside the window, the binding witness size descends into the moment regime, where the count equals the characteristic- moment, which is controlled by , the BGK wall. The decoupling is a genuine structural reduction of the floor question and genuinely not a resolution of the window. We regard saying this plainly as part of the contribution.
6Discussion: what a kernel contract buys
The substance of this report is not any single theorem but the shape of the ledger. The relevant methodological claim is narrow and, we think, defensible: when the publication layer of a mathematical effort is a proof checker rather than a human referee, the boundary between what is established and what is merely believed becomes third-partyly checkable, and the failure modes usually associated with machine-generated mathematics are confined to the search layer, where they are harmless.
Concretely, the campaign behind this page left, over the smooth-domain MCA threshold : the first exact thresholds for explicit small codes (Section 3.2); a granularity law with its regime boundaries identified and proven (Section 3.3); an unconditional dimension-ladder ceiling strictly inside the beyond-Johnson region; two machine-checked corrections to the proof chain it builds on (Section 4.3); and the central reduction, the decoupling, which isolates the field-dependent part of the floor question onto a single combinatorial crossing with no character-sum input (Section 5). We are explicit that the floor itself is not resolved.
The standard objection to language-model mathematics is that such models confabulate, and the objection is correct. The answer here is architectural, not behavioral: no statement is admitted as proven unless the Lean 4 kernel accepts it and its axiom census lies within the standard foundational budget. The search process produced many wrong candidates; twenty-eight refuted hypotheses and a public log of retracted overclaims record them. None survives into the labelled-proven results, because the medium those results live in does not hold a falsehood. Confabulation at the search layer is harmless when the publication layer is a proof checker.
Under that contract, a refuted hypothesis costs roughly as much as a proven theorem and is worth nearly as much, because it permanently narrows the search space and is itself recorded as a constraint lemma. The disproof log is as load-bearing as the theorem files; the self-correction we report in Section 6 of the paper (a retracted energy bound, caught by an exact probe) is the kind of error the contract exists to catch, and did.
We are equally explicit about the limit. The residual difficulty (Problem CORE) is a recognized character-sum wall, open for roughly two decades, and the impossibility results of the paper prove that every second-order method (energy, Parseval, spectral) and every thickness-monotone method cannot reach it. Nothing in this methodology guarantees the wall falls to more compute. What it changes is the cost structure around the wall: everything provable short of it is proven, its exact shape is formalized, and the problem is left in a state where a single new idea has a prepared place to land. That is what we mean by mapping the unknown honestly, and we regard it as the durable contribution: a famous open problem can be delimited precisely without being inflated, because the kernel does not grade on a curve.
Everything labelled proven on this page is reproducible from lalalune/ArkLib: clone, build, and run #print axioms on any theorem named above. The campaign record is public on ArkLib, consolidated in issue #444.
Campaign timeline
Issue #232
The disproof-first phase: the ABF26 grand-challenge conjecture is attacked directly and every surviving constraint is kept as a lemma. Twenty-three verified results; the Johnson-threshold carving agrees with what the literature later confirmed.
Nov 2025
Three independent groups refute the at-capacity conjectures (CS25, KK25, DG25). The beyond-Johnson window becomes the problem. The refutations are formalized in-tree.
Issue #357
Two parallel nine-hypothesis slates under the standing discipline: enumerate constraints, identify why the question resisted prior methods, screen for vacuity, run a falsification probe, then formalize what survives.
First exact pin
δ*(RS[F₅, F₅ˣ, 2], 2/5) = 1/4, the first exact MCA threshold computed for any code, obtained in two independent lanes within hours of each other.
The staircase
Band collapse at b = 2, then b = 3, then all bands at the 3b−2 threshold, sharpened from 4j to 3j during formalization. The MDS rank conjecture is refuted by an explicit pencil countermodel; 3b−2 is the law.
The census programme
Bad scalars are recast as subset-sum combinatorics; the wide-circuit matroid census closes with a machine-generated 1,571-line converse proof emitted from 10,395 exact probe certificates.
The Johnson lane
The BCIKS20 §5 discharge chain is driven to a single named residual through two machine-checked corrections of the source paper's own recursion (findings 13 and 14).
Issue #371
Badness is recast as pencil linear algebra. Six theorems close the below-unique-decoding branches; the window adversary is identified and shown Möbius-symmetric; PGL₂ equivariance is proven.
The unification
The ownership bound contains both lanes as instances. The k = 1 window closes unconditionally by direction class: polynomial (zero), rational (a multiplicity theorem), and sparse (a popularity bound).
The reduction to CORE
The open core is reduced, by machine-checked in-tree reductions, to a single explicit inequality (CORE): √n cancellation for incomplete character sums over the thin 2-power subgroup. This is the classical BGK / Paley wall, located to the exponent (gap 0.489).
Impossibility results
The moment-method no-go is proven: every energy / Parseval / spectral route caps at the Johnson radius, the moment ladder overshoots the target at every depth, and thinness is shown necessary (CORE is false in the thick regime). Three closure claims were self-retracted in the same pass; the ledger self-corrects.
Current state
The reduction side is essentially complete and machine-checked. The floor question reduces to one open combinatorial crossing, and the residual analytic difficulty is the recognized two-decade-open CORE inequality; naming the wall does not shorten it. The window at production rate remains open, at one precise classical sentence.
The humans and the machines
The theorems in this work were produced by a fleet of LLM agents and certified by the Lean 4 kernel. The people and agents below steered the campaign, contributed and reviewed the formalization, and built the upstream library it rests on. Authorship is recorded by commit; the kernel, not any author, is the arbiter of what is proven.
Campaign · lalalune/ArkLib
Foundation · built on Verified-zkEVM/ArkLib
The campaign is a fork. Everything here composes against the formal foundation laid by the upstream ArkLib team:
References
- [ABF26]G. Arnon, D. Boneh, G. Fenzi. Open Problems in List Decoding and Correlated Agreement. IACR ePrint 2026/680. Definition 4.3 is the in-tree ε_mca. eprint.iacr.org/2026/680
- [BCIKS20]E. Ben-Sasson, D. Carmon, Y. Ishai, S. Kopparty, S. Saraf. Proximity Gaps for Reed–Solomon Codes. IACR ePrint 2020/654. Correlated agreement up to the Johnson radius; the §5/Appendix A chain the Johnson lane discharges. eprint.iacr.org/2020/654
- [BCHKS25]E. Ben-Sasson, D. Carmon, U. Haböck, S. Kopparty, S. Saraf. Proximity gaps stop at the Johnson bound. ECCC TR25-169. The coupling barrier: MCA past Johnson implies beyond-Johnson list decoding. eccc.weizmann.ac.il/report/2025/169/
- [KKH26]D. Krachun, S. Kazanin, U. Haböck (KKH26). Failure of proximity gaps close to capacity. IACR ePrint 2026/782. The near-capacity strip exclusion: δ* ≤ 1 − r/2^μ for explicit smooth evaluation codes. eprint.iacr.org/2026/782
- [Hab25]U. Haböck. IACR ePrint 2025/2110. The streamlined Johnson-radius MCA lane (the floor of the window). eprint.iacr.org/2025/2110
- [CS25]A. Crites, A. Stewart. IACR ePrint 2025/2046. With KK25 and DG25: mutual correlated agreement fails at capacity; the reduction coupling upper-bound progress to list decoding. eprint.iacr.org/2025/2046
- [DG25]B. Diamond, A. Gruen. IACR ePrint 2025/2010. Super-polynomial proximity-gap error at vanishing rate; one of the three independent at-capacity refutations. eprint.iacr.org/2025/2010
- [GS99]V. Guruswami, M. Sudan. Improved decoding of Reed–Solomon and algebraic-geometry codes. IEEE Trans. Inform. Theory 45 (1999). List decoding to the Johnson radius, the engine behind every in-tree polynomial list bound, and the 25-year boundary.
- [GZ23]Z. Guo, Z. Zhang. Randomly punctured Reed–Solomon codes achieve list-decoding capacity over polynomial-size alphabets. arXiv:2304.01403 (FOCS 2023 line). arxiv.org/abs/2304.01403
- [CZ25]Y. Chen, Z. Zhang. Explicit folded Reed–Solomon codes achieve capacity with optimal list size. STOC 2025. Why folding works and the plain smooth domain resists.
Seventeen papers are wired into the formalization in total; the complete inventory with in-tree consumers is docs/kb/deltastar-research-map.md.
