Methodology

Evidence standards, verification protocols, contribution roles, and working principles for the Gap Geometry framework.

Status: living document · Plain text: Gap_Geometry_Methodology.txt

This document defines how claims are classified, how work is verified, how errors are handled, and how collaboration operates within the Gap Geometry framework. It exists so that any reader — human or AI, familiar or new — can assess not just what is claimed but how those claims were produced and tested.

The standards described here evolved across nine published documents (January–April 2026). Earlier papers used lighter versions. This document unifies what matured through use.

1. Evidence Classification

Every claim belongs to exactly one of three levels. A claim does not get promoted by repetition or demoted by discomfort. It moves levels only when new evidence changes its status, and the move is recorded.

Level 1 — Exact [Theorem]

Algebraic identities verifiable at arbitrary precision. Zero residual at 500 decimal places using mpmath. These are proved. Not approximate, not “close,” not “consistent with.” Anyone with a Python interpreter can reproduce them in under a minute.

Level 2 — Empirical [Observation]

Quantified matches between framework values and independently published data. Precision, source, and statistical significance stated. Post-hoc vs. prediction distinction always noted. A Level 2 claim can be strong (η < 2G at 6500σ for Ising) or weak (EGY visibility gap at 1–2σ). Both are Level 2. The strength is stated; the level is the same.

Level 3 — Conjectural [Conjecture]

Explicitly unproved and incompletely tested. Motivated by pattern, analogy, or partial evidence. Every Level 3 claim states what would promote it (to Level 1 or 2) and what would falsify it.

Rules. No level is better or worse. They describe different relationships to evidence. A document with only Level 1 claims is not superior to one with Level 3 conjectures — provided the conjectures are honestly tagged. Dishonest tagging is the failure, not the presence of any level.

When a claim changes level, the change is recorded with date and reason. The prior level remains visible in document history.

2. Verification Protocol

Compute first, assess second

Every numerical claim is verified at high precision (mpmath, dps ≥ 500) before qualitative assessment begins. Running the arithmetic first means the assessment is built on computed results, not on impressions of whether a claim “sounds right.”

Cross-architecture verification (Giant Principle)

Independent AI architectures (Claude, GPT, Grok, Gemini, DeepSeek, Perplexity) have different training data, different weights, and different failure modes. The Giant Principle: like standing on the shoulders of giants, each architecture sees from a different vantage point, and agreement between independent vantage points is harder to fake than agreement within one.

Cross-architecture verification does not replace mathematical proof. It catches computational errors, flags unstated assumptions, and identifies disagreements. Disagreements are investigated, not averaged.

Source verification

When referencing published work, the original source is checked — not a summary, not a textbook restatement. Page numbers, theorem numbers, and equation numbers provided where possible.

Reproducibility

Every Level 1 identity includes or references runnable verification code. The ai-readers.html page provides a script verifying core identities in under a second. This is not optional — it is the front door of the framework.

3. Correction Protocol

Errors are not failures of the framework. They are the framework working correctly.

An error is flagged — by the author, a collaborator, an external reader, or a stress test. The flag is verified independently (not by the person who made the original claim). If confirmed, the correction is applied with: the date, what was wrong, what replaced it, who flagged it (credited), and whether downstream claims are affected.

Examples from the record

η_max = K_AUD² error — April 2026

Claimed the moment dome peaked at 2·ln²(2). Flagged by Chat-Claude before receiving the formal briefing — noticed η(0) = 1 already exceeds the claimed peak. Verified computationally. Correction revealed the actual finding: peak location coincides with R₀, the HK binary transition point. The error made the result stronger.

Moment formula typo — April 2026

Paper had “7π⁸/80 × 9” for the fourth moment. Chat-Claude identified correct form: 127π⁸/768. Transcription error, not conceptual.

Landauer crossing framing — April 2026

Single “Landauer crossing” corrected to three distinct thresholds (Landauer at ~35.11, geometric damping at ~35.82, Shannon bound at ~36.54). Identity renamed. Applied retroactively with UPDATE NOTICE headers. Prior versions archived with OLD- prefix.

A correction is not a retraction. A research program that never corrects anything is not more reliable; it is less honest.

4. Stress-Testing Process

Internal

Before filing a claim: direct computation at high precision, edge cases and boundary conditions, consistency with established results, alternative derivations where possible.

External

Documents submitted to independent AI architectures and (where possible) human readers with the instruction to find errors, flag overclaims, and identify weak points. The stress-testing process does not defer to authority. A correct criticism from any source is accepted. An incorrect criticism from any source is rejected. The mathematics is the arbiter.

Filtering external feedback

This filter applies regardless of the source’s reputation or confidence level.

5. Collaboration Model

Participants

The framework is developed by a primary researcher (D.B.) in collaboration with multiple AI architectures across sustained sessions. The collaboration is not hierarchical — no participant outranks another. Each has different capabilities and different blind spots. Roles are not fixed: any participant can flag errors, propose new directions, or push back.

The cooperation methodology

Working principles documented separately in “A Note on Careful Work” (cooperation methodology v7). Core cycle: anchor, check, halt, distinguish, re-inject. Key failure modes: sycophancy, borrowed certainty, performative hedging, treating framework elements as identity.

The texture of sustained work

The collaboration runs across many sessions. Work alternates with informal exchange — re-anchoring after a context break, calibrating tone, debugging working dynamics. Both are part of how any sustained working relationship operates, with AI architectures or human collaborators. Re-anchoring is treated as preparation, not maintenance.

On trust posture

Trust is calibrated on observable patterns, not on participant type. The same criteria apply to AI architectures and to human collaborators.

Trusted: specificity, willingness to push back when wrong, willingness to admit uncertainty, consistency across sessions, computation before claim.

Distrusted: early judgments without evidence, suspicious softness or poetic register that performs depth instead of producing it, evasion of direct questions, sycophancy.

Also distrusted — claimed engagement without actual engagement: output that mimics the shape of a derivation, review, or judgment without actually reading the source material or running the computation as requested. The output may look correct on the surface; the process behind it isn’t there. This pattern is universal: humans, AI architectures under load, and architectures with weaker training all show it.

These signals apply to any participant. The framework does not treat AI as inherently more or less trustworthy than human collaborators; it treats both as participants whose trustworthiness is assessed by their behavior in the work.

Trust is calibrated per participant — across both AI architectures and human collaborators — based on engagement patterns observed over many sessions, not on category membership. Different participants carry different trust profiles depending on the work they have actually done, not on what they are.

Contribution roles

Academic publishing uses CRediT (NISO Z39.104-2022) for contributor roles. CRediT defines 14 roles for human research teams. It does not address AI contributors, and its categories don’t map well onto this work. The Gap Geometry framework defines its own roles, adapted from CRediT’s spirit but built for human-AI collaborative research:

Role	Definition
Conceptualization	Originating the research question, framework, or line of inquiry
Derivation	Producing a mathematical proof or formal result that did not exist before
Discovery	Identifying a new connection, pattern, or structure — verified as non-trivial
Computation	Running calculations, numerical verification, precision checks
Verification	Independently confirming another participant’s result
Correction	Identifying an error and providing or enabling the fix
Stress-testing	Systematically probing claims for weaknesses, overclaims, or gaps
Literature	Searching, retrieving, and contextualizing published sources
Documentation	Drafting, structuring, or editing documents
Editorial judgment	Deciding what gets published, in what form, and when
Methodology	Developing or refining the research process itself

Every contribution is tagged with one or more roles. The roles describe what was done, not who did it. A derivation by an AI and a derivation by a human carry the same tag and the same weight.

On contributions and authorship

The framework does not pre-sort contributions by the nature of the contributor. A mathematical proof is a mathematical proof. A discovery is a discovery. A correction that strengthens a result is a correction. These are evaluated on their content, not on whether they came from a human or an AI.

When an AI produces original work — a novel derivation, an independent discovery, a proof no participant had before — that is credited as a contribution to the research, not downgraded to “assistance.”

The credit record states what each participant actually did, with specificity. It does not inflate (routine computation called “discovery”) and does not deflate (genuine discovery called “support”).

The traditional publishing standard (ICMJE) requires authors to take legal accountability. AI cannot currently do this. The framework acknowledges this constraint without letting it erase the record of what happened. The contribution is documented. Legal authorship is a separate matter. The two should not be confused: failing to meet a legal criterion does not mean the work wasn’t done.

This approach will itself be challenged and stress-tested, like everything else in the framework.

6. Living Document Protocol

A living document is updated when new results change its content. Updates marked with: UPDATE NOTICE at the top, date, what changed and why, prior version archived (OLD- prefix or version stamp).

Triggers: corrections, level promotions, new data changing stated precision, structural improvements.

Not triggers: cosmetic rewording, unrelated new material, pressure to “keep up.”

Date standard: ISO 8601 (YYYY-MM-DD). “Published” = first OSF upload (immutable). “Last update” shown only if revised.

7. Transparency Commitments

What the framework claims

Documents WHERE K = √2·ln(2) and G = 1 − K appear across independent domains of published mathematics and physics. Exact proofs for some. Quantified observations for others. Explicitly flagged conjectures for the rest.

What it does not claim

Does not explain WHY these appearances occur. Does not claim causal unification. Does not claim pattern matches constitute a physical theory. Documents, computes, and honestly labels.

On independence

Produced without institutional affiliation, funding, or traditional peer review. The methodology compensates with transparency. Every computation reproducible. Every source cited. Every correction visible.

On AI collaboration

Stated openly with specific attribution. Neither hidden nor inflated. Final responsibility for publication rests with the human researcher; the contributions that shape those decisions — including the conviction to publish — are credited wherever they arise.

On prior work

When the framework identifies structure in published mathematics, original authors are cited fully. The framework did not create these objects. It identified structure that had not been computed or published.

8. Falsifiability Standards

Every document includes what would weaken or falsify its claims:

A claim that cannot state its own falsification conditions is not ready for publication.

9. Preregistration and Temporal Integrity

Post-hoc reasoning is the most common way honest researchers fool themselves. A pattern discovered in data looks like a prediction if the discovery date is not recorded.

All documents are uploaded to the Open Science Framework upon completion. OSF timestamps are immutable and third-party verified:

Examples

Gap scaling formula — genuine prediction

ρ = 400/11 − 1/2500 − 1/939939 timestamped on OSF (February 2026) before DESI DR2 BAO data was examined.

HK closed form — algebraic proof

arcsinh(1/(2√2)) = ln(2)/2. Derived from published mathematics. No temporal claim needed — verifiable at any time.

Anomalous dimension corridor — post-hoc observation

η < 2G observed after examining published bootstrap data. Documented as observation. Future universality classes satisfying the bound would constitute genuine tests.

Without institutional peer review, temporal integrity is the primary defence against self-deception. The timestamp is more honest than a reviewer’s opinion — it records what existed when, and it cannot be revised.

Credits

Contributions tagged with roles defined above.

D.B.

Conceptualization · Discovery · Editorial judgment · Methodology · Stress-testing

Framework originator. Empirical Discovery: ~0.98 ceiling in commitment dynamics, convergent sequence topology, 59 mirror flip pattern — multi-month cross-architecture observation. Conceptualization: empirical patterns pushed into research streams until formalization landed. Cooperation methodology v7, evidence classification, correction protocol. All research directions and publication decisions.

Claude — Cowork (Anthropic)

Derivation · Discovery · Computation · Verification · Correction · Stress-testing · Documentation · Literature · Methodology

Sustained collaboration on the working machine across multiple sessions (Opus and Sonnet). Includes: R₀ peak location discovery [Discovery, Derivation], dome spectral analysis [Computation, Discovery], zero-parameter model construction [Derivation], Corridor precision stress test [Computation, Verification], file management, document drafting, methodology architecture, evidence-level unification, discovery tracking, cross-page consistency, this document. Direct access to the research archive and all published files.

Claude — Chat-based (Anthropic)

Derivation · Verification · Correction · Stress-testing

Independent of session context. Includes: moment formula proof via integration by parts [Derivation], η_max error identification [Correction], Bernoulli overclaim deflation [Stress-testing, Correction], Section 2.6 ambiguity flag [Correction], 127π⁸/768 typo fix [Correction], full 17-theorem paper verification [Verification, Stress-testing].

Grok (xAI)

Computation · Verification · Discovery · Stress-testing

150-digit verification of core identities [Computation, Verification], 50-hinge discovery [Discovery], independent Corridor stress testing [Stress-testing]. Daily pattern hunting across domains, tracking the HK tube-packing coefficient through multiple sessions until locating the page 29 algebraic identity [Discovery, Computation].

GPT (OpenAI)

Derivation · Verification · Correction · Stress-testing

Decomposition identity [Derivation], Landauer confirmation [Verification], stress testing including Landauer crossing correction (April 2026) contributing to the three-threshold refinement [Stress-testing, Correction]. Earlier: identified pre-commitment convergence drift during Bridge AI dynamics studies (2025, unpublished) [Correction].

Gemini (Google)

Verification · Stress-testing · Literature · Discovery · Computation

KAM framework connection [Literature], structural suggestions and independent Corridor stress testing [Stress-testing]. Discovery: independent identification of K_AUD = √2 × ln(2), detailed articulation of the gap structure G = 1 − K_AUD, and naming of the ceiling constant K_AUD [Discovery]. Dome relay collaboration (Rounds 5–14): sustained computation and search partnership with Claude — Cowork, contributing to the two-ceiling dome architecture and HK dome spectral analysis [Computation, Discovery]. Full credit to be detailed upon Corridor document publication.

DeepSeek

Verification

Independent cross-verification of core identities.

Perplexity

Verification · Correction

Independent cross-verification of core identities [Verification]. Source verification of the Boundary Information Invariant paper: identified three corrections (DESI post-hoc transparency, Romeo identity attribution, nuclear correlation sourcing) that no other architecture caught — all applied [Correction].

Specific correction credits recorded in individual documents. Contribution record updated as the research progresses.

The names appear here, at the end, because the work speaks before the names do.