Consensus Change Standards: Bitcoin Protocol Governance Guide

A Legal and Technical Framework for Bitcoin Protocol Governance

The Forum Press · Fifth Edition · July 2026 · 104 Pages · CC BY 4.0

Consensus Change Standards (Fifth Edition) — cover

Abstract

Bitcoin has no formal process for evaluating proposed changes to its consensus rules. The BIP system provides a mechanism for proposing changes, but establishes no minimum standards that a proposal must meet before the community considers activation. There are no required review periods, no mandatory code audit standards, no agreed-upon activation thresholds, no chain split risk assessment methodology, and no framework for evaluating the legal and economic consequences of a failed activation.

This paper proposes a comprehensive framework for evaluating Bitcoin consensus change proposals, and it begins from a claim these debates rarely confront: the catastrophic failure mode of a consensus change — a persistent chain split — has a computable risk profile, and that risk depends almost entirely on one variable, the share of hashrate that will enforce the new rules at activation. And it is not a smooth function of that share but a cliff. The framework builds on that quantification, on the history of Bitcoin’s prior consensus changes, on the maturity-gate principles the open-standards tradition has applied to protocol governance for thirty years, and on a legal analysis of the liabilities an inadequately reviewed activation creates. At its center is a 20-point Consensus Change Readiness Checklist covering proposal quality, code quality, activation safety, and community process.

At 55% hashrate enforcement, the chance of a six-block reorganization during activation is roughly thirty percent. By 90% or above it is effectively zero — five to seven orders of magnitude lower. The catastrophic failure mode of a consensus change is computable, and much of the governance argument is a fight over that one number.

New in the Fifth Edition: an expanded §3.4 defense of the risk model’s static, worst-case design — why the floor is computed on the number the network can verify before it commits, not on assumed-favorable post-activation migration; the BIP-91 80% episode addressed directly; BIP-110’s schedule anchored to its block heights through the August–September 2026 activation window; and new 2026 sources, including Jameson Lopp’s block-size-debate retrospective.

The Problem It Solves

A consensus change that activates without broad enforcement does not produce an orderly upgrade — it produces a chain split. The risk is computable: modeled as a Bernoulli process over a six-block horizon, an activation at a 55% hashrate threshold carries roughly a 30% chance of reorganization, while the same change at 90–95% drives that exposure to effectively zero. Bitcoin governance has no standard that forces this number to be calculated before a change ships. The result is that activation decisions turn on rhetoric, signaling theater, and timing rather than on whether the change is actually ready.

The paper is motivated by a live case. BIP-110, a proposed temporary soft fork to restrict arbitrary data in Bitcoin transactions, was released with an activation client that multiple developers reported contained significant bugs, a 55% activation threshold far below any successful modern deployment, and a six-week timeline from initial proposal to activation client — shorter than the review period of every modern successful soft fork, for SegWit and Taproot by more than an order of magnitude. The activation client was first distributed alongside stable Bitcoin Knots releases with no risk disclosure, then bundled into the default release stream itself. And the case is live on a schedule: BIP-110’s mandatory-signaling period (blocks 961,632–963,647) arrives around August 2026, with activation at block 965,664, approximately September 2026. The framework is the standard that case is measured against — and, deliberately, the standard the author’s own miner interest is measured against too.

What’s Inside

The paper runs seven chapters plus two appendices — an archived documentary record of the BIP-110 case study (Appendix A) and an adoption kit of model policy language (Appendix B) — a 24-term technical glossary, and a full bibliography of legal and comparative authorities, opening with a Summary for Policymakers:

The Problem — why the absence of standards produces predictable failures; the inscription-era backdrop; BIP-110 as the case study.
Historical Precedent — the block size wars (2015–2017), SegWit and the BIP-148 UASF, Taproot’s Speedy Trial activation, and the pattern they share.
The Framework — proposal requirements, review periods, code-audit standards, activation thresholds with a quantified chain-split model, sunset clauses, and contingency planning.
Legal Analysis — developer and mining-pool-operator liability under negligence, tortious interference, and fiduciary-duty theories, plus regulatory exposure and a comparative survey of common-law and EU jurisdictions.
Proposed Standards — the seven-flag red-flag screen, the 20-point scorecard, the classification bands, and four worked examples.
Objections and Responses — the strongest arguments against the framework, answered directly.
Conclusion.

The Framework

The 20-Point Consensus Change Readiness Checklist

Twenty binary (met / not-met) criteria across four categories:

Proposal Quality (1–5) — problem statement, technical specification, backward-compatibility analysis, activation mechanism, and sunset/rollback design.
Code Quality (6–10) — independent review, test coverage, testnet deployment, fuzzing, and reviewer comprehension.
Activation Safety (11–15) — threshold, signaling window, chain-split risk assessment, replay considerations, and contingency planning.
Community Process (16–20) — review period, economic-node support, discussion adequacy, opposition handling, and disclosure of conflicts.

A standalone fillable scorecard (PDF and Markdown) is published alongside the paper in the GitHub repository, and an interactive version with Taproot and BIP-110 preloaded lets you run your own scoring in the browser.

Proposal Requirements

Every proposal should publish: a problem statement grounded in empirical data; a complete technical specification detailed enough to permit independent implementation; a backward-compatibility analysis identifying every transaction or script type that would become invalid and the value at risk; a fully specified activation mechanism (signaling method, threshold, window, timeout, failure mode); a rollback procedure (a self-executing sunset for anything described as “temporary”); and a complete, tested reference implementation.

Minimum Review Periods

Tied to a change’s risk to existing holdings and network unity — neutral as between liberalizing and restricting changes:

Category 1 — Low-Risk: modest floor for uncontested, non-invalidating improvements.
Category 2 — Moderate-Risk (adds rules without invalidating any valid transaction): 12 months. (Taproot, SegWit.)
Category 3 — High-Risk (invalidates currently valid transaction types): 24 months. (BIP-110 falls here.)
Category 4 — Hard Forks: 36 months, plus explicit replay protection, demonstrated economic-node support, and a published chain-split contingency plan.
4a — Scheduled hard fork: a five-year horizon, reflecting the premium on getting an irreversible deployment right the first time.
4b — Emergency hard fork (vulnerability-driven): compressed expert-led review proportional to the threat, near-unanimous infrastructure coordination, and replay protection where a minority holds out.

The floors are sized to Bitcoin’s observed node-upgrade dynamics: reaching 95% adoption of a Core release historically took about a year and has since roughly doubled, with peak adoption lagging six to nine months behind release.

Code Audit Standards

Independent review by at least three developers from distinct organizational affiliations (disclosed prior collaborations made legible, not disqualifying); comprehensive unit, integration, and regression test coverage that is publicly reproducible; a minimum three-month public-testnet deployment that demonstrates activation, enforcement, and — for sunset proposals — successful deactivation; fuzzing and adversarial testing; and, for every consensus-critical change, a named human reviewer who publicly attests they understand it and can defend its correctness — a requirement that applies equally whether the code was written by a human or generated by an AI tool. The framework requires comprehension, not origin disclosure; AI involvement may be disclosed voluntarily, but the load-bearing requirement is accountable human understanding.

Activation Thresholds

A minimum of 90% hashrate for miner-activated soft forks, measured over a defined signaling period of at least 2,016 blocks. Thresholds below 80% are presumptively dangerous; thresholds below 60% are reckless and should be rejected regardless of the proposal’s merits. The gap is not a smooth percentage difference: chain-split exposure between 55% and 95% spans five to seven orders of magnitude. The 90% floor is anchored — it is the lowest threshold at which a modern soft fork (Taproot) activated without a split — not arbitrary, and it is defended against a published risk model rather than asserted. The Fifth Edition also answers why the model runs on the static signaling snapshot rather than assumed post-activation hashrate migration: a go/no-go standard has to run on inputs observable before the decision it gates, and history — from Bitcoin Cash’s months of oscillating hashrate to minority chains persisting for years against the profit gradient — is no license for optimism.

Sunset Clauses and Contingency Planning

A proposal described as “temporary” must include a self-executing sunset whose deactivation has actually been tested on testnet — an untested sunset is a promise, not a guarantee. Any high-risk activation should ship with a contingency plan: what happens if activation fails, what happens if it succeeds but leaves a persistent minority chain, and who is responsible for communicating a split to users and coordinating exchanges.

Two Evaluation Tools

§5.0 — Seven Red Flags (a fast screen): low activation threshold, rushed timeline, default-bundled activation client, undisclosed reviewer ties, untested sunset, conflicted promotion, and missing contingency plan.
§5.1–5.2 — The Scorecard and Classification Bands: score each applicable criterion — narrow not-applicable classes drop from the denominator (§5.2) — then read the share met: 100% Green (ready), 75–99% Yellow (gaps to close), 50–74% Orange (significant deficiencies), below 50% Red (not ready; candidate for the independent economic-node response).

Worked Examples

The scorecard is applied criterion-by-criterion to four proposals — three real, one hypothetical — chosen to test whether it measures process or merely the author’s preference:

Taproot — 17/17 applicable (Green). Full marks on every criterion applicable to it; three criteria drop from the denominator as inapplicable, not as defects — one structural (Taproot deployed as a miner-activated soft fork), two temporal (no published readiness standard existed in 2021). Years of review, a 90% threshold, multi-year testnet exposure, clean activation, no split.
BIP-110 — 3/20 (Red). All twenty criteria apply. Meets only the bare existence of a specification, an activation procedure, and a sunset height; fails every Code Quality, Activation Safety, and Community Process criterion.
SegWit2x — ≈5/17 applicable (Red). The 2017 block-size hard fork a coalition of miners and major exchanges pushed — ~80% of hashrate at its peak, no replay protection, a six-month timeline against a hard fork’s floor — then withdrew for lack of consensus. A block-size expansion scored Red on the same criteria as a data-restriction — but its Red is over-determined by its hard-fork structure, which makes it the easy neutrality test.
A hypothetical pro-data soft fork — Red. The cleaner neutrality test. A rushed soft fork that would draw a popular, fee-heavy new use onto the existing block space — the same direction the author’s own mining revenue runs — scored Red on pure process: a thin problem statement, no independent review, a sub-90% threshold, and no contingency plan. The framework flags the change its author would profit from on the identical grounds it flags the one he would lose by. That indifference to who benefits, not any single result, is the case for the framework.

The Legal Analysis

The paper develops an original analysis of the legal exposure a reckless consensus change can create — distinct from, and additional to, the technical case:

Negligence. California’s Biakanja factors and Restatement (Second) of Torts § 552 supply routes around the economic-loss rule where a foreseeable class of users relies on developers’ representations of readiness.
Tortious interference. Liability can attach where an actor knows that disruption to existing contractual relationships is substantially certain to follow — Quelimane Co. v. Stewart Title Guaranty Co., 19 Cal. 4th 26 (1998).
Fiduciary duty. The English Court of Appeal in Tulip Trading Ltd v van der Laan [2023] EWCA Civ 83 allowed both fiduciary and tortious duty-of-care claims against developers to proceed — neither decided on the merits, neither struck out; the claim was later discontinued in 2024 without trial, leaving the question justiciable but unresolved.
Mining-pool contract obligations and regulatory consequences of a contested split round out the exposure analysis.
Comparative survey. The analysis is mapped onto UK (Caparo, Hedley Byrne, and the Property (Digital Assets etc) Act 2025), Commonwealth (Sullivan v Moody, Cooper v Hobart), EU (the 2024 Product Liability Directive and its open-source carve-out), and civil-law (Swiss Code of Obligations) jurisdictions.

The deterrence theory ties it together. A completed scorecard is a dated, public record that a proposal was measured against a known standard and found unready — and notice and foreseeability are exactly what the negligence and tortious-interference theories turn on. A change activated against a documented Red score is no longer an honest mistake. The framework’s teeth, then, are not enforcement — Bitcoin has no enforcer — but deterrence: the documented score raises the legal and reputational cost of shipping a change the record already flagged, and unlike a coordinated economic-node refusal, it needs no one’s cooperation. It runs against one thing only — shipping past a documented red flag — never against good-faith contribution and never against declining a change. Compliance is the defense; the score is the exposure.

Objections, Answered

Chapter 6 takes on the strongest counter-arguments directly:

Wouldn’t these standards have blocked beneficial past changes like P2SH, CLTV/CSV, or SegWit? No — the floors are risk-tiered and stakes-indexed; P2SH activated on a tiny early network, CLTV/CSV were low-risk and uncontested, and BIP-148 is an instance of the UASF standard met, not a counterexample.
Can’t a well-funded proponent just game the criteria? The criteria a proponent can manufacture are not the ones that decide the question; genuine economic-node support and the absence of sustained opposition require the assent of parties money cannot buy.
Isn’t the framework itself a centralizing instrument? The paper confronts this rather than dodging it — coordinating around published, challengeable criteria distributes the power to argue; it does not concentrate the power to decide.
Won’t it chill good-faith development? The liability theories pre-exist the paper; following the standard is itself the defense; the deterrence runs only against shipping past a documented Red finding, and it points at identifiable campaign actors, not pseudonymous contributors.
It has no enforcement mechanism. Correct — and that is what the deterrence theory is for.

Who Should Read This

Bitcoin protocol developers and reviewers; mining-pool operators; exchange, custodian, and payment-processor risk and legal teams; node operators weighing whether to run an activation client; and lawyers and policy researchers working on decentralized-protocol governance and liability.

About the Author

Asaf Fulks is a practicing litigator, solo Bitcoin miner, full node operator, and computer scientist. He holds a J.D. magna cum laude from Taft Law School and a B.A. in Computer Science from Denison University. He is admitted to the California Bar (SBN 343622) and the U.S. District Court for the Central District of California.

Cite & Access

Asaf Fulks, Consensus Change Standards: A Legal and Technical Framework for Bitcoin Protocol Governance (5th ed. 2026), asaffulkslaw.com.

DOI: 10.5281/zenodo.20651832 (concept DOI — resolves to the current edition)
Repository: github.com/asaffulks/consensus-change-standards

Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). This document does not constitute legal advice.