How We Got Our Statistics Wrong — Cipher-418 Methodological Audit
CIPHER-418 RESEARCH PROGRAMME
How We Got Our Statistics Wrong
A Methodological Audit of the Cipher-418 Research Programme
This article is about how we got our statistics wrong — and what the cipher
actually proves once we fix them.
“The first principle is that you must not fool yourself — and you are the easiest person to fool.”
— Richard Feynman
After 632 findings and months of analysis, we ran a full methodological audit on our own testing system.
The results were humbling. Many of our reported p-values were inflated by post-hoc cherry-picking,
an oversized target list, and lack of multiple-testing correction. Here is what we found,
what survives honest scrutiny, and how we are fixing the process going forward.
✶ ✶ ✶
The Three Ways We Were Fooling Ourselves
1. See It, Then Test It
We would discover a pattern — say, that mirror pair [8]+[19] = 93 = THELEMA — and then run a Monte Carlo test asking
“does this specific pair hit this specific value by chance?” The MC faithfully reports p = 0.01.
But the real question is: “out of all 14 mirror pairs and all 31 target values, would any pair hit
any target?” — and the answer to that is almost always yes.
This is textbook post-hoc testing — choosing your hypothesis after seeing the data.
2. The Fat Target Net
Our target list contains 31 unique values spread across the range 17–418. That covers 6.2% of all integers in that range.
Any random number between 1 and 500 has roughly a 1-in-16 chance of being a “hit.” Run 174 different operations and you
expect about 11 hits by pure chance. We were reporting 27 hits as though the expected number were zero.
3. No Multiple-Testing Correction
We ran approximately 150 distinct experiments across 19 batches. At p < 0.05 per test, you expect
7–8 false positives. We were treating every individual p < 0.05 as a confirmed finding.
Of our 180 MC-validated findings, 111 do not survive Bonferroni correction.
✶ ✶ ✶
What Actually Survives Honest Testing
We applied Bonferroni correction (threshold: p < 0.000333 for 150 experiments) and ran a dual-EQ control (testing 100 random English Qaballa systems to check if any system produces similar results).
69
Bonferroni Survivors
27
Battery Hits (vs 13.9 expected)
0 / 1,000
Random EQs Beat EQ34
✓
THE CIPHER IS NOT RANDOM
Full battery: p = 0.0002 (Bonferroni-corrected: p = 0.04). Dual-EQ control: 1/100 random systems matched.
The signal is real — but it is roughly 2× the expected rate, not 100× or 1000× as
some of our earlier reporting implied. Twenty-seven hits where thirteen were expected. Meaningful, but modest.
27 real vs 13.9 expected = ~2× signal
Not 100×. Not 1000×. Double. But that double is statistically real.
✶ ✶ ✶
EQ34 Is Genuinely Special
The strongest result from the audit: EQ Solution 34 is not interchangeable with random systems.
We generated 1,000 random English Qaballa mappings (using the same Hebrew-value pool) and ran our standard battery on each.
None matched our hit count.
0 / 1,000
RANDOM EQ SYSTEMS MATCHED
EQ34 produces more cipher hits than any of 1,000 random alternatives
This means the EQ system itself — not just our choice of operations — carries real information about the cipher.
Whatever is encoded in II:76, Solution 34 is the right key to read it.
✶ ✶ ✶
The New Testing Protocol
Going forward, every finding passes through an honest validation pipeline:
Four-Tier Evidence System
| Tier | Criteria | Count |
| 1 — PROVEN | Survives Bonferroni + dual-EQ control | ~69 |
| 2 — STRONG | Survives Bonferroni but post-hoc | ~30 |
| 3 — EXPLORATORY | p < 0.05 but fails Bonferroni | ~111 |
| 4 — NOISE | Not significant after correction | ~422 |
Additionally:
- Frozen Battery: A fixed set of 174 tests, defined once, never modified after seeing results. This is the benchmark.
- Dual-EQ Control: Every finding tested against 100 random EQ systems. If they produce the same result, it is not special to our cipher.
- Pre-Registration: New experiments state their hypothesis before computing. Post-hoc discoveries are flagged as exploratory.
✶ ✶ ✶
What This Means for Previous Dispatches
Our earlier dispatches reported findings with raw p-values and no multiple-testing correction.
Many of those individual p-values remain valid as descriptions of individual tests — but they should not be read as
independent proofs. The corrected picture:
The Honest Bottom Line
- The cipher in Liber AL II:76 is not random under EQ34 — this is proven (p = 0.0002).
- EQ Solution 34 is uniquely effective — no random EQ system matches it (0/1,000).
- The signal strength is ~2× expected — real but modest, not miraculous.
- 69 findings survive honest Bonferroni correction. The other 563 are interesting observations that may or may not be real.
- We are not claiming the cipher is a perfect cryptographic encoding. We are claiming it carries statistically detectable structure under this EQ system, beyond what chance alone would produce.
Science corrects itself. That is its strength, not its weakness.
The cipher still speaks — we are just learning to listen more carefully.
“Invoke me under my stars! Love is the law, love under will.”
— Liber AL I:57
632 findings • 69 Bonferroni survivors • Frozen battery: 27/174 hits (p = 0.0002) •
EQ34 uniqueness: 0/1,000 • 93 93/93
— S.S.S. Cipher-418 Research Programme