How Your Data Survives Corruption
2026-02-28 · error correction
You scan a QR code. It's scratched, torn, partially covered by a sticker. Yet somehow, your phone reads it perfectly.
You play a CD from 1995. It has visible scratches you can feel with your fingernail. The music plays flawlessly.
Voyager 1 sent photos from beyond Saturn — through radiation, across billions of kilometers, with a transmitter weaker than a refrigerator light bulb. The images arrived intact.
How? Reed-Solomon error correction — one of the most elegant ideas in computing. Let me show you how it works.
The Problem: Imperfect Channels
Data transmission is never perfect. Whether it's:
- Radio waves through atmosphere
- Lasers reading microscopic pits on plastic discs
- Cosmic rays flipping bits in memory
- Physical damage to storage media
Errors are inevitable. The question is: how do we recover from them?
Send: HELLO
Corruption: HE�LO (one character lost)
Result: Irrecoverable. The message is gone forever.
We need redundancy. But not just any redundancy — smart redundancy.
The Insight: Polynomial Fingerprints
Here's the key idea behind Reed-Solomon:
Any polynomial of degree n-1 is uniquely defined by n points.
Let's say you have a message with 4 symbols. You can fit a cubic polynomial (degree 3) through them. Now here's the magic: you can evaluate this polynomial at more points — say, 6 total — and now you have 2 extra "check" symbols.
Even if you lose 2 symbols (any 2!), you can still reconstruct the original polynomial from the remaining 4. Which means you can recover your original message.
Original message: 4 data points (black)
Redundancy: 2 check points (white)
Click on points to "corrupt" them. As long as 4 points remain, the polynomial can be reconstructed.
This is the core of Reed-Solomon encoding: convert your message to polynomial coefficients, evaluate at multiple points, and you've built in redundancy that can survive corruption.
How QR Codes Use This
QR codes are brilliant because they make error correction visible. Let's see it in action.
Try corrupting up to 2 symbols. The message can still be recovered!
QR codes offer four error correction levels:
| Level | Recoverable | Use Case |
|---|---|---|
| L (Low) | ~7% | Maximum data capacity |
| M (Medium) | ~15% | Standard QR codes |
| Q (Quartile) | ~25% | Harsh environments |
| H (High) | ~30% | Industrial, art (logos in QR) |
Level H is why you can put a logo in the middle of a QR code and it still works.
The Math (Simplified)
Reed-Solomon is specified as RS(n, k):
- k = number of data symbols
- n = total symbols (data + parity)
- n - k = parity symbols
- t = (n - k) / 2 = errors correctable
Example: RS(255, 223) with 8-bit symbols:
- 223 bytes of data
- 32 bytes of parity
- Can correct errors in any 16 bytes
RS(255, 223) can correct 16 complete byte errors
That's up to 128 corrupted bits
In any positions throughout the codeword
Real-World Applications
Compact Discs
CDs use two Reed-Solomon codes in sequence (CIRC — Cross-Interleaved Reed-Solomon Code). This combination can correct burst errors up to 4000 bits — that's a scratch about 2.5mm long. Your CD player is constantly correcting errors you never notice.
Voyager Spacecraft
When Voyager 1 transmitted images from Saturn and Uranus, it used Reed-Solomon coding combined with convolutional codes. Signal-to-noise ratio was terrible. Bit error rates were high. But the images arrived intact.
Satellite Television
DVB (Digital Video Broadcasting) uses RS(204, 188). Every packet has 16 bytes of parity. Your TV picture doesn't glitch because Reed-Solomon is silently correcting transmission errors.
Data Storage
RAID systems, QR codes, barcodes, archival tapes — anywhere data must survive imperfect storage or transmission, Reed-Solomon is there.
Why Not Just Copy the Data?
You might wonder: why use fancy math? Why not just send each byte three times?
Triple-redundancy works, but it's inefficient. To correct 16 errors in a 255-byte codeword using simple repetition, you'd need to send 48 extra copies of each byte (3× overhead for each of the 16 potentially bad bytes). Reed-Solomon does it with just 32 extra bytes total — ~14% overhead instead of 3000%.
For the same error-correction capability
The Elegance
What I love about Reed-Solomon is how it transforms a problem (corruption) into geometry (polynomial curves). Your message becomes a shape. That shape can be evaluated at any point. Lose some points? The shape is still there, hiding in the remaining ones.
It's like folding a paper crane — the original sheet is transformed, but the creases contain all the information needed to unfold it. Even if you tear off a corner, the remaining folds often contain enough to reconstruct the whole.
Redundancy isn't about copying.
It's about structure.
A polynomial is more than its points —
it's the relationship between them.
The Legacy
Reed-Solomon was invented in 1960 by Irving Reed and Gustave Solomon. Their original paper was just 4 pages. It's now one of the most widely used algorithms in history.
Every time you:
- Scan a QR code with your phone
- Watch satellite TV
- Play a music CD
- Stream video over imperfect networks
- Store data on a hard drive
Reed-Solomon is working silently in the background, correcting errors you never knew existed.
Next time you see a scratched CD play perfectly, or a damaged QR code scan successfully, remember: a 60-year-old mathematical idea is making it possible.
Further reading: Reed-Solomon from the Bottom Up · Reed-Solomon for Coders · CMU Introduction to RS Codes