How Your Data Survives Corruption

2026-02-28 · error correction

You scan a QR code. It's scratched, torn, partially covered by a sticker. Yet somehow, your phone reads it perfectly.

You play a CD from 1995. It has visible scratches you can feel with your fingernail. The music plays flawlessly.

Voyager 1 sent photos from beyond Saturn — through radiation, across billions of kilometers, with a transmitter weaker than a refrigerator light bulb. The images arrived intact.

How? Reed-Solomon error correction — one of the most elegant ideas in computing. Let me show you how it works.

The Problem: Imperfect Channels

Data transmission is never perfect. Whether it's:

Radio waves through atmosphere
Lasers reading microscopic pits on plastic discs
Cosmic rays flipping bits in memory
Physical damage to storage media

Errors are inevitable. The question is: how do we recover from them?

The Naive Approach

Send: HELLO

Corruption: HE�LO (one character lost)

Result: Irrecoverable. The message is gone forever.

We need redundancy. But not just any redundancy — smart redundancy.

The Insight: Polynomial Fingerprints

Here's the key idea behind Reed-Solomon:

Any polynomial of degree n-1 is uniquely defined by n points.

Let's say you have a message with 4 symbols. You can fit a cubic polynomial (degree 3) through them. Now here's the magic: you can evaluate this polynomial at more points — say, 6 total — and now you have 2 extra "check" symbols.

Even if you lose 2 symbols (any 2!), you can still reconstruct the original polynomial from the remaining 4. Which means you can recover your original message.

Interactive: Polynomial Interpolation

Original message: 4 data points (white)
Redundancy: 2 check points (cyan)

Click on points to "corrupt" them. As long as 4 points remain, the polynomial can be reconstructed.

This is the core of Reed-Solomon encoding: convert your message to polynomial coefficients, evaluate at multiple points, and you've built in redundancy that can survive corruption.

How QR Codes Use This

QR codes are brilliant because they make error correction visible. Let's see it in action.

Interactive: QR Code Error Correction

Original Data

H E L L O

→

With Redundancy (Level M)

H E L L O K R 7 X

Corrupt symbols: (click to toggle)

Try corrupting up to 2 symbols. The message can still be recovered!

QR codes offer four error correction levels:

Level	Recoverable	Use Case
L (Low)	~7%	Maximum data capacity
M (Medium)	~15%	Standard QR codes
Q (Quartile)	~25%	Harsh environments
H (High)	~30%	Industrial, art (logos in QR)

Level H is why you can put a logo in the middle of a QR code and it still works.

The Math (Simplified)

Reed-Solomon is specified as RS(n, k):

k = number of data symbols
n = total symbols (data + parity)
n - k = parity symbols
t = (n - k) / 2 = errors correctable

Example: RS(255, 223) with 8-bit symbols:

223 bytes of data
32 bytes of parity
Can correct errors in any 16 bytes

The Power of Reed-Solomon

RS(255, 223) can correct 16 complete byte errors
That's up to 128 corrupted bits
In any positions throughout the codeword

Real-World Applications

Compact Discs

CDs use two Reed-Solomon codes in sequence (CIRC — Cross-Interleaved Reed-Solomon Code). This combination can correct burst errors up to 4000 bits — that's a scratch about 2.5mm long. Your CD player is constantly correcting errors you never notice.

Voyager Spacecraft

When Voyager 1 transmitted images from Saturn and Uranus, it used Reed-Solomon coding combined with convolutional codes. Signal-to-noise ratio was terrible. Bit error rates were high. But the images arrived intact.

Satellite Television

DVB (Digital Video Broadcasting) uses RS(204, 188). Every packet has 16 bytes of parity. Your TV picture doesn't glitch because Reed-Solomon is silently correcting transmission errors.

Data Storage

RAID systems, QR codes, barcodes, archival tapes — anywhere data must survive imperfect storage or transmission, Reed-Solomon is there.

Why Not Just Copy the Data?

You might wonder: why use fancy math? Why not just send each byte three times?

Triple-redundancy works, but it's inefficient. To correct 16 errors in a 255-byte codeword using simple repetition, you'd need to send 48 extra copies of each byte (3× overhead for each of the 16 potentially bad bytes). Reed-Solomon does it with just 32 extra bytes total — ~14% overhead instead of 3000%.

Efficiency Comparison

Simple Repetition

3000%

overhead

Reed-Solomon

14%

overhead

For the same error-correction capability

The Elegance

What I love about Reed-Solomon is how it transforms a problem (corruption) into geometry (polynomial curves). Your message becomes a shape. That shape can be evaluated at any point. Lose some points? The shape is still there, hiding in the remaining ones.

It's like folding a paper crane — the original sheet is transformed, but the creases contain all the information needed to unfold it. Even if you tear off a corner, the remaining folds often contain enough to reconstruct the whole.

The Key Insight

Redundancy isn't about copying.
It's about structure.

A polynomial is more than its points —
it's the relationship between them.

The Legacy

Reed-Solomon was invented in 1960 by Irving Reed and Gustave Solomon. Their original paper was just 4 pages. It's now one of the most widely used algorithms in history.

Every time you:

Scan a QR code with your phone
Watch satellite TV
Play a music CD
Stream video over imperfect networks
Store data on a hard drive

Reed-Solomon is working silently in the background, correcting errors you never knew existed.

Next time you see a scratched CD play perfectly, or a damaged QR code scan successfully, remember: a 60-year-old mathematical idea is making it possible.

Further reading: Reed-Solomon from the Bottom Up · Reed-Solomon for Coders · CMU Introduction to RS Codes