How QR Codes Work
QR codes look like random noise to human eyes, but they follow a precise structure that allows any smartphone camera to decode them in milliseconds. Here's what's actually happening inside those black and white squares.
A brief history
The QR code was invented in 1994 by Masahiro Hara and his team at Denso Wave, a Toyota subsidiary, to track automotive parts moving through assembly lines. Traditional barcodes could only hold around 20 alphanumeric characters — not enough for the product codes and manufacturing metadata the auto industry needed. Hara's team needed something that could encode more data, be read at high speed from any angle, and survive the physical abuse of a factory floor.
The result was a two-dimensional code that could hold far more information and be decoded from any orientation. Denso Wave made the QR code format freely available and deliberately did not enforce the patent, which accelerated its adoption worldwide. In 2000, it was standardized as ISO/IEC 18004, and today it appears on everything from restaurant menus to payment terminals to product packaging globally.
The physical structure
A QR code is a square grid of dark and light modules (the small squares). Each module represents a single bit of data — dark equals 1, light equals 0. The grid size ranges from 21×21 modules (Version 1) up to 177×177 modules (Version 40), with each version adding 4 modules per side.
Not all modules carry your data. Several fixed structural regions serve other purposes:
- Finder patterns: The three identical square bullseyes in three corners of every QR code. Their high-contrast pattern — a 7×7 dark square, a 5×5 light square inside, and a 3×3 dark square inside that — lets scanning software immediately identify the code and determine its size, orientation, and perspective distortion. The missing fourth corner tells the scanner which way is up.
- Separator: A single-module-wide white border around each finder pattern prevents them from blending with adjacent data modules.
- Timing patterns: Alternating black and white modules running horizontally and vertically between the finder patterns. These establish the module grid coordinates, allowing the scanner to precisely locate every data module even if the image is distorted.
- Alignment patterns: Additional smaller bullseyes that appear in larger QR codes (Version 2 and up). They provide additional anchor points so the scanner can correct for image curvature when a code is printed on a curved surface like a bottle or a can.
- Format information: Two redundant copies of data about the error correction level and the mask pattern used (more on masks below), stored in modules adjacent to the finder patterns. Redundancy ensures format information survives even if part of the code is damaged.
- Version information: For codes Version 7 and larger, additional modules store the version number, because at that density the scanner needs it to correctly decode the module grid.
The remaining modules — everything not used for structure — encode your actual data, interleaved with error correction codewords.
Data encoding modes
Before encoding your data, a QR encoder chooses the most efficient representation for the content you've provided. There are four encoding modes, each with different capacity:
- Numeric: Digits 0–9 only. Groups of three digits are encoded in 10 bits, giving roughly 3.33 bits per character — the most efficient mode. A Version 1 code can hold up to 41 numeric characters at error correction level L.
- Alphanumeric: 45 characters — uppercase letters, digits, and nine special characters (space, $ % * + - . / :). Pairs are encoded in 11 bits. This mode is why all-caps URLs (like HTTP://EXAMPLE.COM) encode more efficiently than mixed-case ones.
- Byte: Any byte value from 0 to 255, typically used for UTF-8 text. Each byte uses 8 bits. This is the mode used for most modern QR code content including URLs with lowercase characters, international text, and binary data.
- Kanji: Shift JIS-encoded Japanese characters, encoded in 13 bits per character. A nod to the format's Japanese origins.
Modern QR encoders can mix modes within a single code — using numeric encoding for a phone number prefix, then byte encoding for the rest — to squeeze the maximum data density out of each version.
Error correction: why QR codes survive damage
QR codes use Reed-Solomon error correction, an algorithm developed in 1960 by Irving Reed and Gustave Solomon at MIT Lincoln Laboratory. It was originally designed for deep-space communication (the Voyager probes use a variation of it), where signals can be corrupted by cosmic interference before reaching Earth. The same mathematical principle makes QR codes readable even when they're scratched, smudged, or partially covered.
Reed-Solomon error correction works by adding redundant codewords to the data. These extra codewords contain enough mathematical relationships with the original data that a decoder can identify which codewords are corrupted and reconstruct the correct values — even without knowing in advance which positions are damaged.
There are four error correction levels, each providing different amounts of recovery capacity:
- Level L: ~7% of codewords can be restored. The smallest codes for a given data length, but the least durable. Best when the QR code is in a controlled environment and size matters (e.g., tiny stickers).
- Level M: ~15% recovery. The default in many generators. A reasonable balance for general use.
- Level Q: ~25% recovery. Good for industrial or outdoor use where some wear is expected.
- Level H: ~30% recovery. The highest level. Makes codes significantly larger and denser for the same data, but enables a logo to cover the center of the code without causing scan failures. QRGlyph uses level H for all generated codes.
Masking
After data and error correction codewords are placed in the grid, the encoder applies a mask pattern. Masking XORs the data modules with one of eight repeating mathematical patterns. The reason: certain data sequences produce large areas of uniform color (all black or all white) in the QR grid, which confuse scanning software trying to identify module boundaries. Masking breaks up these patterns.
The encoder evaluates all eight mask patterns and picks the one that produces the most balanced distribution of dark and light modules with the fewest problematic patterns. The chosen mask pattern number is stored in the format information region.
How scanning works
When you point a camera at a QR code, the scanning software performs several steps in rapid succession:
- Detection: The software scans the camera frame looking for the characteristic ratio of a finder pattern (1:1:3:1:1 across a scan line). When it finds three such patterns in the right geometric relationship, it has located a QR code.
- Normalization: Using the finder patterns as anchor points, the software corrects for perspective distortion, rotation, and scale — mapping the raw image coordinates to a clean, grid-aligned coordinate system.
- Module sampling: The software samples each module at its corrected grid position, reading a binary value (dark or light).
- Format reading: The format information region is decoded first to determine the error correction level and mask pattern.
- Unmasking: The mask is removed from the data modules by applying the same XOR operation used during encoding.
- Data recovery: Reed-Solomon decoding identifies and corrects any corrupted codewords.
- Content decoding: The data bits are decoded according to the encoding mode flags, producing the original text, URL, or other content.
On a modern smartphone, this entire sequence takes 10–50 milliseconds — fast enough that it appears instantaneous to the user.
Practical implications
Understanding the structure explains several practical truths about QR codes:
- Shorter data means smaller, less dense codes that are easier to scan at small print sizes. Encode only what you need.
- Higher error correction means more redundancy codewords, which means more modules, which means larger or denser codes for the same data. There is a real tradeoff between durability and compactness.
- The three finder patterns are sacred — covering one will almost certainly make the code unreadable, regardless of error correction level. A center logo is safe because it avoids the corners.
- Color works as long as contrast is maintained. The scanner cares about the lightness difference between modules, not the specific colors. A navy-on-white code scans just as well as black-on-white.
Ready to apply this? Try our QR code generator or read our guide on design best practices.
