This post is a survey of some run-length encoding (RLE) data compression formats, most of which are used for bitmapped graphics.
One such format is PackBits. It’s sort of the granddaddy of them, or the canonical example of such a format. Sometimes, formats similar to it are said to be “PackBits-like”, though the term doesn’t mean anything specific, and the required degree of similarity varies a lot.
A typical RLE format works something like this: The compressed data is in the form of a sequence of code sequences. Each code sequence begin with a code byte. A code byte, depending on its value, means either a “run” of some length (the next byte is to be repeated), or it means that a certain number of following bytes are “uncompressed”, and are to be copied as-is. The 256 available code byte values are apportioned between those two things.
- There is nothing especially innovative about PackBits. I bet that many of these algorithms were developed independently, by someone who knew nothing about PackBits.
- Just because two formats are identical does not mean that anyone copied off of someone else. Many of these formats are simple enough that some amount of independent reinvention is inevitable.
- Bias alert: This survey only includes formats that I already knew about. I did not go out of my way to find more RLE formats.
- The scope of this survey is formats that can decompressed by reading one (or more) full byte at a time, and that are not too complex or configurable. RLE formats that have codes smaller than one byte definitely exist, but are not covered here.
- Some of these formats decompress to a sequence of bytes (to be interpreted later as pixels), others directly to a sequence of logical pixels. Sometimes (when a pixel is the same size as a byte) there is no difference between these interpretations.
How to read this survey
To decompress data, read the compressed bytes sequentially, and find the first pattern (the left part of each line, before the colon) that matches the byte(s) read. Follow the instructions for that pattern, reading and consuming any additional bytes that are mentioned. Repeat.
The first byte of a pattern will be named “N”, the second “N2”, the third “N3”, etc. Bytes in a pattern are separated by a space. The phrase “next byte(s)” mean the byte starting after the bytes represented in the pattern.
If a number is quoted, it means a byte value, not a number of bytes. If not quoted, the meaning should be clear from context.
All numbers are written in decimal, not hexadecimal.
Some formats are traditionally documented in terms of “signed bytes” (whose values range from -128 to +127). I have translated the algorithms to use only unsigned bytes (whose values range from 0 to 255).
Warning: In many cases, this information is not sufficient to properly decompress a format. It does not explain every detail. For example, some formats require rows to be padded in some way. And some formats allow byte runs to cross row boundaries, while others do not.
Warning: Some of these formats are not formally documented, so the information here is just a best guess based on reverse engineering.
Schemes listed by their own name: PackBits (Used by MacPaint, some Photoshop PSD, some TIFF, IFF-ILBM, some PICT, DEGAS, etc.) 0-127: emit the next N+1 bytes literally 128: special 129-255: emit the next byte 257-N times RLE90 (Used by ARC, StuffIt, BinHex, etc.) 0-143,145-255: emit this byte (N) literally 144 0 = emit one byte of value "144" 144 1-255 = emit the "previous byte" N2-1 MORE times (at least N2 times total) Note: In some variants, the "previous byte" is the previous decoded byte (which could be "144"). In others, it is not possible to compress a run of "144" bytes. Note: 144 is hex 0x90, hence the name RLE90. Schemes listed by application or file format: BMP RLE8 0 0-2: special 0 3-255: emit the next N2 bytes literally 1-255: emit the next byte literally Note: There are also padding rules. CRG - Calamus Raster = essentially identical to TGA Deskmate .PNT: N N2: emit byte value N, N2 times Dr. Halo .CUT 0: special 1-127: emit the next N bytes literally 128: special 129-255: emit the next byte N-128 times Note: Each row has a length prefix. Note: Similar to WPG. EPOC MBM RLE8 0-127: emit the next byte N+1 times 128-255: emit the next 256-N bytes literally icns 24-bit (Mac OS icon) 0-127: emit the next N+1 bytes literally 128-255: emit the next byte N-125 times Megapaint .BLD 0 N2 = emit N2+1 bytes of value 0 1-254 = emit this byte (N) literally 255 N2 = emit N2+1 bytes of value 255 Microsoft Paint .MSP 0 N2 = emit the next byte N2 times 1-255 = emit the next N bytes literally Paint Shop Pro pspbrwse.jbf v1.0 0-128: emit the next N bytes literally 129-255: emit the next byte N-128 times Paint Shop Pro pspbrwse.jbf v1.3 0-192: emit this byte (N) literally 193-255: emit the next byte N-192 times Palm Bitmap compression type 1 (RLE) N: emit the next byte N times Palm Database ImageViewer 0-128: emit the next N+1 bytes literally 129-255: emit the next byte N-127 times PCX 0-191: emit this byte (N) literally 192-255: emit the next byte N-192 times Portfolio Compressed .PGC 0-127: emit the next N bytes literally 128-255: emit the next byte N-128 times PrintPartner type 2 compression = Same as Portfolio .PGC SHG/MRB (Segmented Hypergraphics) 0-127: emit the next byte N times 128-255: emit the next N-128 bytes literally Spectrum 512 .SPC 0-127: emit the next N+1 bytes literally 128-255: emit the next byte 258-N times Spectrum 512 .SPS 0-127: emit the next byte N+3 times 128-255: emit the next N-127 bytes literally Sun Raster 0-127,129-255: emit this byte (N) literally 128 0: emit one "128" byte 128 N2: emit the next byte N2+1 times TGA 0-127: emit the next N+1 bytes (or pixels) literally 128-255: emit the next byte (or pixel) N-127 times VORT ray tracer 0-127: emit the next byte (or pixel) N+1 times 128-255: emit the next N-128 bytes (or pixels) literally WPG - WordPerfect Graphics 0 N2: repeat previous scanline N2 more times 1-127: emit the next N bytes literally 128 N2: emit the byte "255" N2 times 129-255: emit the next byte N-128 times Note: Similar to Dr. Halo .CUT
Here are some other RLE formats that I’m aware of, but deemed to be out of scope:
- GEM VDI Bit Image (.IMG)
- T.45 Run Length Colour Encoding
- Tiny Stuff
- Amiga GlowIcons
- PC Paint (.PIC)