Survey of PackBits-like RLE compression schemes

This post is a survey of some run-length encoding (RLE) data compression formats, most of which are used for bitmapped graphics.

One such format is PackBits. It’s sort of the granddaddy of them, or the canonical example of such a format. Sometimes, formats similar to it are said to be “PackBits-like”, though the term doesn’t mean anything specific, and the required degree of similarity varies a lot.

A typical RLE format works something like this: The compressed data is in the form of a sequence of code sequences. Each code sequence begin with a code byte. A code byte, depending on its value, means either a “run” of some length (the next byte is to be repeated), or it means that a certain number of following bytes are “uncompressed”, and are to be copied as-is. The 256 available code byte values are apportioned between those two things.

Notes

  • There is nothing especially innovative about PackBits. I bet that many of these algorithms were developed independently, by someone who knew nothing about PackBits.
  • Just because two formats are identical does not mean that anyone copied off of someone else. Many of these formats are simple enough that some amount of independent reinvention is inevitable.
  • Bias alert: This survey only includes formats that I already knew about. I did not go out of my way to find more RLE formats.
  • The scope of this survey is formats that can decompressed by reading one (or more) full byte at a time, and that are not too complex or configurable. RLE formats that have codes smaller than one byte definitely exist, but are not covered here.
  • Some of these formats decompress to a sequence of bytes (to be interpreted later as pixels), others directly to a sequence of logical pixels. Sometimes (when a pixel is the same size as a byte) there is no difference between these interpretations.

How to read this survey

To decompress data, read the compressed bytes sequentially, and find the first pattern (the left part of each line, before the colon) that matches the byte(s) read. Follow the instructions for that pattern, reading and consuming any additional bytes that are mentioned. Repeat.

The first byte of a pattern will be named “N”, the second “N2”, the third “N3”, etc. Bytes in a pattern are separated by a space. The phrase “next byte(s)” mean the byte starting after the bytes represented in the pattern.

If a number is quoted, it means a byte value, not a number of bytes. If not quoted, the meaning should be clear from context.

All numbers are written in decimal, not hexadecimal.

Some formats are traditionally documented in terms of “signed bytes” (whose values range from -128 to +127). I have translated the algorithms to use only unsigned bytes (whose values range from 0 to 255).

Warning: In many cases, this information is not sufficient to properly decompress a format. It does not explain every detail. For example, some formats require rows to be padded in some way. And some formats allow byte runs to cross row boundaries, while others do not.

Warning: Some of these formats are not formally documented, so the information here is just a best guess based on reverse engineering.

Survey

Schemes listed by their own name:

PackBits (Used by MacPaint, some Photoshop PSD, some TIFF, IFF-ILBM,
     some PICT, DEGAS, etc.)
  0-127: emit the next N+1 bytes literally
  128: special
  129-255: emit the next byte 257-N times

RLE90 (Used by ARC, StuffIt, BinHex, etc.)
  0-143,145-255: emit this byte (N) literally
  144 0 = emit one byte of value "144"
  144 1-255 = emit the "previous byte" N2-1 MORE times (at least N2
    times total)
  Note: In some variants, the "previous byte" is the previous decoded
    byte (which could be "144"). In others, it is not possible to
    compress a run of "144" bytes.
  Note: 144 is hex 0x90, hence the name RLE90.

Schemes listed by application or file format:

BMP RLE8
  0 0-2: special
  0 3-255: emit the next N2 bytes literally
  1-255: emit the next byte literally
  Note: There are also padding rules.

CRG - Calamus Raster
  = essentially identical to TGA

Deskmate .PNT:
  N N2: emit byte value N, N2 times

Dr. Halo .CUT
  0: special
  1-127: emit the next N bytes literally
  128: special
  129-255: emit the next byte N-128 times
  Note: Each row has a length prefix.
  Note: Similar to WPG.

EPOC MBM RLE8
  0-127: emit the next byte N+1 times
  128-255: emit the next 256-N bytes literally

icns 24-bit (Mac OS icon)
  0-127: emit the next N+1 bytes literally
  128-255: emit the next byte N-125 times

Megapaint .BLD
  0 N2  = emit N2+1 bytes of value 0
  1-254 = emit this byte (N) literally
  255 N2 = emit N2+1 bytes of value 255

Microsoft Paint .MSP
  0 N2  = emit the next byte N2 times
  1-255 = emit the next N bytes literally

Paint Shop Pro pspbrwse.jbf v1.0
  0-128: emit the next N bytes literally
  129-255: emit the next byte N-128 times

Paint Shop Pro pspbrwse.jbf v1.3
  0-192: emit this byte (N) literally
  193-255: emit the next byte N-192 times

Palm Bitmap compression type 1 (RLE)
  N: emit the next byte N times

Palm Database ImageViewer
  0-128: emit the next N+1 bytes literally
  129-255: emit the next byte N-127 times

PCX
  0-191: emit this byte (N) literally
  192-255: emit the next byte N-192 times

Portfolio Compressed .PGC
  0-127: emit the next N bytes literally
  128-255: emit the next byte N-128 times

PrintPartner type 2 compression
  = Same as Portfolio .PGC

SHG/MRB (Segmented Hypergraphics)
  0-127: emit the next byte N times
  128-255: emit the next N-128 bytes literally

Spectrum 512 .SPC
  0-127: emit the next N+1 bytes literally
  128-255: emit the next byte 258-N times

Spectrum 512 .SPS
  0-127: emit the next byte N+3 times
  128-255: emit the next N-127 bytes literally

Sun Raster
  0-127,129-255: emit this byte (N) literally
  128 0: emit one "128" byte
  128 N2: emit the next byte N2+1 times

TGA
  0-127: emit the next N+1 bytes (or pixels) literally
  128-255: emit the next byte (or pixel) N-127 times

VORT ray tracer
  0-127: emit the next byte (or pixel) N+1 times
  128-255: emit the next N-128 bytes (or pixels) literally

WPG - WordPerfect Graphics
  0 N2: repeat previous scanline N2 more times
  1-127: emit the next N bytes literally
  128 N2: emit the byte "255" N2 times
  129-255: emit the next byte N-128 times
  Note: Similar to Dr. Halo .CUT

Additional comments

Here are some other RLE formats that I’m aware of, but deemed to be out of scope:

  • GEM VDI Bit Image (.IMG)
  • BMP/RLE4
  • T.45 Run Length Colour Encoding
  • Tiny Stuff
  • Amiga GlowIcons
  • PC Paint (.PIC)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s