Testing some LZ77 compression limits

This post is about data compression algorithms that involve LZ77, or a similar kind of compression. It's mainly about old-school compression algorithms and software. There is some information about LZ77 in my post about LZ77 prehistory. I won't explain it in detail here, but here are some things to know about it. Both the compressor … Continue reading Testing some LZ77 compression limits

Win32 I/O character encoding supplement 2 – setlocale enhancement

This is part of a series of post on using Unicode in Windows command-line applications. Here's the first post. Sometime in 2018, some functions in the Windows 10 C runtime system, and related development SDKs, were enhanced to support UTF-8. This feature is enabled by calling the setlocale function. For reference, Microsoft's current documentation of … Continue reading Win32 I/O character encoding supplement 2 – setlocale enhancement

The blocksize field in LHA compression format

This post is about the data compression format I'll call "lh5". It is actually a family of formats that includes the compression methods often named lh{4, 5, 6, 7, 8}. It was most notably used by version 2.x of the old LHA/LZH/LHArc compressed archive format. It was used, often in modified form, in a number … Continue reading The blocksize field in LHA compression format

Win32 I/O character encoding supplement 1 – A Cygwin issue

A while back, I wrote a series of posts about using Unicode in Windows console mode programs: Part 1Part 2Part 3 In Part 2, I said that programmers should probably not be changing the console code page to UTF-8 (65001). And that if they must, they should change it back when they're done. But now … Continue reading Win32 I/O character encoding supplement 1 – A Cygwin issue

Win32 I/O character encoding part 2: chcp 65001

In a previous post, I summarized the character encodings used by Windows console mode programs. This is a short post about a not-very-good mitigation technique for some of the resulting problems. In a future post, I'll go over some better solutions. [Edit 2020-05: Unfortunately, I've had to walk back the advice in this post a … Continue reading Win32 I/O character encoding part 2: chcp 65001

Summary of some Win32 I/O character encoding behavior

This is the first of a series of post. Here are the others: Part 2Part 3Supplement 1Supplement 2 This post is about programming a Windows Win32 application, mainly one that uses the console (command line). It summarizes the results of some tests I performed. Maybe you ported a Unix utility to Windows, but you find … Continue reading Summary of some Win32 I/O character encoding behavior