The blocksize field in LHA compression format

This post is about the data compression format I'll call "lh5". It is actually a family of formats that includes the compression methods often named lh{4, 5, 6, 7, 8}. It was most notably used by version 2.x of the old LHA/LZH/LHArc compressed archive format. It was used, often in modified form, in a number … Continue reading The blocksize field in LHA compression format

Win32 I/O character encoding supplement 1 – A Cygwin issue

A while back, I wrote a series of posts about using Unicode in Windows console mode programs: Part 1Part 2Part 3 In Part 2, I said that programmers should probably not be changing the console code page to UTF-8 (65001). And that if they must, they should change it back when they're done. But now … Continue reading Win32 I/O character encoding supplement 1 – A Cygwin issue

Win32 I/O character encoding part 2: chcp 65001

In a previous post, I summarized the character encodings used by Windows console mode programs. This is a short post about a not-very-good mitigation technique for some of the resulting problems. In a future post, I'll go over some better solutions. [Edit 2020-05: Unfortunately, I've had to walk back the advice in this post a … Continue reading Win32 I/O character encoding part 2: chcp 65001

Summary of some Win32 I/O character encoding behavior

This post is about programming a Windows Win32 application, mainly one that uses the console (command line). It summarizes the results of some tests I performed. Maybe you ported a Unix utility to Windows, but you find that it doesn't work with filenames that contain Japanese characters. This information may help, though specific recommendations will … Continue reading Summary of some Win32 I/O character encoding behavior