Win32 I/O character encoding supplement 2 – setlocale enhancement

This is part of a series of post on using Unicode in Windows command-line applications. Here's the first post. Sometime in 2018, some functions in the Windows 10 C runtime system, and related development SDKs, were enhanced to support UTF-8. This feature is enabled by calling the setlocale function. For reference, Microsoft's current documentation of … Continue reading Win32 I/O character encoding supplement 2 – setlocale enhancement

Encoding Huffman codebooks

This post will assume you have a basic knowledge of the data compression technique known as Huffman coding. Though maybe, since I'm only concerned about decompression, I should call it something like "bit-oriented prefix codes". Huffman coding is really just one of the algorithms that can produce such a code, but it's the term everybody … Continue reading Encoding Huffman codebooks

The Cleveland baseball team

I see that the Cleveland Indians baseball team is finally going to change their nickname. I think that's probably a good thing. For one thing, the word "Indians" is ambiguous, and you wouldn't want to accidentally demean people from South Asia, when you're trying to demean people from North America. They say they haven't chosen … Continue reading The Cleveland baseball team

LZ77 compression prehistory

LZ77 is a widely-used class of data compression algorithms. I'll start with a quick overview of it. Assuming you're compressing a stream of bytes (a "file"), your LZ77 compressed data, at a high level, would contain two possible kinds of instructions for the decompressor: Emit literal: {byte value=A}Copy from history: {match-offset=B, match-length=C} The match-offset may … Continue reading LZ77 compression prehistory

Thoughts on timestamps of computer files

Computer files can have a number of different kinds of timestamps. Some of them are stored in the file's external metadata, alongside the file's name -- I'll call these external timestamps. Others are stored inside the file itself -- I'll call these internal timestamps. I use the term "timestamp" loosely. When something called a "timestamp" … Continue reading Thoughts on timestamps of computer files

Win32 I/O character encoding supplement 1 – A Cygwin issue

A while back, I wrote a series of posts about using Unicode in Windows console mode programs: Part 1Part 2Part 3 In Part 2, I said that programmers should probably not be changing the console code page to UTF-8 (65001). And that if they must, they should change it back when they're done. But now … Continue reading Win32 I/O character encoding supplement 1 – A Cygwin issue

What is the name of libjpeg?

Shortly after the development of the JPEG image format around 1991, an organization named the Independent JPEG Group (IJG) released an open source software package to help people use the format. While the software included a few utilities, such as cjpeg and djpeg, the important part of it was its C library. The library became … Continue reading What is the name of libjpeg?

Survey of PKZIP versions for MS-DOS

I wanted to know exactly what versions of the old PKZIP compression software were publicly released for MS-DOS, and some basic characteristics about them, particularly what compression methods they used when compressing files. Sure, Wikipedia has a list, but it wasn't quite what I wanted, and it omitted at least one version I was pretty … Continue reading Survey of PKZIP versions for MS-DOS

Will the real PKZ110.EXE please stand up?

I've been researching the version history of PKZIP, the once-popular compression software that gave us the still-popular ZIP file format. There are two important MS-DOS versions of it: v1.10, released in March 1990, which was the latest official version for more than 2.5 years, until v2.04c(?) was released in December 1992.v2.04g, released February 1993, which … Continue reading Will the real PKZ110.EXE please stand up?

The 2019 TLS certificate serial number mess

Remember the Great TLS Certificate Serial Number Brouhaha of March, 2019? Millions of website certificates have been mis-issued! Everything is insecure! The sky is falling! Revoke and replace, ASAP! I barely do, but I remember thinking it was a really stupid overreaction. Now I've gone back and reviewed what happened, and I'll try to explain … Continue reading The 2019 TLS certificate serial number mess