When you need a crane to unload your booze, it's time to admit you have a problem. -- Arthur, The Journeyman Project Part 3

Curiouser and Curiouser: I made An Art

Some of my process of self-discovery has involved being a little bit of an artist. Last Friday I finished a project I’d been working on for several years.

Well, a project I’d been working on for three months after starting it and then getting paralyzed with inertia and not doing anything for three years.

Either the well was very deep, or she fell very slowly, for she had plenty of time as she went down to look about her…

Rebirth

And we’re back. Ish. I guess. Getting there. My new computer arrived Saturday evening. Keep in mind that when the old computer died, I slapped the hard drive into an ancient laptop with a quarter of the ram, and it basically just booted up and aside from it not having enough horsepower, the main issue I had was convincing it which screen to use. So I figured this would be a snap. I popped the hard drive from the old one in and fired it up…

Now available in Right-Side-Up.

Yeah of course nothing is ever easy. First, I spend ten minutes figuring out that the old hard drive uses MBR and the new computer defaults to UEFI, so I had to figure out how to change the bios settings to use the legacy boot mechanism. It finally made it to the Linux desktop… And showed me a warning that it was using software rendering because it didn’t know what to do with the GPU.

This is a Small Form Factor machine marketed as a digital signage player. It can output ridiculous resolutions, but it’s not a super powerful GPU: it’s an on-board Intel chip, same as the old one. I wasted too much time on the assumption that the week my OS had spent Rob Schneidering in a laptop had just donked up the video settings, because you get very little feedback about why you aren’t using hardware acceleration, just that you aren’t and it’s bad.

And it’s bad. This thing was pegging its 8-core i7 4.7 Ghz processor to literally do nothing but show me a terminal window. Also, sound didn’t work, because we live in the HDMI era and sound is a function of video. The usual answer is, “You need to install card-specific video drivers,” except that’s what happens if you have a Fancy GPU, not if you have normal on-board Intel video. There’s an applet you can pull up to check for hardware drivers, and it just told me I didn’t need any, which is what I expected: Intel video drivers are built into the kernel. Finally – possibly after I started this process – I noticed that there’s a specific “Ask the Intel GPU how it’s doing” command, and I ran that. And it failed, which was kinda predictable, but it failed in a useful way: with the message that the 630-series video chip I had was not supported until the Linux kernel 4.15. I’d recently updated the OS on that box, but only to a 2018 version, and its kernel was 4.4. So, roll forward from Linux Mint Sara to Sylvia. No help. Sylvia to Tara. Tara to Tricia. Tricia has a 4.15 kernel but it still doesn’t work. I am on the edge of a breakdown. Both of my children are pestering me because their network-attached music players don’t work because the computer that serves the music database is broken. Push it to Ulyana and it still doesn’t work. But finally, as I am about to cry myself to sleep, I see that there’s an upgrade outstanding: a 5.4 kernel. What the hell, I think, and install it.

Boom. Video drivers work now. Of course, I’ve basically Buck Rogers’d this computer through the whole of the modern “Make potentially system-breaking updates every six weeks” era, so my video card works, but basically nothing else does now.

The upgrade process itself had a few hiccups. Namely:

  • So, upgrading between an OS release in most linux distributions is a little bit of a hack; they’d really prefer you start fresh each time. My systems are generally bespoke enough that I do not wish to do that if I can avoid it. In my distro of choice, Mint, there are two forms of upgrade. The upgrade between minor versions is easy, you just pick “Yes I really want to do the dangerous thing” from the update manager. If you’re on the last minor version, you have to use a special alternative updater to go to the next version. So there is a lot of alternating back and forth, and every time, it disables some stuff and removes some stuff and warns you that it could easily destroy you and everyone you care about. Starting with Mint 19, you are required to have one specific backup program installed. You’re not required to use it. Just have it. It checks and refuses if you don’t.
  • Also you have to switch from mdm to lightdm when going from 18 to 19. It will walk you through this but it will not explain that just replacing mdm with lightdm won’t leave you with a working computer. Because you also need to install and configure a “greeter” for lightdm. The default lightdm greeter is super ugly. Also, automatic logins work differently in lightdm, and no one will explain this to you either. Even if you configure a particular user to auto-login, it will just not do it and not tell you why unless the user is also part of the autologin group.
  • I use a package called nut to monitor the status of my UPS. This isn’t the first time I’ve run into this on an upgrade: one of the nut config files gets replaced with a stock version which defaults the minimum number of power supplies to 1, and it won’t start like that. I think the deal is that the updates are asynchronous, so when you start it, there are always zero power supplies connected until the first update completes, so restarts always fail in that configuration. This prevents updating the packages.
  • The mysql options relating to the query cache size are not supported any more, so the mysql server wouldn’t restart until I edited the config file. This prevented anything which depends on mysql from updating.
  • By the way, at some point mysql decided that “system” is a reserved word. Hope you weren’t using it as a column name.
  • Also, at some point mysql decided a SELECT DISTINCT query could only sort on fields you were selecting on. Hope you weren’t doing functions on them
  • I have a highly customized version of mythtv, so I don’t want to reinstall the stock version. The installer of course replaced my init scripts to start the default version, which was easy enough to fix. Harder was the fact that half the dependencies are no longer available, most notably, the Qt 4 libraries are just gone. You should be using Qt 5. You rely on something which hasn’t been updated for Qt? Well fuck you then. I found a PPA here that had forward-ports  of everything I needed except the MySQL driver. For that, I just downloaded the package from an older Debian and forced it to install despite the dependencies not being right. This required hand-fiddling the dpkg database.
  • There is also a dependency on an older version of VAAPI. I just made a symlink from the old filename to the new one and seem to have gotten away with it.
  • Or maybe not. Mythtv is working as well as ever except whenever I try to use the OSD api to pop a message up on screen, the frontend crashes. It displays the message, sure, but it crashes right after. No clue why; the logs are incredibly unhelpful about this, refusing to provide me with a stack trace or anything.
  • Synergy is an application for sharing a keyboard and mouse across multiple computers. This is a huge deal for a media center-type computer where you don’t want to have to walk over to the keyboard all the time. And it too is gone. Apparently the Synergy developers are dicks or something? Anyway, I managed to find a compatible package here.
  • The music player daemon’s config file was quietly replaced with a non-functional one. Joy.
  • Perl libraries I installed from apt survived fine, but ones where I had to get them out of CPAN did not. So I have to manually reinstall ZOOM, a protocol used by the library of congress, because I’ve got some CGI scripts running that need it.
  • Speaking of CGI Scripts, systemd now runs apache with a private /tmp, so CGI scripts can’t drop files there for someone else to use. And sure, that is a reasonable security thing to do, but a little warning would’ve been nice.
  • Might as well mention since I’ll need it next time I do this dance: you need to blow away clearlooks-phenix or else any time you forward an X window, its GTK theming will be all donked up
  • Also you need to recreate root’s .XAuthority by doing xauth list as the logged in user and then dumping that into xauth add with sudo.

Almost everything is mostly-working now. Still haven’t gotten the notification popups to not-crash, but you can’t have everything. The new box is much peppier, has a huge amount of ram, and can be mounted right-side-up without making a grinding sound, so I think this will do me for a while.

I hope.

menin aeide thea peleiadeo achilleos oulomenen

It turns out I never bothered posting the second half of the article on “The Raising of Lazarus“? Damn. Now I have to find the draft and redo all the jokes because IIRC they were mostly gallows humor about the election.

Anyway, I’m not doing that this week on account of this:

This picture is blurry because I am shaking with rage.

At about 6:15 this evening there was a lightning strike nearby and my computer up and died. So much for the surge suppressor on the UPS.

I am cobbling together a temporary solution and sorting through big feelings now and will try to have something useful to say next week.

Oof.

A failed update of the geolocation database caused the anti-malware on the site to go tits-up for a few hours and just block everything. And then it took me another couple of days to noticed that when tech support “fixed” it, they did so by blowing away part of the apache configs so that only the front page was working afterward. So that was fun. Here’s some pictures.

Bisy Backson

Evelyn wants me to finish the Winnie-the-Pooh story, but I’ve been busy with work, so instead, here’s a picture I took while walking around Wilde Lake.

There are a lot of signs around Wilde Lake warning not to feed the geese. Many posted by the local government, but some not. I sense there was an incident. 

 

Click to Embiggen

Ross Codes: Pathological Switches

Some years ago, a then-colleague described me as “A Developer with a big D,” Which he quickly corrected to a capital D. Anyway, today I’d like to tell you about how C is insane.

So, an idiom not uncommon in the C programming language is the “switch” statement. It looks like this:

switch(x) {
case 1:	printf("Hello\n");
case 2: printf("World\n");
default: printf("Bye\n");
}

This all looks very sane and fine and if you were raised in the world of normal programming languages written in the 21st century by people who weren’t suspicious that the metric system was an agent of global communism, you might assume that what you’re looking at is just syntactic sugar and semantically, it’s the same as this:

if (x==1) printf("Hello\n");
else if (x==2) printf("World\n");
else printf("Bye\n");

Alas, you poor ignorant fool. If you know anything at all about C, you should know that C is not long on “syntactic sugar”. C does have, to my knowledge, one piece of completely pure syntactic sugar. And it is, of all things, the array index operator. Yeah. The array index operator is exactly literally the same as pointer addition. That is, a[b] means exactly the same thing as *(a+b). Which doesn’t sound like a big deal until you remember that addition is commutative, and consequentially, a[b] is the same as b[a]. Or worse, a[4] is the same as 4[a]. This is a good way to confuse someone with code. To make matters even worse, there’s reason to suspect that the reversed syntax was the original intention since in assembly language, the syntax [ax] means “Not ax itself; the memory cell whose address is currently in ax.”  No, a switch statement is actually a calculated GOTO. This isn’t a semantic control structure; it’s a jump table. One consequence of this is that it has to be able to resolve all the cases at compile-time. That is, this won’t work:

y=2;
switch(x) {
	case 1:	printf("Hello\n");
	case y: printf("World\n");
	default: printf("Bye\n");
}

Try that in gcc, and you’ll be told “case label does not reduce to an integer constant”. In other words, “I can’t be sure what ‘y’ is until the program is actually running, and it’s too late by then.” (In this little code snippet, you can indeed be sure. But this is C, so some other thread could be able to modify the value of y after that assignment, because C is like programming by sticking your hands into a big box of broken glass) So semantically, a switch statement is more like this:

if (x==1) goto case_1;
else if (x==2) goto case_2;
else goto case_d;
case_1: printf("Hello\n");
case_2: printf("World\n");
case_d: printf("Bye\n");

Only slightly more insane. Anyway, by now perhaps you have realized the thing that everyone donks up the first time they use the switch statement. If you run this code with x set to 1, it does the thing you told it to, but possibly not the thing you meant:

Hello
World
Bye

Yeah. It prints all three things. Because the code after the “case” label isn’t a block. The switch as a whole is. So execution just keeps going. To do the thing you probably meant, you actually want this:

else if (
switch(x) {
	case 1:	printf("Hello\n");
		break;
	case 2: printf("World\n");
		break;
	default: printf("Bye\n");
		break;
}

That last break isn’t strictly necessary, but it’s a good idea in case you add more code later. At this point, people coming from languages that were engineered by people not alive during the height of the popularity of LSD would probably say that this is dumb and it should just break automatically because clearly that’s what you meant. And loads of languages do this. They even allow freeform expressions in the cases so that it really is just a sugar coating over a series of if statements. Except, of course, this is C, and we wouldn’t give you a construct which was exactly as expressive as an if-else tree; the ability to fall-through isn’t a bug or an oversight; it’s the desired behavior. The reason for this, beyond a pathological hatred of first year computer science students, is that the switch statement’s “killer app” is not “Which of these should I do?” scenarios, but rather “How many of these should I do?”

for(x=0;x<5;x++) {
	printf("There was a farmer had a dog and Bingo was his name-o: ");
	switch(x) {
		case 0: printf("B");
		case 1: printf("I");
		case 2: printf("N");
		case 3: printf("G");
		case 4: printf("O");
	}
	printf(" and Bingo was his name-o.\n");
}

Yes, I can think of better implementations of the Bingo Algorithm. Shut up. My point is, the reason you use a switch and not a tree of if-then-else clauses is that you can treat the body of the switch as a big block that you can enter at various points depending on the combination o of things that need doing, which makes it very elegant for situations where there’s a multistep process you need to perform on some piece of input, but possibly your input already had the first few steps taken care of by someone else earlier.

But now, let’s go completely nuts… After all, a switch is just a structured goto, and the body is just a block. Which means…

switch(x) {
	case 1:	printf("Hello\n");
		if (y!=0) {
	case 2: printf("World\n");
		} else
	case 3: 
	break;
	printf("Surprise!\n");
	default: printf("Bye\n");
}

What? But yeah, C is fine with you doing this. So what happens? Madness.

X Y Output
1 0 Hello
1 Anything else Hello
World
Surprise!
Bye
2 Anything at all World
Surprise!
Bye
3 Anything (nothing)
Anything else Anything Bye

I need a drink. Yes, if x is 1 and y is nonzero, you can just ignore the switch altogether: Print “Hello”, then execute the if block, printing “World”, skip over the else block, and print “Surprise” and “Bye” because that’s what’s next. If y is zero, instead of executing the if, we execute the else, which skips us to the end of the switch. Okay. But now, if x is 2, we leap right straight into the middle of the IF. We don’t bother looking at y. We print “World”. We ignore the “else”, and that’s a little confusing if you are a normal, good-hearted person or even a Perl programmer, because how do you know what to do about the else? But it’s not that complicated: the close brace at the end of the “if” block is basically an invisible “Go to the next line after the else”. So we do that, which takes us to the line where we print “Surprise” and then “Bye”, helpfully labeled “default”.  If x is 3, now things get even stranger, because we jump in between the “else” and the “break”. But C, like a honey badger, don’t care. It’s just a line of code as far as the compiler’s concerned. So if x is 3, we jump right in front of a “break” and exit the switch without printing anything.

Does your brain hurt? Because notice that when x was 2, we fell through from the 2 case through the 3 case to get to the default case. The line that prints “Surprise” is clearly part of the 3 case, yet it can’t be reached in the 3 case.

It gets worse. Let’s go back to our first example and do something naughty:

switch(x) {
	case 1:	printf("Hello\n");
		int w[6];
	case 2: printf("World\n");
		printf("%d\n", sizeof(w));
	default: printf("Bye\n");
}

Now here, if you come from a reasonable sort of language like Perl, you’re troubled by this. If you come from Python, this might even wipe that smug smirk off your face, because even though Python doesn’t make you declare your variables, you probably get an uncomfortable tingle in your spine when you run into code where the scope of variables is nontrivial.

The scary thing, of course, is that if x is 2, you’ve jumped over the declaration of w. So does w even exist? Is w defined one way if x is 2 and you skip straight to case 2, but a different way if x is 1 and you fall through from case 1?

Don’t be ridiculous. This is C. “int w[6]” doesn’t really create anything. Here’s the assembly language emitted by the C compiler in-line between the two print statements:

[this space intentionally left blank]

You get a clue to what’s going on if you remove the first print statement. GCC will issue this error:

a label can only be part of a statement and a declaration is not a statement

Yeah. Declarations aren’t statements. They don’t compile into executable code. Instead, they’re instructions to the compiler about how it should arrange memory. “int w[6]” doesn’t mean “Go create a Thing which is an Array, having 6 cells for holding ints”. What it means is actually, “By the way, when creating the stack frame for this function, leave 6*sizeof(int) contiguous bytes there, which I will refer to as w.”

Being able to just shove in a variable declaration anywhere you like is a comparatively new addition to C. It wasn’t standard back when I learned this bullshit in undergrad a very long time ago. Used to be you could only do it at the beginning of a block. I’m not even sure that was ever standard, but GCC used to allow it. Originally originally, you could only do it at the start of a function. And that’s what’s going on here. That declaration is not an order to allocate memory; it’s part of the surrounding function‘s context. Mechanically, the “int w[6]: really “happens” back at the top of the function – that’s when the memory is allocated. Its placement within the switch statement is only defining its lexical scope – it’s telling the compiler which lines of source code are allowed to use the symbol “w” to refer to that block of memory; it has no meaning at run-time (You can prove this, if you like, by sticking a trivial variable assignment on either side, say “q=101; int w[6]; q=202;” and compiling with gcc’s “-S” option. Even if you can’t read assembly – and a C compiler’s output can be particularly irascible – you’ll be able to see the 101 and 202 on consecutive lines).

Well… Usually. Most of the time. Often. Because, of course, if you are an incredible dickwad, you can pull a stunt like this:

y+=10;
int z[y];

And now you’re boned. Because it can’t just go ahead and allocate that stack space as part of the function call, since it doesn’t know the value of y until runtime. If this were a sane language like Java, you’d just throw your hands up and say, “Actually no. Arrays shouldn’t live on the stack. w is really a reference to an array object which lives on the heap.” and C would laugh at you because that is nonsense and garbage collection is for hippies. That array has to live on the stack, because it has to have stack semantics. And the only way we can do that is if we embiggen the stack at runtime. If we do the same trick as before of emitting the assembly code, we’ll see that when you get to the place where we do the “int z[y]”, the assembly… Is a lot, really. The key thing happens all the way at the end, though, when it moves the stack pointer to accommodate the extra space, and then shoves a value into a local variable.

And here’s one thing that is subtle and different. It’s the one difference between a directly-declared non-variable-length local array and a pointer, and it’s a distinction that as far as I know, you can’t do anything with at the level of the C programming language. To wit: z is a local variable holding the memory address on the stack where the contiguous block of memory is. w is not. w is just the address of the contiguous memory block. When you access w[2], the code that’s emitted is, roughly, “add 2 * sizeof(int) to the constant value w. Add that to the base pointer, and look in that memory cell.” (Though it’s probably “subtract” rather than “add” because stack pointers are like electrons and the sign is wrong for historical reasons). When you do z[2], the code that comes out is “Add the constant value z to the base pointer and get the value from that memory cell. Add 2*sizeof(int) to that, and look in that memory cell.”  Basically, z behaves (at runtime) as it would if we’d declared it as “int *z = w”. Does this matter ever? I mean, it’s a couple more opcodes, but it’s 2021 and you absolutely do not care about that if you’re writing C. Possibly in some very pathological cases there’s some differing behavior you could trigger?

Speaking of pathological cases, how does all this nonsense figure into the switch? Well, let’s try it:

switch(x) {
	case 1:	printf("Hello\n");
		int w[y];
	case 2: printf("World\n");
		printf("%d\n", sizeof(w));
	default: printf("Bye\n");
}

And now we’ve finally found something so awful that even C won’t let you do it. Try to compile that, and C will politely tell you to get stuffed:

error: switch jumps into scope of identifier with variably modified type;

The compiler has noticed that you’ve dicked around with the size of the stack frame, and you should not be allowed to jump over that. This isn’t the message it used to print, for what it’s worth. When I first discovered this behavior way back in undergrad, the message it printed was rather more wonderful: “Can’t jump across dynamic memory contour.”

What would happen if you could? Here, dear reader, I must only say that I don’t know. The specification says only that we are not allowed to do it; it does not attempt to justify this decision. Remember when I said that the variably-modified type of the array made it very slightly and subtly different from a normal statically-sized array? That it’s really more like a pointer? Well, if you really want to jump over a stack allocation, in gcc at least (It’s not standard, and now that variably-modified types are a thing, those are recommended instead), you can do it like this:

switch(x) {
	case 1:	printf("Hello\n");
		int *w;
		w= (int *) alloca(y*sizeof(int));
	case 2: printf("World\n");
		printf("%d\n", sizeof(w));
	default: printf("Bye\n");
}

This is legal and it does the same thing as the previous code snippet, and the assembly language it compiles into is almost the same (I don’t actually understand the differences myself, but I suspect that they would optimize into a closer match if I turned on optimizations). Why does gcc allow this and not the other one? Remember, it didn’t allow the other form even before there was a specification telling it not to.

My best guess is that it offends the compile-time type checking. In particular, what does that penultimate printf print?

Well, now we get to that small oddity of how a locally-declared array is not quite the same as a pointer. Because let’s get rid of the switch and interrogate those variables…

int w[y];
int *z = (int *) alloca(y*sizeof(int));
printf("%d, %d\n", sizeof(w), sizeof(z));

What do you suppose comes out of that? Remember, under the hood, the generated code for w and z are pretty much the same. They’re both represented as a value on the stack which contains the address of another location on the stack. And yet, on a 64-bit system with 32-bit integers with y set to, say, 6, this prints “24, 8”: sizeof(w) is the size of the whole array – six four-byte ints – while z is the size of a pointer. The compiled code handling that “sizeof” is straightforward, if not all that interesting. sizeof(z) fetches the literal constant 8; sizeof(w) fetches the value that was stored in a register when the allocation occurred.

As far as I know, this is the only reason you can’t jump around the allocation: the compile-time type information would be wrong. If you could jump over the initialization of w, sizeof(w) wouldn’t be consistently defined. It’s not actually a problem for memory allocation at runtime: the stack frame will clean itself up when the function returns regardless of whether the allocation happened or not. This isn’t C++ where we need to be able to run destructors when the variable leaves scope.

But maybe there’s some other more subtle problem you could cause this way? Any C gurus out there with even darker wisdom than my own?