The Meditative Coder: 2018

Wednesday, September 12, 2018

Goodby my Wiki

This post is a little late in coming as I made the change earlier this year.

I used to use a hosting service, suso.com, for my main website and email. It ran a message board I built using Perl, and it also ran a older version of wikimedia for a personal wiki. But the service cost a fair amount, and neither the message board nor the wiki was being used much. So to save money, I cancelled it and moved the content to github.

The content is still available at geeky-boy.com, but I dumped the wiki pages to a flat directory. If I want to edit them, I need to edit the raw HTML.

I somewhat mourn the loss of my wiki. I like wikis for certain kinds of content. It is especially powerful for collaborative efforts, particularly for geographically-separated teams. But even for single-user personal-use, it presents such a low barrier to use. If I want to update a page, it's just a few button clicks away. And the update process is easy (assuming minimal use of fancy wiki markup). And it's easy to see change history, roll back changes, etc.

Contrast this with web pages on github where you have to edit them locally in HTML, check in the changes, and sync with "gh-pages" branch to make them live. It takes longer, and requires specialized software. E.g. I can't easily do it from a phone or tablet.

There are "free" wikis out there, but I don't like the ads, and most of them use non-wikimedia software; the few I've tried I haven't liked.

Maybe someday I will try some kind of third-party content management system. Or maybe I'll write my own wiki software as a fun personal project (maybe do the rendering of the markup in the browser in Javascript). Or maybe I just don't really need a wiki. Long ago, I imagined that the blog and the wiki would compliment each other, with content in each referring to content in the other. Blog for "news", wiki for "content". But it hasn't worked out that way.

So rest-in-peace wiki.geeky-boy.com. You were fun while you lasted.

Tuesday, September 11, 2018

Safe sscanf() Usage

The scanf() family of functions (man page) are used to parse ascii input into fields and convert those fields into desired data types. I have very rarely used them, due to the nature of the kind of software I write (system-level, not application), and I have always viewed the function with suspicion.

Actually, scanf() is not safe. Neither is fscanf(). But this is only because when an input conversion fails, it is not well-defined where the file pointer is left. Fortunately, this is pretty easy to deal with - just use fgets() to read a line and then use sscanf().

You can use sscanf() safely, but you need to follow some rules. For one thing, you should include field widths for pretty much all of the fields, not just strings. More on the rules later.

Input Validity Checking with sscanf()

I am a great believer in doing a good job of checking validity of inputs (see strtol preferred over atoi). It's not enough to process your input "safely" (i.e. preventing buffer overflows, etc.), you should also detect data errors and report them. For example, if I want to input the value "21" and I accidentally type "2`", I want the program to complain and re-prompt, not simply take the value "2" and ignore the "`".

One problem with doing a good job of input validation is that it often leads to lots of checking code. In strtol preferred over atoi, my preferred code consumes 4 lines, and even that is cheating (combining two statements on one line, putting close brace at end of line). And that's for a single integer! What if you have a line with multiple pieces of information on it?

Let's say I have input lines with 3 pipe-separated fields:

id_num|name|age

where "id_num" and "age" are 32-bit unsigned integers and "name" is alphabetic with a few non-alphas. We'll further assume that a name must not contain a pipe character.

Bottom line: here is my preferred code:

#define NAME_MAX_LEN 60
. . .
unsigned int id;
char name[NAME_MAX_LEN+1]; /* allow for null */
unsigned int age;
int null_ofs = 0;

(void)sscanf(iline,
"%9u|" /* id */
"%" STR(NAME_MAX_LEN) "[-A-Za-z,. ]|" /* name */
"%3u\n" /* age */
"%n", /* null_ofs */
&id, name, &age, &null_ofs);

if (null_ofs == 0 || iline[null_ofs] != '\0') {
printf("PARSE ERROR!\n");
}

Error Checking

My long-time readers (all one of you!) will take issue with me throwing away the sscanf() return value. However, it turns out that sscanf()'s return value really isn't as useful as it should be. It tells you how many successful conversions were done, but doesn't tell you if there was extra garbage at the end of the line. A better way to detect success is to add the "%n" at the end to get the string offset where conversion stopped, and make sure it is at the end of the string.

If any of the conversions fail, the "%n" will not be assigned, leaving it at 0, which is interpreted as an error. Or, if all conversions succeed, but there is garbage at the end, "iline[null_ofs]" will not be pointing at the string's null, which is also interpreted as an error.

STR Macro

More interesting is the "STR(NAME_MAX_LEN)" construct. This is a specialized macro I learned about on StackOverflow:

#define STR2(_s) #_s
#define STR(_s) STR2(_s)

Yes, both macros are needed. So the construct:
"%" STR(NAME_MAX_LEN) "[-A-Za-z,. ]"
gets compiled to the string:
"%60[-A-Za-z,. ]"

So why use that macro? If I had hard-coded the format string as "%60[-A-Za-z,. ]", it would risk getting out of sync with the actual size of "name" if we find somebody with a 65 character name. (I would like it even more if I could use "sizeof(name)", but macro expansion and "sizeof()" are handled by different compiler phases.)

Newline Specifier

This one a little subtle. The third conversion specifier is "%3u\n". This tells sscanf() to match newline at the end of the line. But guess what! If I don't include a newline on the string, the sscanf() call still succeeds, including setting null_ofs to the input string's null. In my opinion, this is a slight imperfection in sscanf(): I said I needed a newline, but sscanf() accepts my input without the newline. I suspect sscanf() is counting newline as whitespace, which it knows to skip over.

If I *really* want to require the newline, I can do:

if (iline[null_ofs - 1] != '\n') then ERROR...

Integer Field Widths

Finally, I add field widths to the unsigned integer conversions. Why? If you leave it out, then it will do a fine job of converting numbers from 0..4294967295. But try the next number, 4294967296, and what will you get? Arithmetic overflow, with a result of 0 (although the C standards do not guarantee that value). And sscanf() will not return any error. So I restrict the number of digits to prevent overflow. The disadvantage is that with "%9u" you cannot input the full range of an unsigned integer. An alternative is to use "%10Lu" and supply a long long variable. Then range check it in your code. Or you could just use "%8x" and have the user input hexidecimal. Heck, you could even input it as a 10-character string and use strtol()!

I guess "%9u" was good enough for my needs. I just wish you could just tell sscanf() to stop converting on arithmetic overflow!

Not Perfect

I already mentioned arithmetic overflow. Any other problems with the above code? Well, yeah, I guess. Even though I specify "id" and "age" to be unsigned integers, I notice that sscanf() will allow negative numbers to be inputted. Sometimes we engineers consider this a feature, not a bug; we like to use 0xFFFFFFFF as a flag value, and it's easier to type "-1". But I admit it still bothers me. If I input an age of -1, we end up with the 4-billiion-year-old man (apologies to Mel Brooks).

Tuesday, September 4, 2018

Safe C?

A Slashdot post led me to some good pages related to C safety.

https://outflux.net/slides/2018/lss/danger.pdf - slides from a presentation made by Linux kernel developer Kees Cook. Which led me to https://www.kernel.org/doc/html/latest/security/self-protection.html. I need to look more closely at all this material.
https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html, which led me to https://blog.regehr.org/archives/1307 - I didn't know there was so much "undefined" behavior! Also, while I may not buy the tee shirt, I absolutely love the graphic!

(Update: actually I did go ahead and order the tee shirt.)

Monday, August 27, 2018

Safer Malloc for C

Here's a fragment of code that I recently wrote. See anything wrong with it?

#define VOL(_var, _type) (*(volatile _type *)&(_var))
. . .
lbmbt_rcv_binding_t *binding = NULL;
. . .
binding = (lbmbt_rcv_binding_t *)malloc(sizeof(lbmbt_rcv_t));
memset(binding, 0xa5, sizeof(lbmbt_rcv_t));
binding->signature = SIGNATURE_OK;
. . .
VOL(binding->signature), int) = SIGNATURE_FREE;
free(binding);

No, the problem isn't in that strange VOL() macro; that is explained here.

The problem is that I am trying to allocate an "lbmbt_rcv_binding_t" structure, but because of cut-and-past programming and being in a hurry, I used the size of a different (and smaller) structure: lbmbt_rcv_t. Since the "signature" field is the last field in the structure, the assignments to it write past the end of the allocated memory block. GAH!

But we all knew that C is a dangerous language, with its casting and sizeof just begging to be used wrong. But could it be at least a *little* safer?

A Safer Malloc

#define VOL(_var, _type) (*(volatile _type *)&(_var))

/* Malloc "n" copies of a type. */
#define SAFE_MALLOC(_var, _type, _n) do {\
  _var = (_type *)malloc((_n) * sizeof(_type));\
  if (_var == NULL) abort();\
} while (0)
. . .
lbmbt_rcv_binding_t *binding = NULL;
. . .
SAFE_MALLOC(binding, lbmbt_rcv_t, 1);
memset(binding, 0xa5, sizeof(lbmbt_rcv_t));
binding->signature = SIGNATURE_OK;
. . .
VOL(binding->signature), int) = SIGNATURE_FREE;
free(binding);

The above code still has the bug -- it passed in the wrong type -- but at least it generates a compile warning. The cast of malloc doesn't match the type of the variable. Since I insist on getting rid of compiler warnings, that would have flagged the bug to me.

If you don't need portability, you can even use the gcc-specific extension "typeof()" in BOTH macros:

#define VOL(_var) (*(volatile typeof(_var) *)&(_var))

/* Malloc "n" copies of a type. */
#define SAFE_MALLOC(_var, _n) do {\
  _var = (typeof(*_var) *)malloc((_n) * sizeof(typeof(*_var)));\
  if (_var == NULL) abort();\
} while (0)
. . .
lbmbt_rcv_binding_t *binding = NULL;
. . .
SAFE_MALLOC(binding, 1);
memset(binding, 0xa5, sizeof(lbmbt_rcv_t));
binding->signature = SIGNATURE_OK;
. . .
VOL(binding->signature) = SIGNATURE_FREE;
free(binding);

Now the malloc bug is gone ... by design!

Safer Malloc and Memset

Finally, notice that the memset is also wrong. Since I frequently like to init my malloced segments to a known pattern (0xa5 is my habit), I can define a second macro:

#define VOL(_var) (*(volatile typeof(_var) *)&(_var))

/* Malloc "n" copies of a type and set its contents to a pattern. */
#define SAFE_MALLOC_SET(_var, _pat, _n) do {\
  _var = (typeof(*_var) *)malloc((_n) * sizeof(typeof(*_var)));\
  if (_var == NULL) abort();\
  memset(_var, _pat, (_n) * sizeof(typeof(*_var)));\
} while (0)
. . .
lbmbt_rcv_binding_t *binding = NULL;
. . .
SAFE_MALLOC_SET(binding, 0xa5, 1);
binding->signature = SIGNATURE_OK;
. . .
VOL(binding->signature) = SIGNATURE_FREE;
free(binding);

No more bugs.

P.S.

It took me WAY longer than it should have to track this down (an entire weekend). Why? The bug manifested as a segmentation fault. So I figured I could just valgrind it. But lo and behold, I didn't have Valgrind on my Mac. "No Problem," sez I. I sez, "I'll just install it."

$ brew install valgrind
valgrind: This formula either does not compile or function as expected on macOS
versions newer than Sierra due to an upstream incompatibility.

GAH! "Still no problem," sez I. I sez, "I'll just debug it on Linux."

No seg fault. Program worked fine. Valgrind makes no complaint. So I spent a the weekend doing a divide-and-conquer trackdown of the bug in a large codebase to find the "Mac-specific" bug. And finally found the bad malloc.

But wait! That's not Mac-specific! What's going on here? After wasting some more time, I finally printed the sizeof() for each structure type. On Mac, the sizes are different. On Linux, the structures are the same size. Of course valgrind on Linux said it was OK -- the malloc size was correct on Linux!

And now I have Valgrind on my mac which pinpointed the bug immediately. (Thanks Güngör Budak!).

The Meditative Coder