Wednesday, February 19, 2014

Type "n" for Safety! or Strncpy() Considered Risky

Everybody knows that "strncpy()" is the "safe" version of "strcpy()", right?  No chance of going past the destination buffer, right?

Um ...

Here's a nice code fragment I found in some old code (names changed to protect the innocent):

    char foo[MAX_LINE_LEN] = "";
    ...
        case 'F':
            strncpy(foo, optarg, sizeof(foo));
            foo_str = foo;
            foo_len = strlen(foo);

The "n" in "strncpy()" makes sure that if the "optarg" is longer than "foo", it only copies "MAX_LINE_LEN" characters.  No buffer overrun.  Nice and safe, right?

But look at what the man page for "strncpy()" says:
Warning:  If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
If optarg has "MAX_LINE_LEN" or more non-null characters, "foo" ends up without a null termination!  What happens to the "strlen(foo)"?  It runs right off the end and sooner or later will almost certainly result in a seg fault.  (I heard from a co-worker that we had to fix a lot of instances of this same bug in our code base.)

---

You could fix this bug by adding one line below the "strncpy()":
            foo[sizeof(foo)-1] = '\0';
But I don't like this either.  It leaves intact a different objection I have with the original code: it silently truncates something that the user might not want truncated.  Good tools should not silently do anything different than what the user requested.

So, how about this:
        case 'F':
            foo[sizeof(foo)-1] = '\0';
            strncpy(foo, optarg, sizeof(foo));
            if (foo[sizeof(foo)-1] != '\0') {
                fprintf(stderr, "Warning message!");
                foo[sizeof(foo)-1] = '\0';
            }
GAH!  I've seen code like this and I don't like it.  That code is non-obvious.

Here's a tip for writing good code.  How would you verbally describe the basic high-level approach to somebody?  Would you say:
First set a null at the end of the buffer.  We do this because we want to find out if the strncpy function truncates the string.  If so, then that null will be overwritten with the last character copied.  So we after copying, we check that last character; if the null was overwritten, then we warn the user that it was truncated, and then we add the final null (thus truncating it even further).
No, you wouldn't.  You would say something more like:
If the string fits, copy it.  If it doesn't, warn the user and copy in as much as fits, and null-terminate it.
How about this:
        case 'F':
            if (strlen(optarg) < sizeof(foo))
                strcpy(foo, optarg);
            else {
                fprintf(stderr, "Warning message!");
                memcpy(foo, optarg, sizeof(foo)-1);
                foo[sizeof(foo)-1] = '\0';
            }
            ...

I like this better; it flows almost exactly as the description does.

And yes, there is an irrational part of me that wants to use "strncpy()" inside the "if" statement, even though it isn't needed.  You're just "supposed to".  But "strcpy()" IS more efficient, and it is possible to go overboard with the whole "belt and braces" thing.  However, I would not object to somebody using "strncpy()" there.

I guess the moral of the story is: just because the function has an "n" in it doesn't mean that people are going to use it safely.

No comments: