Monday, April 28, 2014

Cute "sed" Trick: rotate file

I wanted a script that would invoke a process, passing in a port number from a circular pool of ports.  When the process exits and restarts, I want the script to pass in the *next* port from the pool.

    PORT=`head -1 port_file.txt`
    sed -i -e '1h;1d;$G' port_file.txt

In the above "sed" command:

  • The "-i" causes the file "port_file.txt" to be edited in-place.
  • The "1h" sed command yanks the first line into the "hold" space.
  • The "1d" sed command deletes the first line (prevents it from being output).
  • The "$G" command appends the hold space after the last line of the file.

Thus, given that "port_file.txt" contains:
    12000
    12001
    12002
    12003
the above two commands will leave the file with:
    12001
    12002
    12003
    12000
and "PORT" set to 12000.

Monday, April 21, 2014

Old C Coding Habits Die Hard

Old habits die hard.

In their 1978 book "The C Programming Language", Brian Kernighan and Dennis Ritchie described a version of C which has since become known as "K&R" C.  For those of you who aren't older than dirt, K&R C differed in many ways from modern C.  For example:

    foo(c, v)
    int c;
    char **v;
    {

Whoa!  What's that?  Original K&R C didn't let you declare the formal parameter types in-line with the function definition, you had to declare them between the ")" and the "{".  Also, functions defaulted to being of type "int".

Lots of fossilized programmers like me were forced into habits in the old days which have outlived their need.  Now, a lot of younger programmers getting out of school have as their first real-world experience the job of maintaining that old code, learning those same obsolete habits through osmosis.


LOCAL VARIABLES

In K&R C, local variables *had* to be declared at the top of the function.  With C89, locals could be declared at the start of any compound statement (i.e. after any "{").  Finally, as of C99 locals could be declared basically anywhere.

In my opinion, it makes sense for a variable to be declared and initialized immediately before (or very close to) its first usage.  Doing this conveys valuable information for a future code maintainer: this variable isn't used prior to this line.  I can't tell you how often I spent time doing reverse searches for a variable to see where else it is being used.

Here's some "old habits" code:

    foo_find(...)
    {
        int found = 0;
        ...tens of lines...
        while (! found) {

With this, you might go straight to the top of "foo()" to see how "found" is declared and initialized, but then you still have to search the tens of intervening lines to see how "found" might changed.

I prefer this:

    foo_find(...)
    {
        ...tens of lines...
        int found = 0;
        while (! found) {

Sure, there might be a 20-year-old compiler which can't handle it, but anybody using a 20-year-old compiler has bigger problems than this.


GLOBAL VARIABLES

Another habit which I think might be a K&Rism: declaration of global variables at the top of the file. I might be wrong, but I suspect that K&R C disallows global variables to be declared after some functions are defined.  As with local variables, this restriction is no longer in force.  So, if you have a group of functions which implement an abstraction that uses globals, but the abstraction is not significant enough to justify putting them in their own file, I think it makes sense to put the globals associated with the abstraction in front of the first implementation function.  For example:

    #define FOO_MAX_NUM 67
    typedef foo_t struct {...};
    foo_t foo_storage[FOO_MAX_NUM];
    static int foo_num_stored = 0;

    foo_t *foo_create(...)
    {

This code, including the #define, the typedef, and the global variables, may well be positioned after other functions are already defined.  Once again, it helps a code reader know that the definitions are not used in the preceding code.


INCLUDE FILES

In the above example, "FOO_MAX_NUM" and "foo_t" are defined close to the code, instead of at the top of the file.  But many programmers wouldn't even put them at the top of the file, they would put them in an include file "foo.h".

I don't think this is a K&Rism.  It's just a habit to put all typedefs and #defines in the include file.  But again, I advocate for keeping things as localized as possible.  Definitions and declarations which are internal implementation details to an abstraction should be hidden from general view.  C++ may do a better job of managing the external and internal details of an abstraction, but the guidelines should be followed in C as well.

Wednesday, April 16, 2014

strtol preferred over atoi

We've all done it: parsed command-line parameters and converting numeric strings to integers using "atoi()".  And (hopefully) we've all felt guilty about it, because "atoi()" sucks.

Let's say I have a program, "blunjo", which takes a "-d" option with numeric debug level (0=no debug, 1=a little debug, 2=a lot of debug).  So I might use it like this:

    blunjo -d 1 input_file1 input_file2 ...

Inside the code I probably call "getopt()" and include the code fragment:

    case 'd':
        debug_opt = atoi(optarg);
        break;

So, what happens if the user forgets exactly how to use it and enters this:

    blunjo -d input_file1 input_file2 ...

In this case, "optarg" points at "input_file1", which "atoi()" happily converts to zero, turns off debug, and silently skips processing "input_file1".  Might be nice if it actually told the user that "input_file1" is an invalid integer and printed the usage string.

Enter "strtol()".  It's a little more complicated to use (also more flexible):

    long int strtol(const char *nptr, char **endptr, int base);

The Linux man page contains two interesting bits:

If endptr is not NULL, strtol() stores the address of the first invalid character in *endptr. If there were no digits at all, strtol() stores the original value of nptr in *endptr (and returns 0). In particular, if *nptr is not '\0' but **endptr is '\0' on return, the entire string is valid.

... the calling  program  should  set  errno to 0 before the call, and then determine if an error occurred by checking whether errno has a non-zero value after the call.

So this is better code:

    case 'd':
        char *p = NULL;  errno = 0;
        debug_opt = strtol(optarg, &p, 10);
        if (errno != 0 || p == optarg || p == NULL || *p != '\0') {
            usage("Invalid numeric value for -d option"); }
Note that you must make sure that "optarg" is non-null before calling strtol.  If you use getopt and specify "d:" then getopt will guarantee a non-null "optarg".  But if you are parsing the command-line string yourself, beware of the user entering "-d" with nothing at all following it - the next "argv[]" will be null.  Also note that the "p==NULL" check is technically not necessary; so long as "optarg" is non-null, "p" will never be left at null.  However, given that I'm not responsible for the code that sets "p", it just seems like good practice to include the sanity check before dereferencing it.

Here's a macro to make it all even easier, handle 0x-prefixed hexidecimal, and even prints a programmer-friendly error specifying the file:line of the call to it:

    #define SAFE_ATOL(a,l) do { \
      char *in_a = a; char *temp = NULL; long result; errno = 0; \
      if (*in_a == '0' && *(in_a+1) == 'x') \
        result = strtol(in_a+2, &temp, 16); \
      else \
        result = strtol(in_a, &temp, 10); \
      if (errno!=0 || temp==in_a || temp==NULL || *temp!='\0') { \
        fprintf(stderr, "%s:%d, Error, invalid numeric value for %s: '%s'\n", \
           __FILE__, __LINE__, #l, in_a); \
        exit(1); \
      } \
      l = result; /* "return" value of macro */ \
    } while (0)
Here's a usage of the macro:

    case 'd':

        SAFE_ATOL(optarg, debug_opt);
Note that on errors, it abruptly exits the program.

Finally, there is also "strtoll()" which returns a long long int, and has the same error-checking.  The functions "strtoul()" and "strtoull()" are similar but for unsigned.

EDIT: I'm pleased to discover that the function "inet_pton()" does a good job of error checking a dotted-decimal IP address.  For example, adding garbage to the end of a valid IP address is flagged as an error.

EDIT2: I've enhanced the above macro in a few ways and put it on my github. See: https://github.com/fordsfords/safe_atoi

Monday, April 14, 2014

Password strength

I disagree with a lot of "password strength" measures.  Most measures want you to include upper and lower case, digits, special characters, etc.  I don't feel they are necessary, and don't give you as much "security" as you might think (like substituting zero for the letter "o").

Then along came Randall Munroe with an XKCD cartoon which does a much better job of explaining it than I ever could:

Password Strength


Most of the password strength "meters" that you see on sites are based on the idea that digits, special characters, and mixed-case are the magic elixir for strong passwords.  I was quite dismayed to discover that most of them consider "P@ssw0rd" to be very secure, which is absurd.  Then I found zxcvbn:


Finally, a password strength meter which knows that "P@ssw0rd" is low security (score=0 of 4, crack time 0 seconds)!  Whereas "correcthorsebatterystaple" is very secure (score=4 of 4, crack time 65 years).

Another password method that I've heard hyped which I disagree with is the haystack approach.  According to this author, the password "D0g....................." (21 dots) is very strong.  This is ONLY true for brute-force password cracking, which is NOT how serious crackers work.  They do dictionary and repeated character analysis.  According to zxcvbn, "D0g....................." is weak (score=0, crack time 84 seconds).  YES!

One fly in all this ointment: many systems limit your password length, sometimes to as few as 8 characters.  This makes it very hard to use 4 random words, meaning that you probably need to go the random route.  For the 8 random character password "0ZhyUQ63", zxcvbn rates it 4 of 4, with centuries required to crack it.  Whereas "saytroll" is weak, with 22-second crack time.  (Note that "S@yTr0ll" is still weak, with a 7-minute crack time - so much for magic elixirs.)

BTW, my wiki has a somewhat longer article on password strength.

Monday, March 17, 2014

Environment for Cron Jobs

I never remember the environment variables that are set for "cron" jobs?  So I created a cron job on each of our platforms with the "set" command.

Unfortunately, this post is of limited usefulness.  Your system administrator can override the defaults, which would make my findings inaccurate for you.  And it is possible that some of the variables shown below already represent our local administrator's overrides.  Also, different versions of the OSes and of Bash will affect the symbols.  But it's a start.  :-)

I've highlighted the PATH since that is my most-often desired information.

Solaris

HOME=(account home dir)
IFS= 
LOGNAME=(account name)
MAILCHECK=600
OPTIND=1
PATH=/usr/bin:
SHELL=(account login shell)
TZ=(system timezone)

HP-UX

ERRNO=10
FCEDIT=/usr/bin/ed
HOME=(account home dir)
IFS=' 
'
LINENO=36
LOGNAME=(account name)
MAILCHECK=600
OLDPWD=(account home dir)
OPTARG
OPTIND=1
PATH=/usr/bin:/usr/sbin:.
PPID=(parent process ID)
PS2='> '
PS3='#? '
PS4='+ '
PWD=(account home dir)
RANDOM=(some random number)
SECONDS=0
SHELL=(account login shell)
TMOUT=0
TZ=(system timezone)
_=';'

I'm surprised to see "." in PATH.  I've always been taught that having "." in your path is dangerous, although most of the danger comes from having it early in your path (if you are "cd"ed into somebody else's directory, they might have an executable named "ls" which does an "rm -rf $HOME").

Linux

BASH=/bin/sh
BASHOPTS=cmdhist:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath
BASH_ALIASES=()
BASH_ARGC=()
BASH_ARGV=()
BASH_CMDS=()
BASH_LINENO=([0]="0")
BASH_SOURCE=([0]="(command)")
BASH_VERSINFO=([0]="4" [1]="1" [2]="2" [3]="1" [4]="release" [5]="x86_64-redhat-linux-gnu")
BASH_VERSION='4.1.2(1)-release'
DIRSTACK=()
EUID=(account effective UID)
GROUPS=()
HOME=(account home dir)
HOSTNAME=(system host name)
HOSTTYPE=x86_64
IFS=' 
'
LOGNAME=(account name)
MACHTYPE=x86_64-redhat-linux-gnu
OLDPWD=(account home dir)
OPTERR=1
OPTIND=1
OSTYPE=linux-gnu
PATH=/usr/bin:/bin
PIPESTATUS=([0]="0")
POSIXLY_CORRECT=y
PPID=(parent process ID)
PS4='+ '
PWD=(account home dir)
SHELL=(account login shell)
SHELLOPTS=braceexpand:hashall:interactive-comments:posix
SHLVL=2
TERM=dumb
UID=(account UID)
USER=(account name)
_=

I suspect if a different login shell were chosen, that a lot of symbols, expecially "BASH_*", would not be there.

AIX

AUTHSTATE=compat
ERRNO=10
FCEDIT=/usr/bin/ed
HOME=(account home dir)
IFS=' 
'
LANG=en_US
LC__FASTMSG=true
LINENO=36
LOCPATH=/usr/lib/nls/loc
LOGIN=(account name)
LOGNAME=(account name)
MAILCHECK=600
NLSPATH=/usr/lib/nls/msg/%L/%N:/usr/lib/nls/msg/%L/%N.cat
ODMDIR=/etc/objrepos
OLDPWD=(account home dir)
OPTIND=1
PATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/usr/java14/jre/bin:/usr/java14/bin
PPID=(parent process ID)
PS2='> '
PS3='#? '
PS4='+ '
PWD=(account home dir)
RANDOM=(some random number)
SECONDS=0
SHELL=/bin/bash
TERM=dumb
TMOUT=0
TZ=(system timezone)
USER=(account name)
_=';'

At first, I thought that the PATH was being loaded from a .bashrc or something, but the PATH shown does not match the account's .bashrc, .profile, etc.  I wonder if perhaps the system admin overrode the default.

FreeBSD

HOME=(account home dir)
IFS=' 
'
LOGNAME=(account name)
OPTIND=1
PATH=/usr/bin:/bin
PPID=(parent process ID)
PS1='$ '
PS2='> '
PS4='+ '
SHELL=(account login shell)
USER=(account name)

MacOS

BASH=/bin/sh
BASH_ARGC=()
BASH_ARGV=()
BASH_EXECUTION_STRING=
BASH_LINENO=()
BASH_SOURCE=()
BASH_VERSINFO=([0]="3" [1]="2" [2]="48" [3]="1" [4]="release" [5]="x86_64-apple-darwin10.0")
BASH_VERSION='3.2.48(1)-release'
DIRSTACK=()
EUID=(account effective UID)
GROUPS=()
HOME=(account home dir)
HOSTNAME=(system host name)
HOSTTYPE=x86_64
IFS=' 
'
LOGNAME=(account name)
MACHTYPE=x86_64-apple-darwin10.0
OPTERR=1
OPTIND=1
OSTYPE=darwin10.0
PATH=/usr/bin:/bin
POSIXLY_CORRECT=y
PPID=(parent process ID)
PS4='+ '
PWD=(account home dir)
SHELL=(account login shell)
SHELLOPTS=braceexpand:hashall:interactive-comments:posix
SHLVL=1
TERM=dumb
UID=(account UID)
USER=(account name)
_=/bin/sh

I suspect if a different login shell were chosen, that a lot of symbols, expecially "BASH_*", would not be there.

Wednesday, February 19, 2014

Type "n" for Safety! or Strncpy() Considered Risky

Everybody knows that "strncpy()" is the "safe" version of "strcpy()", right?  No chance of going past the destination buffer, right?

Um ...

Here's a nice code fragment I found in some old code (names changed to protect the innocent):

    char foo[MAX_LINE_LEN] = "";
    ...
        case 'F':
            strncpy(foo, optarg, sizeof(foo));
            foo_str = foo;
            foo_len = strlen(foo);

The "n" in "strncpy()" makes sure that if the "optarg" is longer than "foo", it only copies "MAX_LINE_LEN" characters.  No buffer overrun.  Nice and safe, right?

But look at what the man page for "strncpy()" says:
Warning:  If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
If optarg has "MAX_LINE_LEN" or more non-null characters, "foo" ends up without a null termination!  What happens to the "strlen(foo)"?  It runs right off the end and sooner or later will almost certainly result in a seg fault.  (I heard from a co-worker that we had to fix a lot of instances of this same bug in our code base.)

---

You could fix this bug by adding one line below the "strncpy()":
            foo[sizeof(foo)-1] = '\0';
But I don't like this either.  It leaves intact a different objection I have with the original code: it silently truncates something that the user might not want truncated.  Good tools should not silently do anything different than what the user requested.

So, how about this:
        case 'F':
            foo[sizeof(foo)-1] = '\0';
            strncpy(foo, optarg, sizeof(foo));
            if (foo[sizeof(foo)-1] != '\0') {
                fprintf(stderr, "Warning message!");
                foo[sizeof(foo)-1] = '\0';
            }
GAH!  I've seen code like this and I don't like it.  That code is non-obvious.

Here's a tip for writing good code.  How would you verbally describe the basic high-level approach to somebody?  Would you say:
First set a null at the end of the buffer.  We do this because we want to find out if the strncpy function truncates the string.  If so, then that null will be overwritten with the last character copied.  So we after copying, we check that last character; if the null was overwritten, then we warn the user that it was truncated, and then we add the final null (thus truncating it even further).
No, you wouldn't.  You would say something more like:
If the string fits, copy it.  If it doesn't, warn the user and copy in as much as fits, and null-terminate it.
How about this:
        case 'F':
            if (strlen(optarg) < sizeof(foo))
                strcpy(foo, optarg);
            else {
                fprintf(stderr, "Warning message!");
                memcpy(foo, optarg, sizeof(foo)-1);
                foo[sizeof(foo)-1] = '\0';
            }
            ...

I like this better; it flows almost exactly as the description does.

And yes, there is an irrational part of me that wants to use "strncpy()" inside the "if" statement, even though it isn't needed.  You're just "supposed to".  But "strcpy()" IS more efficient, and it is possible to go overboard with the whole "belt and braces" thing.  However, I would not object to somebody using "strncpy()" there.

I guess the moral of the story is: just because the function has an "n" in it doesn't mean that people are going to use it safely.

Friday, February 7, 2014

Configure a Cron Job with a Wiki

I have some periodic cron jobs that need extra configuration.  For example, one of them generates a report on bug statistics on a code branch basis.  So I need to tell it which code branches to process.  I could just put the list of branch tags on the command line of the report generator, and just use "crontab -e" while logged in to modify it.  However, I want anybody to be able to maintain the list, without having to know my password or the syntax for crontab.

It turns out that we installed Mediawiki locally for our own internal wiki.  So I created a wiki page with a table listing the code branches that are active.  Then I wrote a script which uses "curl" to fetch that wiki page and parse out the branches.  This gives me a nice web-based GUI interface to the tool that everybody is already familiar with.  Everybody here knows how to use Wikipedia, so anybody can go in and change the list of branches.

After doing some additional development, I wanted to be able to include additional configuration for the cron job which I didn't particularly want displayed on the wiki page.  You can use <!--HTML-style comments--> with Mediawiki and it won't display it on the page.  Unfortunately, it completely withholds the comment from the html of the page.  I.e. you can't see it even if you display source of the page.  You only see the comment when you edit the page.

So here's what I ended up with:

#!/bin/sh
# all.sh
# Generate QA report for all active releases.  Runs via cron nightly.

date
# Read wiki page and select rows in the release table ("|" in col 1).
# Uses "action=edit" so that are included (for option processing).
curl 'http://localwiki/index.php?title=page_title&action=edit' | egrep "^\|" >all.list

# read the contents of all.list, line at a time.  Each line is an entry in the table of active releases.
while read ILINE; do :
    # Extract target milestone (link text of web-based report page).
    TARGET_MILESTONE=`echo "$ILINE" | sed -n 's/^.*_report.html \(.*\)\].*$/\1/p'`
    # Extract the (optional) set of command-line options.
    OPTS=`echo "$ILINE" | sed -n 's/^.*--OPTS:\([^:]*\):.*$/\1/p'`

    eval ./qa_report $OPTS \"$TARGET_MILESTONE\" 2>&1
done <all.list

The "sed" commands use "-n" to suppress printing of lines to stdout.  Adding a "p" suffix to the end of a sed command forces a print, if the command is successful.  So, for example, the line:
    OPTS=`echo "$ILINE" | sed -n 's/^.*--OPTS:\([^:]*\):.*$/\1/p'`
If the contents of $ILINE does not contain match the pattern (i.e. does not have an option string), the "s" command is not successful and therefore doesn't print, leaving OPTS empty.

One final interesting note: the use of "eval" to run the qa_report.sh script.  Why couldn't you just use this?
    ./qa_report $OPTS "$TARGET_MILESTONE" 2>&1

Let's say that $TARGET_MILESTONE is "milestone" and the contents of $OPT is:
    -a "b c" -d "e f"
If you omit the "eval", you would expect the resulting command line to be:
    ./qa_report -a "b c" -d "e f" "milestone" 2>&1
I.e. the qa_report tool will see "b c" as the value for the "-a" option, and "e f" as the value for the "-d" option.  But the shell doesn't work this way.  The line:
    ./qa_report $OPTS \"$TARGET_MILESTONE\" 2>&1
will expand $OPTS, but it won't group "b c" as a single entity for -a.  Without "eval", the "-a" option will only see the two-character value "b (with the quote mark).  I found a good explanation for this; the short version is that the shell does quote processing before it does symbol expansion.  So essentially, the thing you need to do is have the shell parse the command line twice.

The "eval" form of the command works like this:
    eval ./qa_report $OPTS \"$TARGET_MILESTONE\" 2>&1
First the shell looks at this command line and parses it with "eval" as the command and the rest as "eval"s parameters.  It does the symbol substitution.  Thus, the thing that gets passed to "eval" is:
    ./qa_report -a "b c" -d "e f" "milestone"
What does the eval command do with that?  It passes it to the shell for parsing!  In this pass, "./qa_report" is the command, and the rest are the parameters.  Since the shell is parsing it from scratch, it will group "b c" as a single entity, letting the "-a" option pick it up as a single string.

Monday, February 3, 2014

Syns, Syn Cookies, TCP Listen Backlog: More Complicated than You Think


No, syncookies don't have anything to do with dieting.  But they did come up as I learned that the TCP listen backlog is more complicated than I thought.  This article should help those of you trying to support TCP servers with lots of clients, especially if large numbers of clients can try to connect at the same time.  (For example, a popular web server.)  This is Linux-oriented; I'm not sure how applicable the info is for other OSes.

---

Here is an article which talks about the TCP listen backlog. Here are some quotes:
    The backlog has an effect on the maximum rate at which a server can accept new TCP connections on a socket. ... Many systems (particularly BSD-derived or influenced) silently truncate this value (the backlog parameter to the listen() system call) to 5 — version 1.2.13 of the Linux kernel [really old - SF] does this ... Using small values for the listen backlog was one of the major causes of poor web server performance with many operating systems up until recently. ... The backlog parameter is silently truncated to SOMAXCONN ... defined as 128 in /usr/src/linux/socket.h for 2.x kernels.
---

Here is a brilliant writeup that taught me about "syncookies", and how they can lead to hung clients. Basically, if the listen backlog (a.k.a. the SYN queue) fills up and more client connection requests (SYNs) come in, the server will *act* like it is accepting them by responding with syncookies. But the kernel won't actually set up state for those connections or inform the app of the new connection. Instead, the server waits for the client to respond with the ACK (the third step of the 3-way handshake). That ACK contains enough information for the server to reconstruct the initial SYN, and the kernel proceeds to open the connection as normal. HOWEVER, if the client's ACK gets lost in the switch or the NIC or whatever, then the client will be left thinking the connection was accepted and is ready, and the server will have no memory of it.

This leads to a genuine hang if the application protocol depends on the server sending the first message, like SMTP or MySQL. In these cases, the client app will hang forever waiting for the server to send its message.

---

Here is an article which gives advice on how to set up systems that can accept lots of TCP connections. Here's a quote:
    Three system configuration parameters must be set to support a large number of open files and TCP connections with large bursts of messages. Changes can be made using the /etc/rc.d/rc.local or /etc/sysctl.conf script to preserve changes after reboot. In either case, you can write values directly into these files (e.g. "echo 32832 > /proc/sys/fs/file-max").
    • /proc/sys/fs/file-max: The maximum number of concurrently open files. We recommend a limit of at least 32,832.
    • /proc/sys/net/ipv4/tcp_max_syn_backlog: Maximum number of remembered connection requests, which are still did not receive an acknowledgment from connecting client. The default value is 1024 for systems with more than 128Mb of memory, and 128 for low memory machines. If server suffers of overload, try to increase this number.
    • /proc/sys/net/core/somaxconn: Limit of socket listen() backlog, known in userspace as SOMAXCONN. Defaults to 128. The value should be raised substantially to support bursts of request. For example, to support a burst of 1024 requests, set somaxconn to 1024.
Here are some commands I entered on our host Saturn:

   sford@Saturn$ cat /proc/sys/net/core/somaxconn
   128
   sford@Saturn$ cat /proc/sys/net/ipv4/tcp_max_syn_backlog
   2048
   sford@Saturn$ cat /proc/sys/fs/file-max
   3263962
   sford@Saturn$

Looks like the main thing we need to do is increase somaxconn, and maybe tcp_max_syn_backlog as well.

---

One small concern. I saw various references to SOMAXCONN as being a constant in a system include file. It apparently lives in different places, depending on the OS flavor/version; I found it here:

   sford@Saturn$ find /usr/include | xargs egrep SOMAXCONN
   /usr/include/bits/socket.h:#define SOMAXCONN 128

So now the question becomes, if we update the tuning parameter, do we also have to modify the include file? My gut says no. I'm thinking that maybe if you built the kernel from source, you would perhaps change the default via that include, but on a running system you simply override that default and you can magically use larger numbers in the listen() call.

Socket buffers: more complicated than you think

If you are receiving UDP datagrams (multicast or unicast, no difference), how much socket buffer does a datagram consume?  I.e. how many datagrams of a particular size can you fit in a socket buffer configured for a given size?

Well ... it's complicated.

I've tried some experiments on two of our Linux systems, and encountered some surprises.  Note that my experiments were performed with modified versions of the msend and mdump tools, i.e. simple UDP with no higher-level protocol on top of it.  (See my Github project for my modified versions.)  The modified mdump command sets up the socket, prints a prompt, and waits for the user to hit return before entering the receive loop.  I had msend sending 500 messages with 10 ms between sends (nice and slow so as not to overrun the NIC).  Since the mdump is not yet in its receive loop, the datagrams are stored in the socket buffer.  When the send finishes, I hit return on mdump, which enters the receive loop and empties the socket buffer, collecting statistics.  Then I hit control-c on mdump, and it reports the number of messages and bytes received.  Finally, I did experiments on both unicast and multicast; the results are the same.

Here are some results for a two-system test, sending from host "orion", receiving on host "saturn".  The message sizes and bytes received shown are for UDP payload.  Receive socket buffer configured for 100,000 bytes.  Note that 1472 is the largest UDP payload which can be sent in a single ethernet frame (i.e. no IP fragmentation).

message
size
messages
received
bytes
received
14726189792
2156113115
21415733598
1157157

Interesting.  The number of messages seems to not depend on message size, except for a discontinuity at 215 bytes.  I checked a lot of other message sizes, and they all follow the pattern: 61 messages for sizes >= 215, 157 messages for sizes <= 214.


Now let's double the receiver socket to 200,000 bytes:

message
size
messages
received
bytes
received
1472121178112
21512126015
21431366982
1313313

The messages received are approximately doubled, with the discontinuity at the exact same message size.  Cutting the original socket buffer in half to 50,000 approximately cuts the message counts in half, with the discontinuity at the same place (I won't bother including the table).


Now lets switch the roles: send from saturn, receive on orion.  Socket buffer back to 100,000 bytes.

message
size
messages
received
bytes
received
147277113344
2157716555
21436377682
1363363

The discontinuity is at the same place, but different numbers of messages are received.  The linux kernel versions are very close to the same - Saturn is 2.6.32-358.6.1.el6.x86_64 and orion is 2.6.32-431.1.2.0.1.el6.x86_64.  Both systems have 32 gig of memory and are using Intel 82576 NICs.  Saturn has 2 physical CPUs with 6 cores each, and orion has 2 physical CPUs with 4 cores each and hyperthreading turned on.  I'm don't know why they hold different numbers of messages in the same-sized socket buffer.


These machines also have 10G Solarflare NICs in them, so let's give that a try.  Send from saturn, receive on orion, socket buffer 100,000 bytes.

message
size
messages
received
bytes
received
1472110161920
1110110

Whoa! That's right - when using the Solarflare card, the socket buffer held more bytes of data than the configured socket buffer size!  But this isn't necessarily unexpected; the man page for socket(7) says this about setting the receive socket buffer: "The kernel doubles this value (to allow space for bookkeeping overhead)". Finally, it's interesting that there is no discontinuity - 110 messages, regardless of size.


Let's stick with the Solarflare cards, and go back to orion sending, saturn receiving (still 100,000 byte socket buffer):

message
size
messages
received
bytes
received
147287128064
18787

Fewer messages, but still exceeds 100,000 bytes worth with large messages.


Now let's put both sender and receiver on saturn (loopback), with 100,000 byte socket buffer:

message
size
messages
received
bytes
received
147287128064
5828750634
58115791217
7015710990
6926118009
1261261

Lookie there! Two discontinuities.


Someday maybe I'll try this on other OSes (our lab has Windows, Linux, Solaris, HP-UX, AIX, FreeBSD, MacOS).  Don't hold your breath.  :-)


I did try a bit with TCP instead of UDP.  It's a little trickier since instead of generating loss, TCP flow controls.  And you have to take into account the send-side socket buffer.  And I wanted to force small segments (packets), so I set the TCP_NODELAY socket option (to disable Nagle's algorithm). The results were much more what one might expect - the amount buffered depended very little on the segment size. With 1400-byte messages, it buffered 141,400 bytes. With 100-byte messages, it buffered 139,400 messages. I suspect the reduction is due to more overhead bytes.  (I didn't try it with different NICs or different hosts.)


The moral of the story is: the socket buffer won't hold as much UDP data as you think it will, especially when using small messages.

UPDATE: on a colleague's suggestion, I looked at the "recv-Q" values reported by netstat.  On Linux, I sent a single UDP datagram with one payload byte.  The "recv-Q" value reported was 1280 for an Intel NIC, and 2304 for a Solarflare NIC.  When I set the socket buffer to 100,000 bytes and fill it with UDP datagrams, "recv-Q" reports a bit over 200,000 bytes - double the socket buffer size I specified.  (Remember that socket(7) says that the kernel doubles the buffer size to allow space for bookkeeping overhead.)

UPDATE2:I'm not the first one to wonder about this. See https://www.unixguide.net/network/socketfaq/5.9 (that info is for BSD, not Linux).

Sunday, January 12, 2014

Functionally semi-literate?

Huh.

I'm slowly coming to the conclusion that humans have an amazing ability to adapt and change, and to be completely ignorant of those changes.

As far as the inside of my brain can tell, I am pretty much the same guy as I was in my college years.  If anything, I assume that my brain power was a little higher back then - I am 56 after all - but aside from being a little less able to focus intensely, I think I'm approximately as smart as I was 30 years ago.

A couple of things have happened in recent years which suggest I might be wrong.

Thing #1: In my college days, I tried and tried to get into Knuth's "The Art of Computer Programming".  But there was something about his writing style that made the material rather opaque to me.  Partly the insane MIX machine, but also ... I don't know ... a certain arrogance.  At least that's the conclusion I came to, and I've harbored a secret resentment towards Knuth ever since.

Imagine my surprise when, on a lark a couple of years ago, I picked up one of the volumes and started reading it.  Whoa!  Did somebody replace those impenetrable tomes with clear and concise writing?  And the arrogance, where did it go?  What I saw instead was a wry sense of humor which made it enjoyable to read.  I'm officially a fan.

After thinking long and hard about it, I've tentatively concluded that, yes I had some trouble with the material in college, and I became unconsciously defensive, blaming the writing for my difficulty.  But that leaves the question: why am I able to digest the material with relative ease now?

Thing #2: Also in my college days, I wanted to learn LISP.  So I got Winston and Horn and started in.  And although I was able to get the basic syntax, I didn't *get* it.  I couldn't understand the point of so-called "functional programming" where sequential execution is eschewed.  No side effects?  Immutable variables?  How could you write anything more complicated than towers of Hanoi?

I decided that I just didn't have a functional brain - I was brought up imperatively, and my brain was set.  No way I could "think" functionally.

Then, a little while ago, XKCD had a comic about Haskell, a functional programming language.  I looked in the forum and saw the same old religious wars as always between functional and imperative.  One of the posts mentioned a Haskell tutorial, "Learn You a Haskell for Great Good!"  The silly title piqued my interest and I looked it over.

Well, slap my ass and call me Sally!  Imperative programs consist of telling the compiler how to execute an algorithm, functional programming consists of defining what the algorithm IS, it's definition!  Imperative is dynamic: you DO step 1, then you DO step 2.  Functional is static: you define a function's outputs in terms of its inputs, and let the compiler figure out how to evaluate it.

OK, all you imperative programmers probably didn't experience a blinding flash of insight by that paragraph.  (Go through the Haskell tutorial - it's really good.)  But basically, the tutorial disintegrated a little rock which had gotten stuck in my mental gears all those years ago.  Why hadn't anybody said it that way back then?

But LISP, now that's another story.  Between car, crd, cons, and lambdas, I just don't have the brain for it.  But I checked a LISP tutorial (not nearly as good as the Haskell one) and within about 30 minutes I got it.  I'm not a LISP (or Haskell) programmer by any means, but now I *get* them.

Without intending to detract from that great tutorial, I suspect that the biggest difference between "college me" and "now me" is my own lump of grey matter.  30 years of experiences (both technical and otherwise), 30 years of exploring other points of view, of having my mind expanded by continuing to learn the craft, 30 years of ... dare I say it ... getting smarter.

It is still true that I can't focus as intensely as I used to be able to do.  But maybe there is some truth in the tired old saw that geezers like me may not be able to put in hours like a 20-something-year-old, but the hours we do put in might be better spent.

Wednesday, August 28, 2013

Schools relying too-heavily on Java

As a hiring manager looking for good software engineers, this article rings so true...

http://www.joelonsoftware.com/articles/ThePerilsofJavaSchools.html

Like he says, I'm all for hiring programmers who know Java, and even who have had most of their practical experience in Java.  But if they *think* exclusively in Java, then I don't want them for the kind of software we develop.

Sunday, June 16, 2013

Phone Number Word Search

Mnemonic phone numbers.  You know what I'm talking about: a computer store which lists its phone number as "Run-Fast".  Alternatively, they could have used "Sum-Fart" with that same number.  Or a florist with "4-flowers".  Notice it has 8 digits instead of the standard 7 - it always used to be that extra digits dialed at the end were ignored; I wonder if that still works.

You might be surprised how many phone numbers have interesting words hidden inside.  Go to DialABC to find out.  The other day I was just curious if my phone number had an interesting mnemonic.  I discovered that my work number can be listed as, "eye-plea" (maybe I'm a lawyer specializing in Ophthalmology‎).  Or "Ex-Dr. Lea" (poor Lea, lost her license to practice medicine).

It's harder if you have 1s or 0s in your number since those don't have letters associated with them.

Anyway, I was just curious, and vaguely remembered that there were sites which gave you suggestions.  The first several sites I tried ranged from OK to disappointing, and were mostly just platforms to serve up ads.  Then I found DialABC.  For the purpose I wanted, it was superior.  The visual presentation of the words was much better than a long list of combinations.  Plus, the site has other interesting goodies, like DTMF-number conversion, and even an anagram finder (my full name can be re-arranged to "vulgar fondest odes" - I guess I wrote dirty love limericks).

I like DialABC - it provides interesting services without being overly-promotional.

Sunday, May 5, 2013

Transfer DNS management to GoDaddy

Some time ago, I transferred my geeky-boy.com domain from an old hosting service to GoDaddy.  At that time, I set the NS servers to my (then) new hosting service, suso.org.  I did this so that it would be transparent for me if suso decided to move my web page or mail server to a different host.  (And indeed, they did move me at one point, and connectivity was seamless and smooth, so it "worked".)

Now I plan to move to a different hosting service, so I wanted to move DNS management from suso to GoDaddy, which has a web-based control panel so that I could self-manage.  It was not clear to me how to transfer management from Suso to GoDaddy, but after some experimentation and reading, I figured it out.

The first thing to do is to get a print out of all the domain zone information.  I had to send a support request to Suso to get it, and it took a full week.  :-(

Log into GoDaddy and go to "My Account".  On the "DOMAINS" row click "Launch".  Select the domain name.

Under "Nameservers" click "Set Nameservers".  It probably says, "I have specific nameservers for my domains" set.  Change it to "I want to park my domains".  This should set your name servers to NSxx.DOMAINCONTROL.COM where "xx" is a 2-digit number.  Click "OK".

You will now have to wait about 5 minutes and then refresh the page.  At that point, you should see the  Nameservers section with the NSxxDOMAINCONTROL.COM , and under "DNS Manager" it should have a bunch of pre-defined A, CNAME, and MX records defined, all pointing at GoDaddy's servers.  These are the "parking" settings, which you don't want.

Under that is the link "launch".  Click it.  That puts you into the zone editor, which is pretty easy to figure out.  I changed the "@" A record to point at Suso (I haven't moved yet), and added the other A records to match what Suso has.  I removed all the GoDaddy CNAME records and added the ones from Suso.  And I changed the MX records.

Now I just have to decide where to host everything.

Thursday, February 7, 2013

random.org

There have been quite a few times that I have wished for a quick and easy way to make random selections.  I mean in everyday situations, like if I want a random ordering of people for some task.  It's surprisingly hard to just "choose randomly".  Flipping a coin is fine when the number of choices is small and a power of two, but too time-consuming with 8 or more choices.  Slips of paper in a hat is also a bit unwieldy.

I suppose I could use this, but I guess I like more diversity in my random numbers.

Today I found www.random.org.  It is not a pseudo-random number generator; it uses atmospheric noise to generate *true* random numbers.

I know, I know ... overkill, right?  Think I'm OCD?  You're wrong.  I'm probably a bit OCPD.

Tuesday, February 5, 2013

When We Went MAD

I think you have to be of a certain age to really appreciate MAD Magazine.  When I was a kid, it was one of the *FEW* outlets of satire, sarcasm, and anti-grownup humor available to wee tikes.

I suspect there are few who grew up prior to The Simpsons who would have such fond memories, but old farts like me might be interested in:

http://www.kickstarter.com/projects/1438570989/when-we-went-mad-a-documentary-of-ecch-ic-proporti

I threw in $25, and I hope they make their goal.