The Meditative Coder: 2020

Sunday, November 29, 2020

Using sed "in place" (gnu vs bsd)

I'm not crazy after all!

Well, ok, I guess figuring out a difference in "sed" between gnu and bsd is not a sign of sanity.

(TL;DR this works on both: sed -i.bak -e "s/x/y/" x.txt)

I use sed a fair amount in my shell scripts. Recently, I've been using "-i" a lot to edit files "in-place". The "-i" option takes a value that is interpreted as a file name suffix to save the pre-edited form of the file. You know, in case you mess up your sed commands, you can get back your original file.

But for some scripts, the file being edited is itself generated, so there is no need to save a backup. So, just pass a null string in as the suffix. Right?

[ update: useful page: https://riptutorial.com/sed/topic/9436/bsd-macos-sed-vs--gnu-sed-vs--the-posix-sed-specification ]

BSD SED (FreeBSD and Mac)

$ echo "x" >x.txt
$ sed -i '' -e "s/x/y/" x.txt
$ cat x
y
$ ls
x.txt

Looks good. Let's try Linux.

GNU SED (Linux and Cygwin)

$ echo "x" >x.txt
$ sed -i '' -e "s/x/y/" x.txt
sed: can't read : No such file or directory
$ cat x
y
$ ls
x.txt

Hmm ... that's odd. It "worked", which is to say the file was properly edited. But what's with that "no such file" error? Man page to the rescue:

$ man sed
...
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)

Interesting, you can omit the suffix. And you mustn't supply a space between the "-i" and the suffix; it thinks you've omitted it and treats the empty string as an input file. Here's an example with a non-empty suffix:

$ echo "x" >x.txt
$ sed -i .bak -e "s/x/y/" x.txt
sed: can't read .bak: No such file or directory
$ cat x
y
$ ls
x.txt

See? With the space, it thinks ".bak" is an input file. But we don't want a backup file, so let's try just omitting the suffix, like the man page says.

$ echo "x" >x.txt
$ sed -i -e "s/x/y/" x.txt
$ cat x
y
$ ls
x.txt

Works. Let's try it on BSD.

BSD SED (FreeBSD and Mac)

$ echo "x" >x.txt
$ sed -i -e "s/x/y/" x.txt
$ cat x
y
$ ls
x.txt x.txt-e

Wait, what? Again, the file was edited properly, but what's with that file "x.txt-e"? Oh, BSD sed doesn't support a missing suffix. You can supply an empty one, but you can't just omit it. So sed looked at the above command line and thought "-e" was my desired suffix. And the "-e" option is optional in front of an in-line sed program.

ARGH!

I use both Mac and Linux, and want scripts that work on both!

THE SOLUTION

There is no portable way to tell both seds that you want in-place editing but don't want a backup suffix. So just go ahead and always generate a backup file. And remember, GNU doesn't like a space after the "-i". This works on both:

$ echo "x" >x.txt
$ sed -i.bak -e "s/x/y/" x.txt
$ cat x.txt
y
$ ls
x.txt x.txt.bak

Works on Mac and Linux. Just delete the .bak file.

It took a long time to figure all this out, largely because the incorrect usages basically worked. I.e. the intended file did get edited in place, but with undesired side effects. So I didn't notice there was a problem until I really looked at things and saw the error or the "x.txt-e" file.

Corner cases: the bane of programmers everywhere.

Friday, November 27, 2020

Sometimes you need eval

The Unix shell usually does a good job of doing what you expect it to do. Writing shell scripts is usually pretty straight-forward. Yes, sometimes you can go crazy quoting special characters, but for most simple file maintenance, it's not too bad.

I *think* I've used the "eval" function before today, but I can't remember why. I am confident that I haven't used it more than twice, if that many. But today I was doing something that seemed like it shouldn't be too hard, but I don't think you can do it without "eval".

RSYNC

I want to use "rsync" to synchronize some source files between hosts. But I don't want to transfer object files. So my rsync command looks somewhat like this:

rsync -a --exclude "*.o" my_src_dir/ orion:my_src_dir

The double quotes around "*.o" are necessary because you don't want the shell to expand it, you want the actual string *.o to be passed to rsync, and rsync will do the file globbing. The double quotes prevents file glob expansion. And the shell strips the double quotes from the parameter. So what rsync sees is:

rsync -a --exclude *.o my_src_dir/ orion:my_src_dir

This is what rsync expects, so all is good.

PUT EXCLUDE OPTIONS IN A SYMBOL: FAIL

For various reasons, I wanted to be able to override that exclusion option. So I tried this:

EXCL='--exclude *.o' # default
... # code that might change EXCL
rsync -a $EXCL my_src_dir/ orion:my_src_dir

But this doesn't work right. The symbol "EXCL" will contain the string "--exclude *.o", but when the shell substitutes it into the rsync line, it then performs file globbing, and the "*.o" gets expanded to a list of files. For example, rsync might see:

rsync -a --exclude a.o b.o c.o my_src_dir/ orion:my_src_dir

The "--exclude" option only expects a single file specification.

SECOND TRY: FAIL

So maybe I can enclose $EXCL in double quotes:

rsync -a "$EXCL" my_src_dir/ orion:my_src_dir

This passes "--exclude *.o" as a *single* parameter. But rsync expects "--exclude" and the file spec to be two parameters, so it doesn't work either.

THIRD TRY: FAIL

Finally, maybe I can force quotes inside the EXCL symbol:

EXCL='--exclude "*.o"' # default
... # code that might change EXCL
rsync -a $EXCL my_src_dir/ orion:my_src_dir

This almost works, but what rsync sees is:

rsync -a --exclude "*.o" my_src_dir/ orion:my_src_dir

It thinks the double quotes are part of the file name, so it won't exclude the intended files.

EVAL TO THE RESCUE

The solution is to use eval:

EXCL='--exclude "*.o"' # default
... # code that might change EXCL
eval "rsync -a $EXCL my_src_dir/ orion:my_src_dir"

The shell does symbol substitution, so this is what eval sees:

rsync -a --exclude "*.o" my_src_dir/ orion:my_src_dir

And eval will re-process that string, including stripping the double quotes, so this is what rsync sees:

rsync -a --exclude *.o my_src_dir/ orion:my_src_dir

which is exactly correct.

P.S. - if anybody knows of a better way to do this, let me know!

EDIT: The great Sahir (one of my favorite engineers) pointed out a shell feature that I didn't know about:;

Did you consider setting noglob? It will prevent the shell from expanding '*'. Something like:

EXCL='--exclude *.o' # default
set -o noglob
rsync -a $EXCL my_src_dir/ orion:my_src_dir
set +o noglob

I absolutely did not know about noglob! In some ways, I like it better. The goal is to pass the actual star character as a parameter, and symbol substitution is getting in the way. Explicitly setting noglob says, "hey shell, I want to pass a string without you globbing it up." I like code that says exactly what you mean.

One limitation of using noglob is that you might have a command line where you want parts of it not globbed, but other parts globbed. The noglob basically operates on a full line. So you would need to do some additional string building magic to get the right things done at the right time. But the same thing would be true if you were using eval. Bottom line: the shell was made powerful and flexible, but powerful flexible things tend to have subtle corner cases that must be handled in non-obvious ways. No matter what, a comment might be nice.

FULL DISCLOSURE: I tried it and it didn't work as expected. It's probably related to all the crazy quoting. Since my "eval" solution worked, I didn't invest the time to figure out why the "noglob" method didn't work. So I'm still using eval even though noglob is arguably better for this purpose.

Tuesday, November 17, 2020

Ok, I guess I like Grammarly (grumble, grumble)

Ok, I grudgingly admit that I like Grammarly.

My complaints still hold: [UPDATE: these are all fixed now]

Mac users are second-class citizens. Mac Word integration has the file size limit, and there is no Mac outlook integration. [UPDATE: Mac integration is now good]
Their desktop tool won't edit a locally-stored text file. You have to do cutting and pasting. [UPDATE: it integrates well with TextEdit. But not vim.]
The file size limit is too small for serious work. Yes, you can do cutting and pasting again, but really? In 2020? [UPDATE: it now operates on large files]

The grumpy old man in me really wants to mumble something about snot-nosed little kids and go back to a typewriter and liquid paper.

But ... well ... I do have some bad writing habits.

Mostly I sometimes write unnecessarily complicated sentences, including useless phrases that I must have learned sound intellectual. It's a little humbling to have it pointed out over and over, but the result of more concise writing is worth it.

Mind you, there are many MANY times that I click the trash can because I don't like Grammarly's suggestions. In much of my technical writing, I use passive voice because active is too awkward. I also deviate from the standard practice of including punctuation inside quotes, especially when the quotes are not enclosing an actual quotation, but instead are calling out or highlighting a technical term, like a variable name. If I tell you to enter "ls xyz," and you type the comma, it won't work. You have to enter "ls xyz". I also sometimes include a comma that Grammarly thinks is not needed, but I think it helps separate two ideas.

Also, Grammarly isn't good at large-scale organization of content, which can have a MUCH greater effect on clarity than a few superfluous words.

In other words, *real* editors don't have to worry about being replaced by AIs for quite a while.

And yet ... and yet ... even with its limited ability to understand what I'm trying to say, it is still improving my writing. In small ways, perhaps. But improvement is improvement.

So yeah, I'll keep paying them money (grumble, grumble).

Friday, October 30, 2020

Software Sucks

Sorry, I had to say it. Software really does suck.

We just installed a new CentOS, and I wanted to do some apache work. I don't do that kind of thing very often, so I don't just remember how to do it. Thank goodness for search engines!

Do a quick google for "apache shutdown" which led me to https://httpd.apache.org/docs/2.4/stopping.html which tells me to do a "apachectl -k graceful-stop". Cool. Enter that command.

Passing arguments to httpd using apachectl is no longer supported.
You can only start/stop/restart httpd using this script.
If you want to pass extra arguments to httpd, edit the
/etc/sysconfig/httpd config file.

Um ... stopping httpd is exactly what I was trying to do. So I guessed that 2.4 must be old doc. Rather than trying to find new doc, I just entered

apachectl -h

It responded with:

Usage: /usr/sbin/httpd [-D name] [-d directory] [-f file]
                       [-C "directive"] [-c "directive"]
                       [-k start|restart|graceful|graceful-stop|stop]
                       [-v] [-V] [-h] [-l] [-L] [-t] [-T] [-S] [-X]
Options:
...

There's the "-k graceful-stop" all right. What's the problem? Well, except of course, for the stupid fact that the Usage line claims the command is "httpd", not "apachectl". Some newbie must have written the help screen for apachectl.

Another search for "Passing arguments to httpd using apachectl is no longer supported" wasn't very helpful either, but did suggest "man apachectl". Which says:

When acting in pass-through mode, apachectl can take all the arguments available for the httpd binary.
...
When acting in SysV init mode, apachectl takes simple, one-word commands, defined below.
...

How might I know which mode it's working in? Dunno. But a RedHat site gave an example of:

apachectl graceful

which matches the SysV mode. So apparently the right command is "apachectl graceful-stop" without the "-k". Which worked.

So why did "apachectl -h" give bad help? I think it just passed the "-h" to httpd (passthrough), so the help screen was printed by httpd. But shouldn't apachectl have complained about "-h"? GAH!

Software sucks.

Wednesday, October 14, 2020

Strace Buffer Display

The "strace" tool is powerful and very useful. Recently a user of our software sent us an strace output that included a packet send. Here's an excerpt:

sendmsg(88, {msg_name(16)={sa_family=AF_INET, sin_port=htons(14400), sin_addr=inet_addr("239.84.0.100")}, msg_iov(1)=[{"\2\0a\251C\27c;\0\0\2\322\0\0/\263\0\0\0\0\200\3\0\0", 24}], msg_controllen=0, msg_flags=0}, 0) = 24 <0.000076>

Obviously there's some binary bytes being displayed. I see a "\0a", so it's probably hex. But wait, there's also a \251. Does that mean 0x25 followed by ascii '1'? I decoded it assuming hex, and the packet wasn't valid.

So I did a bit of Googling. Unfortunately, I didn't note where I saw it, but somebody somewhere said that it follows the C string conventions. And C strings come from long ago, when phones had wires connecting them to wall jacks, stack overflow was a bug in a recursive program, and octal ruled the waves when it came to specifying binary data.

So \0a is 0x00 followed by ascii 'a' and \251 is 0xa9. Now the packet parses out correctly. (It's a "Session Message" if you're curious.)

So, I guess I'm a little annoyed by that choice as I would prefer hex, but I guess there's nothing all that wrong with it. Hex or octal: either way I need a calculator to convert to decimal. (And yes, I'm sure some Real Programmers can convert in their heads.)

Tuesday, August 25, 2020

I want to love Grammarly

UPDATE 2 (17-Nov-2021): Grammarly has released an update, and my first impression is THANK GOODNESS! They either listened to me, or more likely, they knew all along what their drawbacks were. Most of my complaints are now either resolved or at least better. It is now much more native to the Mac and works more cleanly with web pages, Mac's "text edit" app, and Office for Mac. Apparently, it leverages Apple's "accessibility" infrastructure somehow. But it doesn't work with MacVim or the Terminal app. Which doesn't surprise me. Also doesn't work with Teams, which maybe does surprise me a little.

Again, I'll try to write a proper review sometime.

EARLIER UPDATE: I've been using paid-for Grammarly for almost 3 months now, and despite my complaints, I have to say that I like it. I'll leave this post pretty much as I originally wrote it and write a follow-up post when I have time.

I've lived with a bit of a problem almost all my life. I'm a slow reader and a poor speller. I suspect I have a bit of a learning disability.

That's not my problem. If I really do have a learning disability, it is mild; my language abilities are not that far below average. And I'm an engineer, for goodness sake! I'm not expected to have a perfect command of English.

My problem is that I love to write. I've dabbled with fiction, humor, and non-fiction (this blog being a primary outlet). And I want the quality of my writing to be high.

When I had my first exposure to a spell-checker, I was ecstatic! Finally, a tool to save me huge amounts of time. When I first used one that suggested correct spellings, I thought I had died and gone to heaven. But their were still to many thymes that incorrect word choice, usually among homophones, led to mistakes not being caught. I knew that a grammar checker was needed to really get the spelling right.

Microsoft Word, for all the hate that is heaped on it, raised the bar. It catches many problems that "aspell" does not. Word is still not perfect, but it is *good*.

Enter Grammarly. There's a lot about it I really like. For example, instead of just showing you your mistake and offering suggestions, it explains why its a mistake (at least when using the Grammarly editor). I.e. it is both an error checker and a learning tool. I'm using the Grammarly editor to enter this blog post, and it found a couple of things that the Firefox checker did not flag.

... But ... did you notice my mistake in the previous paragraph? "...it explains why its a mistake." The "its" needs an apostrophe. Grammarly didn't catch it. Microsoft did. But neither one caught the "thymes" in the prior paragraph. (Interestingly, the Firefox checker does flag "thymes".) Grammar checking is still an inexact science. UPDATE: The latest version *does* catch the "its". But my point is that it won't be perfect.

But never let perfection be the enemy of good. And there's a lot of good in Grammarly. I really want to love it enough to pay the fee. Why don't I?

My biggest issue is the file size limitation, which does not grow with the paid-for version. I maintain the documentation for our product, and some of those files are pretty big. Way too big, it turns out. I would have to mess with splitting and recombining them. Never mind the annoyance of doing that, the recombining introduces more opportunities for mistakes. UPDATE: The file size limitation seems to be removed now. I can use TextEdit to edit very large files and it seems to work. But it doesn't remember the "errors" that I dismiss; again that seems to be only in the Grammarly editor.

Also, I use a Mac, and Grammarly doesn't integrate with Outlook on Mac. And even the Mac Word plugin is size-limited, although it looks like maybe that limitation would be lifted if I were using Windows instead of Mac. UPDATE: it does now.

Also, it doesn't work well with local ".txt" files except through copy-and-paste. (It can read text files, but not write them.) UPDATE: it does now with TextEdit.

I'll probably pay for a month's worth just to see what the 11 extra suggestions are for this blog post. Maybe seeing them will change my mind. But I kind of doubt it.

EDIT (19-Sep-2020): I did go ahead and shell out for the pro version. So far it has provided a small improvement. Not sure it's worth the cost yet, but still early days. In most cases, it challenges me on something that probably does deserve a second thought, but I ended up keeping as-is.

Sunday, July 12, 2020

Perl Diamond Operator

As my previous post indicates, I've done some Perl noodling this past week. (I can't believe that was my first Perl post on this blog! I've been a Perl fan for a loooooooong time.)

Anyway, one thing I like about Perl is it takes a use case that tends to be used a lot and adds language support for it. Case in point: the diamond operator "<>" (also called "null filehandle" or "null angle operator").

See the "Tutorial" section below if you are not familiar with the diamond operator.

Tips

I may expand this as time goes on.

Filename / Linenumber
Continue / Close
Skip Rest of File
Positional Parameter
Security Warning

Filename / Linenumber

Inside the loop, you can use "$." as the line number and "$ARGV" as the file name of the currently open file.

*BUT*, see next tip.

Continue / Close

Always code your loop as follows:

while (<>) {
...
} continue {
close ARGV if eof;
}

The continue clause is needed to have "$." refer to the line number within the *current* file. Without it "$." will refer to the total number of lines read so far.

In my opinion, even if what you want is total lines and not line within file, you should still code it like the above and just use your own counter for the total line number. This provides consistency of meaning for "$.". Plus, it's possible that in the future you will want to add functionality that requires line within file, and it's messy to code that with your own counter.

Skip Rest of File

Sometimes you get a little ways into a file and you decide that you're done with the file and would like to skip to the next (if any). Include this inside the loop:

close ARGV; # skip rest of current file

Positional Parameter

Let's say you're writing a perl version of grep, and you want the first positional parameter (after the options) to be the search pattern.

$ grep.pl "ford" *.txt

Unfortunately, this will try to read a file named "ford" as the first file. What to do?

my $pat = shift; # Pops off $ARGV[0].
while (<>) {
...

This works because "<>" doesn't actually look at the command line. It looks at the @ARGV array. The "shift" function defaults to operating on the @ARGV array.

Security Warning

Because of the way the diamond operator opens files, it is possible for a hostile user to construct a file that can produce very bad results. For example:

$ echo "hello world" >x
$ echo "goodby world" >'rm x|'
$ ls -1
rm x|
x
$ cat *
goodby world
hello world
$ cat x
hello world

So far, so good. "rm x|" is just an unusually-named file with a space in the middle and a pipe ("|") at the end. But now let's use my perl version of grep with a pattern of "." (matches all non-empty lines):

$ grep.pl "." *
Can't open x: No such file or directory at /home/sford/bin/grep.pl line 81.
$ cat x
cat: x: No such file or directory

Yup, grep.pl just deleted the file named "x". The pipe character at the end of the file "rm x|" invoked Perl's opening a filehandle into a command functionality (with the 2-argument open). In other words, by just naming a file in a particular way, you've made grep.pl do something unexpected and potentially dangerous.

This might look like a horrible security hole (what if the name of that rogue file resulted in deleting all your files?), but it can also be a very powerful (albeit rarely used) feature. The moral of the story is don't run *any* tool over a set of files that you aren't familiar with.

You can also instead use "<<>>" instead of "<>". But this requires Perl version 5.22 or newer, which rarely seems to be on any system I try to use. This will force each input file to be opened as a file, not potentially as a command.

Unfortunately, it also prevents the special handling of input file named "-" to read standard input. This is a construct that I do use periodically.

Tutorial

Many Unix commands have the following semantics:

cmdname -options [input_file [input_file2 ...] ]

where the command will read from each input file sequentially, or from standard input if no input files are provided. File names can be wildcarded. Most such Unix commands allow you to supply "-" as a file name and the tool will read from standard input.

The diamond operator makes this ridiculously easy. Here's a minimal "cat" command in Perl:

#!/usr/bin/env perl
while (<>) {
print $_;
}

That's the whole thing. It takes zero or more input files (if none, reads from standard input) and concatenates them to standard out. Just like "cat".

Specifically what "<>" does is read one line from whatever input file is currently open. If it is at the end of the file, "<>" will automatically open the next file (if any) and read a line from it. As with many Perl built-ins, it leaves the just-read line in the "$_" variable.

You should be ready for the "Tips" section now.

Saturday, July 11, 2020

Perl Faster than Grep

So, I've been crawling through a debug log file that is 195 million lines long. I've been using a lot of "grep | wc" to count numbers of various log messages. Here's some timings for my Macbook Pro:

$ time cat dbglog.txt >/dev/null
real 0m35.423s

$ time wc dbglog.txt
195177935 1177117603 28533284864 dbglog.txt
real 1m44.560s

$ time egrep '999999' dbglog.txt
real 7m39.737s

(For this timing, I chose a pattern that would *NOT* be found.)

On the Macbook, the man page for fgrep claims that it is faster than grep. Let's see:

$ time fgrep '999999' dbglog.txt

real 7m11.365s

Well, I guess it's a little faster, but nothing to brag about.

Then I wanted to create a histogram of some findings, so I wrote a perl script to scan the file and create the histogram. Since it performed regular expression matching on every line, I assumed it would be a little slower than grep, since Perl is an interpreted language.

$ time ./count.pl dbglog.txt >count.out

real 3m9.427s

WOW! Less than half the time!

So I created a simple grep replacement: grep.pl. It doesn't do any histogramming, so it should be even faster.

$ time grep.pl '999999' dbglog.txt

real 2m8.341s

Amazing. Perl grep runs in less than a third the time of grep.

For small files, I bet Perl grep is slower starting up. Let's see.

$ time echo "hi" | grep 9999

real 0m0.051s

$ time echo "hi" | grep.pl 9999

real 0m0.113s

Yep. Grep saves you about 60 milliseconds. So if you had thousands of small files to grep, it might be faster to use grep.

See https://github.com/fordsfords/grep.pl

UPDATE:

I got another big log file today (70 million lines) and saw something pretty surprising given my initial findings.

See https://blog.geeky-boy.com/2021/07/more-perl-grep-performance.html

Wednesday, June 3, 2020

Multicast Routing Tutorial

I wanted to direct a colleague to a tutorial on Multicast that explained about IGMP Snooping and the IGMP Query function. But I couldn't find one. So I'm writing one with an expand scope.

Multicast is a method of network addressing

Normally, when a publisher sends a packet, that packet has a destination address of a specific receiver. The sending application essentially chooses which host to send to, and the network's job is to transport that packet to that host. With multicast, the destination address is NOT the address of a specific host. Rather, it is a "multicast group", where the word "group" means zero or more destination hosts. Instead of the publisher choosing which receivers should get the packet, the receivers themselves tell the network that they want to receive packets sent to the group.

For example, if I send to multicast group 239.1.2.3, I have no idea which or even how many receivers might get that packet. Maybe none at all. Or maybe hundreds. It depends on the receivers - any receiver that has joined group 239.1.2.3 will get a copy of the packet.

Layer 2 (Ethernet)

For this tutorial, layer 2 means Ethernet, and layer 3 means IP. Yes, there are other possibilities. No, I don't care. :-)

Multicast exists in Ethernet and IP. They are NOT the same thing!

In the old days of Ethernet, you had a coaxial cable that had taps on it (hosts connected to the taps). At any given point in time, only one machine could be sending on the cable, and the signal was seen by all machines attached to the same cable. The NIC's job was to look at each and every packet that came by and consume the ones that are addressed to that NIC, ignoring packets not addressed to the NIC.

Ethernet multicast was a very simple affair. The NIC could be programmed to consume packets from a small number of application-specified Ethernet multicast groups. Multicast packets, like all other packets, were seen by every NIC on the network, but only NICs programmed to look for a multicast group would consume packets sent to that group.

Ethernet multicast was not routable to other networks.

LAYER 3 (IP)

IP multicast was built on top of Ethernet multicast, but added a few things, like the ability to route multicast to other networks.

When routing multicast, you don't want to just send *all* multicast packets to *all* networks. That would waste too much WAN bandwidth. So the IGMP protocol was created. Among other things, IGMP provides a way for routers to be informed about multicast *interest* in each network. So if a router sees a multicast packet for group X, and it knows that a remote network has one or more hosts joined to that group, it will forward the packet to that network. (That network's router will then use Ethernet multicast to distribute the packet.)

To maintain the router's group interest table, IGMP introduced a "join" command by which a host on a network tells the routers that it is interested in group X. There is also an IGMP "leave" message for when the host loses interest.

But what if the host crashes without sending a "leave" message? Routers have a lifetime for the multicast routing table entries. After three minutes, the router will remove a multicast group from its table unless the entry is refreshed. So every minute, the router sends out an "IGMP Query" command and the hosts will respond according to the groups they are joined to. This keeps the table refreshed for groups that still have active listeners, but lets dead table entries time out.

If the network administrator forgets to configure the routers to run an IGMP Querier, after 3 minutes the router will stop forwarding multicast packets (multicast "deafness" across a WAN).

IGMP Snooping

As mentioned earlier, in the old days of coaxial Ethernet, every packet is sent to every host, and the host's NIC is responsible for consuming only those packets it needs. This limits the total aggregate capacity of the network to the NIC speed. Modern switches greatly boost network capacity by isolating traffic flows from each other, so that host A's sending to host B doesn't interfere with host C's sending to host D. The total aggregate bandwidth capacity is MUCH greater than the NIC's link speed.

Multicast is different. Layer 2 Ethernet doesn't know which hosts have joined which multicast groups, so by default older switches "flood" multicast to all hosts. So similar to coaxial Ethernet, the total aggregate bandwidth capacity for multicast is essentially limited to NIC link speed.

So they cheated a little bit. Even though IGMP is a layer 3 IP protocol, layer 2 Ethernet switches can be configured to passively observe ("snoop") IGMP packets. And instead of keeping track of which networks are interested in what groups, the switch keeps track of which individual hosts on the network are interested in which groups. When the switch has a multicast packet, it can send it only to those hosts that are interested. So now you can have multiple multicast groups carrying heavy traffic, and so long as a given host doesn't join more groups than its NIC can handle, you can have aggregate multicast traffic far higher than NIC link speeds.

As with routers, the switch's IGMP Snooping table is aged out after 3 minutes. A router's IGMP query accomplishes the same function with the switch; it refreshes the active table entries.

Note that a layer 2 Ethernet switch is supposed to be a passive listener to the IGMP traffic between the hosts and the routers. However, for the special case where a network has no multicast routing, the IGMP Querier function is missing. So switches that support IGMP Snooping also support an IGMP querier. If the network administrator forgets to configure IGMP Snooping to query, the switch will stop delivering multicast packets after 3 minutes (multicast "deafness" on the LAN).

Saturday, May 23, 2020

Global STAC Live

I normally don't do product or vendor shilling on this blog. But I've always liked Peter Lankford at STAC, and I think he did a good job with this video:

www.STACresearch.com/spring2020

Yes, it's a funny advertisement, but instead of being just plain funny, it's actually pretty clever. And I don't just mean the jokes.

The Coronavirus has really disrupted professional conferences and events. Many are postponed, some are just plain cancelled. I assumed that STAC would be a casualty, but Peter is trying to make lemonade by making his 2020 summit virtual.

Now it's not like that has never been done before. Zoom has made quite a name for itself by saving the business meeting. (In fact, many meetings that used to be simple phone conferences have mysteriously and inexplicably drifted into Zoom meetings, I'm not sure why.)

So when I first heard that the 2020 summit was going virtual, I was like ho-hum ... another endless series of Zoom presentations. And it got me to thinking, what was going to be lost? The sales-oriented participants will of course miss glad-handing prospective customers. But what about engineers like me?

We'll miss getting together with our smart friends and colleagues that we haven't seen for a while, and maybe being introduced to their smart friends and colleagues for interesting conversation, war stories, and maybe even the occasional brilliant technical insight. Not going to get any of that out of a zoom presentation.

But Peter's video suggests that he was fully aware of that drawback, and he is experimenting with approaches to address it. I'm skeptical that I will be blown away by it, but I'm intrigued enough to give it a go. And whether it turns out revolutionary, or just another chat room, I respect Peter for effort.

Plus, I just plain like the ad. :-)

UPDATE: June 3

At the end of day 2, I can safely say that the STAC Live event was well done. The speakers did a good job of dealing with the new format, and there were very few technical glitches. The system seemed to scale well.

In particular, it is nice to be able to replay sessions that you missed (or just couldn't pay attention to) the same afternoon. This is a clear *advantage* to this format as compared to a traditional in-person conference.

The technical content was also very good, but that's not what I want to talk about.

I'm a little sad to report that the social aspect turned out worse than I had hoped. It's not STAC's fault, nor the fault of Ubivent, the technology behind the virtual event. It's the fault of human nature.

When I travel to a venue to attend a function, I am engaged with the event. Yes, I sometimes check email, and will even sometimes hide in a corner to do an important call. But >90% of my time is spent being fully engaged with the people, both in and out of a session. So conversations happen naturally and productively. You get introduced to new people with similar interests. I made a very good friend at a conference a few years ago, just because I was there.

During STAC Live, the chat feature was there, but wasn't used very much. And like all chats, I might type something, and minutes would go by before the response came back. Why? BECAUSE WE'RE ALL BUSY, THAT'S WHY! During an in-person event, we are mostly engaged with our surroundings, but when attending remotely, we are balancing the virtual event with work. And work usually wins. And working from home means there's also family members and pets who need attention.

So even if the event technology were somehow more conducive to socializing, the attendees were not.

So, overall STAC was a good event, better than it had any right to be given the circumstances. And I've already learned some things that I plan to put into practice, so it's been well worth the time. But it being a virtual event does detract something meaningful. And I miss it.

Friday, May 1, 2020

Wireshark: "!=" May have unexpected results

How is it that I haven't sung the praises of Wireshark in this blog? It's one of my favorite tools of all time! I can't get over how powerful it has become.

Occasionally, there are little gotchas. Like when using the "!=" operator, a yellow warning flashes a few times saying:

"!=" may have unexpected results (See the User's Guide)

My first thought: "There's a User's Guide?" My second: "Couldn't you give me a link to it?"

OK, Fine. Here's a direct link to the section "A Common Mistake with !=". The problem is related to "combined expressions", like eth.addr, ip.addr, tcp.port, and udp.port. Those expressions do an implicit "or"ing of the corresponding source and destination fields. E.g. saying tcp.port==12000 is shorthand for:
(tcp.srcport==12000 || tcp.dstport==12000)

So if you want the opposite of tcp.port==12000, it seems logical to use tcp.port!=12000, thinking that it will eliminate all packets with either source or destination equal to 12000. But you're wrong. It expands to:
(tcp.srcport!=12000 || tcp.dstport!=12000)
which, doing a little boolean algebra is the same as:
!(tcp.srcport==12000 && tcp.dstport==12000)
In other words, instead of eliminating packets with either source or destination of 12000, it only eliminates packets with both source and destination of 12000. You'll get way more packets than you wanted.

The solution? Use:
!(tcp.port==12000)
or better yet, expand it out:
tcp.srcport!= 12000 && tcp.dstport!=12000
I prefer that.

What's So Special About "!="?

It got me to thinking about other forms of inequality, like ">" or "<". Let's say you want to display packets where source or destination (or both) are an ephemeral port:
tcp.port > 49152
Now let's say you want the opposite - packets where *neither* port is ephemeral. If you do this:
tcp.port <= 49152
you have the same problem (but this time Wireshark doesn't warn you). This displays all packets that where source or destination (or both) are non-ephemeral.

At the root of all this is the concept of "opposite". With:
a compare b
you think the opposite is:
a opposite_compare b
But given the existence of combined expressiions, it's usually safer to use:
!(a compare b)

Monday, April 20, 2020

Roadmap to Reopening

Vi Hart posted an important video: https://www.youtube.com/watch?v=HhRQxk9QA-o

Please watch it.

Full report:
https://ethics.harvard.edu/files/center-for-ethics/files/roadmaptopandemicresilience_final_0.pdf

Lots more info about it: https://www.pandemictesting.org/

Monday, April 6, 2020

Queen Elizabeth speaks

https://www.nytimes.com/video/world/europe/100000007072121/queen-address.html

(You have to click the little "speaker" in the lower right corner to turn on the audio.)

That's one classy lady.

Thank you, your Majesty, for your words of encouragement.

[edit 25-Aug-2020] I just re-watched this. I don't imagine that the British, as a people, are somehow more decent than the US, and even the political leadership in the UK can be just as nasty. But dammit, why can't even one US leader be this inspiring?

And yes, I already know the answer. The Queen doesn't have to worry about being re-elected, and can therefore side-step the politics. Elected officials *always* have to include the political angle in everything they say, especially on subjects that, rightly or wrongly, have become politicized.

But just because I understand it doesn't mean I have to like it.

Once again, thank you.

Steve

Monday, March 9, 2020

Don't Be a Spreader

Max Brooks PSA about Corona Virus

https://www.youtube.com/watch?v=DthdYTLiqcI

Friday, February 28, 2020

New animated short by Ivan Maximov

Sorry, no computer software today.

I see that Ivan Maximov, one of my favorite animators, has finally released his latest work:

"Lonely Monster Goes Out"

I really love Ivan's gentle style. I also like how, instead of being intensely plot-driven, most of his stories are small explorations of a fantasy world, with an over-arching plot added for cohesiveness. I also like to believe that he is preaching tolerance of differences between good people, although I might just be projecting my own interpretations. (-:

Some of my other favorites:

Or binge his whole channel. (And also his site for some things not on his Youtube channel.)

I don't understand why he isn't more popular, although it might be related to his low output rate.

Sunday, January 12, 2020

Return of the Wiki

A while ago I migrated from a traditional hosting service to GitHub. One thing I lost in the process was access to a mediawiki installation for a personal Wiki. I like Wikis, and I've missed it ever since.

That is, until I finally looked a little closer at GitHub and saw that they support Wikis! DOH! So now I'm experimenting with it and so far I like it (still very early days). Each Wiki is stored in its own git repo and can be cloned and worked on off-line.

One problem is that they haven't integrated the Wiki repos with the general GitHub software very well, so the GitHub Desktop GUI application can't deal with it. The doc says it does, but the doc lies. You have to use command-line git.

But that's not a big problem. One of the reasons I like Wikis is a low-barrier to updating. And having to edit files, write them out, and then do the git commands to update is *not* low-barrier. So my usage will be almost exclusively via the web.

But it's nice to know that if my Wiki grows large, I can run sed scripts on the files to make global changes if I want.

Anyway, THANKS GitHub!

Matias keyboard FAIL

Like many Mac owners, I don't like the chiclet keyboards. I like an old-fashioned mechanical keyboard. So I finally got one: a very clicky Matias keyboard. And while I would have preferred a smaller keyboard (I don't need the keypad), I was very happy with it. For about a month.

Then the spacebar started "bouncing". I.e. maybe 1 time in 20 I get two spaces instead of one. I experimented quite a bit, and it's not the autorepeat, and it's not *me* that's bouncing. I can slowly press the key, and bam - two spaces. Or I can type my normal (pretty fast) speed and get periodic bounces. (Sometimes I can go an hour without a bounce, but then they come back.)

So I contacted Matias support, and after leading me through a series of unsuccessful steps, they sent me a new one. It arrived with NO BOUNCE! Happy-happy.

For about a month. Now the bounces are back.

I know baseball says 3 strikes are an "out", but I'm only giving them 2. The Matias is going in the trash as soon as my new Keychron K2 (with blue switches) gets here.

Wish me luck!

Update: It's been over 3 months now, and I am VERY happy with Keychron K2!

My only complaint is that the blue tooth times out even if I use an external power supply. Timing out blue tooth is a good thing when running on battery, but is unnecessary when it has external power. (NOTE: there is a simple solution: don't use blue tooth. The same cable I'm connecting to an external power supply can simply be plugged into my laptop, at which point it's a wired keyboard. But I have some reasons for not wanting to do that. Oh well, I'm still very happy with the K2.)

Update 2: It's been over 7 months now, and the Keycron K2 is still going strong! It's a pleasure to use.

Sunday, November 29, 2020

Friday, November 27, 2020

Tuesday, November 17, 2020

Friday, October 30, 2020

Wednesday, October 14, 2020

Tuesday, August 25, 2020

Sunday, July 12, 2020

Saturday, July 11, 2020

Wednesday, June 3, 2020

Saturday, May 23, 2020

Friday, May 1, 2020

What's So Special About "!="?

Monday, April 20, 2020

Monday, April 6, 2020

Monday, March 9, 2020

Friday, February 28, 2020

Sunday, January 12, 2020

About Me

Tags (see here)

Blog Archive