Wednesday, January 5, 2022

Bash Process Substitution

I generally don't like surprises. I'm not a surprise kind of guy. If you decide you don't like me and want to make me feel miserable, just throw me a surprise party.

But there is one kind of surprise that I REALLY like. It's learning something new ... the sort of thing that makes you say, "how did I not learn this years ago???"

Let's say you want the standard output of one command to serve as the input to another command. On day one, a Unix shell beginner might use file redirection:

$ ls >ls_output.tmp
$ grep myfile <ls_output.tmp
$ rm ls_output.tmp

On day two, they will learn about the pipe:

$ ls | grep myfile

This is more concise, doesn't leave garbage, and runs faster.

But what about cases where the second program doesn't take its input from STDIN? For example, let's say you have two directories with very similar lists of files, but you want to know if there are any files in one that aren't in the other.

$ ls -1 dir1 >dir1_output.tmp
$ ls -1 dir2 >dir2_output.tmp
$ diff dir1_ouptut.tmp dir2_output.tmp
$ rm dir[12]_output.tmp

So much for conciseness, garbage, and speed.

But, today I learned about Process Substitution:

$ diff <(ls -1 dir1) <(ls -1 dir2)

This basically creates two pipes, gives them names, and passes the pipe names as command-line parameters of the diff command. I HAVE WANTED THIS FOR DECADES!!!

And just for fun, let's see what those named pipes are named:

$ echo <(ls -l dir1) <(ls -1 dir2)
/dev/fd/63 /dev/fd/62


(Note that echo doesn't actually read the pipes.)


The "cmda <(cmdb)" construct is for cmda getting its input from the output of cmdb. What about the other way around? I.e., what if cmda wants to write its output, not to STDOUT, but to a named file, and you want that output to be the standard input of cmdb? I'm having trouble thinking here of a useful example, but here's a not-useful example:

cp file1 >(grep xyz)

I say this isn't useful because why use the "cp" command? Why not:

cat file1 | grep xyz

Or better yet:

grep xyz file1

Most shell commands write their primary output to STDOUT. I can think of some examples that don't, like giving an output file to tcpdump, or the object code out of gcc, but I can't imagine wanting to pipe that into another command.

If you can think of a good use case, let me know.


Here's something that I have occasionally wanted to do. Pipe a command's STDOUT to one command, and STDERR to a different command. Here's a contrived non-pipe example:

process_foo 2>err.tmp | format_foo >foo.txt
alert_operator <err.tmp
rm err.tmp

You could re-write this as:

process_foo > >(format_foo >foo.txt) 2> >(alert_operator)

Note the space between the two ">" characters - this is needed. Without the space, ">>" is treated as the append redirection.

Sorry for the contrived example. I know I've wanted this a few times in the past, but I can't remember why.

And for completeness, you can also redirect STDIN:

cat < <(echo hi)

But this is the same as:

echo hi | cat

I can't think of a good use for the "< <(cmd)" construct. Let me know if you can.

Tuesday, November 9, 2021

Keychron K2 Keyboard

 I mentioned buying a Keychron K2 keyboard almost two years ago, but that post was primarily about a different vendor's keyboard which was a fail.

I just bought a second Keychron K2 keyboard ("blue" switches), but not because of a problem. It's because the keyboard is wonderful, and I want a second one to keep at an alternate worksite.

The laptop keyboard on the 2017-vintage Macbook Pro is almost unusable. Really, even the 2015-vintage Air's laptop keyboard is not that great. I prefer a full-stroke "clicky" keyboard with good tactile feedback.

Enter the Keychron K2. Lots of nice features that I won't bother listing since I don't use most of them and the site explains them fine. I like the noisy "blue" style switches, but you can get quieter ones.

Also, it has all the Mac-specific special keys, right there (I don't like the touch-bar at the top of the Macbooks.) And I also like the compact size.

My only complaint is that the key caps are not dual-injected, which means that the paint can wear off the tops of the frequently-used keys (E and A). But this is a problem for most keyboards I use; I apparently have a heavy typing hand.

Count me as a satisfied customer.

Thursday, August 5, 2021

Timing Short Durations

 I don't have time for a long post (HA!), but I wanted to add a pointer to ("nstm" = "Nano Second Timer"). It's a small repo that provides a nanosecond-precision time stamp portably between MacOS, Linux, and Windows.

Note that I said precision, not resolution. I don't know of an API on Windows that gives nanosecond resolution. The one Microsoft says you should use (QueryPerformanceCounter()) always returns "00" as the last two decimal digits. I.e. it is 100 nanosecond resolution. They warn against using "rdtsc" directly, although I wonder if most of their arguments are mostly no longer applicable. I would love to hear if anybody knows of a Windows method of getting nanosecond resolution timestamps that is reliable and efficient.

One way to measure a short duration "thing" is to time doing the "thing" a million times (or whatever) and take an average. One advantage of this approach is that taking a timestamp itself takes time; i.e. making the measurement changes the thing you are measuring. So amortizing that cost over many iterations minimizes its influence.

But sometimes, you just need to directly measure short things. Like if you are histogramming them to get the distribution of variations (jitter).

I put some results here:

Friday, July 9, 2021

More Perl "grep" performance

In an earlier post, I discovered that a simple Perl program can outperform grep by about double. Today I discovered that some patterns can cause the execution time to balloon tremendously.

I have a new big log file, this time with about 70 million lines. I'm running it on my newly-updated Mac, whose "time" command has slightly different output.

Let's start with this:

time grep 'asdf' cetasfit05.txt
... 39.388 total

time 'asdf' cetasfit05.txt
... 21.388 total

About twice as fast.

Now let's change the pattern:

time grep 'XBT|XBM' cetasfit05.txt
... 24.787 total

time 'XBT|XBM' cetasfit05.txt
... 18.940 total

Still faster, but nowhere near twice as fast. I don't know why 

Now let's add an anchor:

time grep '^XBT|^XBM' cetasfit05.txt
... 25.580 total

time '^XBT|^XBM' cetasfit05.txt
... 3:08.25 total

WHOA! Perl, what happened????? 3 MINUTES???

My only explanation is that Perl tries to implement  a very general regular expression algorithm, and grep implements a subset, and that might cause Perl to be slow in some circumstances. For example, maybe the use of alternation with anchors introduces the need for "backtracking" under some circumstances, and maybe grep doesn't support backtracking. In this simple example, backtracking is probably not necessary, but to be general, Perl might do it "just in case". (Note: I'm not a regular expression expert, and don't really know when "backtracking" is needed; I'm speculating without bothering to learn about it.)

Anyway, let's make a small adjustment:

time '^(XBT|XBM)' cetasfit05.txt
... 17.910 total

There, that got back to "normal".

I guess multiple anchors in a pattern is a bad idea.

P.S. - even though this post is about Perl, I tried one more test with grep:

time grep 'ASDF' cetasfit05.txt
... 26.132 total

Whaaa...? I tried multiple times, and lower-case 'asdf' always takes about 40 seconds, and upper-case 'ASDF' always takes about 27 seconds. I DON'T UNDERSTAND COMPUTERS!!! (sob)

Wednesday, March 17, 2021

Investment Advice from an Engineer

 I have some financial advice regarding investing in stocks. But be aware that, while I am pretty wealthy, the vast majority of my wealth came from salary from a good-paying career. You will *NOT* get rich off the stock market by following my advice.


Put money every month into an U.S. exchange-traded fund that is very broad market. Like SPDR. (I prefer VTI because it is even broader and has lower fees, but the differences aren't really that big). The goal is to keep buying and never selling, every working month of your life until you retire. (I don't have retirement advice yet.)

If the market starts going down, do *NOT* stop buying. In fact, if you can afford it, put more in. Every time the market goes down, put more in. The market will go back up, don't worry. 

The same cannot be said for an individual stock -- sometimes a company's stock will dive down and then stay down, basically forever. But the market as a whole won't do that. A dive might take a few days to recover, or might take a few years to recover. But it will recover. DON'T sell a broad fund if the market is going down. BUY it.


No. It will give you the highest probability of a good return. Back when I was a kid, I was told to put money into a bank savings account. That was poor advice then, and is terrible advice now with interest rates close to zero. Putting money into a guaranteed bank account is guaranteed to underperform inflation. I.e. it will slowly but surely lose value.

Instead, tie your money to the overall market. The broad market has its ups and downs, but if you stick with it for a while, the overall trend is higher than inflation. 


Well, you could buy a lottery ticket. That will get you rich the quickest. Assuming you win. Which you won't. Or you could go to Vegas and put a quarter into a slot machine.

But you're smarter than that. You know the chances of getting rich off of lotteries or gambling are too low. You want something that has some risk, but which will probably make you rich. And you're thinking the stock market.

Your best bet? Find a company currently trading for a few dollars a share, but is definitely for sure going to be the next Apple or Microsoft. One tiny problem: if it is definitely for sure going to be the next Apple or Microsoft, the stock price won't be a few dollars a share. It will be priced as if it already IS the next Apple or Microsoft. This is because there are tens of thousands of smart traders out there who are spending a HELL of a lot more time than you are trying to find the next big company. For a "sure thing" company that is already publicly traded, those tens of thousands have already bid up the price.

The only real chance for you to get in on the ground floor is to find a garage shop start-up, and invest in them. I have a rich friend who has made a lot of money doing exactly that. For every ten companies he invests in, nine go bankrupt within 5 years. And one goes off like a rocket.

That's how you do it. And unfortunately, you have to start out pretty rich to make this work. And you need to research the hell out of startups, and have a good crystal ball.

The only other way to do it is to find a company that the tens of thousands of smart traders thinks will NOT be a big deal, but you believe something they don't. Microsoft was publically traded at a low price for many years. The tens of thousands of smart traders in the 70s didn't think home computers would become a thing. And the few people who believed otherwise became rich.

The problem is that belief usually doesn't work very well at predicting success.

Look at BitCoin. I know several people who have made huge amounts of money on BitCoin. They did their initial investments based on a belief. A belief that BitCoin would become a real currency, just like dollars and euros; that people would be use BitCoin every day to buy and sell everything from gasoline to chewing gum. They looked at the theories of money and the principles of distributed control, and thought it would definitely for sure replace national currencies.

Those friends of mine were wrong. They made money for a completely different reason: speculators. Speculators buy things that are cheap and they think will go up in price. If enough speculators find the same thing to buy, the prices *does* go up. And more speculators jump in. BitCoin is a big speculative bubble, with almost no intrinsic value. (And yes, I know BitCoin is more complicated than that. But I stand by my 10,000-foot summary.)

Now don't get me wrong. Successful speculators DO become rich. Who am I to argue with success? But getting rich off of speculation is all about timing. You have to find the next big thing before most of the other speculators do, and then jump back out *before* it has run its course. Will BitCoin come crashing back down? Not necessarily. If enough people simply *believe* in it, it will retain its value. My own suspicion is that it will eventually crash but what do I know? I thought it would have crashed by now.

That's also what happened with GameStop. A group of Reddit-based speculators decided to pump up the price of a company. If you were in on it from the start, you probably made a ton of money. But once it hit the news, it was too late for you to get rich off of it. The best you could hope for was to make a little money and then get out FAST. But most people who jumped into GameStop after it had already made the news ended up losing money.

(BTW, "pump-and-dump" is against the law. I will be interested to find out if any of the Reddit-based traders get in trouble.)

Anyway, I know of many people who have taken a chance on a stock, and made some money. But they didn't get rich. And if they keep trying to replicate their early success, they'll end up winning some and losing some. And if they're smart, and work hard at it, they may out-perform the overall market in the long run. But remember - those tens of thousands of smart traders are also trying to out-perform the overall market. For you to do it repeatedly for many years probably requires expertise that those smart traders don't have. And you don't get rich quick this way, you just make some good money.


(shrugs) Everybody needs a hobby. I have a friend who goes to Vegas once a year. Sometimes he comes back negative, sometimes positive. He has absolutely no illusion that he will get rich in Vegas. He assumes he will either make a little or lose a little. And he has fun doing it. There's nothing wrong with spending money to have fun.

If you play the stock market as a game, where you aren't risking your financial future, then more power to you. But I knew one person who had to stop for his own emotional well-being. He started feeling bad every time he lost some money because he should have invested less, but also felt bad when he made money because he should have invested more. Overall he made money, but he had so much anxiety doing it that he decided it wasn't worth it.

Sunday, March 14, 2021

Circuit simulation

 I've been playing with designing simple digital circuits this weekend. Since my breadboards are not with me at the moment, I decided to look for circuit simulators.

Here's a nice comparison of several:

Before I found that comparison site, I tried out and even threw them money for a month's worth. And I can say that I've gotten that much worth of enjoyment out of my tinkering this weekend, so money well-spent. But I knew that I didn't want to keep shelling out every month (I don't do digital design that much), and there was no way to export the circuits in a way that I could save them. So I kept looking.

Here's my CircuitLab home:

I haven't tried all the choices in the "top ten" list, but I did try the "CircuitJS1" simulator maintained by Paul Falstad and Iain Sharp. See It isn't quite as nice as CircuitLab, but it's hard to argue with free, especially given my infrequency of use.

CircuitJS1 doesn't host users' designs. In fact, they don't integrate well with any form of storage. You can save your design to a local file, but the simulator doesn't do a good job of remembering file names. It presents you with a link containing a file name of the form "circuit-YYYYMMDD-HHMM.circuitjs.txt". You can save the linked contents with your own file name, but the next time you go to save, it obviously won't remember that name since it was a browser operation. All of this will make it a little inconvenient and perhaps error-prone to manage different projects. If I were doing a lot of hardware work, I would probably choose something else. But for occasional fiddling, this is fine.

Here's a simple state machine that checks even/odd parity of an input bit stream:

If I want to make anything public, I'll make them as github projects.

Friday, January 8, 2021

Racism in America

 As my readers have no doubt noticed (all 2 of you!), I keep this blog pretty technical, without a lot of politics. And I intend to keep it that way ... for the most part. But occasionally I will let my politics peek out.

Yeah, you're expecting me to talk about the events in Washington DC in January, 2021. I might post about that some day, but I'm nowhere near ready yet.

No, I'm going to talk about a class I took last fall. See

These are left-leaning classes that not only teach history, they also encourage and facilitate activism. Their focus is on racism, but touch on other "isms" as well. I learned a heck of a lot of history that wasn't covered very well back when I went to high school. The material is well-researched and well-sourced. I consider myself a better person for having participated.

The bottom line is that it isn't enough to be "not racist". We have to be "anti-racist".

The classes are not cheap. As of this writing, they are $200 a pop. (And worth it, in my humble opinion.) That said, the class organizers don't want cost to be a barrier to participation, and are willing to make adjustments. Plus, I am willing to kick in $100 for anybody who comes to them from my recommendation. Tell 'em fordsfords sent ya. (-:

Anybody who wants more "informal" information on the classes, send me an email.


Sunday, November 29, 2020

Using sed "in place" (gnu vs bsd)

 I'm not crazy after all!

Well, ok, I guess figuring out a difference between gnu sed and bsd sed is not a sign of sanity.

I use sed a fair amount in my shell scripts. Recently, I've been using "-i" a lot to edit files "in-place". The "-i" option takes a value which is interpreted as a file name suffix to save the pre-edited form of the file. You know, in case you mess up your sed commands, you can get back your original file.

But for a lot of applications, the file being edited is itself generated, so there is no need to save a backup. So just pass a null string in as the suffix. No problem, right?

[ update: useful page: ]


GNU SED (Linux and Cygwin)

echo "x" >x
sed -i '' -e "s/x/y/" x
sed: can't read : No such file or directory

Hmm ... that's odd. It's trying to interpret that null string as a file name, not the value for the "-i" option. Maybe it doesn't like that space between the option and the value.

echo "x" >x
sed -i'' -e "s/x/y/" x

There. It worked. I'm generally in the habit of using a space between the option and the value, but oh well. Learn something new every day...

BSD SED (FreeBSD and Mac)

echo "x" >x
sed -i'' -e "s/x/y/" x
ls x*
x    x-e

Hey, what's that "x-e" file? Oh, it IGNORED the empty string and interpreted "-e" as the suffix! Put the space back in:

echo "x" >x
sed -i '' -e "s/x/y/" x

Works. No "x-e" file.


I use both Mac and Linux, and want scripts that work on both!



Go ahead and always generate a backup file. And don't use a space between the option and the value. This works on both:

echo "x" >x
sed -i.bak -e "s/x/y/" x
rm x.bak

Works on Mac and Linux.

IT TOOK ME A LONG TIME TO FIGURE ALL THIS OUT!!! Part of the reason it took so long is that for the cases that don't work as intended, they tend to basically work. For example, the first Linux case where it tried to interpret '' as a file. It printed an error. But then it went to the actual file and processed it correctly. The command did what it was suppose to do, but it printed an error. For the BSD case, it created a backup file using "-e" as the suffix, but it went ahead and interpreted the sed command string as a command string, and properly processed the file. In both cases, the end goal was accomplished, but with unintended side effects.

Corner cases: the bane of programmers everywhere.

Friday, November 27, 2020

Sometimes you need eval

 The Unix shell usually does a good job of doing what you expect it to do. Writing shell scripts is usually pretty straight-forward. Yes, sometimes you can go crazy quoting special characters, but for most simple file maintenance, it's not too bad.

I *think* I've used the "eval" function before today, but I can't remember why. I am confident that I haven't used it more than twice, if that many. But today I was doing something that seemed like it shouldn't be too hard, but I don't think you can do it without "eval".


I want to use "rsync" to synchronize some source files between hosts. But I don't want to transfer object files. So my rsync command looks somewhat like this:

rsync -a --exclude "*.o" my_src_dir/ orion:my_src_dir

The double quotes around "*.o" are necessary because you don't want the shell to expand it, you want the actual string *.o to be passed to rsync, and rsync will do the file globbing. The double quotes prevents file glob expansion. And the shell strips the double quotes from the parameter. So what rsync sees is:

rsync -a --exclude *.o my_src_dir/ orion:my_src_dir

This is what rsync expects, so all is good.


For various reasons, I wanted to be able to override that exclusion option. So I tried this:

EXCL='--exclude *.o'  # default
... # code that might change EXCL
rsync -a $EXCL my_src_dir/ orion:my_src_dir

But this doesn't work right. The symbol "EXCL" will contain the string "--exclude *.o", but when the shell substitutes it into the rsync line, it then performs file globbing, and the "*.o" gets expanded to a list of files. For example, rsync might see:

rsync -a --exclude a.o b.o c.o my_src_dir/ orion:my_src_dir

The "--exclude" option only expects a single file specification.


So maybe I can enclose $EXCL in double quotes:

rsync -a "$EXCL" my_src_dir/ orion:my_src_dir

This passes "--exclude *.o" as a *single* parameter. But rsync expects "--exclude" and the file spec to be two parameters, so it doesn't work either.


Finally, maybe I can force quotes inside the EXCL symbol:

EXCL='--exclude "*.o"'  # default
... # code that might change EXCL
rsync -a $EXCL my_src_dir/ orion:my_src_dir

This almost works, but what rsync sees is:

rsync -a --exclude "*.o" my_src_dir/ orion:my_src_dir

It thinks the double quotes are part of the file name, so it won't exclude the intended files.


The solution is to use eval:

EXCL='--exclude "*.o"'  # default
... # code that might change EXCL
eval "rsync -a $EXCL my_src_dir/ orion:my_src_dir"

The shell does symbol substitution, so this is what eval sees:

rsync -a --exclude "*.o" my_src_dir/ orion:my_src_dir

And eval will re-process that string, including stripping the double quotes, so this is what rsync sees:

rsync -a --exclude *.o my_src_dir/ orion:my_src_dir

which is exactly correct.

P.S. - if anybody knows of a better way to do this, let me know!

EDIT: The great Sahir (one of my favorite engineers) pointed out a shell feature that I didn't know about:;

Did you consider setting noglob? It will prevent the shell from expanding '*'. Something like:

    EXCL='--exclude *.o' # default
    set -o noglob
    rsync -a $EXCL my_src_dir/ orion:my_src_dir
    set +o noglob

I absolutely did not know about noglob! In some ways, I like it better. The goal is to pass the actual star character as a parameter, and symbol substitution is getting in the way. Explicitly setting noglob says, "hey shell, I want to pass a string without you globbing it up." I like code that says exactly what you mean.

In contrast, my "eval" solution works fine, but the code does not make plain what my purpose was. I would need a comment that says, "using eval to prevent  the shell from doing file substitution in the parameter string." And while that's fine, I much prefer code that better documents itself.

One limitation of using noglob is that you might have a command line where you want parts of it not globbed, but other parts globbed. The noglob basically operates on a full line. So you would need to do some additional string building magic to get the right things to be done at the right time. But the same thing would be true if you were using eval. Bottom line: the shell was made powerful and flexible, but powerful flexible things tend to have subtle corner cases that must be handled in non-obvious ways. No matter what, a comment might be nice.

Tuesday, November 17, 2020

Ok, I guess I like Grammarly (grumble, grumble)

Ok, I grudgingly admit that I like Grammarly.

My complaints still hold:

[UPDATE: these are all fixed now]

  1. Mac users are second-class citizens. Mac Word integration has the file size limit, and there is no Mac outlook integration. [UPDATE: Mac integration is now good]
  2. Their desktop tool won't edit a locally-stored text file. You have to do cutting and pasting. [UPDATE: it integrates well with TextEdit. But not vim.]
  3. The file size limit is too small for serious work. Yes, you can do cutting and pasting again, but really? In 2020? [UPDATE: it now operates on large files]

The grumpy old man in me really wants to mumble something about snot-nosed little kids and go back to a typewriter and liquid paper.

But ... well ... I do have some bad writing habits.

Mostly I sometimes write unnecessarily complicated sentences, including useless phrases that I must have learned sound intellectual. It's a little humbling to have it pointed out over and over, but the result of more concise writing is worth it.

Mind you, there are many MANY times that I click the trash can because I don't like Grammarly's suggestions. In much of my technical writing, I use passive voice because active is too awkward. I also deviate from the standard practice of including punctuation inside quotes, especially when the quotes are not enclosing an actual quotation, but instead are calling out or highlighting a technical term, like a variable name. If I tell you to enter "ls xyz," and you type the comma, it won't work. You have to enter "ls xyz". I also sometimes include a comma that Grammarly thinks is not needed, but I think it helps separate two ideas.

Also, Grammarly isn't good at large-scale organization of content, which can have a MUCH greater effect on clarity than a few superfluous words.

In other words, *real* editors don't have to worry about being replaced by AIs for quite a while.

And yet ... and yet ... even with its limited ability to understand what I'm trying to say, it is still improving my writing. In small ways, perhaps. But improvement is improvement.

So yeah, I'll keep paying them money (grumble, grumble).

Friday, October 30, 2020

Software Sucks

 Sorry, I had to say it. Software really does suck.

We just installed a new CentOS, and I wanted to do some apache work. I don't do that kind of thing very often, so I don't just remember how to do it. Thank goodness for search engines!

Do a quick google for "apache shutdown" which led me to which tells me to do a "apachectl -k graceful-stop". Cool. Enter that command.

Passing arguments to httpd using apachectl is no longer supported.
You can only start/stop/restart httpd using this script.
If you want to pass extra arguments to httpd, edit the
/etc/sysconfig/httpd config file.

Um ... stopping httpd is exactly what I was trying to do. So I guessed that 2.4 must be old doc. Rather than trying to find new doc, I just entered

apachectl -h

It responded with:

Usage: /usr/sbin/httpd [-D name] [-d directory] [-f file]
                       [-C "directive"] [-c "directive"]
                       [-k start|restart|graceful|graceful-stop|stop]
                       [-v] [-V] [-h] [-l] [-L] [-t] [-T] [-S] [-X]

There's the "-k graceful-stop" all right. What's the problem? Well, except of course, for the stupid fact that the Usage line claims the command is "httpd", not "apachectl". Some newbie must have written the help screen for apachectl.

Another search for "Passing arguments to httpd using apachectl is no longer supported" wasn't very helpful either, but did suggest "man apachectl". Which says:

When  acting in pass-through mode, apachectl can take all the arguments available for the httpd binary.
When acting in SysV init mode, apachectl takes simple, one-word commands, defined below.

How might I know which mode it's working in? Dunno. But a RedHat site gave an example of:

apachectl graceful

which matches the SysV mode. So apparently the right command is "apachectl graceful-stop" without the "-k". Which worked.

So why did "apachectl -h" give bad help? I think it just passed the "-h" to httpd (passthrough), so the help screen was printed by httpd. But shouldn't apachectl have complained about "-h"? GAH!

Software sucks.

Wednesday, October 14, 2020

Strace Buffer Display

 The "strace" tool is powerful and very useful. Recently a user of our software sent us an strace output that included a packet send. Here's an excerpt:

sendmsg(88, {msg_name(16)={sa_family=AF_INET, sin_port=htons(14400), sin_addr=inet_addr("")}, msg_iov(1)=[{"\2\0a\251C\27c;\0\0\2\322\0\0/\263\0\0\0\0\200\3\0\0", 24}], msg_controllen=0, msg_flags=0}, 0) = 24 <0.000076>

Obviously there's some binary bytes being displayed. I see a "\0a", so it's probably hex. But wait, there's also a \251. Does that mean 0x25 followed by ascii '1'? I decoded it assuming hex, and the packet wasn't valid.

So I did a bit of Googling. Unfortunately, I didn't note where I saw it, but somebody somewhere said that it follows the C string conventions. And C strings come from long ago, when phones had wires connecting them to wall jacks, stack overflow was a bug in a recursive program, and octal ruled the waves when it came to specifying binary data.

So \0a is 0x00 followed by ascii 'a' and \251 is 0xa9. Now the packet parses out correctly. (It's a "Session Message" if you're curious.)

So, I guess I'm a little annoyed by that choice as I would prefer hex, but I guess there's nothing all that wrong with it. Hex or octal: either way I need a calculator to convert to decimal. (And yes, I'm sure some Real Programmers can convert in their heads.)

Tuesday, August 25, 2020

I want to love Grammarly

UPDATE 2 (17-Nov-2021): Grammarly has released an update, and my first impression is THANK GOODNESS! They either listened to me, or more likely, they knew all along what their drawbacks were. Most of my complaints are now either resolved or at least better. It is now much more native to the Mac and works more cleanly with web pages, Mac's "text edit" app, and Office for Mac. Apparently, it leverages Apple's "accessibility" infrastructure somehow. But it doesn't work with MacVim or the Terminal app. Which doesn't surprise me. Also doesn't work with Teams, which maybe does surprise me a little.

Again, I'll try to write a proper review sometime.

EARLIER UPDATE: I've been using paid-for Grammarly for almost 3 months now, and despite my complaints, I have to say that I like it. I'll leave this post pretty much as I originally wrote it and write a follow-up post when I have time.

I've lived with a bit of a problem almost all my life. I'm a slow reader and a poor speller. I suspect I have a bit of a learning disability.

That's not my problem. If I really do have a learning disability, it is mild; my language abilities are not that far below average. And I'm an engineer, for goodness sake! I'm not expected to have a perfect command of English.

My problem is that I love to write. I've dabbled with fiction, humor, and non-fiction (this blog being a primary outlet). And I want the quality of my writing to be high.

When I had my first exposure to a spell-checker, I was ecstatic! Finally, a tool to save me huge amounts of time. When I first used one that suggested correct spellings, I thought I had died and gone to heaven. But their were still to many thymes that incorrect word choice, usually among homophones, led to mistakes not being caught. I knew that a grammar checker was needed to really get the spelling right.

Microsoft Word, for all the hate that is heaped on it, raised the bar. It catches many problems that "aspell" does not. Word is still not perfect, but it is *good*.

Enter Grammarly. There's a lot about it I really like. For example, instead of just showing you your mistake and offering suggestions, it explains why its a mistake (at least when using the Grammarly editor). I.e. it is both an error checker and a learning tool. I'm using the Grammarly editor to enter this blog post, and it found a couple of things that the Firefox checker did not flag.

... But ... did you notice my mistake in the previous paragraph? " explains why its a mistake." The "its" needs an apostrophe. Grammarly didn't catch it. Microsoft did. But neither one caught the "thymes" in the prior paragraph. (Interestingly, the Firefox checker does flag "thymes".) Grammar checking is still an inexact science. UPDATE: The latest version *does* catch the "its". But my point is that it won't be perfect.

But never let perfection be the enemy of good. And there's a lot of good in Grammarly. I really want to love it enough to pay the fee. Why don't I?

My biggest issue is the file size limitation, which does not grow with the paid-for version. I maintain the documentation for our product, and some of those files are pretty big. Way too big, it turns out. I would have to mess with splitting and recombining them. Never mind the annoyance of doing that, the recombining introduces more opportunities for mistakes. UPDATE: The file size limitation seems to be removed now. I can use TextEdit to edit very large files and it seems to work. But it doesn't remember the "errors" that I dismiss; again that seems to be only in the Grammarly editor.

Also, I use a Mac, and Grammarly doesn't integrate with Outlook on Mac. And even the Mac Word plugin is size-limited, although it looks like maybe that limitation would be lifted if I were using Windows instead of Mac. UPDATE: it does now.

Also, it doesn't work well with local ".txt" files except through copy-and-paste. (It can read text files, but not write them.) UPDATE: it does now with TextEdit.

I'll probably pay for a month's worth just to see what the 11 extra suggestions are for this blog post. Maybe seeing them will change my mind. But I kind of doubt it.

EDIT (19-Sep-2020): I did go ahead and shell out for the pro version. So far it has provided a small improvement. Not sure it's worth the cost yet, but still early days. In most cases, it challenges me on something that probably does deserve a second thought, but I ended up keeping as-is.

Sunday, July 12, 2020

Perl Diamond Operator

As my previous post indicates, I've done some Perl noodling this past week. (I can't believe that was my first Perl post on this blog! I've been a Perl fan for a loooooooong time.)

Anyway, one thing I like about Perl is it takes a use case that tends to be used a lot and adds language support for it. Case in point: the diamond operator "<>" (also called "null filehandle" or "null angle operator").

See the "Tutorial" section below if you are not familiar with the diamond operator.


I may expand this as time goes on.

Filename / Linenumber

Inside the loop, you can use "$." as the line number and "$ARGV" as the file name of the currently open file.

*BUT*, see next tip.

Continue / Close

Always code  your loop as follows:

while (<>) {
} continue {
  close ARGV if eof;

The continue clause is needed to have "$." refer to the line number within the *current* file. Without it "$." will refer to the total number of lines read so far.

In my opinion, even if what you want is total lines and not line within file, you should still code it like the above and just use your own counter for the total line number. This provides consistency of meaning for "$.". Plus, it's possible that in the future you will want to add functionality that requires line within file, and it's messy to code that with your own counter.

Skip Rest of File

Sometimes you get a little ways into a file and you decide that you're done with the file and would like to skip to the next (if any). Include this inside the loop:

close ARGV;  # skip rest of current file

Positional Parameter

Let's say you're writing a perl version of grep, and you want the first positional parameter (after the options) to be the search pattern.

$ "ford" *.txt

Unfortunately, this will try to read a file named "ford" as the first file. What to do?

my $pat = shift;  # Pops off $ARGV[0].
while (<>) {

This works because "<>" doesn't actually look at the command line. It looks at the @ARGV array. The "shift" function defaults to operating on the @ARGV array.

Security Warning

Because of the way the diamond operator opens files, it is possible for a hostile user to construct a file that can produce very bad results. For example:

$ echo "hello world" >x
$ echo "goodby world" >'rm x|'
$ ls -1
rm x|
$ cat *
goodby world
hello world
$ cat x
hello world

So far, so good. "rm x|" is just an unusually-named file with a space in the middle and a pipe ("|") at the end. But now let's use my perl version of grep with a pattern of "." (matches all non-empty lines):

$ "." *
Can't open x: No such file or directory at /home/sford/bin/ line 81.
$ cat x
cat: x: No such file or directory

Yup, just deleted the file named "x". The pipe character at the end of the file "rm x|" invoked Perl's opening a filehandle into a command functionality (with the 2-argument open). In other words, by just naming a file in a particular way, you've made do something unexpected and potentially dangerous.

This might look like a horrible security hole (what if the name of that rogue file resulted in deleting all your files?), but it can also be a very powerful (albeit rarely used) feature. The moral of the story is don't run *any* tool over a set of files that you aren't familiar with.

You can also instead use "<<>>" instead of "<>". But this requires Perl version 5.22 or newer, which rarely seems to be on any system I try to use. This will force each input file to be opened as a file, not potentially as a command.

Unfortunately, it also prevents the special handling of input file named "-" to read standard input. This is a construct that I do use periodically.


Many Unix commands have the following semantics:

cmdname -options [input_file [input_file2 ...] ]

where the command will read from each input file sequentially, or from standard input if no input files are provided. File names can be wildcarded. Most such Unix commands allow you to supply "-" as a file name and the tool will read from standard input.

The diamond operator makes this ridiculously easy. Here's a minimal "cat" command in Perl:

#!/usr/bin/env perl
while (<>) {
  print $_;

That's the whole thing. It takes zero or more input files (if none, reads from standard input) and concatenates them  to standard out. Just like "cat".

Specifically what "<>" does is read one line from whatever input file is currently open. If it is at the end of the file, "<>" will automatically open the next file (if any) and read a line from it. As with many Perl built-ins, it leaves the just-read line in the "$_" variable.

You should be ready for the "Tips" section now.

Saturday, July 11, 2020

Perl Faster than Grep

So, I've been crawling through a debug log file that is 195 million lines long. I've been using a lot of "grep | wc" to count numbers of various log messages. Here's some timings for my Macbook Pro:

$ time cat dbglog.txt >/dev/null
real 0m35.423s

$ time wc dbglog.txt
195177935 1177117603 28533284864 dbglog.txt
real 1m44.560s

$ time egrep '999999' dbglog.txt
real 7m39.737s

(For this timing, I chose a pattern that would *NOT* be found.)

On the Macbook, the man page for fgrep claims that it is faster than grep. Let's see:

$ time fgrep '999999' dbglog.txt
real 7m11.365s

Well, I guess it's a little faster, but nothing to brag about.

Then I wanted to create a histogram of some findings, so I wrote a perl script to scan the file and create the histogram. Since it performed regular expression matching on every line, I assumed it would be a little slower than grep, since Perl is an interpreted language.

$ time ./ dbglog.txt >count.out
real 3m9.427s

WOW! Less than half the time!

So I created a simple grep replacement: It doesn't do any histogramming, so it should be even faster.

$ time '999999' dbglog.txt
real  2m8.341s

Amazing. Perl grep runs in less than a third the time of grep.

For small files, I bet Perl grep is slower starting up. Let's see.

$ time echo "hi" | grep 9999
real        0m0.051s

$ time echo "hi" | 9999
real        0m0.113s

Yep. Grep saves you about 60 milliseconds. So if you had thousands of small files to grep, it might be faster to use grep.


I got another big log file today (70 million lines) and saw something pretty surprising given my initial findings.