But there is one kind of surprise that I REALLY like. It's learning something new ... the sort of thing that makes you say, "how did I not learn this years ago???"
Let's say you want the standard output of one command to serve as the input to another command. On day one, a Unix shell beginner might use file redirection:
$ ls >ls_output.tmp
$ grep myfile <ls_output.tmp
$ rm ls_output.tmp
On day two, they will learn about the pipe:
$ ls | grep myfile
This is more concise, doesn't leave garbage, and runs faster.
But what about cases where the second program doesn't take its input from STDIN? For example, let's say you have two directories with very similar lists of files, but you want to know if there are any files in one that aren't in the other.
$ ls -1 dir1 >dir1_output.tmp
$ ls -1 dir2 >dir2_output.tmp
$ diff dir1_ouptut.tmp dir2_output.tmp
$ rm dir[12]_output.tmp
So much for conciseness, garbage, and speed.
But, today I learned about Process Substitution:
$ diff <(ls -1 dir1) <(ls -1 dir2)
This basically creates two pipes, gives them names, and passes the pipe names as command-line parameters of the diff command. I HAVE WANTED THIS FOR DECADES!!!
And just for fun, let's see what those named pipes are named:
$ echo <(ls -l dir1) <(ls -1 dir2)
/dev/fd/63 /dev/fd/62
COOL!
(Note that echo doesn't actually read the pipes.)
VARIATION 1 - OUTPUT
The "cmda <(cmdb)" construct is for cmda getting its input from the output of cmdb. What about the other way around? I.e., what if cmda wants to write its output, not to STDOUT, but to a named file, and you want that output to be the standard input of cmdb? I'm having trouble thinking here of a useful example, but here's a not-useful example:
cp file1 >(grep xyz)
I say this isn't useful because why use the "cp" command? Why not:
cat file1 | grep xyz
Or better yet:
grep xyz file1
Most shell commands write their primary output to STDOUT. I can think of some examples that don't, like giving an output file to tcpdump, or the object code out of gcc, but I can't imagine wanting to pipe that into another command.
If you can think of a good use case, let me know.
VARIATION 2 - REDIRECTING STANDARD I/O
Here's something that I have occasionally wanted to do. Pipe a command's STDOUT to one command, and STDERR to a different command. Here's a contrived non-pipe example:
process_foo 2>err.tmp | format_foo >foo.txt
alert_operator <err.tmp
rm err.tmp
You could re-write this as:
process_foo > >(format_foo >foo.txt) 2> >(alert_operator)
Note the space between the two ">" characters - this is needed. Without the space, ">>" is treated as the append redirection.
Sorry for the contrived example. I know I've wanted this a few times in the past, but I can't remember why.
And for completeness, you can also redirect STDIN:
cat < <(echo hi)
But this is the same as:
echo hi | cat
I can't think of a good use for the "< <(cmd)" construct. Let me know if you can.
EDIT:
I'm always amused when I learn something new and pretty quickly come up with a good use for it. I had some files containing a mix of latency values and some log messages. I wanted to "paste" the different files into a single file with multiple columns to produce a .CSV. But the log messages were getting in the way.
paste -d "," <(grep "^[0-9]" file1) <(grep "^[0-9]" file2) ... >file.csv
Done! :-)
3 comments:
I learned about bash process substitution recently as well, and I share your sentiment! It's also handy for sourcing completion scripts that might be built into CLIs:
source <(kubectl completion bash)
source <(npm completion)
The other really cool recent shell 'innovations' IMO are https://www.shellcheck.net/ and https://github.com/bash-lsp/bash-language-server. Both can be integrated into vim via plugins and make editing shell scripts much safer.
Actually, I just ran across this article which may answer your question as to why this feature didn't exist earlier:
https://utcc.utoronto.ca/~cks/space/blog/unix/ProcessSubstitutionWhyLate
I guess the /dev/fd filesystem needed to exist first.
Also something to be aware of when using these is that there is no way in bash to abort the whole command if something inside <(...) fails.
> there is no way in bash to abort the whole command if something inside <(...) fails.
Ah, a good think to keep in mind. Thanks for pointing it out.
Post a Comment