Tuesday, December 27, 2022

Tgen: Traffic Generator Scripting Language

Oh no, not another little language! (Perhaps better known as a "domain-specific language".)

Yep. I wrote a little scripting language module intended to assist in the design and implementation of a useful network traffic generator. It's called "tgen" and can be found at https://github.com/fordsfords/tgen. I won't write much about it here other than to mention that it implements a simple interpreter. In fact, the simplicity of the parser might be the thing I'm most pleased about it, in terms of bang for buck. See the repo for details.

Little Languages Considered Harmful?

So yeah, I'm reasonably pleased with it. But I am also torn. Because "little languages" have both a good rap (Jon Bentley, 1986) and a bad rap (Olin Shivers, 1996).

In that second paper, Shivers complains that little languages:

  • Are usually ugly, idiosyncratic, and limited in expressiveness.
  • Basic linguistic elements such as loops, conditionals, variables, and subroutines must be reinvented and re-implemented. It is not an approach that is likely to produce a high-quality language design.
  • The designer is more interested in the task-specific aspects of his design, to the detriment of the language itself. For example, the little language often has a half-baked variable scoping discipline, weak procedural facilities, and a limited set of data types.
  • In practice, it often leads to fragile programs that rely on heuristic, error-prone parsers.
Of course, Shivers doesn't *really* think that little languages are a bad idea. He just thinks that they are usually implemented poorly, and his paper shows the right way to do it (in Scheme, a Lisp variant).

But there are some good arguments against developing little languages at all, even if implemented well. At my first job out of college, I wrote a little language to help in the implementation of a menu system. The menus were tedious and error-prone to write, and the little language improved my productivity. I was proud of it. An older and wiser colleague gently told me that there are some fundamental problems with the idea. His reasoning was as follows:

  • We already have a programming language that everybody knows and is rich and well-tested.
  • You've just invented a new programming language. It probably has bugs in the parser and interpreter that you'll have to find and fix. Maybe the time you spend doing that is paid for by the increased productivity in adding menus. Maybe not.
  • The new language is known by exactly one person on the planet. Someday you'll be on a different project or a different company, and we can't hire somebody who already knows it. There's an automatic learning curve.
  • Instead of writing an interpreter for a little language, you could have simply designed a good API for the functional elements of your language, and then used the base language to call those functions. Then you have all the base language features at your disposal, while still having a high level of abstraction to deal with the menus.

He was a nice guy, so while his criticism stung, he didn't make it personal, and he was tactful. I was able to see it as a learning experience. And ever since then, I've been skeptical of most little languages.

Then Why Did I Make a Little Language?

Well, first off, I *DID* create an API. So instead of writing scripts in the scripting language, you *can* write them in C/C++. I expect this to be interesting to QA engineers wanting to create automated tests that might need sophisticated usage patterns (like waiting to receive a message before sending a burst of outgoing traffic). I would not want to expand my scripting language enough to support that kind of script. So being able to write those tests in C gives me all the power of C while still giving me the high level of abstraction for sending traffic.

But also, a network traffic generator is a useful thing to be able to run interactively for ad-hoc testing or exploration. It would be annoying to have to recompile the whole tool from source each time you want to change the number or sizes of messages.

Of course, most traffic generation tools take care of that by letting you specify the number and sizes of the messages via command-line options or GUI dialogs. But most of them don't let you have a message rate that changes over time. My colleagues and I deal with bursty data. To properly test the networking software, you should be able to create bursty traffic. Send at this rate for X milliseconds, then at a much higher rate for Y milliseconds, etc. The "tgen" module lets you "shape" your data rate in non-trivial ways.

Did I get it right? My looping construct is ugly and could only be loved by an assembly language programmer. Maybe I should have made it better? Or just omitted it? Dunno. I'm open to discussion.

Anyway, I'm hoping that others will be able to take the tgen module and do something useful with it.

No comments: