shrun: A modern CLI testing framework

Cover image

TL;DR: Test your CLI commands in isolated docker containers using the Jest test environment you already love.

A few years ago, I was working as lead architect for a startup building a high-performance competitor to AWS Lambda. One of my responsibilities was maintaining a standalone CLI and SDK for the functions backend. The CLI/SDK was built with Node and commander (eventually yargs), and had very similar structure/usage to the popular Serverless framework.

A while after I built out this initial SDK/CLI we started having some internal frustrations regarding the process user-facing features would go through before eventually reaching the CLI. We realized that we would often design a backend feature only to later realize that the CLI interface/API would need to be quite nasty to satisfy it. This frustration had a measurably negative effect on both the quality of new features and the velocity in which they could be released. Many readers might assume that we simply had bad communication and planning, and while there was definitely room for improvement in that area, it didn't help that our team was separated by a 10-11 hour time difference. Regardless of the cause, at some point one of my coworkers started a conversation with me to explore ways we could make our process more declarative and reliable. After a especially frustrating day, he came to me with an amazing idea. He suggested that we create a "spec" format that would allow us to both test the CLI and propose new user-facing features in a concrete way. I perfectly understood the direction he was going, so I immediately started building a prototype. A day later I had a MVP version of the tool which consumed yaml based spec tests and ran them automatically against our open source CLI. Below is an example to show you the format of the spec (testing the npm init --help command):

- test: Test init help output
    - "curl -sL | sudo -E bash -"
    - "sudo apt install nodejs"
    -   in: npm init --help
        out: |-
          npm init [--force|-f|--yes|-y|--scope]
          npm init <@scope> (same as `npx <@scope>/create`)
          npm init [<@scope>/]<name> (same as `npx [<@scope>/]create-<name>`)
          aliases: create, init

Spec format

test: string - each spec test must have a test stanza with a unique name. For those who are familiar with Jest/Ava/Mocha, this maps directly to the test("someName", () => {}) format used by those frameworks.

setup?: string[] - the setup section allows you to run a series of shell commands before the test itself runs. This is convenient for tests that rely on a specific set of environment variables, need iptables configured etc. For those who are familiar with Jest/Ava/Mocha, this partially maps to the beforeEach (more like beforeThis since you specify it per test) construct.

steps: Step[] - steps are where the bulk of your test logic is defined and there is no limit to the number you can have per test. All steps must have an in entry, this is what will actually be run against the containers internal shell. If a step is expected to succeed, it is a PassStep and must have an out entry. in and out map to actual and expected in traditional testing frameworks. If a test is not expected to succeed (not 0 exit code), it must either have an err or exit entry. err is similar to out but is checked against stderr as opposed to stdout. exit makes it possible to specify the expected exit code that resulted from running the tests in statement.

There are also two other stanzas not show by the above spec:

cleanup?: string[] - the exact same as setup but runs after the test has finished. Useful for resource cleanup. Maps to the afterEach/afterThis construct in traditional testing frameworks.

foreach: Map<string, string>[] - allows a single test to be run multiple times with different input values.

Why shrun?

Some of you may think a dockerized solution like this is overkill. I understand that sentiment but there are convincing reasons why shrun brings value:

  • Each test runs in it's own isolated environment. CLI testing is unique in the sense that it's often the ultimate point of contact between your product and the user. Ensuring that a set of steps runs from start to finish on X environment is paramount.
  • Tests have minimal ability to interfere with each other. There are still issues such as noisy neighbors and throttling by external services, but generally speaking parallel test runs will not degrade the reliability of the tests.
  • The containers of troublesome failing tests can be sent to other developers and debugged quickly.
  • You can run shrun on any platform that supports Docker (basically all of them)


This is the initial release of shrun so don't expect things to be perfect. In the future I hope to improve the framework and add all relevant but missing Jest flags. Contributors and feedback are welcome and desired, so I'd love to hear how shrun could be improved to better fit your needs. If you like what you saw, please star the project on GitHub so it can be useful to a wider audience.