This article introduces the first release of ‘Lash#Cat9’, a different kind of command-line shell.
A big change is that it is communicating with the display server directly, instead of being restricted and filtered by a terminal emulator. The source code repository with instructions for running it yourself can be found here: https://github.com/letoram/cat9. A concatenation of all the clips here can be found in this (youtube-link).
Cat9 serves as the practical complement to the article on ‘The day of a new command-line interface: shell‘. That article also covers the design/architectural considerations on a system level, as well as more generic advancements to displacing the terminal emulator.
The rest of the article will work through the major features and how they came about.
A guiding principle is the role of the textual shell as a frontend instead of a clunky programming environment. The shell presents a user-facing, interactive interface to make other complex tools more approachable or to glue them together into a more advanced weapon. Cat9 is entirely written in Lua, so scripting in it is a given, but also relatively uninteresting as a feature — there are better languages around for systems programming, and better UI paradigms for automating work flows.
Another is that of delegation – textual shells naturally evolved without assuming a graphical one being present. That is rarely the case today, yet the language for sharing between the two is unrefined, crude and fragile. The graphical shell is infinitely more capable of decorating and managing windows, animating transitions, routing inputs and tuning pixels for specific displays. It should naturally be in charge of such actions.
Another is to make experience self documenting – that the emergent patterns on how your use of command line processing gets extracted and remembered in a form where re-use becomes natural. Primitive forms of this are completions from command history and aliases, but there is much more to be done here.
I collected history from a few weeks of regular terminal use along with screen recordings of the desktop window management side. I then proceeded to manually sift through these, looking for signs of poor posture. I found plenty.
This is a humbling experience. The main conclusion drawn is that I am mostly a hapless twit who default to repeating the same things hoping for different outcomes. I consistently confuse ‘src’ and ‘dst’ for ‘ln -s’; ‘ls’ gets spelled ‘sl’ much too often; ifconfig remains the preferred choice to ‘ip’ even though its main output typically is ‘file not found’ these days; nearly every tool that expects regular expressions are first fed plaintext strings. When I actually want to use a regular expression I consistently pick the wrong expression language.
The signal to noise ratio in the history is abysmal. About 90% of scrollback contents were leftovers from cd, ls and tab completion sprinkled with repeated runs of the same command through sudo, with minor tweaks to the arguments or to get a redirection for stderr. Redirections that were then left in the file system, with descriptive names like “boogeraids2000”.
The screen recordings were also revealing. Some notable time sinks:
- Copy paste across line-feeds and resizing windows to deal with incorrect wrapping.
- Spinning up new terminals to work around man or vim hogging the alt screen.
- Digging around in ps/proc/… for PIDs.
- Redirecting to temporary files to transfer job outputs between windows or for later comparison.
- Switching vim buffers between horizontal/vertical to fight the tiling WM.
All these can be fixed with relatively minor effort.
Get the prompt out of the way.
Starting with the prompt – obvious bits are that its contents should be ephemeral and disappear after running a command. It should reflect information about the current context (directory, etc.) and whatever else of immediate short lived value. The point is to clean this up:
Instead we get this:
- Prompt is updated live regardless of input and can change its layout template dynamically.
- Prompt format and contents depends on window management state (focus, unfocus).
- Silent commands are kept away from the history.
- Completions come up without interaction and do not trample/shuffle actual contents.
- Commands that only resulted in errors are automatically delay purged.
The previous options for compartmentation was a choice between juggling between a ‘foreground’ job and ‘background’ jobs. For this to work you needed either a fragile weave of signalling (SIGSTP, …) and file redirections — or spin up new terminals, either through a terminal multiplexer (a terminal emulator inside a terminal emulator inside ..) or new windows.
I find those solutions both noisy and distracting. Instead, I now have this:
- Every command-line submitted now becomes its own job.
- Jobs can reference each other.
- Job context (environment variables, working directory, …) is saved and tracked.
- The jobs are presented in order of importance (active ones take priority over passive ones).
- Spawning new jobs automatically folds old ones into a collapsed form.
- Individual controls, status and statistics are added to a stateful bar at the top of the job.
- Job contexts can be reused for new commands.
Remember everything, but right to be forgotten.
In the terminal world, all job outputs either get composed unto one shared buffer with a certain amount of memory (scrollback history); fight for a scratchpad (“altscreen mode”) or are redirected to files or other jobs. This happens regardless of stream source or job state (foreground/background).
With real compartmentation and much larger memory and CPU budgets thanks to server side text rendering, we can do much better:
- Stdout and Stderr are tracked separately.
- All job output is kept, tracked and addressed individually.
- Contents can be forgotten, or selectively processed.
- Completed jobs can be repeated, appending to the existing output or replacing that of previous runs.
- Jobs can be repeated with an edited command-line.
Cooperate with the outer windowing system
Now that the shell can talk directly to the window manager without having the conversation dumbed down by a terminal emulator sitting in between, new integration options are possible:
- Snapshot the output of a job to a new window.
- Window creation hints to window manager, like vertical split or tabbed.
- Open applications and media embedded, with controls for position and size.
- Detach and reattach embedded media, preserving input routing.
- Directly route contents to clipboard and other data sharing mechanisms.
- Trigger GUI file pickers.
Let legacy in
Now with a fairly functional environment, the last part is to account for all the edge cases where we still need access to the old world in various degrees:
- Send data from a job to external processing pipes (#0 | grep hi).
- Request a new window, attach a terminal emulator to it and run a pty dependent command (!vim).
- Setup a PTY and attach a VTxxx view to it: (p! ls –color=yes).
Streamline command structure
The foundation to cat9 is the command-line language itself. All the UI elements that you see, mouse gestures and key bindings map to the same things that you could type in manually:
- Hooks and event actions can be added after a command has been setup or is running.
- Mouse actions, bindings (clicking shown in clip: view #csel $=crow as in ‘cursor job, cursor row’).
- Aliases and pre-commit expansion.
With these basics sorted out, it is time to build something more interesting.
Special Topic: Views on Life
Now that jobs keep their data around in nicely tracked structures rather than a prematurely composed and broken ‘scrollback buffer’, we can do something more. While we have data in its raw form, we can look at it through various lenses to get different representations of the data. These are baked into the ‘view’ builtin.
Simply put, they parse the data and reformat the contents by adding annotations, structures, formatting and so on. The current builtin ones are all shown in this clip:
In this one you see ‘wrap’ and ‘filter’ along with some options like line numbers and column wrapping. Filter even goes so far as to have an interactive mode that live-applies the filter as it is being written.
With the original data retained, re-executing previous pipelines is not needed, and the choice between using the formatted output and the original data is available when copying in/out.
This is one of the features that will be expanded heavily in future versions as we try to improve the presentation of the many ad-hoc text formats.
Special Topic: State Actor
This is a good one. Regular windowing systems provide Clipboard as well as Drag and Drop as forms of interactive data sharing. Some go further and also allow sequenced picking/sharing, like the “share” button popular in mobile operating systems. Arcan adds a state store/restore action to the mix.
This means that at any point, the windowing system can request that a state snapshot is created, or request that the application reverts to a provided one.
Examples of what gets stored in such a state blob here are configuration changes; command history; environment variables; aliases and so on. While this offloads the ‘where are my dot files’ responsibility, more interesting is that states can be transferred between instances at runtime.
Combine this with the job system: by marking a job as persistent, the command creating a job will be added to the state store. In the following clip you can see it being used to an interesting effect:
I first start a new cat9 session, run two jobs and mark one as persistent manual and other as automatic. Shutting down and restarting and you can see how the jobs come back, with the automatic one starting immediately. In the next clip I go one step further and copy the state between two live instances.
When combined with remote shells, this becomes a really potent administration and automation tool. Perform a task once; visually confirm that the results matched expectations; Save the state and replay wherever and whenever. Use that for knowledge sharing, or hook it up to an event source for snapshotting and rollback to give anything history/undo.
Special Topic: Frontending
There is little consistency between many popular tools, no matter if they come as “argv hell”, “CLIs within the CLI” or “lots of small binaries”. This is natural, but also undesired from a user perspective. It feels rather futile to have gone through the strides of building a CLI that behaves like you want it to — just to have the work be undone by the tools you launch from it.
I am no stranger to uphill battles, but the odds of getting the likes of wpa_supplicant, git, gdb/lldb or ffmpeg to change their evil ways and follow the one true path are slim to none. The passive aggressive form of dealing with this is what bash_completion and the likes do – create helper scripts that at least make polite suggestions while building the command line. This works poorly when the tool is interactive. Other options include defining better programmable interfaces, language server style external oracles, then hope for the main drivers to convert.
With the extensive scripting, parsing and rendering options available to us now – there is a more actively aggressive way. In Cat9, you can define multiple sets of builtins and views, and switch between them. This means that you can create a set of builtins for a specific logical function, like networking, programming or debugging, then swap between those as needed.
This, along with views, will be the more active area being developed for future releases. The following short clip shows an early ‘in progress’ such set for networking.
In the clip you can see the set of builtins being swapped to ‘networking’ which new builtins such as ‘wifi’. You can see the live completion of available SSIDs appearing asynchronously as a scan is complete. Commands can still be forwarded ‘raw’ with the output packaged into its own job that can be used by the other builtins. It can also attach polling status about signal levels and connection into the prompt, using all the same infrastructure as the previous demonstrations.
I hope this conveyed some of the benefits of leaving the shackles of terminal emulators and its more abstract form of ‘virtualisation for compatibility through emulation as default’ behind. There are a whole lot more ideas to squeeze into this setup now that all the grunt work has been dealt with.
Better CLIs as part of better TUIs are key for making professional computing more accessible to budding sprout experts and cognitively challenged alike. The building blocks are here for your ‘speech- assisted’ command-lines without having to have a screen reader try and make sense of a poorly segmented word soup, or for your red team approved secret “leave no trace” cleanup sauce.
The last article in this series will dip into the programmable surface – how the APIs replacing curses work and integrate with the display server / window manager.