“AWK” for Multimedia

(… and for system graphics, games and other interactive applications but that would make the title just a bit too long…)

Many of the articles here have focused on the use of Arcan as a “desktop engine” or “display server”; even though those are rather fringe application areas which only showcases a fraction of the feature set – the display server target just happens to be part of my current focus.

This post is about another application area. For those of you not ‘in the know’, AWK is a programming language aptly suited for scripted stream processing of textual data, and a notable part of basic command-line hygiene. Arcan can be used in a similar way, but for scripted- and/or interactive- stream processing of multimedia; and this post will demonstrate how that can be achieved.

The example processing pipeline we’ll set up first takes an interactive gaming session in one instance (our base contents). This is forwarded to a second instance which applies a simple greyscale effect (transformation). The third instance finally mixes in a video feed and an animated watermark (our overlay or metadata). The output gets spliced out to a display and to a video recording.

The end result from the recording looks like this:

The invocation looks like this:

./arcan_lwa -w 480 -h 640 --pipe-stdout ./runner demo default |
    ./arcan_lwa --pipe-stdin --pipe-stdout ./gscale |
    ./arcan --pipe-stdin -p data ./composite mark.png demo.mp4 out.mkv

There are a number of subtle details here, particularly the distinction between “arcan_lwa” and “arcan”. The main difference is that the former can only connect to another “arcan” or “arcan_lwa” instance, while “arcan” will connect to some outer display system; this might be another display server like Xorg, or it can be through a lower level system interface. This is important for accelerated graphics, format selection, zero-copy buffer transfers and so on – but also for interactive input.

Structurally, it becomes something like this:


All of these interconnects can be shifted around or merged by reconfiguring the components to reduce synchronisation overhead or rebalance for system composition. The dashed squares indicate process- and possibly privilege- separation. The smaller arcan icons represent the lwa (lightweight) instance while the bigger showing the normal instance responsible for hardware/system integration.

Note that content flow from some initial source to the output system, while input move in the other direction. Both can be transformed, filtered or replaced with something synthesised at any step in the (arbitrary long) chain. For the example here only works with a single pipe-and-filter chain, but there is nothing preventing arbitrary, even dynamic, graphs to be created.

Going from left to right, lets take a look at the script bundles(“appls”) for each individual instance. These have been simplified here by removing error handling, showing only the normal control flow.


This reads like “On start, hook up an external program defined by the two command line arguments, and make its buffers visible. Shut down when the program terminates, and force-scale it to fit whatever dimensions was provided at startup. Whenever input is received from upstream, forward it without modification to the external program”.

function runner(argv)
 client = launch_target(argv[1], argv[2], LAUNCH_INTERNAL, handler)

function handler(source, status)
 if status.kind == "terminated" then
  return shutdown("", EXIT_SUCCESS)
 if status.kind == "resized" then
  resize_image(source, VRESW, VRESH)

function runner_input(iotbl)
 if valid_vid(client, TYPE_FRAMESERVER) then
   target_input(client, iotbl)

Note that the scripting environment is a simple event-driven imperative style using the Lua language, but with a modified and extended API (extensions being marked with cursive text). There are a number of “entry points” that will be invoked when the system reaches a specific state. These are prefixed with the nameof the set of scripts and resources (‘appl’) that you are currently running. In this case, it is “runner”.

Starting with the initialiser, runner(). Runner takes the first two command line arguments (“demo”, “default”) and passes through to  launch_target. This function performs a lookup in the current database for a ‘target'(=demo) and a ‘configuration'(=default). To set this up, I had done this from the command line:

arcan_db add_target demo RETRO /path/to/libretro-mrboom.so
arcan_db add_config demo default

The reason for this indirection is that the scripting API doesn’t expose any arbitrary eval/exec primitives (sorry, no “rm -rf /”). Instead, a database is used for managing allowed execution targets, their sets of arguments, environment, and so on. This doubles as a key/value store with separate namespaces for both arcan/arcan_lwa configuration, script bundles and individual targets.

RETRO indicates that we’re using libretro as the binary format here, and the demo is the ‘MrBoom‘ core. This can be substituted for anything that has a backend or dependency that can render and interact via the low-level engine API, shmif. At the time of this article, this set includes Qemu (via this patched backend), Xorg (via this patched backend), SDL2 (via this patched backend), Wayland (via this tool), SDL1.2 (via this preload library injection). There’s also built-in support for video decoding (afsrv_decode), terminal emulation (afsrv_terminal) and a skeleton for quickly hooking your own data providers (afsrv_avfeed), though these are spawned via the related launch_avfeed call.


This reads like “On start, compile a GPU processing program (“shader”). Idle until the adoption handler provides a connection on standard input, then assign this shader and an event loop to the connection. Forward all received interactive input. If the client attempts a resize, orient its coordinate system to match.”

function gscale()
 shader = build_shader(nil,
uniform sampler2D map_tu0;
varying vec2 texco;
void main(){
 float i = dot(
  texture2D(map_ty0, texco).rgb,
  float3(0.3, 0.59, 0.11)

 gl_FragColor = vec4(i, i, i, 1.0);
]], "greyscale")

function gscale_adopt(source, type)
 if type ~= "_stdin" then
  return false

 client = source
 target_updatehandler(source, handler)
 image_shader(source, shader)
 resize_image(shader, VRESW, VRESH)
 return true

function handler(source, status)
 if status.kind == "terminated" then
   return shutdown("", EXIT_SUCCESS)
 elseif status.kind == "resized" then
   resize_image(source, VRESW, VRESH)
   if status.origo_ll then
    image_set_txcos_default(source, true)

function gscale_input(iotbl)
 if valid_vid(client, TYPE_FRAMESERVER) then
   target_input(client, iotbl)

This shouldn’t be particularly surprising given the structure of the ‘launcher’. First thing to note is that build_shader automatically uses the rather ancient GLSL120 for the simply reason that it was/is at the near-tolerable span of GPU programming in terms of feature-set versus driver-bugs versus hardware compatibility.

The interesting part here is the _adopt handler. This can be activated in three very different kinds of scenarios. The first case is when you want to explicitly switch or reload the set of scripts via the system_collapse function and want to keep external connections. The second case is when there’s an error is a script and the engine has been instructed to automatically switch to a fallback application to prevent data loss. The third case is the one being demonstrated here, and relates to the –pipe-stdin argument. When this is set, the engine will read a connection point identifier from standard input and sets it up via target_alloc. When a connection arrives, it is being forwarded to the adopt handler with a “_stdin” type. The return value of the _adopt handler tells the engine to keep or delete the connection that is up for adoption.

A subtle detail that will be repeated later is in the origio_ll “resized” part of the event handler.

The skippable backstory is that in this area of graphics programming there are many generic truths. Truths such as that color channels will somehow always come in an unexpected order; GPU uploads will copy the wrong things into the wrong storage format and in the most inefficient of ways possible; things you expect to be linear will be non-linear and vice versa; if something seem to be easy to implement, the only output you’ll get is a blank screen. The one relevant here is that at least one axis in whatever coordinate system that is used will be inverted for some reason.

Any dynamic data provider here actually needs to cover for when or if data source decides that a full copy can be saved by having the origo be in the lower left corner rather than the default upper left corner. For this reason, the script needs to react when the origo_ll flag flips.


This reads like “on start, load an image into layer 2, force-scale it to 64×64 and animate it moving up and down forever. Spawn a video decoding process that loops  a user supplied video and draw it translucent in the corner at 20%. Record the contents of the screen and the mixed audio output into a file as h264/mp3/mkv”. Terminate if the ESCAPE key is pressed, otherwise forward all input”.

function composite(argv)
 symtable = system_load("symtable.lua")()

function setup_watermark(fn)
 watermark = load_image(fn, 2, 64, 64)
 if not valid_vid(watermark) then
 move_image(watermark, VRESW-64, 0)
 move_image(watermark, VRESW-64, VRESH - 64, 100, INTERP_SMOOTHSTEP)
 move_image(watermark, VRESW-64, 0, 100, INTERP_SMOOTHSTEP)
 image_transform_cycle(watermark, true)

function setup_overlay(fn)
 launch_decode(fn, "loop",
  function(source, status)
   if status.kind == "resized" then
    blend_image(overlay, 0.8)
    resize_image(overlay, VRESW*0.2, VRESH*0.2)
    order_image(overlay, 2)

function setup_recording(dst)
 local worldcopy = null_surface(VRESW, VRESH)
 local buffer = alloc_surface(VRESW, VRESH)
 image_sharestorage(WORLDID, worldcopy)
 define_recordtarget(buffer, dst, "", {worldcopy}, {},
  function(source, status)
   if status.kind == "terminated" then
    print("recording terminated")

local function handler(source, status)
 if status.kind == "terminated" then
  return shutdown("", EXIT_SUCCESS)
 elseif status.kind == "resized" then
  resize_image(source, VRESW, VRESH)
  if status.origo_ll then
   image_set_txcos_default(source, true)

function composite_adopt(source, type)
 if type ~= "_stdin" then
  return false
 target_updatehandler(source, handler)
 return true

function composite_input(iotbl)
 if iotbl.translated and symtable[iotbl.keysym] == "ESCAPE" then
  return shutdown("", EXIT_SUCCESS)
 if valid_vid(client, TYPE_FRAMESERVER) then
  target_input(client, iotbl)

Composite is a bit beefier than the two other steps but some of the structure should be familiar by now. The addition of system_load is simply to read/parse/execute another script and the symtable.lua used here comes with additional keyboard translation (how we can know which key is ESCAPE).

In setup_watermark, the thing to note is the last two move_image commands and the image_transform_cycle one. The time and interpolation arguments tell the engine that it should schedule this as a transformation chain, and the transform_cycle says that when an animation step is completed, it should be reinserted at the back of the chain. This reduces the amount of scripting code that needs to be processed to update animations, and lets the engine heuristics determine when a new frame should be produced.

In setup_overlay, launch_decode is used to setup a video decoding process that loops a single clip. If the decoding process tries to renegotiate the displayed size, it will be forcibly overridden to 20% of output width,height and set at 80% opacity.

The setup_recording function works similarly to setup_overlay, but uses the more complicated define_recordtarget which is used to selectively share contents with another process. Internally, what happens is that a separate offscreen rendering pipeline is set up with the contents provided in a table. The output buffer is sampled and copied or referenced from the GPU at a configurable rate and forwarded to the target client. In this case, the offscreen pipeline is populated with a single object that shares the same underlying datastore as the display output. The empty provided table is simply that we do not add any audio sources to the mix.

Final Remarks

I hope this rather long walkthrough has demonstrated some of the potential that is hidden in here, even though we have only scratched the surface of the full API. While the example presented above is, in its current form, very much a toy – slight variations of the same basic setup have been successful in a number of related application areas, e.g. surveillance systems, computer vision, visual performances, embedded-UIs and so on.

Even more interesting opportunities present themselves when taking into account that most connections can be dynamically rerouted, and things can be proxied over networks with a fine granularity, but that remain as material for another article.

Posted in Uncategorized | 1 Comment

Arcan 0.5.3, Durden 0.3

It’s just about time for a new release of Arcan, and way past due for a new release of the reference desktop environment, Durden. Going through some of the visible changes on a ‘one-clip or screenshot per feature’ basis:


Most Arcan- changes are internal engine modifications or changes to the assortment of support tools and libraries, lacking interesting visual changes, so dig into the detailed changelog further below for some more detail.

Crash Recovery Improvements:

All unhandled server-side scripting errors (i.e. no fallback application is set) are now interpreted by clients as display server crashes, triggering the crash recovery-reconnect behaviour rather than a clean shutdown. Two shmif- related bugs preventing Xarcan from recovering have also been squished. This should leave us with arcan_lwa and the waybridge- tool left in terms of basic crash recovery (omitted now as the allocation- management is fiendishly difficult). The video shows me explicitly killing the display server process (it’s on a while true; do arcan durden; done loop) and the new instance getting a reconnect from the recovering xarcan bridge.

Improved support for Wayland clients:

Here we see SDL2 (GL-test), QT (konsole), EFL (terminology), Weston-terminal and gtk3-demo mixing wl_shm, wl_drm, wl_shell, wl_zxdg_shellv6, and a bunch of other protocols and subprotocols, with about as many variations of decorations and mouse cursors.

Initial Bringup on OpenBSD:

There’s a lot of things left to do on this backend, with the bigger issues (as usual) being system integration with pre-existing virtual terminal management schemes and getting access to input samples in the right states in the processing pipeline. The following screenshot show that enough bring-up has been done to get keyboard input and Durden working enough to show graphics, spawn terminal frameservers etc.

OpenBSD screenshot

Terminal/TUI experimental features:

Smooth scrolling, Ligatures and non-monospace rendering. These are not really solvable problems for legacy terminal emulation (and disabled by default) but can in some limited settings be helpful, and will be relevant for upcoming work on terminal-emulator liberated TUIs/CLIs. The following video shows three differently configured terminals (though hinting looks like crap when recorded in this way). One that smooth-scrolls a bitmap font, one that shapes with a non-monospaced font, and the last one that performs ligature substitutions with the fira-code font.


A lot of the features and ideas hidden within durden have now been documented on the webpage, now situated at durden.arcan-fe.com

Window slicing:

This is useful when you only want to keep an eye on- or interact with- a specific part of a window (say an embedded video player in a web browser), slicing away crop-bars from video players or avoiding toolkit enforced decorations.

Overlay tool:

This is useful when you want some contents to follow you regardless of the context you’re working inside (except dedicated fullscreen). This can, of course, be combined with window slicing for better effect.

Input multicast:

This is useful when you have a hierarchy of clients you want to receive the same input in the same time frame (e.g. when doing latency measurements) or when controlling a large number of terminals, VMs or remote desktops.


Window- relayout/resize animations for float/tile:

This was primarily added as a cheap way of debugging performance and interactions between the animation subsystems and the window-type dependent scaling policies. The effect seemed pretty enough to be left as a configurable toggle (global/config/visual/window animation speed).

LED devices and profiles:

This is useful for reducing eye strain by mapping display contents to ambient light, for communicating system events like incoming alerts and for improved UI by showing which keybindings are currently available and a color hint as to what they do.

This clip shows a custom rule that maps the contents of the currently selected window to LEDs placed behind the monitor.

This clip shows the currently accepted keybindings (all lit = input locked to window) color coded by menu path, while the F1..n buttons indicate the global audio gain.

External clipboard manager support:

Though the tool and the interface has been provided by Arcan for a while, the WM (Durden in this case) still needs to allow/enable support since Arcan itself doesn’t dictate or care about clipboards as such. In Durden, this support now manifests itself as enabling a ‘clipboard bridge’ that allows one passive (global listen), active (global insert) or full (global listen/insert) clipboard connection.


External gamma controller support:

Although the subsystem still needs some work on the Durden side, it is possible to either allow all clients full gamma control access (any client can make the screen go unusable dark) or enable it on a per-client basis (target/video/advanced/color-gamma synch). When performed on an Xarcan instance, the currently active display will be exposed over XRandr, meaning that old/legacy tools that require ramp controls should work transparently.

What’s Next?

While we are still waiting for advancements outside of our control in regards to lower level primitives (synchronisation control, drivers switching over to atomic modesets, developments at xdc2017 in regards to usable buffer APIs, Vulkan driver stability, the list goes on)  – the main development drivers for the 0.5 branch (which is, by far, the heaviest one planned for the entirety of the base project); heterogenous multi-GPU, live driver updates, VR and low level system graphics in general – will keep progressing alongside the various supporting tools.

The more enticing developments in the near-future are partly putting the finishing touches on the TUI- set of libraries and partly the unwrapping of the Lua APIs. As mentioned in earlier posts, the Lua interface has acted as a staging grounds for finding the set of necessary/sufficient features for writing quite advanced graphical applications. Quite some care was put into avoiding language-feature masturbation and object-oriented nonsense for the very reason of having the API being able to double as a privileged, GPU friendly drawing protocol. This means that the driving scripts can be decoupled from the engine and be controlled by an external process, creating some rather interesting possibilities – the least of which is re-enabling the separate window manager model from X, but without the intractable synchronisation issues.

Tracking for other, smaller, enhancements can be found in the issue tracker: arcan , durden.

Detailed Changelog

Arcan – Engine:

  • Refactored frameserver- spawning parts to cut down on duplicated code paths and make setup/control more streamlined.
  • Support for tessellated 2D objects, more fine-grained control over individual vertices.
  • Extended agp_mesh_store to cover what will be needed for full glTF2.0 support.
  • Crash-recovery procedure for external clients now also applies to scripting layer errors when there is no fallback appl set.
  • Reworked font/format string code to bleed less state and automatically re-raster if the outer object is attached to a rendertarget with a different output density.
  • Added additional anchoring points to linked images (center-left, center-top, center-right, center-bottom)
  • VR- mapping work for binding external sensor “limbs” to 3d models, continued bringup on managing vrbridge- instances and fast-path integration vid vrbridge provided sensor output.

Arcan – Lua:

  • New function: image_tesselation, used to change subdivisions in s and t directions, and to access and change individual mesh attributes for 3d objects.
  • New function: rendertarget_reconfigure, used to change the target horizontal and vertical density of a rendertarget.
  • New functions: vr_maplimb, vr_metadata
  • Updated function: define_rendertarget to now returns status, accept more mode flags (for MSAA) and allow target density specification.
  • Updated function: alloc_surface to allows additional backend storage formats, (FP16, FP32, alpha-less RGB565, …)
  • Updated function: link_image, added additional anchoring points

 Arcan – Shmif:

  • New library, arcan-shmif-server. This is used for proxying / multiplexing additional connection unto an established one. Primary targets for this lib is a per-client networking proxy, and for TUI/Terminal to support delegating decode/rendering to other processes.
  • Added support for HANDOVER subsegments. These are subsegments that mutate into primary segments in order to reuse a connection to negotiate new clients without exposing a listening channel, allowing a client to negotiate connections on behalf of another.
  • RESET_ level 3 events now carry a reference to the deregistered descriptor so clients have a chance to remove from custom select()/poll() hooks that cache descriptor use.

Arcan – TUI/Terminal:

  • Another dissemination/progress article: https://arcan-fe.com/2017/07/12/the-dawn-of-a-new-command-line-interface/
  • support for bitmapped fonts (PSFv2) as an optional path for faster rendering on weak hardware and freetype- less builds.
  • Built-in bitmapped terminus font for three densities/sizes (small, normal, large) as fallback when no font is provided by the display server connection.
  • Added dynamic color-scheme updates.
  • Rendering-layer reworked to support shaping, custom blits, …
  • Experimental double buffered mode (ARCAN_ARG=dblbuf)
  • Experimental smooth scrolling in normal mode (ARCAN_ARG=scroll=4)
  • Experimental shaping mode kerning for non-monospace fonts (ARCAN_ARG=shape)
  • Experimental ligature/substitution mode for BiDi/i8n/”code fonts” via Harfbuzz (ARCAN_ARG=substitute)
  • Lua bindings and tool for experimenting with them (src/tools/ltui)

Arcan – Platform:

  • Refactored use of environment variables to a configuration API
  • EGL-DRI: VT switching should be noticeably more robust, EGL libraries can now be dynamically loaded/reloaded to account for upgrades or per-GPU sets of libraries.
  • AGP: Updated GLES2 backend to work better with BCM drivers.
  • Evdev: Added optional support for using xkblayouts to populate the utf8 field.
  • EGL-GLES: quick fixes to bring BCM blobs back to life on rPI.
  • OpenBSD: initial port bring-up, keyboard input and graphics working.
  • SDL2: added SDL2 based video/event platform implementation, some input issues left to sort out before 1.2 support can be deprecated and this be the default on OSX.

Arcan – Tools:

  • Aloadimage: basic support for SVG images.
  • Doc: started refactoring lua API documentation format to double as IDL for re-use of lua API as privileged drawing and WM- protocol.
  • Xarcan: synched to upstream, parent crash recovery fixes, old-drawing mode (no -glamor, no dri3) synchronization and color management improvement.
  • Qemu/SDL2: synched to upstream.


  • XKB- Layout transfer support, basic pointer and pointer surface (wl_seat)
  • Damage Regions, dma-buf forwarding (wl_surf)
  • More stubs (data_device/data_device manager/data_offer/data source)
  • zxdg-shell mostly working (toplevel, positioners, popup)
  • added support for relative_pointer motion


  • Documentation moved to a separate webpage, http://durden.arcan-fe.com

  • allow client- defined mouse-cursor support
  • Window slicing: target/window/slice allows mouse-selected subregion to (active->input forward or passive) bind a subregion of one window to a new window.

  • External clipboard manager support: external clients can be permitted to read and/or inject entries unto the clipboard. See global/config/system/clipboard-bridge.

  • Gamma controls: external clients can be permitted to set custom color/ and gamma/ lookup tables, either per window or globally. See target/video/advanced/color-gamma synch and global/config/system/gamma-bridge.

  • Filesystem-like IPC: the iopipes IPC path has been extended to allow ls, read, write and exec like navigation of the menu subsystem. This can be bound to a FUSE-wrapper to fully control (script!) durden from a terminal.

  • LED devices: added support for profile driven LED device control see devmaps/led/README.md or global/config/led

  • Input multicast : added support for input multicast groups. Enable per window via target/input/multicast. Keyboard input received will be forwarded to all children.

  • Statusbar: can now be set to ‘HUD’ mode, where it is only visible on the global/ or target/ menu HUDs. (config/visual/bars/statusbar(HUD)/…)

  • Tools/Autolayout improvements: can now disable titlebars on side-columns, and allow a different shader on side-columns (see global/config/tools/autolayouting)

  • Tools/Overlay: [new], can now take the contents of a window and add to a vertical column stack at left or right edge as scaled-down previews.

  • Target/Video/Advanced: allow per-window output density overrides.

  • Atypes/wayland/x11: new scaling mode, ‘client’ to allow the client to know about the max dimensions, but let it chose its own actual size within those constraints.

  • Window- relayout/resize animations for float/tile: disable/enable via config/visual/window animation speed

  • Dynamically switchable visual/action schemes (devmaps/schemes/README.md) that can be used to set a global, per-display, per workspace or per window scheme of fonts and other configuration presets.

  • Allow GPU- authentication controls.

  • Split mouse cursors into sets.

  • More consistent font/font-size switching when migrating across displays with different densities.

  • Default-off display profiles for vive/psvr.

  • Defer window attachment to reduce initial storm of resize operations.

  • Menu options for appl- switching (global/system/reset/…).

  • Hidden bind path for suspend-state toggle (target/state/…).

  • Menu path to reset workspace background (global/workspace/…)

  • Menu path for global/workspace/switch/last.

  • Option to force bitmap font path for terminal.

  • A shader for luma (monochrome) – only mode.

  • Atype- profile for wayland clients.

  • Option to disable/block mouse (global/input/mouse/block).

  • Target menu path for set-x, set-y in float mode.

  • Mouse button debounce timer support (global/inpput/mouse/debounce).

  • Expose backlight controls per display (global/display/displays/…)

  • Tools/pulldown: can now set a shadow/colored border.

Posted in Uncategorized | 3 Comments

The Dawn of a new Command Line Interface

disclaimer: this is a technical post aimed at developers being somewhat aware of the problem space. There will be a concluding ‘the day of…’ post aimed at end users where some of the benefits will be demonstrated in a stronger light.

A few months back, I wrote a lighter post about an ongoing effort towards reshaping the venerable Linux/BSD CLI to be free of the legacy cruft that comes with having to deal with the emulation of old terminal protocols, stressing the point that these protocols make the CLI less efficient, and hard to work with from both a user- and a developer- perspective. In this post, we’ll recap some of the problems, go through the pending solution, update with the current progress and targets, and see what’s left to do.

To recap, some of the key issues to adress were:

  • Split between terminal emulator and command line shell breaks desktop integration – Visual partitions such as windows, borders and popups are simulated with characters that are unwanted in copy-paste operations and fail to integration with an outer desktop shell (if any).
  • Code/data confusion – both the terminal emulator and text-oriented user interfaces (TUIs) tries to separate content from metadata using a large assortment of encoding schemes, all being prone to errors, abuse, difficult to parse and ridden with legacy.
  • Uncertain capabilities/feature-set – basic things like color depth, palette, character encoding schemes and so on are all probed through a broken mishmash of environment variables, capability databases and the actual support varies with the terminal emulator that is being used.
  • Confusion between user-input and data – programs can’t reliably distinguish between interactive (keyboard) input, pasted/”IPC” input and other forms of data entry.
  • Lack of synchronisation. This makes it impossible for the terminal emulator to know when it is supposed to draw, and signal propagation contributes to making resize operations slow.
  • Crazy encoding schemes for representing non-character data –  such as Sixel.

This just scratches the surface and don’t go into related issues when it comes to user-interaction, consistency, and it ignores the entire problem space of system interaction when it comes to tty devices, input modes, virtual terminal switching and so on.

If you consider the entire feature-set of all protocols that are already around and in use, you get a very “Cronenberg“- take on a display server and I, at least, find the eerie similarities between terminal emulators and the insect typewriters from Naked Lunch both amusing, tragic and frightening at the same time; the basic features one would expect are there, along with some very unwanted ones, but pieced together in an outright disgusting way. If we also include related libraries and tools like curses and turbo vision we get a clunky version of a regular point and click UI toolkit. Even though the scope is arguably more narrow and well-defined, these libraries are conceptually not far away from the likes of Qt, GTK and Electron. Study unicode and it shouldn’t be hard to see that ‘text’ is mostly graphics, the largest difference by far is the smallest atom, and the biggest state-space explosion comes from saying ‘pixel’ instead of cell.

So the first question is, why even bother to do anything at all within this spectrum instead of just maintaining the status quo? One may argue that we can, after all, write good CLI/TUIs using QT running on Xorg today, no change needed – it’s just not the path people typically take; maybe it’s the paradigm of favouring mouse or touch oriented user interaction that is “at fault” here, along with favouring style and aesthetics over substance. One counterpoint is that the infrastructure needed to support the toolkit+display server approach is morbidly obese into the millions of lines of code, when the problem space should be solvable within the tens-of-thousands, but “so what, we have teraflops and gigabytes to spare!”. Ok, how about the individual investment of writing software? accommodating for disabilities? attack surface? mobility and mutability of produced output? efficiency for a (trained) operator? or when said infrastructure isn’t available? the list goes on.

There is arguably a rift here between those that prefer the ‘shove it in a browser’ or flashy UIs that animate and morph as you interact, and those that prefer staring into a text editor. It seems to me that the former category gets all the fancy new toys, while the latter mutters on about insurmountable levels of legacy. What I personally want is many more “one- purpose” TUIs and for them to be much easier to develop. They need to be simpler, more consistent, obvious to use, and more configurable. That’s nice and dreamy, but how are “we” supposed to get there?

First, lets consider some of the relevant components of the Arcan project as a whole, as the proposed solution reconfigures these in a very specific way. The following picture shows the span of current components:

family1.pngThis time around, we’re only interested in the parts marked SHMIF, Terminal and TUI. Everything else can be ignored. SHMIF is the structural glue/client IPC. TUI is a developer facing API built on top of SHMIF but with actual guarantees of being a forward/backwards compatible API. Terminal is a vtXXX terminal emulator/state machine built using a modified and extended version of libtsm.

Inside the ‘Arcan’ block from the picture, we have something like this:

Arcan (bin) layers

From this, we take the frameserver(ipc)– block and we put it into its own shmif-server library. We take the platform block and split out into its own, libarcan-abc. Terminal is extended to be able to use these two APIs along with optional Lua/whatever bindings for the TUI API so that the higher level shell CLI logic with all its string processing ickiness can be written in something that isn’t C. This opens the door for two configurations.  Starting with the more complex one, we get this figure:


Here, Arcan is used as the main display server or hooked up to render using another one (there are implementations of the platform layer for both low-level and high-level system integration). The running ‘appl’ acts as the window manger (which can practically be a trivial one that just works as fullscreen or the alt+fN VT switching style with only a few lines of code) and it may spawn one or many of the afsrv_terminal. These can be run in ‘compatibility mode’ where the emulator state machine is activated and it acts just like xterm and friends.

We can also run it in a simpler form:


In this mode, the terminal works directly with the platform layer to  drive displays and sample input. It can even perform this role directly at boot if need be. An interesting property of shmif here is the support for different connection modes (which I’ll elaborate on in another post) where you can both interactively migrate and delegate connection primitives. This means that you can switch between these two configurations at runtime, without data loss – even have the individual clients survive and reconnect in the event of a display server crash.

No matter the configuration, you (the ghost behind the shell) get access to all the features in shmif and can decide which ones that should be used and which ones that should be rejected. You are in control over the routing via the choice in shell (and the appl- for the complex version). Recall that the prime target now is local text-oriented, command line interfaces – not changing or tampering with the awk | sed | grep | … flow, that’s an entirely different beast. In contrast to curses and similar solutions, this approach also avoids tampering with stdin, stdout, stderr or argv, because connection primitives and invocation arguments are inherited or passed via env. This should mean that retrofitting existing tools can be done without much in terms of ifdef hell or breaking existing code.

Anyhow, most of this is not just vapours from some hallucinogenic vision but has, in fact, already been implemented and been in testing for quite some time. What is currently being worked on now and for the near future is improving the quality in some of the existing stages and adding:

  • Double buffering on the virtual cell screen level to add support for sub-cell “smooth” scrolling, text shaping, BiDi, and non-monospace, properly kerned, text rendering.
  • API and structures for designating regions (alt-screen mode) or lines (normal mode) for custom input, particularly mixing/composing contents from other tui clients or frameservers.

Then comes some more advanced refactoring:

  • Shmif-server API still being fleshed out.
  • Libarcan-abc platform split, as it depends on another refactoring effort.
  • Lua bindings and possibly an example shell.

And more advanced “some time in the future” things:

  • Shmif-server-proxy tool that can convert to-/from- a network or pipe-passed ‘line format’ (protocol) to enable networking support and test high latency/packet loss behavior.
  • CPU- only platform rasteriser (current form uses GL2.1+ or GLES2/3).
  • Ports to more OSes (currently only Linux, FreeBSD, OSX).

Should all these steps succeed, the last ‘nail in the coffin’ will be to provide an alternative platform output target that undoes all this work and outputs into a VT100 compliant mess again – all for the sake of backwards compatibility. That part is comparably trivial as it is the end result of ‘composition’ (merge all layers), it is the premature composition that is (primarily) at fault here as information is irreversibly lost. It is just made worse in this case as the feature scope of the output side (desktop computer instead of dumb terminal) and the capability of the input side (clients) mismatch because of the communication language.

Posted in Uncategorized | Leave a comment

Arcan 0.5.2

A new version of Arcan has been tagged. There is no demo video or fancy screenshots this time around; those things will have to wait until updates come to the related projects (mainly Durden) in a few weeks. Most of the work that remains on the 0.5 series isn’t much to look at by itself – but is nevertheless conceptually and technically interesting.

Some generic highlights from the last ~8 months of work:

The ‘Prio’ side-project – described in more depth in the One night in Rio – Vacation photos from Plan9 post. Outside of its value as a security- research target, an hommage to ‘the elders’ and a reminder that there are many more “ways of the desktop” out there than the ‘popular by legacy and traction’ Win/OSX/… one. Prio also serves as a decent base for rapidly putting together highly customised environments.

A number of new supporting tools (src/tools) – There’s aloadimage for image loading that will serve as a testing and development tool for security work like sandbox hardening, quality improvement work in terms of performance and – more importantly HDR rendering and HDR- formats. There’s aclip for building CLI- and scriptable external clipboard managers. There’s shmmon for connection- inspection, monitoring and debugging, and there’s waybridge for wayland client support. Speaking on the Wayland support, it’s getting to a stage where things like gtk3-demo, weston-terminal etc. start showing up – but so far, it has unfortunately been a very unpleasant beast to work with, and with the current pacing, it will take, at least, another good month or two still until it’s really usable.

VR grunt work – One of the near-future prospects that interest me the most on both a personal and a professional level – is the one of getting rid of much of the unintuitive and overcomplicated cruft that browsers, UI toolkits and the “traditional” desktop puts between the individual and computing. “Desktop VR” as it has been presented so far is little more than low-definition “planes-in-space”. With the layering and divison of responsibility that Arcan brings to the table, much more interesting opportunities should arise. Putting flowery visions aside, the support that has been integrated right now is far from stellar, but the “heavy on experimentation, light on results” phase of figuring out how everything from device interfacing- to the scripting API- is supposed to work – is nearing its end. The first part in this is the tool vrbridge that provides device- control and aggregation (the current hardware reality of coding for VR is mostly a closed-source vendor-lock in mess where you won’t get access to the primitives without swearing allegiance to bulky full-engine APIs) which will be fleshed out during the coming few releases.

TUI – Covered in the (regrettably crap) blog post on Chasing the dream of a terminal-free CLI is the (soon to be) developer facing API that takes advantage of SHMIF features to try and provide a way for building text oriented / command-line interfaces that gets rid of the legacy baggage and limitations that stem from having the CLI shell to always work through terminal emulator protocols like VT100, instead making it talk with the display server directly. This will be accompanied with Lua bindings and a (bash/zsh/…)- like shell environment. The SensEYE sensors and translators will also be reworked to use this API.

Xarcan – Is a patched Xorg that interfaces with SHMIF and provides ‘X in a box’ like integration. For all the ‘beating up the old man’ that Xorg seem to get, the backend coding was neither more or nor less painful than Qemu or SDL proved to be. Normal use should be just fine, but dipping into glamor+accelerated graphics seem to be good at provoking graphics driver crashes, possibly since we bind the GL context to a render node rather than the card- node. See the README in the git repository for more details.

Platform refactoring – The main ‘low-level’ platform backend, egl-dri – has been extended with some basic switchable ‘synchronization strategies’ for dynamically changing scheduling priorities between energy efficiency, lower input latency, smoother animations and so on. The egl-nvidia code has been integrated with the egl-dri platform now that unified buffer project seems to have stalled. There are some caveats on activating and using it with the NVIDIA closed-source blobs, covered further in the wiki. Most GL use has been refactored to be dynamically loadable and reloadable, getting us much closer to multi-vendor-multi-GPU use and live driver upgrades.

LED subsystem rework – This was very recently covered in the Playing with LEDs post, so I will only mention this briefly, but the way LEDs like keyboard NumLock, CapsLock, ScrollLock, Display Backlights and more advanced “gamer keyboard” like setups – were all hooked up has been reworked and mostly moved to a tiny protocol over a pipe.

SHMIF improvements – The internal segmentation/IPC API has been extended to support negotiation for privileged features, such as access to display lookup tables and VR metadata (also, placeholder: HDR, Vector). Extended accelerated graphics (OpenGL etc.) has been split out into a shmifext- library so that all the other project backends use the same method for getting accelerated GPU access. This will primarily play a role when we need to respond to hotplug events or load balance between multiple GPUs. A new ‘preroll’ stage has been added to the connection process in order to provide an explicit synch-point for script- dependent initial metadata, which should cut down on connect-and-draw latency.

An interesting development that is being experimented with, but won’t be pushed for a while, is reusing the Lua API (most of the API was actually designed with this in mind) as an external protocol for moving the ‘appl’ layer out into its own process. This will be relevant for two reasons. The first one is to make it possible to control the engine using languages other than Lua. It also makes it possible to run things in an X11- like “separate window manager” division of responsibility in an ‘opt-in’ way. The second, more distant, reason is that it works to provide a render- API for network like transparency – though there are likely to be more hard-to-foresee problems lurking here when there’s added latency and round-trips.

A slightly more detailed changelog is available on the wiki.

Posted in Uncategorized | Leave a comment

Playing with LEDs

It’s not uncommon to only bring up monitors when talking about display server outputs, but there are more subtle yet arguably important ones that doesn’t always get the attention they deserve. One such output are LEDs (and more fringe light emitters that gets bundled together with the acronym), present in both laptop backlights and keyboard latched-modifier indicators used for num-lock, scroll-lock and so on.

In the case of the display backlight, the low-level control interface may be bundled with backlight management, so there’s a natural fit. In the case of keyboard modifiers, it comes with the window management territory – someone needs privilege to change the device state, and different applications have different views on active modifiers. A lot of gaming devices, recent keyboards and mice today also come with more flexible LED interfaces that permit dozens or hundreds of different outputs.

Arcan has had support for fringe LED devices for a long time (since about ~2003-2004) in the form of the Ultimarc LED controller, used for very custom builds. Recently, this support has been refactored somewhat and extended to be usable both for internally managed LED devices that come bundled with other input and output devices – and for a custom FIFO- protocol for connecting to external tools.

As part of developing and testing this rework, the following video was recorded:

Thought it may not be easy to see (I don’t have a good rigging for filming at the moment), a few interesting things happen here. The hardware setup is comprised of a few Adafruit Neopixel sticks attached to the back sides of the monitor, along with an Arduino and a G410 RGB keyboard. The software setup is a custom LED profile for Durden (an upcoming feature for the next release). This profile samples some state of the currently selected window, like contents or level of trust (since application origin and permissions are tracked), and maps to the arduino controlled LEDs. It is updated in response to changes to the canvas contents of the window, and moves with window- selection state. There is also a ‘keymap’ profile that describes how the currently active keyboard layout translates into RGB keyboard LEDs. This allows the input-state, like the currently available keybindings, to be reflected by the lights on keyboard. When a meta-key is pressed – only the keybindings relevant to that key will be shown on the keyboard.

This can be utilised for more ambient effects, like in the following video:

Here, the prio WM is used, and it maps the contents of the video being played back in the lower right corner to both the display backlight, and to the keyboard.

This system allows for a number of window- manager states to be trivially exposed — things such as notification alerts, resource consumption issues, application crashes etc. in (to some) much less intrusive ways than something like popup windows or sound alerts.

In the following clip you can see a profile running on Durden that maps both global system state (the top bar corresponds to the current audio volume levels, and the all-white state indicate that input is locked to a specific window), while the other colors indicate valid keybindings and what they target.


Though the driver support may be sketchy (RGB Keyboards and related gaming peripherals can be absolutely terrible in this regard), the patches and tools used for the demos above can be found in this git repository.

Posted in Uncategorized | Leave a comment

One night in Rio – Vacation photos from Plan9

This post is about experimenting with imitating and extending the window management concepts from the venerable Plan9, Rio. The backstory and motivation is simply that I’ve had the need for a smaller and more ‘hackable’ base than the feature-heavy Durden environment for a while, and this seemed like a very nice fit.

For the TL;DR – here’s a video of a version with some added visual flair, showing it in action:

From this, the prio project has been added to the Arcan family. I wanted this experiment to exhibit three key features, that I’ll cover in more detail:

In addition to these features, it would be a nice bonus if the code base was simple enough to use as a starting point for playing around with kiosk-/mobile-/tablet- level window management schemes.

User-defined-confined Spaces

The user-designated confined spaces is shown in the beginning of the video via the green region that appears after picking [menu->new], which allows the user to “draw” where he wants a new CLI-/terminal- group to spawn. By default, the clients are restricted to the dimensions of this space, and any client-initiated attempts to resize- or reposition- will be ignored or rejected.


This is not something that is perfect for all occasions, and is a “better” fit for situations where you have a virtual machine monitor, emulator, video playback, remote desktop session or a command-line shell.

The reason is simply that these types of applications have quite restrained window management integration requirements. This means that you can get away with forceful commands like “this will be your display dimensions and that’s final” or trust that a “hi, I’m new here and I’d like to be this big” request won’t become invalid while you are busy processing it. By contrast, a normal UI toolkit application may want to spawn sub-windows, popup windows, tooltips, dialogs and so on – often with relative positioning and sizing requirements.

The primary benefits with this model is that it:

  1. Drastically cuts down on resize events and resize requests.
  2. [more important] Provides a basis for compartmentalisation.

Resize events are among the more expensive things that can happen between a client and its display server: they both need to agree on a new acceptable size, new memory buffers needs to be allocated with possibly intermediate datastores. Then, the new buffers need to be populated with data, synchronized, composited and scanned out to the display(s). At the 4k/8k HDR formats we are rapidly reaching as the new normal, a single buffer may reach sizes of 265MB (FP16 format @8k), amounting to, at least, a gigabyte of buffer transfers before you’d actually see it on the screen.

You really do not want a resize negotiation to go wrong and have buffers be useless and discarded due to a mismatch between user expectations and client needs.

*very long explanation redacted*

Over the course of a laptop battery cycle, waste on this level matters. This is one of the reasons as to why ‘tricks of the trade’ like rate-limiting / coalescing resize requests on mouse-drag-resize event storms, auto-layouter heuristics for tiling window management and border-colour padding for high-latency clients — will become increasingly relevant. Another option for avoiding such tricks is to pick a window management scheme where the impact is lower.

The basic idea of Compartmentalisation (and compartmentation in the sense of fire safety) here is that you define the location and boundary (your compartment) for a set of clients. The desired effect is that you know what the client is and where it comes from. There should be no effective means for any of these clients to escape its designated compartment or misrepresent itself. To strengthen the idea further, you can also assign a compartment with additional visual identity markers, such as the color of window decorations. A nice example of this, can be found in Qubes-OS.

With proper compartmentalisation, a surreptitious client cannot simply ‘mimic’ the look of another window in order to trick you into giving away information. This becomes important when your threat model includes “information parasites”: where every window that you are looking at is also potentially staring back at some part of you, taking notes. The catch is that even if you know this or have probable cause to suspect it, you are somehow still forced to interact with the surreptitious client in order to access some vital service or data – simply saying “no!” is not an option (see also: How the Internet sees you from 27c3 – a lack of a signal, is also a signal).

The natural countermeasure to this is deception, which suffers from a number of complications and unpleasant failure modes. This is highly uncharted territory, but this feature provides a reasonable starting point for UI assisted compartmentalisation and deception profiles assigned per compartment, but that’s a different story for another time.

Hierarchical Connection Structure

Generally speaking, X and friends maintain soft hierarchies between different ‘windows’ – a popup window is a child of a parent window and so on – forming a tree-like structure (for technical details, see, for instance, XGetWindowAttribute(3) and XQueryTree(3)). The ‘soft’ part comes from the fact that these relations can be manipulated (reparented) by any client that act as a window manager. Such hierarchies are important for transformations, layouting, picking and similar operations.

A hierarchy that is not being tracked however, is the one behind the display server connections themselves – the answer to the question “who was responsible for this connection?”. A simple example is that you run a program like ‘xterm’. Inside xterm, you launch ‘xeyes’ or something equally important. From the perspective of the window management scheme, ‘xeyes’ is just a new connection among many, and the relationship to the parent xterm is lost (there are hacks to sort of retrieve this information, but not in a reliable way as it requires client cooperation).

In the model advocated here, things are a bit more nuanced: Some clients gets invited to the party, and some are even allowed to bring a friend along but if they start misbehaving, the entire party can be ejected at once without punishing the innocent.

In Plan9/Rio, when the command-line shell tries to run another application from within itself, the new application reuses (multiplexes) the drawing primitives and the setup that is already in place. While the arcan-shmif API still lacks a few features that would allow for this model to work in exactly this way, there is a certain middle ground that can be reached in the meanwhile. The following image is taken from the video around the ~40 mark:


Instead of multiplexing multiple clients on the same connection primitive, each confined space acts as its own logical group, mapping new clients within that group to individual tabs, coloured by their registered type. Tabs bound to the same window come from the same connection point. The way this works is as follows:

(feel free to skip, the description is rather long)

The connection model, by default, in Arcan is simple: Nothing is allowed to connect to the display server. No DISPLAY=:0, no XDG_RUNTIME_PATH, nothing. That makes the server part rather pointless, so how is a data provider hooked up?

First, you have the option to whitelist by adding things to a database, and a script that issues an explicit launch_target call which references a database entry, or by exposing a user-facing interface that eventually leads down the same path. The engine will, in turn, spawn a new process which inherits the primitives needed to setup a connection and synchronise data-transfers. When doing so, you also have the option of enabling additional capabilities, such as allowing the client to record screen contents, alter output display lookup-tables or even inject input events into the main event loop, even though such actions are not permitted by default.

Second, you have the option to use designated preset software which performs tasks that are typically prone to errors or compromising security. These are the so called frameservers – intended to improve code re-use of costly and complex features across sandbox domains. The decode frameserver takes care of collecting and managing media parsing, the encode frameserver records or streams video, etc. Strong assumptions are made as to their environment requirements, behaviours and volatility.

Lastly, you have explicit connection points. These are ‘consume-on-use’ connection primitives exposed in some build-time specific way (global shared namespace, domain socket in home directory and so on). The running scripts explicitly allocates and binds connection points on a per-connection basis (with the choice to re-open after an accepted connection) using a custom name. This allows us to:

  1. Rate-limit connections: external connections can be disabled at will, while still allowing trusted ones to go through.
  2. Compartmentalise trust and specialise user-interface behaviour based on connection primitives used.
  3. Redirect connections, you can tell a client that “in the case of an emergency (failed connection)” here is another connection point to use. This is partly how crash recovery in the display server is managed, but can also be used for much more interesting things.

When a new user-designated confined space is created, a connection point is randomly generated and forwarded (the ARCAN_CONNPATH environment variable) to the shell that will be bound to the space. The shell can then chose to forward these connection primitives to the clients it spawns and so on.

User-definable preset- roles

Time to add something of our own. An often praised ability of X is its modularity; how you can mix and match things to your heart’s content. The technical downside to this is that it adds quite a bit of complexity in pretty much every layer, with some intractable and serious performance and security tradeoffs.

Other systems have opted for a more rigid approach. Wayland, for instance, ties different surface types and interaction schemes together through the concept of a “sHell”. Roughly put, you write an XML specification of your new shell protocol that encapsulates the surface types you want, like “popup” or “statusbar” and explain how they are supposed to behave and what you should and should not do with it. Then, you run that spec through a generator, adjust your compositor to implement the server side of this spec, and then develop clients that each implement the client side of this new spec. There are a number of sharp edges to this approach that we’ll save for later, though it is an interesting model for comparison.

Arcan has a middle ground: each “segment” (container for buffers, event deliver, etc.) has a preset / locked-down type model (e.g. popup, titlebar, …) but delegates the decision as to how these are to be used, presented or rejected – to a user controlled set of scripts (‘Prio’, ‘Durden’, ‘something-you-wrote’) running inside a scripting environment. This is complemented by the notion of script-defined connection points, that were covered at the end of the previous section.

This approach still decouple presentation and logic from ‘the server’, while maintaining the ‘Window Manager’ flexibility from X, but without the cost and burden of exposing the raw and privileged capabilities of the server over the same protocol that normal clients are supposed to use.

A direct consequence from this design – is that you can quickly designate a connection point to fulfil some role tied to your window management scheme, and apply a different set of rules for drawing, input and so on, depending on the role and segment types. This can be achieved without modifying the clients, the underlying communication protocol or rebuilding/restarting the server.

At the end of the video, you can see how I first launch a video clip normally, and how it appears as a tab. Then, I specify a designated connection point, ‘background’ and relaunch the video clip. Now, its contents are being routed to the wallpaper rather than being treated as a new client.

This means that you can split things up like a single- connection point for a statusbar, launch-bar, HUD or similar desktop elements and enforce specific behaviours like a fixed screen position, filtered input and so on. You can even go to extremes like a connection point for something like a screen recorder that only gets access to non-sensitive screen contents and “lie” when you get an unexpected connection and redirect the output of something nastier.

Closing Remarks

As the purists will no doubt point out, these three key features do not really cover a big raison d’être for Rio itself – exposing the window management, buffer access and drawing control in the spirit of ‘everything is a file’ API, and through that feature, multiplex / share the UI connection. That is indeed correct, and part of the reason for why this is not supported right now, is that the previous post on ‘Chasing the dream of a terminal-free CLI’, and this one stand to merge paths in another post in the future, when the missing pieces of code have all been found. 

Update: These pieces have been added in an improved form as ‘HANDOVER’ connections where one client negotiates a connection over an existing one. The ‘WM as a file-system’ part has been integrated as part of Arcan 0.5.5, Durden 0.5. The draw-to-spawn are exposed as (‘global/settings/wspaces/float/method=draw) and the group connections as (‘global/open/terminal_group’).

As things stand, Prio is obviously not a ‘mature’ project and outside the odd feature now and then, I will not give it that much more attention, but rather merge some of it into the floating management mode in Durden. When the open source Raspberry Pi graphics drivers mature somewhat, or I get around to writing a software rendering backend to Arcan, I’ll likely return to Prio and make sure it is ‘desktop-complete’ and performant enough to be used efficiently on that class of devices.

Posted in Uncategorized | Leave a comment

Chasing the dream of a terminal-free CLI

TLDR; Crazy person is trying to do something about how nightmarishly bad the interface between you and the command-line/command-line utilities really is, i.e. getting rid of terminal protocols. 

 To start with ‘why’? If you have never had the pleasant experience of writing a terminal emulator, you might not have run into the details on how these actually work. As a user, maybe you have experienced a spurious ‘cat’ command with wrong input file turning your screen into unreadable garbage, or tried to resize a terminal window and wondered why it is so slow to react or being temporarily drawn with severe artifacts – but discarded such observations as mere annoyances. In fact, there are quite a lot of problems hiding under the surface that makes the command-line interface in unix- inspired systems less than optimal- and even unpleasant- to use. Instead of going into those details here, I’ll refer to this thread on reddit to save some time. Incidentally, this thread also happened to remind me that I did the grunt work for this already some two months ago; then forgot to tell anyone.

Context/Recap (feel free to skip): As part of the Arcan 0.5.2 work, the terminal frameserver was refactored and split into two parts: the terminal frameserver and arcan-shmif-tui. TUI here does not just refer to the common acronym ‘Text-based User Inteface’ but also the last part of the onomatopoetic form of the action of spitting saliva (rrrrrrpppptui, excluding any adjustments for locale). Frameservers in Arcan are a set of partially-trusted clients where each distinct one fulfills a specific role (archetype). There’s one for encoding/translating, another for decoding and so on. The idea was to have the engine be able to outsource tasks that are crash-prone or easy targets for reliable exploitation. One of the many goals with this design is to remove all media parsers from the main Arcan process, but also to allow these to be interchangeable (swap the default set out for ones that fit your particular needs) and act as services for other processes in a microkernel- like architecture in order to reduce the system-wide spread of dangerous and irresponsive parser use.

Back to business: The terminal frameserver was heavily based on David Herrmans libtsm, adding a handful of additional escape codes and integrating with font transmission/tuning/rendering, clipboard management, and the myriad of other features hidden inside the internal shmif- API.

Thankfully enough, libtsm neatly separates between the insanely convoluted state-machine required in order to act as a good little terminal emulator, and a virtual display which perform the whole cells/lines/metadata management that is sometimes rendered and queried.

Keeping the virtual display part around, the formula thus becomes:

TUI = [shiny new API] + [shmif- integration || XXX] + [tsm(display)] + [???]

This provides a building block for command-line driven applications that are not bound to the restrictions of the normal [terminal-emulator + sHell + application] threesome. Think of it as removing the middle man between the command line and the display server, but not being burdened by a full GUI toolkit. The core concept – text output, keyboard input – is maintained and lines or uniform ‘cells’ in a tight grid is still kept. You get the benefits of integrating with the outer system (window manager, …) when it comes to clipboard, multiple windows and so on, but not the complexity from toolkits or from implementing VT100 or other terminal protocols. It will also be possible to ‘hand the connection over’ to any program the shell would be running, providing a Plan9 like multiplex CLI style.

Wait, I hear you ask, won’t this actually introduce a gentle-push-off-a-cliff dependency to Arcan-as-a-Display-Server, break backwards compatibility with just about everything and in the process undermine-the-holiness-of-our-lord-and-saviour-UNIX-and-how-dare-you-I-love-my tscreenmux-bandaid-f-you-very-much-Hitler

Well, that’s something to avoid – don’t be evil and all that. That’s why there’s a “|| XXX” and a [???] in the formula above. The XXX can be substituted for some other rendering/display system integration API and the [???] to some ‘to be written’ backend that can output to the curses/terminal-esc-termcap war zone. It won’t be that hard and it’s not that many lines of code. It is much easier to multiplex metadata and data into holy output-stream matrimony again, than it would ever be to safely divorce the two.

To step back and be a bit critical to the while concept (but not really) – “is this actually something we need? We have a ton of awesome sparkling crisp UI toolkits and even that sweet Electr…”. What some of us (not necessarily all of us or even the majority of us) need is to get the f’ away from a world where everything needs GPU access and half a gigabytes of dependencies to download and draw a picture. That said, clinging to a world where you have to think “hmm was it -v as in ls, or as in pkill” may be a bit too conservative.

Anyhow, the current state is more than usable, although the API is still in the middle of its first iteration. Hey look, a header file! The terminal frameserver has been rewritten to use this API, so the features that was previously present in there (dynamic font switching, multi-DPI aware rendering, clipboard, unicode, and so on) are exposed. The code is kept as part of the main Arcan git for the time being,  but when things have stabilized, it will be split into a project of its own.

Small gains:

  1. Integration with window management: making life easier for keyboard cowboys and screen-readers alike, you no longer have to interpret a series of +-=/_+ (if the glyphs are even in the current font) as a popup or separator.
  2. Reliably copy and pasting things.
  3. Saving / Restoring State, Switching Locale, Modifying Environment – at runtime.
  4. Common interface for Open/Save/Pipe between otherwise isolated sessions.
  5. Audio Playback without /dev/dsp or worse.
  6. Drawing custom rasters into cells without Sixel.
  7. Emoji, free of charge – though debatable if that’s really a good thing.
  8. UI manageable alerts, not blinky-flashy-beepy-beepy.

(the list is really much longer but I’ll just stop here)


  1. Empower/Inspire people to find a new terminal-emulator liberated, efficient model for a future-ready shell, like NOTTY is doing.
  2. That the emergent patterns may one day be cleaned up and spit-polished into a protocol — pure, free and not relying on any code that I’ve touched.
  3. Peace on earth and good will towards men.
  4. That children never again have to lose their innocence by learning what setterm -reset is, or why it is, at times, needed.

Personal plans:

  1. Migrate Senseye translators to this.
  2. One day make a text/command-line oriented UI to gdb/lldb that doesn’t suck.
  3. Slowly forget what ESC, OSC, DSC,CSI etc. actually did.
  4. Make bindings for the more sensible of programming languages.
  5. Tricking someone else into maintaining it.
  6. TrueColor 60fps version of sl.

Intrigued?  Excited? Aroused? Care to Join in?

Bring out the KY, pour a stiff drink, swing by IRC, fork on Github – and have a Happy New Year.

Posted in Uncategorized | Leave a comment

Dating my X

I spread the coding effort needed for protocols and 3rd party software compatibility out over longer periods of time because the underlying work is mundane, tedious and very very repetitive. The QEmu backend is by far the more interesting and potent one – in terms of which Arcan capabilities that can be bridged, but it is also more experimental with frequent failures – it’s not for everyone.

While I was working on the Wayland Server parts, it became clear to me that there are quite a few technical details involved which makes the balance between time spent, progression and possible gains quite unfavorable – though I won’t elaborate on that now. (There’s a big page on the wiki tracking status, limitations and my own, possibly flawed, notes and observations)

Therefore, I came to the conclusion that I needed (for the time being) another model and feature-set for compatibility with X, than what is currently offered by XWayland.

Gulp, that means I have to deal with the Xorg codebase,  hmm what to do. 

Digging around in there, I found one dusty part that felt out of place, but in a somewhat good way: ‘Kdrive’. At first glance, this seemed like it would lessen some of the boilerplate coding needed to stitch together a working minimal Xserver, compared to a full DDX implementation.

Added bonus: less Xorg exposure to rinse off in the shower later (however, it still requires a prescription shampoo, body scrub and medevac team on standby).

Results: Github:XArcan

Before going into more details and technical jibberjabber, here’s a demo video of it running in some weird window manager, along with early signs of Wayland life.

(No, the Arcan scripts for this particular window manager are not public, yet).

The biggest motivation hurdle was, as it almost always is, digging through autotools-hell and patching myself into the build system. At least it wasn’t a custom configure shell script (QEmu) or both automake and cmake (SDL2).

Desired features:

  • Containment – I didn’t want to have a 1:1 ratio between an X client window and a logical window in the Arcan scripts I was using for window management (XWayland model). I would much rather imitate a dumb ‘display’ confined to one logical window in Arcan. That approach blends more easily with both the tiling window management scheme and the one used in the video.
  • Compartmentation – To be able to spin up multiple Xservers and control which clients belong to which group in order to separate between privileges and to tag with visual privilege-level markers so that I know which ones that currently gets to snoop on my keyboard input and therefore should get the ‘special’ credit card numbers, gmail accounts and phone-numbers. My honeypots, they hunger.
  • Clipboard – The clipboard model in Arcan is quite different from anything else, and is practically similar to how screen sharing is implemented. The model does allow for opt-in bidirectional global clipboard sharing and the Durden set of scripts will get a feature that can be toggled to set a client as global clipboard monitor and auto-promotion of new clipboard entries to global state. This should be able to bridge old xsel- scripts and similar tools.
  • Gamma Controls – There is bidirectional gamma table synchronisation between Arcan and its clients, though no scripts around that actually make use of them (that I know of). In Durden, this will be added as an advanced client video toggle to allow it to act as a gamma controller for the monitor it is currently bound to. When activated on an Xarcan window, things like redshift-xrandr should start to work.
  • Retain Input Tools – (your hotkey manager) The problem is comparable to gamma and clipboard, though this might take some more aggressive patches to the Xserver in order to find the right hooks. The Input- multicast group and global receiver feature hidden in Durden can be used to this effect, but something better is probably needed.
  • Controlled Screen Recording – With the way output segments work in the arcan-shmif API, I can extract and manipulate the subset of data sources that are being forwarded to an external ‘screen’ recorder. There is fundamentally no difference between a video camera, youtube video feed, or periodic screen snapshots in this regard even if they potentially live in different colour spaces. It seems possible to map a received output segment to the hooks used by X clients to record screen content, though you don’t reach 4k@60fps this way.

I also, of course, need some controls to be able to configure the compartmentation to decide if the very very scary GPU access should be allowed to an untrusted client or not.

Status and Limitations:

I haven’t spent that many hours on it about 1:1 between arcan-wayland and xarcan, but progression is quite decent – and it’s definitely usable.

  • Containment / Compartmentation – there by design
  • Gamma Controls – not yet
  • Clipboard – soon, the X server does not provide easy access to selection buffers etc. Need to fork/popen into separate clipboard process.
  • Input Tools – injection: not yet (internationalization input is doable through some clipboard hacks), broadcast: yes

Some other limitations:

  • You really want to run a normal window manager with the X server, though I consider that a feature. For the other use cases, there will eventually be XWayland support too.
  • Glamor and GLX are working in a primitive state, there will be glitches.
  • 1 Display:1 Screen (so no stretched multiscreen) – spin up more servers on more displays, if needed. This constraint makes synchronisation and performance tricks easier and the codebase less painful.
  • It’s still X, synchronisation between WM, Xorg, Displays and clients will be bad forever.
  • No way of reliably pairing audio source to a window, so something more hack:y is needed for that. Got PA in the sniper scope though, looks like he’s grazing at the moment – filthy beast.
  • Keyboard Layout management synchronisation cannot really be fixed (I’m not building a dynamic translator between the internal keyboard layout state and XKB unless I restock with considerable amounts of alcohol and got a good suicide hotline on speed dial).

I also got a crazy idea or two in the process that’ll showcase some obscure Durden features, but that’s for another time.

Posted in Uncategorized | Leave a comment

Arcan “Monthly”, September Edition

Revising the approach to dissemination slightly, we will try out having a monthly (or bi-monthly if there is not enough relevant changes for a monthly one) update to the project and sub-projects.

For this round, there’s a new tagged Arcan (i.e. the Display Server) version (0.5.1) and a new tagged Durden (i.e. the example “Desktop Environment”) version (0.2). Although some new features can’t be recorded with the setup I have here, the following demo video covers some of the major changes:

I did not have the opportunity to record voice overs this time around, but here are the rough notes on what’s happening.

1: Autolayouter

The autolayouter is an example of a complex drop:in able tool script that adds additional optional features to Durden. It can be activated per workspace and takes control over the tiling layout mode, with the idea of removing the need for manual resizing/reassignment etc. It divides the screen into three distinct columns, with a configurable ratio between the focus area in the middle and the two side columns. New windows are spawned defocused in a column, spaced and sized evenly, and you either click the window or use the new target/window/swap menu path to swap with the focus area.

It can operate in two different modes, non-scaled and scaled. The non-scaled version acts like any normal tiling resize. The scaled version ‘lies’ to all the clients, saying that they have the properties of the focus area. This means the side- columns get ‘live previews’ that can be swapped instantly without any resize negotiation taking place, reducing the amount of costly resize operations.

You also see a ‘quake style’ drop down terminal being used. This is another drop-in tool script best bound to a keybinding. Its primary use is when you need a persistent terminal with a more restricted input path (keybindings etc. are actually disabled and there’s no activated scripting path to inject input) that works outside the normal desktop environment. In some ways safer than having a sudo terminal around somewhere…

2: Model Window

This is another example drop-in tool script that was ported from the old AWB demo video (the amiga desktop meets BeOS demo from ~2013). What it does is that it simply loads a 3d model, binds to a window and allows you to map the contents of another window to a display part of the 3d model.

There’s clearly not much work put into the actual rendering here, and the model format itself is dated and not particularly well thought out, but serves to illustrate a few codepaths that are the prerequisite for more serious 3D and VR related user interfaces – offscreen render-to-texture of a user-controlled view- and perspective- transform with content from a third party process, with working I/O routing from the model space back to the third party process.

3: Region-OCR to Clipboard

This is an addition to the encode frameserver (assuming the tesseract libraries are present) and re-uses the same code paths as display region monitor, record and share. What happens is that the selected region gets snapshotted and sent as a new input segment to the encode frameserver, that runs it through the OCR engine and puts any results back as a clipboard- style message.

4: Display Server Crash Recovery

We can already recover from errors in the scripts by having fallback applications that adopt external connections and continue from where they left off. A crash in the arcan process itself, would still mean sessions were lost.

The new addition is that if the connection is terminated due to a parent process crash, external connections keep their state and try to migrate to a new connection point. This can be the same one they used, or a different one. Thus, this feature is an important part in allowing connections to switch display servers in order to migrate between local and networked operation, or as a means of load balancing.

5: Path- Activated Cheatsheets

The menu path activated widgets attached to the global and target menu screens were already in place in the last version, but as a primer to the new feature, we’ll show them again quickly. The idea is to have pluggable, but optional, dynamic information or configuration tools integrated in the normal workflow.

What is new this time is the support for target window identity activation. Any external process has an fixed archetype, a static identifier, a dynamic identifier and a user definable tag. The dynamic identifier was previously just used to update titlebar text, but can now be used as an activation path for a widget.

To exemplify this, a cheatsheet widget was created that shows a different cheatsheet based on target identity. The actual sheets are simply text files with a regex- on the first line and empty lines between groups. The widget is set to activate on the root- level of the target menu.

The normal OSC- command for updating window title is used to update the target identity that is used as a selector for the sheet. Vim can be set to update with the filename of the current file and the shell can be set up to change the identity to the last executed command, as shown in the video when triggering the lldb cheat sheet.

6. Connection- and Spawn- Rate Limiting

This another safety feature to let you recover from the possible Denial-Of-Service a ‘connection bomb’ or ‘subwindow-spawn-bomb’ can do to your desktop session. In short, it’s a way to recover from something bad like:

while true; do connect_terminal & done

which has a tendency to crash, 100% live lock or just stall some desktop environments. Here we add the option to limit the amount of external connections to an upper limit or to only allow a certain number of connections over a specified time slice.

7. Dedicated Fullscreen

This feature is still slightly immature and looks like the normal fullscreen but with a few caveats. One is that we circumvent normal composition, so post processing effects, shaders etc. stop working.

The benefit is that we reduce the amount of bandwidth is required. The more important part is what this feature will be used for in the near future, and that is to prioritize bandwidth, latency and throughput to a specified target.

 8. QEmu/SDL1.2/SDL2

As part of slowly starting to allow 3rd party producers/consumers, there is now an Arcan  QEmu display driver (maintained in a separate GIT) that’s at the point where single display video and keyboard / mouse input is working.

The hacky ‘SDL1.2’ preload library has been updated somewhat to work better on systems with no X server available (and there’s an xlib- preload library to work around some parasitic dependencies many has to glX related functions, but it’s more a cute thing than a serious feature).

There is also a SDL2 driver (maintained in a separate GIT) that support Audio/Video/Input right now, but with quite a lot of stability work and quirk-features (clipboard, file DnD, multi-window management) still missing.

Condensed Changelog:

Arcan – 0.5.1 Tagged

In addition to the normal round of bug fixes, this version introduces the following major changes:

  • Encode frameserver: OCR support added (if built with tesseract support)
  • Free/DragonflyBSD input layer [Experimental] : If the stars align, your hardware combination works and you have a very recent version of Free- or Dragonfly- BSD (10.3+, 4.4+), it should now be possible to run durden etc. using the egl-dri backend from the console. Some notes on setup and use can be found in the wiki as there are a few caveats to sort out.
  • Terminal: added support for some mouse protocols, OSC title command, bracket paste and individual palette overrides.
  • Shmif [Experimental] : Migration support – A shmif- connection can now migrate to a different connection point or server based on an external request or a monitored event (connection dropped due to server crash). This complements the previous safety feature with appl- adoption on Lua-VM script error. The effect is that external connections can transparently reconnect or migrate to another server, either upon request or with external connection adoption on a dropped connection in the event of a server crash. When this is combined with an upcoming networking proxy, it will also be used for re-attachable network transparency.
  • Evdev input: (multi) touch- fixes
  • Shmif- ext : Shmif now builds two libraries (if your build configuration enables ARCAN_LWA with the egl-dri VIDEO_PLATFORM), where the second library contains the helper code that was previously part of the main platform used for setup for accelerated buffer passing. This will swallow some of the text-based UI code from the terminal. The patched SDL2 build mentioned above requires this lib, and arcan_lwa and game frameserver (with 3D enabled) have been refactored to use it.

Lua API Changes:

  • target_displayhint : added PRIMARY flag to specify synch-group membership
  • rendertarget_forceupdate : can now change the update- rate after creation
  • new function: rendertarget_vids – use to enumerate primary attached vids
  • set_context_attachment : can now be used to query default attachment
  • system_collapse : added optional argument to disable frameserver-vid adoption
  • new function: target_devicehint – (experimental) can be used to force connection migration, send render-node descriptor or inform of lost GPU access
  • new function: video_displaygamma – get or set the gamma ramps for a display
  • target_seek : added argument to specify seek domain


  • New Tools: 3D Model Viewer, Autolayouter, Drop-Down Terminal
  • Dedicated fullscreen mode where a consumer is directly mapped to the output device without going through compositing. More engine work is needed for this to be minimal overhead/minimal latency though (part of 0.5.2 work).
  • Double-Tap meta1- or meta2- to toggle “raw” window input-lock / release.
  • Added display-region selection to clipboard OCR.
  • [Accessibility] Added support for sticky meta keys.
  • Consolidates most device profiles into the devmaps folder and its subfolders.
  • Added a ‘slotted grab’ that always forwards game-device input management to a separate window, meaning that you can have other windows focused and still play games.
  • Multiple- resize performance issues squashed.
  • Locked- input routing for mouse devices should work better now.
  • Basic trackpad/touch display/tablet input classifiers, see devmaps/touch.
  • Format- string control titlebar contents
  • External connection and window- rate limiting
  • Statusbar is now movable top/bottom and the default is top so that those trying things out using the SDL backend won’t be frightened when they are met with a black screen.
  • Target- identity trigered cheat sheets
  • Button release can now be bound to menu path


The Senseye subproject is mainly undergoing refactoring (in a separate branch), changing all the UI code to use a subset of the Durden codebase, but with a somewhat more rigid window management model.

This UI refactoring along with Keystone based assembly code generation and live- injection will comprise the next release, although that is not a strong priority at the moment.

Upcoming Development

In addition to further refining the 3rd party compatibility targets, the following (bigger) changes are expected for the next (1-2) releases:

  • LED driver backend rework (led controllers, backlight, normal status LEDs and more advanced keyboards)
  • Text-to-Speech support
  • LWA bind subsegment to rendertarget
  • GPU(1) <-> GPU(2) migration, Multi-GPU support
  • Vulkan Graphics Backend
  • On-Drag bindable Mouse cursor regions
  • More UI tools: On-Screen Keyboard, Dock, Desktop Icons
Posted in Uncategorized | Leave a comment

Some Questions & Answers

A few days have gone by since the project was presented, and while I am not very active on the forums and other places where the project have been discussed, I have seen some questions and received some directed ones that I think should be replied to in public view.

1. If I would build and install Arcan, what can I do with it?
To just try things out and play with it, you can for starters build it with SDL as the video platform and run it from X or OSX. It won’t be as fast or have as many features as a more native one like egl-dri, but enough to try it out and play around. A few brave souls have started packaging so that will also help soon. The main application you want to try with it is probably the desktop environment, durden. With it, you have access to the terminal emulator, libretro- cores for games, video player and a vnc client. There is a work-in-progress QEmu integration git and soon a SDL-2 backend. If you are adventurous, it is also possible to build with -DDISABLE_HIJACK=OFF and get a libahijack_sdl12.so. Run with LD_PRELOAD=/path/to/libahijack_sdl12.so /my/sdl1.2/program and you should be able to run many (most?) of SDL-1.2 based games and applications.

2. Will this replace X.org?
That depends on your needs. For me, it replaced X quite a while ago; I can run my terminal sessions, connect to VNC, run my QEMU virtual machines natively and the emulators I like to play around with all work thanks to libretro. The default video decoder does its job ‘poorly but ok enough’ for my desktop viewing and my multi-monitor setup works better now than it has even done in my 20+ years of trying to stand XFree86/X.org. For others, that’s not enough so that might be reason to wait or simply stay away. It is not like you lack options.

3. How does this all relate to Wayland?
I tried to answer that in the presentation, but it was at the end and perhaps I did not express myself clearly. I intend to support Wayland both as a server and as a client. I’ve had a good look at the protocol (and Quartz, SurfaceFlinger, DWM, for that matter…), and there’s nothing a Wayland implementation needs that isn’t already in place – in terms of features – but the API design and the amount of ‘X’ behaviors Wayland would introduce means that it will an optional thing. There is nothing in Wayland that I have any use for, but there are many things I need in terms of better integration with virtual machine guests and the recent developments in QEmu 2.5/2.6 in regards to dma-buf/render-nodes is highly interesting, so it comes down to priorities or waiting for pull-requests 😉

4. Is the Lua scripting necessary?
No, it should take little more effort than removing a compilation unit and about 50 lines of code or so for the scripting interface to disappear in order to run the engine C only – but it is a lot more work telling it what to do and with less support- code for you to re-use. A lot of scripts in Durden, for instance, were written so that you could cut and paste them into other projects. That’s how Senseye will be made usable for people other than myself 🙂

The engine will get a library- build version for such purposes further down the road, but right now there’s no guarantee to the stability of internal interfaces. The same applies to the shared memory interface, even though that already has a library form. I have a few unresolved problems that may require larger changes in these interfaces without considering how any change would affect other people.

5. Will this run faster / better with games?
I have no data to support such a claim, so that’s a maybe. A big point however, is that you can (if you know your Lua, which isn’t very hard) have very good control over what “actually happens” in order to optimize for your needs. For gaming, that would be things like mapping the game output directly to the selected display, without the insanity of the game trying to understand resolution switching and whatever ‘fullscreen’ means. Another possibility would be switching to a simpler set of scripts or mode of operation that suspend and ignores windows that don’t contribute to what you want to do.

6. Is the database- application whitelisting necessary?
No, you can connect to the server using another set of primitives (ARCAN_CONNPATH=…), if the set of scripts you are using allows you to. This is what is meant by “non-authoritative” connection mode and the database can be entirely :memory if you don’t want any settings to be stored. The whitelisting will come into better use later, when you can establish your own “chain of trust”.

7. Is there a way to contribute? 

There are many ways, besides ‘spreading the word’ (and I could use a Vive ;-)). See the wiki page here: https://github.com/letoram/arcan/wiki/contrib

8. The ‘Amiga UI’ is not working?

That’s the reason it was marked as abandoned (and practically since end of 2013). It was just a thing I did to get a feel for how much code it would take to do something like ‘Amiga meets BeOS’ and find out some places where the API had gone wrong. Afterwards, I changed those parts but never updated the related scripts. That said, it is not a big effort to get it up and running again, so maybe…

8. Where does this fit in the Linux/BSD ecosystem?

Where does awk, sed and grep fit? Arcan is a versatile tool that you can use for a lot of kinds of graphics processing and the Desktop case illustrated by Durden is just one. I use a minimal init and boot straight into Durden, using a handful of preset mount and networking settings that render current state and controls into small widgets. No service manager, display manager, boot animation, login manager or message passing IPC.

One of the many problems with interactive graphics in a ‘pipes and filters‘ like ‘user freedom UNIX- way model‘ is that the performance and latency breaks down. You are much more sensitive to those things thanks to the wonders of human cognition. I know some people still think in the ways of ‘a framebuffer with pixels’ but the days of Mode 13 are gone. The process now is highly asynchronous and triggered by events far more complicated than a VBLANK interrupt. The design behind Arcan resembles about as close to the ‘pipes and filters’ I think I can come without becoming slow or esoteric.

9. Why is there no X support?
This is a big question and ties in with answer 3.  A small part is the cost and pain of implementing such a complete mess, which would mean less time for more interesting things. This is a completely self-financed project, fueled mostly by dissent, cocktails and electronic music, with no strong commercial ambitions — all in the tradition of dumb idealism.

A bigger part in committing to a protocol, or saying ‘I should be compatible with- or replace- project XYZ’ is that you limit yourself to thinking in terms of how those project works and how you should be better than them or outcompete in some way, rather than in terms of ‘how can I do something interesting with this problem in a way that is different from how others have approached it’.

Collectively speaking, we don’t need yet another project or implementation that takes on X and if that already feeds your needs, why change? Some of us, however, need something different.


Posted in Uncategorized | Leave a comment