Towards Secure System Graphics: Arcan and OpenBSD

Let me preface this by saying that this is a (very) long and medium-rare technical article about the security considerations and minutiae of porting (most of) the Arcan ecosystem to work under OpenBSD.¬†The main point of this article is not so much flirting with the OpenBSD crowd or adding further noise to software engineering topics, but to go through the special considerations that had to be taken, as notes to anyone else that decides to go down this overgrown and lonesome trail, or are curious about some less than obvious differences between how these things “work” on Linux vs. other parts of the world.

A disclaimer is also that most of this have been discovered by experimentation and combining bits and pieces scattered in everything from Xorg code to man pages, there may be smarter ways to solve some of the problems mentioned – this is just the best I could find within the time allotted. I’d be happy to be corrected, in patch/pull request form that is ūüėČ

Each section will start with a short rant-like explanation of how it works in Linux, and what the translation to OpenBSD involved or, in the cases that are still partly or fully missing, will require. The topics that will be covered this time are:


One of the many lofty goals behind Arcan has been to not only reduce system-wide desktop complexity, to push the envelope in terms of system graphics features, quality and performance Рbut also to enable experimentation with security sensitive workflows and interaction schemes, with the One Night in Rio: Vacation Photos from Plan9 article covering one such experiment. Working off the classic Confidentiality- Integrity- Availability- information security staple, the article on Crash-Resilient Wayland Compositing also showed the effort being put into the Availability aspect.

The bigger picture of how much this actually entails will be saved for a different article, but the running notes on the current state of things, as well as higher level experiments are kept in [Engine Security] and [Durden-Security/Safety].

Outside of attack surface reduction, exploit mitigations and safety features, there is a more mundane, yet important aspect of security – and that is of software quality itself. This is an area where it is easy to succumb to fad language “fanboyism” or to drift in the Software Homeopathy¬†direction of some bastardised form of code autism, such as counting the number of lines of code per function, indentation style, comment:code ratio and many other ridiculous metrics.

While there are many parts to reaching a lofty goal of supposed ‘quality software’, the one that is the implicit target of this post is the value of a portable codebase as a way of weening out bugs (where #ifdef hell or ‘stick it in a VM masked as a language runtime’ does not count).

The road to- and value of- a portable codebase is built on the ability to swap out system integration layers as a way of questioning the existing codebase about hidden assumptions and reliance on non-standard, non-portable and possibly non-robust behaviours and interfaces.

This does not always have to be on the OS level, but in other layers in the stack as well. For instance, in the domain that Arcan targets, the OpenGL implementation can be switched out for one that uses GLES, though their respective feature sets manifest many painful yet subtle differences.

Similarly, the accelerated buffer sharing mechanism can be swapped between DMA-Buf and EGLStreams. Low-level DRI and evdev based access for graphics and input can be swapped out for high-level LibSDL based. Engine builds are tested with multiple libc:s to avoid “glibc-rot” and so on. Some of these cases are not without friction, and even painful at times, but the net contribution of being able to swap out key culprits as a means of finding and resolving bugs have, in my experience, been a positive one.

Arcan has been ported to quite a few platforms by now Рeven if not all of them are publicly accessible; there are certain ecosystems (e.g. android, iOS) we have very little interest in publicly supporting in a normal way. The latest addition to the list of actually supported platforms is OpenBSD, added about a year after we had support for FreeBSD.  So, enough with the high-level banter, time to dive into the technical meat of the matter.

Graphics Device Access Differences

Recall that in the BSDs, most of the DRM+GBM+Mesa (DRI) stack (or as Ilja van Sprundel poignantly puts it – the shit sandwich) is mostly lifted straight from Linux, hence why some of the interfaces are decidedly “non-BSD” in their look and feel.

Device Nodes and drmMaster

A normal (e)udev setup provides device nodes on linux as /dev/dri/Card + RenderD. In OpenBSD, these nodes are located at /dev/drm, but merely changing the corresponding ‘open’ calls won’t yield much of a success, unless you are running as root. The reason is a piece of ugly known as drmMaster.

As with most things linux these days, if something previously had permissions – there is now likely both a permission, some hidden additional permission layer and some kind of service manager that shift things around if you ever get too comfortable. Something that once looked like a file actually has the characteristics and behaviour of a file system; file permissions and ownership gets “extended” with hidden attributes, capabilities, and so on.

This is the case here as well. A user may well have R/W access to the device node, but only a user with CAP_SYS_ROOT/ADMIN is allowed to become “drmMaster”, a singleton role that determines who is allowed to configure display outputs (modeset) or send data (scanout) to a display, and who that is allowed to only use the device node to access certain buffers and setup accelerated graphics. As a note, this will get more complicated with the ongoing addition of ‘leases’ – tokenised outsourced slicing of display outputs connectors.

In Linux, there are about 2.5 solutions to this problem. The trivial one is¬†weston-launch, have it installed suid and there you go, along with all of the problems of suid and hard coded paths. The second involves an unholy union between Logind, D-Bus, udev and the ‘session’ and ‘seat’ abstractions that ultimately leads back to who is actually on the active VT, as the TTY driver layer is never going away (insert facepalm picture here) .

The last .5 is the one we have been using up until recently, and involves taking advantage of a corner case in that you do not have to set drmMaster – if no-one else has tried to; buffer scanout and modeset actually work anyway, just ignore the failed call. Set normal user permissions on the card node and there we go. This has the risk that an untrusted process running as the same user can do the same and break display state, which leads us to a case of “avoid mixing trust domains within the same uid/gid”. If you really desire to run software that you don’t trust, well, give such miscreants another uid/gid that has permission on the ‘RenderD’ node for privilege separation and the problem would be solved. Let multi-user on desktop unix just die.

Alas, neither the Master-less scanout nor the render node feature currently exist (#ifdef Linux removed) on OpenBSD, leaving the no-option of running as root or the remaining one of doing what weston-launch did, but in a saner way. This means taking an improved cue from Xenocara. Make the binary suid, and the first thing you do, no argument parsing, no hidden __constructor__ nonsense, just setup a socketpair, fork and drop privileges. Implement a trivial open/close protocol over the socketpair, and have a whitelist of permitted devices. (Fixed in OpenBSD 6.6)

Since we need dynamic and reactive access to a number of other devices anyhow, this tactic works well enough for that as well. There is a caveat here in that the user running the thing can successfully send kill -KILL to the supposedly root process, breaking certain operations like VT switching (see the Missing section).


One of the absolutely murkiest corner in any user – kernel space interface has to be that of input devices: keyboards, trackpads, joysticks, mice and so on. This is one of the rare few areas where I can sit and skim Android documentation and silently whisper “if only” before going back to alternating between screaming into a pillow and tracing through an ever expanding tree of “works 100% – 95% of the time” translation tables, where each new branch manages to be just a tad bit dumber than the ones that came before it.

Starting in Linux land again, first open the documentation. Now, you can close it again – it won’t be useful – I just wanted you to get a glimpse of its horrors. The only interface that can really be used, is¬†evdev,¬†written in cursive just to indicate that it should be pronounced with a hiss. It is a black lump of code that will never turn into a diamond. Instead of being replaced, it gets an ever increasing pile of “helper” libraries (libevdev, libudev, libinput, …) that tries to hide how ugly this layer is. This much resembles a transition towards the Windows approach of ‘kernel interfaces as library code’ rather than comparably clean and observable system calls. The reason for this pattern is, I assume, that no one in their right mind wants to actually work on such a thing within the Linux kernel ecosystem, if it can at all be avoided.

For OpenBSD, the better source of information is probably the PDF (Input Handling in wscons and X) along with the related manpages (yes, there are man pages) on wscons, uhid, wsmouse, wskbd. For the most part, these are obvious and simple, with the caveat that most of the low level controls are left to other tools, or that they work as multiplexer abstract devices without any ‘stream identifier’ for demultiplexation. Then there’s the case of the keyboard layout format. The underlying “problem”, depending on your point of view, is the legacy that exist in both Linux and BSDs in that tradition has meant going from BIOS -> boot loader(s) -> “text” console -> (display manager) -> Xorg and we at least need the option of being able to boot straight into UI.


There are two sides to the hotplug coin: input devices and output devices. The normal developer facing Linux strategy uses udev, the source code of which is recommended reading for anyone still looking for more reasons to get away from linux systems programming. If you want to have extra fun, take a look at how udev distinguishes a keyboard from a joystick.

Udev is an abomination that has been on my blacklist for a number of years. It has all the hallmark traits of a parasitic dependency and is a hard one to ignore. For input devices, the choice was simply to go with inotify on a user provided folder (default, /dev/input), and let whatever automatic or manual process he has populating that folder with the device nodes he thinks that Arcan should try and use.

For output, one might be easily fooled to think that the card device node that is used for everything (almost, see the backlight section) that relate to displays like mode setting, buffer scanout and synch triggers, should also be used to detect when a display is added or removed to this card. Welcome to Linux. How about, instead, we send an event with data over a socket (Netlink) in a comically shitty format, to a daemon (udev) which maps to a path in a filesystem (sysfs¬†best pronounced as¬†Sisyphus) which gets scraped and correlated to a database and then broadcasted to a set of listeners over D-Bus. Oh goodie. One might be so bold to call this an odd design, given that all we need is a signal to trigger a rescan of the CRTCs (“Cathode Ray Tube Controllers” for a fun legacy word) as the related API basically forces us to do our own tracking of connector card set deltas anyhow.

In OpenBSD (>= 6.3) you add the file descriptor of the card device node to a kqueue (‚̧ԳŹ) set and wait for it to signal.


Backlight is typically reserved for the laptop use case where it is an important one, as not only can modern screens be murdering bright, but also bat away at your battery. This subsystem is actually quite involved in Arcan as it maps as yet another LED controller, though one that gets paired to ‘display added’ events. Hence why we can do things like in¬† this (youtube video).

The approach we used for Linux is lifted from libbacklight before it became a part of the katamari damacy¬†game that is¬†systemd. Again one might be tempted to think that the backlight control could somehow be tied to the display that it regulates the brightness of, but that is not how this works. Just like with hotplug, there’s scraping around in sysfs.

In OpenBSD, you send an ioctl to your wscons handle.


As touched upon in the section on graphics device access, OpenBSD maintains its own Xorg fork known as Xenocara in order to maintain their specific brand of privilege separation. Arcan meanwhile maintains its own Xarcan fork to get more flexibility and security options than what XWayland can offer, at the cost of legacy application “transparency” (though that will be introduced via the Wayland bridge at some point).

The only real difference here right now, in addition to the much superior DRI3(000) rendering and data passing model is not working (see the Missing section further below) – was adding translation tables to go from SDL1.2 keysyms (the unfortunate format that we have to work with, no one, especially not me, is without sin here).


A large part of the work behind Arcan has been to resection the entire desktop side of user space to permit much more fine grained privilege separation, particularly when it comes to parsers working on untrusted data – but also for server to client sharing modes. Thus we split out as much of the data transformations and specialised data providers as possible into their own processes – and the same goes for data parsing.

One of the reasons for this is, of course, that these are error prone tasks that should not be allowed to damage the rest of the system in any way, segmenting these out and monitoring for crashes is a decent passive way of finding new less than exciting vulnerabilities in FFmpeg, something that still happens much too often some ten years later.

The other goal is, of course, to be able to apply more precise sandboxing as one process gets assigned one task, like a good Mr. Meeseeks. An indirect beneficial side effect to this is that we get a context for experimenting with the different sandboxing experiences that the various OSes provide, such as the case of the supplementary tool ‘aloadimage‘ (a quite paranoid version of xloadimage).


The OpenBSD port of DRI currently lacks two important features – render nodes and DMA buffers. This means that we have no sane way of sending accelerated buffers between a client using accelerated graphics and Arcan. This breaks the nested ‘arcan-in-arcan’ mode with the arcan_lwa binary, it breaks wayland support and it breaks Xarcan supporting DRI3 as these all rely on that feature. (Added in OpenBSD 6.5)

While it is technically possibly to explicitly enable older style GPU sharing (durden exposes these as config/system/GPU delegation) where clients to be trusted authenticate via a challenge response scheme to the “drm master” then gets the keys to the kingdom, it is far from a recommended way as this puts the client on just about the same privilege terms as the display server itself. That being said, it is murderously foolish to think that a client who is allowed any form of raw GPU access these days doesn’t have multiple ways of privilege escalation waiting to be discovered; the order of priority is getting GPUs to do what they are told without crashing the system, then to account for clients that rely on unspecified or buggy behaviour and maybe to get them to perform under these conditions. Security is a distant dot somewhere on the horizon.

Last and incidentally, my least favourite part of the entire ecosystem, “libwayland-server” still lacks a working port – which comes as no surprise as this asynch race condition packed, faux-vtable loving, use-after free factory is littered with no-nos; the least of which being its pointless reliance on ‘epoll‘. Funny how they practically reimagined Microsoft COM and managed to make it worse.

Posted in Uncategorized | 2 Comments

Safespaces: An Open Source VR Desktop

In this post, I will go through the current stages of work on a 3D and (optionally) VR desktop for the Arcan display server. It is tentatively called safespaces (Github link) as an ironic remark on the ‘anything but safe’ state of what is waiting inside. For the impatient, here is a video of it being used (Youtube link):

The explanation of what is going on can be found in the ‘High Level Use’ section further below.

To help navigate the post, here are a few links to the individual sections:

Background and Motivation

One of the absolute main goals with the Arcan project as a whole is to explore different models for all the technical aspects that goes into how we interact with computers, and to provide the infrastructure for reducing the barrier to entry for such explorations.

The overall ambition is that it should be ‘patching- scripts’ levels of difficult to piece together or tune a complete desktop environment to fit your individual fancy, and that major parts of such scripts could – with little to no modification – be shared or reused across projects.

For this reason, I’ve deliberately tried to avoid repeating- or imitating- the designs that have been used by projects such as Windows, Android, Xorg, OS X and so on. The reason was not to question or challenge their technical or business related soundness as such; those merits are already fact. Instead, the reason was to find things that was not exactly “sound” business – hence why this has all been kept as a non-profit, self-financed thing.

During the last few months, I’ve started unlocking engine parts that were intended for building VR environments, and many of the design choices had this idea as part of the implicit requirements specification very early on. The biggest hurdle has been hardware quality and availability, an area where we are finally starting to reach a workable level – the Vive/PSVR- level of hardware is definitely “good enough” to start experimenting – yet terrible enough to not be taken seriously.

Arcan as a whole is in¬†a prime position to do this “right”, in part because of its possible role as both a display server, and as a streaming multimedia processor / aggregator. Some of the details of the TUI subproject and the SHMIF IPC subsystem also fit into the bigger puzzle, as their implicit effect is to push for thinking of clients as interactive, streams of different synchronised data types – rather than opaque pixmaps produced in monolithic ‘GUI toolkit thiefdoms’.

High level use and Demo

Starting with a shorter but slightly more crowded video from the same desktop session:

In the video, you see parts of the first window management scheme that I am trying out. Although the video is presented as monoscopic, the contents itself is stereoscopic – 3D encoded video looks 3D when viewed on a HMD.

The window management scheme is a riff on the tiling window manager, mostly due to the current lack of more sophisticated input devices that would benefit from something more refined. Actual input is keyboard and mouse in this round, though there are ongoing experiments with building a suitable glove.

“Windows”, or rather, models – since you have a number of different shapes and mappings to chose from – are grouped in cylindrical layers. The user is fixed at the centre and the input focused window is positioned at 12 o clock, with other sibling windows rotated to face the user (“billboarding”), scaled down and positioned around the layer geometry.

Each layer can be arranged at either a fixed distance for heads-up display components and infinite geometry like skyboxes; or they can be swapped back and forth – with optional opacity fade-offs, level of detail triggers and so on based on layer distance.

In each layer you can attach various primitive models, from cylinders and spheres to rectangles, flat or curved. Each model can be assigned an external source, an image and an optional activation connection point. Individual properties, such as stereoscopic mapping modes, scale, opacity and so on can also be set, and the models can swap places with each-other within the layer. Each layer can have a ‘layouter’ attached that automatically determines window positions and scale. The builtin default works like in this figure:


This shows a single layer with the user is at a fixed position, dead center. The window that has input focus has the center slot at 12’o clock, and the other windows are evenly laid out around a circle that match the set per-layer radius. If the focus window spawns subwindows or hierarchically bound subconnections, like when a terminal starts a graphics program, such windows gets positioned vertically.

The activation connection point is a unique name where the ARCAN_CONNPATH environment variable points. In the video you can see a designated ‘moviescreen’ appearing at the end when I redirect video there, and disappearing when the specific client disconnects.

A little twist is that safespaces was actually written as a tool (‘vrviewer’) for Durden¬†– even though it is also a full window manager in disguise. The reason why I went this path is for prototyping agility, taking advantage of the tons of existing code and features in Durden shortens the ‘edit, run, test’ cycle drastically. It also eliminates the headache of picking the right display for the right purpose and other low level time consuming details – and you can move your workflows back and forth between the 3D/VR state and the 2D one.

The downside is that there is quite some overhead running it nested like this since Arcan also needs to take the normal desktop management into account, and there is some interference in the refresh rates of the different displays I have hooked up.

Setup and Architecture

There is a tool in the Arcan codebase that you have to build and enable, vrbridge (arcan_vr). It has some weird constraints since we definitely don’t want to do the device management in-process – yet the sampled data has a tiny window of opportunity for use. It takes extended, privileged features in the SHMIF API (so the engine will have to launch and manage the program. Instructions for enabling are in the file). The current version supports devices via OpenHMD, but it is trivial to interface with other VR- related APIs.

The design of the vrbridge interface is such that it allows for selective plugging of a wide range of devices (gloves, haptic suits, eye trackers, …). The scripting layer activates the VR bridge and gets announcements of ‘limbs’ arriving or disappearing. A limb gets activated by the script mapping the limb to a 3D model, and the engine will take care of synchronising limb orientation and so on at both a monotonic rate (for collision detection and response) and when preparing new output frames.

The biggest problem right now – is interfacing with devices and the vast array of possible pluggable input devices that extend the VR setup capabilities. It doesn’t help that practically everyone tries to¬†push their own little lock-in happy frameworks and makes it hard (deliberately or by accident) to access the very few control primitives and sensor samples that are actually needed. As an added bonus, there is also “incentive” to merge with the next round of walled garden app-stores, because things weren’t bad enough as is.

Next Steps

This setup is still in its infancy, and the work so far has highlighted a few sharp corners that will need sanding, although the biggest eyesores (literally) – the quality of the 3D pipeline and the positional audio will have to wait for a while until the relevant stages of the longer Arcan roadmap has been reached.

Another part of that reason is due to the low level of system integration and high level portability requirements that Arcan needs to follow; we are really restricted as to the set of GPU features that can be relied upon and purposely restrictive when it comes to introducing dependencies.

There are two big low level targets for the near- future:

The first is improved asymmetric multi-GPU support. The challenge of this task scales drastically with what you actually do, where sending textured quads back and forth is trivial, and then it goes bad quick and nightmarish almost as fast. The two worst parts is effective load balancing and rewriting much of the synchronisation and storage management code to multithread better and get advantage out of adaptive synchronisation outputs (FreeSync).

The second is fleshing out the interaction with clients so that there is intelligent level-of-detail, and the ability for the client to output 3D representations rather than just rendered pixel buffers – think mesh icons, voxel buffers etc. for server-side rendering in order to get better immersion and seamless, gradual transition / handover between desktop use and dedicated client use.


Posted in Uncategorized | 7 Comments

Argumenting Client-Side Decorations

Apparently it is the season to chime in on opinions on client side decorations (CSDs) versus server side decorations (SSDs) in the context of Wayland. I normally ignore these kinds of quarrels but this is a case where I consider the dominating solution being close to the worst possible one. The two pieces of note, so far, is This post from camp Gnome and This from camp KDE.

For starters, Martin claims:

Nothing in Wayland enforces CSD. Wayland itself is as ignorant about this as X11.

Well, it is the compositor’s job to enforce it (i.e. do nothing really) and he doesn’t want to – it’s a case of bad engineering hygiene for any implementor of a protocol to skew interpretation this blatantly as it counteracts the point of having one. You don’t have to dig deep to find the intention behind the protocol – even the first commit to the project has:

Window management is largely pushed to the clients, they
draw their own decorations and move and resize themselves, typically implemented in a toolkit library.

This has been debated again and again, and if the ‘me-too’:ers would have taken even a cursory google glance at the mailing list rather than skewing things to fit their narrative (or just not doing the basic research because why would you), they would immediately run into Kristians recurring opinion about it, and he designed much of the protocol. If that does not align with your wants and needs, perhaps this is not the protocol you should use, much less dig up worse extensions where the client can simply say “no, I want to decorate” and the server have to comply or break protocol, increasing the development cost on both sides as even compliant clients would have to deal with the absence of the particular extension. It also doesn’t do anything about the related mouse cursor issue.

Furthermore, a ton of stuff in Wayland make very little sense without CSDs – particularly wl_subsurface and wl_region but also, to some extent, the xdg_shell parts about window state (maximized, movable, fullscreen, resizing, context menu).

Then we have a long standing topic from the #wayland IRC channel itself:

Please do not argue about server-side vs. client side decorations. It's settled and won't change.

That said, I also think that the agreed upon approach to CSDs is a technically inadequate, and more a counter reaction to the state in X than other possibilities.

Lets first look at what the decorations chatter is all about. In the screenshot below you can see the top bar, the shadow region and the window border. What is not visible yet relevant is the mouse cursor. All these are possibly client defined UI components.


The options to convey this segmentation in wayland is practically as a single unit (a surface) or a complex aggregate that is one of the biggest pains to actually implement, a tree of “subsurfaces”.

Reading through comments on the matter, most seem to focus on the value of the top bar and how much stuff you can shove into it to save vertical space on a floating window manager with widescreen monitors.

Had that been the only thing, I would have cared less, because well, the strategy that will be used to counter this in the various Arcan related projects is a riff on this:

I’ll automatically detect where the top bar is, crop it out and hide it inside a titlebar toggle in the server side defined titlebar where I have the stuff that I want to be able to do to the client. There is practically nothing GTK, or anyone else for that matter, can do to counter that. In durden, the /target/window/crop and /target/window/titlebar/impostor features allows for that, per window.

UPDATE: This has recently been added in its first, quite rough, inception:

Protocol wise, it would be nice if the bar that GTK/GNOME (and to be fair, Chrome, Firefox, …) prefer in their design language had been added as another possible surface role to xdg-surface so the segmentation could go a bit smoother and the border and shadow could all be left entirely to the compositor – but it is, all in all, a minor thing.

There are, however, other areas that I argue that the choice matter a lot more, so lets look at those.


If you look like projects such as QubesOS (see also: border color spoofing) and Prio, the border part of the decorations can be used to signal “compartment” or “security domain”. The server side annotates each client surface with its knowledge about the contents, domain and trust level that the user should associate with the particular client. This part of the equation is invisible to the client and the client has no way of removing it and substituting with its own.


With client side decorations there is, by definition, no contribution from the compositor side of the equation. This means that any client can proxy the look, feel and behaviour of any other.

Even if I have a trusted path of execution where the compositor spawns the specific binary (like Arcan can with a sandboxing chainloader and a policy database) keeping the chain of trust intact. Without a visual way to indicate this, there is no difference from a compromised process spawning and proxying an instance of the same binary as a man in the middle in order to snoop on keyboard and clipboard input and client output. The problem is not insurmountable, but other design language need to be invented and trained.


Not that there are many serious attempts to allow networking for single wayland clients at the moment, but for networked clients, normal window operations like move and drag resize now require a full round-trip and a buffer swap to indicate that the operation is possible since the mouse  cursor need to be changed to reflect the current valid state. Having such operations jitter in their responsiveness is really awkward to work with.

This applies locally as well should the client input processing starts to stall, and now the user is faced with the problem of having to wait in order to move a window, or adapt/discover a possible compositor defined override (meta key +drag).

That said, there are other parts of the protocol that has similar problems, particularly the client being responsible for maintaining the keyboard layout state machine thus to implement key-repeats.

Performance and Memory Consumption

The obvious part is that all buffers need to be larger than necessary in order to account for the shadow and border region. This implies that more data needs to be transferred to- and cached on- the GPU side. About the scarcest resource I have on my systems is memory on GPUs and when it thrashes, it hurts. You can do border and even really fancy shadows as shaders during composition at a much lower systemic cost.

Everything the client draws has to be transferred in some way, with the absolute fastest way being one well-aligned buffer that gets sent to the GPU with a handle (descriptor) being returned in kind. This handle gets forwarded to the compositor that maps it to a texture and uses this texture to draw. The crutch here is that there are many ways to draw primitives that use this surface, and the one you chose will have impact on how expensive it will be for the GPU.

We can see with some toolkits, notably GTK, that fade-in, fade-out effects on state changes, such as the creation of a new surface, or focus shifting also now needs to update the entire surface multiple times. While no big deal locally, when using zero-copy handles, this stuff will need to go over the network. This makes such clients inherently much less network friendly, when it makes much more sense to have such effects implemented on the remote (server!) side, but that needs control over the decorations.

The shadow region needs to be drawn with alpha blending enabled or it will be ugly. The vast majority of UI contents can be drawn without any kind of blending enabled, but with client side decorations the compositor is left with no choice. To combat this, there is the wl_region part of the wayland protocol. Basically, you annotate the regions of a surface that is to be ‘opaque” so the compositor can perform the lovely dance of slicing up drawing into degenerate quads and toggle the blend states on and off. My memory may be a bit fuzzy, but I do not recall any compositor that actually bothers with this.

The position here is that the shadow and border- part of CSDs increase the total amount of surface that has to be considered for all drawing, buffering and layout operations. Consequently, they increase the amount of data that has to be updated, transferred and synched. The main contents, furthermore, gets pushed to offsets where they don’t align with efficient transfers, making the most expensive operation we have (texture uploads) worse. They also mutate the contents of the buffer to have a mixed origin and type. Because of this, they also adversely affect compression and other quality parameters like filtering, mipmapping and antialiasing.

UI Consistency

This is the part many discussion threads on the matter seem to boil down to so I will not provide much annotation on it. The stance that gets repeated from GTK people is a version of “users do not care, Chrome and friends do this anyhow”.  Here you have GTK, QT, EFL, SDL and Weston in loving harmony. Everyone with their own take on colours, icons, buttons and mouse cursor. “Beautiful”.

Suggestions on solving these brings the palm that much closer to the face. Things like “oh but you can have a shared library that implements the rendering”. One should rightfully ask what the point of a protocol is if, at the signs of design flaws, is to immediately reach for a sideband implementation, defeating the point of having a protocol in the first part.

Client and Protocol Complexity

It is considerably more complex to write a conforming wayland client than it is to write an Xorg client. A simple “hello world” press a key and flip a color output is in the range of thousands of lines of code, and the problem of actually knowing how to draw client decorations is open ended, as you get no hints in regards to how the user wants things to look or behave.

This means that some clients just ignore it – to some conflict and chagrin – Wayland SDL and GLFW backends or clients like Retroarch do not, at the time of writing this, implement them at all, for instance. The inflammatory thread in this MPV issue also highlights the problem.

The “I need to draw decorations?” problem becomes worse when the contents the client wants to present is in a format that is hard or costly to draw decorations in because of their rendering pipeline or restrictions on colour formats that should be used, incidentally, the case for games and video playback. This is where the wl_subsurface protocol is needed, and it maps back into the performance perspective.

The tactic is that you split the surface into the “main” area with one buffer format, and draw the decorations in slices using some other. These can be infinitely many, impose a requirement to “wait” for subsurface updates to synch. They are a fun way of making virtually all compositors crash or deadlock if you know what you are doing.

For operations like drag- resize and transforms, these becomes painfully complicated to deal with as you may need to wait for all surfaces in the tree to be “ready” before you can use them again. They can easily consume around ~12 file descriptors (also recall: double buffered state) even if the client is being nice, and with a few tricks it can become much worse.

It is to no surprise that GTK, EFL and others agree on the current situation, as they have already done the work, thus this empowers them at the cost of everyone else. Pundits also typically chime in and say something like “everyone is using a toolkit anyway” to which I say, step out of your filter bubble and widen your sample set, there are plenty of “raw” clients if you know where to look. The natural follow up from the same crowd is something to the effect of “you shouldn’t write a wayland client manually yourself, use a toolkit” – which is nothing but a terrible excuse and undermining the point of agreeing on a protocol when the details gets masked inside library code, it means moving the complexity around, making it less visible, rather than actually reducing complexity.

Adding a dependency to a million-line codebase is a really weird way of making something “simple”.¬†

If the point of wayland is, in fact, the often claimed “make it simpler” and the suggested solution turns out demonstrable worse¬†than the dominating one in that regard, a band aid suggestion of ‘hide the complexity in a library’ is farcical at best.

Real simplicity is observable and permeates all components in a solution, and the reality of Wayland is anything but simple.

Posted in Uncategorized | 1 Comment

Arcan 0.5.4, Durden 0.4

‘Tis the season to be jolly and just about time for the definitely last release of the year for both Arcan and its related subproject, the Durden desktop environment.¬†Following the pattern from the last release post, lets go through some highlights and related videos, but first – a family photo:


From left to right, we have a little Raspberry¬† running the ‘prio’ WM using the broadcom binary blob drivers (so lacks some of the features needed to run durden), with arcan and terminals eating up all of 20MB of ram. The left Macbook running OSX with Arcan/Durden in fullscreen, retina resolution, of course. The Macbook on the right is running the same system on OpenBSD 6.2. The three-headed monkey behind them is a voidlinux setup with two instances, one on an intel GPU, the other on an AMD GPU. If only the android devices on the wall could be brought in on the fun as well…


For Arcan itself, there has been rather few ‘visible’ changes, but many more things underneath.

The Xorg arcan backend ‘Xarcan’ has been ported to OpenBSD and is now up and running there. While on the subject, the OpenBSD port has also received some improvements on the input front, with support for the wsmouse interface and the engine itself now uses the same privilege separation setup as Xenocara.

The VRbridge tool has been updated with basic OpenHMD support. In the video, you can see the vrtest script where a cube is mapped to the reported HMD orientation. A number of engine improvements have also been made for managing VR related rendering. As a PoC – here’s windowmaker, xterm and xeyes connected to an xarcan instance that renders to arcan_lwa and then via arcan with this script (gist).


The wayland protocol service has seen quite a few improvements to crash recovery, wl-shell and xdg-shell protocol compliance. It has also received a single-exec mode for compartmentation of wayland clients, and has some basic least-privileges via seccomp- filters. The full details of the crash recovery steps are kept in a separate article, Crash Resilient Wayland compositing.

The Terminal and TUI libraries have been extended to get a window ‘cloning’ feature and a related pasteboard mode. The clone feature acts as a passive ‘screenshot’ of the contents of the terminal at activation time, and the pasteboard mode reroutes ‘copy’ operations to be added to a copy window rather than being added to the clipboard directly.

Arcan and the special ‘lwa’ build have gotten their argument handling and initial setup extended to allow arbitrarily long chains of ‘pipes and filters’ like media processing, the full details are kept in the separate article, AWK for multimedia.

Yet another tool has joined the ranks, arcan-netproxy, though not really usable for anything but experimentation at the moment, but it will become the main development focus target for the 0.6.x versions. Its role and purpose is to provide ‘per- window’ like networking features, and later on, full network transparency.


To weigh up for the lack of visual flair to Arcan itself, plenty of highly visible changes has been made to the durden desktop environment.


Flair is a new tool script that provides a Compiz- like UI effects framework for adding visual flair to normal operations. Though only a few effects have been added so far, more are sure to come now that the tough parts have been finished. In the video you see not only faux-physics ‘clothy windows’ (spring grid, verlet integration, obviously superior to wobbly windows) but also an xsnow- like effect layer and a burn-on-window-destroy effect. Now, if we only could get an adaptation of¬†Realtime 2D Radiosity¬†and end the era of drop shadows…


Overview is a tool that acts as an ‘expose’ like workspace switcher to more quickly see what is happening on the other workspaces. Though it lacks something in the looks department, the bigger work effort was, like with the flair tool, getting the infrastructure in place for hooking and handling. Later iterations will bring back more interesting workspace switching looks.

Advanced Float

Previously, the floating window management mode was quite crude, and efforts have started to spice things up a bit. It is still not entirely competitive with advanced floating window managers yet, but the missing features should appear around the next version. This round of enhancements adds support for:

  • spawn control (e.g. draw to spawn shown in the video)
  • minimise targets (e.g. minimise to statusbar or desktop-icon)
  • customised “on-drag-enter/over” screen regions
  • titlebar buttons can now be set based on workspace mode
  • auto-rearranger (simple binpack for now)
  • grid- like position/sizing controls (like the compiz plugin)

Here you see the spawn control used, to position/size a number of windows, then auto-rearranging them.

Terminal Group- Mode

Taking a trick from the Rio/Plan9 article, it is now possible to spawn a terminal ‘group’ which act as its own distinct connection point. Graphical clients that connect via this point forcibly shares group (window slot and settings) with the terminal it spawned from, making hybrid text/graphics clients easier to work with in tiling modes. In the video, you see how the terminal switches to an image when aloadimage is run, switching back/forth between the parent slot and children, and back again when the window is ‘destroyed’.

Menu and Browser Improvements

The browser filtering now respects wildcards, but also lua patterns and specialised sort-order controls. In addition, most menu paths now show a description of what each entry do, and navigation also works with the mouse.

What’s Next?

The tedious 0.5.x- lower system graphics interfacing is sure to continue for a while longer, with the next round of changes focusing further on multi-vendor-multi-GPU drawing, swapping and load balancing now that most of the necessary configuration system rework has been dealt with.

Another worthwhile work target will be improving synchronisation strategies now that drivers start to support FreeSync, particularly letting the WM decide if and when some client should be prioritised (think games and latency) to complement the existing ‘direct-to-screen’ mode.

It is also about time to push the work on TUI further: finishing the Lua API bindings and related advanced features e.g. subwindows, content embedding, standalone version (kmscon: the return).

For Durden, the feature race will soon (but not yet) start to slow down and instead focus on raising overall quality and usability (polish, actual default profiles, internationalisation support). Before then, however, it is likely that some more advanced input devices (eye trackers, customised mouse- and touchscreen gestures, onscreen keyboard) will get a little bit of love, icon and drag/drop for float mode, and the model-viewer tool being extended to support redirecting to a VR HMD, allowing for 360-, 180- video playback.

Senseye has been neglected for a long time, with only minor experiments on how to refactor the UI into something better. After some pondering, the individual sensors will be changed into “normal” arcan applications, and all the analysis tools will become plugins to Durden rather than acting as a completely separate UI.

Detailed Changelog



  • VR support now covers the full path from bridge communicating metadata and limb discovery/loss/map/updates
  • (71939f) -0,-1 pipes and filters input setup added, covered in¬†AWK for Multimedia
  • format-string render functions extended with vid-subimage blit


  • New function: define_linktarget¬†– used to create an offscreen render pipeline that is tied to the pipeline of another rendertarget
  • New function: subsystem_reset – used to rebuild subsystems (video only for now) to allow live driver upgrades, active GPU switching and so on – without losing state
  • Updated function: camtag_model, change to allow forcing destination rendertarget
  • Updated function: image_tesselation, expose index access
  • Updated function: render_text, added evid,w,h and Evid,w,h,x1,y1,x2,y2
  • Updated function: random_surface, added additional noise function controls


  • Support for ligatures improved
  • Highlighting/Inverse/Full-Block cursor changed for better visibility
  • Added controls to copy the current window into a new (input label: COPY_WINDOW)
  • Copy Windows can be set to be the primary clipboard receiver (input label: SELECT_TOGGLE)


  • OpenBSD: added mouse support
  • Egl-Dri: swap-GPU slot added to db- based configuration
  • SDL2: improved keyboard and mouse support


  • VRbridge – initial support for OpenHMD
  • Xarcan – Ported to OpenBSD
  • Waybridge:
    • fixes to subsurface allocations
    • reworked wl-shell to use (most) of the xdg-shell event mapping
    • -egl-shm argument added (perform shm->dma_buf conversion in bridge to offload server)
    • single exec mode (arcan-wayland -exec /my/bin) for stronger separation between clients
    • add support for rebuilding client (crash recovery and migration)


Big Items:

  • Display region sharing now supports force-pushing into clients that can handle input segments.
  • target/video/advance/migrate,fallback – send a migrate request to a client, which may prompt a client to jump to a different connection point or display server instance.
  • shader subsystem – added a multi-pass effect format along with some initial effects (gaussian blur, CRT-lottes).
  • tools/advfloat – extended float layout mode capabilities:
    • spawn control (draw2spawn)
    • hide-to-statusbar
    • cursor-action-region (see tools/advfloat/cregion.lua for definition)
    • automatic relayouter
  • tools/overview – added a HUD- like workspace switcher
  • tools/flair – added a visual effects layers and some initial effects
  • terminal-group spawn-mode – allows a connection primitive to be generated per terminal and clients which connect via this group share the same logical window tree slot
  • Tui/terminal clients are now allowed to spawn additional tui subsegments.
  • File browser now expose wild-card matching (asterisk), Lua patterns (%%) and sort-order modification (typing % lists options).
  • retain some window and workspace properties across script errors, crashes and resets
  • menu navigation can now shows a helper description of the currently selected item
  • mode-sensitive titlebar icons – window titlebar icons can now be set to be activated only in certain modes


  • Destroying a window in fullscreen mode now returns the workspace¬† ¬†to the last known mode instead of forcing to tile
  • Double-tap input-lock automatically unlocks if the locked window¬† ¬†is closed
  • Double-tap input-lock without a selected windows is now a no-op
  • Float mode border drag sizing, cursorhint and positioning fixes
  • Float mode drag now respects statusbar- imposed boundaries
  • Float mode canvas-drag/resize option for self-decorated clients
  • Improved (less broken) handling for wayland popups and subsurfaces
  • Step-resize keybinding now aligns to window- cell size (terminals)
  • statusbar can now be sized/padded to percentage of display output (config/statusbar/borderpad)
  • statusbar specific configuration moved to (config/statusbar) from¬† ¬†(config/visual/bars/…)
  • statusbar number prefix on known workspaces and statusbar mode button can now be toggled on/off

That’s all for now – Happy New Year and See you in 2018.

Posted in Uncategorized | 3 Comments

Crash-Resilient Wayland Compositing

A commonly held misbelief about one of the possibly negative consequences with migrating from X11 to Wayland is that the system as a whole will become more brittle due to the merger of the display server and the window manager into a ‘compositor’. The argument goes that a bug in the higher level window management parts would kill both the “display server” part, and subsequently its clients, and this is well illustrated in¬†this issue¬†from camp GNOME. To quote <jadahl>s comment in the thread:

I know I have been talking about this before but I think eventually
we really want to do the compositing/UI split. Not now, since it's a
huge undertaking and will involve a massive amount of work, but
eventually it's the solution that I think we should work towards -
long term. The reason is not only because of stability, but for
responsiveness and other things as well.

I think for this bug, a WONTFIX is appropriate. There is no single
"fix" for stability to do here. We just need to hunt down as many
mutter/gnome-shell crashes as we can.

Fortunately, the protocol does not have any provisions that would make a more resilient design impossible –¬†although, to the best of my knowledge, Arcan is the only* implementation that can readily demonstrate this fully. This article will therefore go into details as to how that is achieved, and show just how much work and planning that takes.

The following illustration shows the layers in place, and the remainder of this article will go through them one at a time, divided as follows:

Crash Recovery Stages

The two dashed boxes indicate some form of strict code- memory- or symbol- namespace separation. To quickly navigate to each section, use the links below:

  1. Surviving Window Manager crashes
  2. Surviving Protocol implementation crashes
  3. Surviving Compositor level crashes
  4. Surviving Display System crashes

A combined video clip (for those that dislike the wordpress- embedded video approach) can be found here:

Important: The setup here uses two separate instances of Arcan for demonstration purposes. The left screen is running on GPU1, which will be used as a monitoring console for checking the right screen (GPU2) where all the demo stuff will happen. It has been set up so that CTRL-ALT-F1 act as an on/off toggle for ignoring input on the instance controlling the left screen, with CTRL-ALT-F2 doing the same for the instance controlling the right screen.

*¬†Notes: It is worthwhile to note that KDE Plasma and KWin do have a separation that works to the extent of stage 1. It could be argued that MIR and Enlightenment has something to this effect, with the latter described in¬†Wayland Recovery: A Journey Of Discovery¬†– but I disagree; re-pairing client server-side metadata by agreeing on unique identifiers that persist across sessions is but one small and quite optional part of the solution, there are much stronger things that be done. That said, as visible at the end of the video you indirectly see this tactic being used also. The underlying implementation detail is simply that each resource bundle (“segment”) allocated across shmif always gets assigned a unique identifier, unless one is provided upon connection, allowing the window manager to store that identifier combined with layout information. Durden, for instance, does that periodically after something relevant in the layout has changed. When clients reconnect via the recovery mechanism covered later, this identifier is provided upon registration and window layout can be restored.

1. Surviving Window Manager crashes

For this demo, I have added a ‘Crash WM’ menu command that tries to call a function which doesn’t exist (a fatal scripting error), which you can see the highlighted error message on the left screen (this_function_will_crash_the_wm()).

The individual steps are, roughly:

  1. VM script error handler -> longjmp back into a part of main().
  2. Drop all ‘non-externally connected’ resources.
  3. Reinitialise VM using new or previous set of scripts.
  4. Expose externally connected resources through special ‘adopt’ event handler.

The first part of the strategy here is separation between the core engine and the policy layer (window manager) by a scripting interpreter (the Lua language). Lua was chosen here due to its strong track record, the quality of the available JIT, the way the VM integrates with C code, the detailed control over dependencies, the small code and memory footprint and it being “mostly harmless, with few surprises” as a language, and a low “barrier to entry”. Blizzard, for instance, used Lua as the UI scripting layer in World of Warcraft for good reasons and to great effect.

Note: A more subtle detail that is currently being implemented, is that, due to the way the engine binds to the Lua layer, it is easy to make the bindings double as a privileged protocol, using the documentation format and preexisting documentation as an interface description language. By wrapping the Lua VM integration API with a line format and building the binding code twice, with one round as a normal scripted build and another as a ‘protocol build’, we will be able to fully replicate the X model with a separate Window Manager ‘client’ process written – but without the synchronisation problems, security drawbacks or complex legacy.

When the Lua virtual machine encounters an error, it invokes an error handler. This error handler will deallocate all resources that are not designated for keeping clients alive, but keep the rest of the engine intact (the longjmp step).

When the virtual machine is rebuilt, the scripts will be invoked with a special ‘adoption’ event handler – a way to communicate with the window management scripts to determine if it accepts a specific external client connection, or if it should be discarded. The nifty part about this design is that it can also be used to swap out window manager schemes at runtime, or ‘reset’ the current one – all using the same execution path.

The active script also has the option to ‘tag’ a connection handle with metadata that can be accessed from the adoption event handler in order to reconstruct properties such as window position, size and workspace assignment. The DE in the video, Durden, uses the tag for this very purpose. It also periodically flushes these tags to disks, which works both as ‘remember client settings’ and for rebuilding workspace state on harder crashes (see section 3).

2. Surviving Protocol- implementation crashes

For this demo, I have modified the wayland server side implementation to crash when a client tries to spawn a popup. There are two gtk3-demo instances up and running, where I trigger the popup-crash on one, and you can see that the other is left intact.

Wayland support is implemented via a protocol bridge, waybridge¬†– or by the name of its binary: arcan-wayland. It communicates with the main process via the internal ‘SHMIF’ API as a means of segmenting the engine processing pipeline into multiple processes, as trying to perform normal multithreading inside the engine itself is a terrible idea with very few benefits.

At the time of the earlier presentations on Arcan development, the strategy for Wayland support was simply to let it live inside the main compositor process – and this is what others are also doing. After experimenting with the wayland-server API however, it ticked off all the check-boxes on my “danger, Will Robinson!” chart for working in memory-unsafe languages (including it being multithread-unsafe), placing it firmly in a resounding “not a chance, should not be used inside a privileged process” territory; there are worse dangers than¬†The Wayland Zombie Apocalypse¬†lurking here. Thus, the wayland support was implemented in front of SHMIF rather than behind it, but I digress.

Waybridge has 2.5 execution modes:

  • simple mode – all clients share the same resource and process compartment.
  • single-exec mode – only one client will be bridged, when it disconnects, we terminate.

With simple mode, things work as expected: one process is responsible for mediating connections and if the process happens to crash, everyone else go down with it. The single-exec mode, on the other hand, is neat because a protocol level implementation crash will only affect a single client. It can also achieve better parallelism for shared-memory type- buffers, as the cost for synchronising and transferring the contents to a GPU will be absorbed by the bridge-process assigned to the responsible client at times where the compositor may be prevented from doing so itself.

What happens in single-exec mode, i.e.

./arcan-wayland -exec gtk3-demo

Is that the wayland abstract ‘display’ will be created with all the supported protocols. The related environment variables (XDG_RUNTIME_DIR, …) and temporary directories will be generated, and then inherit+fork+exec the specified client.

Note: This does not currently “play well” with some other software that relies on the folder (notably pulseaudio), though that will be fixed shortly.

The missing “.5” execution mode is a slightly cheaper hybrid that will use the fork-and-continue pattern in order to multiplex on the same listening socket and thus removing the need to ‘wrap’ clients; yet still keep them in separate processes. Unfortunately, this mode still lacks some ugly tricks in order to work around issues with the design of the wayland-server API.

Another neat gain with this approach is that each client can be provided with a different set of supported protocols; not all clients need to be allowed direct GPU access, for instance. The decoupling also means that the implementation can be upgraded live and we can load balance. Other than that, if you, like me, mostly work with terminals or special-purpose graphics programs that do not speak Wayland, you can opt-in and out of the attack surface dynamically. Per-client sandboxing or inter-positioning tricks in order to stub or tamper with parasitic dependencies(D-Bus, …)¬† that tend to come with wayland clients due to the barebones nature of the protocol, also becomes relatively easy.

3. Surviving Compositor level crashes

Time to increase the level of difficulty. Normally, I run Arcan with something like the following:

while true; do arcan -b durden durden; done

This means that if the program terminates, it is just restarted. The -b argument is related to “1. surviving window manager crashes”, i.e. in the event of a crash, reload the last set of scripts, it could also be the name of another window manager – or when doing development work, combined with a git stash to fall back to the ‘safe’ version in the case of a repeatable error during startup.

In this clip, you see me manually sending SIGKILL to the display server controlling the right display, and the screen goes black. I thereafter start it again and you can see the clients come back without any intervention, preserving both position, size and hierarchy.

This one requires cooperation throughout the stack, with most work on the ‘client’ side, and here are some Wayland woes in that there isn’t any protocol in place to let the client know what to do in the event of a crash. However, due to the decoupling between protocol implementation and server process, that task is outsourced to the IPC subsystem (SHMIF) which ties the two together.

What happens first is that the window manager API has commands in place to instruct a client where to connect in the event of the connection dying unexpectedly. This covers two use cases, namely crash recovery, and for migrating a client between multiple server instances.

When the waybridge detects that the main process has died, it goes into a reconnect/sleep on fail – loop against a connection primitive that was provided as a ‘fallback’. When the new connection is enabled, waybridge enumerates all surfaces tied to a client, and renegotiates new server-side resources, updates buffers and hierarchy based on the last known state.

The ugly caveat to this solution right now, is that connections can be granted additional privileges on an individual basis, normally based on if they have been launched from a trusted path of execution (Arcan spawned the process with inherited connection primitives) or not. Since this chain is broken when the display server is restarted, such privileges are lost and the user (or scripts acting on behalf of the user) need to interactively re-enable such privileges.

The big (huge) upside, on top of the added resilience, is that it allows clients to jump between different display server instances, allowing configurations like multi-GPU via one-instance per GPU like patterns, or by selectively moving certain clients to other specialised server instances, e.g. a network proxy.

4. Surviving Display System crashes

This one is somewhat more difficult to demonstrate – and the full story would require more cooperation from the lower graphics stack and access to features that are scheduled for a later release, so it is only marginally interesting in the current state of things.

The basic display server level requirement is being able to, at any time, generate local copies of all GPU-bound assets, and then rebuild itself on another GPU and – failing that, telling clients to go somewhere else. On the other hand, this is, to some extent, also required by the CTRL+ALT+Fn style of virtual terminal switching.

The ‘go somewhere else’ step is easy, and is just reusing the same reconnect mechanism that was shown in ‘Surviving Compositor Level Crashes’ and combining it with a network proxy, through which the client can either remote-render- or migrate state.

The rebuild-step starts out as simple since initially, the rendering pipeline is typically only drawing a bunch of quadrilaterals with external texture sources – easy enough. With window manager specific resources and custom drawing, multi-pass, render-to-texture effects, shaders and so on, the difficulty level rises considerably as the supported feature set between the devices might not match. Then it becomes worse when the clients that draw using accelerated devices that might have been irreversibly lost also need to be able to do this – along with protocol for transferring new device primitives and dynamically reloading related support libraries.

The upside is that when the feature is robust, all major prerequisites for proper multi-Vendor-multi-GPU scan-out, load-balancing and synthesis, live driver updates and GPU/display-to-client handover have been fulfilled – and the volatility is mitigated by the crash level recovery from section 3, but that is a topic for another time.

Posted in Uncategorized | Leave a comment

“AWK” for Multimedia

(… and for system graphics, games and other interactive applications but that would make the title just a bit too long…)

Many of the articles here have focused on the use of Arcan as a “desktop engine” or “display server”; even though those¬†are rather fringe application areas which only showcases a fraction of the feature set – the display server target just happens to be part of my current focus.

This post is about another application area. For those of you not ‘in the know’, AWK is a programming language aptly suited for scripted stream processing of textual data, and a notable part of basic command-line hygiene. Arcan can be used in a similar way, but for scripted- and/or interactive- stream processing of multimedia; and this post will demonstrate how that can be achieved.

The example processing pipeline we’ll set up first¬†takes an interactive gaming session in one instance (our base contents). This is¬†forwarded¬†to a second instance which applies a simple greyscale effect (transformation). The third instance finally mixes in a video feed and an animated watermark (our overlay or metadata). The output gets spliced out to a display and to a video recording.

The end result from the recording looks like this:

The invocation looks like this:

./arcan_lwa -w 480 -h 640 --pipe-stdout ./runner demo default |
    ./arcan_lwa --pipe-stdin --pipe-stdout ./gscale |
    ./arcan --pipe-stdin -p data ./composite mark.png demo.mp4 out.mkv

There are a number of subtle details here, particularly the distinction between “arcan_lwa” and “arcan”. The main difference is that the former can only connect to another “arcan” or “arcan_lwa” instance, while “arcan”¬†will connect to some outer display system; this might be another display server like Xorg, or it can be through a lower level system interface. This is important for accelerated graphics, format selection, zero-copy buffer transfers and so on – but also for interactive input.

Structurally, it becomes something like this:


All of these interconnects can be shifted around or merged by reconfiguring the components to reduce synchronisation overhead or rebalance for system composition. The dashed squares indicate process- and possibly privilege- separation. The smaller arcan icons represent the lwa (lightweight) instance while the bigger showing the normal instance responsible for hardware/system integration.

Note that content flow from some initial source to the output system, while input move in the other direction. Both can be transformed, filtered or replaced with something synthesised at any step in the (arbitrary long) chain. For the example here only works with a single pipe-and-filter chain, but there is nothing preventing arbitrary, even dynamic, graphs to be created.

Going from left to right, lets take a look at the script bundles(“appls”) for each individual instance. These have been simplified here by removing error handling, showing only the normal control flow.


This reads like “On start, hook up an external program defined by the two command line arguments, and make its buffers visible. Shut down when the program terminates, and force-scale it to fit whatever dimensions was provided at startup. Whenever input is received from upstream, forward it without modification to the external program”.

function runner(argv)
 client = launch_target(argv[1], argv[2], LAUNCH_INTERNAL, handler)

function handler(source, status)
 if status.kind == "terminated" then
  return shutdown("", EXIT_SUCCESS)
 if status.kind == "resized" then
  resize_image(source, VRESW, VRESH)

function runner_input(iotbl)
 if valid_vid(client, TYPE_FRAMESERVER) then
   target_input(client, iotbl)

Note that the scripting environment is a simple event-driven imperative style using the Lua language, but with a modified and extended API (extensions being marked with cursive text). There are a number of “entry points” that will be invoked when the system reaches a specific state. These are prefixed with the nameof the set of scripts and resources (‘appl’) that you are currently running. In this case, it is “runner”.

Starting with the initialiser, runner(). Runner takes the first two command line arguments (“demo”, “default”) and passes through to¬†¬†launch_target. This function performs a lookup in the current database for a ‘target'(=demo) and a ‘configuration'(=default). To set this up, I had done this from the command line:

arcan_db add_target demo RETRO /path/to/
arcan_db add_config demo default

The reason for this indirection is that the scripting API doesn’t expose any arbitrary eval/exec primitives (sorry, no “rm -rf /”). Instead, a database is used for managing allowed execution targets, their sets of arguments, environment, and so on. This doubles as a key/value store with separate namespaces for both arcan/arcan_lwa configuration, script bundles and individual targets.

RETRO indicates that we’re using libretro¬†as the binary format here, and the demo is the ‘MrBoom‘ core. This can be substituted for anything that has a backend or dependency that can render and interact via the low-level engine API, shmif.¬†At the time of this article, this set includes Qemu (via this¬†patched backend), Xorg (via this patched backend), SDL2 (via this patched backend), Wayland (via this tool), SDL1.2 (via this preload library injection). There’s also built-in support for video decoding (afsrv_decode), terminal emulation (afsrv_terminal) and a skeleton for quickly hooking your own data providers (afsrv_avfeed), though these are spawned via the related launch_avfeed¬†call.


This reads like “On start, compile a GPU processing program (“shader”). Idle until the adoption handler provides a connection on standard input, then assign this shader and an event loop to the connection. Forward all received interactive input. If the client attempts a resize, orient its coordinate system to match.”

function gscale()
 shader = build_shader(nil,
uniform sampler2D map_tu0;
varying vec2 texco;
void main(){
 float i = dot(
  texture2D(map_ty0, texco).rgb,
  float3(0.3, 0.59, 0.11)

 gl_FragColor = vec4(i, i, i, 1.0);
]], "greyscale")

function gscale_adopt(source, type)
 if type ~= "_stdin" then
  return false

 client = source
 target_updatehandler(source, handler)
 image_shader(source, shader)
 resize_image(shader, VRESW, VRESH)
 return true

function handler(source, status)
 if status.kind == "terminated" then
   return shutdown("", EXIT_SUCCESS)
 elseif status.kind == "resized" then
   resize_image(source, VRESW, VRESH)
   if status.origo_ll then
    image_set_txcos_default(source, true)

function gscale_input(iotbl)
 if valid_vid(client, TYPE_FRAMESERVER) then
   target_input(client, iotbl)

This shouldn’t be particularly surprising given the structure of the ‘launcher’. First thing to note is that build_shader¬†automatically uses the rather ancient GLSL120 for the simply reason that it was/is at the near-tolerable span of GPU programming in terms of feature-set versus driver-bugs versus hardware compatibility.

The interesting part here is the _adopt handler. This can be activated in three very different kinds of scenarios. The first case is when you want to explicitly switch or reload the set of scripts via the system_collapse function and want to keep external connections. The second case is when there’s an error is a script and the engine has been instructed to automatically switch to a fallback application to prevent data loss. The third case is the one being demonstrated here, and relates to the –pipe-stdin argument. When this is set, the engine will read a connection point identifier from standard input and sets it up via¬†target_alloc. When a connection arrives, it is being forwarded to the adopt handler with a “_stdin” type. The return value of the _adopt handler tells the engine to keep or delete the connection that is up for adoption.

A subtle detail that will be repeated later is in the origio_ll “resized” part of the event handler.

The skippable backstory is that in this area of graphics programming there are many generic truths. Truths such as that color channels will somehow always come in an unexpected order; GPU uploads will copy the wrong things into the wrong storage format and in the most inefficient of ways possible; things you expect to be linear will be non-linear and vice versa; if something seem to be easy to implement, the only output you’ll get is a blank screen. The one relevant here is that at least one axis in whatever coordinate system that is used will be inverted for some reason.

Any dynamic data provider here actually needs to cover for when or if data source decides that a full copy can be saved by having the origo be in the lower left corner rather than the default upper left corner. For this reason, the script needs to react when the origo_ll flag flips.


This reads like “on start, load an image into layer 2, force-scale it to 64×64 and animate it moving up and down forever. Spawn a video decoding process that loops¬† a user supplied video and draw it translucent in the corner at 20%. Record the contents of the screen and the mixed audio output into a file as h264/mp3/mkv”. Terminate if the ESCAPE key is pressed, otherwise forward all input”.

function composite(argv)
 symtable = system_load("symtable.lua")()

function setup_watermark(fn)
 watermark = load_image(fn, 2, 64, 64)
 if not valid_vid(watermark) then
 move_image(watermark, VRESW-64, 0)
 move_image(watermark, VRESW-64, VRESH - 64, 100, INTERP_SMOOTHSTEP)
 move_image(watermark, VRESW-64, 0, 100, INTERP_SMOOTHSTEP)
 image_transform_cycle(watermark, true)

function setup_overlay(fn)
 launch_decode(fn, "loop",
  function(source, status)
   if status.kind == "resized" then
    blend_image(overlay, 0.8)
    resize_image(overlay, VRESW*0.2, VRESH*0.2)
    order_image(overlay, 2)

function setup_recording(dst)
 local worldcopy = null_surface(VRESW, VRESH)
 local buffer = alloc_surface(VRESW, VRESH)
 image_sharestorage(WORLDID, worldcopy)
 define_recordtarget(buffer, dst, "", {worldcopy}, {},
  function(source, status)
   if status.kind == "terminated" then
    print("recording terminated")

local function handler(source, status)
 if status.kind == "terminated" then
  return shutdown("", EXIT_SUCCESS)
 elseif status.kind == "resized" then
  resize_image(source, VRESW, VRESH)
  if status.origo_ll then
   image_set_txcos_default(source, true)

function composite_adopt(source, type)
 if type ~= "_stdin" then
  return false
 target_updatehandler(source, handler)
 return true

function composite_input(iotbl)
 if iotbl.translated and symtable[iotbl.keysym] == "ESCAPE" then
  return shutdown("", EXIT_SUCCESS)
 if valid_vid(client, TYPE_FRAMESERVER) then
  target_input(client, iotbl)

Composite is a bit beefier than the two other steps but some of the structure should be familiar by now. The addition of system_load is simply to read/parse/execute another script and the symtable.lua used here comes with additional keyboard translation (how we can know which key is ESCAPE).

In setup_watermark, the thing to note is the last two move_image commands and the image_transform_cycle one. The time and interpolation arguments tell the engine that it should schedule this as a transformation chain, and the transform_cycle says that when an animation step is completed, it should be reinserted at the back of the chain. This reduces the amount of scripting code that needs to be processed to update animations, and lets the engine heuristics determine when a new frame should be produced.

In setup_overlay, launch_decode is used to setup a video decoding process that loops a single clip. If the decoding process tries to renegotiate the displayed size, it will be forcibly overridden to 20% of output width,height and set at 80% opacity.

The setup_recording function works similarly to setup_overlay, but uses the more complicated define_recordtarget which is used to selectively share contents with another process. Internally, what happens is that a separate offscreen rendering pipeline is set up with the contents provided in a table. The output buffer is sampled and copied or referenced from the GPU at a configurable rate and forwarded to the target client. In this case, the offscreen pipeline is populated with a single object that shares the same underlying datastore as the display output. The empty provided table is simply that we do not add any audio sources to the mix.

Final Remarks

I hope this rather long walkthrough has demonstrated some of the potential that is hidden in here, even though we have only scratched the surface of the full API. While the example presented above is, in its current form, very much a toy – slight variations of the same basic setup have been successful in a number of related application areas, e.g. surveillance systems, computer vision, visual performances, embedded-UIs and so on.

Even more interesting opportunities present themselves when taking into account that most connections can be dynamically rerouted, and things can be proxied over networks with a fine granularity, but that remain as material for another article.

Posted in Uncategorized | Leave a comment

Arcan 0.5.3, Durden 0.3

It’s just about time for a new release of Arcan, and way past due for a new release of the reference desktop environment, Durden. Going through some of the visible changes on a ‘one-clip or screenshot per feature’ basis:


Most Arcan- changes are internal engine modifications or changes to the assortment of support tools and libraries, lacking interesting visual changes, so dig into the detailed changelog further below for some more detail.

Crash Recovery Improvements:

All unhandled server-side scripting errors (i.e. no fallback application is set) are now interpreted by clients as display server crashes, triggering the crash recovery-reconnect behaviour rather than a clean shutdown. Two shmif- related bugs preventing Xarcan from recovering have also been squished. This should leave us with arcan_lwa and the waybridge- tool left in terms of basic crash recovery (omitted now as the allocation- management is fiendishly difficult). The video shows me explicitly killing the display server process (it’s on a while true; do arcan durden; done loop) and the new instance getting a reconnect from the recovering xarcan bridge.

Improved support for Wayland clients:

Here we see SDL2 (GL-test), QT (konsole), EFL (terminology), Weston-terminal and gtk3-demo mixing wl_shm, wl_drm, wl_shell, wl_zxdg_shellv6, and a bunch of other protocols and subprotocols, with about as many variations of decorations and mouse cursors.

Initial Bringup on OpenBSD:

There’s a lot of things left to do on this backend, with the bigger issues (as usual) being system integration with pre-existing virtual terminal management schemes and getting access to input samples in the right states in the processing pipeline. The following screenshot show that enough bring-up has been done to get keyboard input and Durden working enough to show graphics, spawn terminal frameservers etc.

OpenBSD screenshot

Terminal/TUI experimental features:

Smooth scrolling, Ligatures and non-monospace rendering. These are not really solvable problems for legacy terminal emulation (and disabled by default) but can in some limited settings be helpful, and will be relevant for upcoming work on terminal-emulator liberated TUIs/CLIs. The following video shows three differently configured terminals (though hinting looks like crap when recorded in this way). One that smooth-scrolls a bitmap font, one that shapes with a non-monospaced font, and the last one that performs ligature substitutions with the fira-code font.


A lot of the features and ideas hidden within durden have now been documented on the webpage, now situated at

Window slicing:

This is useful when you only want to keep an eye on- or interact with- a specific part of a window (say an embedded video player in a web browser), slicing away crop-bars from video players or avoiding toolkit enforced decorations.

Overlay tool:

This is useful when you want some contents to follow you regardless of the context you’re working inside (except dedicated fullscreen). This can, of course, be combined with window slicing for better effect.

Input multicast:

This is useful when you have a hierarchy of clients you want to receive the same input in the same time frame (e.g. when doing latency measurements) or when controlling a large number of terminals, VMs or remote desktops.


Window- relayout/resize animations for float/tile:

This was primarily added as a cheap way of debugging performance and interactions between the animation subsystems and the window-type dependent scaling policies. The effect seemed pretty enough to be left as a configurable toggle (global/config/visual/window animation speed).

LED devices and profiles:

This is useful for reducing eye strain by mapping display contents to ambient light, for communicating system events like incoming alerts and for improved UI by showing which keybindings are currently available and a color hint as to what they do.

This clip shows a custom rule that maps the contents of the currently selected window to LEDs placed behind the monitor.

This clip shows the currently accepted keybindings (all lit = input locked to window) color coded by menu path, while the F1..n buttons indicate the global audio gain.

External clipboard manager support:

Though the tool and the interface has been provided by Arcan for a while, the WM (Durden in this case) still needs to allow/enable support since Arcan itself doesn’t dictate or care about clipboards as such. In Durden, this support now manifests itself as enabling a ‘clipboard bridge’ that allows one passive (global listen), active (global insert) or full (global listen/insert) clipboard connection.


External gamma controller support:

Although the subsystem still needs some work on the Durden side, it is possible to either allow all clients full gamma control access (any client can make the screen go unusable dark) or enable it on a per-client basis (target/video/advanced/color-gamma synch). When performed on an Xarcan instance, the currently active display will be exposed over XRandr, meaning that old/legacy tools that require ramp controls should work transparently.

What’s Next?

While we are still waiting for advancements outside of our control in regards to lower level primitives (synchronisation control, drivers switching over to atomic modesets, developments at xdc2017 in regards to usable buffer APIs, Vulkan driver stability, the list goes on)  Рthe main development drivers for the 0.5 branch (which is, by far, the heaviest one planned for the entirety of the base project); heterogenous multi-GPU, live driver updates, VR and low level system graphics in general Рwill keep progressing alongside the various supporting tools.

The more enticing developments in the near-future are partly putting the finishing touches on the TUI- set of libraries and partly the unwrapping of the Lua APIs. As mentioned in earlier posts, the Lua interface has acted as a staging grounds for finding the set of necessary/sufficient features for writing quite advanced graphical applications. Quite some care was put into avoiding language-feature masturbation and object-oriented nonsense for the very reason of having the API being able to double as a privileged, GPU friendly drawing protocol. This means that the driving scripts can be decoupled from the engine and be controlled by an external process, creating some rather interesting possibilities – the least of which is re-enabling the separate window manager model from X, but without the intractable synchronisation issues.

Tracking for other, smaller, enhancements can be found in the issue tracker: arcan , durden.

Detailed Changelog

Arcan – Engine:

  • Refactored frameserver- spawning parts to cut down on duplicated code paths and make setup/control more streamlined.
  • Support for tessellated 2D objects, more fine-grained control over individual vertices.
  • Extended agp_mesh_store to cover what will be needed for full glTF2.0 support.
  • Crash-recovery procedure for external clients now also applies to scripting layer errors when there is no fallback appl set.
  • Reworked font/format string code to bleed less state and automatically re-raster if the outer object is attached to a rendertarget with a different output density.
  • Added additional anchoring points to linked images (center-left, center-top, center-right, center-bottom)
  • VR- mapping work for binding external sensor “limbs” to 3d models, continued bringup on managing vrbridge- instances and fast-path integration vid vrbridge provided sensor output.

Arcan – Lua:

  • New function: image_tesselation, used to change subdivisions in s and t directions, and to access and change individual mesh attributes for 3d objects.
  • New function: rendertarget_reconfigure, used to change the target horizontal and vertical density of a rendertarget.
  • New functions: vr_maplimb, vr_metadata
  • Updated function: define_rendertarget to¬†now returns status, accept more mode flags (for MSAA) and allow target density specification.
  • Updated function: alloc_surface to¬†allows additional backend storage formats, (FP16, FP32, alpha-less RGB565, …)
  • Updated function: link_image, added additional anchoring points

 Arcan РShmif:

  • New library, arcan-shmif-server. This is used for proxying / multiplexing additional connection unto an established one. Primary targets for this lib is a per-client networking proxy, and for TUI/Terminal to support delegating decode/rendering to other processes.
  • Added support for HANDOVER subsegments. These are subsegments that mutate into primary segments in order to reuse a connection to negotiate new clients without exposing a listening channel, allowing a client to negotiate connections on behalf of another.
  • RESET_ level 3 events now carry a reference to the deregistered descriptor so clients have a chance to remove from custom select()/poll() hooks that cache descriptor use.

Arcan – TUI/Terminal:

  • Another dissemination/progress article:¬†
  • support for bitmapped fonts (PSFv2) as an optional path for faster rendering on weak hardware and freetype- less builds.
  • Built-in bitmapped terminus font for three densities/sizes (small, normal, large) as fallback when no font is provided by the display server connection.
  • Added dynamic color-scheme updates.
  • Rendering-layer reworked to support shaping, custom blits, …
  • Experimental double buffered mode (ARCAN_ARG=dblbuf)
  • Experimental smooth scrolling in normal mode (ARCAN_ARG=scroll=4)
  • Experimental shaping mode kerning for non-monospace fonts (ARCAN_ARG=shape)
  • Experimental ligature/substitution mode for BiDi/i8n/”code fonts” via Harfbuzz (ARCAN_ARG=substitute)
  • Lua bindings and tool for experimenting with them (src/tools/ltui)

Arcan – Platform:

  • Refactored use of environment variables to a configuration API
  • EGL-DRI: VT switching should be noticeably more robust, EGL libraries can now be dynamically loaded/reloaded to account for upgrades or per-GPU sets of libraries.
  • AGP: Updated GLES2 backend to work better with BCM drivers.
  • Evdev: Added optional support for using xkblayouts to populate the utf8 field.
  • EGL-GLES: quick fixes to bring BCM blobs back to life on rPI.
  • OpenBSD: initial port bring-up, keyboard input and graphics working.
  • SDL2: added SDL2 based video/event platform implementation, some input issues left to sort out before 1.2 support can be deprecated and this be the default on OSX.

Arcan – Tools:

  • Aloadimage: basic support for SVG images.
  • Doc: started refactoring lua API documentation format to double as IDL for re-use of lua API as privileged drawing and WM- protocol.
  • Xarcan: synched to upstream, parent crash recovery fixes, old-drawing mode (no -glamor, no dri3) synchronization and color management improvement.
  • Qemu/SDL2: synched to upstream.


  • XKB- Layout transfer support, basic pointer and pointer surface (wl_seat)
  • Damage Regions, dma-buf forwarding (wl_surf)
  • More stubs (data_device/data_device manager/data_offer/data source)
  • zxdg-shell mostly working (toplevel, positioners, popup)
  • added support for relative_pointer motion


  • Documentation moved to a separate webpage,

  • allow client- defined mouse-cursor support
  • Window slicing: target/window/slice allows mouse-selected¬†subregion to (active->input forward or passive) bind a subregion¬†of one window to a new window.

  • External clipboard manager support: external clients can be¬†permitted to read and/or inject entries unto the clipboard. See¬†global/config/system/clipboard-bridge.

  • Gamma controls: external clients can be permitted to set custom¬†color/ and gamma/ lookup tables, either per window or globally.¬†See target/video/advanced/color-gamma synch and¬†global/config/system/gamma-bridge.

  • Filesystem-like IPC: the iopipes IPC path has been extended¬†to¬†allow ls, read, write and exec like navigation of the menu subsystem. This can be bound to a FUSE-wrapper to fully control (script!)¬†durden from a terminal.

  • LED devices: added support for profile driven LED device control¬†see devmaps/led/ or global/config/led

  • Input multicast : added support for input multicast groups. Enable per window via target/input/multicast. Keyboard input¬†received will be forwarded to all children.

  • Statusbar: can now be set to ‘HUD’ mode, where it is only visible on the global/ or target/ menu HUDs. (config/visual/bars/statusbar(HUD)/…)

  • Tools/Autolayout improvements: can now disable titlebars on¬†side-columns, and allow a different shader on side-columns¬†(see global/config/tools/autolayouting)

  • Tools/Overlay: [new], can now take the contents of a window and add¬†to a vertical column stack at left or right edge as scaled-down¬†previews.

  • Target/Video/Advanced: allow per-window output density overrides.

  • Atypes/wayland/x11: new scaling mode, ‘client’ to allow the client to¬†know about the max dimensions, but let it chose its own actual size¬†within those constraints.

  • Window- relayout/resize animations for float/tile: disable/enable via config/visual/window animation speed

  • Dynamically switchable visual/action schemes (devmaps/schemes/¬†that can be used to set a global, per-display, per workspace or per window¬†scheme of fonts and other configuration presets.

  • Allow GPU- authentication controls.

  • Split mouse cursors into sets.

  • More consistent font/font-size switching when migrating across displays with different densities.

  • Default-off display profiles for vive/psvr.

  • Defer window attachment to reduce initial storm of resize operations.

  • Menu options for appl- switching (global/system/reset/…).

  • Hidden bind path for suspend-state toggle (target/state/…).

  • Menu path to reset workspace background (global/workspace/…)

  • Menu path for global/workspace/switch/last.

  • Option to force bitmap font path for terminal.

  • A shader for luma (monochrome) – only mode.

  • Atype- profile for wayland clients.

  • Option to disable/block mouse (global/input/mouse/block).

  • Target menu path for set-x, set-y in float mode.

  • Mouse button debounce timer support (global/inpput/mouse/debounce).

  • Expose backlight controls per display (global/display/displays/…)

  • Tools/pulldown: can now set a shadow/colored border.

Posted in Uncategorized | 3 Comments

The Dawn of a new Command Line Interface

disclaimer: this is a technical post aimed at developers being somewhat aware of the problem space. There will be a concluding ‘the day of…’ post aimed at end users where some of the benefits will be demonstrated in a stronger light.

A few months back, I wrote a lighter¬†post about an ongoing effort towards reshaping the venerable Linux/BSD CLI to be free of the legacy cruft that comes with having to deal with the emulation of old terminal protocols, stressing the point that these protocols make the CLI less efficient, and hard to work with from both a user- and a developer- perspective. In this post, we’ll recap some of the problems, go through the pending solution, update with the current progress and targets, and see what’s left to do.

To recap, some of the key issues to adress were:

  • Split between terminal emulator and command line shell breaks desktop integration –¬†Visual partitions such as windows, borders and popups are simulated with characters that are unwanted in copy-paste operations and fail to integration with an outer desktop shell (if any).
  • Code/data confusion – both the terminal emulator and text-oriented user interfaces (TUIs) tries to separate content from metadata using a large assortment of encoding schemes, all being prone to errors, abuse, difficult to parse and ridden with legacy.
  • Uncertain capabilities/feature-set – basic things like color depth, palette, character encoding schemes and so on are all probed through a broken mishmash of environment variables, capability databases and the actual support varies with the terminal emulator that is being used.
  • Confusion between user-input and data¬†– programs can’t reliably distinguish between interactive (keyboard) input, pasted/”IPC” input and other forms of data entry.
  • Lack of synchronisation. This¬†makes it impossible for the terminal emulator to know when it is supposed to draw, and signal propagation contributes to making resize operations slow.
  • Crazy encoding schemes for representing non-character data – ¬†such as Sixel.

This just scratches the surface and don’t go into related issues when it comes to user-interaction, consistency, and it ignores the entire problem space of system interaction when it comes to tty devices, input modes, virtual terminal switching and so on.

If you consider the entire feature-set of all protocols that are already around and in use, you get a very “Cronenberg“- take on a display server and I, at least, find the eerie similarities between terminal emulators and the insect typewriters from Naked Lunch¬†both amusing, tragic and frightening at the same time; the basic features one would expect are there, along with some very unwanted ones, but pieced together in an outright disgusting way. If we also include related libraries and tools like curses¬†and¬†turbo vision¬†we get a clunky version of a regular point and click UI toolkit. Even though the scope is arguably more narrow and well-defined, these libraries are conceptually not far away from the likes of Qt, GTK and Electron. Study unicode and it shouldn’t be hard to see that ‘text’ is mostly graphics, the largest difference by far is the smallest atom, and the biggest state-space explosion comes from saying ‘pixel’ instead of cell.

So the first question is, why even bother to do anything at all within this spectrum instead of just maintaining the status quo? One may argue that we can, after all, write good CLI/TUIs using QT running on Xorg today, no change needed – it’s just not the path people typically take; maybe it’s the paradigm of favouring mouse or touch oriented user interaction that is “at fault” here, along with favouring style and aesthetics over substance. One counterpoint is that the infrastructure needed to support the toolkit+display server approach is morbidly obese into the millions of lines of code, when the problem space should be solvable within the tens-of-thousands, but “so what, we have teraflops and gigabytes to spare!”. Ok, how about the individual investment of writing software? accommodating for disabilities? attack surface? mobility and mutability of produced output? efficiency for a (trained) operator? or when said infrastructure isn’t available? the list goes on.

There is arguably a rift here between those that prefer the ‘shove it in a browser’ or flashy UIs that animate and morph as you interact, and those that prefer staring into a text editor. It seems to me that the former category gets all the fancy new toys, while the latter mutters on about insurmountable levels of legacy. What I personally want is many more “one- purpose” TUIs and for them to be much easier to develop. They need to be simpler, more consistent, obvious to use, and more configurable. That’s nice and dreamy, but how are “we” supposed to get there?

First, lets consider some of the relevant components of the Arcan project as a whole, as the proposed solution reconfigures these in a very specific way. The following picture shows the span of current components:

family1.pngThis time around, we’re only interested in the parts marked SHMIF, Terminal and TUI. Everything else can be ignored.¬†SHMIF is the structural glue/client IPC. TUI is a developer facing API built on top of SHMIF but with actual guarantees of being a forward/backwards compatible API. Terminal is a vtXXX terminal emulator/state machine built using a modified and extended version of libtsm.

Inside the ‘Arcan’ block from the picture, we have something like this:

Arcan (bin) layers

From this, we take the frameserver(ipc)– block and we put it into its own shmif-server library. We take the platform block and split out into its own,¬†libarcan-abc. Terminal is extended to be able to use these two APIs along with optional Lua/whatever bindings for the TUI API so that the higher level shell CLI logic with all its string processing ickiness can be written in something that isn’t C. This opens the door for two configurations. ¬†Starting with the more complex one, we get this figure:


Here, Arcan is used as the main display server or hooked up to render using another one (there are implementations of the platform layer for both low-level and high-level system integration). The running ‘appl’ acts as the window manger (which can practically be a trivial one that just works as fullscreen or the alt+fN VT switching style with only a few lines of code) and it may spawn one or many of the afsrv_terminal. These can be run in ‘compatibility mode’ where the emulator state machine is activated and it acts just like xterm and friends.

We can also run it in a simpler form:


In this mode, the terminal works directly with the platform layer to ¬†drive displays and sample input. It can even perform this role directly at boot if need be. An interesting property of shmif here is the support for different connection modes (which I’ll elaborate on in another post) where you can both interactively¬†migrate and delegate connection primitives. This means that you can switch between these two configurations at runtime, without data loss – even have the individual clients survive and reconnect in the event of a display server crash.

No matter the configuration, you (the ghost behind the shell) get access to all the features in shmif¬†and can decide which ones that should be used and which ones that should be rejected. You are in control over the routing via the choice in shell (and the appl- for the complex version). Recall that the prime target now is local text-oriented, command line interfaces – not changing or tampering with the awk | sed | grep | … flow, that’s an entirely different beast. In contrast to curses and similar solutions, this approach also avoids tampering with stdin, stdout, stderr or argv, because¬†connection primitives and invocation arguments are inherited or passed via env. This should mean that retrofitting existing tools can be done without much in terms of ifdef hell or breaking existing code.

Anyhow, most of this is not just vapours from some hallucinogenic vision but has, in fact, already been implemented and been in testing for quite some time. What is currently being worked on now and for the near future is improving the quality in some of the existing stages and adding:

  • Double buffering on the virtual cell screen level to add support for sub-cell “smooth” scrolling, text shaping, BiDi, and non-monospace, properly kerned, text rendering.
  • API and structures for designating regions (alt-screen mode) or lines (normal mode) for custom input, particularly mixing/composing contents from other tui clients or frameservers.

Then comes some more advanced refactoring:

  • Shmif-server API still being fleshed out.
  • Libarcan-abc platform split, as it depends on another refactoring effort.
  • Lua bindings and possibly an example shell.

And more advanced “some time in the future” things:

  • Shmif-server-proxy tool that can convert to-/from- a network or pipe-passed ‘line format’ (protocol) to enable networking support and test high latency/packet loss behavior.
  • CPU- only platform rasteriser (current form uses GL2.1+ or GLES2/3).
  • Ports to more OSes (currently only Linux, FreeBSD, OSX).

Should all these steps succeed, the last ‘nail in the coffin’ will be to provide an alternative platform output target that undoes all this work and outputs into a VT100 compliant mess again – all for the sake of backwards compatibility. That part is comparably trivial as it is the end result of ‘composition’ (merge all layers), it is the premature composition that is (primarily) at fault here as information is irreversibly lost. It is just made worse in this case as the feature scope of the output side (desktop computer instead of dumb terminal) and the capability of the input side (clients) mismatch because of the communication language.

Posted in Uncategorized | Leave a comment

Arcan 0.5.2

A new version of Arcan has been tagged. There is no demo video or fancy screenshots this time around; those things¬†will have to wait until updates come to the related projects (mainly Durden) in a few weeks. Most of the work that remains on the 0.5 series isn’t much to look at by itself¬†– but is¬†nevertheless conceptually and technically interesting.

Some generic highlights from the last ~8 months of work:

The ‘Prio’ side-project – described in more depth in the One night in Rio – Vacation photos from Plan9¬†post. Outside of its value as a security- research target, an hommage to ‘the elders’ and a reminder that there are many more “ways of the desktop” out there than the ‘popular by legacy and traction’ Win/OSX/… one. Prio also serves as a decent base for rapidly¬†putting together highly customised¬†environments.

A number of new supporting¬†tools (src/tools) – There’s aloadimage¬†for image loading that will serve as a testing and development tool for security work like sandbox hardening, quality improvement work in terms of performance and – more importantly HDR rendering and HDR- formats. There’s¬†aclip for building CLI- and scriptable external clipboard managers. There’s shmmon¬†for connection- inspection, monitoring and debugging, and there’s¬†waybridge for wayland client support. Speaking on the Wayland support, it’s getting to a stage where things like gtk3-demo, weston-terminal etc. start showing up – but¬†so far, it has unfortunately been a very unpleasant beast to work with, and with the current pacing, it will take, at least, another good month or two still until it’s really usable.

VR grunt¬†work – One of the near-future prospects that interest me the most on both a personal and a professional level – is the one of getting rid of much of¬†the unintuitive and overcomplicated cruft that browsers, UI toolkits and the “traditional” desktop puts between the individual and computing.¬†“Desktop VR” as it has been presented so far is little more than low-definition “planes-in-space”. With the layering and divison of responsibility that Arcan brings to the table, much¬†more interesting opportunities should arise.¬†Putting flowery visions aside, the support that has been integrated right now is far from stellar, but the “heavy on experimentation, light on results” phase of figuring out how everything from device interfacing- to the scripting API- is supposed to work – is nearing its end. The first part in this is the tool¬†vrbridge that provides device- control and aggregation (the current hardware reality of coding for VR is mostly a closed-source vendor-lock in mess where you won’t get access to the primitives without swearing allegiance to bulky full-engine APIs) which will be fleshed out during the coming few releases.

TUI – Covered in the (regrettably crap) blog post on Chasing the dream of a terminal-free CLI¬†is the (soon to be) developer facing API that takes advantage of SHMIF features to try and provide a way for building text oriented / command-line interfaces that gets rid of the legacy baggage and limitations that stem from having the CLI shell to always work through terminal emulator protocols like VT100, instead making it talk with the display server directly.¬†This will be accompanied with Lua bindings and a (bash/zsh/…)- like shell environment. The SensEYE¬†sensors and translators will also be reworked to use this API.

Xarcan¬†– Is a patched Xorg that interfaces with SHMIF and¬†provides ‘X in a box’ like integration. For all the ‘beating up the old man’ that¬†Xorg seem to get, the backend coding was neither more or nor less painful than Qemu or SDL proved to be. Normal use should be just fine, but dipping into glamor+accelerated graphics seem to be good at provoking graphics driver crashes, possibly since we bind the GL context¬†to a render node rather than the card- node. See the README in the git repository for more details.

Platform refactoring –¬†The main ‘low-level’ platform backend, egl-dri – has been extended with some basic switchable¬†‘synchronization strategies’ for dynamically changing¬†scheduling priorities between energy efficiency, lower input latency, smoother animations and so on. The egl-nvidia code has been integrated with the¬†egl-dri platform now that unified buffer project seems to have stalled. There are¬†some caveats on activating and using it with the¬†NVIDIA¬†closed-source blobs, covered further in the¬†wiki. Most GL use has been refactored to be dynamically loadable and reloadable, getting us much closer to multi-vendor-multi-GPU use and live driver upgrades.

LED subsystem rework – This was very recently covered in the Playing with LEDs¬†post, so I will only mention this briefly, but the way LEDs like keyboard NumLock, CapsLock, ScrollLock, Display Backlights and more advanced “gamer keyboard” like setups – were all hooked up has been reworked and mostly moved to a tiny protocol over a pipe.

SHMIF improvements –¬†The internal segmentation/IPC API has been extended to support negotiation for privileged features, such as access to display lookup tables and VR metadata (also, placeholder: HDR, Vector). Extended accelerated graphics (OpenGL etc.) has been split out into a shmifext- library so that all the other project backends use the same method for getting accelerated GPU access. This will primarily play a role when we need to respond to hotplug events or load balance between multiple GPUs. A new ‘preroll’ stage has been added to the connection process in order to provide an explicit synch-point for script- dependent initial metadata, which should cut down on connect-and-draw latency.

An interesting development that is being experimented with, but won’t be pushed for a while, is reusing the Lua API (most of the API was actually designed with this in mind) as an external protocol for moving¬†the ‘appl’ layer out into its own process. This will be relevant for two reasons. The first one is to make it possible to control the engine using¬†languages other than Lua. It also makes it possible to run things in an X11- like “separate window manager” division of responsibility in an ‘opt-in’ way. The second, more distant, reason is that it works to provide a render- API for network like transparency – though¬†there are likely to be more hard-to-foresee problems lurking here when there’s added latency and round-trips.

A slightly more detailed changelog is available on the wiki.

Posted in Uncategorized | Leave a comment

Playing with LEDs

It’s not uncommon to only bring up monitors when talking about display server outputs, but there are more subtle yet arguably important ones¬†that doesn’t always get the attention they deserve. One such output are LEDs (and more fringe light emitters that gets bundled together with the acronym), present in both laptop backlights and keyboard latched-modifier indicators used for num-lock, scroll-lock and so on.

In the case of the display backlight, the low-level control interface may be bundled with backlight management, so there’s a natural fit. In the case of keyboard modifiers, it comes with the window management territory – someone needs privilege to change the device state, and different applications have different views on active modifiers. A lot of gaming devices, recent keyboards and mice today also come with more flexible LED interfaces that permit dozens or hundreds of different outputs.

Arcan has had support for fringe LED devices for a long time (since about ~2003-2004) in the form of the Ultimarc LED controller, used for very custom builds. Recently, this support has been refactored somewhat and extended to be usable both for internally managed LED devices that come bundled with other input and output devices – and for a custom FIFO- protocol for connecting to external tools.

As part of developing and testing this rework, the following video was recorded:

Thought it may not be easy to see (I don’t have a good rigging for filming at the moment), a few interesting things happen here. The hardware setup is comprised of a few Adafruit Neopixel sticks attached to the back sides of the monitor, along with an Arduino and a G410 RGB keyboard. The software setup is a custom LED profile for Durden (an upcoming feature for the next release). This profile samples some state of the currently selected window, like contents or level of trust (since application origin and permissions are tracked), and maps to the arduino controlled LEDs. It is updated in response to¬†changes to the canvas contents of the window, and moves with window- selection state. There is also a ‘keymap’ profile that describes how the currently active keyboard layout¬†translates into RGB keyboard LEDs. This allows the input-state, like the currently available keybindings, to be reflected by¬†the lights on keyboard. When a meta-key is pressed – only the keybindings relevant to that key will be shown on the keyboard.

This can be utilised for more ambient effects, like in the following video:

Here, the prio WM is used, and it maps the contents of the video being played back in the lower right corner to both the display backlight, and to the keyboard.

This system allows for a number of window- manager states to be trivially exposed — things such as notification alerts, resource consumption issues, application¬†crashes etc. in (to some) much less intrusive ways than something like popup windows or sound alerts.

In the following clip you can see a profile running on Durden that maps both global system state (the top bar corresponds to the current audio volume levels, and the all-white state indicate that input is locked to a specific window), while the other colors indicate valid keybindings and what they target.

Though the driver support may be sketchy (RGB Keyboards and related gaming peripherals can be absolutely terrible in this regard), the patches and tools used for the demos above can be found in this git repository.

Posted in Uncategorized | Leave a comment