Arcan versus Xorg – Approaching Feature Parity

This is the first article out of two where I will go through what I consider to be the relevant Xorg feature set, and compare it, point by point, to how the corresponding solution or category works in Arcan.

This article will solely focus on the Display Server set of features and how they relate to Xorg features, The second article will cover the features that are currently missing (e.g. network transparency) when they have been accounted for, as well a the features that are already present in Arcan (and there are quite a few of those) but does not exist in Xorg.

It is worthwhile to stress that this project in no way attempts to ‘replace’ Xorg in the sense that you can expect to transfer your individual workflow and mental model of how system graphics works without any kind of friction or effort. That said, it has also been an unspoken goal to make sure that everything that can be done in an Xorg environment should be possible here — in general there is nothing wrong with the feature set in X (though a bit limited), it is the nitty gritty details of how these features work, are implemented and interact that has not really kept up with the times or been modelled in a coherent way. Thus, it is a decent requirement specification to start with – just be careful with the implementation and much more can be had for a fraction of the code size.

The very idea of the project as a whole is to find new models for system graphics, preferably those that can survive the prospect of a coming ‘demise’ of the ‘desktop’ as the android/chrome alphabet soup monster keep slurping it up, all the while established ones make concessions after concessions to try and cater to a prospective user base that just want things to “work” at the expense of advanced user agency.

In terms of the ‘classic’ desktop, Xorg with DRI3 work is already quite close to that dreaded ‘good enough’, and the polish needed is mostly within reach. My skepticism is based on the long standing notion that it won’t actually matter or be anywhere near ‘good’ enough.

This article is about as dry an experience as quaffing a cup of freshly ground cinnamon – even after the wall of text was trimmed down to about half of its original length. To make it less painful to read, you can use these shortcuts to jump between the different feature categories.

Most of these sections start with a summary of the X(org) perspective, and then contrast it with how it is handled in Arcan.

As a precursor to the other sections, lets first take a look at two big conceptual differences. The first one is that in Arcan, the only primitive a client really works with is referred to as a ‘segment’. A segment is a typed IPC container that can stream audio and/or video data in one direction, ‘client to server’ or ‘server to client’. It also carry associated events, metadata and event-bound abstract data handles (e.g. file descriptors) in both directions.

The type is a hint to/from the policy layer (which is somewhat similar to a ‘Window Manager’), e.g. popup, multimedia, clipboard, game and so on – used to assign priority and influence decisions such as suppress-suspend or input grabs without letting clients request such state changes, which is otherwise the annoying default in many other environments.

Any connected client gets one segment (the ‘primary’) and closing that one will release all resources associated with the client. A client may negotiate for additional segments, but these can be rejected, and rejection is, just as in real life, the default behaviour. The policy layer has to explicitly opt in to further allocations. A client may also negotiate for extending the capabilities of the segment to support advanced and privileged features, such as access to VR related devices, hardware colour lookup tables and so on.

A second key difference is the aforementioned policy layer. This layer act as both the window manager and event driven I/O “routing” rules combined into one. It is not treated as a normal client, but instead has access to a unique set of privileged functions that can be leveraged to explicitly control everything from display synchronisation, source to output mapping, to event propagation and synthesis.

The premise is that the choice in policy layer should be modifiable at the user’s behest. It can be thought of as a firewall and router for all UI related activity – in control of what goes where, when, why, and in what shape. This layer also have the option to not only accept and map client segments, but also to explicitly push a segment to a client, as a means of probing- and announcing- support for some features.

Clients and Privileges

For those not entirely comfortable with display server parlance, recall the distinction between X11 (X protocol version 11) and Xorg. X11 is a protocol (a literal stack of papers), while Xorg is the current de facto server side implementation of that protocol – though the protocol itself only expose a subset of what the server itself can potentially do. A reasonable parallel is the one of the relationship between HTTP (or SPDY, QUIC) and the rest of the web browser (Firefox, Chrome, …) both in engineering effort and relative importance. For clients to communicate using the protocol, the de facto implementation comes via the xlib library or via the more recent ‘xcb’ library.

This is a setup that is quite rare for user-facing operating systems in general: Windows does not have a display server protocol, and the same goes for OSX and Android. The real value of a protocol is communication between hard system barriers, politics condensed so to speak, but interaction between userspace components is not a particularly hard boundary. The protocol distinction is also not very common in related subsystems like audio. What happens instead is that they do expose APIs for letting other clients interact and integrate with the outer system, but that, by its very definition, only describes the interface – nothing strictly about the communication channel, ordering rules, valid- and invalid- bit patterns or actions to take on non-compliance, as some of those parameters are regulated on an OS level in the shape of the lesser known ABI.

In a similar vein, Arcan does not have a protocol for normal clients as such, and compatibility support for protocols like X11 and Wayland come via translation services built as separate opt-in special clients for stronger separation, sanity and sandboxing reasons.

API-wise, Arcan exposes three APIs. ‘Shmif’, ‘Tui’ and ‘Alt’. The philosophy behind the design is to let the server side absorb systemic complexity, and strive to make clients as trivial as possible to both develop and inspect. No multi-million lines of code toolkit should ever be necessary. The perspective is to instead emphasise the extremes, that is to focus on the small “one task” kind of clients and on integration with bigger “embedded universes” like virtual machines and web browsers, and leave the middle to fester and rot.

The ‘shmif‘ API provides the low level interface and exposes a large set of features over a deceptively simple setup, one that was mostly inspired by old arcade hardware designs. This API is kept in lock-step with the engine. It may have incompatible changes between versions (but so far, not really), and will remain lock stepped up until the fabled ‘1.0’ release. This means that the running engine version needs to match that of the shmif API version. It enforces the attitude of “if you upgrade the server side, upgrade the clients” as well. Therefore, it does not provide means for “extensions”. The reasoning for this is that it should be absolutely clear from a version which features that are present or not. It should not be left up to guesswork, a “TERM=xxx ; termcap” style database or Wayland- “Charlie Foxtrot” style extension registry.

The ‘tui’ API builds upon shmif and primarily strikes at text-oriented user interfaces as a means of dissolving the dependency these have had traditionally to terminal protocols and the need to go via a terminal emulator. This opens the door to better integration with the window management style in the outer desktop; sane input models; multimedia- capable ‘rich text’ command line tools; removing the need for tmux style multiplexers and so on.

Lastly, the ‘alt’ API is the high level engine interface itself and is currently exposed via Lua (see also: AWK for multimedia). It is used on the server side to implement the policy layer, but it can also be used client side via a special Arcan build (referred to as LWA) that acts as a normal 2D/3D renderer.

Going back to the arcan-shmif API — it is used to implement support for a number of features, but also for supporting other protocols. These typically come as separate processes and services that you switch on and off during runtime. The big two examples, right now, being ‘waybridge’ for Wayland support and ‘xarcan’ for X11 client support. These are kept as separate, privilege separated and, optionally, sandboxed processes and can run in all combinations: one bridge to one client, one bridge to many clients and many bridges to many clients.

Programs that use the arcan shmif API are split into three distinct groups: ‘subjugated‘, ‘authenticated‘ and ‘non-authoritative‘ – collectively referred to as ‘frameservers‘ in most of the engine documentation and code.

The authenticated clients are started through the initiative of the window manager and their parameters are predefined in a database. They inherit their respective connection primitives, thus keeping the chain of trust intact.

Subjugated clients have a preset ‘archetype‘. This archetype regulates its intended role in the desktop ecosystem, e.g. media decoding,  encoding, CLI shell etc. and the window manager may forcefully create and destroy these with predictable consequences. They can therefore be considered as a form of ‘do one thing well then die’ graphical clients. The big thing with this is that a window manager can make stronger assumptions about what to make of their presence or lack thereof.

The last ‘Non-authoritative’ clients are ones that more closely match the model seen both in other Wayland compositors and in X11 (X has a very mild form of authentication, Wayland completely fails in this regard) and is simply some client started from a terminal (tty) or, more likely, pseudoterminal (ptty) using some OS specific connection primitive and discovery scheme.

In X11, clients connects and authenticates to an X11 server with a flat (disregarding certain extensions, there are always ‘extensions‘) set of quite far-reaching permissions – clients can iterate, modify, map, read and redirect parts of both the event routing and the scene graph (a structure that controls what to draw, how to draw and when to draw). It is a tried and true way of getting very flexible and dynamic clients – which is also its curse.

In Arcan, there is a minimal default set of permissions which roughly corresponds to drawing into a single re-sizeable buffer, along with a bidirectional event queue. Everything else is default-reject policy, and has to be opted in. This includes attempts to retrieve additional buffers, on the principle that a client should have a hard time to issue commands that induces uncontrolled server-side resource allocations. Partial motivation for the resource stringency is that the display server can run for extended periods of time, and small performance degradation from heap fragmentation and so on- accumulate slowly, presents itself as ‘jankyness’, ‘judder’, ‘lag’ and can be terrifyingly hard to attribute and solve.

The upside is that the ‘window manager’ can extend this on a sliding scale from accepting custom mouse cursors for certain clients, to allowing GPU accelerated transfers, access to colour controls, buffer readbacks and unfiltered event-loop injection. Principally, albeit unwise, there is nothing that stops this layer from opening the permission doors wide open immediately and just replicate the X11 model in its entirety, but it happens on the behest of the user, not as a default.

Window Managers (WM)

One of the hallmark features of any decent display system is to be able to allow the user to manipulate, tune, or replace the interface, event routing and event response as he sees fit.

In X, any client can achieve this effect, which is also one of its many problems – if one or many clients tries to manage windows you have the dubious pleasure of choosing between race conditions (which manifest as short intermittent graphics glitches in conflicted territories like decorations, or buffer-content tearing) or performance degradation in terms of increased latency and/or lowered throughput. In many cases, you actually get both.

Arcan keeps the Window Manager concept as such, but it is not treated as ‘just another client’ with access to the a same set of mechanisms as all the others. Instead, it act as a dominant force with access to a much stronger set of commands. Since the engine both knows and trusts the window manager, it can allow it to drive synchronisation and get around the synchronisation problems inherent to the X approach.

All example window management schemes that have been showed in the videos here over the years are implemented at this level and they have acted as driving forces for evolutionary design of the scripting API. Some of the high level scripts, mouse cursor management being one example, gets generalised and added to a shared set of opt-in ‘builtin’ scripts.

Because of these choice, most of the features that the individual schemes provide can be transplanted into other window managers at a much lower developer effort than what would be ever be possible in Xorg or any other current Wayland compositor for that matter.

Displays vs Connection Points

In X11, a client connects to a ‘display’ specified with the DISPLAY=[host]:[.screen] environment variable. This is an addressing scheme that maps to a domain or network socket through which you can connect to the server.

In Arcan, the environment variable is ARCAN_CONNPATH=”name” which also points to a domain socket (unless the connection is pre-authenticated and the primitives are inherited). The big difference is that the “name” is set by the Window Manager and it is consumed on use, meaning that when a client has connected through it, it disappears until it is reopened by the window manager.

This is a very powerful mechanism as it allows the Window Manager to treat clients differently based on what address they connect to. It means that the API can focus on mechanisms and not policy, and the Window Manager can select policy based on connection origins. Instead of just using the name as an address to a server, it is an address to a desktop location.

Trivial examples would be to have singleton connection points for an external status bar, launchers or for the desktop wallpaper – without having to pollute a protocol with window management policies and worse, modifying clients to enable what is principally window manager scheme dependent policy.

This mechanism has also been used to successfully implement load balancing and denial of service protection (something like while true; do connect_client &; done from a terminal should not be able to render the UI unresponsive or cause the system to crash or terminate).

Reading Window or Screen contents

An X client has a few options for reading the contents of a window, its own or others. This can be used to sample whole or parts of the display server output, but at a fairly notable cost. The actual means vary with the set of extensions that are available.

For instance, if GLX (the GL extensions to the X protocol) is available, the better tactic is to create an offscreen “redirected” window on the server side, bind that to a client accessible texture and draw/copy the intended source into it and then let the GPU sort out the transfer path.

In Arcan, the feature is much more nuanced. A client can request a subsegment of an ‘output’ type, but it has no explicit control over what this means more than “output intended for the client”. The same mechanism is used for the drop part in drag and drop, and the paste part in clipboard paste. The window manager explicitly specifies which sources and which transformations should be sampled and forwarded to the client. On the client side, the code can look like this.

This means that some clients may be provided full copies of the screen, or something else entirely, and the contents can be swapped out at will and at any time, transparently to the client. Since each segment can contain both audio and video, the same approach applies to audio. The philosophy is that the window manager provides user controls for defining selection and composition, and the clients provides processing of composited contents.

The window manager can also chose to force-push an output segment into a client as a means of saying “here’s output that the user decided you should have, do something with it”. The client gets the option to map it and starts reading, or just ignore and implicitly say “I don’t know, I don’t care”. A useful mental model of this is like normal ‘drag and drop’ but where the drag contents is user- defined, and the target is unaware of content origins.

Input and Input Injection

The input layer is a notoriously complicated part of X11. The specs for XInput2 and XKB alone are intimidating reads for sure, not including deprecated but present methods and so on. They also relate to one of the larger issues with X11, being the lack of WM mandated input coordinate space translation and scaling. You have access to zero race condition free ways of controlling who gets control over what input subsystem and when, which is a notable cause of all kinds of problems – but it also severely hinders the features a desktop can provide.

Not only is the input model complicated, but it is also limited. A lot of things has happened to input devices in terms of both speciality accessibility, gaming devices and hybrid systems like VR positioning and gesture input.

The philosophy in Arcan is to have the engine platform layer gather as much as it can from all input devices it has been given permission to use, and provide them to the window manager scripts with minimal processing in between – no acceleration or gesture detection. This is not always reasonable for performance reasons, which boils down to the sample rate of the device and the propagation cost for a sample. It is therefore possible to set rate limits or primitive filters as to not over-saturate event queues – but otherwise the input samples are as raw as possible.

This is done in order to defer decisions on what an input device action actually means. For that reason, there are default scripts the window manager can chose to attach to overlay gesture detection and similar higher level features – then translate and forward to a client.

For input injection, things are even more flexible. The “event model” in the arcan-shmif API match the one the engine uses to translate between the platform layer (where the input events from devices normally originate) and the scripting layer. Client event processing simply means that the event queues in the memory shared with each client gets multiplexed and translated onto a master queue, with a priority and saturation limiter.

A key difference however is that the events provided by a client gets masked and filtered before they are added to the master queue. The default mask is very restrictive, and the events associated with input devices are blocked. The scripting API, however, provides target flags, allowing this mask to be changed. Combine this with the pre-authenticated setup and other programs can act as input drivers in a safe way, yet still present UI components (such as on-screen keyboards) or configuration interface. This code example illustrates a trivial input injector which presses random keys on a random number of keyboards.

Displays, Density and Colour Management

X has the XRANDR extensions for exposing detailed information about output displays. This interface is also used for low level control of resolution, orientation, density and accelerated colour lookup tables.

For colour lookup tables, there are at least two big drawbacks.  The first is that any client can make a display useless by providing broken or bad tables. The second is that there is no coordination about which client knows what or in which order table transforms are to be applied or when – a game might need its brightness adjusted while an image viewer needs more advanced colour management. This will become more pronounced as we drift towards a larger variety in displays, where HDR will probably prove to be even more painful to get right than variations in density.

Exposing low level control of the other display parts comes with the problem that a lot of extra events and coordination is needed for the window manager to understand what is happening, and these solutions worked at a time when CRT displays were at their zenith and not as well now when their collective sun is setting.

Density is a related and complicated subject. A decent writeup of the interactions between X extensions and displays in the context of density can be found here: [Mixed DPI and the X Window System], and it is a better read than what can be covered here.

As for Arcan, there is an extended privileged setup of the SHMIF API where a client gets accessed to enumerating displays and their properties in an RANDR- like way. It can also be used to both retrieve and submit acceleration tables, but it ultimately requires that the window manager opt in and act as an intermediate in order to resolve conflict. Such tables are therefore tracked and managed on a per client level and not per display,.

For resolution and orientation control, see the section on ‘Management Interfaces‘ section further below, and for more details on Density, see the ‘Font Management‘ section. What is worth to mention here is that these properties are all handled ‘per client’ and the window manager is responsible to forward relevant information on a client basis. Thus, multiple clients can live on the same display but have different ideas on what the density and related parameters are.

Font Management

Font rendering and management is a strong candidate, if not a sure winner, for the murkiest part in all of X – though the input stack also come close (since the print server is principally dead). The better description that can be found is probably keithp’s “The Xft Font Library” but it also relies on quite some knowledge on the huge topic of text rendering and how it relates to network transparency and a historical perspective on printing protocols and printers as “very high on DPI, extremely low on refresh rate” kind of displays. The topic of X Font Management in the shape of XFS (the X Font Server) and XFLD (X logical font description) will be left in the history department on account on it practically being deprecated, but there is another aspect that is worth bringing up.

Text is one of the absolutely most important primitives being passed around. No doubt about that. Fonts are connected in order to transform text into something we can see.

This transform depends on the output display, such as its density, subpixel layout and orientation. It also requires user context sensitive information such as presentation language in order to provide proper shaping and substitutions.

As a short example of a ‘simple’ problem here: take subpixel hinting where the individual colour channels are biased in order to provide the appearance of higher text resolution. While it may look better locally, taking a ‘screenshot’ or ‘sharing’ such a window with a remote viewer has the opposite effect.

The division of responsibility for text processing is an intractable problem with no solution in sight that doesn’t have far reaching consequences and quality tradeoffs. The legacy compromise is to work in multiples of 90, 96 DPI or similar constants and then express a scale factor. For new designs, this should make every high school maths and physics teacher cringe in disgust.

In Arcan, a client is normally responsible for all text drawing while the window manager suggests some of the parameters. The way it is managed is that at the moment a connection is opened, the client immediately gets a set of initial properties that includes a set of fonts (primary, secondary etc.), preferred font size, output display properties and language.

These can be dynamically overridden later via a handful of events, primarily FONTHINT (size or hinting changes, transfer of new descriptors), DISPLAYHINT (preferred output size and target density) and GEOHINT (position and input-output languages).

As a twist, clients can switch a segment to use a different output format, TPACK. This format packages enough high level information (glyph, font index, formatting) in a line- or screen- oriented way – and the server side performs the actual font rendering.

Management Interfaces

This category is a bit more eccentric than the others in that for Xorg, a lot of ‘management’ behaviour can be exposed via clients and the quintessential tools that people associate with this is either part of RANDR or via xdotool which more falls into ‘creative use of XTEST and other game mechanics’. The randr- like management is covered in the section on displays.

Some of those roles are split up and covered in the other sections on display management and input injection. Due to the increased role of the window manager, most of the other automation controls have to go through it.

The compromise is that there are support scripts that can be pulled in by the window manager which expose a data model for exposing window management controls ‘as a file system’, along with a support FUSE (demo) driver that allows this file system to be mounted so that the normal command line tools can be used to discover, control and script higher level management in cooperation with the window manager – rather than in competition with it.

Client Contents and Decorations

This is a very infected subject, and it principally requires reaching an agreement  between the clients and the window manager as to the actual contents of client populated buffers. The client cannot do anything to stop the window manager from decorating its windows, and the window manager cannot stop the client from rendering decorations into the buffer used as canvas. The window manager only has a slight advantage as it can forcibly crop the buffer during composition and translate input coordinates accordingly, as shown in this (demo).

Thus, the client need to know if it should, or should not, draw titlebar, border and so on. This decision has far reaching consequences as to the information that needs to be conveyed, understood and synchronised between the window manager and the clients. For instance, titlebar buttons drawn on the client side need to reflect server side window state, and suddenly needs information if it is in a ‘minimised’, ‘maximised’ or some other state.

It is further confused by the part that technically, what many refer to as ‘server side’ decorations are not exactly ‘server side’ – the predominant X solution is that one or even multiple clients (window manager and compositor as possibly separate clients) decorate others. The big problem with this approach, other than it becomes almost uniquely complicated, is synchronisation. The decorations need to reflect the state and size of the decorated client and the speed at which you can do this is limited to the combined round-trip time for the one doing the decorating and the one being decorated.

The principal standpoint in Arcan is that the server is allowed to be complicated to write as it will only be written once, and the clients should be as trivial as possible. The ‘annotations’ that decorations provide should be entirely up to the window manager.

Since the WM is an entirely different and privileged thing to a client here, we can have ‘true’ server side decorations – with the typical approach being to just synthesise them on demand as a shader during the composition stage at a fraction of the cost, even for fancy things like drop shadows.

The ‘compromise’ is that clients can explicitly either indicate the ‘look’ of a cursor from a preset W3C derived list, or request subsegments of a ‘cursor’ and/or a ‘titlebar’ type. As the case is with all secondary allocations these requests are default-deny and window manager opt-in.


The clipboard is traditionally one of the most strange forms of IPC around, with the implementation in Windows arguably being the biggest piece of “WTF” around, tailgated by X. The hallmark “feature” is that you can’t be certain of whatever it is that you “paste” – its origins and its destination, as the interface is inherently prone to race conditions.

The venerable ICCCM defines three ‘selection buffers’: PRIMARY, SECONDARY and CLIPBOARD, with the real difference being the inputs these are bound to (ctrl+c, ctrl+v or menus for CLIPBOARD vs mouse actions for PRIMARY). External clients, clipboard managers, can be used to add / monitor these buffers. There is also the option for clients to negotiate content format by exchanging list of supported formats, and a lengthy protocol for Drag and Drop.

In Arcan, there is no abstract singleton ‘clipboard’ or ‘drag and drop as such’, but rather two types of segments: CLIPBOARD (client to server) and CLIPBOARD_PASTE (server to client). If a client wish to provide clipboard information, it can request a subsegment of the CLIPBOARD type, and if a paste operation occurs, the server force-pushes a CLIPBOARD_PASTE subsegment into the client. These will all accept both audio, video, short messages and file descriptor transfer routing.

This allows for an infinite number of “clipboards”, and how these are shared between different clients is determined by the window manager. Thus, it can be completely configurable and safe to allow one client to act as a clipboard manager or segment into trusted and non-trusted groups for that matter. Example code of a simple clipboard manager can be found here.

It can be set to directly copy one client clipboard with another client clipboard-paste with the window manager intercepting and modifying everything, or to just act as a file descriptor router, you pick your own poison.


While it might be a bit confusing to think of ‘synchronisation’ as a feature, there are a few things to consider that might make it a bit more clear. Modern system graphics is parallelised and asynchronous to a degree that cannot really be underestimated, and then hits like a brick where you fail to think things through. The task is not made any easier by the fact that most of the parts in the chain that needs synchronising are unreliable, limited or just plain broken.

A fantastic explanation of the absolute basics in this regard can be found in this article by Jasper st. Pierre et. all: XPLAIN. Then follow the developments in this regard to the PRESENT extension and check up on the XSYNC extension. I will not attempt to summarise those here and instead go with a model of the problem itself.

There are three big areas we should consider for synchronisation:

The first area is synchronisation to the displays.

Historically, display synchronisation was ‘easy’ in that you could literally have an interrupt handler that told you when the display wanted new contents and how much time you had left to produce it. This became difficult when you hooked up multiple displays that do not necessarily share the same refresh rate or refresh method.

As an example, when a 60Hz display cooperating with a 50Hz in trying to share contents, you’d likely prefer tearing to the latency of using the greatest common divider (10Hz) as the shared synch period.

Xorg tries its hardest to meet the specific deadlines of the displays that are connected. This is a necessity when trying to ‘race the beam’, down to counting microseconds. This was an admirable approach at the time of its conception, and not so much today.

The second area is the window manager.

The window manager needs to react when its UI components are being manipulated. It needs to relayout / redecorate accordingly, and be ready in time for composition and scanout (synch to display). This can be hard to impossible to strike with both accuracy and precision, with many external factors introducing jitter and delays. This is an area where animation quality tend to highlight how well the solution works as a whole, and actually a good reason to have them for troubleshooting, regardless if they are enabled in production or not.

The third area is synchronising to clients themselves.

A common, if not the most common, kind of client is interactive. This means that it mainly updates or produces new contents in response to the input that it receives. Thus the state of the input samples that are forwarded needs to be fresh or latency will suffer. At the same time, the client need to decide when to stop waiting for input and when to use that input to synthesise the output. The state itself might need client local coordinate transforms which might depend on surface dimensions. The source device might be a high sample-rate mouse device emitting 1khz of samples to a terminal emulator capable of handling a fraction of that. The list goes on and ends up in a web of judgement calls that balances latency to cpu utilisation.

Now, how does Arcan approach the range of synchronisation problems then? Well, on a number of levels.

High level: There is a runtime changeable ‘strategy’ that can be set to favour some property like input latency; animation smoothness; or energy conservation. This is a user choice that gets applied regardless of what the window manager itself decides.

Mid-level: The window manager has controls to set a ‘synchronisation target’ that act as input to the high level strategy. The currently selected window being a good target. This means that the synchronisation target gets favourable treatment.

Low-level: A single client can be ‘mapped’ to an output, side-stepping window management in its entirety. The window manager can also force a synchronisation mode and even manually clock and suggest or enforce the deadline per client.

API-level: A client has some movement range itself in that it can cancel current synchronisation attempts (at the cost of possible tearing), bind itself to ‘as early as possible’, against a dynamic deadline or a ‘triple-buffer’ like strategy where the latest complete frame is the one that gets synched out.

Phew. Needlessly to say, there is a lot more that can be said on the subject while still barely scratching the surface. The main missing pieces — If you have not guessed it by now — are mainly multi-level network transparency and drawing command-stream buffer formats. Then we can focus on what is in Arcan but not in Xorg, features like 3D compositing, VR, audio support, live migration, server recovery, multi-GPU rendering and so on.

This entry was posted in Uncategorized. Bookmark the permalink.

17 Responses to Arcan versus Xorg – Approaching Feature Parity

  1. madlos rantes says:

    Congrats on the great achievement
    modular design will make it easier for UI toolkits to take advantage of the extra features arcan has to offer while not having to go from scratch to support additional WM and display protocols

    Wish arcan can gain more momentum and support from the open source community.

  2. Paul says:

    Great work! I wonder if there would be interest in making bindings for MicroPython

  3. John Smith says:

    I’m yet to see those mumbling of desktop demise and its survival to do some DEVELOPMENT, or other creative, constructive and innovative activity, that moves world forward. Otherwise it kind of BULLSHIT: developing to promote state of thing where you … can’t develop?! Are there someone coding for life on non-desktop things? And it even being more efficient than classic desktop, to prove mumbling is valid? Or maybe drawing decent GFX? 3D modelling? Some cad work? Huh? You see, Android is ok to consume content, sure. But far less appealing to create it, except maybe taking photos and NOT doing any processing, because this environment is really hostile to that kind of activity. And I fail to imagine on how VR hood can help to write the code or, say, edit photo.

    So I do think Arcan’s author still misses some when disregarding desktops.

    • bjornstahl says:

      Missing some? I’ve literally implemented every popular and most obscure desktop interaction models possible by now and I do, in fact, code in VR as much as my eyes and body permit and prefer that as the environment is much less ‘noisy’. That doesn’t change the fact that neither me nor any other dev or professional are even close to an interesting/valid demographic for the vast, vast amount of users and the money they represent. The demise refers to the point that the classic desktop is a compromise biased on non-professionals, a user base that could not care less about the desktop anymore as dedicated devices etc. fill those holes. Thus they can be removed from the desktop interaction design and give professionals better tools. Near-future articles will demonstrate some of those ways.

  4. Anthony says:

    Thank you for the article. But I still can’t understand the framebuffer concept. May be you can provide picture/description of data/control flow from application function call DrawText(“Hello World” ) to the raster picture in window on user desktop through all arcan architecture/components?

    P.S. In your future article with things about network transparency I’d like to see your thoughts about why bitmap moving over anetwork (vnc, etc.) won over X or similiar protocol (I remember something about Z protocol try). Is this by concept or by realization? My common sense tell me that the moving a code of a command OpenFileDialog(*.txt) to display server is always effectively than moving raster containing drawn dialog.
    Thank you again.

    • bjornstahl says:

      My plan was to provide a more thorough walkthrough of writing a simple “kmscon” like Arcan WM, which might be a better reference point as the “how things get on screen” part is very much dependent on what the WM decides to do with your client buffer. By default it ends up nowhere it can end up being routed to another client, to the screen, to a video file or all three.

      I’m writing (right now) on a prelude article that doesn’t look at networking through Arcan (as that code isn’t working yet) but goes through the design, problem and approach of X networking and what the problem with ‘transparency’ is.

  5. Anthony says:

    I believe a current UIs is annoying and ineffective. This ineffectiveness accumulated day by day and the more time has pass the more wasted time and energy in my life. Also, after my first application for win3.1 with ugly loops of message dispatching and monstrous Win API till now I feel that UI API may be much better. I’m awaiting for a progress in UI and UI API for 20+ years now. So I fully understand what you are doing and why you are doing this.

    As for network transparency I’m interested are you going to provide possibillity for things other than vnc (raster transfer). I may ask the question in a form: does network transparency possible in applicationframebuffer or in framebufferarcan only. But I’m afraid I misunderstood a framebuffer and such a question is incorrect.

    So about framebuffer. I feel it as content (audio/video) creation helper. They has some API and they produce buffer (audio/video snapshot). This buffer then goes to arcan core. May be it is incorrect. You explain framebuffer mostly as internal part of arcan but I try to understand it from external view, out of arcan. If you explain step by step how DrawText(‘hello world’) call in native arcan application produce client buffer with “hello world” raster then this will overcome my imbecility may be.

    • bjornstahl says:

      For how the higher levels are connected, look no further than the recent post on writing a console replacement. For network transparency, there is a prelude article coming that only focuses on the X part and why that died, and you can find the arcan strategy for network transparency on the main wiki page.

      The framebuffer isn’t really a thing anymore, there is a ‘scanout buffer’ at the lower levels where you can select a buffer (which fills certain other criteria in terms of GPU friendly format etc.) that will be mapped to the signal processing circuitry of the GPU. It can be a direct client (but rarely is, many prerequisites need to be fulfilled). More often, the scanout buffer comes from some off-screen composition stage.

      For raw bitmap versus vectorized contents, the current ‘native’ buffer format is raw pixels only (or an opaque GPU bound handle if you are dealing with a GL/Vulkan client), but there will be at least two other buffer formats for a client to pick as well. One is a text based one that more resembles terminal ‘screens’, and another is a 2D/3D vector pack format that resembles a simplified SVG or glTF2 mesh.

  6. excited says:

    I wanted to thank you for putting in all this effort into Arcan and Durden. I have been using i3 and sway but have been fully happy with neither and have been looking for alternatives for a long time—Durden does not appear on anyone’s list of things to try and I don’t understand why. I don’t have the time or energy to make my own solution but if I did it would be just like what you’ve already done. The extensive configuration and scripting potential makes me excited for the future.

  7. bjornstahl says:

    excited: thanks for your comment. Some of the lack of public attention regarding Durden is probably my lack of marketing, and that is quite deliberate – you can reach out to people at any time, but you won’t be given much of a chance to correct any disappointment unless the potential is obvious. I might as well keep a low profile until the compatibility story is better, the rough edges smoothed out and the installation and setup is easier.

  8. brm says:

    >subjugated (kek)
    I want to believe this is sarcasm and you are not one of those morons.
    Thanks for the nice articles.

    • bjornstahl says:

      thanks and yes, it is partly sarcasm, partly “other” – another option for a (temporary) name would be mancave but there is something inside “safe” (as in storage container, not supression of critical thought).

  9. Dobbo says:

    Just a thought…
    if the Arcan project and the Invisible Things Lab (Qubes OS team) could put their head’s together (and do away with vulnerable Xen hypervisor entirely)
    – we’d have much improved security for online transactions.

    • bjornstahl says:

      It was discussed with some people (not invisible things though) at 35c3, I have a more radical idea in the works though ..

  10. Aaron Denney says:

    > least common divisor

    Do you mean least common multiple (300 Hz), or greatest common divider (10 Hz)?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s