Writing a console replacement using Arcan

In this article, I will show just how little effort that it takes to specify graphics and window management sufficient enough to provide features that surpass kmscon and the ‘regular’ linux console with a directFB like API for clients to boot. It comes with the added bonus that it should work on OSX, FreeBSD, OpenBSD, and within a normal Arcan, X or Wayland desktop as well. Thus, it is not an entry in the pointlessly gimmicky ‘how small can you make XYZ’ but rather something useful yet instructional.

Two things motivated me to write this. The first is simply that some avid readers asked for it after the article on approaching feature parity with Xorg. The second is that this fits the necessary compatibility work needed with the TUI API subproject – a ‘terminal emulator’ (but in reality much more) that will be able to transition from a legacy terminal to handling terminal-freed applications.

The final git repository can be found here: https://github.com/letoram/console

Here are shortcuts to the steps that we will go through:

Each Part starts by referencing the relevant Git commit, and elaborates on some part of the commit that might be less than obvious. To get something out of this post, you should really look at both side by side.

Prelude: Build / Setup

For setup, we need two things – a working Arcan build and a directory tree that Arcan can interpret as the set of scripts (‘appl’) to use. Building from source:

git clone https://github.com/letoram/arcan arcan
cd arcan/external ; ./clone.sh ; mkdir ../build ; cd ../build
cmake -DVIDEO_PLATFORM=XXX ../src ; make

There are a lot of other options, but the important one here is marked with XXX. Some of them are for embedded purposes, some for debugging or performance versus security trade offs.

Even though Arcan support many output methods, the choice of the video platform has far reaching effects and it is hardcoded into the build and thus the supporting tools and libraries. Why this is kept like this and not say, “dynamically loadable plugins” is a long topic, but suffice it to say that it saves a lot of headache in deal-breaking edge cases.

There are two big options for video platform (and others, like input and audio are derived from the needs of the video platform). Option one is sdl or sdl2, which is a lot simpler as it relies on an outer display server to do much of the job, which also limits its features quite a bit.

Option two is the ‘egl-dri’ platform which is the more complicated beast needed to act the part of a display server. The ‘egl-dri’ platform build is the one packaged on voidlinux (xbps-install arcan). There is a smaller ‘egl-gles’ platform hacked together for some sordid binary-blob embedded platforms, but disregard that one for now.

The directory structure is simple: the name of the project as a folder, a .lua file with the same name inside that folder, with a function with the same name. The one we will use will simply be called ‘console’, so:

mkdir console
echo "function console() end" > console/console.lua

This is actually enough to get something that would be launchable with:

 arcan /path/to/console

But it won’t do much–and if you use the native egl-dri platform, you need to somehow kill the process for the resources to be released or rely on the normal keybindings to switch virtual terminal.

The first entry point for execution will always be named the same as the appl, tied to the name of the script file and the name of the folder. This is enforced to make it easy to find where things ‘start’. Any code executed in the scope outside of that will not have access to anything but a bare minimum Lua API.

All event hooks (like input, display hotplug, …) are implemented as suffixes to these naming rules, e.g. the engine will look for a console_input when providing input events.

Part 1: Hello Terminal

Git Commit #1

Breaking down the first function:

function console()
	KEYBOARD = system_load("builtin/keyboard.lua")()
	KEYBOARD:load_keymap(get_key("keymap") or "devmaps/keyboard/default.lua")
	selected = spawn_terminal()

The words in bold are reserved Lua keywords and the cursive ones return to Arcan specific functions. The Arcan specific functions are documented in the /doc/*.lua files, with one file per available function.

The first called, system_load is used to pull in other scripts or native code (.dll/.so). The way it searches for resources is a bit special, as the engine works on a hierarchy of ‘namespaces’, basic file system paths that are processed in a specific order. Depending on what kind of a resource you are looking for, different namespaces may be consulted. The ones relevant here are system scripts, shared resources and appl.

System scripts are shared scripts for common features that should be usable to most projects but are not forced in. These are things like keyboard map translation, mouse state machine and gestures and so on. ‘appl’ is the namespace of our own scripts here.

The get_key function is used for persistent configuration data storage. This is stored in a database with a table for arcan specific configuration, for shared launch targets (programs that are allowed to be executed by the engine itself) and for appl specific configuration (that is what we want here). These can be handled in script, but there are also a command-line tool, arcan_db where you can modify these keys by yourself.

The show_image function takes a vid and sets its opacity to fully opaque (visible). A ‘vid’ is a central part in Arcan, and is a numeric reference to a video object. We will use these throughout the walkthrough as they also work as a ‘process/resource’ handle for talking to clients.

When created, VIDs start out invisible to encourage animation to ‘fade them in’ – something that can be achieved by switching to blend_image and provide a duration and optionally an interpolation method. The fancy animations and visuals are out of scope for now though.

Next we add the missing function referenced here, spawn_terminal:

function spawn_terminal()
	local term_arg = get_key("terminal") or "palette=solarized-white"
	return launch_avfeed(term_arg, "terminal", client_event_handler)

The get_key part has already been covered, and if we don’t find a ‘terminal’ key to use as an argument to our built-in terminal emulator, the palette=solarized-white argument will be selected.

launch_avfeed(arg, fsrv_type, handler) is where things get interesting. It is used to spawn one of the frameservers, or included external programs that we treat as special ‘one-purpose’ clients. There is one for encoding things, decoding things, networking and so on. There is also one that is a terminal emulator, which is what we are after here. The argument order is a bit out of whack due to legacy and how the function evolved, in hindsight, arg and type should have been swapped. Oh well.

Time for a really critical part, the event handler for a client. This can of course be shared between clients, unique to individual clients based on something like authenticated identity or type but also swapped out at runtime with target_updatehandler.

function client_event_handler(source, status)
	if status.kind == "terminated" then
		return shutdown()

	elseif status.kind == "resized" then
		resize_image(source, status.width, status.height)

	elseif status.kind == "preroll" then

There are a ton of possible events that can be handled here, and you can read launch_target for more information. Most of them are related to more advanced opt-in desktop features and can be safely ignored. The ones we handle here are:

terminated‘ meaning that the client has, for some reason, died. You can read its last_words field from the status table for a user-presentable motivation. The backing store, that is the video object and the last submitted frame, is kept alive so that we can still render, compose or do other things with the data so normally, you would delete_image here but we chose to just shutdown.

‘resized’ is kind of important. It means that the size of the backing store has changed, and the next frame drawn will actually be scaled unless you do something. That is there is a separation between the presentation size that the scripts set with a resize_image call and whatever size the backing store has. Here we just synch the presentation to the store.

preroll’ is a special Arcan construction client communication design. Basically, the client is synchronously blocking and waiting for you to tell as much as you care to tell about its parameters and instead of reacting to each as part of an event loop, they get collected into a structure of parameters like presentation language, display density and so on. Here we only use target_displayhint to tell about the preferred output dimensions, focus state and specific display properties like density.

Finally, we need some input:

function console_input(input)
	if input.translated then
	target_input(selected, input)

This is an event handler that was talked about before, and one of the more varying one as it pertains to all possible input states. This grabs everything it can from the lower system levels and can be as diverse as sensors, game controllers, touch screens, etc.

The more common two from the perspective here though is ‘translated’ devices (your keyboard) and a mouse. Here we just apply the keyboard translation table (map) that we loaded earlier and forward everything to the selected window. The target_input function is responsible for that, with the possibility to forego routing a source table at all and synthesise the input that you want to ‘inject’.

Part 2: Workspaces and Keybindings

Git Commit #2

While this commit is meatier than the last one, most of it is the refactoring needed to go from one client fullscreen to multiple workspaces and workspace switching and barely anything of it is Arcan specific. The two small details of note here would be the calls to valid_vid and decode_modifiers.

Decode_modifiers is trivial, though its context is not. Keyboards are full of states, and they are transmitted as a bitmap. This function call helps decompose that bitmap as a more manageable type. There are much more to be said about the input model itself as it is much more refined and necessarily complex, and will span multiple articles.

Valid_vid will be used a lot due to a ‘Fail-Early-Often-Hard’ principle in the API design. A lot of functions has a ‘terminal state transition’ note somewhere, meaning that if the arguments you provide mismatch with what is expected, the engine will terminate, generate a snapshot, traceback etc. Depending on how the program was started, it is likely that it will also switch to one of the crash recovery strategies.

This is to make things easier to debug and preserve as much relevant program state as possible. Misuse of VIDs is the more common API mistake, and valid_vid calls can be used as a safeguard against that. It is also the function you need to distinguish a video object with say, a static image source, from one with an external client tied to it.

Part 3: Clipboard and Pasteboard

Git Commit #3

More interesting things in this one and it is a rather complex feature to boot. In fact, it is the most complex part of this entire story. In the client_event_handler you can spot the following:

elseif status.kind == "segment_request" and
       status.segkind == "clipboard" then
		local vid = accept_target(clipboard_handler)
		if not valid_vid(vid) then
		link_image(vid, source)

The event itself is that the client (asynchronously) wants a new subsegment tied to its primary one. If we do not handle this event, a reject will be sent instead and the client will have to go on without one.

By calling accept_target we say that the requested type is something we can handle. This function is context sensitive and only valid within the scope of an event handler processing a segment request. Since all VID allocations can fail (there is a soft configurable limit defaulting to a few thousand, and a hard one at 64k) we verify the result.

The link_image call is also vastly important as it ties properties like coordinate space and lifecycle management of one object to another. When building more complex server side UIs and decorations, this is typically responsible for hierarchically tying things together. Here we use it to make sure the new allocated clipboard resources are destroyed automatically when the client VID is deleted.

Looking at the clipboard handler:

elseif status.kind == "message" then
    tbl, _ = find_client(image_parent(source))
    tbl.clipboard_temp = tbl.clipboard_temp .. status.message
    if not status.multipart then
        clipboard_last = tbl.clipboard_temp
        tbl.clipboard_temp = ""

This is a simple text-only clipboard, there are many facilities for enabling more advanced type and data retrieval — both client to client directly and intercepted. Here we stick to just short UTF-8 messages. On the lower levels, every event is actually transmitted and packed as a fixed size in a fixed ring buffer in shared memory.

This serves multiple purposes, one is to avoid the copy-in-copy-out semantics that the write/read calls over a socket like X or Wayland would do. Other reasons are to allow ‘peek/out-of-order’ event processing as a heavy optimisation for costly event types like resize, but also to act as a rate-limit and punish noisy clients that try to saturate event queues to stall / delay / provoke race conditions in other clients or the WM itself. For this reason, larger paste events need to be split up into multiple messages, and really large ones will likely stall this part of the client in favour of saving the world.

Long story short, the WM thus has to explicitly concatenate these messages, and, optionally, say when enough is enough. Here we just buffer indefinitely, but a normal approach would be to cut at a certain length and just kill the per-client clipboard as punishment. For longer data streams, we have the ability to open up asynchronous pipes, either intercepted by the WM or by sending the read end to one clipboard, and the write end to another.

The call to image_parent simply retrieves the parent object of the clipboard, which is the one we linked to earlier and the code itself just pairs a per-client table where we build the clipboard message that the client wants to add.

Lastly, pasting. In the clipboard_paste function we can spot the following:

if not valid_vid(dst_ws.clipboard) then
    dst_ws.clipboard = define_nulltarget(dst_ws.vid,
     "clipboard", function(source, status)
	if status.kind == "terminated" then

The important one here is define_nulltarget. There are a number of define_XXXtarget functions depending on how information is to be shared. The common denominator is that they are about allocating and sending data to a client, while most other functions deal with presentation or data coming from a client.

The nulltarget is simple in the form that it only really allocates and uses the event queues, no costly audio or video sharing. It allocates the IPC primitives and forces unto the client, saying ‘here is a new window of a certain type, do something with me!’. Here we use that to create a clipboard inside the recipient (if one doesn’t exist for the purpose already).

We can then target_input the VID of this nulltarget to act as our ‘paste’ operation.

Part 4: External Clients

Git Commit #4

There are quite a few interesting things in this one as well. In the initial console() function, we added a call to target_alloc. This is a special one as it opens up a connection point — a way for an external client to connect to the server. All our previous terminals have been spawned by initiative of the WM itself and using a special code path to ensure that it is the terminal we are getting.

With this connection point, a custom socket name is opened up that a client can access using the ARCAN_CONNPATH environment (or by specifying it explicitly with a low level API). Otherwise it behaves just like normal, with the addition of an event or two.

Thus in the client_event_handler, we add a handler for “registered” and “connected”.

“Connected” is simply that someone has opened the socket and it is now consumed and unlinked. No other client can connect using it. This is by design to encourage rate limiting, tight resource controls and segmenting the UI into multiple different connection groups with a different policy based on connection point used. For old X- like roles, think of having one for an external wallpaper, statusbar or launcher.

Our behaviour here is simply to re-open it by calling ‘target_alloc’ again in the connected stage of the event handler.

The “Registered” means that now the client has provided some kind of authentication primitive (optional) and a type. This type can also be used to further segment the policy that is applied to a connection. In the ‘whitelisted’ function below, we select the ones we accept, and assign a relevant handler. If the type is not whitelisted, the connection is killed by deleting the VID associated with the connection.

Lastly we add a new event handler, _adopt (so console_adopt). This one is a bit special.

function console_adopt(vid, kind, title, have_parent, last)

We will only concern ourselves with the prototype here. Adopt is called just after the main function entry point, if the engine is in a recovery state. It covers three use cases:

  1. Crash Recovery on Scripting Error
  2. ‘Reset / Reload’ feature for a WM (using system_collapse)
  3. Switch WMs (also using system_collapse)

The engine will save / hide the VIDs of each externally bound connection, and re-expose them to the scripts via this function. The ‘last’ argument will be set on the last one in the chain, and have_parent if it is a subsegment linked to another one, like clipboards.

It is possible for the WM to tag more state in a vid using image_tracetag which can also be recovered here, and that is one way that Durden keeps track of window positions etc. so that they can survive crashes (along with store_key and get_key to get database persistence).

In the handler for this WM, we keep only the primary segment, and we filter type through whitelisted so that we do not inherit connections from a WM switch that we do not know what to do with.

Part 5: Audio

Git Commit #5

Time for a really short one. Arcan is not just a Display Server, and there are reasons for why it is described as a Multimedia Server or a Desktop Engine. One such reason is that it also handles audio. This goes against the grain and traditional wisdom (or lack thereof) of separating a display server and audio server and then spend a ton of effort getting broken synch, device routing and meta-IPC as an effect.

With every event, you can extract the ‘source_audio’ field with gives an AID, the audio identifier that match a VID one. Though the interface is currently much more primitive as advanced audio is later on in the roadmap, the basic is being able to pair an audio source with a video one, and be able to control the volume.

This in this patch, we simply add a keybinding to call audio_gain and extract the AID to  store with other state in the workspace structure.

Part 6: Extra Font Controls

Git Commit #6

Concluding with another short one. To get into the details somewhat on this one. you should read the Font section in the Arcan vs Xorg article.

We simply add calls to target_fonthint during the ‘preroll’ stage in the client event handler:

local font = get_key("terminal_font")
local font_sz = get_key("font_size")

if font and (status.segkind == "tui" or status.segkind == "terminal") then
    target_fonthint(source, font, (tonumber(font_sz) or 12) * FONT_PT_SZ, 2)
    target_fonthint(source, (tonumber(font_sz) or 12) * FONT_PT_SZ, 2)

The main quirk is possibly that the size is expressed in cm (as density is expressed in ppcm and not imperial garbage), thus a constant multiplier (built-in) is needed to convert from the familiar font PT size.

That is it for now. As stated before, this is supposed to be a small (not minimal) viable WM model that would support many ‘I just need to do XYZ’ users, but at the same time most of the building block needed for our intended terminal emulator replacement project. For that case, we will revisit this WM about once more a little later.

At this stage, most kinds of clients should be working, the fsrv_game, fsrv_decode etc. can be used for libretro cores and video decoding. Other arcan scripts can be built and tested with the arcan_lwa binary. aloadimage for image viewing and Xarcan for running an X server.

The “only” things missing client- support wise is the arcan-wayland bridge for wayland client support due to additional complexity from the wayland allocation scheme and that specific window management behaviour has such a strong presence in the protocol.

This entry was posted in Uncategorized. Bookmark the permalink.

11 Responses to Writing a console replacement using Arcan

  1. Anthony says:

    Thank your for a good sample of code. I’ve seen something similiar early in arcan wiki but this is much more detailed. Unfortunately I have some misunderstanding on concepts rather than on “how to do things”. May I just list the few of set of questions that bother me for years? (Further I will refer native arcan application as NAA)
    1. Runing NAA is passive or active? Can I exec NAA binary and get results on arcan managed screen (active) or is it passive and I should always call special arcan function that will load and execute NAA?
    2. Is all arcan NAA must have appl entry?
    3. I see appl entry as a thing that may be modified by user to tune and integrate NAA into arcan desktop. Am I right?
    4. For example above (as for many example in the arcan-wiki) there is lua scripting and understand that arcan will load those scripts provide functions for them and install handlers from scripts internally. But I can’t expand this logic to a hard application (like MS Excel) written in C/C++ presented by binary ELF file. Is there some guide for NAA written in C/C++? Is there general arcan API and framebuffers API for NAA?

  2. bjornstahl says:

    See: https://imgur.com/a/F3gf9R3

    First recall that there are two very different kinds of ‘native’ arcan applications. We have the ones not covered by this article, that is the external clients that use the SHMIF or TUI API to ‘connect’ similarly to how a program would connect to Xorg using Xlib or XCB. Those are normal C libraries. Here is a trivial example: https://github.com/letoram/arcan/blob/master/tests/frameservers/counter/counter.c

    Then there are those that use the Arcan Lua API specifically, which this article describes a small part of just as the ‘Awk for multimedia’ article describes another part — thus I assume it is this kind of app your questions pertain to. In that case:
    1. there is no ‘binary’ for the NAA as such, Arcan is the ‘binary format / loader’ and programs have to have the kind of structure that is shown in the article.
    2. yes, it refuses to start if it isn’t there.
    3. It works just like int main(int argc, char** argv) would do for a C program, but for arcan lua scripts.
    4. It is deliberately in the other direction, the lua script can use system_load(“mylib.so”) for native like extensions, this model is similar to how android apps do it for their “native” apps.

  3. Anthony says:

    It’s interesting. I was waiting for this picture(https://imgur.com/a/F3gf9R3). Can you add frameservers on it?
    1. I forget about great SHMIF. So OK, I understand now that there is a direct native arcan application (DNAA) which make content (frame raster/audio) by own and put it into arcan by SHMIF. DNAA will use API from arcan_shmif.h documented somewhere. Example of DNAA is a counter.c
    2. Counter.c binary (CB) does not need appl entry. Is it true?
    3. May I create appl entry for CB for tuning? For example, I want CB video output to be grayscaled and started on the upper left corner on screen. How to achieve this?

    It’s all about I didn’t catch how to draw line/text in arcan system. Lua scripts is not the preferred choice for an application development. There is a content creation API somewhere in arcan (at least SDL?) but I dont understand how to use it.

  4. bjornstahl says:

    Frameservers (terminal, encode, decode, …) are just special DNAAs that are built as part of the system codebase.
    1. Correct
    2. CB (and all DNAAs) are plain ol’ binaries, no entry.
    3. The CB wouldn’t have those controls (i.e. it can ask to please put me in the upper left corner (enqueue VIEWPORT hint event) but it is up to the appl to determine the coordinate space and post-process.

    If you would want the ‘generic’ feature of allowing one client to be placed as a greyscale overlay in the top left corner regardless of what the appl does:

    TUI is a DNAA API for text-oriented applications. SDL is only used for display output when running inside other display systems, e.g. x11/osx/wayland.

    Far away on the roadmap is “libification” of the arcan core as well and then it would work as a content creation API as well.

  5. Anthony says:

    1. Will various content creation API be realized as a frameservers? Suppose I want to supply arcan by small toolkit that draw two lines that forms X. This toolkit will provide one function PutX(x,y). What the right way to do this? I think I must create framebuffer server “xline” and then say that all NAA which only want to draw X’es must use “xline” headers and link with “xline” lib.

    2. ” but it is up to the appl to determine the coordinate space and post-process”. So can I create appl that executes CB?
    3. or Can I create appl for CB that will be used on any CB execution (appl will be applied)?

  6. bjornstahl says:

    1. toolkit as a lib that renders into shmif- connection using the shmif-lib.

    2/3. yes, and there is a generic mechanism to help you execute and retain chain of trust:

    arcan_db add_target my_binary BINARY /path/to/bin argv

    arcan_db add_config my_binary default extra_argv

    appl- script can then do list_targets(), list_configurations(), launch_target().

    you can do everything like that and not even expose alloc_target (i.e. no option for untrusted clients to connect at all)

  7. Anthony says:

    1. “lib that renders into shmif- connection using the shmif-lib” == frameserver?
    2. Is there a default appl for DNAA that has not arcan_db entry?
    3. If I want CB to appear in upper left corner I have to:
    a) arcan_db add_target counter_sample BINARY /usr/local/bin/counter_sample
    b) arcan_db add_config counter_sample default
    c) mkdir /path/to/arcan/appls/counter_sample
    d) touch /path/to/arcan/appls/counter_sample/entry_script (sorry, I forget the appl structure)
    e) edit script: add … for positioning in upper left corner and end with launch_target().
    After that I can call arcan function XXX(‘counter_sample’) for executing appl. Is this correct?
    4. In case of direct executing:
    is appl ‘counter_sample’ bypassed?

  8. bjornstahl says:

    Most of your reasoning seems sound.

    so the arcan_db stuff is to provide a uniform interface for letting users define what the appl is allowed to exec() or not, and a bit later, control sandboxing parameters etc.

    if you look in durden for instance, the menu/open/target will list all the targets and configuration in the database and select what you want to launch but it is read-only from that context. Durden itself could not do exec(“rm -rf /”).

    d) entry_script is called the same as the appl (counter_sample) with a .lua and the first function invoked also has the same name.

    launch_target(‘counter_sample’) (but via a resolve function for going from string to id, list_targets(), list_configurations()) is the function for executing appl.

    Directly executing /usr/local/bin/counter_sample would have the problem of counter_sample checking ARCAN_CONNPATH=name env for a path to a socket. The appl need to open that socket (target_alloc(“name”..)) to say that it accepts an external connection.

  9. Anthony says:

    I think that providing ARCAN_CONNPATH environment is not a big problem.
    As I understand arcan_shmif_open(…) do not check that calling binary has entry in arcan_db and launch corresponding appl.
    So one can’t say that arcan launch an application (in case of DNAA) strictly in a user defined way (via corresponding appl). Am I right?

  10. bjornstahl says:

    A binary that is launched via launch_target (arcan_db entry) does not use ARCAN_CONNPATH in shmif_open, it inherits the socket descriptor from its parent. Since the appl- launch the binary, it already knows what it wants to do with it (pre-authenticated).
    A binary that connects via ARCAN_CONNPATH must have a corresponding “target_alloc(name, [key], …)” call, so the appl knows that it can possible be untrusted, man-in-the-middle etc.

  11. Anthony says:

    Ok. Thank you for your answers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s