The X Network Transparency Myth

This article presents an interpretation of the history surrounding the ability for X clients to interact with X servers running on other machines, recent arguments as to that ability being defunct and broken, problems with the feature itself, going into what it was and what happened along the way and where things seem to be heading.

The high level summary of the argumentation herein is that there is validity to the claims that, to this day and age, there is such a thing as network transparency in X on a higher level than streaming pixel buffers but it has a diminishing degree of practical usability and interest, and the technical underpinnings are fundamentally flawed, dated and criminally inefficient; yet similarly dated (VNC) or perversely complex (RDP) solutions are not reasonable alternatives.

What are the network features of X?

If you play things strict, all of it are. It should be the very point of having a client / server protocol and not an API. Communication go across hard system barriers and data packets need to consider things like endianness, loss, addressing and so on while the state machine(s) need to account for delays, latencies and buffer back-pressure. Interplay between versions and revisions matter much more in protocol design than in API design, and the effects of tradeoffs between synchronous (processing being blocked waiting for a reply) and asynchronous forms of communication becomes highly visible.

The real (and only) deal for X networking is in its practical nature; the way things work from a user standpoint. In the days of old, one could simply utter the following incantation:

DISPLAY=some.ip:port xeyes

and have its soul stare back at you through heavily aliased portals. This was practically similar to the local “DISPLAY=:0 eyes” form, hence transparent.

The remote display thing assumes that you were OK with anyone able to listen in “on the wire” and doing all kinds of nasty things with the information gathered. Pixel buffers were not compressed so when they too numerous or large, the network was anything but happy. It was good only through the rose tinted glasses of nostalgia and for a local area network; your home, school, or business; certainly not across the internet.

It also assumes that the X server itself was not started with the “-nolisten tcp” argument set or that you were using the better option of letting SSH set forwarding up, compress and provide otherwise preferential treatment. Even then, it assumes that you were practically fine with the idea that some of your communication could be deduced from side channel analysis and so on. Digressing a bit, in spite of speculative vulnerabilities being all the rage yet hard to do practical at scale, the side channel thing is highly relevant in much more modern protocols and efforts in this ‘web browser as a dumb terminal’ day and age. Just sayin’.

Those details in spite, this was a workable scenario for a long time, including relatively complex clients like that of Quake3. The reason being that even GLX, the X related extensions to OpenGL only had local ‘direct rendering’ as a very optional thing. That said, it was just about at the tipping point where the distance between locally optimal rendering and remote optimal rendering became much too great, and the large swath of developers- and users- in charge largely favoured the locally optimal case for desktop like workloads.

The big advantage non-local X had over other remote desktop solutions, of which there were far too many, is exactly this part. As far as the pragmatic user could care, the idea of transparency (or should it be translucency?) was simply to be able to say “hey you, this program, and only this program on this remote machine, get over here!”.

The principal quality was the relative seamlessness of the entire set of features on a per window basis, and that goes unmatched to this very day, but with every ‘integrated desktop environment’ advancement, the feature grows weaker and the likelihood of applications working tolerably like this decreases drastically.

What Happened?

An unusably short answer would be: the convergence of many things happened. A slightly longer answer can be found here: X’s network transparency has wound up mostly being a failure. My condensed take is this:

Evolution of accelerated graphics happened, or the ‘Direct Rendering Infrastructure, DRI’ as it is generationally referenced in the Xorg and Linux ecosystem. Applications starting to depend heavily on network unfriendly IPC systems that were being used in tandem with X. You wanted sound to go with your application? Sorry. Notification popups going to the wrong machine? oops, D-Bus! and so on.

This is what one side of the argument is poking at when they go ‘X is not network transparent!’, while the other side are quick to retort that they are, in fact, running emacs over X on the network to this very day. Try it for yourself, it is not that the mechanisms have suddenly disappeared and it should be a short exercise to gain some experience. From my own experiments just prior to writing this article, the results varied wildly from pleasant to painful depending on the toolkit. No points are awarded for guessing which ones faired the worst.

Thus far, I have mostly painted a grim portrait, yet there are more interesting sides to this, perhaps best represented by that of XPRA or X2go. X2go address some of the shortcomings in ways that still leverage parts over X without falling back to the lowest “no way out” denominator of a composited framebuffer. It does so by using a custom X server with a different line protocol for external communication and a carrier for muxing in sound, among other things.

While this approach still falls flat when it comes to accelerated composition past a specific feature-set, as can be seen in the compatibility documentation notes, it is still very actively developed, used. The activity on mailing lists, irc and gatherings all act as testament to the relevance of the feature and its current form, from both a user- and a develop- perspective.

What does the future hold?

… or the alternative section title of “Wayland, and the irony of dressing up the past as the future”.

It is no secret that I have become increasingly disappointed as to the technical and architectural merits of Wayland over the course of the last few years. This comes empirically from my quite public code, along with some not so public experiments and consultation work. That said, the Wayland ecosystem (because hey, it’s not “just a protocol”) is useful for reference here as an indicator of things that might become reality for some poor souls.

There is a total of one blog post/PoC that, in stark contrast to the rambling fever dreams of most forum threads on the subject, experiments technically with the possibility of transparent in the sense of “a client connecting/bridged to a remote server” and not opaque in the sense of “a server compositing and translating n clients to a different protocol”.  Particularly note the issues around keyboard and descriptor passing. Those are significant yet still only the tip of a very unpleasant iceberg.

The post itself does a fair enough job providing notes on some of the problems, and you can discover a few more for yourself if you patch or proxy the wayland client library implementation to simulate various latencies in the buffer dispatch routine. Enjoy troubleshooting why clients gets disconnected or crash sporadically (or more than usually). It turns out testing asynchronous event driven implementations reliably is hard and not that much effort is being put into toolkit backends for Wayland; too bad near all the responsibility has been pushed to the toolkit backends in order to claim that the server side look simple.

The reason I bring this up is that what will eventually happen is eluded to in the Wayland FAQ. All of them are ‘opaque’ in the aforementioned sense.

The dumbest thing that can happen is that people take it too literally and actually embed VNC on the compositor side. RFB, the underlying protocol, is seriously terrible; even if you factor in the many extensions, proprietary as well as public. Making fun of X having a dated view on graphics and in the next breath considering VNC has some air of irony to it.

Putting aside the seriously dodgy implementations that persist in all the places you would not want them to be, the one redeeming feature is the inertia in that there are clients on lots of platforms.

The counterpoint to that feature is that they practically all have subtle incompatibilities with the many servers that exist, so you do not know which features that can be relied on when or to what extent; assuming the connection does not just terminate on connection handshake, as was the case for many years with Apples VNC client or server connecting to one not written by Apple. At least the public part of the protocol (RFC6143) is documented in such a coherent and beautiful way that it puts the soup of scattered XML files and TODO sprinkled PDFs that is “modern” Wayland forever in the corner.

A fun holiday depression exercise and CCC tradition in some circles, I have heard, is to script a scan of the open internet for 5900, connect, take a screenshot, print them all out and stitch together into a seriously depressing virtual quilt.

The second dumbest thing is to use RDP. It has features. Lots of them. Even a printer server and file system mount translation. Heck, all the things that Xorg was made fun of for having, is in there, and then some. The reverse engineered implementation of this proprietary Microsoft monstrosity is about the code size of the actually used parts of Xorg, give or take some dependencies. In C. In network facing code. See where this is heading? Does that not sound like the greatest thing to embed in that oh so simple and minimalistic Wayland Compositor you pretend to use for other things than XWayland?

The least bad option – outside of the more remote possibility of a purpose-fit design – is to realise that SPICE still exists, mostly being wasted as a way of integrating with KVM/Qemu, and it too has quite the room for improval.

Rounding things off, the abstract point of the referenced ‘VNC-‘ idea argument is, of course, the core concept of treating client buffers as opaque texture bitmaps in relation to an ordered stream of input and display events; not the underlying protocol as such.

The core of the argument is that being that networked ‘vector’ drawing is defunct and dead or dying. The problem with that argument is that it is trivially false, well illustrated by the web browser which shows some of the potential, and then only partially right in the X case as X2go shows that there is validity to proper segmentation of the buffers so that the networking part can optimize its contents and caching thereof.

If you made it this far and want to punish yourself extra – visit or revisit this forum thread.

This entry was posted in Uncategorized. Bookmark the permalink.

10 Responses to The X Network Transparency Myth

  1. Is there any good set of specifications for a remote transparent graphic (multi-media really) terminal protocol? Don’t complain, write it down!

    • bjornstahl says:

      Not really no. In terms of the quality and accuracy of the documentation, the referenced SPICE is probably as good as it gets and it is still quite a limited one. What is being worked on in the scope of Arcan (roadmap target for 0.6, teasers in net-056 branch) is a bit more ambitious, but documenting those details is the final quality assurance stage of that target, not something that will happen in the near future.

  2. Anthony says:

    Thank you for the interesting article.
    May you announce your thoughts for what things will have to be?

    May be you”ll shed some light on the following issues:

    1. Arcan core don’t care of network trasparency. There may be different implementation of frameservers(?) which will do network transfers and finally put the picture(raster) into arcan client buffer via shmif.

    2. What about a chain of frameservers? For example, I have direct native Arcan library/toolkit(==frameserver?) DrawKit which provide drawing of two lines in form X via one function drawX(). What If I want to provide network transparency for this library? I must create some specific program DrawKitNet linked to DrawKit which will listen to network, decode command and call drawX(). Does Arcan care about existence of DrawKitNet?

    3. About VR. I didn’t see in your VR experiments that you’ve introduced new type of Arcan client buffer. May be I’ve missed something. As for me, there must be new type of client buffer because things goes from raster to volume. If there is a client buffer of Volume type in Arcan then we can say that in case of VR/3Dscenes logic remain the same and Arcan will handle network transparency for VR by additional frameservers(?) which will do network transfers and finally put the things into arcan client buffer of type ‘Volume’ via shmif.

    • bjornstahl says:

      1 /2. There is a branch (net-056) where a semi- workable netpipe prototype is evolving. See src/tools/netproxy there. It works by exposing itself as a headless simplified shmif- server. It can currently do video and (most) events for one segment.

      Recall that from crash recovery we can tell a client where to go if its current server disappears and rebuild itself here. So dynamic handover works with the WM instructing (in durden, target/video/advanced/migrate) the client to leave and go somewhere else, but to come back if something goes wrong.

      Then there’s more on

      3. VR from a client perspective is so far focused on input, but all the other parts except for the data format(s) itself is prepared for, and it is the same code path that is used to allow some clients to share and modify display color LUTs (see shmif_sub and the vrbridge tool source). The attack plan is rougly to refactor out the current mesh handling from the core (arcan_3dbase.c) and moving it into afsrv_decode where added dependencies and parsing does much less damage and will be sandboxed. Then use a standard mesh packing format (like the one needed by glTF anyhow) for models, and some voxel representation. When the current test cases work and so on, there is little left to let other clients provide 3D output. The ‘pixel buffer’ part of the segment will be used for textures then.

  3. Anthony says:

    It seems that I got it wrong again.
    As “It works by exposing itself as a headless simplified shmif- server” and “..gets rendered off-screen and forwarded to the ‘encode’ frameserver which implements VNC- style compression and serving” my assumption is wrong. Arcan will ’embed’ network transparency. It slightly disappoint me.

    I still think you will have to cover Arcan with API (content creation API) for application developers and from that point of view things may have changed. Also I think that such issue like security constraints, network transparency must be outer to Arcan core.

    • bjornstahl says:

      I think that it is just language that confuses us. What do you mean with ’embed’ here? It is a separate, replaceable, optional tool. It act as both a shmif-api client and a shmif-api server but one that doesn’t need gpu, input device access and so on as it will just proxy the “server” side. From the perspective of application developers, nothing is needed, this is entirely transparent to them.

  4. Anthony says:

    I fully agree that our non-native English is a bit/very poor.

    But “exposing itself as a headless simplified shmif- server” gave no chance to interpret this in wrong way. You are going to transfer various arcan buffer types from shmif-local to shmif-remote. But I really think that this is not the the point where network transparency may be introduced.

    As you saw I’m looking for an Arcan application API layer(toolkits). Also I believe that shmif will be fully covered by this API (toolkits). In this case there is no need to require all Native Arcan Application to connect to shmif, they do not need ARCAN_CONNPATH. They linked to a toolkit and it is up to toolkit to make network transparency in the best way.

    After the review it is suprising to see that you announce “ARCAN_CONNPATH (comparable to X DISPLAY=:number)” for network transparency. My thoughts after your article is that network transparency is a complex problem. And a design with universal solution for network transparency inside is a mistake. You are going to provide universal solution for network transfer of all Arcan buffer types which will ‘automatically’ make all native Arcan application network transparent. It is strange at this stage as there is no ‘heavy’ Arcan application to proof the concept at least.

    Again, I think that design architecture and infrastructure apart the real applications and applications API can lead to non optimal decisions.

    • bjornstahl says:

      arcan-shmif, arcan-shmif-tui is the API? more abstract / advanced rendering (say Qt, GTK, style) would render using that as their ‘backend’. shmif-client connects to arcan (shmif-server) and found via CONNPATH. How else would they do it?

      There are certainly ‘heavy’ clients enough to cover the gamut (Xarcan, afsrv_decode, afsrv_terminal, afsrv_game, qemu-arcan, arcan-wayland, arcan-lwa).Full screen VM style “browser- class” applications (complex input patterns), 2D/3D games (via the libretro backend, high bandwidth, latency sensitive) and video playback (high-bandwidth, good buffer case) – what category would you say is missing? All except the arcan-wayland bridge work fine (better than VNC or X-fwd) networked on my LAN right now.

      • Anthony says:

        “How else would they do it?” This is the real important question which should be carefully analyzed. It come when the issue of API/toolkits layer arise.
        Is Arcan here to drop out the megabytes of shim and helpers over poor/outdated API? It seems to me that Arcan design can suggest some flexible and efficent answer.

        Again, “How else would they do it?”. Furher something of brainstorm.
        First of all, lets show some cases:
        1. Excel like apllication builded with toolkit ArcanGTK.
        2. ProteinViewer. An application that render protein molecule and allow user to examine it (rotate/zoom).
        Is there one ‘network transparency solution’ best efficent for both? The ProteinViewer will heavy render 3D image by some logic based on relatively small dataset of protein characterstic. An VNC style is bad because server which calculate characterstics must also do heavy renders for 1,2 … 20 users. The most efficent way is to put network border in early stage and just transfer protein characterstic to the ProteinRenderer.

        Do you think that GTK and QT guys unable to add effiicent network transparency if they will do it from scratch? Just by profiling application and by union most common function sequence to one network command.

        I think that arcan core must be bordered by buffer transfer via shmif. But in Arcan infrastructure must be renderers – libraries that do hardware accelerated rendering. So there are two stage:
        – buffer production using renderers ;
        – outputing buffer to arcan via shmif;

        So there will be two toolkits ArcanGTK and ProteinRenderer. Excel-like application will use ArcanGTK and ProteinViewer will use ProteinRenderer. If ProteinRenderer (as any arcan toolkit) want to be ‘network transparent’ they must… (here carefully analyzed requirements)

        I think in Arcan infrastucture must have some library with helpers for typical marshalling/command producing/parsing and a shell application ‘onremote’.
        So putting protein ABCD4533 on user screen on station12 may be achieved by command:

        onremote station12 proteinviewer ABCD4533

        onremote will (this is some fast crazy thougths):
        – establish connection with station12 via SCTP;
        – prepare environment;
        – link proteinviewer with network version of ProteinRenderer library;
        – execute proteinviewer with argument ABCD4533.

      • bjornstahl says:

        Ok, I think I know how to structure an explanation that can take this forward – but wordpress/browser comment section format is a bit hard to work with. Please send me an e-mail if that is ok with you.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s