Time to continue to explain what Arcan actually “is” on a higher level. Previous articles have invited the comparison to Xorg ( part1, part2 ). Another possibility would have been Plan9, but Xorg was also a better fit also for the next (and last) article in this series.
To start with a grand statement:
Arcan is a single-user, user-facing, networked overlay operating system.
With “single-user, user-facing” I mean that you are the core concern; it is about providing you with controls. There is no compromise made to “serve” a large number of concurrent users, to route and filter the most traffic, or to store and access data the fastest anywhere on earth.
With “overlay operating system” I mean that it is built from user-facing components. Arcan takes whatever you have access to and expands from there. It is not hinged on the life and death of neither the Linux kernel, the BSD ones or any other for that matter. Instead it is a vagabond that will move to whatever ecosystem you can develop and run programs on, even if that means being walled inside an app store somewhere.
As such it follows a most pervasive trend in hardware. That trend is treating the traditional OS kernel as a necessary evil to work around while building the “real” operating system elsewhere. For a more thorough perspective on that subject, refer to USENIX ATC’21: It’s time for Operating Systems to Rediscover Hardware (Video).
This is a trick from the old G👀gle book: they did it to GNU/Linux with Android and with ChromeOS. It is not as much the predecessor mantra of “embrace, extend and extinguish” as it is one of “living off the land” — understanding where the best fit is within the ecosystem at large.
From this description of what Arcan is — what is the purpose of that?
The purpose of Arcan is to give you autonomy over all the compute around you.
With “autonomy” I mean your ability to move, wipe, replace, relocate or otherwise alter the state that is created, mutated or otherwise modified on each of these computing devices.
The story behind this is not hard in retrospect; user-facing computers are omnipresent in modern life — they outnumber us. You have phones, watches, tablets, “IoT” devices, e-Readers, “Desktop” Workstations, Laptops, Gaming Consoles and various “smart”- fridges, cars and meters and so on. The reality is that if a computer can be fitted into something, be sure that one or several will be shoved into it, repeatedly.
The fundamentals on how these computers work differ very little; even “embedded” is often grossly overpowered for the task at hand. On the other hand, getting these things you supposedly own to collaborate or even simply share state you directly or indirectly create is often hard- or impossible- to achieve without parasitic intermediaries. The latest take on this subject on the time of writing this is parts of “cloud”; routing- and subjecting- things to someone else’s computing between source (producer) and sink (consumer).
Part of the reason for this is persistent and deliberate balkanisation combined with manipulative monetisation strategies that have been permitted to co-evolve and steer development for a very long time; advances in cryptography has cemented this.
An example: In the nineties it was absurd to think that the entire vertical (all the ‘layers’) from datastore all the way to display would have an unbroken chain of trust. The wettest of dreams that the Hollywood establishment had was that media playback was completely protected from the user tampering with- or even observing- the data stream until presented. That is now far from absurd; it is the assumed default reality and you are rarely allowed to set or change the keys.
The scary next evolution of this is making you into a sensor and it is sold through claims of stronger security features that are supposed to protect you from some aggressor. A convenient side effect is that it actually serves to safeguards the authenticity of the data collected from-, of- and about- you. As a simple indicator: when no authentication primitive (password etc.) is needed, the “ai in the cloud” model of you has locked things down to the point that you best behave if you want to keep accessing those delicious services your comfort depend on.
The overall high-level vision how this development can be countered on a design basis is covered by the 12 principles for a diverging desktop future on our sister blog — but the societal implementation is left as an exercise for the reader.
A short example of the general idea in play can be seen in this old clip:
This demonstrates live migration between different kinds of clients moving to arcan instances with different levels of native integration and no intermediaries. As a native display server on OpenBSD in the centre to a limited application on the laptop to the left (OSX) and to a native display server on the pinephone on the right.
The meat and remainder of this article will provide an overview of the following:
The following diagram illustrates the different building blocks and how they fit together.
SHMIF – Shared Memory Interface
SHMIF is a privilege barrier between tasks built only using shared memory and synchronisation primitives (commonly semaphores) as necessary and sufficient components. This means that if an application has few other needs over what shmif provides, all native OS calls can be dropped after connection setup. That makes for a strong ‘least-privilege’ building block.
It fills the role of both asynchronous system calls and asynchronous inter-process communication (IPC), rolled into one interface. The main inspiration for this is the ‘cartridges‘ of yore — how entire computers were plugged in and removed at the user’s behest.
There is a lot of nuance to the layout and specifics of SHMIF which are currently out of scope, notes can be found in the Wiki. The main piece is a shared 128b event structure that is serialised over two fixed size ring buffers, one in-bound and one out-bound. The rest is contested metadata that is synchronously negotiated and transferred to current metadata that both sides maintain a copy of.
This is optionally extended with handle/resource-token blob references when conditions so permit (e.g. unix domain socket availability or the less terrible Win32 style DuplicateHandle calls), as that is useful for by-reference passing large buffers around which can be preferred for accelerated graphics among many other things.
The data model for the events passed around is static and lock stepped. This model is derived from the needs of a single user “desktop”. It has been verified against both X11, Android as well as special purpose “computer-in-computer” clients like whole-system emulation à QEmu, specialised hybrid input/out devices (e.g. streamdeck) and virtual- augmented- and mixed- reality (e.g. safespaces).
A summary of major bits of shmif:
- Connection: named, inheritance, delegation, redirection and recovery.
- Rendering: pixel buffers, formatted text, accelerated device buffers.
- Audio: sources, sinks and synchronisation to/from video.
- Input: digital, translated (keyboard), mice, touch / tablet, analog sensors, game / assistive, eye tracker, application announced custom labels.
- Sensors: Positional, Locational and Analog.
- Window management (relative position, presentation size, ordering, annotation).
- Color profiles and Display controls (transfer functions, metadata).
- Non-blocking State transfers (clipboard, drag and drop, universal open/save, fonts).
- Blocking state controls (snapshot, restore, reset, suspend, resume).
- Synchronisation (to display, to event handler, to custom clock, to fence, free-running).
- Color scheme preferences.
- Key/value config persistence and “per window” UUID.
- Coarse grained timers.
Most of this is additive – there is a very small base and the rest are opt-in events to respond to or discard. All event routing goes through user-scriptable paths that are blocked by default, forcing explicit forwarding. A client does not get to know something unless the active set of scripts (typically your ‘window manager’) routes it.
Upon establishing a connection, requesting a new ‘window’ (pull) or receiving one initiated by the server side (push), the window is bound to an immutable type. This type hints at policy and content to influence window management, routing and to the engine scheduler.
A game has different scheduling demands from movie playback; an icon can attach to other UI components such as a tray or dock and so on. This solves for accessibility and similar tough edge cases, and translates to better network performance.
A12 – Network Protocol
SHMIF is locally optimal. Whenever the primitives needed for inter-process SHMIF cannot be fulfilled, there is A12. The obvious case is networked communication where there is no low latency shared memory, only comparably high-latency copy-in-copy-out transfers.
There are other less obvious cases, with the rule of thumb being that two different system components cannot synchronise over shared memory with predictable latency and bandwidth. For instance, ‘walled garden’ ecosystems tend to disallow most forms of interprocess communication, while still allowing networked communication to go through.
A12 has lossless translation to/from SHMIF, but comes with an additional set of constraints to consider and builds on the type model of SHMIF to influence buffer behaviour, congestion control and compression parameters.
The constraints placed on the design are many. A12 needs to usable for bootstrapping; operate in hostile environments; isolated/islanded networks and between machines with unreliable clocks; incompatible namespaces and possibly ephemeral-transitive trust. For these reasons, A12 deviates from the TLS model of cryptography. It relies on a static selection of asymmetric- and symmetric- primitives with pre-shared secret ‘Trust On First Use’ management rather than Certificate Authorities.
Around the same era that browsers started investing heavily into sandboxing (late 2000s, early 2010s) Arcan, then a closed source research project, also focused on ephemeral least privilege process separation of security and stability sensitive tasks. The processes that carry out such tasks are referred to as ‘frameservers’.
In principle ‘frameservers’ are simply normal applications using SHMIF with an added naming scheme to them and a chainloader (arcan_frameserver) that is responsible for sanitising and setting up respective execution environments.
In practice ‘frameservers’ have designated roles (archetypes). These control how the rest of the system delegates certain tasks, and gives predictable consequences to what happens if one would crash or be forcibly terminated. It is also used to put a stronger contract on accepted arguments and behavioural response to the various SHMIF defined events.
The main roles worth covering here are ‘encode‘, ‘decode‘ and to some extent, ‘net‘.
Decode samples some untrusted input source e.g. a video file, a camera device or a vector image description, and converts it into something that you can see and/or hear, tuned to some output display. This consolidates ‘parsing’ to single task processes across the entire system. These processes have discrete, synchronous stages where incrementally more privileges, e.g. allocating memory or accessing file storage, can be dropped. The security story section goes a bit deeper into the value of this.
Encode transforms something you can see and/or hear into some alternate representation intended for an untrusted output. Some examples of this are video recording, image-to-text (OCR) and similar forms of lossy irreversible transforms.
Net sets up transition and handover between shmif and a12. It also acts as networked service discovery of pre-established trust relationships (“which devices that I trust are available”, “have any new devices that I trust become available”) and as a name resource intermediary e.g. “give me a descriptor for an outbound connection to <name in namespace>”.
Splitting clients into this “ephemeral one-task” and regular ones lead to dedicated higher level APIs for traditionally complex tasks, as well as acting as delegates for other programs.
It is possible for another shmif client to say “allocate a new window, delegate this to a decode frameserver provided with this file and embed into my own at this anchor point” with very few lines of code. This lets the decode frameserver act as a parser/decode/dependency sponge. Clients can be made simpler and not invite more of the troubles of toolkits.
The ‘engine’ here is covered by the main arcan binary and invites parallels to app frameworks in the node.js/electron sense, as well as the ‘Zygote’ of Android.
It fills two roles — one is to act as the outer ‘display server’ which performs the last composition and binds with the host OS. The scripts that run within this role is roughly comparable to a ‘window manager’, but with much stronger controls as it acts as ‘firewall/filter/deep inspection’ for all your interactions and data.
The other role is marked as ‘lwa’ (lightweight arcan) in the diagram and is a separate build of the engine. This build act as a SHMIF client and is able to connect to another ‘lwa’ instance or the outermost instance running as the display server. This lets the same code and API act as both display server, ‘stream processor’ (see: AWK for Multimedia) and the ‘primitives’ half of a traditional UI toolkit.
Both of these roles are programmable with the API marked as ‘ALT’ in the diagram and will be revisited in the sections on ‘Programmable Interfaces’ and ‘Appl’.
The architecture and internal design of the engine itself is too niche to cover in sufficient detail. Instead we will merely lay out the main requirements that distinguish it against the many strong players in the core 2D- supportive 3D- game engine genre.
Capability – enough advanced graphics and animation support for writing applications and user interfaces on the visual and interactive span of something ranging from your run-of-the-mill web or mobile ‘app’ to the Sci-Fi movie ‘flair over function’ UIs. It should not rely on techniques that would exclude networked rendering or exclude devices which cannot provide hardware acceleration.
Abstraction – The programmable APIs should be focused on primitives (geometry, texturing, filtering), not aggregates/patterns (look and feel). Transforms and animations should be declarative (“I want this to move to here over 20 units of time”), (“I want position of a to be relative to position of b”) and let the engine solve for scheduling, interpolation and other quality of experience parameters.
Robust – The engine should be able to operate reliably in a constrained environment with little to no other support (service managers, display/device managers). It should avoid external dependencies. It should be able to run extended periods of time — months, not hours or days.
Resilient – The engine should be able to recover from reasonable amounts of failures in its own processing, and that of volatile hardware components (primarily GPU). It should be able to hand over/migrate clients to other instances of itself.
Recursive – The engine should be able to treat additional instances of itself as it would any other node in the scene graph, either as an external source node or a subgraph output sink node.
SHMIF has been covered already from the perspective of its use as an IPC system. As an effect of this, it is also a programmable low-level interface. A thorough article on using it can be found in (writing a low level arcan client), and a more complex/nuanced follow up in (writing a tray icon handler). The QEmu UI driver, the arcan-wayland bridge and Xarcan are also interesting points of reference to hack on.
TUI is an abstraction built on top of SHMIF. It masks out some features and provides default handlers for certain events — as well as translating to/from a grid of cells- or row-list- of formatted text. It comes with the very basics of widgets (readline, listview, bufferview). Its main role in the stack is to replace (n)curses style libraries and improve on text dominant tools as a migration strategy for finally leaving terminal emulation to rest.
ALT is the high level API (and protocol*) for controlling the Engine. The primary way of using it is as Lua scripts, but the intention is a bit more subtle than that. For half of this story see ‘Appl’ below. Lua was chosen for the engine script interface in part for its small size (with that low memory overhead and short start times), the easy binding API and minimal set of included functions. It is treated and thought of as a “safe” function decoration for C more than it is treated as a normal language.
The *protocol part is that the documentation for the API double as a high level Interface Description Language to generate bindings that would use the API out of process — allowing both Lua “monkey patching” by the user and process separation with an intermediate protocol. This makes the render process and ALT into a dynamic sort of postscript for applications with animations and composition effects rather than static printer-friendly pages.
This is not a discrete component, but rather a set of restrictions and naming conventions added on top of the core engine. To understand this, a rough comparison to Android is again in order.
The Android App is, grossly simplified, a Zip archive with some hacks, a manifest XML file, some Java VM byte code, optional resources and optional native code. The byte code traditionally came from compiling Java code, but several languages can compile to it. The manifest covers some metadata, importantly which system resources that the app should have access to.
The Arcan Appl (the ‘l’ is pronounced with a sigh or ‘blowing raspberries’) has a folder structure:
- A subdirectory to some appl root store with a unique name
- A .lua file in that directory with the same name.
- A function with the same name as the directory and the .lua file.
Resources the appl can access and data stores it can create and delete files within are broken down into several namespaces. The main such namespaces are roughly: application-local-dynamic, application-local-static, fonts, library code and shared.
Similarly to how the Android app can load native code via JNI, the Arcan appl can dynamically load shared libraries. In contrast to Android where native code is pulled in for supporting the high level Java/Kotlin/…, the high level scripting in Arcan is to support the native code so that the tedious and error prone tasks are written in a memory safe, user hackable/patchable code by default.
The mapping of the namespaces themselves, restrictions or additional permissions, configuration database and even different sets of frameservers are all controlled by the arguments that was used to start each engine process.
The database act as a key / value store for the appl itself, but also as a policy model for which other shmif capable clients should be allowed to launch (* enforcement for native code is subject to controls provided by the host OS), as well as a key / value store for tracking information for each client launched in such a way.
Other resources permissions are not directly requested or statically defined by the appl itself, it the window manager that ultimately maps and routes such things.
There are a number of reference appls that have been written and presented on this page throughout the years. These are mainly focused on filling the ‘window manager’ role, but can indeed also be used as building blocks for other applications.
These have been used to drive the design of ALT API itself; to demonstrate the rough scope of what can be accomplished, and they are usable tools in their own right.
The ones suitable as building blocks for custom setups or ‘daily drivers’ are as follows:
Console — (writing a console replacement using Arcan) which act as fullscreen workspaces dedicated to individual clients, with a terminal spawned by default in an empty one. This is analogous to the console in Linux/BSD setups and comes bundled with the default build (but with unicode fonts, OSD keyboard, touchscreen support, …).
Durden — Implements the feature set of traditional desktops and established window management schemes (tiling, stacking/floating and several hybrids). It is arranged as a virtual filesystem tree that UI elements and device inputs reference.
Safespaces — Structurally similar to Durden as far as a ‘virtual filesystem’ goes, but intended for Augmented-, Mixed- and Virtual Reality.
Pipeworld — Covers ‘Dataflow’; a hybrid between a programming environment, a spreadsheet and a desktop.
There are other shorter ones that are not kept up to date but rather written to demonstrate something. A notable such example is the Plan9-like Prio (One night in Rio – Vacation Photos from Plan9).
There are several intimidating uphill battles with established conventions and the network/lock-in effect of large platforms — no interesting applications equals no invested users; no developers equals no interesting applications.
The most problematic part for compatibility comes with ‘toolkit’ (e.q. Qt and GTK) built ones. Although often touted as ‘portable’, what has happened time and again is a convergence to some crude and uninteresting capability set tied to whatever platform abstraction can be found deep in the toolkit code base — it is never pretty.
There is fair reason as for why many impactful projects went with ‘the browser as the toolkit’ (i.e. Electron). The portability aspects for the big toolkits will keep on loosing relevance; the long term survival rate for well-integrated ‘native’ feel portable software looks slim to none. The end-game for these rather looks like banking on one fixed idea/style or niche.
The compatibility strategy for Arcan is “emphasis at the extremes” — first to focus on the extreme that is treating other applications as opaque virtual machines (which include Browsers). Virtualization for compatibility is the strongest tactic we have for taking legacy with us. This calls for multiple tactics to handle integration edge cases and incrementally breaking the opaqueness – such as forced decomposition through side-band communicated annotations, “guest-additions” and virtual device drivers.
The second part to the strategy is to focus on the other extreme that is the ‘text dominant’ applications, hence all the work done on the TUI API. As mentioned before, it is needed as a way out of getting ‘Terminals’ with command lines that are not stuck with hopelessly dated and broken assumptions on anything computing. Terminal emulators will be necessary for a long time, and Arcan comes with one by default — but as a normal TUI client.
TUI is also used as a way of building front-ends to notoriously problematic system controls such as WiFi authentication and dynamic storage management. It is also useful for ‘wrapping’ data around interactive transfer controls; leave the UI wrapping and composition up to the appl stage.
The distant third part of the compatibility strategy are protocol bridges — the main one currently being ‘arcan-wayland’. For a while, this was the intended first strategy, but after so many years of the spec being woefully incomplete, then seriously confused, it is now completely deranged and ready for the asylum. That might sound grim, yet this is nothing compared to the ‘quality’ of the implementations out there.
One area that warrants special treatment is security (and the overlap with some of privacy). This is an area where Arcan is especially opinionated. A much longer treatment of the topic is needed, and an article for that is in the queue.
The much condensed overarching problem of major platforms are that they keep piling on ‘security features’ (for your own good they say) and (often pointless) restrictions or interruptions that are incrementally and silently added through updates — with you in the dark of what they are actually supposed to protect you from, and the caveats with that.
The following two screenshots illustrate the entry level of this problem:
Note: The very idea that the second one even became a dialog is surprising. Most UIs that predates this idiocy had trained users to route data using their own initiative and interaction alone through “drag and drop”, “copy and paste” and so on. It is a dangerous pattern in its own right, and a mildly competent threat actor knows how to leverage this side channel.
There is a lot to unpack with these two alone, but that is for another time.
The core matter is that these fit in some threat model, but unlikely to be part of your threat model. The tools to actually let you express what your threat model currently is, and tools to select mitigations to fit your circumstances — are both practically nonexistent.
Compare to accessibility: supporting vision impairment, or blindness, has vastly different needs from deafness which has different needs from immobility. Running a screen reader will provide little comfort to someone that is hard of hearing and so on. Turning such features on without the user being informed; or even rudely interrupting by repeatedly asking at every possible junction, is rightly to be met with some contention.
In the other end, someone working on malware analysis has different needs from someone approving financial transactions for a company has different needs from someone forced to use a tracking app from an abusive partner or employer. Yet here protections with different edge cases and failure modes are silently enabled without considering the user.
The security story is dynamic and context dependent by its very nature. A single person could be switching between having any of the needs expressed above at different times over the course of a single day. More technically, it might be fine for your laptop to “on-lid-open: automatically enable baseband radio, scan for known access points, connect to them, request/apply DHCP configuration” coupled with some other service waking up to “on network: request and apply pending updates” and so on from the comforts of your home. It might also get you royally owned while at an airport in seemingly infinitely many ways.
To tie things back to the Arcan design. The larger story comes from the 12 principles linked earlier, and a few of those are further expanded into the following maxims:
The Window Manager defines the set of mitigations for your current threat model.
This is hopefully the least complicated one to understand. To break it down further:
- The window manager is first in line to operationalise your intents and actions.
- The window manager reflects your preferences as to how your computer should behave.
- You, or someone acting on your behalf, should always have the agency to tune or work around undesired behaviours or appearances.
- Any interaction should be transformable into automation through your initiative and bindable to triggers that you pick.
- Automation and convenience should be easy to define and discover, but not a default.
The point is to make the set of scripts that is the Appl controlling the outermost instance of Arcan (as a display server) the primary control plane for your interests. If you think of each client/application as possibly sandboxed-local or sandboxed-networked, the scripts define the routing/filtering/translation rules between any source to any sink — a firewall for your senses.
There is no IPC but SHMIF.
Memory safety vulnerabilities (typically data/protocol parsers) were for a long time a cheap and easy way to gain access into a system (RCE – Remote Code Execution).
The cost and difficulty increased drastically with certain mitigations, e.g. ASLR, NX, W^X and stack canaries – but also through least-privilege separation of sensitive tasks (sandboxing). Neither of these are panaceas but they have raised the price and effort so substantially that there is serious economy and engineering effort behind just remote code execution alone (public examples) — which is far from what goes into a full implant.
Bad programming patterns break mitigations. If you don’t design the entire solution around least-privilege, very little can safely be sandboxed. In UNIX, everything is a file-descriptor. Subsequently, blocking write() to file-descriptors breaks everything.
What happens when trying to sandbox around non least-privilege friendly components is that you get IPC systems. Without a systemic perspective you end up with a lot of them, and they are really hard to get right. Android developers had serious rigour and a lot of effort in Binder (their primary IPC system) yet it was both directly and indirectly used to break phones for many years — and probably still is.
Few IPC systems actually gets tested or treated as security boundaries, and eventually you get what in offensive security is called ‘lateral movement’.
This is the story of (*deep breath*) how the sandboxed but vulnerable GTK based image parser triggered by a D-Bus (IPC) activated indexer service on your mail attachment exposed RCE via a heap corruption exploited with a payload that proceeded to leverage one of the many Use-after-Frees in the compositor Wayland (IPC) implementation, where it gained persistence by dropping a second stage loader into dconf (IPC) that used PulseAudio (IPC) over a PipeWire (IPC) initiated SIP session to exfiltrate stolen data and receive further commands without your expensive NBAD/IDS or lazy blue team noticing — probably just another voice call.
In reality the scenario above will just be used to win some exotic CTF challenge or impress women. What will actually happen is that some uninteresting pip-installed python script dependency (or glorious crash collection script) just netcats $HOME/.ssh/id_rsa (that you just used everywhere didn’t you?) to a pastebin like service– but that’ll get fixed when everything is rewritten in Rust so stay calm and continue doing nothing.
The point of SHMIF is to have that one IPC system (omitting link to the tired xkcd strip, you know the one); not to end them all, or be gloriously flexible with versioned code-gen ready for another edition of ‘programming pearls’ — but to solve for the data transport, hardening, monitoring and sandboxing for only the flows necessary and sufficient for the desktop.
Least privilege parsing and scrubbing
Far from all memory safety vulnerabilities are created equal. The interesting subset is quite small, and somehow need to be reachable by aggressor controlled data. That tend to be ‘parsers’ for various protocols and document/image formats. If you don’t believe me, believe Sergey and Meredith (Science of Insecurity).
This can (and should be) leveraged. Even parsing the most insane of data formats (PDF) have fairly predictable system demands. With a little care they can do without any system calls at all after they have been buffered, and it is really hard to do anything from that point even with robust RCE.
This is where we return to the ‘decode’ frameserver — a known binary that any application can, and should, delegate parsing of non-native formats to. One which aggressively sandboxes the worst of offenders. With support from the IPC system that tunes the parsing and consumes the results, it also becomes analysis, collection and fuzzing harness in one. Leveraging the display server to improve debugging.
Someone slightly more mischievous can then, run these delegates on single-purpose devices that network boot and reset to a steady state on power-cycle. Let them consume the hand-crafted and targeted phishing expedition, remote-forward a debugger as a crash collector to a team that extract and reverse exploit chain and payload, replicate-inject into a few honey pots with some prices for the threat actor to enjoy. This clip from pipeworld looks surprisingly much like part of that scenario, no?
Many data formats are becoming apt at embedded compromising metadata. Most know about EXIF today, fewer are aware just how much can be shoved into XMP – where you can find such delicacies as metadata on your motion (gait), or tracking images hidden inside the image as a base64 encoded jpeg inside the xmp block of a jpeg. A good rule of thumb is to never let anything touched by Adobe near your loved ones. If you would manage to systematically strip the one, something new is bound to pop up elsewhere.
By splitting importing (afsrv_decode) and (afsrv_encode) to very distinct tasks with a human observable representation and scriptable intermediary model the transition from the one to the other — you also naturally get designs that lets you define other metadata to encode. If that then is what gets forwarded and uploaded to whatever “information parasite” (social media as some tend to call it) that pretends to strip it away, but really collects it for future modelling, starts to trust it, well shucks, that degrades the value of the signal/selector. The point is not to “win”, the point is to make this kind of creepiness cost more than what you are worth so some are incentivised to try and make a more honest living.
Compartment per device.
With A12 comes the capability to transition back and forth between device-bound desktop computing and several networked forms. This opens up for a mentality of ‘one task well’ per discrete (‘galvanically isolated’) device, practically the strongest form of compartmenting risk that we can reasonably afford.
The best client to compartment on ‘throwaway devices’ is the web browser. The browser has dug its claws so deep into the flesh of the OS and hardware itself, and expose so much of it to web applications that the distance between that and downloading and running a standalone binary is tiny and getting shorter by the day — we just collectively managed to design a binary format that is somehow worse than ELF, a true feat if there ever was one.
The browser offer ample opportunity for persistence and lateral movement, yet itself aggregates so much sensitive and useful information that you rarely need to seek elsewhere on the system.
In these cases lateral movement as covered before is less interesting. Enough ‘gold’ exists within the browser processes that it is a comfortable target for your disk-less ephemeral process parasite to sit and scrape credentials and proxy through; ‘smash and grab’ as the kids say.
There is generally an overemphasis on memory safety to the point that it becomes the proverbial ‘finger pointing towards the moon’ and you miss out on all the other heavenly glory. There are enough great and fun vulnerabilities that require little of the contortionist practices of exploiting memory corruptions, and a few have been referenced already.
A category that has not been mentioned yet is micro-architectural attacks — one reason why the same piece of hardware is getting incrementally slower these days. You might have heard about these attacks through names indicative of movie villains and vaguely sexual sounding positions, e.g. SPECTRE and ROWHAMMER. Judging by various errata sections between CPU microcode revisions alone, there is a lingering scent in the air that we are far away from the end of this interesting journey.
Instead of handicapping ourselves further, assume that ‘process separation’ and similar forms of whole-system virtualization is useful for resiliency and for compatibility still, but not a fair security mechanism; sorry docker. Instead, split up the worst offenders over multiple devices that, again, are wiped and replaced on a regular basis. You now cost enough to exploit that a thug with a wrench is the cheaper solution.
At this juncture, we might as well also make it easier to extract state (snapshot/serialize) and re-inject into another instance of the same software on another device (restore/deserialize). In the end, it is a prerequisite for making the workflow transparent and quick enough that spinning up a new ephemeral browser tab should be near instant.
Another gain is that you reduce the amount of accidental state that accumulates, and you get the ability to inspect what that state entails. This is the story about how your filthy habits built up inside and between the tabs that you, for some reason, refuse to close. Think of the poor forensics examiner that has to sift through it — toss more of your data.
Anything that provides data, should be able to transition to producing noise.
Consider the microphone screenshot from earlier, or ‘screen’ sharing for that matter. What value does it have to your actions over abstracting away from the device?. Having a design language that is ‘provide what you want to share’ might look similar enough to a browser asking for permission to use your microphone or record your desktop — but there is quite some benefit to basic the user interaction to work at this other level of abstraction.
Some are strictly practically, like getting the user to think of ‘what’ to ‘present’ rather than to try and clean the desktop from any accidentally compromising material. Being explicit about source makes it much less likely that the iMessage notification popups from your loved one showing something embarrassing will appear in the middle of a zoom call with upper management.
By decoupling the media stream from the source (again, fsrv_decode and fsrv_encode), there is little stopping you from switching out or modifying the stream as it progress. While it can be used for cutesy effects such as adding googly eyes to everything without the application being directly the wiser — it also permits a semantically valid stream to slowly be morphed to and from a synthetic one.
With style transferring GANs improving, as well as adversarial machine learning for that matter, AI will no doubt push beyond that creepy point where AI synthetic versions of you will plausibly pass the Turing test to any colleagues and loved ones. This also implies that your cyber relations or verbal faux pas will be more plausibly deniable. You can end a conversation, and let the AI keep it going for a while. Let a few months pass and not even you will remember what you actually said. Taint the message history by inserting GPT3 stories in between. Ideally the cost for separating truth from nature’s lies will cost as much as dear old Science.
This is a building block for ‘Offensive Privacy’ — Just stay away from VR.