Digging For Pixels

A few months back, there was this buzz on Twitter and Reddit in regards to the possibilities of automatically extracting raw images from dumps- or live- memory.

I was a bit indisposed at the time, but thought that now — with GPU Malware and similar nastiness appearing on the horizon — was a good a time as any to contribute a tiny bit on the subject.

This post is the first in what will hopefully (the whole ‘if time permits’ thing) be a series that will double as a scratch pad for when I take the time to work on features for Senseye.

The initial Q from the neighbourly master of hiding things inside things that are themselves hiding inside of other things, @angealbertini went exactly like this:

zoCWgXkE_400x400any script/tool worth checking to automagically identify raw pictures (and their width) in memory dumps?

The discussion that followed is summarized rather well in this blogpost by @gynvael.

Lets massage this little problem a bit and break it down:

Classification  Format detection  Pitch tuning  Edge detection.


The first part of the problem is distinguishing the bytes that correspond to what we want from the bytes that are irrelevant or, in other words, to group based on some property or pattern and identify this as either what we are looking for (our pixel buffer) or as something to discard — finding the outline of the image in our virtual pile.

The problem lies in the beholders eyes, consider the following images:

case1 kitchen-5 manul


Their respective statistical profiles are all quite different and appear distinct here, but rip them out the from the context of a formatted and rendered webpage, and hide in some 4 Gb of VRAM (where they will probably be hiding on your computer as you read now) and then try to recover.

Which ones that are the most interesting to you will, of course, depend on some context, as will the criterion that would make them stand out from other bits in a bytestream, so we will need some way of specifying what we want.

The options that immediately comes to mind:

  • “Human Vision” – This means cheat and let a human help us. That already slightly means we fail the »automatically« part and corresponds to the solutions from the discussion link.
  • Various degrees and levels of statistics and signal processing hurt – Histogram matching against databases and auto- correlation are possibilities, albeit rather expensive ones.
  • Magical machine learning algorithms – These require proper training, regular massage and a hefty portion of luck and can still mistake a penguin for a polarbear.

With the addition of the traditionally ‘easy’ ones – context hints in the case of pointers; metadata leftovers; intercepting execution and so on. These have been done to death by forensics tools already, and I would also consider them outside the scope here.

Format Detection

»Assuming that we manage to actually classify parts of a byte stream as belonging to a pixel buffer« Then we have the matter of determining the underlying storage format. While it may be tempting to think that this would just be a matter of some red, green and blues — that is hopelessly naive and quite far from the more harsh reality: Depending on the display context (graphics adapter, output signal connector, state in graphics processing chain etc.) there is a large space of possible raw (uncompressed) pixel storage formats.

To name a few parameters that ought to at least be considered:

  • layout – interlaced, progressive, planar, tiled (tons of GPU and driver specific formats)
  • orientation – horizontal, vertical, bi- / quad- directional, polar(?).
  • numerical format floating point, integral
  • channel count/depth – [1,3,4] channels * [8, 10, 15, 16, 24, 30, 32, …] bits per channel
  • color space – [monochromatic, RGB, YUV, CMYK, HSV, HSL] * [linear,
    non-linear] * [indexed (palette) or direct]
  • row-padding – pixel buffers may at times need to be padded to fit power-of-two or perhaps 16-byte vector instruction alignments, see also, Image stride.

Along with the possibility of interleaving additional buffers in padding byte areas, and … I’ve probably missed quite a few options here. Even compressed vs. uncompressed images can ‘look’ surprisingly similar:

aorbOne of the two images above is decodable without considering compression, the other is not.

Pitch Tuning and Edge Detection

After finding a possible pixel buffer and detecting or forcing a specific format, the next step should be finding out what pitch (width) that it has and my guess is that we will need some heuristic (fancy word for evaluation function) to move forward.

This heuristic will, similarly to whatever searching strategy used in 1. have its fair share of false positives, false negatives and, hopefully, true positives (matches). The images below are from three different automatic pitch detection runs with different testing heuristics.

stage1 stage2 stage3

With the width figured out, all that remains is the matter of the beginning, the end (end – beginning = height) and an offset (because chances are that our search window landed a few byte in, as is the case with the rightmost image above).

For the most part, this last step should be reducible to an edge detection filter (like a Sobel Operator ) and then look for horizontal- and for vertical- lines. Note: this will be something of a problem for multiple similar images (or gradients) stacked tightly after each other.

detected edges

The image above shows two copies of a horizontally easy- and vertically moderately difficult- case, with an edge detection filter applied to one of them.

Another alternative (or complement) for detecting the edge would be to do histogram comparison on a per row basis, as the row-to-row changes is usually rather small.


OK, so developing, selecting, and evaluating solutions for all of the above is within the reasonable scope of a Ph.D thesis, lovely. Massage the regular ‘needle in haystack‘ cash cows such as forensics for finding traces of child pornography or drawings of airport security checkpoints (while secretly polish your master plan of scraping credit card numbers, passwords and extortion-able webcam sessions from GPU memory dumps – just sayin’) and the next few years are all pretty much lined up. Lets take a look at what Senseye has to offer on the subject.

Note: Senseye is not intended as an automated push-button tool but rather a long list of instruments and measuring techniques for the informed explorer, so there are a lot of trade-offs to consider that removes the possibility for an optimal solution here — but the approach could of course be lifted to a dedicated tool.

Which parts of Senseye would be useful for experimenting with this? Well, for classification – we have histogram matching and pattern matching. Both of them require some kind of reference picture that is related to what you are looking for; both are also rather young features. At the time of writing, they can’t load presets from external files (have to be part of the sensor data stream) and the comparison functions in the pattern matching feature included all work on the form:

Reference image + per pixel comparison shader = 1-bit output. Sum output and compare against threshold value. Collect Underpants. Profit.


With the changes planned for 0.3 the ‘automatic search’ part can probably be fulfilled, so we’ll wait with a classification discussion until then. The screenshots above were taken by just bigram- mapping (first byte X, second byte Y) the same sample images used in the training grounds section further below. But judging from them, it seems like a lot of pictures could probably be classified by modelling bigrams as a distance function from the XY diagonal (or something actually clever, my computer graphics background is pretty much all mix of high-school level math, curiosity and insomnia).

For tuning, we have the Picture Tuner that has some limited capability for automatic tuning along with manual adjustments for starting offset. Underlying design problems and coding issues that relate to working with a seriously outdated OpenGL version (2.1 because portability- and drivers- are shit) limits this in a few ways, with the big one being that image size must be > sample window width, and the maximum detectable width is also a function of the sample window size. This is usually not a problem unless you are looking for single icons.

The automatic tuner works roughly like this:

Unpack Shader (input buffer being RGBx) → Tuning ShaderSample TilesCPU readbackScoring Function.

This is just repeated with a brute-force linear-search through the range of useful widths. Keep the one with the highest score. The purpose of the tile- sampling stage is to cut down on memory bandwidth requirements and cost per evaluation (conservative numbers are some 256x256x4 bytes per buffer, 4-5 intermediate copy steps and some 2-3000 evaluations per image).

The included scoring function looks for vertical continuity, meaning run lengths of similar vertical lines, discarding single coloured block results. As can be seen with the picture below, the wrong pitch will give sharp breaks in vertical continuity (lower score). This also happens to favours borders and other common desktop- screenshot features.


Training Grounds

We will just pick a common enough combination and see where the rabbit hole takes us: RGB color format, progressive, 8 bits per channel, vertically oriented. We strip the alpha channel just because having it there would be cheating (hmm, every 4th byte suddenly turns into 0xff or has very low entropy, what could it ever be?).

Before going out into the harsh real world, lets take a few needles:

SONY DSCsnakesala centrifughe

Note that the bunny snake (wouldn’t that be a terrifying cross-breed?) has some black padding added to simulate the “padding for alignment” case. Run them through imagemagick:s ‘convert’ utility to get a raw r8g8b8 out:

convert snake.png snake.rgb

and sandwich them between some nice slices of CSPRNG (because uniforms are hot) like this:

noisecmd="/bin/dd if=/dev/urandom of=pile conv=notrunc \
  bs=1024 count=$(expr $RANDOM % 512 + 64) oflag=append"


for fn in ./*.rgb ; do
/bin/dd if=$fn of=pile conv=notrunc oflag=append

Loading it (from the build directory of a senseye checkout):

arcan -p ../res ../senseye &
sense_file ./pile

And, like all good cooking shows, this video here is a realtime recording of how it would go down:

There is a lot of corner cases that are not covered, as you can see in the first attempt to autotune Mr. Bunny-snake; usually the SCADA image takes more tries due to the high number of continuous regions that will have the evaluation tiles be ignored. Running the list of everything that is flawed or too immature with this process would make this already too lengthy post even worse, but good starting points would be the tile management: Placement needs to be smarter, evaluation quicker and readback transfers should pack multiple evaluation widths into the same buffer (readback synchronization cost is insane).

Among the many things left as an exercise for the reader, try and open some pictures in a program of your liking, dump process memory with your favourite tool (I am slightly biased towards ECFS because of all the other cool features in the format) and try to dig out the pixels. When that over and done with, look into something a lot more hardcore (which won’t work by just clicking around in the tools provided):

2015 DFRWS Forensic Challenge

Posted in Uncategorized | Leave a comment

Senseye 0.2

Senseye has received quite a lot of attention, fixes and enhancements the last few months and is well overdue for a new tagged release. Highlights (not including tweaks, performance boosts, UI and bugfixes)  since last time include:

  • Translators: It is now possible to connect high-level decoders that track selected cursor position in data window to give more abstract views e.g. Hex, ASCII, Disassembly (using Capstone)
  • New sensor: MFile – This sensor takes multiple input files of (preferably) the same format and shows them side by side in a tile- style layout, along with a support window that highlights
  • New measuring tools: Byte Distance – This view looks a lot like the normal histogram, but each bin shows the number of bytes that pass from a set marker position, to the next time each possible value was found.
  • New visual tool: Picture Tuner – This tool is used for manually and/or automatically finding raw image parameters (stride, color format and so on)
  • Pattern Matching: Pattern finding using a visual pattern reference (like n-gram based mapping) and/or histograms matching.
  • Improved seeking and playback control: multiple stepping sizes to chose from, along with the option to align to a specific value.
  • File- sensor now updates preview window progressively, and works a lot better with larger (multiple gigabytes) input sources.

True to form, the “quick” demo-video below tries to show off all the new features. Be sure to watch with annotations enabled. In addition, a more detailed write-up on the picture tuner will be posted in a day or two. Make sure that you synch- and rebuild- Arcan before trying this out as a lot of core engine changes have been made.

  • 0:00 – 1:20, MFile sensor.
  • 1:20 – 2:50, File enchantments, Coloring, Histogram Updates, Byte Distance.
  • 2:50 – 5:40, Memsense updates, Translator feature.
  • 5:40 – 7:00, Picture Tuner.
Posted in Uncategorized | Leave a comment

Next Experiment, Senseye

The development strategy behind Arcan has always been to work with experimental proof of concepts doing ‘traditional’ tasks in odd way and use that as feedback to refactor and improve the Engine, API, testing and documentation.

For instance — the Video decoding, encoding and tagging experiments added process separation and greatly helped shape the shared memory interface. The arcade frontend experiments à Gridle improved support for odd input combinations (2 mice, 3 keyboards and 5 gamepads? not a problem), support for synchronization with time sensitive processes (libretro frameserver) where buffering and other common solutions were not available. The AWB experiments helped define controlled and segmented data sharing, along with performance considerations in tricky UI situations (hierarchies of windows where size and position relied on dynamic sources, drag+resize and watch hell break lose).

The end goals, getting a portable graphics- focused backend for putting together embedded, mobile and desktop system interfaces; balancing security, performance, stability and no-nonsense style- ease of use — is still out of reach, but great strides have been made. The last couple of months have mostly been stuck documenting, testing and working with the corners that dynamic multiscreen entails.

The next experiment in this regard is Senseye, which is targeted towards the more rugged of computing travellers; the reverse engineers, the security ‘enthusiasts’ and the system analysts. It is a tool for navigating and controlling non-native representations of large, unknown, binary data streams and blocks. Both statically in terms of files and dumps, and dynamic through live acquisition of memory from running processes.

The video above shows using senseye for navigating a suspicious binary and for poking around the memory pages allocated by pid 1 (still init though in its twilight..).

Posted in Uncategorized | Leave a comment

Removing the Scaffolding

Those of you that are monitoring the Github page have likely noticed that a lot of things has changed during the last few days. The lack of activity on this page in recent months do not correlate to a lack of activity in the project overall, on the contrary.

The upcoming version, 0.5, was planned to be the last version that had anything directly to do with emulators but that plan was scrapped in favour of moving the project to its next intended phase slightly ahead of schedule. The previous emulation / HTPC frontend work was always intended to be an intermediate target for aggressive prototyping when figuring out what features were actually needed in the Lua API — not as a goal in- and by- itself.

At the moment, the contents of the git are unsuitable for emulation and frontend purposes — stick with the latest tagged release for that. The tools and scripts used for emulation are being updated and moved to a separate Git, and the plan is still to make sure those of you that are using gridle/awb/other projects can continue to do so while still taking advantage of advances in the engine.

What is this ‘next phase’ then? Well, some of the details will be left hanging for a while longer, but one of the major end-goals for this project is to find a different approach to how desktop environments should work by taking advantage of some of the good parts from the game development world (API design, performance, minimizing latency), emulation (state management, odd and weird input devices), the X Windows System (data sharing, allowing window managers to provide a choice in interaction schemes) and other areas.

To see this plan in another perspective:

Part of this goal is also to reduce the barrier-to-entry for developing real-time graphics applications, even desktop environments, and especially ones targeting the recent onslaught of low to medium powered ‘all in one board’ devices like the Raspberry Pi.

The other side of this is to act as a counterweight to current development trends towards ‘fixing the desktop’. One such trend is the one of embedding the most overly complex pile of legacy and liability that humanity has produced so far (also known as “the web browser”) everywhere — ultimately ending up in a place where nothing works if you’re behind a malicious DNS server, forcing you to constantly relearn how to interact with it because the interface had yet another lousy morning and decided to rearrang itself by moving the features you used the most to the place where you are the least likely to find them. 

Another one, which is possibly even more damaging, is trying to ‘fix things’ by introducing hordes of PolicyServiceKitManagerSystemInstanceFactoryBusObserver dependent applications that try to automate action on behalf of the user in response to some “complex” dynamic system event like the user plugging in <some device> into a USB port (thinking that means the user wants to mount the filesystem on the device, synch all the contents to a cloud storage service while being given helpful shopping advice based on the pictures stored), a world where every little component is unknowingly responsible for something that is unrelated to its own function and thus unable to take responsibility for malfunction — thus no-one will be pushed to act on the fact that your music player in the background grinds to a halt because the SoundServiceSystemManagerListener live-locked when your network connection dropped because you moved to a different WiFi access point.

The ambition here is, in stark contrast to the two scenarios mentioned above, to provide the user with the tools necessary to easily and clearly convey his or her own actual intent behind the action that the system observed < some device was plugged in >, along with the tools to only grant applications temporary and limited access to selected parts of the user’s data– and to upgrade the UNIX pipes and filters model of ‘do one thing but do it well‘, rather than to try and replace it with a selected mix of all the bad parts associated with Windows and Android.

But that is a longer story — to be continued.

Posted in Uncategorized | Leave a comment

Not Dead

Far from it! Unfortunately, a whole lot of the changes ahead are a step up in technical difficulty and low on visual goodies. The tentative plan at the moment is a 0.5 push in August, which depends a bit on how long it will take to port all the internal changes to the windows platform and on how hard I fail cryptography engineering in the networking protocols.

As an example of the current state of new features, the screenshot below (from the AWB scripts) illustrates:

  1. An external, non-authoritative connection of lightweight-arcan (LWA) running ‘Gridle’ where the engine is used as both display server/window manager and has another version of itself connected. LWA mode still lacks sound as I have to write an OpenAL-soft backend driver that uses the shared memory API. (coordinate system messed up in the screenshot).
  2. A remoting frameserver connected to a VirtalBox Headless VM running PC-BSD over VNC.
  3. A VNC server that (using a version of the recording tool) maps the LWA connection from 1 and the client from 2, with input translation and all.
Preview of some of the upcoming features in 0.5

Preview of some of the upcoming features in 0.5

Posted in Uncategorized | Leave a comment

Looking Ahead

While there are quite a few enhancements planned as evident from the issues tracker on github, the subset that is currently being focused on in the near future and what it will be used for is not quite as clear, so while a new release are many weeks away, here is the current focus and a status update of sorts:

“Big changes”:

  • Extending the shared memory interface to accept multiple input/output segments for one frameserver. This will, in the libretro and hijack cases, be used for in situ texture manipulation, and for the longer perspective, allow the support for multiple windows (some MAME cases..) and for streaming network A/V transfers (minor details and a lot of testing left).
  • Extending the shared memory interface to accept (foreign, external, “non-authoritative”) connections. This is partially related to the previous point, but under certain conditions, allow external processes to connect and act as any other frameserver even though the trust domain has not been explicitly defined in beforehand (as is done with the database for target_launch today). (minor details and a lot of testing left).
  • Extending the platform support to EGL/KMS/ARM/… Parts of this was present, but not really emphasized, in the 0.4 release in that we have support for EGL and, in particular, the Raspberry Pi brand. This is was an experimental step to determine how much work would be needed to get things working on more competent ARM platforms. The big thing missing here still is BSD/Linux raw /dev/input- style support for various devices along with select optimizations for some of math intensive parts (particularly a few ARM NEON/softfpu implementations of 4x4matr- mul and 4x4matr- vec3 transforms).
  • Getting recursive. Arcan will be capable of using itself as a “hardware” platform, meaning that the shared memory interface will be used for A/V/Input, and that the engine can launch itself (and with different sets of scripts) so that it fills the roll of desktop environment, compositor and “app-engine” all in one. This is experimentally working for video (with some manual labour) but some ways to go still with audio (~a days work to hack together an openAL-soft backend) and input (~another day to translate events properly).
  • Engine optimization; tracking changes on a “per rendertarget” basis and only update when something really has changed, and only update dirty areas for the 2D pipeline (quadtree with batching), and “blit-vid-to-vid” using FBOs as an intermediary for dynamic sprite sheets. Clipping, finally, will get a cheaper version for CLIP_SHALLOW on non-rotated objects to work without needing the stencil buffer.

At the same time, some parallel projects are being worked on (like a BSD/Linux terminal emulator, no points for guessing why), with the bigger one being to integrate and tune all the existing networking code for state synchronisation (aka netplay).

Posted in Uncategorized | Leave a comment

0.4.0 Released

This was long overdue, but I finally got satified enough to tag a new release. A lot of engine- “under the covers” work in this one in regards to the LuaAPI, documentation, build systems, portability and performance to prepare for things to come. Among the features that some end-users might appreciate, we have:

  • enhanced configuration, filtering and tuning for analog input devices
  • support for 3D libretro cores
  • support for libretro core options
  • mouse-analog device translation
  • (AWB) per game/per target coreoptions/input configuration
  • (Gridle) custom layout can now include the game display as part of the layout

No video this time around, but the individual wikis ( AWB Gridle ) have annotated screenshots to better explain the features.

The boring and full release notes can be found here: [Full Release Notes]

Arcan Workbench Screenshot (0.4.0)

Arcan Workbench Screenshot (0.4.0)

Customview Layout Editor (0.4.0)

Customview Layout Editor (0.4.0)

N64 Libretro core in Gridle internal launch

N64 Libretro core in Gridle internal launch

Infrastructure wise, we are now rid spam-forge. Main repository will be kept @ github and binary releases @ bintray.

This is the last version where Gridle / AWB will be released synchronously and bundled. They now have their own github repositories so you can get your updates more frequently than the release cycle of the engine. In the coming version, we’ll focus more on social features (netplay, …) ARM devices (and living life without X) and a package format.

Download links are in the menu above.

Enjoy /B

Posted in Uncategorized | Leave a comment