An IRC client in your motherboard

Reading time: 11 minutes



Note
This post was discussed on Hacker News and Lobsters.

I made a graphical IRC client that runs in UEFI. It’s written in Rust and leverages the GUI toolkit and TrueType renderer that I wrote for axle’s userspace. I was able to develop it thanks to the vmnet network backend that I implemented for QEMU. I’ve published the code here.

You can connect to an IRC server, chat and read messages, all from the comfort of your motherboard’s pre-boot environment.

“Why”? What kind of question is “why”?

A quick refresher on UEFI

The bootloader for any OS is itself loaded with the help of firmware that’s stored on the motherboard’s ROM. Back in the Bad Old Days, this motherboard firmware was a BIOS implementation. This pre-boot environment is often thought of as the first major step to the computer starting up.

BIOS imposed a bunch of annoying limitations, so an industry consortium came up with a new standard to replace it, UEFI.

Note
For example, BIOS mandates that the bootloader starts off in 16-bit mode, as all APs ostensibly do. It also imposes weird and legacy requirements on the bootloader, such as mandating that the bootloader must have a self-contained first stage loader program that fits in 512 bytes.

Just like the BIOS, a UEFI implementation is shipped on each motherboard in ROM. This UEFI firmware provides an environment for the operating system’s bootloader to run in, and provides various APIs that the bootloader can leverage to do its thing.

UEFI is a massive step forwards from BIOS! The bootloader is dropped into a 64-bit environment from the get-go, and UEFI provides tons of helpful APIs for switching VESA display resolutions, allocating memory, and interacting with the EFI filesystem.

UEFI is also somewhat maligned for being over-engineered.

Network boot

One use case that’s kind of fun is that some bootloaders allow the operating system to be loaded over the network, instead of being loaded from a stored installation on a local block device. This can be useful in some corporate environments.

Supporting this use case means that the UEFI firmware ships a network stack, complete with NIC drivers and a TCP implementation, and exposes APIs to interact with this stack directly to any applications running in the pre-boot environment.

Of course, there’s no obligation for the bootloader to actually load an operating system. Behold, social media!

Rust networking in UEFI

The most finicky part of this project by far was implementing a client for UEFI’s TCP protocol in Rust. Making this all work with its scatter-gather buffers was quite tricky.

The nuts and bolts of actually using UEFI’s TCP protocol can be fairly wacky, especially when trying to explain the lifetimes and data interactions to Rust. UEFI’s TCP protocol design enforces the use of global state and re-entrant callbacks, scatter-gather buffers, and an involved set of concepts (events, tokens, handles, protocols, oh my!).

I spent a number of days carefully testing my Rust code to make sure I wasn’t leaking memory, and to squash TCP receive buffer UAFs.

As an example of how the UEFI programming model can be somewhat obtuse, how do you think this UEFI API is meant to be used?

Note
I’ve Rustified the syntax and simplified the API for readability. Here’s part of the UEFI API in question.

If you gave this to me on a paper napkin, the behavior I’d expect is pretty straightforward:

  1. Specifying NOTIFY_SIGNAL will invoke my callback when an event occurs.
  2. Specifying NOTIFY_WAIT, then calling wait(), will block until an event occurs, invoke my callback, then continue execution.

This is not at all what UEFI does! Here’s how it really works:

  1. If you specify NOTIFY_SIGNAL, UEFI will invoke the callback when the event occurs. Using wait() will raise an error.
  2. If you specify NOTIFY_WAIT, then call wait(), UEFI will invoke your callback whenever it feels like it, multiple times, until the event occurs. Then, the wait() call will unblock.

This is quite confusing, because the callback has completely different semantics depending on which listening mode you use.

Note
The UEFI C API uses the typical pattern of accepting a function pointer and an opaque context pointer to manage callbacks. I used Sage Griffin’s excellent trick to neatly trigger Rust closures as UEFI callbacks here.

According to the UEFI specifiers, when using NOTIFY_SIGNAL, the callback’s world-model should be “The event has occurred, it’s time to do the next thing!”

When using NOTIFY_WAIT, the callback’s world-model should be “The event still hasn’t happened, I should poke or prod something to move things along.”

In other words, UEFI allows you to flip a switch to vacillate between two completely different callback paradigms, one of which is quite nontraditional, and neither of which provide the ‘block until ready’ behavior that both their names sort of imply.

The name on the tin is literally NOTIFY_WAIT, but if you expect it to notify you after the wait() completes, you’ll be in for a bout of confusion.

The actual behavior is spelled out in the docs, but you need to read the WaitForEvent docstring quite closely.

Note

It was a huge pain to successfully set things up such that I could asynchronously buffer received packet data.

One sensible way to structure this would be to set up a NOTIFY_SIGNAL callback that appends to a buffer each time it’s triggered. Unfortunately, due to Rust’s borrowing rules, this was painful to get working.

I ended up with a NOTIFY_WAIT loop that also includes a short timeout timer. Each event loop pass, this will either trip the timer and time out, or receive some packet data and append it to the buffer. If the timer tripped, we don’t remove the previous RX transfer handle as the underlying UEFI implementation didn’t like that.

Cursor support

While a mouse isn’t a strict requirement for an IRC client, having one makes the whole app feel more interactive. I used UEFI’s Simple Pointer Protocol to read mouse movement and button presses, and included visual feedback on the cursor’s current position in the GUI.

Note
Unfortunately, the Simple Pointer Protocol doesn’t support scroll wheels. To scroll in UEFIRC, you can either use the arrow keys, or drag the scroll bar with the cursor.

If you try to use the Simple Pointer Protocol with the OVMF UEFI firmware, you won’t manage to get any mouse events, and you won’t get many helpful errors from the API.

Note
Here’s an example from the OSDev forums in which someone notices that the Simple Pointer Protocol is unusable with the standard OVMF build.

I compiled a custom UEFI firmware build that had all the right drivers and protocols compiled in (paticularly UsbMouseDxe), which allowed me to get on with it. To facilitate others to try out UEFIRC in QEMU, I’ve also uploaded my UEFI firmware to the repo.

Mouse drivers report a change in the mouse’s position. A naive way to code up a mouse cursor might be something like:

As it turns out, this ends up feeling quite sluggish. Operating systems tend to use an approach more like this:

Get a feel for the difference:

People tend to like log2 scaling because you can get where you’re going quicker: drag with confidence, and the mouse will fly across the screen, while still allowing for fine adjustments when honing in on an area more slowly. It’s also just what most people are used to. When presented with a linear movement scaling cursor, people tend to perceive the whole environment as slow and unresponsive.

Modelling IRC messages

Modelling the IRC messages was straightforward and pleasant. IRC uses a textual, line-based format that’s easy to parse, though it’s clearly encumbered by decades of slow expansion, only some of which is standardized.

Using libgui in UEFI

It wasn’t too bad to get my GUI toolkit running in UEFI, as I’ve already done most of the heavy lifting to make axle’s Rust GUI toolkit available in contexts other than axle itself. The first bulk of work here was providing an implementation of AwmWindow that can be used from within UEFI. After that, most of libgui comes for free, including event management, font rendering, layer compositing, view decorations, and tricky components like scroll views.

Scroll bars

axle’s Rust-based libgui toolkit came after axle’s C-based libgui toolkit, and the C toolkit still boasts a few features that I haven’t caught up to in the Rust version yet.

For example, the C toolkit displays these nice scroll bars on scroll views.

Since UEFIRC’s primary interaction takes place in a scrolling view filled with text, I couldn’t do without this for any longer. I reimplemented scroll bar functionality in the Rust libgui, with the famous ’tuck-in’ behavior at the top and bottom of the viewport that has made axle the OS of choice for the hip and fashionable the world over.

Text rendering on scroll views

To make UEFIRC usable, I had to make some minor, but notable, changes to how text gets rendered to scrolling views. To see why, let’s first take a look at a simpler case of representing and manipulating pixels.

Representing pixel data is easy if all you have to worry about is a fixed-size rectangle of content. Imagine a buffer with width * height elements, containing RGB data.

When we need to render this rectangle of content somewhere, it’s straightforward to conceptualize copying the buffer corresponding to the desired rectangular region.

Views that allow scrolling their content are much more difficult.

With a fixed-size rectangle, we never need to think twice about how much memory we’ll need to allocate for the pixel buffer. With a scroll view, however, all of a sudden we have an infinitely extensible canvas to think about. Should we impose a ‘maximum size’ on the scroll view and allocate a huge buffer upfront? Should we resize a backing buffer that grows as we draw more content into it?

The approach for scroll views that I went with in axle’s Rust GUI toolkit is based on ’tiles’ of content. Each tile is a square pixel buffer, a few hundred pixels wide, and is the fundamental unit for scroll views.

Each time we draw a bit of graphical content to a scroll view, we first allocate the tiles necessary to display the corresponding visual area.

When rendering a scroll view to another layer, the visible tiles are computed and stitched together into a final image.

This allows the scroll view to expand without bound, and ensures that the scroll view only allocates pixel buffer area that actually contains rendered content, instead of allocating pixel buffer memory for any empty space that the user could scroll to.

This is all to say: it’s a lot more expensive to plot pixels to scrolling views than to fixed-size views, because we have to do more work to maintain the representation.

Drawing graphics primitives to scroll views isn’t too bad, because the scroll view can pre-determine which tiles it’ll need to populate upfront. For example, if a caller asks to draw a circle at a given origin and radius:

The call to allocate_tiles_to_cover_bounding_box() is fairly expensive, but we only have to pay it once for the entire shape. The absolute pathological worst case is putpixel():

Now, let’s look at how the TrueType renderer draws glyphs:

Uh oh! The TrueType renderer is making a ludicrous number of calls to putpixel(), and the underlying scrolling view doesn’t have the opportunity to understand the wider context of how much area the renderer is going to draw to.

To resolve this, I added polygon stacks as one of the ‘fundamental things’ that all view buffer implementations need to know how to draw, alongside primitives like lines, circles and rectangles. This gives the scroll view an opportunity to say “Ah ha, we’re drawing a big polygon! I can allocate all the tiles upfront!”, which is much quicker than the alternative. I don’t love having this as a fundamental primitive, as filling arbitrary polygons conceptually seems quite a bit heftier than rasterizing simpler shapes, but it’s practical and works well.

Note
Another option that I weighed for this API design was to let clients ask a graphics layer to “get this area ready for drawing”, explicitly priming the layer for a bunch of draw calls in one region. I didn’t love this either.

Improving libgui

Every time I implement a new graphical application against my stack, I come up against little limitations or papercuts that I’ve never run into before. These could be in the GUI toolkit, IPC, driver interface, kernel features, etc. This gives me an opportunity to just-in-time improve the system and APIs to facilitate whatever I’m building at the moment. It’s the fun of OS development! When building out UEFIRC, I made a few notable tweaks and fixes to libgui:

Completely unnecessary

The IRC client itself, as a client, isn’t that usable because this project is an elaborate joke.

Note
I told a friend I was making a joke project, then explained it. She said she wasn’t sure when to laugh. I’m not so sure either.

However, if you’re ever feeling mad about UEFI’s TCP/IP stack, I know just the tool to complain about it with.

As a final tip of the hat, I joined the UEFI #edk2 development IRC channel from UEFI and bid good tidings.

Here’s a demo showing UEFIRC in action.

Below are some screenshots from the course of UEFIRC’s development.


Newsletter

Put your email in this funny little box, and I'll send you a message when I post new stuff.