date: 2018-07-17 layout: post title: Input handling in wlroots
tags: [wlroots, wayland, instructional]
I’ve said before that wlroots is a “batteries not included” kind of library, and one of the places where that is most apparent is with our approach to input handling. We implemented a very hands-off design for input, in order to support many use-cases: desktop input, phones with and without USB-OTG HIDs plugged in, multiple mice bound to a single cursor, multiple keyboards per seat, simulated input from fake input devices, on-screen keyboards, input which is processed by the compositor but not sent to clients… we support all of these use-cases and even more. However, the drawback of our powerful design is that it’s confusing. Very confusing.
Let’s begin by forgetting about the Wayland part entirely. After all, wlroots is flexible enough that you can use it without writing a Wayland compositor at all! It can be used in a similar fashion to tools like GLFW and SDL, to abstract low-level input (via e.g. libinput) and graphical output (via e.g. DRM). Let’s start here, simply getting input events from wlroots in the first place.
One of the fundamental building blocks of wlroots is the
which is a resource that abstracts the underlying hardware and exposes a
consistent API for outputs and input devices. Outputs have been discussed
elsewhere, so let’s focus just on input devices. Each backend provides an
wlr_backend.events.new_input. The signal is called with a
reference to a
wlr_input_device each time a new input
device appears on the backend - for example, when you plug a mouse into your
computer when using the libinput backend.
The input device can be one of five types, appropriately identified by the
type field. The types are:
The type indicates which member of the anonymous union is valid. If
wlr_input_device->type == WLR_INPUT_DEVICE_KEYBOARD, then
wlr_input_device->keyboard is a valid pointer to a
Let’s examine the wlr keyboard more closely now. The keyboard struct also
provides its own events, like
keymap. If you want to process input
from this keyboard, you need to set up an xkbcommon context for
ingesting the raw scancodes emitted by the
key event and converting them to
Unicode and keysyms (e.g. “Up”) with an XKB keymap. Most of the wlroots examples
implement this if you’re looking for a simple reference.
When these events are sent, we just let you process them as you please. They do
not automatically get propagated to any Wayland clients. Communicating these
events to the clients is your responsibility, though we provide you tools to
help - we’ll get into that shortly. You don’t even have to source the input you
give to Wayland clients from a
wlr_input_device, you can just as easily make
them up or get them from the network or anywhere else.
Before we get into details on how to send events to clients, let’s examine the other components in your compositor’s input code. First, let’s talk about the cursor.
We provide the
wlr_pointer abstraction for getting events from
a “pointer” device, like a mouse. However, because batteries are not included,
you will find that we only tell you what the pointer device is doing - we don’t
act on it. If you want to, for example, display a cursor image
on screen which moves around when the mouse does, you need to wire this up
yourself. We have tools which can help.
First, let’s talk about getting the cursor image to show. You can source the
image from anywhere you want, but you will probably want to leverage
wlr_xcursor. This is a small wlroots module (forked from the
wayland-cursor library used by Wayland clients) which can read Xcursor themes,
the kind your user will already have installed on their system. Loading up a
cursor theme and getting the pixels from it is pretty straightforward. But what
should you do with those pixels?
Well, now we have to introduce hardware cursors. Many backends support
“hardware” cursors, which is a feature provided by your low-level graphics stack
(e.g. GPU drivers) for rendering a cursor on the screen. Hardware cursors are
composited by the GPU, which means you can move the cursor around without
re-drawing the things underneath it. This is the most energy- and CPU-efficient
way of drawing your cursor, and you can do it with
wlr_output_cursor_set_image, specifying which
you want it to appear on and at what coordinates. Not all configurations support
hardware cursors, but
wlr_output automatically falls back to software cursors
if need be.
Now you have all of the pieces to show a cursor on screen that moves with the
mouse. You can store some X and Y coordinates somewhere, grab an image from an
Xcursor theme, and throw it at your
wlr_output, then process input events and
move it around. Then… you need to consider multiple outputs. And you need to
make sure that it can’t be moved outside of an output. And you need to let the
user move it around with a drawing tablet or touch screen as well. And… well,
it’s about to get complicated. That’s where our next tool comes in!
wlr_cursor is how wlroots saves you from some of this work. It
can display a cursor image on-screen, tie it to multiple input devices,
constrain it to your outputs and move it across multiple displays. It can also
map input from certain devices to certain outputs or regions of the output
layout, change the geometry of inputs from a drawing tablet, and more.
wlr_cursor, you should create one (
wlr_cursor_create) and as the
new_input events, bind them to the cursor with
wlr_cursor then raises aggregated events
from all of its devices, which you can catch and handle accordingly - usually
calling a function like
wlr_cursor_move and propagating the event to Wayland
clients. You also need to attach a
the cursor, so it knows how to constrain the cursor movement and can handle
hardware cursors for you.
wlr_output_layout module allows you to configure an arrangement of
wlr_outputs in physical space. Its function is fairly straightforward and
largely unrelated to our topic - I suggest reading through the header and asking
questions if you need help. Once you make one of these and hand it to a
wlr_cursor, you have a cursor on-screen which moves around when you provide
input and correctly moves throughout a multi-display setup.^1
Okay, now that we have all of those pieces in place, we can finally start talking about sending input events to Wayland clients! Before we get into how wlroots does it, let’s talk about how Wayland does it in general.
The top-level resource which manages input for a Wayland client is the
wl_seat. One seat, in rough terms, maps to a single set of input devices used
by a user (a user who is presumably sitting at a seat in front of their
computer). A seat can have up to one keyboard, pointer, touch device, or drawing
tablet each. Each of these devices can then enter or leave any of the
client’s surfaces at the compositor’s orders.
When you bind to a
wl_keyboard.enter is raised
on a surface, it means your surface has keyboard focus. The compositor will
follow-up with (or will have already sent) a
wl_keyboard.keymap signal to let
you know the layout of this keyboard (e.g.
ru, etc) in the
form of an xkbcommon keymap (the same format we were using with
earlier - hint hint). Some number of
modifier events will likely
follow as the user taps away.
When you bind to a
wl_pointer.enter is raised, it
means a pointer has moved over one of your surfaces. Note that this can be an
entirely separate occasion from receiving keyboard focus. The client is then
expected to provide a cursor image to display (at the moment, Wayland requires
client side cursors. They have to do the whole Xcursor dance we did on the
wlroots side earlier, too. We have some plans to correct this…). Some number
button events will likely follow as the user wiggles their
mouse and clicks your windows.
So, how does a wlroots-based compositor facilitate these interactions? With
wlr_seat, our abstraction on top of
wl_seat. This implements the
wl_seat state machine, but again leaves it to you to tweak the knobs as
you wish. You need to decide how your compositor is going to deal with focus -
KDE, Sway, the Librem5 phone UI, an in-vehicle infotainment system; all of these
will have a different approach to focus.
wlroots doesn’t render client surfaces for you, and doesn’t know where you put
them. Once you figure out where they go, you need to notice when the
wlr_cursor is moved over it and call
wlr_seat_pointer_notify_enter with the
pointer’s coordinates relative to the surface it entered, along with any
button events through the relevant
functions. The client will also likely send you a cursor image to display - this
is done with the
When you decide that a surface should receive keyboard focus, call
wlr_seat will automatically handle removing
focus from whatever had it last, and will also grab the keymap and send it to
the client for you, assuming you configured it with
you did, right?
wlr_seat also semi-transparently deals with grabs, the sort of
situation where a client wants to keep keyboard focus for longer than it
normally would, to deal with a context menu or something.
Touch events are similar and should be self-explanatory when you read the
header. Drawing tablet events are a bit different - they’re not actually
specified by the core Wayland protocol. Instead, we rig these up with the
tablet protocol extension and wlr_tablet. It
works in much the same way, but you have to explicitly configure it for a
wlr_seat by calling
So, in short, if you wiggle your mouse, here’s what happens:
- Before you wiggled your mouse, the
libinputbackend noticed it was plugged in and raised a
- Your compositor attached the resulting
wlr_cursor, which it had prepared earlier by looking up an appropriate cursor theme and letting it know about the display layout.
wlr_pointerbubbled up a
motionevent, which was caught by
wlr_cursorand bubbled up to your compositor.
- Your compositor called
wlr_cursor_moveto apply the resulting motion, constrained by the output layout, which in turn caused the cursor image on your display to move.
- Your compositor then looked around to see if the pointer had moved over any new surfaces. Since wlroots doesn’t handle rendering or know where anything is displayed, this was a rather introspective question.
- You did wiggle it over a new surface, so the compositor called
wlr_seat_notify_pointer_enterafter translating the pointer coordinates to surface-local space. It sent a
wlr_seat_notify_pointer_motionfor good measure.
- The client noticed the pointer entered it and sent back a cursor image to
show. The compositor was informed of this via
- The compositor handled the client’s cursor image to
wlr_cursor, throwing away all of that hard work loading up a cursor theme just for a client-side cursor to come in and ruin it.
And there you have it, that’s how input works in wlroots. It’s really fucking complicated, isn’t it? I think this article puts on display both the incredible advantages and serious drawbacks of wlroots. Because you have to plug all of these pieces together yourself, you are afforded an enormous amount of flexibility. However, you have to do a lot of work and understand a whole lot of different pieces to get there. Libraries like wlc are much easier to use in this respect, but if you want to change even a small detail of this process with wlc you are unable to.
If you have any questions about this article, please reach out to the developers hanging out in #sway-devel on irc.freenode.net. We know this is confusing, and we’re happy to help.