Thirty years ago, it was 1987. The US stock market crashed, there was a fire at King’s Cross station in London killing 31 people, the first Final Fantasy game was released for NES, the USA had 75 million fewer inhabitants than it does today, Disney agreed to open a theme park in Paris, the world’s population was 2.5 billion fewer than today, Regan was president and I was just three years old. Computers were quite a bit different back then – it was only in December of that year that Windows 2.0 was launched. A lot changes over thirty years but that is also the year that the X Window System reached protocol version 11 and X11 is still here some thirty years later.
The graphical capabilities of computers have changed more than a smidgen since then. When I first used X in 1997 it already felt outdated and clunky. At just ten years old X11 was never really designed for transparency, circular windows (who doesn’t love xeyes), multiple monitors or 3D. The first time I managed X11 working with multiple monitors felt like I’d achieved one of the most difficult feats known to man. I had my xf86config file backed up in so many places due to the terrifying thought that I’d have to go through that ordeal again. But it was all going to be OK – soon it would be the year of the desktop for Linux (and by virtue likely all other kernels that sat on top of the same stack). The first time I saw Enlightenment my mind was blown at what the future would herald for us in the new millennium.
It’s now 20 years since I first struggled with X and yet all my Linux desktops are still running X11 and I’m still underwhelmed. During that time the major implementation has changed, extensions have been added, extensions have been removed, the source repository has been tidied up and we’ve got reasonable 3D capabilities in many Unix-style operating systems – especially Linux. There are even manufacturer-support proprietary drivers for graphics cards and a whole bunch of compositing window managers… but I can’t help but feel it still sucks compared to what it should be?
Nobody really liked Windows Vista but under the hood it felt like a thing of beauty for the time (I know I’m in the minority here). DWM arrived to Windows back in 2007. Sure, nobody liked Vista, the glass theme was overused and Flip 3D became a novelty really quickly but DWM settled in to being an incredibly robust, reliable compositing window manager that – frankly – works really well. That was 10 years ago, and it still feels like X is playing catch-up.
The biggest problem with X is its requirement for backwards compatibility. You’re never going to want to run a Motif app from decades ago but the protocol mandates that current X servers must implement the functionality for it. As far as software products go it’s amazing what the X team have managed to cram in to it – but even they admit they’ve taken it about as far as they can do without even the tiniest of change taking huge amounts of time to implement.
In really basic terms X is nothing more than a network protocol. A client (i.e. an application) is supposed to say to the server “Draw me a window here, put a circle in it, then put some squares here and here and fill the second square green”. All that goes over the network, without any image data being transferred. This is super cool and light-weight. It also takes the burden away from the client and on to the server. An amazing feat back in 1987 when you may want a terminal server farm but not so useful these days on your home desktop.
These days, the most common use-case for an application framework (QT, KDE, EF, etc.) is to use 3D accelerated code to generate a “pretty” user interface for an application then to use one of the X extensions to throw all of that pretty window (expressed as raw pixel data – very data intensive) towards the X server. The window manager can then listen in on that, get a handle to it and store a copy of that buffer and do things with it (all those pretty 3D effects where things stack on top of each other or expose, etc.) and eventually chuck it all back to the X server. So, all the clever bits we used X for we’re not using here and somehow, we’ve got it mostly working for this task. But ultimately all X is now is a middle man.
What’s happened here is that X has become a really, really bad communication protocol trying to wedge in some 3D surfaces where they aren’t supposed to be. More than that, X servers get in the way of everything whilst modern window managers spend a lot of their code trying to work around it. There are all kinds of weird consequences – there will be screen tearing, window resizing is kind of ugly to say the least, modern input devices don’t translate well to the world of X, you cannot have different DPI settings on different monitors, hybrid graphics cards don’t really work at all, there’s a lot more overhead than there needs to be for rendering, popups have all kinds of dirty hacks that break screensavers from working, full-screen transitions are ugly and the list could go on for a very long time. There are many blogs listing all the problems with X so I won’t bother here – but suffice to say it has serious issues that are holding it back.
But, some people say, X works across a network and that’s amazing. True – your Motif app will still work across the network and a basic WM in a 2D context will likely work but most of those flashy hacks won’t work across the network, we broke that a long time ago. You may, for your circumstances, get X working perfectly across a network but for the person running the latest fancy-pantsy 3D WM it isn’t going to work and, ultimately, that’s how a lot of people use it these days.
Sorry, X has become a really horrible, monolithic beast and it’s holding us back. Flame away if you disagree. I am nowhere near an expert on X – far from it – but the people that are experts are those who have written X, supported X and who maintain it to this day. They are the ones saying this and unless you’re part of that fairly small group odds are you know less than them.
That’s why the X.org developers came up with Wayland…
So, Wayland’s Our Saviour?
First things first – Wayland is not an X server, it’s not a window manager. Wayland is just an extensible communication protocol with an XML schema. It defines very, very little out the box.
Unlike the X architecture (X server with some apps and exactly one window manager all talking through it), the Wayland architecture defines a single compositor (potentially the equivalent of the X server plus a window manager) which talks to all apps itself via shared memory on the same machine. There’s no network, there is no definition of drawing primitives, in fact there’s not really much of anything defined in Wayland other than as a way for a client to talk to some kind of compositor that draws something on a screen. There are some requirements of a server (knowledge of chucking shared memory around, input devices and output devices at a basic level) but they’re kept to the minimum. The protocol, by its very nature, is extensible so more shared functionality between compositor and applications can be added but not relied upon nor required.
Wayland is designed to resolve the issues with complexity, graphics, architecture, technical debt and code maintenance described above but also aims at resolving security concerns. In X one app can easily-enough snoop on another and perform all kinds of bad things – key logging made simple. Let’s not pretend we’ve got one-up on Windows here, if Windows was this poorly designed and X wasn’t we’d cite it every time we wanted someone to switch platforms. This direct path between client and server means that the server (compositor) is the only application with a world-state and it’s completely in control of what it does with that.
In addition to the core protocol itself there are a number of other components that the Wayland team provide. These are Weston – an example compositor – and a libwayland-client/libwayland-server pair of libraries. The libraries are basic extensible implementations of the protocol that allow anybody developing clients/GUI frameworks or compositors to not have to write the basic plumbing. They just deal the communications layer of server/client but by themselves do absolutely nothing – it’s (almost) like having a full telephone network with no telephones plugged in anywhere.
Weston is what most people consider to be Wayland. It’s a compositor – so if you run Wayland instead of X you see a desktop and can run applications that appear. There’s even a terminal emulator and a few test apps that come with Weston. Unlike X this is not the server though – it’s just a proof of concept. There are also Gnome, KDE and E compositors – none of which need Weston – and there are extensions to Weston that change how it functions. There are other compositors specifically designed for in car entertainment or smartphone usage (Tizen runs on Wayland for example). This is one area people seem to slightly misunderstand – many people appear to see the clunky Weston and believe that this is the future then become disheartened. It’s not. Weston is just a quick-and-dirty “here’s what you guys can do with this” example. Gnome and KDE in particular have been making huge strides in their compositors.
What about all your old apps? Sure QT, GTK and EF support Wayland but what about all the old stuff? Good news! There’s support for that as well so you can run your X apps within Wayland compositors! Vice versa you can also run a Wayland compositor within X so everything works both directions – pretty cool?
So, we’ve got this new protocol that is lightweight, fast, designed for flexibility, considers things like touch pads and smartphones and doesn’t tie down to a specific implementation. KDE, Gnome and E all have compositors. It’s even used on all the Tizen smartphones already. Fedora has even switched over to using it as their default WM at last, Gnome on Wayland works with the proprietary Nvidia drivers… so why do most people still wonder when it will be out and not use it?
The problem is that Wayland, too, is still a bit rubbish.
Wait, So Wayland Sucks Too?
Yes, there are some known bugs and KDE, Gnome and E are all not quite at feature-parity with their X counterparts – but that’s all OK. Until more people are using it (bear in mind it’s only just become the default on a single distribution this year) that’s to be excused. Whilst I’ve not looked through the Gnome or E source code I have perused the KWin source and found it surprisingly mature for the fairly low user-base. Give it a little more time and we’ll certainly see parity between X and Wayland compositors. So what’s the problem?
Well Wayland is not a network protocol and it is also expressly designed so that applications cannot spy on other applications. Whilst this has many benefits it does mean that remote rendering like you used to do with X is out and that instead we’re more likely to be in the realms of VNC. As far as I’m aware there is no “one-size-fits-all” approach to this at the moment so common remote desktop access is out the window if you’re using Wayland for the time being. Furthermore, if you want to do desktop video capture with something like OBS then the same issue exists. Whilst there is no technical barrier for these being overcome it’s not standardised and it’s not there now. This may be a deal breaker for some.
Twin that with worse graphics support than X. If you’re happy using binary drivers then NVidia have you covered well in X. The Wayland support is still lagging here and whilst there are options it’s still not as good as X, which is a little frustrating considering the benefits that Wayland as an architecture overall offers.
But you know what? None of those are big issues to most users. The biggest issue I see is that Wayland defines too little. This obviously became apparent because the core shell protocol (which, in effect, describes the basics of a window being created somewhere) has already had a second version created with far greater functionality. Considering the low uptake of Wayland you’d think that wl_shell would have been retired with xdg_shell’s existence – but no, they’re both there and there’s far less documentation for the enhanced shell than the basic one.
What about if you want to have borders on your windows and a maximise or minimise button? Well you can get your apps to self-decorate or you can let the compositor do it. Weston expects apps to do it, KDE is good with the compositor doing it. So, if you run a non-Weston app in Weston it will have no border decorations and a Weston app in KDE with server side decorators enabled could end up with two sets of window borders. If you want to work around that then every compositor and UI framework needs to understand the others in order to interoperate with them.
And this is where the problem starts to scale really badly. If we have X11 mandating everything then WMs ignore it and work around it leaving the mess we have now. If we go with the Wayland approach then too little is defined for currently common trends (a desire for notifications overlays, lock-screens, window minimisation and maximisation, etc.) to be shared between WMs meaning that we lose one of the biggest benefits of X11 – any app can run under any other windowing system almost seamlessly. Applications will either lose functionality, won’t run at all or will have huge overheads for developers. We’re not going to see one toolkit winning any time soon so this is a real problem and as a conceptual idea I see this as far worse than any of the fairly trivial technical issues that may remain.
So What to Do?
X cannot last and Wayland, in many ways, is an improvement. The only issue I really see with it is unfortunately a big one that I cannot see anybody having the desire or ability to easily solve in the near future. It would require all major toolkit creators (KDE, QT, EF) to work together on a lowest-common-denominator standard for their windowing systems that could at least share most functionality and this feels like it’s still a way off and another burden on allowing Wayland to take off any time soon.
I think the uptake and technical issues outside this more conceptual and political problem will be easily solved. With one major distribution now shipping with Wayland out the box and it appearing in cars and smartphones it won’t be long before we see it being used more regularly. I’m excited by what it brings to us.
I, for one, am going to be attempting to swap full time to Wayland and will be chronicling my experiences shortly. I’ve already started working on my own UI toolkit designed solely for Wayland and am fully behind it but I do think we need to solve this underlying issue sooner rather than later. The slow uptake is actually a good thing as it allows a strategy to appear before we end up with the Wayland equivalent of X’s extensions but without common implementation (unlike in the X world). Defining everything is bad, defining nothing is bad – perhaps defining just a little bit more is what’s needed.
Regardless – I think Wayland is a massive step forward and am glad that after 30 years something other than X11 is now shipping with a Linux distribution. Kudos to Wayland.