Using HTML as a compile target
If you've been around an industry for long enough you end up with Opinions™ about things. Here's one of mine: I think one of the biggest missing aspects of web programming is the lack of support for treating HTML as a compile-target.
Most folks reading this blog may not be aware of this, but I started my career working with web technologies, which I kept at for the better part of a decade. I co-authored one of the (at least to my knowledge) first 1 CSS-in-JS engines, I've written frontend frameworks, designed web compilers, and authored entire HTTP stacks. Even if web tech isn't really my current area of focus, it's something I've worked on a fair bit, and still care about a lot. So here's a post about an opinion I have, in a field I don't do much in currently, but still think would like to share.
To my knowledge at the time at least. I'll totally buy that e.g. PHP
or Google's web compiler may have supported similar stuff - but we didn't know
about it at the time. When we designed sheetify in ~2015 ish
all the rage, and we figured we could do something similar without needing to
overload the meaning of
require. And we didn't know of anything that worked
similar at the time. Later,
styled-components became a thing, citing our
work on sheetify as an inspiration. But that's all ancient history at this
A taste of HTML
There are well over 100 unique HTML elements in the spec. Let's zoom in on just
one of those: the
It inherits from a class hierarchy which is 4 levels
exposes two unique attribute values, of which 1 has been deprecated and
shouldn't be used. And it can't just be placed anywhere in the DOM: it has
exactly 4 valid parent elements, of which 1 also has been obsoleted. Here are
some examples of it in action:
<aside> <li>first</li> <li>second</li> </aside> <menu> <li>first</li> <li>second</li> </menu>
Quiz time: which of these is valid? Which of these isn't? None, both, either one? When you write the code, you don't know. It's only when you try and render the code in the browser that you'll receive feedback.
Okay fine, the answer to the last question was that
<aside><li> is invalid,
<menu><li> is valid. Quiz number two (no peeking at the reference): what
is rendered here?
<ol> <li value="2">second</li> <li value="1">first</li> </ol> <menu> <li value="2">second</li> <li value="1">first</li> </menu>
Okay, answer time: first one is valid, second one is not. But both will render,
without error. It's just that in the first case
value is respected ("first,
second") while in the second case it renders "second, first". Did you know that?
I sure didn't.
As you're probably aware I'm not actually trying to teach you the rules of
<li>, but just trying to show that this teeny tiny basic element has a host of
interactions that will affect what you build. And we're only scratching the
surface of this element. It also interacts with styling, event handlers, ARIA roles,
and elements contained within. The fact that
value attributes on
valid when nested within
<ol>, but not within
<menu> is just one interaction
of thousands out there. Everyone gets things wrong all the time with this. And
we need to do better. All these rules are specced, implemented, and (mostly)
consistent. We need a way to automate this knowledge.
I think we need to start by acknowledging that HTML is complex. Not in a deregatory way, HTML is also really useful, but as a statement of fact. There are more rules and interactions than anyone reasonable person can manage. Which means we shouldn't expect people to be the ones managing it.
Instead, I believe that HTML would benefit from type safety. Right now the way HTML is validated is by shipping it to a browser, and then checking the console tab for errors. Or sometimes people run linters on it such as lighthouse or pa11y. Sometimes these things even happen in CI.
A shortcoming of this approach is that in order to validate whether you've done things right, you first have to load your web page into a browser engine, and only then you can validate whatever was rendered. But that's where it stops: it only validates what was rendered at that point. If your page has different components which can be in different states, or also has different pages, you need to ensure every one of those is validated. And that's often more work than it's worth, so people don't bother.
Instead the way I'm thinking about it is: what if we moved the validation to a compilation pass. That way we can prevent errors from occurring altogether. A compiler is able to reason about all states, and validate they're all valid, all in a blink of an eye. No browsers required. For people who're looking to reduce the amount of runtime issues in their websites: this is by far the most feasible way to actually get the number of HTML bugs down to zero.
This isn't the first time I'm writing about this: I first
pitched this in 2019 because I didn't have the bandwidth to implement it
myself. But Bodil did, who ran with it and subsequently published the
typed-html crate. This is good; I'd love to see different approaches of this
same idea spring up in Rust and in other languages. All that's needed to get started
is an idea of what the resulting API should look like. And then write a compiler
.webidl node definitions, and outputs types in the right
shape on the other end 2.
It'd be amazing if the W3C kept a repo with these definitions up to date themselves, so everyone else could just pull them in. But one can wish.
Type-Safe Semantic Represenation
Once you have a base of checked HTML types that's pretty good. But that doesn't mean we're quite done yet. Remember when I said we weren't getting into ARIA roles? Well, we are now.
So see, if HTML is structure, and CSS is representation, then ARIA is
"semantics". WAI-ARIA, or "ARIA roles" for short, enables you to ascribe meaning to
your structure. When you see a bunch of
<li> items in text form, what are you looking at?
Are they tabs? Are they a menu? Are they a dropdown? ARIA roles enable you to
meaningfully distinguish between them. How many of these roles are there? Over
250, and they all have different properties and interactions.
Again: the path away from trouble here is by creating abstractions. Instead of having
to remember which elements to apply the
roles on, you should be able to create something akin to Swift's
which encapsulates all of those differences for you.
This is a higher level of abstraction than just typed HTML provides. But it's equally important, since it's a requirement for useful keyboard navigation and accessibility. It'll take some more work than just plain typed HTML, and some more design work is required to incorparate best practices on how to actually structure elements since you can't just scrape a spec for definitions. But once you can reason about e.g. "tabs" or "feeds", building more complex things becomes a lot easier. Check out the ARIA Patterns page to get a sense of the kind of stuff this could cover.
To recap so far: we've established that HTML and ARIA are complex interfaces, that checking during compilation rather than at runtime has a lot of benefits, and that applying this to ARIA it would enable us to reason about elements at a higher level of abstraction.
I believe that in order to make this all make the most sense, the web is missing
"sourcemaps for HTML". If we're going to be compiling a bunch of semantic
components down to HTML, then devtools should enable us to reason about them as
components. When inspecting HTML in the DOM, we shouldn't have to look at a
<span> tags. But more meaningful things, such as
The way I think this is that devtools are like having access to a debugger. When
debugging a compiled program we usually don't reason in terms of
MOV rax jax and
POWF bonk donk. But the operations we actually wrote such as:
big_stretch. It's relatively infrequently that actual output is what we're
interested in, but it's always there if you need it.
Some of you web heads might be wondering: "What about Web Components? - Isn't that the same?" And, like, kind of! It's similar in how a reflection in a mirror is similar to the thing it reflects: similar at first sight, but actually everything is the opposite.
The point of the Custom Elements part of Web Components is that you can send
<custom-tags> to the browser which are then interpreted at runtime to
introduce interactivity, but preserve qualities such as cacheability of shared
resources. A canonical example here is the
relative-time-element tag used by
GitHub. What we're doing is the opposite: the HTML we send to the browser is
already fully expanded, and doesn't need to interpret any JS to enable interactivity.
What we want is for the browser to be able to interpret all of our
<span> tags generated by our components as the
elements that we've authored. This is much closer to source maps: associate
the building blocks we generated with the semantic components they actually came from.
The similarity with custom elements comes back up when we take a look at how custom-elements are rendered in the browser's Elements tab:
<relative-time tense="past" datetime="2016-11-23T11:12:43-08:00" data-view-component="true" title="Nov 23, 2016, 8:12 PM GMT+1"> #shadow-root (open) "6 years ago" November 23, 2016 11:12 </relative-time>
This contains all the information anyone needs to debug the element. And what I'd love is to be able to access this view for server-rendered components without ever needing to go use any JS. And the uses don't stop there either: frameworks such as React also reason in terms of components and require custom devtools to re-surface that view. What we really need is a way to ascribe semantic meaning to our elements in a way that works for both server- and client-side applications.
Now, it could be that I've missed a capability that would enable doing exactly this with custom elements in browsers today - or perhaps some other mechanism. It's been a minute since I last used custom elements. But if that's the case I'd love to learn about it, because in my opinion every framework under the sun should start making use of that. I know I will!
In this post we've talked about the complexity of getting HTML right, how using a compiler could be used to mitigate those issues, what ARIA roles are, how we could model higher-level type-safe components, and finally what's missing in the browsers devtools to round this out.
I strongly believe that type-safety would be a boon for the browser space. People have already experienced some of it with the adoption of TypeScript, and I believe leaning further into types and type-safety would enable the web platform to become far more accessible - both for developers and users.
The only piece we can't directly control is the ability to teach browsers to
reason about the semantic meaning of elements - almost like being able to
declare source maps for the resulting HTML. In an alternate universe this is
what Custom Elements would have been. But I believe in order to make the web
platform more accessible, we must make it possible to actually reason about
<calendar> as first-class items. And not just the
<p> tags they're comprised of.
update (2023-04-07): As coincidence will have it, two weeks after publishing this post the chrome and safari teams began discussing the upcoming "declarative shadow DOM" feature. This seems like it will resolve the outstanding issues involving shipping semantically meaningful elements from the server.