Anne van Kesteren

HTML components

Hayato left a rather flattering review comment to my pull request for integrating shadow tree event dispatch into the DOM Standard. It made me reflect upon all the effort that came before us with regards to adding components to DOM and HTML. It has been a nearly two-decade journey to get to a point where all browsers are willing to implement, and then ship. It is not quite a glacial pace, but you can see why folks say that about standards.

What I think was the first proposal was simply titled HTML Components, better known as HTC, a technology by the Microsoft Internet Explorer team. Then in 2000, published in early 2001, came XBL, a technology developed at Netscape by Dave Hyatt (now at Apple). In some form that variant of XBL still lives on in Firefox today, although at this point it is considered technical debt.

In 2004 we got sXBL and in 2006 XBL 2.0, the latter largely driven by Ian Hickson with design input from Dave Hyatt. sXBL had various design disputes that could not be resolved among the participants. Selectors versus XPath was a big one. Though even with XBL 2.0 the lesson that namespaces are an unnecessary evil for rather tightly coupled languages was not yet learned. A late half-hearted revision of XBL 2.0 did drop most of the XML aspects, but by that time interest had waned.

There was another multi-year gap and then from 2011 onwards the Google Chrome team put effort into a different, more API-y approach towards HTML components. This was rather contentious initially, but after recent compromises with regards to encapsulation, constructors for custom elements, and moving from selectors to an even more simplistic model (basically strings), this seems to be the winning formula. A lot of it is now part of the DOM Standard and we also started updating the HTML Standard to account for shadow trees, e.g., making sure script elements execute.

Hopefully implementations follow soon and then widespread usage to cement it for a long time to come.

Network effects affecting standards

In the context of another WebKit issue around URL parsing, Alexey pointed me to WebKit bug 116887. It handily demonstrates why the web needs the URL Standard. The server reacts differently to %7B than it does to {. It expects the latter, despite the IETF STD not even mentioning that code point.

Partly to blame here are the browsers. In the early days code shipped without much quality assurance and many features got added in a short period of time. While standards evolved there was not much of a feedback loop going on with the browsers. There was no organized testing effort either, so the mismatch grew.

On the other side, you have the standards folks ignoring the browsers. While they did not necessarily partake in the standards debate back then that much, browsers have had an enormous influence on the web. They are the single most important piece of client software out there. They are even kind of a meta-client. They are used to fetch clients that in turn talk to the internet. As an example, the Firefox meta-client can be used to get and use the FastMail email client.

And this kind of dominance means that it does not matter much what standards say, it matters what the most-used clients ship. Since when you are putting together some server software, and have deadlines to make, you typically do not start with reading standards. You figure out what bits you get from the client and operate on that. And that is typically rather simplistic. You would not use a URL parser, but rather various kinds of string manipulation. Dare I say, regular expressions.

This might ring some bells. If it did, that is because this story also applies to HTML parsing, text encodings, cookies, and basically anything that browsers have deployed at scale and developers have made use of. This is why standards are hardly ever finished. Most of them require decades of iteration to get the details right, but as you know that does not mean you cannot start using any of it right now. And getting the details right is important. We need interoperable URL parsing for security, for developers to build upon them without tons of cross-browser workarounds, and to elevate the overall abstraction level at which engineering needs to happen.

Upstreaming custom elements and shadow DOM

We have reached the point where custom elements and shadow DOM slowly make their way into other standards. The Custom Elements draft has been reformatted as a series of patches by Domenic and I have been migrating parts of the Shadow DOM draft into the DOM Standard and HTML Standard. Once done this should remove a whole bunch of implementation questions that have come up and essentially turn these features into new fundamental primitives all other features will have to cope with.

There are still a number of open issues for which we need to reach rough consensus and input on those would certainly be appreciated. And there is still quite a bit of work to be done once those issues are resolved. Many features in HTML and CSS will need to be adjusted to work correctly within shadow trees. E.g., as things stand today in the HTML Standard a script element does not execute, an img element does not load, et cetera.

Enabling HTTPS and HSTS on DreamHost

DreamHost recently enabled Let’s Encrypt support. This is great and makes HTTPS accessible to a great many people. For new domains there is a simple HTTPS checkbox, could not be easier. For existing domains you need to make sure the domain’s “Web Hosting” is set to “Fully Hosted” and there are no funny redirects. If you have an Internationalized Domain Name it appears you are out of luck. If you have a great many subdomains (for which you should also enable HTTPS), beware of rate limits and wildcard certificates being unsupported.

The way DreamHost manages the rate limits is by scheduling the requests not succeeding for a week later. Coupled with the fact that Let’s Encrypt certificates are relatively short-lived this places an upper bound on the amount of subdomains you can have (likely around sixty). If you manage certificate reqeusts from Let’s Encrypt yourself you could of course share a certificate across several subdomains, thereby increasing the theoretical limit to six-thousand subdomains, but there is no way that I know of to do it this way through DreamHost.

To make sure visitors actually get on HTTPS, use this in your .htaccess for each domain (assuming you use shared hosting):

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

(As long as domains are not using rewrite rules you can in fact share this across many domains by placing it in a directory above the domains, but you will need to copy it for each domain that does use rewrite rules. IndexOptions InheritDownBefore requires Apache 2.4.8 and DreamHost sits on Apache 2.2.22, although they claim they will update this in the near future. (Very much unclear why DreamHost wiki is still without HTTPS.))

The next thing you want to do is enable HSTS by adding this to your .htaccess (first make sure all your subdomains are on HTTPS too):

Header set Strict-Transport-Security "max-age=31415926; includeSubDomains; preload" env=HTTPS

The preload directive is non-standard, but important, since once this is all up and running you want to submit your domain for HSTS preloading. You can remove the preload directive after submitting your domain, if you care for the bytes (or standards). With that done, and thanks to DreamHost’s upgraded HTTPS story, you will get an A+ on the SSL [sic] Server Test.

Custom elements no longer contentious

I was in San Francisco two weeks ago. Always fun to see friends and complain about how poorly Caltrain compares to trains in most civilized countries. Custom elements day was at Apple where you cannot reasonably get to via public transport from San Francisco. The “express” Caltrain to Mountain View and a surged Uber is your best bet. On the way back you can count on Ryosuke, who knows exactly how much gas is in his car well after the meter indicates it’s depleted.

While there are still details to be sorted with both custom elements and shadow DOM, we have made major headway since last time. Getting cross-browser agreement on the contentious issues:

For those paying attention, none of this provides a consistent world view throughout. We gave up on that and hope that the combination of the parser doing things synchronously and other contexts not doing that will be enough to get folks to write their components in a way that is resilient to different developer practices.

Three years at Mozilla

I started in London during a “work week” of the Platform group. I had also just moved to London the weekend prior so everything was rather new. I don’t remember much from that week, but it was a nice way to get to know the people I had not met yet through standards and figure out what things I could be contributing to.

Fifteen months later I moved to Switzerland to prepare for the arrival of Oscar and Mozilla has been hugely supportive of that move. That was so awesome. Oscar is too of course and might I add he is a little bigger now and able to walk around the house.

Over the years I have helped out with many different features that ended up in Gecko and Servo (the web engines Mozilla develops) through a common theme. I standardize the way the web works to the best of my ability. In the form of answering questions, working out fixes to standards such as the security model of the Location and Window objects, and helping out with the development of new features such as “foreign fetch”. I hope to continue doing this at Mozilla for many years to come.

W3C forks HTML yet again

The W3C has forked the HTML Standard for the nth time. As always, it is pretty disastrous:

So far this fork has been soundly ignored by the HTML community, which is as expected and desired. We hesitated to post this since we did not want to bring undeserved attention to the fork. But we wanted to make the situation clear to the web standards community, which might otherwise be getting the wrong message. Thus, proceed as before: the standards with green stylesheets are the up-to-date ones that should be used by implementers and developers, and referred to by other standards. They are where work on crucial bugfixes such as setting the correct flags for <img> fetches and exciting new features such as <script type=module> will take place.

If there are blockers preventing your organization from working with the WHATWG, feel free to reach out to us for help in resolving the matter. Deficient forks are not the answer.

— The editors of the HTML Standard

Firefox OS is not helping the web

Mozilla has been working on Firefox OS for quite a while now and ever since I joined I have not been comfortable with it. Not the high-level goal of turning the web into an OS, that seems great, but the misguided approach we are taking to get there.

The problem with Firefox OS is that it started from an ecosystem parallel to the web. Packaged applications written using HTML, JavaScript, and CSS. Distributed through an app store, rather than a URL. And because Mozilla can vet what goes through the store, these applications have access to APIs we could never ship on the web due to the same-origin policy.

This approach was chosen in part because the web does offline poorly, and in part because certain native APIs could not be made to work for the web and alternatives were not duly considered. The latest thinking on Firefox OS does include URLs for applications, but the approach still necessitates a parallel security model to that of the web. Implemented through a second certificate authority system, for code. With as sole authority Mozilla, and a “plan” to decentralize that over time.

As stated, the reason is APIs that violate the same-origin policy, or more generally, go against the assumed browser sandbox. E.g., if Mozilla decides your code is trustworthy, you get access to TCP and can poke around the user’s local network. This is quite similar to app stores, where typically a single authority decides what is trustworthy and what is not. With app stores the user has to install, but has the expectation that the authority (e.g., Apple) only distributes trustworthy software.

I think it is wishful thinking that we could get the wider web community to adopt a parallel certificate authority system for code. The implications for the assumed browser sandbox are huge. Cross-site scripting vulnerabilities in sites with extra authority suddenly result in the user’s local network being compromised. If an authority made a mistake during code review, the user will be at far more risk than usual.

The certificate authority system the web uses today basically verifies that when you connect to example.com, it actually is example.com, and all the bits come from there. And that is already massively complicated and highly political. Scaling that system, or introducing a parallel one as Firefox OS proposes, to work for arbitrary code seems incredibly farfetched.

What we should do instead is double down on the browser. Leverage the assumed browser sandbox. Use all the engineering power this frees up to introduce new APIs that do not require the introduction and adoption of a parallel ecosystem. If we want web email clients to be able to connect to arbitrary email servers, let’s back JMAP. If we want to connect to nearby devices, Fly Web. If we want to do telephony, let’s solidify and enhance the WebRTC, Push, and Service Worker APIs to make that happen.

There are many great things we could do if we put everyone behind the browser. And we would have the support of the wider web community. In the sense that our competitors would feel compelled to also implement these APIs, thereby furthering the growth of the web. As we have learned time and again, the way to change the web is through evolution, not revolution. Small incremental steps that make the web better.

Update on standardizing shadow DOM and custom elements

There has been revived interest in standardizing shadow DOM and custom elements across all browsers. To that end we had a bunch of discussion online, met in April to discuss shadow DOM, and met earlier this month to discuss custom elements (custom elements minutes). There is agreement around shadow DOM now. host.attachShadow() will give you a ShadowRoot instance. And <slot> elements can be used to populate the shadow tree with children from the host. The shadow DOM specification will remain largely unchanged otherwise. Hayato is working on updates.

This is great, we can start implementing these changes in Gecko and ship them. Other browsers plan on doing the same.

Custom elements however is somewhat more astray. Here are some of the pain points:

During the meeting Maciej imagined various hacks during a break that would meet the consistency and upgrade requirements, but neither seemed workable on closer scrutiny, although an attempt will still be made. That and figuring out whether JavaScript needs to run during DOM operations will be our next set of steps. Hopefully with some more research a clearer answer for custom elements will emerge.

Still looking for alternatives to CORS and WebSocket

Due to the same-origin policy protecting servers behind a firewall, cross-origin HTTP came to browsers as CORS and TCP as WebSocket. Neither CORS nor WebSocket address the problem of accessing existing services over protocols such as IRC and SMTP. A proxy of sorts is needed.

We came up with an idea whereby the browser would ship with a HTTP/TCP/UDP API by default. Instead of opening direct connections a user-configurable public internet proxy would be used with a default provided by the browser. The end goal would be having routers announce their own public internet proxy to reduce latency and increase privacy. Unfortunately routers have no way of knowing whether they are connected to the public internet so this plan fall short. (There were other concerns too, e.g. shipping and supporting an open proxy indefinitely has its share of issues.)

There might still be value in standardizing some kind of proxy for HTTP/TCP/UDP traffic that is selected by web developers rather than the browser. Similar to TURN servers in WebRTC. Thoughts welcome.