Anne van Kesteren

Contributing to standards

I was asked how one contributes to standards. Before anything else, it is worth watching Domenic Denicola’s presentation on making friends and influencing standards bodies. It is awesome and will teach you a great deal.

I think the core thing to understand when considering contributing to web standards is that they are created by communities. Typically there are a few people leading the charge and many people contributing with critique, research, and tests. Usually there is a combination of mailing lists, IRC, and the occasional face-to-face meeting, to keep everyone roughly synchronized.

A lot of discussion still happens through email and given the volumes you need to filter it to some extent. An effective way of doing this is by paying more attention to the peers you know and trust and checking from time to time to see if that list of peers needs adjusting. E.g. if you follow the development of JavaScript you want to read email from Allen Wirfs-Brock and Brendan Eich. If you follow development of HTML you want to read email from Ian Hickson. You’ll quickly find out Boris Zbarsky is insightful irrespective of the mailing list involved. If the people are not immediately obvious to you, you can always ask on IRC. These people will often reply to the key points within a thread and make it immediately obvious what it is about and why it might be worth paying attention to. That way you save yourself some time reading the whole thing. Of course you will need to judge for yourself how to filter, but some amount of filtering will be required if you want to keep up with the community and also do some work.

You want to figure out what community to participate in:

Unfortunately there’s a myriad of other smaller lists for particular APIs. Usually the standard you care about has relevant pointers. If it doesn’t, please file a bug or let someone else know as it definitely should.

Studying the output of the community (the standard and tests) and its ongoing progress (the mailing list) is a good way to get a feel for how things work and what you should pay attention to. It can help to read the WHATWG FAQ too as it documents answers to many common questions. Having familiarized yourself with the material and the environment you should feel more than ready to start participating more actively, in particularly if you see something worthy of improvement.

Monkey patch

There appears to be trend where specifications monkey patch a base specification. A monkey patch being a subtle change to an existing algorithm only observable if you have seen both the new and the base specification. Some examples: Custom Elements attempts to redefine the createElement() method; Resource Timing adds a hook into each fetching end-point within a document without actually defining this in any amount of detail; Content Security Policy hijacks JavaScript’s eval(). (Using dated TR/ URLs here as an exception so these examples remain useful going forward.)

Apparently it is not clear that this is bad design. We should avoid monkey patching (hereafter patching). It has at least these problems:

Note that it is fine to have extension points. Both adopting and cloning of nodes can be hooked into by other specifications (and soon JavaScript for Custom Elements). Explicit extension points make the model clear. If adopting was instead merely patched from HTML’s img element definition it would not be clear for someone reading the adopting algorithm that adopting is actually more involved.

If you encounter patching, please file a bug. If you are writing a specification and temporarily want to patch a base specification to help implementations along, file a bug on the base specification so the community is informed of what you are trying to do.

One year at Mozilla

I love figuring out the web platform and making it better.

Last year (since I started February 4) I worked on Fetch, URLs, DOM, XMLHttpRequest, Fullscreen, Notifications, and Encoding, all published through the WHATWG under CC0. Apart from that I focused on bringing JavaScript and the web platform closer together by trying to foster better mutual understanding. The intersection of DOM, HTML, IDL, and JavaScript around the details of script execution, tasks, microtasks, and multiple globals has also been a recurring theme. This year the plan is to solve offline.

Countries in 2013

I moved to the United Kingdom to work for Mozilla last year and it has been excellent so far. Getting close to a full year now. Since I have listed countries in 2008, 2009, 2010, 2011, and 2012, I thought I would do it again:

State of promises

A brief overview of the current state of promises as people have been asking me all over. We thought we were fully done, but decided on course-correction. Based on discussions with Mark and Tab, Domenic has been doing great work writing up the new algorithms in Promise Unwrapping Algorithm. This defines a subset of the envisioned model (it does not support promises-for-promises, but can in due course). This design will be integrated in DOM until it can move to JavaScript proper. By next TC39 meeting in a month we hope to declare consensus. After that we can start shipping in browsers.

To be clear, the fundamental aspects of promises remain unchanged. And we should continue using them for all new features that require asynchronous values.

URL: IDNA2003

Previously, in reverse chronological order: IDNA Hell, URL: IDNA2008, and URL: domain names.

IDNA2003 consists of two important algorithms: ToASCII and ToUnicode. Both operate on a single domain label (i.e. not a whole domain name). To obtain one or more domain labels from a domain name it needs to be split on dots (U+002E, U+3002, U+FF0E, and U+FF61).

Apart from doing a range check and checks for certain code points, ToASCII encompasses two major algorithms: Nameprep and Punycode (see Wikipedia’s Punycode). Nameprep is a specific profile of Stringprep. Stringprep in turn, does a number of things: mapping code points, Unicode normalization (NFKC — “Die, heretic scum!”), check forbidden code points, check proper use of bidirectional code points, and check unassigned code points (although this last one will not happen in browsers).

ToUnicode does the reverse, with the caveat that it cannot fail. If it fails at any point the original input is returned instead.

The URL Standard standardizes on IDNA2003 as that is what the most widely deployed clients implement. It does override one requirement, namely to use the latest version of Unicode rather than Unicode 3.2.

The IDNA section of the URL Standard references IDNA2003’s ToASCII and ToUnicode and makes appropriate requirements around them. The status quo now has better documentation than before. It seems unlikely clients will update to IDNA2008 as it’s not a straightforward replacement (it has nothing equivalent to ToASCII and ToUnicode) and is not backwards compatible.

Decade

Ten years now since First Item!. And five months since I started at Mozilla. Pretty sweet.

Currently working on URLs again. In particular, file URLs. Does file: relative to file:///test/path yield file:/// or file:///test/path. I don’t even…

London TAG F2F

Last week was the second reformed TAG meeting, this time with new chairs, and hosted by me at Mozilla in London. I felt that overall it went well, though there was quite a bit of repetition too. Getting a shared understanding takes more time than desired. Takeaways:

Also, the W3C TAG is now on GitHub. It took some arguing internally, but this will make us more approachable to the community. We also plan to have a developer meetup of sorts around our meetings (a little more structured than the first one in London) to talk these things through in person. Feel free to drop me a line if something is unclear.

Fetching URLs

There are a ton of features in the web platform that take a URL. (As the platform is build around URLs, that makes a ton of sense, too.) XMLHttpRequest, <img>, background-image, <script>, @font-face, … The semantics around obtaining a resource from such a URL however are not very well defined. Are redirects followed? What if the server uses HTTP authentication? What if the server returns 700 as status code for the resource? Does a data URL work? Does about:blank work? Is the request synchronous? What if I use a skype URL? Or mailto? Is CORS used? What value will the Referer header have? Can I read data from the resource returned (e.g. via the canvas element)? Can I display it?

For what seems rather trivial, is actually rather complicated.

At the moment Ian Hickson has defined some of this in the HTML Standard. In an algorithm named fetch. Then CORS came about (for sharing cross-origin resources) and the idea was that it would neatly layer on top, but it ended up rather intertwined. And now there is another layer controlling fetching, named CSP. To reduce some of this intertwinedness and simplify defining new features that take a URL, I wrote the Fetch Standard. It supersedes HTML fetch and CORS and should be quite a bit clearer about the actual model as well as fix a number of edge cases.

It is not entirely done, but it is at the point where review would be much appreciated.

Making the web platform more suitable for “apps”

At Mozilla we’re trying to bring the web platform closer to what is taken for granted in the “walled gardens” of our time (Apple’s App Store, Google Play, and friends). A big thing we need to solve is offline. As applications, sites should just work without network connectivity. Some variant of “NavigationController” (the name is bad) will give us that, but we need to iterate on it more. And in particular we need to test it to make sure performance is adequate and the API simple enough.

We have an API for end-user notifications, but after the site is closed clicking the notification from the notification center will fail (what should happen?) and if there are multiple browsing contexts with the same site open there is also some ambiguity as to which should receive focus. The permission grant is per-origin, but a single origin can host multiple sites. Push notifications face similar issues. The site is not open, but a push notification for it comes in, where should it be delivered?

The idea we have been toying around with is a worker that could be fired up whenever there is some external event that cannot be directly managed by the site (e.g. when the site is not open). This idea is not new, Google suggested it long ago, but it did not take off. A change from their model would be to not make these workers persistent, but rather short-lived so they are not too wasteful. Part of the application logic would move to the server, and push notifications can be used to wake the worker (we have been using “event worker” as a name) to e.g. notify the user or synchronize state for when the user navigates to the relevant site next.