Anne van Kesteren

Shadow tree encapsulation theory

A long time ago Maciej wrote down five types of encapsulation for shadow trees (i.e., node trees that are hidden in the shadows from the document tree):

  1. Encapsulation against accidental exposure — DOM nodes from the shadow tree are not leaked via pre-existing generic APIs — for example, events flowing out of a shadow tree don't expose shadow nodes as the event target.
  2. Encapsulation against deliberate access — no API is provided which lets code outside the component poke at the shadow DOM. Only internals that the component chooses to expose are exposed.
  3. Inverse encapsulation — no API is provided which lets code inside the component see content from the page embedding it (this would have the effect of something like sandboxed iframes or Caja).
  4. Isolation for security purposes — it is strongly guaranteed that there is no way for code outside the component to violate its confidentiality or integrity.
  5. Inverse isolation for security purposes — it is strongly guaranteed that there is no way for code inside the component to violate the confidentiality or integrity of the embedding page.

Types 3 through 5 do not have any kind of support and type 4 and 5 encapsulation would be hard to pull off due to Spectre. User agents typically use a weaker variant of type 4 for their internal controls, such as the video and input elements, that does not protect confidentiality. The DOM and HTML standards provide type 1 and 2 encapsulation to web developers, and type 2 mostly due to Apple and Mozilla pushing rather hard for it. It might be worth providing an updated definition for the first two as we’ve come to understand them:

  1. Open shadow trees — no standardized web platform API provides access to nodes in an open shadow tree, except APIs that have been explicitly named and designed to do so (e.g., composedPath()). Nothing should be able to observe these shadow trees other than through those designated APIs (or “monkey patching”, i.e., modifying objects). Limited form of information hiding, but no integrity or confidentiality.
  2. Closed shadow trees — very similar to open shadow trees, except that designated APIs also do not get access to nodes in a closed shadow tree.

Type 2 encapsulation gives component developers control over what remains encapsulated and what is exposed. You need to take all your users into account and expose the best possible public API for them. At the same time, it protects you from folks taking a dependency on the guts of the component. Aspects you might want to refactor or add functionality to over time. This is much harder with type 1 encapsulation as there will be APIs that can reach into the details of your component and if users do so you cannot refactor it without updating all the callers.

Now, both type 1 and 2 encapsulation can be circumvented, e.g., by a script changing the attachShadow() method or mutating another builtin that the component has taken a dependency on. I.e., there is no integrity and as they run in the same origin there is no security boundary either. The limited form of information hiding is primarily a maintenance boon and a way to manage complexity. Maciej addresses this as well:

If the shadow DOM is exposed, then you have the following risks:

  1. A page using the component starts poking at the shadow DOM because it can — perhaps in a rarely used code path.
  2. The component is updated, unaware that the page is poking at its guts.
  3. Page adopts new version of component.
  4. Page breaks.
  5. Page author blames component author or rolls back to old version.

This is not good. Information hiding and hiding of implementation details are key aspects of encapsulation, and are good software engineering practices.

Heading levels

The HTML Standard contains an algorithm to compute heading levels and has for the past fifteen years or so, that’s fairly complex and not implemented anywhere. E.g., for the following fragment

<body>
 <h4>Apples</h4>
 <p>Apples are fruit.</p>
 <section>
  <div>
   <h2>Taste</h2>
   <p>They taste lovely.</p>
  </div>
  <h6>Sweet</h6>
  <p>Red apples are sweeter than green ones.</p>
  <h1>Color</h1>
  <p>Apples come in various colors.</p>
 </section>
</body>

the headings would be “Apples” (level 1), “Taste” (level 2), “Sweet” (level 3), “Color” (level 2). Determining the level of any given heading requires traversing through its previous siblings and their descendants, its parent and the previous siblings and descendants of that, et cetera. That is too much complexity and optimizing it with caches is evidently not deemed worth it for such a simple feature.

However, throwing out the entire feature and requiring everyone to use h1 through h6 forever, adjusting them accordingly based on the document they end up in, is not very appealing to me. So I’ve been trying to come up with an alternative algorithm that would allow folks to use h1 with sectioning elements exclusively while giving assistive technology the right information (default styling of h1 is already adjusted based on nesting depth).

The simpler algorithm only looks at ancestors for a given heading and effectively only does so for h1 (unless you use hgroup). This leaves the above example in the weird state it is in in today’s browsers, except that the h1 (“Color”) would become level 2. It does so to minimally impact existing documents which would usually use h1 only as a top-level element or per the somewhat-erroneous recommendation of the HTML Standard use it everywhere, but in that case it would dramatically improve the outcome.

I’m hopeful we can have a prototype of this in Firefox soon and eventually supplement it with a :heading/:heading(…) pseudo-class to provide additional benefits to folks to level headings correctly. Standards-wise much of this is being sorted in whatwg/html #3499 and various issues linked from there.

The case for XML5

My XML5 idea is over twelve years old now. I still like it as web developers keep running into problems with text/html:

XML in browsers has much less of a compatibility footprint. Coupled with XML not always returning a tree for a given byte stream making backwards compatible (in the sense that old well-formed documents parse the same way) extensions to it is possible. There is a chance for it to ossify like text/html though, so perhaps XML5 ought to be amended somewhat to leave room for future changes.

(Another alternative is a new kind of format to express node trees, but then we have at least three problems.)

any.js

Thanks to Ms2ger web-platform-tests is now even more awesome (not in the American sense). To avoid writing HTML boilerplate, web-platform-tests supports .window.js, .worker.js, and .any.js resources, for writing JavaScript that needs to run in a window, dedicated worker, or both at once. I very much recommend using these resource formats as they ease writing and reviewing tests and ensure APIs get tested across globals.

Ms2ger extended .any.js to also cover shared and service workers. To test all four globals, create a single your-test.any.js resource:

// META: global=window,worker
promise_test(async () => {
  const json = await new Response(1).json()
  assert_equals(json, 1);
}, "Response object: very basic JSON parsing test");

And then you can load it from your-test.any.html, your-test.any.worker.html, your-test.any.sharedworker.html, and your-test.https.any.serviceworker.html (requires enabling HTTPS) to see the results of running that code in those globals.

The default globals for your-test.any.js are a window and a dedicated worker. You can unset the default using !default. So if you just want to run some code in a service worker:

// META: global=!default,serviceworker

Please give this a try and donate some tests for your favorite API annoyances.

Five

Five years at Mozilla today. I’m very humbled to be able to push the web forward with so many great people and leave a tiny imprint on web architecture and the way the web platform gets standardized. Being able to watch from the sidelines as more people are empowered to be systems programmers and as graphics for the web is reinvented is hugely exciting. It’s a very tough competitive landscape, and Firefox is very much the underdog, but despite that Mozilla manages to challenge rather fundamental assumptions about web browsers and deliver on them.

And ultimately, I think that is a huge part of what makes the web platform so great. Multiple independent implementations competing with each other and thereby avoiding ossification of bugs, vendor lock-in, platform lock-in, software monoculture, and overall reluctance to invest in fundamentally improving the web platform. Really grateful to be part of all this.

Testing standards

At a high level, standards organizations operate in similar ways. A standard is produced and implementations follow. Taking a cue from software engineering, WHATWG added active maintenance to the mix by producing Living Standards. The idea being that just like unmaintained software, unmaintained standards lead to security issues and shaky foundations.

The W3C worked on test suites, but never drove it to the point of test-driven development or ensuring the test suites fully covered the standards. The WHATWG community produced some tests, e.g., for the HTML parser and the canvas API, but there was never a concerted effort. The idea being that as long as you have a detailed enough standard, interoperable implementations will follow.

Those with a background in quality assurance, and those who might have read Mark Pilgrim’s Why specs matter, probably know this to be false, yet it has taken a long time for tests to be considered an essential part of the standardization process. We’re getting there in terms of acceptance, which is great as crucial parts of the web platform, such as CSS, HTML, HTTP, and smaller things like MIME types and URLs, all have the same kind of long-standing interoperability issues.

These interoperability issues are detrimental to all constituencies:

Therefore I’d like everyone to take this far more seriously than they have been. Always ask about the testing story for a standard. If it doesn’t have one, consider that a red flag. If you’re working on a standard, figure out how you can test it (hint: web-platform-tests). If you work on a standard that can be implemented by lots of different software, ensure the test suite is generic enough to accommodate that (shared JSON resources with software-specific wrappers have been a favorite of mine).

Effectively, this is another cue standards needs to take from modern software development practices. Serious software will require tests to accompany changes, standards should too. Ensuring standards, tests, and implementations are developed in tandem results in a virtuous cycle of interoperability goodness.

(It would be wrong not to acknowledge Ecma’s TC39 here, who produced a standard for JavaScript that is industry-leading with everything derived from first principles, and also produced a corresponding comprehensive test suite shared among all implementations. It’s a complex standard to read, but the resulting robust implementations are hard to argue with.)

Using GitHub

I’ve been asked a few times how I stay on top of GitHub:

This works well for me, it may work for you.

What I miss is Bugzilla’s needinfo. I could see this as a persistent notification that cannot be dismissed until you go into the thread and perform the action asked of you. What I also miss on /notifications is the ability to see if someone mentioned me in a thread. I often want to unsubscribe based on the title, but I don’t always do it out of fear of neglecting someone.

Dara

Dara was born.

MIME type interoperability

In order to figure out data: URL processing requirements I have been studying MIME types (also known as media types) lately. I thought I would share some examples that yield different results across user agents, mostly to demonstrate that even simple things are far from interoperable:

These are the relatively simple issues to deal with, though it would have been nice if they had been sorted by now. The MIME type parsing issue also looks at parsing for the Content-Type header, which is even messier, with different requirements for its request and response variants.

Browser differences in IDNA ToASCII processing between ASCII and non-ASCII input

At the moment the URL Standard passes the domain of certain schemes through the ToASCII operation for further processing. I believe this to be in line with how the ToASCII operation is defined. It expects a domain, whether ASCII or non-ASCII, and either returns it normalized or errors out.

Unfortunately, it seems like the web depends on ToASCII effectively being a no-op when applied to ASCII-only input (at least for some cases), as is the way browsers seem to behave from these tests:

Input Description ToASCII Expected Chrome 58 dev Edge 14.14393 Firefox 54.0a1 Safari TP 23
x01234567890123456789012345678901234567890123456789012345678901x A domain that is longer than 63 code points. Error, unless VerifyDnsLength is passed. No error. No error. No error. No error.
x01234567890123456789012345678901234567890123456789012345678901† Error. Error. Error. Error.
aa-- A domain that contains hyphens at the third and fourth position. Error. No error. No error. No error. No error.
a†-- Error. No error, returns input. No error, returns xn--a---kp0a. Error.
-x A domain that begins with a hyphen. Error. No error. No error. No error. No error.
-† Error. No error, returns input. No error, returns xn----xhn. Error.

There is also a slight difference in error handling as rather than returning input, Chrome returns the input percent-encoded.

(I used the Live URL Viewer and Live DOM Viewer to get these results, typically prefixing the input with https://.)