In order to figure out
data: URL processing requirements I have been studying MIME types (also known as media types) lately. I thought I would share some examples that yield different results across user agents, mostly to demonstrate that even simple things are far from interoperable:
These are the relatively simple issues to deal with, though it would have been nice if they had been sorted by now. The MIME type parsing issue also looks at parsing for the
Content-Type header, which is even messier, with different requirements for its request and response variants.
At the moment the URL Standard passes the domain of certain schemes through the ToASCII operation for further processing. I believe this to be in line with how the ToASCII operation is defined. It expects a domain, whether ASCII or non-ASCII, and either returns it normalized or errors out.
Unfortunately, it seems like the web depends on ToASCII effectively being a no-op when applied to ASCII-only input (at least for some cases), as is the way browsers seem to behave from these tests:
|Input||Description||ToASCII Expected||Chrome 58 dev||Edge 14.14393||Firefox 54.0a1||Safari TP 23|
||A domain that is longer than 63 code points.||Error, unless VerifyDnsLength is passed.||No error.||No error.||No error.||No error.|
||A domain that contains hyphens at the third and fourth position.||Error.||No error.||No error.||No error.||No error.|
||Error.||No error, returns input.||No error, returns
||A domain that begins with a hyphen.||Error.||No error.||No error.||No error.||No error.|
||Error.||No error, returns input.||No error, returns
There is also a slight difference in error handling as rather than returning input, Chrome returns the input percent-encoded.
(I used the Live URL Viewer and Live DOM Viewer to get these results, typically prefixing the input with
We have been moving WHATWG standards to be deployed through GitHub and Travis CI. This way we can generate snapshots for each commit which in turn makes it easier to read older obsolete copies of the standard. The final step in our build process moves the resources to the server using SSH.
Unfortunately we have been doing this in a bad way. The documentation from Travis suggests to use
ssh_known_hosts and lots of other documentation suggests passing
-o StrictHostKeyChecking=no as argument. The risks of these approaches and their secure alternatives are not (always) outlined unfortunately. Both of these open you up to network attackers. You effectively do not know what server you end up connecting to. Could be the one you know, could be that of an attacker. Note also that in case of Travis’s
ssh_known_hosts it is not even trust-on-first-use. It is trust-on-each-use (i.e., trust-always). You can be attacked each time Travis runs. I filed issue 472 since what we need is trust-never, as the network is unsafe.
As far as I can tell this is not a big deal for WHATWG standards, since they are completely public and the worst that could happen is that an attacker stops publication of the standard, which they could do even if we had a proper setup (by terminating the network connection). However, it does set a bad example and we would not want folks to copy our code and have to know the limitations of it. It should just be good.
The easiest way to do Travis deployments securely that I have found is to create a
known_hosts resource and pass
-o UserKnownHostsFile=known_hosts as argument (thanks Tim). That ensures the
rsync -rsh="ssh" program will not prompt. However, rather than not prompting because you told it to bypass a security check, it is not prompting because everything is in order. Of course, this does require that the contents of
known_hosts are obtained out-of-band from a secure location, but you need to be doing that anyway.
The XMLHttpRequest Standard now makes use of that secure deployment process and the remainder of WHATWG standards will soon follow.
With that, if any of the following is true, you probably need to fix your setup:
A couple years ago I wrote Contributing to standards and it is worth noting how everything has gotten so much better since then. Basically all due to GitHub and standards groups such as TC39, WHATWG, and W3C embracing it. You can more easily engage with only those standards you are interested in. You can even subscribe to particular issues that interest you and disregard everything else. If you contrast that with mailing lists where you likely get email about dozens of standards and many issues across them, it’s not hard to see how the move to GitHub has democratized standards development. You will get much further with a lot less lost time.
Thanks to pull requests changing standards is easier too. Drive-by-grammar-fixes are a thing now and “good first bug” issue labels help you get started with contributing. Not all groups have adopted one-repository-per-standard yet which can make it a little trickier to contribute to CSS for instance, but hopefully they’ll get there too.
(See also: my reminder on the WHATWG blog that WHATWG standards are developed on GitHub.)
Andrew pointed out webrender yesterday. A new rendering technology for CSS from the folks that are reinventing C++ with Rust and browsers with Servo. There is a great talk about this technology by Patrick Walton. It is worth watching in its entirety, but 26 minutes in has the examples. The key insight is that using a retained mode approach to rendering CSS is much more efficient than an immediate mode approach. The latter is what browsers have been using thus far and makes sense for the
canvas element (which offers an immediate mode rendering API), but is apparently suboptimal when talking to the GPU. Patrick mentioned this was pointed out back in 2012 by Mark J. Kilgard and Jeff Bolz from NVIDIA in a paper titled GPU-accelerated Path Rendering:
We believe web browsers should behave more like video games in this respect to exploit the GPU.
The WHATWG Fetch Standard is an essential part of the browser networking subsystem. Basically any API that involves networking (e.g.,
<a href> (through navigation),
WebSocket) goes through Fetch. The exception is WebRTC’s
RTCDataChannel and perhaps not surprisingly it has a security issue. The
fetch() API is also defined in terms of Fetch and the similar naming has led to some confusion. Fetch is basically the subsystem and
fetch() is one of the many APIs that exposes (part of) the capabilities of Fetch.
The basic setup is that an API prepares a request, which consists of a URL and a number of variables, feeds that to Fetch, and at some point gets a response, which consists of a body and a number of variables. Fetch takes care of content security policies, referrer policies, invoking service workers, credentials, cache modes, CORS, HSTS, port blocking, default headers (and whether they get exposed to service workers),
X-Content-Type-Options: nosniff, and more. In part Fetch defines essential infrastructure such as CORS, redirect handling, port blocking, and overall terminology, and in part it serves as glue between the now numerous standards that together define the browser networking subsystem.
E.g., for redirects, Fetch defines which headers are preserved, whether a request body gets cloned and reused (it usually does), how the referrer policy gets updated, what happens with redirects to non-HTTP schemes (fail, except when navigating sometimes), but the actual connection opening and request transmission is largely left to TLS and HTTP. And as a consequence of all APIs using Fetch, redirects behave the same throughout. There are exceptions to the rule of course, but redirects are no longer a problem we need to solve on a per-API basis. And when you extrapolate this redirects example to content security policies, referrer policies, service workers, and all the other little things Fetch takes care of, it should be clear why it is essential.
(See Fetching URLs for an earlier introduction.)
There are two computing models today that have mass-market appeal, are safe-by-default, are app-driven (no OS access), and provide some degree of sandboxing for their apps: Web and Store. The major difference is that Web computing has decentralized publishing (it would be distributed if not for domain registrars and certificate authorities) and Store computing is by definition centralized. Decentralizing Store computing is unlikely to ever succeed and I have argued before that such a system cannot reasonably exist as part of Web computing. (Arguably Web computing is a form of centralized computing. Certificate authorities are ultimately grounded in a list managed by the browser or the OS the browser runs in.)
Web and Store computing both rely on the end user for a number of permission decisions that control powerful APIs. Can this app use geolocation? Can this app use notifications? Can this app use the camera? User-controlled permissions has been a great innovation in computing.
As discussed previously Web computing does not offer HTTP/TCP/UDP access. Web computing might do Bluetooth, but what is on offer is less capable than Store computing and sits behind a permission decision. USB is a similar story and there are undoubtedly more APIs.
Another way of looking at this is that Store computing is vulnerable to exfiltration of intranet data and other “local” attacks. Web computing protects the intranet and the “local network” through the same-origin policy and simply not providing certain APIs. Store computing relies on an initial installation/trust decision by the user, review by the Store owner, and app revocation by the Store owner. Store computing does not require permission decisions for these APIs. And Web computing does not offer permission decisions for these APIs as they are deemed too powerful.
Developers looking to solve problems in the HTTP/TCP/UDP/Bluetooth space will likely become Store computing developers as Web computing cannot address their needs. In turn they might convince their colleagues that Store computing is “better” and slowly grow that ecosystem at the expense of Web computing. The question is then whether there is a mismatch in the security requirements between Web and Store computing or whether this disparity of functionality is intrinsic in their respective security architectures.
The track record of reviewing apps has not been perfect. Google now performs manual reviews for Play Store submissions after getting into ActiveX-level badness and Apple had to battle a malicious version of Xcode. Assuming such mistakes continue to happen users will continue to be vulnerable as they can easily be guided towards installing a Store computing app through directions offered on Web computing (which is typically offered access to as part of Store computing). Of course, were Web computing to offer such APIs users would be vulnerable too. The only recourse would be using the anti-phishing and malware infrastructure, which is not too dissimilar from app revocation. The question is whether users would be more vulnerable.
Assume that Web computing got a trust decision that goes further than just trusting the lock-domain combination in the address bar. The next problem is Web computing apps lacking isolation, i.e., they are vulnerable to XSRF and XSS. Those are a direct result of a shared cookie jar among all Web computing apps for a given user, the ability to manipulate URLs, and the ability to inject code through forms that might end up executing in the app. Store computing apps on iOS have been attacked through URL schemes and on Android through intents, but not to the same extent as Web computing I believe. So apart from a trust decision and revocation, Web computing apps might need new isolation primitives before even being allowed to ask for more trust.
None of that addresses the aspect of app review and the Store having some kind of relationship with the app developer, other than perhaps dismissing that as not being a crucial part of the security story. The question is whether app review can reasonably protect against intranet data exfiltration.
The goal of this post is to frame the problem space and encourage exploration towards solutions that make Web computing more powerful, without ceding the aspects we hold dear. That is, if Web computing provides a new trust decision, isolation, and revocation, can it expose HTTP/TCP/UDP/Bluetooth and more?
From “Perspectives on security research, consensus and W3C Process”:
There have been a number articles and blog posts about the W3C EME work but we’ve not been able to offer counterpoints to every public post…
There has been enough of a shitstorm about W3C and DRM that we had to write something.
First, the W3C is concerned about risks for security researchers.
We are concerned with the PR-optics of the EFF rallying against us.
W3C TAG statements have policy weight.
The W3C TAG has no place in the W3C Process.
This TAG statement was reiterated in an EME Factsheet, published before the W3C Advisory Committee meeting in March 2016 as well as in the W3C blog post in April 2016 published when the EME work was allowed to continue.
The W3C TAG gets some publicity, but has no place in the W3C Process.
Second, EME is not a DRM standard.
We are actively enabling DRM systems to integrate with the web.
The W3C took the EFF covenant proposal extremely seriously.
We proposed it to the four-hundred-and-some conservative Member companies and let them do the dirty work, per usual. We will only lead the web to its full potential when there is agreement among the four-hundred-and-some conservative Member companies.
One criteria for Recommendation is that the ideas in the technical report are appropriate for widespread deployment and EME is already deployed in almost all browsers.
We will continue to ignore the actual ramifications of browsers shipping DRM systems.
What are the various security boundaries the platform offers? I have an idea, but I’m not completely sure whether it is exhaustive:
document.domainhas forced this upon us.
There is also the HTTP cache, which leaks everywhere, but is far less reliable.