Anne van Kesteren

Browser differences in IDNA ToASCII processing between ASCII and non-ASCII input

At the moment the URL Standard passes the domain of certain schemes through the ToASCII operation for further processing. I believe this to be in line with how the ToASCII operation is defined. It expects a domain, whether ASCII or non-ASCII, and either returns it normalized or errors out.

Unfortunately, it seems like the web depends on ToASCII effectively being a no-op when applied to ASCII-only input (at least for some cases), as is the way browsers seem to behave from these tests:

Input Description ToASCII Expected Chrome 58 dev Edge 14.14393 Firefox 54.0a1 Safari TP 23
x01234567890123456789012345678901234567890123456789012345678901x A domain that is longer than 63 code points. Error, unless VerifyDnsLength is passed. No error. No error. No error. No error.
x01234567890123456789012345678901234567890123456789012345678901† Error. Error. Error. Error.
aa-- A domain that contains hyphens at the third and fourth position. Error. No error. No error. No error. No error.
a†-- Error. No error, returns input. No error, returns xn--a---kp0a. Error.
-x A domain that begins with a hyphen. Error. No error. No error. No error. No error.
-† Error. No error, returns input. No error, returns xn----xhn. Error.

There is also a slight difference in error handling as rather than returning input, Chrome returns the input percent-encoded.

(I used the Live URL Viewer and Live DOM Viewer to get these results, typically prefixing the input with https://.)