Years ago Sam Ruby posted URI Equivalence (thanks Robbert!). I have been studying URLs lately to write a new URL standard. Turns out that something so fundamental to the platform is non interoperable in various ways. Quelle surprise! And yes, the plan is to do away with IRI/URI and just call them all URLs. Anyway:
http://example.com/ http://example.com true HTTP://example.com/ http://example.com/ true http://example.com/ http://example.com:/ true http://example.com/ http://example.com:80/ true http://example.com/ http://Example.com/ true http://example.com/~smith/ http://example.com/%7Esmith/ false http://example.com/~smith/ http://example.com/%7esmith/ false http://example.com/%7Esmith/ http://example.com/%7esmith/ false http://example.com/%C3%87 http://example.com/C%CC%A7 false
The reason for the latter four being false is that browsers (apart from Chrome) do not unescape URL escapes. Well, and the last one really is because Unicode normalization is not performed throughout the web platform. These are equal as Ç
expands to %C3%87
during parsing:
http://example.com/%C3%87 http://example.com/Ç