text/xml
?Sam made the following remark in his rather nice response to my flamebait: I will also note that the same reasons that
I think text/xml
should be deprecated apply equally well to text/html
.draft-murata-kohn-lilley-xml-02.txt
is the last edition of the draft that tries to deprecate text/xml
. Section 3.1, paragraph 3 of XML media types is the reason (section 8.5 gives an example) some people want to deprecate it in favor of application/xml
. Browsers seem to ignore that particular rule though and treat it the same as application/xml
. (Do I hear someone saying silly browser vendors
in the background?) Any chance we can get that in the specification? It seems pretty compatible to me. And more pragmatic.
There exists (existed?) a class of software called transcoders. Their purpose in life was to convert from one character set to another.
Such software does not understand every possible data format, in fact the text/*
class of MIME types was designed for them. They merely needed to look for this pattern and possibly a charset parameter. By the rules, this charset was supposed to override everything you might find inside the document.
The reality is that some data formats embed their own charset information. XML does so in the prolog. HTML permits meta http-equiv. The knowledge of XML required to parse the prolog is minimal. The knowledge of HTML required to parse the meta tag is somewhat deeper. Furthermore, the popular usage of the meta tag, namely to specify the charset, is not supported by the specification (see references below).
Needless to say that a transcoder that changes the character set and only the external HTTP header produces broken results for these formats.
This leads to some tension. Should one follow the specifications, or follow actual usage? Should one make exceptions only for popular formats, but not for unpopular ones?
References
I know that Outlook does ‘transcoding’. I sent an attachment with the MIME type correctly specified as text/plain; charset=UTF-8
, and the damn thing actually converted the file to Latin-1 when saving it (breaking all the non-English names in the file). Transcoding bad :).
~Grauw
There exists (existed?) a class of software called transcoders.
Until someone proves otherwise, I continue assuming that transcoding proxies (that are not tightly coupled with a mobile browser to form a distributed UA) are a myth on the Web today. Most often people who point to the transcoder problem in the context of HTTP have heard about transcoding proxies but have not seen one in the wild recently. (Russian Apache does not count. It is a transcoding origin server.)
Besides, considering how form submissions work, I highly doubt that a transcoding proxy could be deployed without breaking form submissions spectacularly.
So this problem exists for text/html
, text/css
, text/javascript
and text/xml
. Look at that, all the popular web formats!
So this problem exists for
text/html
,text/css
,text/javascript
andtext/xml
. Look at that, all the popular web formats!
I am not aware of a mechanism to specify a character set inside of text/css
or text/javascript
. If no such mechanism exists, one could transcode text/css
from iso-8869-1
to utf-8
, for example, and do so safely by simply adjusting the charset
parameter on the Content-Type
HTTP header to match.
For CSS there is @charset
.
Files with the .js
extension are generally served as application/x-javascript
by servers and thus aren't affected by transcoding problems. Now that there are registered MIME types for JavaScript one can still switch to application/javascript
or application/ecmascript
since browsers ignore the media type for files referenced from <script src>
.
Until someone proves otherwise, I continue assuming that transcoding proxies ... are a myth on the Web today.
I bet Google Mobile transcodes. It's not tightly-coupled with a specific UA, either.
Until someone proves otherwise, I continue assuming that transcoding proxies ... are a myth on the Web today.
I bet Google Mobile transcodes. It's not tightly-coupled with a specific UA, either.
Isn’t it a URL rewriter and not a proxy in the HTTP sense?
It does a variety of things, but you're correct that it's not a transparent proxy (unless Google uses the same system in a proxy for its own apps).