Anne van Kesteren

Why MIME types are not like handing someone a cup of coffee saying it’s hot chocolate

Consider the following document:

<html></html>

When someone hands me a cup of coffee saying its hot chocolate I can figure out it is coffee quite easily. But can you figure out the MIME type of the above document? You probably can’t. The above document can have multiple MIME types and each will make the browser parse it differently.

text/plain
The browser will show you the text as a literal string.
text/html
The browser will generate a HTML DOM consisting of one root node, HTML (in uppercase, yes), and two child nodes: HEAD and BODY. (I used Mozilla to get the document object model; HTML5 will probably define this.)
application/xml
The browser will generate an XML DOM consisting of one root node, html.

(This also shows why you want to have slug support in your weblog, in case you didn’t figure that one out already.)

Comments

  1. (This also shows why you want to have slug support in your weblog, in case you didn’t figure that one out already.)

    Sorry for being slow on the uptake, but WTF?

    Posted by Turnip at

  2. it's more like handing somebody a cup of decaf and claiming it's regular (a crime in itself)...

    Posted by patrick h. lauke at

  3. I don't grasp the issue here.

    It sounds like you're complaining that you can't ID a file's mime type from it's contents. But shouldn't mime type stuff be server-side issues? File extensions take care of the issue on the local side (except for on Mac's)? If you're looking through files in a non-GUI system (like telnet/ssh), then you're most likely an uber-geek that knows what's what anyway. So I'm confused where the issue is.

    Posted by Devon at

  4. If you hand me that document (regardless extension) and I open it in notepad or some other text-editor, than to me at that moment it will be just plain text. You cannot however make an assumption based on a textual representation to judge what it really is; you need to see it in the context it was meant to be in; mimetype will tell a computer how this document should be treated and what it should be. If I sent an XHTML document with a text/html mimetype than eventually, within the browser context - what it was meant for, IT IS HTML. No XHTML DTD will change that.

    To me XHTML is more than syntax alone; it's a kind of technologie and it has special features and properties. When sent as text/html it lacks all of those but in return gets all the features and properties that belong to HTML. Therefor, in my opinion, it IS HTML at that moment, unless the browser tells me it cannot possibly be HTML based on the content.

    Posted by Tino Zijdel at

  5. Turnip, well you don’t want the title of this post to be hyphenated for the URI.

    Posted by Anne at

  6. I'm not sure if Mozilla still append dummy body element into the empty html element: bug 57717.

    Posted by minghong at

  7. "its" in the title should of course read "it's". Is this the state of the web that we cannot allow proper language due to deficiences in our blog software? Or was it a simple mistake?

    Posted by Chris Hester at

  8. I’m using Word 2002 from time to time.

    minghong, I tested this in a recent nightly… Either that bug fixed something else or I made a mistake.

    Posted by Anne at

  9. Thanks for clarifying Anne. I thought you were in some way relating it to MIME types ; ).

    Posted by Turnip at

  10. MIME types are not like handing someone a cup of coffee saying it’s hot chocolate.

    Isn’t that what MIME types are for — to prevent situations like this from occurring? MIME types are (or at least, should be) like handing someone a cup of coffee, explicitly stating it’s a cup of coffee. They have to make clear to user agents what kind of document they’re dealing with.

    When a text/html document (coffee) is served as application/xml (it’s hot chocolate), it will be interpreted as an application/xml document, regardless the source code. (Certain people like their coffee with gallons of milk and tons of sugar; others just like it black. This is where user agents come in.)

    So what’s the point in being able to tell what type of document the above example is just by looking at its source code?! The Content-Type header will tell you wether it’s text/plain, text/html, or application/xml.

    Posted by Mathias Bynens at

  11. Mathias, you misquoted me (the important part is missing) and missed the point.

    Posted by Anne at

  12. Slug support is something so basic and necessary, I can't believe that there were (and are) blog systems that don't have it (straight out of the box).

    Posted by Faruk Ates at