Anne van Kesteren

innerHTML in XML

Firefox 1.1 is promising to become a great browser for developers. Firefox 1.1 will ship with native SVG support (enabled), it will support cursor:url(link), CSS3 cursor values, CSS3 overflow-x and overflow-y. Firefox 1.1 will also support the outline property and the new CANVAS element. It will support the TABINDEX attribute on every element for accessibility. (See Extending tabindex for custom HTML widgets and The tabindex Attribute.) It also has support for ‘ecmascript for XML’ (E4X) and many more enhancements.

Among those cool new features, it will also support innerHTML in XML documents. To give you an idea how it should work I put two innerHTML testcases online here: 001 & 002. In order for it to work the string you append as value needs to be escaped using either CDATA sections or the five predefined entities. If you mess it up by trying to enter ill-formed XML an error will be returned to the javascript console and nothing will happen with your data.

“Dude, that’s non standard!”

Well yeah, but it is part of the forthcoming ‘HTML5 specification’: Serialization and parsed fragment replacement. Besides, it appears to be a lot faster and when using CDATA sections it is a lot easier to use than the W3C DOM as well, not?

It appears that at least Opera 8 supports this as well. Very cool!

Comments

  1. Interesting, I didn't know Opera supported that. :P

    Posted by Frenzie at

  2. Will document.write() also work in XML?

    Posted by zcorpan at

  3. document.write() is evil! Will Firefox 1.1 support counters and list-styles that actually work with arbitrary xml structures?

    Posted by Jimmy Cerra at

  4. Opera supports document.write in XML mode as well. It hooks into the parser stream and there was quite some discussion about this on the beta mailing lists.

    Posted by Erik Arvidsson at

  5. Jimmy Cerra, why not? If document.write is specifief properly and implemented correctly I don’t see any problems with it. It must be consistent though.

    Posted by Anne at

  6. Are you sure innerHTML will work in XHTML? The bug status is still NEW.

    Posted by minghong at

  7. If yes, they should be create an alias called innerXML, as it is not limited to HTML. :-P

    Posted by minghong at

  8. Sounds wery hot! Althoug I have never heard about that Opera 8 supports this..

    Posted by Bjørn Olav at

  9. minghong, it works quite ok. There are some small bugs though and that bug covers those. And yes, you can enter any kind of XML as value and no, it probably won’t be renamed.

    Posted by Anne at

  10. If the innerHTML does nothing, or perhaps gives a mal-formed warning on ill-formed content, then I don't think there would be a problem implementing it. The DOM may be more than just browser ECMA-script and do I know what it's all good for, but only on bigger browser-based-applications re-using lots of things it might be profitable, but even then it appears like an overly complicated way to me. Of course the DOM is platform and language neutral, but that doesn't make it anything less complicated as opposed to coding "the old fashioned way". I'm aware of a few of the good sides of the DOM (and there are probably much more than I'm aware of), but I like this method a lot more anyway. :)

    Posted by Frenzie at

  11. If document.write() would work in XML, what happens with this document?

    <script xmlns="http://www.w3.org/1999/xhtml" type="application/x-javascript">
    <![CDATA[
     document.write('<foo>Test</foo>'); 
    ]]>
    </script>

    If we run the script, it will create an element foo right after the script element. But XML only allows one root element... Well of course we could simply ignore any document.write() if the script block is the root element.

    Posted by zcorpan at

  12. Opera does not support document.write() for XML, only for text/html. See Why document.write() doesn't work in XML.

    Opera's implementation of innerHTML for XML also seems to accept any kind of ill-formed tag soup as input. It seems to just parse it as HTML and insert the resultant DOM, thus keeping the whole document well formed.

    Personally, I prefer using the existing DOM methods, since XML should not generated by scripts using strings of markup. The only thing that makes innerHTML better than document.write() is that it actually creates DOM nodes before inserting into the document, unlike document.write() which just inserts strings of markup and doesn't guarentee well-formed output. However, innerHTML should reject ill-formed input, unlike Opera's current implemetation. If you're correct about Firefox's implementaion rejecting ill-formed input, then that's good.

    It would be better to implement a more generic document.parse() function which is not specific to HTML and accepts any string of HTML or XML markup, depending on the type of document; and which returns a node for insertion or other processing using the other DOM methods, like appendChild().

    Posted by Lachlan Hunt at

  13. Mozilla DOM Range method createContextualFragment()

    Posted by hemebond at

  14. What's wrong with the DOM methods? document.write() and .innerHTML are just hacks for lazy developers, aren't they?

    Posted by Tommy Olsson at

  15. Lachlan, that document is outdated. document.write will work in some future version of at least both Opera and Firefox.

    zcorpan, I can’t get it to work. I also tried it with a more generic root element as SCRIPT can cause trouble. (There is a bug about SCRIPT as root element in Firefox for example.) I also tried it with another TYPE attribute value and combinations with the methods I tried above but no luck. Perhaps Erik can give a working example?

    Tommy, they are much more difficult and slow. And lazy is what the web is all about.

    Posted by Anne at

  16. if innerHTML is a hack, then HTML (the document format) is a hack.

    It's much simpler and clearer to express documents and document fragments declaratively (as HTML syntax), rather than procedural (creating and inserting one DOM node after another).

    The DOM methods are OK if you only need to insert/move/remove a single node, but if you want to insert/change subtrees, innerHTML is cleaner.

    Posted by Olav Junker Kjær at

  17. There is a bug about SCRIPT as root element in Firefox for example

    Yup, it actually freezed my Mozilla/Firefox sometime earlier. :'-(

    Posted by minghong at

  18. I Don't know if this will make it into 1.1 (I doubt it), but it sure sounds like sweet music to the (small) web designer part of my brain. The main bug is here

    Also, bzbarsky blogs that two Acid2 bugs have been worked out lately. But if I understand him correctly, these fixes haven't landed yet, and won't ship until Firefox 1.5 (Gecko 1.9).

    Posted by David Naylor at

  19. Regarding the rounded corners: Am I the only person who actually cringes at the sight of them in Firefox??? Come on, they can do better than that! They're severely pixelated – what about some sort of none-intensive anti-aliasing, perhaps? But at least it can be done, I suppose... For the moment, though, I actually tend to prefer the fall-back borders for IE and co. Despite the fact that I'm sick of the sight of the same boring dashed lines... Design clichés move on, eh? :D

    Posted by trojjer at

  20. Here's the first post I was rambling on about, finally... (As I guessed, it was yet again something niggling and silly. *sigh* But at least I noticed it: I had left out a semi-colon at the end of an entity declaration! Argh! lol... The Validator said it was "tentatively valid" like that, which I was surprised about, but now it reckons that it's perfect XHTML 1.0 Strict, if you add in the necessary "skeleton tags". So I'm clicking preview for the last time today, and hopefully it'll work!)

    True, it may in part arise out of lazyness, lol, but I still heavily advocate the "clean" use of innerHTML in situations when you have a well-formed HTML string all ready to go, and all you want to do is append the whole thing to an element. Then it's done and dusted.

    I made a simple example page, which, I believe, proves that using innerHTML seems like a perfectly reasonable option in some circumstances. The DIV node and its descendants are appended perfectly into the DOM tree, just as if you had used appendChild directly (without the additional work involved in having to document.createElement() the fragment beforehand); and as you can see, the CSS rules are applied transparently, too.

    (Feel free to free this page from the extra weight of the following outburst, if desired... ;)

    <rant>Alright, so sue me: There's no DTD, and I used the *shudder* deprecated language attribute to declare that the script is indeed Javascript (I might get around to changing my habits slightly and using type="text/javascript" instead, but I'm not sure – does this work in absolutely any script-capable browser? Sounds laughable, I know, but I'm not using a silly version number anyway, and I'm not too bothered about XHTML at the moment (actually, considering that I use "valid" HTML 4.01, I'll bite the bullet here and admit that I tend to stick with Strict, so technically I shouldn't be using it there either); I just write "decent", well-formed HTML for the most part, because I don't see the benefits of including all that seemingly superfluous stuff yet, when I'm not knee-deep in bloomin' namespaces and all that... And also, why the hell do you have to state the seemingly obvious, that a <script> tag contains a form of text content??? Isn't it a bit ambiguous, when the "proper" MIME type is actually application/x-javascript, anyway?). Phew, closed the brackets, ahem.

    Oh, and I left out all that CDATA block clutter, too, because it's not an XHTML page (hang on there! Must say document, oh yes. Pah...) – and it seems daft to me that, given the importance they associate with defining the right entity content type or whatever they call it, they didn't make it an inherent feature of the <script> tag in the first place, within the DTD! Why is it "PCDATA", when it obviously should be excluded from XML parsing? It's like the href attribute, in my opinion – with that, they complicate matters by saying that you need to escape ampersands and the rest of its Gang of [X]HTML-Unfriendly Characters, despite the fact that URI/Ls have their own encoding in the first place... Couldn't attributes like that be set to CDATA too, to make life a bit easier for XHTML users?

    One last thing I forgot: The whole confusion arising from how most browsers automatically turn attributes into properties of the element's JS object. What's going on there? Yes, I set the DIV's ID by simply using div1.id, and it bugs me when I see people using element.set/getAttribute for core attrib's like that. Surely, if we weren't supposed to do it at all, they wouldn't've given us className, for instance? Or are we technically "supposed" to use getAttribute('class'), even? I've seen people use that, and they seem just as confused. And, as I've said on another BBS before, what about the special case of the faithful (or not so) element.style – does element.getAttribute('style') officially return the unprocessed cssText-like string, instead...?</rant>

    (Having realised that I've spanned a lot of topics here, I hope you keep this text in its original, long-winded format! :S You don't have to, though, of course; it's just that I'd like some answers and I haven't been satisfied much at all, anywhere. You could email me, and I'll split them up and post them separately for you.)

    Posted by trojjer at

  21. Not one reply yet... Oh well.

    I'd like to point out that the behaviour I would expect from element.getAttribute('style'), surprise surprise, is not present in IE[6]. The processed object is still returned, just as if I'd used element.style; which means that I don't know of a possible method of fetching the unparsed cssText from a style attribute in that friggin' browser (bearing in mind that, at least for IE, element.style.cssText doesn't seem to include such inline text). Meh, it's not the worst it has to offer, by far...

    Posted by trojjer at

  22. IE6 has an extra argument to getAttribute. F.e. getAttribute("href") on a link returns an absolute URI, but getAttribute("href", 2) returns the exact value of the attribute. Guess what: it doesn't work with "style"!

    I don't understand what's missing from element.style.cssText?

    Posted by Sjoerd Visscher at

  23. As I said, IE doesn't consider CSS that's added via inline style attributes to be within the reaches of element.style.cssText – it only seems to let you access stuff that's referenced by, or inside of, a <style> tag. At least, from that string property, anyway.

    Hmm, maybe it's an ironic case of the IE developers trying to get people to frown upon such inline attributes? :'D It'd be just a tad bit hypocritical, though; and besides, they have their uses at times, when content/presentation needs to be more flexible. Or something like that.

    I mean, I'm sure a few people would agree that adding “individual touches” to a particular instance of an element's appearence, without being constrained to a predefined class or ID, or even, occasionally, having to use one at all (if it's just a one-off, perhaps, and it would be otherwise insignificant), sounds reasonable?

    Posted by trojjer at

  24. Can't you just use FONT for that? ;o]

    Posted by Krijn Hoetmer at

  25. I tried this:

    <div style="color:red" onclick="alert(this.style.cssText)">test</div>

    And I get COLOR: red. So I'm not sure what your problem is.

    Posted by Sjoerd Visscher at