Anne van Kesteren

HTML5 and Bidirectional Text

I am not at all proficient in bidirectional (commonly called bidi) issues and unfortunately there is very little input from the community that deals with right-to-left text rendering on a daily basis to the community working on standards. However, this recently seems to have improved somewhat and via the W3C Internationalization Core Working Group a great number of suggestions for better dealing with bidirectional problems on the web are being made.

Motivating these changes seem to be that the Unicode control characters for bidirectional text are not great, that CSS is the wrong layer, and that the model Unicode has of “weak” and “strong” characters does not always work out (i.e. the Unicode Bidirectional Algorithm). Not knowing the direction of the text you are going to display plays a role too.

One of the new features that is introduced to HTML5 is the new bdi element. It contrasts with the existing bdo element in rather than overriding the direction it isolates a particular piece of bidirectional content. It is needed for instance in the following example (the uppercase characters are used to represent characters that are supposed to be rendered right-to-left):

<span dir=rtl>PURPLE PIZZA</span> — <a href="r?pp">3 reviews</a>

You want this to render as follows:

AZZIP ELPRUP — 3 reviews

But it will in fact render (correctly!) as follows due to the way the Unicode bidirectional algorithm deals with “weak” characters:

3 - AZZIP ELPRUP reviews

If you use the new bdi element (not implemented yet) instead of the span element above the right-to-left text will be isolated and everything will render as you intended. Well, in user agents that implement this feature. Like this:

<bdi dir=rtl>PURPLE PIZZA</bdi> — <a href="r?pp">3 reviews</a>

Another new feature is the auto value for the dir attribute. Lots of content online these days is coming from most everywhere and the direction of it is not always carried around with it. When dealing with such scenarios you can set the dir attribute to its auto value (e.g. dir=auto) and let the user agent figure out what the rendering direction should be. The dir attribute on the new bdi element defaults to this state.

Hopefully once these features (and others) get implemented more documentation will be written to help authors cope with the problems of bidirectional text as it is quite complicated.