Robust Vertical Text Layout == THIS IS AN INCOMPLETE DRAFT

by fantasai

Most formatting systems today can only handle horizontal text layout. This document outlines a system that can not only handle common vertical right-to-left cases, but that can gracefully accept uncommon script combinations and left-to-right text columns. The model is described here as a CSS system, but the concepts can apply to non-CSS systems as well.

The examples in this text require support for Unicode BIDI and Arabic shaping, and fonts for Simplified Chinese and Arabic/Farsi.

Recommended browsers (recent versions):

More about Unicode fonts and other software

Background

The Cascade in Cascading Style Sheets

Unlike many formatting systems, in which styling properties are definitively applied to a page element at one point, CSS collects and applies to the element multiple style rules from the author, reader, and user agent. In case of a conflict, the origin of the rule and the specificity of the rule's affected elements selector determine which property value takes effect on the elements. This process of sorting and applying style rules is called cascading, and it allows style rules from multiple sources and with separate formatting purposes to interact in a rigorous way.

Cascading means that style properties specified together are not guaranteed to take effect together. This raises the design standards for creating CSS properties and pushes them towards a more logical, rather than physical, description of the intended design.

CSS and Unicode BIDI

CSS2 introduced the direction and unicode-bidi properties to incorporate markup directives such as HTML's dir attribute into the CSS rendering model, and to allow the use of markup semantics in assigning BIDI embeddings. The direction property can take the values ltr and rtl. The unicode-bidi property assigns embeddings and overrides in the direction given by the direction property. Its behavior is defined in terms of the Unicode embedding and override codes.

/* map 'dir' attribute to 'direction' */
*[dir="ltr"] {direction: ltr;}
*[dir="rtl"] {direction: rtl;}
/* embed quotations so they stay as a single unit */
q {unicode-bidi: embed;}

The direction property inherits to descendant elements. When applied to a block of text, the direction property specifies the block's embedding direction; CSS documents do not use heuristics to guess the block's embedding direction.

These properties were meant to reflect BIDI distinctions necessary for the proper ordering of text. Authors in general were discouraged from using the properties in favor of the direct markup that would trigger the appropriate values.

Describing Text Flow

To describe how a text flows into lines, one needs to know three things:

  • which way the text flows within a line (inline progression)
  • which way the lines stack (block progression)
  • which way the glyphs are facing (glyph orientation)

However, not all combinations of text direction and glyph orientation are valid, and if you know certain of the character's inherent properties, you can often derive one from the other. Unicode systems take advantage of this model in horizontal text: you don't have to manually tell every run of Hebrew to order itself right-to-left, and you don't need to specify that Mongolian turn itself sideways when it's running horizontally left-to-right.

Logical vs. Physical Description

In a purely physical layout scheme, each of these text layout properties would be given as an absolute: The inline progresion of this run of English is top to bottom, its glyph orientation is 90 degrees (clockwise), its block progression is from right to left. However, because the interrelationships among these properties are realized in the author's mind and not in the system,

  • The author must manually intervene any time there is a script change in a block with non-default settings.
  • If one of the three properties fails to take effect (because of the Cascade or lack of UA support), then the layout breaks and the text becomes unreadable.

A better system would embed knowledge of different scripts' intrinsic characteristics and define style properties in terms of the relationships among the properties.

Intrinsic Directionality and Orientation

Each script has a characteristic writing direction, and each character in Unicode is assigned a directionality value based on these directions. Unfortunately, Unicode currently only defines horizontal directionality even though vertical and bi-orientational scripts have a vertical directionality as well. For example, while English can go either top to bottom or bottom to top (since it doesn't have a vertical directionality), Japanese must only go from top to bottom, even in a left-to-right block progression. Mongolian also has top-to-bottom vertical directionality. Unlike Japanese however, it has no definite horiziontal directionality.

Script Classification by Directionality

Scripts can be classified into three orientational categories:

horizontal
Scripts that have horizontal, but not vertical, directionality. Includes: Latin, Arabic, Hebrew, Devanagari
vertical
Scripts that have vertical, but not horizontal, directionality. Includes: Mongolian, Manchu
bi-orientational
Scripts that have both vertical and horizontal directionality. Includes: Han, Hangul, Yi

Bi-orientational scripts may be further classified by how their glyphs transform when switching orientations. CJK characters translate; they are always upright. Other scripts, such as Ogham and some variants of classical Yi, must be rotated.

Logical Text Flow

Implying Direction

Scripts in their native orientation need no additional stylistic hints for proper layout: their inline progression and glyph orientation are both intrinsically mandated, so the style system can know by itself how to lay them out. Directionality and glyph orientation overrides [1] are not necessary and should not be used. (In fact, using them degrades the system by creating a tangle of dependencies, as demonstrated in the next section.)

Scripts in a foreign orientation don't need directionality or glyph overrides either. They just need a few hints: whether to translate upright, or, if they're rotated sideways, which side is "up". Given that, the rules for laying out the text in its native orientation are enough to determine the inline progression and exact glyph orientation.

Accounting for Block Progression

For scripts in a non-native orientation, the natural inline text flow depends on the direction of line stacking: the text is most comfortably laid out as if the whole text block were merely rotated from the horizontal. For example, English text in vertical lines that stack from left to right will face with the glyphs' tops towards the left and the text direction running from bottom to top. The same text, by the same logic, would in a right-to-left line stacking context face right and flow within each line from top to bottom.

Putting this logic into the style system is straightforward: define "up" for non-native glyphs to point to the beginning of the line stack, and the inline progression follows from that orientation. The glyph orientation and inline progression will thus adapt to whichever block progression happens to take effect.

This layout scheme is most appropriate for dealing with text that has been turned on its side for layout purposes—as for page headers or captions or table headings. (Merely rotating the rendered text from a horizontal layout is not sufficient because while the primary script is horizontal, it may include some vertical as well, which would need to be appropriately handled.) However, a major use case for laying out text in a non-native orientation is mixing horizontal and vertical scripts, which introduces the requirement of making the secondary scripts flow well in the context of the primary script.

For example, a primarily Mongolian document, which has vertical lines stacking left to right, usually lays its Latin text with the glyphs facing the right. This makes the text run in the same inline progression as Mongolian and face the same direction it does in other East Asian layouts (which have vertical lines stacking right to left), but the glyphs are facing the bottom of the line stack rather than the top, something they wouldn't do in a primarily-English paragraph.

Yet another common layout is to keep the horizontal script's glyphs upright and order them from top to bottom; this is frequently done with Latin-script acronyms in vertical East Asian text.

To handle these layouts, the style system needs to offer controls for choosing among these different layout schemes. Note, however, that scripts in their native orientations do not need these hints; only the non-native ones do. Also, this is only one simple scheme switch here: there's no need for the designer to set separate absolute inline progression and glyph orientation controls or to set styling properties on each text run of a different script.

The Three Switches of Logical Text Layout

In summary, to lay out a block of arbitrary, mixed-script text, the layout system needs to offer only three controls:

  • primary script's directionality (BIDI property)
  • block progression direction (stylistic property)
  • glyph orientation scheme (stylistic property)

Formalized into CSS syntax, this becomes:

direction

Directionality. Can take the following values

ltr
Left-to-right directionality in horizontal text; No inherent directionality in vertical text. (Horizontal script) Examples: Latin, Tibetan
rtl
Right-to-left directionality in horizontal text; No inherent directionality in vertical text. (Horizontal script) Examples: Arabic, Hebrew
ttb
Top to bottom directionality in vertical text; No inherent directionality in horizontal text. (Vertical script) Example: traditional Mongolian
lr-tb
Left to right directionality in horizontal text; Top to bottom directionality in vertical text. (Bi-orientational script) Examples: Han, modern Yi
lr-bt
Left to right directionality in horizontal text; Bottom to top directionality in vertical text. (Bi-orientational script) Example: Ogham
block-progression

Block progression (line stacking) direction. Can take the following values

tb
Top-to-bottom line stacking (horizontal text). Typically used for most non-East-Asian layout.
rl
Right-to-left line stacking (vertical text). Typically used for traditional CJK layout.
lr
Left-to-right line stacking (vertical text). Typically used for traditional Mongolian layout.
text-orientation

Glyph orientation scheme to use in vertical text. Can take the following values

natural
Non-vertical script runs are laid out as if text had been flowed as horizontal text and then the page had been rotated 90 degrees—to the left for left-to-right block progression and to the right for right-to-left block progression. (Vertical scripts are laid out as vertical scripts.)
left
Non-vertical script runs are laid out as if text had been flowed as horizontal text and then the page had been rotated 90 degrees to the left. (Vertical scripts are laid out as vertical scripts.)
right
Non-vertical script runs are laid out as if text had been flowed as horizontal text and then the page had been rotated 90 degrees to the right. (Vertical scripts are laid out as vertical scripts.)
upright
Non-vertical scripts' characters read top to bottom, with each grapheme cluster oriented upright. (Vertical scripts are laid out as vertical scripts.)

Note: For handling vertical-only scripts in horizontal layout, a text-orientation-horizontal property is also necessary; it takes effect only when the block progression is top-to-bottom. To keep the discussion less verbose, I am delegating consideration of horizontal layout to the appendix.

As long as the directionality is set correctly for the text (and it should be set automatically from the content/markup as long as the designer doesn't touch it later), any combination of the block-progression and text-orientation stylistic values will result in a correct (though perhaps not optimally-designed) text layout.

The style system can thus handle most of the intricacies of laying out both usual and unusual combinations of text by itself. What it needs to do this, however, is to know the intrinsic properties of the characters it is laying out.

Implementing A Logical Text Layout System

To lay out an arbitrary block of multi-script text using this logical model, the system only needs from the author the text, its primary directionality, and the values of two stylistic switches: block-progression and text-orientation. From the programmer, it needs both the logic and the data necessary to lay out the text.

Handling block-progression is very straightforward: just stack the composed lines in the stacking direction. Composing the lines of text is more complicated. The text needs to go through three processing steps.

Composing Lines of Text

Character Ordering

Character ordering is where the BIDI algorithm gets applied. The algorithm remains essentially unchanged when dealing with vertical text: what changes is the data. Specifically, the directionality values of certain characters are mapped depending on the styling context.

The Unicode BIDI algorithm deals with two directions: left-to-right (towards right) and right-to-left (towards left), precisely the same as the script directionalities involved. Although this multi-directional model has several more directionality values, the BIDI algorithm here also deals with only two directions: it just abstracts them so that they could just as easily be bottom-to-top (towards top) and top-to-bottom (towards bottom). To avoid the apparent absurdity of mapping right to left and such things, I will call the two BIDI directions "high" (H) and "low" (W). (Implementations, no doubt, will prefer to call them "left" and "right" to map directly into the Unicode BIDI algorithm.)

It is important to keep in mind that these directions are abstract. We will map "left", "right", "top", and "bottom" to "high" or "low" based on the values of text-orientation and block-progression. The mapping applies to everything: the individual character's directionality, embedding and override codes, the CSS direction values, HTML dir attributes, etc. Once the line is composed, we then lock "high" and "low" to the appropriate sides of the block as we stack the lines according to block-progression.

Directionality Mapping: Vertical Case

In vertical context, bi-orientational scripts use their vertical directionality and behave as vertical, not horizontal, scripts. Han, for example, as a ltr-ttb script, is treated as ttb (top to bottom), not ltr (left to right). The ltr-ttb value for direction is correspondingly treated the same way as the value ttb.

For text-orientation: right (and text-orientation: natural in a right-to-left block progression):
  • Map ttb to htl (high to low)
  • Map btt to htl (low to high)
  • Map ltr to htl (high to low)
  • Map rtl to htl (low to high)

Run the Unicode BIDI Algorithm with its "left" being our "high" and its "right" being our "low".

For text-orientation: left (and text-orientation: natural in a left-to-right block progression):
  • Map ttb to htl (low to high)
  • Map btt to htl (high to low)
  • Map ltr to htl (high to low)
  • Map rtl to htl (low to high)

Run the Unicode BIDI Algorithm with its "left" being our "high" and its "right" being our "low".

For text-orientation: upright
  • Map ttb to htl (high to low)
  • Map btt to htl (low to high)
  • Map ltr to htl (high to low)
  • Map rtl to htl (high to low)

Run the Unicode BIDI Algorithm with its "left" being our "high" and its "right" being our "low".

Glyph Orientation

Before the system can paint the text (or even do alignment), it needs to know how to rotate the glyphs. For vertical and bi-orientational scripts, this is simply "rotate me to my intrinsic position". This doesn't mean "don't rotate me, I'm supposed to be upright", however, because the standard representation of a character in a font is the one used in horizontal text using the canonical directionality. Han and Kana and Hangul and Yi do need to be kept upright (0° rotation) because they use the same orientation in both horizontal and vertical text. Mongolian (and Ogham), however, rotate from one context to the other and so their glyphs must be rotated 90° from their horizontal orientation when used in vertical context. Part of the system's knowledge, therefore, needs to be which scripts need to be rotated and which merely translated into place. Given that and the script's directionality, the exact rotation can be derived as follows:

System's Knowledge of Vertical Scripts' Properties
Han/Hangul/Kana/Yi Mongolian/Manchu Ogham
(cannonical) horizontal directionality LTR (LTR) LTR
vertical directionality TTB TTB BTT
transformation translation rotation rotation
System's Derivation of Vertical Scripts' Orientation
Han/Hangul/Kana/Yi Mongolian/Manchu Ogham
horizontal orientation (vector direction) glyph orientation: 0deg; inline-progression: 90deg glyph orientation: 0deg; inline-progression: 90deg glyph orientation: 0deg; inline-progression: 90deg
transformation Glyph orientation static; Inline progression rotates 90deg (from ltr to ttb) Glyph orientation and Inline progression rotate together 90deg (from ltr to ttb) Glyph orientation and Inline progression rotate together -90deg (from ltr to btt)
vertical orientation glyph orientation: 0deg; inline-progression: top-to-bottom glyph orientation: 90deg; inline-progression: top-to-bottom glyph orientation: 270deg; inline-progression: bottom-to-top

For horizontal scripts, the method is "rotate me according to the relevant text-orientation style".

For text-orientation: right or text-orientation: natural in a right-to-left block progression:

Rotate horizontal scripts' grapheme clusters 90° to the right.

For text-orientation: left or text-orientation: natural in a left-to-right block progression:

Rotate horizontal scripts' grapheme clusters 90° to the left.

For text-orientation: upright

Keep glyphs for horizontal scripts upright.

Transformations for punctuation, being somewhat arbitrary and stylistic, should be handled by using vertical glyph variants given in the font, but only when the direction of the text is a vertical or bi-orientational directionality. (If the text is primarily horizontal text rotated sideways, then the punctuation should likewise be horizontal punctuation rotated sideways.)

Character Shaping

Character shaping is the process of selecting, based on context, which of several allographs of a letter should be used. This is typical of cursive scripts like Arabic and Mongolian, where the shape of a letter depends on whether it comes at the start of a word, in the middle of a word, or at the end of a word.

According to UAX 9, character shaping occurs after BIDI reordering: the Arabic character shaped as an "initial" will always be on the right, even if the text is given a left-to-right override. This ensures that the letters always connect. (An initial on the left side of the word would be trying to connect to nothing.)

To deal with the multiple orientations of vertical layout, the shaping code needs to know not just the reordered string of characters, but which side of the line is "up". If we turn the glyphs all upside-down, for instance, the shaping needs to be done in reverse. Because in vertical text Arabic and Mongolian sometimes go in the same direction and sometimes in opposite directions, merely inverting the entire character string before passing it to standard Unicode shaping functions doesn't work.

Shaping occurs only within each directional level run. Shaping is also constrained to runs of text in the same script; Mongolian characters, from Arabic's point of view, form as concrete a boundary as Latin ones do. It is therefore possible to break up the text into pieces that have characters from no more than one shaping-affected script without compromising the accuracy of the shaping. Then, for each run of text, one can use the shaping script characters' glyph orientation (derived above) to determine which way is "up" (0deg) and hence which are the "left" (-90deg) and "right" (+90deg) sides of the text run. Once that's known the text run can be shaped, in reverse if necessary.

Abusing Directionality and Its Consequences: A Case Study of CSS3 Text

CSS3 Text was intended to update and expand the text layout capabilities of CSS2 by adding support for more international typesetting features and introducing controls for laying out vertical text. It introduces a block-progression property, which switches the line stacking direction, and hijacks rtl and ltr values of the direction property to use as an inline-progression control in vertical text.

writing-mode: direction: block-progression: Common Usage:
lr-tb ltr tb Latin-based, Greek, Cyrillic writing systems (and many others)
rl-tb rtl tb Arabic, Hebrew writing systems
tb-rl ltr rl some East Asian writing systems
tb-lr rtl lr Mongolian writing system

It is a good example of how not to set up a vertical text system.

In order to interface with the Unicode BIDI Algorithm, CSS3 Text maps characters' directionality based on the block progression. For example, top-to-bottom characters in a right-to-left block progression will be treated as left to right (L) characters, just like Latin. However, if the columns of text are stacking the other way—from left to right—then the same characters (which so far are all assigned left-to-right directionality in Unicode) are treated as right-to-left characters (R). This is done because, if you recall, left-to-right scripts such as Latin read bottom to top when the lines of text are ordered left to right. Top-to-bottom scripts must therefore go in the opposite direction, and the opposite of ltr is rtl.

Assuming support for Arabic and Mongolian, the code speaking to the shaping engine would now be very confused. The Mongolian, which as a top-to-bottom script is now being treated as rtl, needs to be shaped as if its directionality hadn't been tweaked: the first character in each word, even though it's now on the "right" side of the word and not the "left", still needs to be shaped as an initial, not a final. The font rendering code will have to then make sure that the glyph is "upside-down" (from its point of view) so that the letters connect.

With all these tweaks in BIDI reordering and character shaping and font rendering, the layout system can't simply pass the string to standard Unicode functions. But let's assume it manages to hold up the pretense that "top-to-bottom" is really "right-to-left" internally. It still needs to interact with BIDI instructions from the outside world, which doesn't share the delusion. CSS3 Text therefore requires the designer to use "direction: rtl" when assigning "block-progression: lr" to a block of top-to-bottom text (such as Mongolian or Chinese), in effect asking him to lie about the text's properties. Like most lies, it seems to work in the general case, but as the situation gets complicated, the system breaks down...

  • Foremost, if the expected block progression fails to take effect—whether through the cascade or through lack of UA support—the text direction and the assigned embedding direction no longer match and the subtleties of Unicode BIDI can wreak havoc on the order of the text.

    <p style="/*block-progression: lr;*/ direction: rtl">
      (这是)一些中国字.
    </p>
    

    (这是)一些中国字.

  • BIDI control characters placed by the content author to order horizontal text are now affecting a topsy-turvey set of character directionalities.

    In this sequence:

    在这个对话, بهار 告诉 الیکا‎ :‎ این کتاب را به برادرم ببر و برایش بخوان.

    The punctuation in the middle is given as "thin space", "colon", "space". The punctuation sequence is surrounded by some rtl text on one side (ندا) and an rtl embedding (the quotation) on the other. If left alone, the thin space and the colon will join the Persian name and the quotation in single right-to-left sequence, like this:

    在这个对话, بهار 告诉 الیکا : این کتاب را به برادرم ببر و برایش بخوان.

    The text's author (or the authoring software) was intelligent enough to insert two left-to-right (LRM) marks around the punctuation sequence so that this does not happen.

    If, however, this Chinese paragraph is treated as rtl when put in a vertical context, then the punctuation sequence will be ordered the wrong way. (It will still be left-to-right because of the codes, instead of right-to-left like the rest of the text.)

  • CSS embeddings set on elements within the formatted block are no longer necessarily going the right way.

    This bit of code assigns the wrong directionality if the text is to be inserted in a page with vertical text layout settings that make Chinese behave as rtl.

    note:before {
      content: "注意: "; /* Insert "Note:" before the note. */
      direction: ltr;
      unicode-bidi: embed; /* Document is mixed Chinese/Arabic, so
                              make sure this stays together properly. */
    }
    
  • HTML dir attributes that were added with the assumption of regular, horizontal text might or might not need to have their effects be reversed.

    <p dir="rtl">وسط این خملة فرسی,
      <q dir="ltr">在中国(PRC)人民用人民币.</q>
      یک خملة چینی هست</p>
    

    وسط این خملة فرسی, 在中国(PRC)人民用人民币. یک خملة چینی هست

    If this paragraph finds its way into a left-to-right block progression, the Chinese quote will need a right-to-left embedding, not a left-to-right one.

  • There is no mention of how character shaping should happen.

In conclusion, abusing directionality controls to make a limited system lay out text correctly doesn't scale. It's a hack, not a solution.