Python, long words and the soft hyphen


The official HTML specifications do not require or allow a user agent to manage text hyphenation. Instead, it must rely on manual hints provided by the document:

The soft hyphen tells the user agent where a line break can occur.

In English, (very) long words are relatively infrequent, and as such, hyphenation is not particularly important. The lack of good hyphenation rarely breaks a page layout.

In German – and in Germanic languages in general – words are long! If you have a page layout with narrow columns (a left-hand side navigation for instance), lines might extend further than you'd like them to.

The following Python function inserts soft hyphens into a text string, guided by a hyphenation dictionary:

If you can't see the script embedded above, you can see the source.

Note that with CSS Text Level 3 (see draft spec) you can use the hyphens property:

p {
  -webkit-hyphens: auto;
  -moz-hyphens: auto;
  hyphens: auto;
}

This currently works in recent Firefox, IE10 and Safari (see compatibility table).

Comments

You can use markdown formatting here. Note that your e-mail address will not be shown.
 or sign in and submit using