SmartWordBreakPlugin
Inserts word-breaks and soft hyphens into long words
Long words hurt readability
Long words make web pages harder to read. First prize is to use short words instead, but sometimes you cannot.
WikiWords can be alarmly long e.g.
SmartWordBreakPlugin,
TextFormattingRules and
HierarchicalNavigationChildExample.
Technical content sometimes requires long words. For example, long names tend to crop up in software source code and thus also in the corresponding documentation.
Finally, some authors simply use (too many) long words, and sometimes there is not much you can do about it.
(When the Vice President uses long words, it might be more productive to celebrate that senior management is using the wiki and just live with the long words
)
Long words can make a mess of an otherwise-clean table layout.
Browsers tend not to split words automatically and
table columns must be wide enough for the longest word in each column.
Thus, long words make content interfere with presentation.
You can tell the browser that it may split a word, by using the <wbr> tag,
but this tag is not supported by all browsers.
In fact, there is
*no* completely portable way to tell the browser that it may split a word.
Besides, you would rather concentrate on content and let the wiki take care of presentation and portability issues
like choosing the correct word break for the browser and decding when and where to split long words.
Split long words portably
SmartWordBreakPlugin lets you split long words portably.
The plugin inserts the correct form of word break for the reader's browser.
SmartWordBreakPlugin uses a combination of approaches to split words.
- WikiWords are split at the start of each word
- Words_With_Underscores are split after underscores
- Words breaks are inserted after punctuation
- Hyphenation (see below)
SmartWordBreakPlugin inserts word breaks at these points. They inform the browser that it
may split the words at these points.
If hyphenation is enabled,
SmartWordBreakPlugin tries to hyphenate words that are longer than the longest-unbroken-word-segment setting,
which is controlled with the
SMARTWORDBREAKPLUGIN_LONGEST
preference and the
longest
parameter to %SMARTWORDBREAK{...}%.
The plugin will apply hyphenation both to ordinary words and to the segments of words split using other heuristics.
The plugin inserts soft hyphens (­) into ordinary words.
Soft hyphens can look odd when inserted into WikiWords and words_with_underscores, so the plugin inserts word breaks there instead.
Some search engines do not handle word breaks and soft hyphens well; they may register the parts of thw word but not the whole word.
This could render search facilities useless. To mitigate against this,
SmartWordBreakPlugin inserts the unsplit (original) version of split words
as an HTML comment.
Control which words are split
The plugin is configurable.
You can adjust how aggressively and intelligently the plugin splits words,
and you can control which parts of each topic are processed. This is a tradeoff between convenience and performance.
At the one extreme,
SmartWordBreakPlugin will split all long words on a wiki page using a combination of wiki-specific heuristics
and the TeX hyphenation algorithm, preserving the original text in HTML comments to assist search engines.
At the other extreme, you can insert word breaks only where you want them.
Automatic word breaks in tables
Setting the
SMARTWORDBREAKPLUGIN_TABLES
preference makes the
SmartWordBreakPlugin insert word breaks automatically in all tables on a page.
This preference probably provides the best balance between convenience and performance for most applications.
The
SMARTWORDBREAKPLUGIN_LONGEST
and
SMARTWORDBREAKPLUGIN_HYPHENATE
preferences affect table-based insertion of word breaks.
The table-based processing does not work well with nested tables, so it is
not useful with skins that use tables to lay out the page.
Focussed automatic word breaks: %SMARTWORDBREAK{...}%
The %SMARTWORDBREAK{...}% macro inserts word-breaks automatically. This macro lets you apply
SmartWordBreakPlugin's automatic processing to a portion of a page.
Using this macro, it is possible to
automatically insert word breaks in a single table. This macro lets you enable automatic processing only where you need it.
Argument |
Comment |
Default value |
Example |
hyphenate |
Enables the hyphenation algorithm. |
Value of SMARTWORDBREAKPLUGIN_HYPHENATE preference, which defaults to "on" |
hyphenate="off" |
longest |
Specifies how long a sequence of letters may be; word-breaks or soft hyphens are inserted into words longer than this setting. |
Value of SMARTWORDBREAKPLUGIN_LONGEST prefernce, which defaults to 8 |
longest="5" |
Calling all pockets: Automatic word breaks for whole pages
The
SMARTWORDBREAKPLUGIN_WHOLEPAGE
preference enables processing for the whole page, including the header, footer and side-bar.
This can
hurt performance, so it should probably not be enabled in
SitePreferences.
This preference may be useful on specific pages.
It may also be useful to wiki users who use narrow browser windows - they could set this preference in their user topic.
The
SMARTWORDBREAKPLUGIN_LONGEST
and
SMARTWORDBREAKPLUGIN_HYPHENATE
preferences also affect whole-page processing.
The %WBR% macro inserts the correct form of word-break for your browser. You choose where to put the break;
SmartWordBreakPlugin inserts the correct one.
The %WBR% macro has the lowest overhead of all options provided by the
SmartWordBreakPlugin.
This macro may be tedious to use, and topics that use it may be more difficult to read when editing.
This macro may also interfere with the wiki search facility.
Demonstration
This table demonstrates what
SmartWordBreakPlugin does (change the window width to see the effect):
%TABLE{columnwidths="30%,70%"}%
| *Selection* | *Manifestation* |
| %SMARTWORDBREAK{Navigate_With_Euclidean_Geometry}% | Works for short distances, based on the approximation that the world is flat. |
| %SMARTWORDBREAK{Navigate_With_Elliptical_Geometry}% | Works for long distances. |
Simulation (should works on Firefox 2 & 3 and IE 6 & 7; other browsers may vary):
Selection |
Manifestation |
Navigate_With_Euclidean_Geometry |
Works for short distances, based on the approximation that the world is flat. |
Navigate_With_Elliptical_Geometry |
Works for long distances. |
If you have the plugin installed and enabled:
Selection |
Manifestation |
%SMARTWORDBREAK{Navigate_With_Euclidean_Geometry}% |
Works for short distances, based on the approximation that the world is flat. |
%SMARTWORDBREAK{Navigate_With_Elliptical_Geometry}% |
Works for long distances. |
Without word-breaks:
Selection |
Manifestation |
Navigate_With_Euclidean_Geometry |
Works for short distances, based on the approximation that the world is flat. |
Navigate_With_Elliptical_Geometry |
Works for long distances. |
Examples
Insert a single word break:
| *Name* | *Description* |
| Very_Long_%WBR%Function_Name | Passes the foo to the crumblicator, which turns it into biscuits. This function is not re-entrant because foo cannot be articulated. |
Make
SmartWordBreakPlugin process a single table:
%SMARTWORDBREAK{"
%TABLE{columnwidths="30%,70%"}%
| *Selection* | *Manifestation* |
| Navigate_With_Euclidean_Geometry | Works for short distances, based on the approximation that the world is flat. |
| Navigate_With_Elliptical_Geometry | Works for long distances. |
"}%
Add this to specific pages to make
SmartWordBreakPlugin process all tables on those pages (probably the best way to use this plugin):
<!--
* Set SMARTWORDBREAKPLUGIN_TABLES=on
-->
Add this to a page to make
SmartWordBreakPlugin process the whole of the page:
<!--
* Set SMARTWORDBREAKPLUGIN_WHOLEPAGE=on
-->
Caveats
Browsers do not all wrap text in the same way. Some browsers only respect word-breaks for text in tables.
Others respect word-breaks for all text.
The
SMARTWORDBREAKPLUGIN_TABLES
preference does not work well with nested tables.
In consequence, it does not work well with skins that use tables for controlling the page layout.
After attempting to split words using hyphenation rules, the plugin simply chops up any remaining long word-segments into shorter fixed-length seqments.
The resulting breaks are unfortunately not grammatically-correct.
SmartWordBreakPlugin can hurt performance. It is better to use it only where it is needed, and to avoid widespread hyphenation.
Only use the plugin where it is needed:
- Use %SMARTWORDBREAK{}% in preference to setting SMARTWORDBREAKPLUGIN_TABLES to
on
.
- Set SMARTWORDBREAKPLUGIN_TABLES to
on
for a single page in preference to setting it on
for a whole web.
- Set SMARTWORDBREAKPLUGIN_TABLES to
on
for a single web in preference to setting it on
for a whole site.
- Avoid using SMARTWORDBREAKPLUGIN_WHOLEPAGE, unless you really need the whole page (including header and footer) processed, or you really do not mind the performance hit.
Hyphenation hurts performance too:
- Only enable hyphenation where it is needed.
- Reduce the number of words to be hyphenated by increasing the length of the longest unsplit word-segment (use the
longest
parameter or the SMARTWORDBREAKPLUGIN_LONGEST preference).
Installation Instructions
You do not need to install anything in the browser to use this extension. The following instructions are for the administrator who installs the extension on the server.
Open configure, and open the "Extensions" section. Use "Find More Extensions" to get a list of available extensions. Select "Install".
If you have any problems, or if the extension isn't available in
configure
, then you can still install manually from the command-line. See
http://foswiki.org/Support/ManuallyInstallingExtensions for more help.
Preferences
No preferences are stored in this topic. The example settings here have no effect.
To learn more about setting preference variables, see the
PreferenceSettings topic.
Variable |
Default |
Description |
SMARTWORDBREAKPLUGIN_WHOLEPAGE |
off |
Enables processing of the whole web page. |
SMARTWORDBREAKPLUGIN_TABLES |
off |
Enables processing of all text in tables. This should be faster than SMARTWORDBREAKPLUGIN_WHOLEPAGE . This works better than %SMARTWORDBREAK{}%, but it is slower, so this preference should only be set on the pages where it is needed. If SMARTWORDBREAKPLUGIN_WHOLEPAGE is true, then SMARTWORDBREAKPLUGIN_TABLES is ignored. |
SMARTWORDBREAKPLUGIN_LONGEST |
8 |
Sets the length of the longest unbroken sequence of letters. This preference affects the whole page when using the SMARTWORDBREAKPLUGIN_WHOLEPAGE setting, or all tables when using the SMARTWORDBREAKPLUGIN_TABLES setting. It also sets the default value for the longest parameter to %SMARTWORDBREAK{}%. |
SMARTWORDBREAKPLUGIN_HYPHENATE |
on |
Enables hyphenation i.e. splitting words at hyphenation points. This preference affects the whole page when using the SMARTWORDBREAKPLUGIN_WHOLEPAGE setting, or all tables when using the SMARTWORDBREAKPLUGIN_TABLES setting. It also sets the default value for the hyphenate parameter to %SMARTWORDBREAK{}%. |
Configuration
Some
SmartWordBreakPlugin settings affect the whole site, and are not intended to have different values for different topics and/or webs.
These settings are adjustable via
configure. The default settings are suitable for English.
Hyphenation configuration files for additional languages are available from
http://www.ctan.org/tex-archive/language/.
Load the configuration file for your language onto your server and set the path to that configuration file via
configure
.
For more information, see TeX::Hyphen on
CPAN.
Supported browsers
This plugin
should support the following browsers, but they have not all been tested:
- Internet Explorer 6, 7 and 8
- Firefox 3.x
- Opera 9.62 and 10.x
- Chrome 1 & 2
- Konqueror 3.5.7
The list is not exhaustive.
Info