[Lazarus] Howto work with wikiget and wikiconvert tools

Mon Jan 19 23:06:06 CET 2015

On Mon, 19 Jan 2015 19:11:39 +0100
Frieß <friess at gmx.at> wrote:

>[...]
> > I _feel_ that pagenames should be pure ASCII to avoid confusion, but I 
> > am not an expert on that.
> >
> It looks, the naming is allowed. See 
> http://en.wikipedia.org/wiki/Wikipedia:Naming_conventions_(technical_restrictions) 
> <http://en.wikipedia.org/wiki/Wikipedia:Naming_conventions_%28technical_restrictions%29>See 
> 'Title length' there. So the representation in the Filesystem must 
> handle this. Or the naming convention of the wikesite have a (internal) 
> rule, not to use UTF-8.

Of course the Wiki supports UTF-8 for page names.
The problem is that wikiget creates one file per page and so it needs
to map a wiki page name to a file name. The file name must also
work on all common file systems and version control systems.
For example the Wiki is case sensitive, file systems in general are not.
At the moment wikiget uses a simple mapping, that keeps English letters
and encodes the rest. This makes debugging somewhat easier.
Unfortunately it can triple the length leading to too long file names.
An alternative would be UTF-7 (i.e. special characters like \ and /
encoded).
Functions WikiPageToFilename and WikiFilenameToPage.

The Wiki allows pretty long page names. This means very long page names need a special
treatment. For example using a md5sum. OTOH long page names are bad
style, so it is better to rename the wiki page.

Mattias