[Lazarus] GB18030 support in Lazarus
Mattias Gaertner
nc-gaertnma at netcologne.de
Fri Oct 16 16:00:24 CEST 2015
On Fri, 16 Oct 2015 14:33:03 +0100
Martin Frb <lazarus at mfriebe.de> wrote:
> On 16/10/2015 10:19, Tony Whyman wrote:
> >
> > In terms of "work", if I use functions such as UTF8Length and
> > ValidUTF8String on a GB18030 string should they always work, or are
> > there exceptions?
>
> IIRC ... UTF8Length counts codepoints, not chars. So if the chars you
> are interested in have chars that need more than one codepoint then this
> is not the length in char.
True.
> This can even happen with some western languages, but it is not likely
> with them.
Actually decomposed characters are pretty common in western languages,
for example on OS X HFS+. And afaik Chinese in Unicode usually use
precomposed characters, does it not?
> The same is for char accessing function (NextUtf8CharByteLen or
> similar). They only get codepoints.
Mattias
More information about the Lazarus
mailing list