[Lazarus] String vs WideString

Mattias Gaertner nc-gaertnma at netcologne.de
Mon Aug 14 15:46:54 CEST 2017


On Mon, 14 Aug 2017 14:21:57 +0100
Tony Whyman via Lazarus <lazarus at lists.lazarus-ide.org> wrote:

>[...]
> Lazarus is already a UTF8 environment.
> 
> Much of the LCL assumes UTF8.

True.

 
> UTF8 is arguably a much more efficient way to store and transfer data

It depends.

 
> UTF-16/Unicode can only store 65,536 characters while the Unicode 
> standard (that covers UTF8 as well) defines 136,755 characters.

No. 
UTF-16 can encode the full 1 million Unicode range. It uses one or
two words per codepoint. UTF-8 uses 1 to 4 bytes.
See here for more details:
https://en.wikipedia.org/wiki/UTF-16

Although you are right, that there are still many applications, that
falsely claim to support UTF-16, but only support the first $D800
codepoints.

 
> UTF-16/Unicode's main advantage seems to be for rapid indexing of large 
> strings.

That's only true for UCS-2, which is obsolete.

 
> You made need UTF-16/Unicode support for accessing Microsoft APIs but 
> apart from that, why is it being promoted as the universal standard?

Who does that?

Mattias


More information about the Lazarus mailing list