[Lazarus] UTF16 2 utf8

Mattias Gaertner nc-gaertnma at netcologne.de
Wed May 4 19:32:41 CEST 2011


On Wed, 04 May 2011 17:58:32 +0200
Marc Weustink <marc.weustink at cuperus.nl> wrote:

> Graeme Geldenhuys wrote:
> > On 04/05/2011 14:38, Eduard Filipas wrote:
> >> I read that MS SQL server 2008 uses UCS2 which is UTF-16 ...im realy
> >> cofused by all this
> >
> >
> > UCS2<>  UTF-16
> >
> > UCS2 is a subset of UTF-16. It doesn't cover all code points that
> > Unicode defines, which UTF-8, UTF-16, UTF-32 does.
> 
> Till 1996 (Unicode 2.0) you are right. After that date they are equal.

Ehm, isn't it the other way round?
The Unicode 2.0 added UTF-16 as successor of UCS2. UCS2 is still
fixed 2 byte. They are not the same (Graeme is right).

BTW, the lconvencoding.pas GuessEncoding seems to treat the UTF16 BOMs
as "UCS2" BOMs. Practically this is correct for many existing
documents, but officially it is wrong.


Mattias




More information about the Lazarus mailing list