[Lazarus] UTF8 RTL for Windows

Mattias Gaertner nc-gaertnma at netcologne.de
Mon Nov 24 23:39:03 CET 2014


On Mon, 24 Nov 2014 22:15:29 +0100
Hans-Peter Diettrich <DrDiettrich1 at aol.com> wrote:

>[...]
> The Delphi (and FPC) encoding model allows for strings of different 
> static (declared) and dynamic (true content) encoding, see the special 
> handling of RawByteString (Wiki).
> 
> So far it's not a good idea to simply *assume* that a string variable 
> contains bytes of the declared encoding. In detail one should check or 
> force the right dynamic encoding of every string variable, before 
> searching for specific bytes (chars) in it.
> 
> I'm missing documentation for working safely (and efficiently) with such 
> irregular strings, most probably none of the FPC (and Delphi) developers 
> ever noticed how users are left alone with this problem :-(

Maybe I don't understand the question, but it seems to me this is
documented where static-, dynamic cp and rawbytestring are explained.

http://wiki.freepascal.org/FPC_Unicode_support#Ansistring

When a procedure requires a specific encoding it uses a specific String
type. If it works with CP_ACP it uses "String". If it needs UTF8 it
uses UTF8String. If it can work with any 8-bit encoding it uses
RawByteString. If you need it even more detailed use the
StringCodePage function.

What else do you need?

Mattias




More information about the Lazarus mailing list