[Lazarus] Does Lazarus support a complete Unicode Component Library?

Hans-Peter Diettrich DrDiettrich1 at aol.com
Wed Feb 23 03:12:46 CET 2011


Jürgen Hestermann schrieb:
> Hans-Peter Diettrich schrieb:
>  > Generic types should never evolve in a breaking way.
> 
> But that's the whole reason for generic types: They should
> not map to a specific type but vary dependend on certain
> environment settings.

Generic types IMO should vary according to the target machine, with the 
most prominent type "pointer".

> If the generic type String means
> the same type all the time than it's just an alias for
> that type.

When it's allowed to redefine the string type per unit, using $H, then 
it's (kind of) an alias.

> I never understood how this can be of advantage but that's
> the reason for the design of generic types.

Strings are somewhat special types. Just the move from ShortString to 
AnsiString broke the char at index zero, which holds the string length 
of an ShortString, but is no more accessible in an AnsiString.


>  > Adding encodings to the string type looks to me like turning 
> "integer" into "complex".
> 
> But what exactly should be the type "String"? Because of the generic
> nature it must map to different string types under certain environment
> conditions. Otherwise it would be just another name for a fixed string 
> type.

A ShortString16 could hold 16 bit (wide) chars, and a ShortString32 
could hold the full range of Unicode codepoints. This were compatible 
with other generic types, where the (element) bytecount is not fixed. 
When s[0] still would hold the string length, then the maximum length of 
the strings would have grown "naturally" with the char size. Following 
this model we had no discussion about future string types, e.g. Unicode 
issues (like UTF-8) never had been a problem.

But of course such string implementations are incompatible with Delphi, 
so...


> Again, what reason exists for a generic integer type
> if not providing the "best" integer representation
> for a certain platform?

IMO not the platform, but the machine architecture and mode is essential 
for the determination of "integer" (and "pointer").

> If you compile for DOS it
> should be 16 bit, for Win32 it should be 32 bit and
> of course for Win64 it needs to be 64 bit.

There is no adavantage in the use of 64 bit integers on current 64 bit 
machines. There is no speed penalty in the use of values shorter than 
the register size, and then the data and code size have to be taken into 
account, too. Even if data size doesn't matter with regards to available 
RAM, reading and writing twice the amount of bytes costs twice the 
number of RAM cycles, which are much slower than CPU clock cycles.

When the same code has to run on target machines with different 
architecture, then the use of overly long generic integers is not a good 
idea.

> Either you vary it or not. If there is no variation
> then what reason exists for a generic type?

Generic types should not affect any (existing) code. That's hard to 
accomplish when new attributes shall be added to the string type.

Otherwise generic types should be forward-compatible, i.e. code assuming 
(at least) 16 bit integers will never fail with integers of any higher 
bitcount. Backwards compatibility can not be achieved, i.e. code 
assuming 32 bit integers is not guaranteed to work with 16 bit integers, 
occuring with older or "smaller" targets. A "backport" should adress 
such issues, by replacing "integer" by  more specific types, when there 
exist allowed values and a chance for targets with integers of less bits.

DoDi





More information about the Lazarus mailing list