[Lazarus] dynamic string proposal
Juha Manninen
juha.manninen62 at gmail.com
Wed Aug 16 17:55:54 CEST 2017
On Wed, Aug 16, 2017 at 6:24 PM, Martin Frb via Lazarus
<lazarus at lists.lazarus-ide.org> wrote:
> Actually no.
I know CodeUnit and CodePoint are not called "character" officially by
the Unicode Standard.
They however are called "character" in normal communication.
For example in the "String vs WideString" thread most people used
"character" as a synonym for CodePoint.
For CodeUnit the term is very logical for historical reasons as the
type "Char" is a short form of "Character". This is a very important
meaning because CodeUnit resolution is so useful also with variable
width encodings.
For example the following code works perfectly with UTF-8 and UTF-16:
function SplitInHalf(Txt, Separator: string; out Half1, Half2: string): Boolean;
var
i: Integer;
begin
i := Pos(Separator, Txt);
Result := i > 0;
if Result then
begin
Half1 := Copy(Txt, 1, i-1);
Half2 := Copy(Txt, i+Length(Separator), Length(Txt));
end;
end;
although Pos(), Copy() and Length() deal with CodeUnit resolution.
I wonder how the new fancy string types would handle it without a
performance penalty.
Juha
More information about the Lazarus
mailing list