[Lazarus] Updating on Mac from PPC to Intel, GTK/Carbon problems

Tom Verhoeff T.Verhoeff at tue.nl
Thu Aug 14 11:36:57 CEST 2008


Mattias Gärtner wrote:
> Zitat von Brad Campbell <brad at wasp.net.au>:
> 
>> G'day all,
>>
>> I'm just starting my foray into unicode (and I'm not sure I like it..
>> nonetheless..). I have to read
>> UTF-16 formatted text files produced by some bletcherous windows software.
>>
>> I figured if I could read them I could probably get them into a TSynedit for
>> editing and then
>> convert them back out to UTF-16 when I re-write them. I've found plenty of
>> material related to
>> dealing with UTF-8, but very little with UTF-16.. is there an easy way to do
>> this or do I need to
>> hit the Unicode site and start writing my own conversion routines?
> 
> For example unit LCLProc:
> UTF16ToUTF8 and UTF8ToUTF16.

Hey, thanks for the quick answers :)

Ok, I've found that.. maybe this is a dumb question, but how would I populate a UTF16String from a 
file? Do I have to strip the BOM manually? How do I convert what is effectively a stream of bytes 
into valid UTF16 Chars to feed them into the conversion routine?

Looking closer at the following function it looks like if I load the file into a buffer and strip 
out the BOM, I should be able to pass it the address of the start of the buffer as a PWideChar 
(Provided the endianness is the same). Is there an easy way to figure out how many actual encoded 
chars there are in the source file without individually parsing each word in the stream looking for 
double width characters or just pass it a very high count and make sure the end of stream is an 
invalid character?

function ConvertUTF16ToUTF8(Dest: PChar; DestCharCount: SizeUInt;
   Src: PWideChar; SrcWideCharCount: SizeUInt; Options: TConvertOptions;
   out ActualCharCount: SizeUInt): TConvertResult;

> 
> I have not yet added UTF-16 to the IDE. If you need it I can add it.

I'm not sure I'll require that.

My plan (such as it is) is to take the UTF16 text file and convert/squirt it into a TSynEdit 
component where I am hoping to be able to colour it / edit it / print it / syntax highlight it... 
and then strip all the fancy formatting out and write the file back out as a UTF16 text file.

It needs to be unicode as there is a corresponding TTF font file that has "special" characters in 
it. Effectively like the old text/graphic characters.. specific to the windows application that 
created the files. Lucky I have the font file..

-- 
Dolphins are so intelligent that within a few weeks they can
train Americans to stand at the edge of the pool and throw them
fish.





More information about the Lazarus mailing list