[Lazarus] Any chance of changing the LCL Unicode encoding to UTF-16?

Ivan P. Gan ivan at comchatter.com
Mon Aug 4 21:40:49 CEST 2008


Graeme Geldenhuys wrote:
> 2008/8/4 Ivan Gan <ivan at comchatter.com>:
>   
>> UTF-8 is the better encoding system to use
>> It is more compact than UTF-16 & why make applications more bloated than
>> they already are?
>>     
>
> Unfortunately the argument isn't that easy! There are many compelling
> reasons to implement utf-8 or utf-16. It's not a black & white
> argument.
>
> And please don't read this as I'm against utf-8. On the contrary, I
> implemented UTF-8 as our internal encoding in fpGUI as well.
>   
I never suggested that you are against UTF-8, I understand both 
arguments, on Windows from a performance point of view the use of UTF-16 
is likely to prove faster as it is the native encoding on that platform
It is however better to have a single encoding format for all platforms 
that Free pascal & Lazarus support
The use of UTF-8 allows easier upgrading of existing ANSI encoded 
applications, this backward compatibility is one I consider a strong plus
Updating existing components should be easier, though the variant 
character would have to be taken into account
Like you say its not black & white
>> Lazarus is now in my view a better environment to work with & is beginning
>> to take the lead anyhow
>>     
>
> The Free Pascal Compiler can be placed in the same context.  For my
> line of work, I must agree with our statement. ;-)
>   
Lazarus is impressive indeed as is the Free Pascal Compiler
>> We have effective Unicode support in Lazarus, ok, next problem please, thats
>> my view
>>     
>
> Again a narrow point of view... It may work fine if you only write
> English applications. A simple example where FPC (and Lazarus) is
> failing with unicode support (and no easy solution is available), is
> in locale information.
>   
Actually I work with Non English, Mixed language content daily, I am 
aware of many of the difficulties though the Russian character is one I 
was unaware of
As my work is web based, I find UTF-8 very effective providing a good 
balance on size & function
Does using UTF-16 correct the Russian separator problem?
> For example:
> The ThousandSeparater variable is a Char type (1 byte type), yet the
> Russian character for a thousand separater (U+00A0 character) requires
> 2 bytes (the utf-8 encoding would be $C2 $A0).
>
> There are a few other such cases with locale information alone. Simply
> changing the ThousandSeparater type or other locale types is also not
> an option, because the gets used in many SysUtils functions like
> FormatDateTime(), FormatFloat() etc...
>
>   
I have noticed one common failing on many projects, people tie language 
to locale, they need to be independent to allow immigrants for example 
to choose Chinese in the USA or French in Israel where the locale 
information would  follow local encoding for dates etc.
A difficult task, even for those of us who work with it frequently
Most locale setting systems however tie language to the location which 
totally fails to take into account immigration, for example, the second 
most popular language in the state of Texas, USA is Vietnamese

Regards
Ivan
> Regards,
>  - Graeme -
>
>
> _______________________________________________
> fpGUI - a cross-platform Free Pascal GUI toolkit
> http://opensoft.homeip.net/fpgui/
> _______________________________________________
> Lazarus mailing list
> Lazarus at lazarus.freepascal.org
> http://www.lazarus.freepascal.org/mailman/listinfo/lazarus
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20080804/80c499ae/attachment-0007.html>


More information about the Lazarus mailing list