[Lazarus] [?? Probable Spam] Re: Unicode text stored in database

Bogusław Brandys brandys at o2.pl
Sun May 24 15:50:53 CEST 2009


Joost van der Sluis pisze:
> Op vrijdag 15-05-2009 om 12:12 uur [tijdzone +0200], schreef Graeme
> Geldenhuys:
>> Hi,
>>
>> I briefly tried to store some unicode text (UTF-8) into a Firebird
>> database and got some strange results in the Lazarus DB components.
>> I'm not sure if the issue is in the Sqldb components or in the actual
>> GUI DB components of LCL.
>>
>> Anyway, can one define a VarChar() field with unicode text? I'm using
>> Firebird 2.1.2
>>
>> For example:
>>     varchar(10) field definition
>>
>>  - that would allow max. of 10 ansi characters
>>  - what about UTF-8 characters?  Some chars can be anything from 1-4 bytes.
>>  - does varchar() use bytes or characters? I believe it's bytes if I'm
>> not mistaken.
> 
> Depends. You have to define which character-set you use for each
> (n)(var)char(x) field on a FB database. But (x) is always the
> character-count, not the byte-count.
> 
> When you don't supply a character-set for your (n)(var)char-field, the
> default character-set is used which is supplied while creating the
> database. If you haven't done that either, ansi is used.
> 
> The n-prefix makes Firebird store the data in blocks of two bytes.
> That's made for UCS-16 systems like windows. Then each ucs-16 character
> fits inside one such block. (But using a nvarchar with a UTF-8
> characterset is obviously stupid, use a normal varchar instead) for
> UTF-16, nvarchar could also be handy, offcourse.
> 
>> If I can't use varchar(), then what? Char() :-( That would waste a lot
>> of space, but if I have no alternative, then I guess it will have to
>> do.
> 
>>   http://www.firebirdsql.org/manual/migration-mssql-data-types.html
>> I found a web page that mentions the migration of SQL Server to
>> Firebird. It mentions NVarChar (sql server specific type for unicode
>> text) to VARCHAR(x) CHARACTER SET UNICODE_FSS (firebird support). So I
> 
> The UNICODE_FSS charset is obsolete. You can still use it, though. But I
> would simply use UTF-8.
> 
>> guess that's what I need to try. But which unicode format is firebird
>> going to use? utf-8, utf-16, etc...
> 
> That's completely up to you.
> 
>> Has anybody successfully used Firebird DB to store Unicode text
>> (preferably UTF-8 text)?  Anybody done that with Lazarus?
> 
> You could do that, but that's not what you want. All text above is not
> very interesting in this case.
> 
> How Firebird STORES the data is irrelevant in this case. What you want
> is that Firebird sends and recieves the data in the UTF-8 format. When
> you make a connection with a Firebird-server, you have to tell the
> server in which character-set you want to talk to it. Then it will do
> all the necessary conversions for you. 
> 
> Setting the character-set which should be used to communicate with the
> server can be done with a TIBConnection parameter or the
> TIBConnection.charset property.
> 
> (The characterset used to store the data is relevant, offcourse, since
> when it's the same characterset as is used to communicate, no conversion
> is necessary. But the point is clear, I think)
> 
> Joost.
> 
> 

Yep.Just create database with UTF-8 and connect using UTF-8, using 
Lazarus UTF-8. No problems then ,other then fpc RTL ANSI functions ;-)

Boguslaw




More information about the Lazarus mailing list