[Lazarus] TStringFIeld and Multibyte character sets

Tony Whyman tony.whyman at mccallumwhyman.com
Wed Apr 1 14:32:26 CEST 2015


I've just submitted a bug report:

http://bugs.freepascal.org/view.php?id=27766

plus the related http://bugs.freepascal.org/view.php?id=27768

27766 is intended to improve handling of multibyte character sets by 
TStringField, only I am not quite sure about whether I have fully 
understood the meaning of TField.Size and hoped to get some feedback on 
this.

The basic problem is that TStringField is only really set up for single 
byte character sets and TFieldDef just ignores the whole issue.

TField descendents have three properties to play with: DisplayWidth, 
Size and DataSize. The documentation for DisplayWidth is pretty clear - 
it's the character width. Ditto DataSize, which is the storage size, 
while Size is defined as the "logical size", whatever that means.

In the current TStringFIeld, DisplayWidth = Size and DataSize = Size + 
1. For single byte character sets that's fine, with DataSize adding an 
extra byte for the null terminator. However, with multi-byte character 
sets, is the (logical) Field Size the number of characters or the number 
of bytes (less the null terminator) - or something else? My reading is 
that it is the number of bytes.

This is important because TFieldDef currently only allows the database 
driver (in a TDataSet Descendent) to set the Field Size. There is no  
property that allows a multibyte character set to be declared or the max 
number of bytes per character. it is possible to work around this, but 
only by some rather dirty coding.

My proposal is to enable a better approach by adding an extra property 
to TFieldDef - CharSetWidth - which is the max. number of bytes per 
character (e.g. 4 for UTF-8) and to pass this on to TStringField when it 
is created by TFieldDef.CreateField. But then what should TStringField 
do with this information. Should it:

a) Set the default display width to the Field Size div CharSetWidth, and 
set DataSize to Field Size + 1, or

b) Set both the Display Width and the Field Size to the TFieldDef Size 
div CharSetWidth and Set DataSize to the TFieldDef Size +1, or

c) Something else.

Really, the answer depends on what does the "logical" size mean?





More information about the Lazarus mailing list