<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">On Wed, Feb 24, 2021 at 12:22 PM José Mejuto via lazarus <<a href="mailto:lazarus@lists.lazarus-ide.org">lazarus@lists.lazarus-ide.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">In my code there is non 100% unicode compatibility when using the <br>
"CaseInsensitive" mode as as it uses lowercase mask and lowercase string <br>
to perform the test which is wrong by definition but I was unable to <br>
find a method to test codepoints case insensitive without pulling in big <br>
unicode tables.<br>
<br>
I was thinking in import the NTFS (the filesystem) case comparison <br>
tables which are 128 KB "only".<br></blockquote><div><br></div><div>That is not necessary.</div><div>LazUTF8 has functions like UTF8CompareText(), UTF8CompareTextP() and the latest UTF8CompareLatinTextFast().</div><div>UTF8CompareLatinTextFast supports full Unicode but is optimized for mostly Latin text.<br></div><div>We should add a PChar version UTF8CompareLatinTextFastP() and use it in your mask code.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Comprehensive unit tests are a way to prevent breaking things.<br>
<br>
And also define if a compatibility break is a bug in the new code or in <br>
the old code. In example my mask supports (there is a define to disable) <br>
"[z-a]" converting it to "[a-z]" which is a compatibility break.</blockquote><div><br></div><div>Your code does not compile when RANGES_AUTOREVERSE is not defined.</div><div>cMask is not found.<br></div><div>The reverse logic can be enabled by default. It does not break anybody's masks as I understand it. Earlier it was an error, now it does something sensible.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Also there is the support (also can be disabled) for the mask "[?]" <br>
which is the counterpart for "*" but with one char position.<br></blockquote><div><br></div><div>Where did you get this "[?]" syntax? There must be a reference documentation somewhere but I have not seen it.</div><div>What is the difference between "?" and "[?]" ?</div><div><br></div><div><br></div><div><div dir="ltr">On Wed, Feb 24, 2021 at 1:28 PM José Mejuto via lazarus <<a href="mailto:lazarus@lists.lazarus-ide.org">lazarus@lists.lazarus-ide.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">> Sometimes I wish we would migrate to using UnicodeString by default.<br>
> It would make life a bit easier.<br>
> (And yes I know you would have to deal with composed characters<br>
> (grapheme defined by more than 1 16-bit word)).<br>
<br>
That's a can of worms! UTF8 forces you to write "correct code" (at least <br>
try it) for any character >127, with UnicodeString you get the false <br>
apparence that everything magically works until everything cracks when a <br>
string with surrogate pairs come in play :-) and ALL you text handling <br>
must be rewritten, and most of them completly rewritten.<br></blockquote><div><br></div><div>Exactly. UnicodeString uses UTF-16 which is also a variable length encoding. The same rules should be applied but often they are not. There is plenty of sloppy UTF-16 code out there.</div><div>Writing proper code UTF-8 is not difficult once you wrap your mind around the concept. There is a learning curve, true. I also scratched my head for some time when studying it.</div><div><br></div><div>Juha</div><div><br></div><div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>