[Lazarus] unit Masks vs. unit FPMasks

José Mejuto joshyfun at gmail.com
Wed Feb 24 12:28:16 CET 2021


El 24/02/2021 a las 11:58, Bart via lazarus escribió:

Hello,

>> In my code there is non 100% unicode compatibility when using the
>> "CaseInsensitive" mode as as it uses lowercase mask and lowercase string
>> to perform the test which is wrong by definition
> 
> Currently Masks unit does the same.

Yes, but in example in my case I can not success test mask "ä*" vs 
string "Ä*" because "Ä" is not lowercased to "ä" (Windows 7).

> Sometimes I wish we would migrate to using UnicodeString by default.
> It would make life a bit easier.
> (And yes I know you would have to deal with composed characters
> (grapheme defined by more than 1 16-bit word)).

That's a can of worms! UTF8 forces you to write "correct code" (at least 
try it) for any character >127, with UnicodeString you get the false 
apparence that everything magically works until everything cracks when a 
string with surrogate pairs come in play :-) and ALL you text handling 
must be rewritten, and most of them completly rewritten.

>>> There are no tests for MatchesWindowsMask() yet.
> I tested that extensively on my machine with all scenarios I could think of.
> But others most likely can think of scenarios I did not test.
> It was based on current behaviour of Windows NT platform (Win7 at the
> time to be precise).
>> Who defines which are right and which are wrong ?
> Well, I did ;-)
> (Nobody else bothered at the time, and nobody complained either.)

And mostly will not as almost everything matches the expected behaviour 
for an user, like typical "*.txt" but there are some non supported cases 
like:

Filename:='test.txt'
Mask:='test??.txt?'
Match must be true

This is the doc from my code about Windows matching, Quirks can be 
enabled or disabled for compatibility:

----------------8<----------------------------8<---------------------
Windows mask works in a different mode than regular mask, it has too 
many quirks and corner cases inherited from CP/M, then adapted to DOS 
(8.3) filenames and adapted again for long file names.

         Anyth?ng.abc    = "?" matches exactly 1 char
         Anyth*ng.abc    = "*" matches 0 or more of chars

         ------- Quirks -------

         --eWindowsQuirk_AnyExtension
           Anything*.*     = ".*" is removed.

         --eWindowsQuirk_FilenameEnd
           Anything??.abc  = "?" matches 1 or 0 chars (except '.')
                          (Not the same as "Anything*.abc", but the same
                           as regex "Anything.{0,2}\.abc")
                           Internally converted to "Anything[??].abc"

         --eWindowsQuirk_Extension3More
           Anything.abc    = Matches "Anything.abc" but also
                            "Anything.abc*" (3 char extension)
           Anything.ab     = Matches "Anything.ab" and never
                            "anything.abcd"

         --eWindowsQuirk_EmptyIsAny
           ""              = Empty string matches anything "*"

         --eWindowsQuirk_AllByExtension (Not in use anymore)
           .abc            = Runs as "*.abc"

         --eWindowsQuirk_NoExtension
           Anything*.      = Matches "Anything*" without extension

----------------8<----------------------------8<---------------------

-- 



More information about the lazarus mailing list