[Lazarus] Proposal: Allow Umlaute and Accented Characters in Identifiers

Fri Jul 3 18:42:32 CEST 2020

Martin Frb via lazarus <lazarus at lists.lazarus-ide.org> schrieb am Fr., 3.
Juli 2020, 17:02:

> On 03/07/2020 16:37, Michael Van Canneyt via lazarus wrote:
> >
> >
> > On Fri, 3 Jul 2020, Martin Frb via lazarus wrote:
> >
> >> On 03/07/2020 16:21, Péter Gábor via lazarus wrote:
> >>> Hi!
> >>>
> >>> I hope that you did not misread my words/sentences.
> >>> Your example if perfect to illustrate the reason why I don't want
> >>> international characters in the language itself (and identifiers).
> >> Yes, that was my understanding.
> >>
> >> You gave reasons why it would be a bad idea. I added a reason, that I
> >> think would make the idea even worse.
> >> In other words, I supported the current a-z0-1_ set.
> >
> > I did a quick test in Delphi:
> >
> >
> > [dcc32 Error] doti.dpr(9): E2003 Undeclared identifier: 'ß'
> >
> > So indeed, case-insensitivity is lost. Even in German.
>
> And that, despite the German ß is not locale dependent. It has exactly
> one uppercase version.
> Were as "i" has 2. (But not within any one locale)
>
> I would guess that if you copy and paste, and some of your umlauts/chars
> are composed, some decomposed, that will likely not work either.
> And for composed chars with more than one combining codepoint, if the
> order of the combining codepoints does not matter, the problem will
> likely be the same.
>
> Then there are full width codepoint for some chars. (They could be
> argued to be ignored, but readability would be gone...)
> https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block)
> So "A" and "Ａ" should also be the same.
> And full width digits should be allowed in numbers.
>
> I do wonder, if Delphi accepts any of the Utf8 spaces for separating
> identifiers.
> https://www.compart.com/en/unicode/category/Zs
> and https://en.wikipedia.org/wiki/Word_divider
> Especially the zero width space....
>
> And the soft hyphen? Will it be ignored, so the same identifier in
> different locations of the source can have it, or not have it?
> https://en.wikipedia.org/wiki/Soft_hyphen

It could very well be that Delphi (and other languages) follows the Unicode
Standard Annex #31 which is about Unicode Identifiers in programming
languages and also deals with case insensitive identifiers (
https://unicode.org/reports/tr31/ ).

Regards,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20200703/037ab44d/attachment-0001.html>