<div dir="auto"><div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Martin Frb via lazarus <<a href="mailto:lazarus@lists.lazarus-ide.org">lazarus@lists.lazarus-ide.org</a>> schrieb am Fr., 3. Juli 2020, 17:02:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 03/07/2020 16:37, Michael Van Canneyt via lazarus wrote:<br>

><br>

><br>

> On Fri, 3 Jul 2020, Martin Frb via lazarus wrote:<br>

><br>

>> On 03/07/2020 16:21, Péter Gábor via lazarus wrote:<br>

>>> Hi!<br>

>>><br>

>>> I hope that you did not misread my words/sentences.<br>

>>> Your example if perfect to illustrate the reason why I don't want<br>

>>> international characters in the language itself (and identifiers).<br>

>> Yes, that was my understanding.<br>

>><br>

>> You gave reasons why it would be a bad idea. I added a reason, that I <br>

>> think would make the idea even worse.<br>

>> In other words, I supported the current a-z0-1_ set.<br>

><br>

> I did a quick test in Delphi:<br>

><br>

><br>

> [dcc32 Error] doti.dpr(9): E2003 Undeclared identifier: 'ß'<br>

><br>

> So indeed, case-insensitivity is lost. Even in German.<br>

<br>

And that, despite the German ß is not locale dependent. It has exactly <br>

one uppercase version.<br>

Were as "i" has 2. (But not within any one locale)<br>

<br>

I would guess that if you copy and paste, and some of your umlauts/chars <br>

are composed, some decomposed, that will likely not work either.<br>

And for composed chars with more than one combining codepoint, if the <br>

order of the combining codepoints does not matter, the problem will <br>

likely be the same.<br>

<br>

Then there are full width codepoint for some chars. (They could be <br>

argued to be ignored, but readability would be gone...)<br>

<a href="https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block)" rel="noreferrer noreferrer" target="_blank">https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block)</a><br>

So "A" and "Ａ" should also be the same.<br>

And full width digits should be allowed in numbers.<br>

<br>

I do wonder, if Delphi accepts any of the Utf8 spaces for separating <br>

identifiers.<br>

<a href="https://www.compart.com/en/unicode/category/Zs" rel="noreferrer noreferrer" target="_blank">https://www.compart.com/en/unicode/category/Zs</a><br>

and <a href="https://en.wikipedia.org/wiki/Word_divider" rel="noreferrer noreferrer" target="_blank">https://en.wikipedia.org/wiki/Word_divider</a><br>

Especially the zero width space....<br>

<br>

And the soft hyphen? Will it be ignored, so the same identifier in <br>

different locations of the source can have it, or not have it?<br>

<a href="https://en.wikipedia.org/wiki/Soft_hyphen" rel="noreferrer noreferrer" target="_blank">https://en.wikipedia.org/wiki/Soft_hyphen</a></blockquote></div></div><div dir="auto"><br></div><div dir="auto">It could very well be that Delphi (and other languages) follows the Unicode Standard Annex #31 which is about Unicode Identifiers in programming languages and also deals with case insensitive identifiers ( <a href="https://unicode.org/reports/tr31/">https://unicode.org/reports/tr31/</a> ). </div><div dir="auto"><br></div><div dir="auto">Regards, </div><div dir="auto">Sven</div></div>