[Lazarus] Arabic beta tester for SynEdit needed

Martin lazarus at mfriebe.de
Thu Dec 6 23:53:12 CET 2012


A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.

The actual display of RTL text now works (that is, if you have some 
arabic chars in the text, they display RTL, and the caret moves 
accordingly / caret between RTL and LTR always means caret at LTR).
uf8 LTR/RTL markers are not supported. This is absolute basics only.

Unfortunately with RTL came other unicode features, that sofar no one 
had missed. Those are at the very least
- combining codepoints
- ligatures
- maybe reordering of codepoints.
- other?
They are tasks of different extent. And I need to find out what is 
mandatory, and what optional. So I can then decide, what does fit into 
my schedule.

The current state is:
- combining: Only Arabic has been done (but they should be complete). So 
none Arabic RTL will not work.
- ligatures: see below
- reordering: not researched, hopefully optional.

"work"
means, that the text is stable (except ligatures, only with workaround), 
and does not expand/shrink, when selecting text, or moving the caret. 
Also that the caret will be at the correct pos. A newly inserted char 
will be where the caret was. Can be tested by hitting the "end" key, and 
see if the caret is at the end of visual text. If SynEdit thinks the 
text is shorter/longer than the actual painted display, then there is an 
issue.

ligatures:
The editor does not handle ligatures yet. So it calculates 2 screen 
cells, when only one is needed. However a stable "workaround" exists 
(currently depends on config)

On windows and windows only (others will be done, if that turns out to 
be any good). In Options / Editor / Display / set "Extra CHAR spacing" to 1
This will slightly widen the script, ignore that, its temporary.
Requires a proper monospaced font. (Deja vu mono)

What it will do: It will tell windows, that the ligature is expected to 
cover 2 display cells.
Display: Arabic text is a script, glyphs are connected by a continuous 
line. The ligature will be in one cell, the next cell will be empty, 
except for the connecting line.
Editing: The caret can be at either cell. Each cell stands for one of 
the 2 chars in the ligature. So the 2nd char can be edited, if the caret 
is at the empty cell

------------------
I need feedback from people who actually speak (or at least read and 
write) Arabic. I need to know, if the above situation is "useable".

If so, then:
- it can be fixed to work without the extra char spacing
- on gtk, carbon, qt (well at least I hope)
- combining can be added for other languages.

If not, well I don't know yet.






More information about the Lazarus mailing list