[Lazarus] Code Structure / SourceEdit and SyneEdit [Re: Mouse Link in SynEdit (only link-able items)]

Tue Dec 16 07:21:49 CET 2008

Martin Friebe schrieb:

> Then how to you handle double width chars? This si one of the problems I 
> still have to address.
> Even in a proportional font, some chars (Chinese, and other) have twice 
> the width of a normal char. They will 2 positions in the grid.
> Actually it is the same issue as tabs.

Are these characters inside the Unicode BMP?

Actually I did a translation into WideChar, so that only characters from 
the Unicode BMP can be processed. In this model I'd insert a dummy 
character into the output string, that is not displayed but compensates 
for the width of the preceding double-width character.

And what about RTL text? IMO there exist limits for the task of an 
source code editor, where it's easier to use or port some existing text 
processing component, instead of reinventing the wheel.

Full Unicode requires assistance by some sophisticated library, that can 
deal with all the oddities of character sequences (ligatures...), and 
that should be provided and maintained by the platform. Windows has such 
a Uniscribe library, see
<http://www.catch22.net/tuts/neatpad/11>
This editor tutorial convinced me to either stay with truly monospaced 
characters, or let an according library do all drawing. Nothing 
half-baked in between. The same for the *input* of Chinese or other 
exceptional character sets (IME editors...).

> Column Blocks, as well as horizontal caret movement have to deal with 
> this anyway, since you may allow to be in the middle of a tab. But you 
> can not permit to have the caret in the middle of a chinese char.

No problem, my tabs have a special display encoding, that forces the 
caret to move to the next ordinary character cell.

Is it really still a column block, when a double-width character hangs 
out on it's right boundary?

> The question still is how do your painters to all that. IMHO the mapping 
> into a grid requires a lot of info (folded/ word wrapped/ tabs,...) as 
> well as highlighting info. The painter as I see it collects all this 
> info, but the info is provided by other objects.

Right, I left the implementation of the highlighters and folding to 
dedicated objects. The classes have to implement only the very slim 
interface of the base class, everything else is open end.

BTW, I stored "characteristic" info in fixed size records, which can 
easily saved and exchanged together with the source file or the global 
settings. No encapsulation, but easy to use, and little chances for 
coding errors.

> The View port for example does not do the painting, it is a helper class 
> to map the right code into the grid. It is used by the painters, but 
> also used outside.
> It allows (without accessing the painter) to check if a char or the 
> caret is in the visible area. ( There still is the question if it will 
> be the painter or the viewport who defines the size of each grid cell 
> (basically the font size))

In my model nobody has to know about the painting, all required 
information resides in the line buffer and viewport outline. Apart from 
the basic client and gutter size (in pixels) and the font size, the 
viewport outline contains the overall grid dimension, corresponding to 
the line/row count of the document after block folding, the visible area 
is described by a scrollable offset within the grid, or (0,0) for 
painting in client address space, and the current extent of the viewport 
(client area of the control), in both fully and partially visible 
characters. The separation into fully (page size) and partially visible 
(viewport size) characters allows to e.g. scroll by (fully visible) 
pages in both directions, while painting and display of the caret stops 
only at the viewport margins - the latter (caret display) is not 
properly implemented in the current LazEdit!

>> ACK. A MVC (model-view-controller) approach migth be better. The model 
>> holds the source files, the view manages painting and user interface 
>> (mouse and keyboard), and the controller updates the document upon input 
>> or other commands, and synchronizes the related view(s) afterwards.
>>   
> The view IMHO is more than one class, that gradually apply the mapping 
> from a Source-Holder (TStringList) to a char-grid. The Painter then 
> transfers each char to from the grid to the canvas.

All that has to be encapsulated in every single view(er). When the code 
explorer is a view, it's internals have almost nothing in common with 
the text viewer. A gutter also is kind of an viewer, coupled with the 
text viewer only by a common TopLine, but independent otherwise.

Of course the various viewers are related to a distinct document and 
helper objects, from which they obtain all the information to be 
displayed, but the kind of required information depends on their 
individual tasks.

The document base class has (virtual) functions for the conversion 
between stored (file based) and visible (possibly folded) coordinates, 
so that the details of folding are encapsulated. When the gutter 
painter/viewer requires information about blocks, it also can obtain 
that information from the according methods in the document base class.

> That is what I am currently trying to do
> 
> the painter looks at a "grid-provider",  a grid provider may read either 
> the source or another grid-provider as input. I currently call those 
> grid-provider "View".

Okay - with the subtle difference that in my model the painter is 
provided by the grid-provider with all required information, eliminating 
the need for any callbacks to other objects. The extent of information, 
required by an painter, can be determined easily, and only that 
information has to be passed to the painter.

Did you realize that mutliple views of the same source file can have 
different TopLines, viewport sizes, word-wrap settings, and much more? I 
really cannot see how one grid-provider can fit these different needs of 
multiple views. A meaningful separation IMO were:
- file provider: access in file coordinates (by characters or lines).
- block manager: access to visible (expanded) blocks, in display 
lines/columns.
- view: mapping of visible lines into display (row/col) coordinates.
- painter: mapping of viewport into pixel coordinates.

The latter (pixel mapping) occurs in two places, in the grid itself from 
mouse coordinates in grid coordinates, and in the painter from grid 
coordinates into canvas coordinates (different directions). The required 
information is stored in the viewport outline.

> One thing must change, currently the PaintLines code, combines the 
> highlight info with the grid-view result. But that means mapping the 
> highlight info.
> The highlight info must be applied to the unmodified source, and then 
> share the way through the grid-providers
> (That's actually something I realized from this discussion => good)

That's why keep both the characters and their text attributes in the 
line buffer, passed to the painter. I started with separate regions, 
describing the begin and length of tokens, selection etc., but it turned 
out that the handling of overlapping regions is accomplished easier by a 
direct mapping of the text attributes to every single character. The 
text attributes then can be used to e.g. determine, whether the mouse is 
over a (character in a) hyperlink.

> So If  I display a text that has no tabs, no double width chars, no 
> folds, no ...., then all I need is:
> 
> -source-buffer
> -highlight info (does not re-organize the layout)
> -viewport-grid
> -painter
> 
> The view port grid, selects the correct lines, and within each line the 
> correct substring. So if the text is horizontaly scrolled it cuts the 
> beginning of each line, and in any case, it cuts any line that is to long.
> There a 2 objects:
> - The TViewPort => which defines the corner points / the rectangle
> - The TViewPortTextView => which reveals the text in the ViewPort. E.g. 
> returns 20 lines, with 80 chars in each line (like a grid)

I see no need for according (separate) objects. The coordinate 
transformations are determined by the information in the grid outline. 
The outline is updated when the viewport size changes, the font size 
changes, or when the viewport is scrolled - i.e. by user actions. It 
also is updated when the current file changes, either to a different 
file, or by insertion/removal of text, or by block folding. When the 
information has been updated, the display is refreshed.

Line wrapping can occur only in a fixed width viewport with no 
horizontal scrolling capabilities. In this case the document lines can 
be broken into according display lines, which are stored in the line 
buffer cache as distinct lines, eventually containing continuation 
markers (characters). The mapping between physical (display) and logical 
(document based) lines can be stored in the line buffer records.

> Again I do not know the exact organization of your grid.

Attached the source code and documentation, if I don't forget...
[Too big for attachment, sorry]

> But for me the 
> (as an example) the tab-view/expander is not a subclass of the painter 
> (or grid). The tab-view/expander class is a class of it's own 
> (inheriting from an abstract TextView/GridMapper).

Why should tab expansion require a separate class, extending or 
descending from any other class? The tab settings are global (IDE wide), 
and can be reflected in a commonly used data structure or tab-expander 
singleton.

> All the individual view/grid/mapping classes are organized in a stack. 
> You can at anytime add/exchange mebers of the stack to archive new 
> functionality.

Sounds good, but I doubt that this is feasable. The modules are so 
tigthly coupled, that an implementation in distinct units will be almost 
impossible. This way adding new functionality will require to edit the 
common unit, so that it doesn't matter in which class (common or 
separate) the functionality is implemented.

One such case is the docking manager, where I still don't see a chance 
to implement an different manager separately from Controls.pp. The 
anchor docking sample in fact lacks the drag-dock functionality, because 
the implementation would require access to and modification of the 
existing code base, hidden in the implementation section of Controls.pp. 
An extraction of the hidden classes leads to circular unit references 
all over, protected methods are inaccessible from other classes etc.

If you want to make LazEdit that modular and extensible, please supply a 
unit structure that really allows for such extensions. And don't forget 
proper object management, when a reference is changed to a different 
object - the docking manager implementation will result in memory leaks 
or other quirks, when the automatically created manager is replaced. 
When different viewers (or other objects) can share other objects, 
interfaces instead of classes may be a better solution for the lifetime 
management of the exchangeable objects.

> See above. You always speak of your Grid in singular, as one class. For 
> me this is a list of classes (the stack), plus the helper classes 
> (Highlighter and Markup)

My design is bottom up, with open end. The base class implements a 
default behaviour, that can be modified in a derived class, as can be 
seen in the TTextViewer class. The base classes and the base unit(!) 
never must be touched when the functionality is extended.

A user of a CharGrid class, e.g. the Lazarus IDE, doesn't have to care 
about eventual related classes, it only uses the interface provided in 
the single (maybe derived) component class. Then it's also easy to hunt 
bugs, introduced by the implementation of extensions. Either the bug 
resides in the base class, then it can be fixed there, without having to 
wade through uncountable extensions, or it resides in the extension 
itself, and has to be fixed there. The more helper classes are put into 
the base component, the harder the maintenance and extension of such a 
class.

> Of course the Stack I am speaking of, will depended on the situation 
> present itself though a single interface.

I do not really understand why you need an stack? A pool or list of 
exchangeable object references looks more appropriate to me.

>> TopLine and LeftChar are not of any interest outside the view. A 
>> ScrollIntoView method will be sufficient for the outer world, with 
>> document based coordinates, perhaps with an anchor (alTop, alBottom, 
>> alCenter).
>>   
> The View here being a SynEdit drawing a (possible shared) Textbuffer? 
> True TopLine should not be needed outside, but it is needed for Caret 
> Control.

A shared text buffer with a shared caret or scroll position does make no 
sense to me. It is debatable whether bookmarks or block folding should 
be the same in multiple views of some file, with regards to the amount 
and management of such block trees, but the user must be allowed to move 
to different places in every view, select different parts of the text, 
have different (hyperlink) history lists, insert/overwrite modes etc.

Thus caret control has to be private to every view. Please try to 
separate all your intended helper objects, with regards to their later 
use, as being bound to a single document, a single view, or whatever 
else. Also keep in mind what has to be saved and restored when the user 
tabs through the file list of a notebook.

> Therefore I differ between the ViewPort (defining the rectangle) and The 
> ViewportTextView  using the rectangle to provide the grid of chars which 
> is to be displayed.

We agree to disagree.

DoDi