[Lazarus] Regex and Syntax Highlighting

Hans-Peter Diettrich DrDiettrich1 at aol.com
Thu May 27 02:33:17 CEST 2010


Graeme Geldenhuys schrieb:

> THE SLOWNESS YOU GUYS ARE MENTIONING IS BASED ON AN CRAP IMPLEMENTATION.

Well, the goal has shifted away from only *syntax* highlighting.

> I don't know what editor you guys used to test syntax highlighting,
> but clearly it was a crap editor. jEdit being a Java program is damn
> fast (imagine that, a Java app being fast.) and extremely efficient
> with LARGE files. So regexp syntax highlighting, implemented
> correctly, does not slow down syntax highlighting!!

RegExp per se is a DFA, but recognition of multiple possible tokens 
requires a NFA. This increases the O() complexity of the algorithm.

Multi-line comments, or even worse: nested comments, require to pre-scan 
an file, before the highlighter can be used.

So it depends on the *concrete* syntax, how fast or slow the lexer 
automaton can be.

Next comes the storage of modifications. A mere viewer can work directly 
on the immutable file, but when the source can be modified at runtime, 
with undo-tracking, and foldable blocks come into play, and UTF-8 and 
tab expansion, then it may take longer to retrieve the text to show, the 
highlighter must be fault-tolerant to cover temporarily invalid tokens, 
and the source may need a reparse on every single insert/delete.


So yes, a syntax highlither *can* be amazingly fast, as I know from my 
own experiments, but this can vary dramatically with more complex 
requirements.

IMO it's a matter of preferences, whether one wants to construct an 
editor in the first place, and add syntax-highlighting to it, or whether 
one wants to implement an syntax highlighter for an file viewer. So it 
doesn't make sense to compare apples and oranges, and to suspect a 
crappy implementation, unless one knows all related requirements.

DoDi





More information about the Lazarus mailing list