[Lazarus] Using FPC parser/tokenizer for code formatting

Adem listmember at letterboxes.org
Mon May 31 11:29:47 CEST 2010


On 2010-05-31 05:36, Doug Chamberlin wrote:

Doug,

 > I have great interest in a parser-based code formatting tool. 
However, I have doubt that using the compiler's parser is the best 
solution.

It may not be the best solution basically bexause it was written solely 
for the compiler's needs. IOW, a very customized, special purpose one.

Yet, as far as parser goes, it is the best solution. Because it is the 
originial source of information. Any new additions as well as 
corrections go in there --if it doesn't work, nothing does. IOW, it is 
also the main specification.

At the moment, as I gather, either because the need has never really 
arisen to or because of the workload, that code (which is also the best 
specification) has not been modularized enough to be usable for other 
purposes. [by 'other purposes' I don't mean code formatting alone, there 
are others and more will doubtless come up in the future.]

 > So, I'd like to discuss some of the issues surrounding this decision.
 >
 > I'd like to get your ideas on the following problem. There are 
basically two philosophies of code reformatting. These I'll call 
"reorganization" and "rebuild".
 >
 > Using the reorganization approach, the source file is largely 
retained intact with some lines modified to enforce some formatting 
rules. The idea is to keep everything as it is except those things that 
have specifically been requested to be examined and modified. Minimal 
change is called for. Parts of the source file that the parser does not 
understand are left intact.
 >
 > Using the rebuild approach, the source file is examined by the 
reformatter, digested for content, and completely rewritten by the 
reformatter according to its rules. Everything is re-written according 
to the formatting rules so nothing can be included that the reformatter 
does not understand. For example, any manually formatted details are 
lost. The reformatter must know about and understand how to retain all 
valuable content. The developer cannot add anything important that the 
reformatter does not know how to handle.
 >
 > Which approach do you think should be used?
 >
 > I prefer the reorganization approach because I view source code as 
carefully sculpted art. Others treat source code as essentially a 
machine-generated intermediate representation of some abstract concepts 
that can be transformed over and over again as needed. These people 
would be more comfortable with the rebuild approach.
 >
 > I'm just curious how you look at the problem because it has a big 
impact on what parser is used. Of course, I'm also interested in 
everyone else's viewpoint as well.

These, the following, are simply my own/personal opinions. IOW, I am not 
out to press them as precondition.

As far as I can see, JCF itself does (or, rather, can do) some of the 
latter example you gave. IOW, IIRC, it can add code (if opted for by the 
user) such as 'begin-end' stuff for 'if' blocks.

if something then something else somethingelse;

becomes

if something then begin
   something;
end else begin
   somethingelse;
end;

it can also reorganize some comments in such constructs.

While I do remember talking about 'un-with'ing (or, 'de-with'ing), I am 
not sure if Anthony did get around implementing it in JCF. This 
(admittedly an extreme one, one which I haven't even checked to see if 
it compiles) is the kind of stuff I am talking about:

var
   Box1: TRect;
   Box2: TRect;
begin
   With Box1, Box2 do begin
      Left := Left;
      Top := Top;
      Right := MyRight;
      Bottom := Bottom;
   end;
end;

I'd like something to take care of this sort of stuff; but I am not sure 
it has to be the code formatter --a refactoring tool sounds like a 
better one for the job definition.

If so, it brings us to the 'refactoring tool'; which --unless I am 
mistaken-- there isn't one for FPC/Lazarus/others. If such a thing 
doesn't exist, I can only guess it is simply because there isn't the 
infrastructure suitably ready for it --how many of us would undertake 
writing a fully-fledged parser engine before writing a refactoring tool 
(which is hard enough) for non-commercial puposes? Same goes for code 
analysis tools etc.

These are the reasons why I am pinning so much hope on modularizing 
tokenizer/parser/compiler trio. Once that done, someone else (singular 
or plural) --I am hoping-- will step in to write those tools which in 
return help developing all sorts of other tools better and faster.

But I digress.

Short answer to your question is, I don't have a condition carved in stone.

-- 
Cheers,

Adem





More information about the Lazarus mailing list