[Lazarus] Why is SAX so slow?
Werner Pamler
werner.pamler at freenet.de
Sun Dec 25 12:47:15 CET 2016
Motivated by a user comment on excessive memory consumption of
fpspreadsheet during reading of large xlsx files
(http://forum.lazarus.freepascal.org/index.php/topic,33292.msg231598.html#msg231598)
I began to investigate an alternative approach to read xlsx files based
on SAX instead of DOM which is currently used.
However, I found that SAX is considerably slower than DOM - I always
thought it would be the other way round because SAX avoids building the
tree of DOM nodes.
Probably, I am doing something wrong.
If somebody wants to look into this issue here's a little demo. It
consists of three projects:
* *create_xml* creates an xml file similar to the sharedstrings.xml
used by xlsx files internally. The file consists of 500,000 nodes
with random strings, and is about 20 MB in size.
* *read_dom* reads this file using the dom routines. On my system this
is accomplished within about 1.2 seconds.
* *read_sax* reads the same file using the sax routines of fpc. On my
system this takes 4.3 seconds.
So, why is the sax project slower than the dom project?
Werner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20161225/e038b067/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sax.zip
Type: application/x-zip-compressed
Size: 4921 bytes
Desc: not available
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20161225/e038b067/attachment.bin>
More information about the Lazarus
mailing list