[Lazarus] FPC index/searcher classes committed.
Michael Van Canneyt
michael at freepascal.org
Sat Mar 10 18:29:51 CET 2012
Hello,
I've committed a file/database indexing and search engine, fpindexer, to SVN.
It was developed by Darius Blaszijk, with help from me. One of the planned uses is
to create searchable documentation. I've also (in a private project) used it to
implement full-text search on a database that doesn't support that natively.
It is in packages/fpindexer. It can be compiled with Make or fpmake, but is not
yet compiled if you 'make' all packages (needs testing first).
We'd like people to try it and comment on the design and speed of the engine,
and if you have suggestions for improvements, they're more than welcome.
One of the planned extensions is automatic language detection.
An implementation has been created, but it needs testing and speed improvements.
A lazarus package that installs the components on the component palette will
be submitted to the lazarus developers soon (as soon as I've tested it ;))
Below you'll find part of the README file, which explains more or less what
is available.
Michael.
-----------------------------------------------------------------------
Architecture:
=============
The indexer and search mechanism design is modular:
- A storage mechanism
- An indexer class
- A search class
- Text processing classes.
The indexer uses a text processing class and a storage mechanism to create a
search database. The search class uses the same storage mechanism to search
the database.
Currently, 3 databases are supported:
- In memory database (plus flat file storage)
- Firebird database
- sqlite database.
3 input text processors are supported:
- Plain text
- HTML
- Pas files.
A text processor is selected based on the extension of a file, if a file is
processed.
It is possible to specify a list of words to ignore per language, and a mask for words to
ignore.
On top of the file/stream indexer, a database indexer is implemented.
It can be used to implement full-text search on a database.
Sample programs for all 3 classes (search, index and index DB) are provided
in the examples dir.
More information about the Lazarus
mailing list