[Lazarus] data matrix with thousands of columns

Marc Santhoff M.Santhoff at web.de
Fri Mar 29 18:17:42 CET 2013


Am Dienstag, den 26.03.2013, 15:44 +0100 schrieb Andrea Mauri:
> one more thing. my data is more similar to a huge spreadsheet than a 
> relational DB, anyway I am looking for the best option already available 
> in lazarus/fpc to store and query the dataset
> 
> Il 26/03/2013 15:38, Andrea Mauri ha scritto:
> > Dear all,
> > I am looking for the best option in order to store big datasets with
> > thousands of columns.
> > The dataset can contains from tens to hundred thousands lines and
> > thousand of columns (some columns are string some numbers).
> > Which is the best option to store and retrieve information from a
> > dataset like this?
> > Actually I am using sqlite, tens of tables including maximum 200
> > columns. But I am not sure this is the best option.
> > Within my application I have to query this dataset:
> > - retrieve a particular line (all or some columns);
> > - retrieve a particular column (all or some lines);
> > - order the dataset with respect to a particular column;
> > - delete/add line(s);
> > - delete/add column(s);
> > ...
> >
> > SQLIte is easy to use when I need to query the dataset but I am not sure
> > that is the most suitable.
> >
> > Any hint?

Scientific data formats maybe best for this task. Something like HDF or
NetCDF or the like.

I have wrapped the header of HDF5 in version 1.6.4 (or was it .5?) a
long while ago with the help of MvC and that is lingering on my harddisk
somewhere in an unknown state - but has been working quite good in the
cases tested. That might help somehow.

I'd have to clean up a bit for sending you a copy, but be aware that
curently HDF5 is coming in to flavors: the "old" style that is sort of
abandoned but still usable had version number 1.6.9 last time I had
looked after it. The new "product line" is starting with version 1.8.x
and I have no idea what changes would be necessary. Those two flavors
are not installable in parallel (at least on FreeBSD I'm using).

There are some problems in declaring hfd's "compound types" in source
code coming from the original source using C-macro-voodoo that is not
very handily portable, but you should not be forced to use that data
types, simple "array"-type is yours, I think.

Look there to get an impression:

http://www.hdfgroup.org/HDF5/

If you're intrested and have a little time to wait for results from my
side, I'll send you a copy.

Have fun,
Marc

-- 
Marc Santhoff <M.Santhoff at web.de>





More information about the Lazarus mailing list