[Lazarus] storing big dynamic matrices in file ?

R.Smith ryansmithhe at gmail.com
Sun Aug 27 23:56:23 CEST 2023


On 2023/08/27 21:46, Etienne Leblois via lazarus wrote:
> Dear all,
>
> I want to manipulate, store, later retrieve, 2-dimensions matrices of 
> single or double ;
>
> matrices may turn big and I wanted to turn to dynamic array of array
>
> the basic solution below works, but storing (and retrieval) of big 
> matrices is VERY slow.
>
> any hint on the good way to make things quick ?


Hi Etienne,

There are two things in the method that are horribly slow, and can be 
improved by several orders of magnitude.

First problem is the method of sizing and assigning values to the 
Matrix, thought his is only about twice as slow as it could be. Second 
problem is using millions of individual file-writes - that goes horribly 
slow even on a very fast SSD. It can (and should) be done using only 1 
write.

To show the timings of the method you've used and the proposed better 
method, I've made a short program (see below) so you can run it 
yourself, where I simply named the two methods Old-way and New-way.
Your way (old way) takes 1.032 seconds to fill the data, and 6 minutes 
and 13.702 seconds to write it to disk on my relatively fast machine 
with fast SSD drive.
The good way (new way) takes 0.685 of a second to fill the data and 
0.315 of a second to write it to disk - i.e. near instant, thousands of 
times faster.

Your mileage may vary based on what kind of drive, speed, drive-cache, 
etc. you run - but simply take the code and try it.
Everything used is standard FPC code with only "Classes" and "SysUtils" 
units used - nothing fancy, no special libraries or tools.


    program lm;

    {$R *.res}

    uses Classes, SysUtils;


    const matrixSize = 10000;

    var
       dt, bt, wt : TDateTime;



    procedure oldWay;
    var
       fsingle : file of single;
       m       : array of array of single;
       i,j,n   : Integer;
    begin
       dt := Now();
       n  := matrixSize;
       m  := nil;

       // I like i and j to run in 1..n, so I accept to loose line and
    column 0
       setlength(m, 1 + n, 1 + n);

       for i := 1 to n do for j := 1 to n do M[i, j] := random();
       bt := Now();

       assignfile(fsingle, 'single_test_old.bin');
       rewrite(fsingle);
       for i := 1 to n do for j := 1 to n do write(fsingle, M[i, j]);
       closefile(fsingle);
       wt := Now();

    end;



    procedure newWay;
    var
       a         : array of single;
       i,j,n     : Int64;

       aIdx,
       aFullSize : Int64;
       FS        : TFileStream;

    begin
       dt := Now();
       n  := matrixSize;
       a  := nil;
       aFullSize :=  n * n;
       SetLength(a, aFullSize);

       for i := 0 to (n - 1) do for j := 0 to (n - 1) do begin
         aIdx := (i * n) + j;
         a[aIdx] := random();
       end;
       bt := Now();

       FS := TFileStream.Create('single_test_new.bin', fmCreate or
    fmShareDenyWrite);
       FS.WriteBuffer(a[0], aFullSize * SizeOf(Single));
       FS.Free;
       wt := Now();

    end;




    begin

       WriteLn();
       WriteLn('Doing the old way...');
       oldWay;
       WriteLn('  Old way matrix assignment time: ',
    FormatDateTime('hh:nn:ss.zzz', bt - dt));
       WriteLn('  Old way File-Write time: ',       
    FormatDateTime('hh:nn:ss.zzz', wt - bt));


       WriteLn();
       WriteLn('Doing the new way...');
       newWay;
       WriteLn('  New way matrix assignment time: ',
    FormatDateTime('hh:nn:ss.zzz', bt - dt));
       WriteLn('  New way File-Write time: ',       
    FormatDateTime('hh:nn:ss.zzz', wt - bt));


       ReadLn();
    end.


PS: To get an index into the a matrix-array from the i and j values is 
easy - as done above:
     aIdx := (i * n) + j;
  and to get the i and j values from the index, you can simply do:
     i := (aIdx div n);
     j := (aIdx mod n);

Unfortunately, this requires using proper Zero-based indexing, else the 
formulae get unnecessarily complicated.

Reading the data from the file back into the Matrix array is as easy as:

   FS := TFileStream.Create('single_test_new.bin', fmOpenRead or 
fmShareDenyWrite);
   FS.ReadBuffer(a[0], aFullSize * SizeOf(Single));
   FS.Free;

Hope that answers the question.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20230827/1debdeca/attachment.htm>


More information about the lazarus mailing list