[Lazarus] Threads in Lazarus code base
Dariusz Mazur
darekm at emadar.com
Mon Sep 20 15:22:17 CEST 2010
W dniu 2010-09-20 13:55, Mattias Gärtner pisze:
> Zitat von Dariusz Mazur <darekm at emadar.com>:
>
>> [...]
>>> Before using multi threading, try to use a better single threaded
>>> algorithm. And when comparing multi with single threading speed, you
>>> must not compare a multi threaded algorithm running single threaded,
>>> but
>>> a multi threaded algorithm with a good single threaded algorithm.
>> I don;t talk about best solution of fibbonaci computing. But about
>> possibility do deal with some class of task using multithreading.
>> Its easy imagine task, which is hard to resolve with loops.
>
> Of course.
> I think, we both agree that CILK is a nice approach.
Nice to hear this.
> I just said, that there are not many multi threading examples that are
> easy to read and are practical at the same time. Computing Fibonacci
> numbers does not need multi threading.
>
Most of programs are huge count of simple computing. There are many
places, that things may run parallel without pain. But very often this
places are computing short, thus very often overhead of starting thread
is bigger than using second core. But CILK potentially (I can prof that,
only imagine) reduce this overhead. Second : its very readability
solution, and compiler potentially can warning about some danger . If we
try to concurrency smaller things, without loss efficiency, danger
will be minimize
>
>>>> Its simple example to show, that CILK can play with recurrence (even
>>>> with one invoke).
>>> Yes.
>>> BTW: Same for parallel loops. You can write the above as parallel loop:
>>>
>>> ...
>>> for i:=0 to 1 paralleldo begin
>>> if i=0 then x := fib (n-1)
>>> else y := fib (n-2);
>>> end;
>>> exit(x+y);
>>> ...
>>>
>> I doubt this run. Very soon you receive lack of threads. Second this
>> two iteration has very different time of computing.
>> I don;t know approach, that balance this.
>
> Even the mtprocs does that.
> http://wiki.lazarus.freepascal.org/Parallel_procedures#Features
realy?
cite:" a procedure or method is executed with an Index running from an
arbitrary StartIndex to an arbitrary EndIndex"
assigning iteration to threads are done on start loop, eq. we must know
count of iteration.
what when iteration 1 is ten times longer than rest?
>
> Of course mtprocs has some disadvantages, e.g. it uses slow critical
> sections and task size must be tuned by the programmer.
>
>> Third: readability is poor.
>
> Compare this to OpenMP programs. ;)
>
>
>> [...]
>>>> CILK has different attempt to task scheduling., its fullfil
>>>> Work-requesting idea from
>>>> http://groups.google.com/group/lock-free/browse_frm/thread/18f90bdd8c721880
>>>>
>>> That's an implementation detail.
>> This is not detail, but quite different attempt.
>>
>>> Parallel loops implementations can use
>>> this too. There are for example plugins for OpenMP using work stealing.
>>> Probably some use some form of work requesting too.
>> OpenMP is heavy framework, not suitable to all area. Some need
>> something light. But as You notice, work stealing is not useless. Am
>> best with strong integration with FPC.
>
> Yes, that would be nice.
But some one should choose on approach, even weak, and start play with
them. Till now every one (me too) complaint: my solution is better.
>
>
>>>> More on CILK documentation, I;m to weak in english.
>>>>
>>>>>> Second: we not initialize thread on SPAWN (with
>>>>>> loops threads are initializing on start loop), Workers wait for
>>>>>> task on
>>>>>> idle. Push and pop task from queue can be very fast. Of course this
>>>>>> depend on implementation (for example each worker should deal
>>>>>> with own
>>>>>> queue, but should be possibility to draw task form other queue, this
>>>>>> approach is faster than single MPMC FIFO queue)
>>>>> Ehm, you know that parallel loops normally use thread pools, do you?
>>>> Thats resolved only one problem.
>>>> Thread switching is done slowly (less than 100Hz) and rather
>>>> expensive,
>>>> thus we cant start one thread per iteration. We should a priori pass
>>>> iteration to threads.
>>> Of course. I don't know any parallel loop implementation starting one
>>> thread per iteration (only in extreme cases).
>> But CILK can effective play with this.
>
> I doubt that. For example:
>
> for i:=1 to 100000000 do
> spawn DoSomething(i);
>
> It will work (unless run out of mem), but it obviously needs some tuning.
but
for i:=1 to 1000000 do
spawn DoSomething(i);
should work,
I thinking about Fifo queue, I use one Lock free,
its need 16 byte per spawn (func addres, param and result)
time of swich less than 1us
problem is with assigning local stack per task, my asm is too weak.
--
Darek
More information about the Lazarus
mailing list