![]() |
|
#1
|
||||
|
||||
|
Intel, Parallelism and OpenMP
Hi,
I'm using OpenMP under the Intel compiler suite. Take a for() loop, like : for(i=0; i<k; i++) ... Intel docs makes the statement to NOT use the auto parallel pragmas if the value of 'k' is not fixed. Anybody know why?. If k varies each time the loop is encoutered, but remains constant during execution of the loop, why can't the loop be split into 4 threads, each handling approx k/4 iterations using : #pragma omp parallel sections and 4 of : #pragma omp section Is this a shortcoming of the pragmas or a fundamental point that I am missing? Git |
|
#2
|
|||
|
|||
|
I guess that's because the condition "i<k" has to be evaluated after each loop, and C/C++ allows changes to k during the loop.
Other languages (like MATLAB) would run through that loop for k_0-times, with k_0 being the value of k when entering the loop. Changes to k will be ignored. In this case its possible to parallize that loop (called "parfor" in MATLAB): parfor i=0:k-1 ... end But in C/C++ you have to check the expression each time, therefore it's just not possible to parallize that loop, because you do NOT know in advance how many times you have to run through that loop if k is a non-constant variable. Maybe you should try a "const int k" to see if that gets parallized. |
|
#3
|
||||
|
||||
|
As I said, k remains constant during the execution of the loop. The compiler should see that as it compiles the loop which has no external references and no references to k. I'll just spilt it up manually into four loops and run each on a thread. If I can do it then the complier should be able to.
Git |
![]() |
|
|