#pragma omp simd

Forum for the public review of the OpenMP 4.0 API Release Candidates. (Read Only)
Forum rules
This forum is now closed.

#pragma omp simd

Postby jakub » Fri Apr 05, 2013 2:24 am

1) When safelen (x) clause is present, I understand that for any iteration i in [0, N) where N is the number of loop iterations and j in (i, i + x) the iterations i and j can be executed concurrently. But if safelen clause is missing, does that mean
the same as safelen (N), or safelen (0) (in the latter case the compiler would be responsible for finding out whether there are any possible overlaps and whether it e.g. needs to do runtime versioning for aliasing or not, as it does normally
during autovectorization when it can't prove there is no overlap)?

2) What exactly is aligned when a var is specified in aligned clause?
Code: Select all
extern int *p, q, r[1024];
void
foo (int *s)
{
  #pragma omp simd aligned (p, q, r, s : 8 * sizeof (int))
  for (int i = 0; i < 1024; i++)
    r[i] = p[i] + q + s[i];
}

Can the compiler assume that (((uintptr_t) p) & 15) == 0, or (((uintptr_t) &p) & 15) == 0? For q, being a non-pointer scalar, I bet it doesn't make sense to talk about the alignment (should the standard forbid specifying it in aligned clause?),
but if specified validly the only possibility in that case is (((uintptr_t) &q) & 15) == 0. For r I guess (((uintptr_t) &r[0]) & 15) == 0, the only possibility, and for s there are again two possibilities like for p.
For vectorization I guess the first alternative for p (and similarly for s), and the only alternative for r is what is useful, but I'd say the standard should have detailed wording on it in that case.

3) aligned clauses without specified alignment - what are other compilers intending to do here and what are users expected to use in their code to align vars, especially on targets with multiple vector sizes?
Say on i?86/x86_64, where (currently) we have 16-byte and 32-byte vectors with AVX and later, some loops can be with -mavx (or -mavx2 etc.) better vectorized using 32-byte vectors, some using 16-byte vectors (e.g. with -mavx
when AVX2 isn't available, if the loop needs to do integer math, it needs to use 16-byte vectors, but if it only does float/double math, it can use 32-byte vectors). Would the default aligned alignment be e.g. 16 when -mavx isn't specified
(or similar, i.e. when the CPU which code is generated for can't ever use 32-byte vectors) and 32 when -mavx is specified, or 16 vs. 32 depending on whether vectorizer actually sees whether it could use 32-byte vectors?
Or is the plan say for Intel CC to use 16-byte alignment always? This isn't unfortunately just compiler's private decision, because it requires the user to add alignas(16) vs. alignas(32) somewhere (or aligned attribute or whatever the
compiler allows), or posix_memalign etc. or whatever the language/extensions provide to align variables.
jakub
 
Posts: 74
Joined: Fri Oct 26, 2007 3:19 am

Re: #pragma omp simd

Postby xtian » Sat Apr 06, 2013 2:03 pm

If the safelen is not specified, it implies the safelen = N, programmers need to make sure the entire loop is vectoriable (i.e. no loop-carried lexical backward dependency).
xtian
 
Posts: 4
Joined: Fri Apr 05, 2013 4:55 pm

Re: #pragma omp simd

Postby xtian » Sat Apr 06, 2013 2:08 pm

We will clarify taht what can be specified in "align" clause. The "align" clause applys to "memory address", e.g. *p, a[100]; align(p:64), align(a:4). Note that, OpenMP does not allow &p or &a[0] in the clause.
xtian
 
Posts: 4
Joined: Fri Apr 05, 2013 4:55 pm

Re: #pragma omp simd

Postby jakub » Thu Apr 11, 2013 3:59 am

Also, there doesn't seem to be any restriction that items specified in aligned clause(s) can't appear in other #pragma omp simd clauses. The question is, what is the behavior if there is say:
Code: Select all
extern int *p1, *p2, *p3, *p4, a[1024];
...
#pragma omp simd private(p1, a) lastprivate(p2) reduction(+:p3) linear(p4:1) aligned(p1, p2, p3, a:32)
for (i = 0; i < 1024; i++)
  ...

Does that say something just about alignment of the original list items, or say in the a array case request that the compiler aligns the privatized array to 32 bytes? If aligned(p1) talks about alignment of *p1 and aligned(a) talks about alignment of the array itself, then privatized pointer is just uninitialized, thus it hardly can have any well defined alignment for the privatized copy. Or should the standard just require that list items listed in aligned clause aren't specified in either certain subset of other clauses, or in any of them except perhaps shared clause for the case of #pragma omp parallel for simd ?

Is it intentional that there is nothing that talks about the list items mentioned in aligned clauses causing implicit reference in outer constructs (if any)? It makes sense, just wanted to make it clear that it is expected that if aligned clause mentions any variables that aren't actually used in the construct that the compiler just ignores those clauses.
jakub
 
Posts: 74
Joined: Fri Oct 26, 2007 3:19 am

Re: #pragma omp simd

Postby xtian » Wed Apr 24, 2013 5:05 pm

>>>>>>>>Is it intentional that there is nothing that talks about the list items mentioned in aligned clauses causing implicit reference in outer constructs (if any)? It makes sense, just wanted to make it clear that it is expected that if aligned clause mentions any variables that aren't actually used in the construct that the compiler just ignores those clauses.

Right, it is intentional. Yes, if any variable that is not used in the simd loop, the compiler ignores it.
xtian
 
Posts: 4
Joined: Fri Apr 05, 2013 4:55 pm


Return to OpenMP 4.0 Public Review Release Candidates

Who is online

Users browsing this forum: No registered users and 2 guests

cron