Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6, ABataev
Differential Revision: https://reviews.llvm.org/D102107
When F calls G calls H, G is nounwind, and G is inlined into F, then the
inlined call-site to H should be effectively nounwind so as not to lose
information during inlining.
If H itself is nounwind (which often happens when H is an intrinsic), we
no longer mark the callsite explicitly as nounwind. Previously, there
were cases where the inlined call-site of H differs from a pre-existing
call-site of H in F *only* in the explicitly added nounwind attribute,
thus preventing common subexpression elimination.
v2:
- just check CI->doesNotThrow
v3 (resubmit after revert at 3443788087):
- update Clang tests
Differential Revision: https://reviews.llvm.org/D129860
Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6.
So, this
```
for (int i=0; i<N; i++) {
res = res+i
}
```
will become
```
#pragma omp task in_reduction(+:res) if(0)
#pragma omp target map(res)
for (int i=0; i<N; i++) {
res = res+i
}
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D125669