Commit Graph

1710 Commits

Author SHA1 Message Date
Jennifer Yu c274b19866 Add implicit map for a list item appears in a reduction clause.
A new rule is added in 5.0:
If a list item appears in a reduction, lastprivate or linear clause
on a combined target construct then it is treated as if it also appears
in a map clause with a map-type of tofrom.

Currently map clauses for all capture variables are added implicitly.
But missing for list item of expression for array elements or array
sections.

The change is to add implicit map clause for array of elements used in
reduction clause. Skip adding map clause if the expression is not
mappable.
Noted: For linear and lastprivate, since only variable name is
accepted, the map has been added though capture variables.

To do so:
During the mappable checking, if error, ignore diagnose and skip
adding implicit map clause.

The changes:
1> Add code to generate implicit map in ActOnOpenMPExecutableDirective,
   for omp 5.0 and up.
2> Add extra default parameter NoDiagnose in ActOnOpenMPMapClause:
Use that to skip error as well as skip adding implicit map during the
mappable checking.

Note: there are only tow places need to be check for NoDiagnose. Rest
of them either the check is for < omp 5.0 or the error already generated for
reduction clause.

Differential Revision: https://reviews.llvm.org/D108132
2021-08-19 12:53:47 -07:00
Jon Chesterfield 21d91a8ef3 [libomptarget][devicertl] Replace lanemask with uint64 at interface
Use uint64_t for lanemask on all GPU architectures at the interface
with clang. Updates tests. The deviceRTL is always linked as IR so the zext
and trunc introduced for wave32 architectures will fold after inlining.

Simplification partly motivated by amdgpu gfx10 which will be wave32 and
is awkward to express in the current arch-dependant typedef interface.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108317
2021-08-18 20:47:33 +01:00
Roger Ferrer Ibanez bfb77364d0 [OpenMP] Fix accidental reuse of VLA size
We were using an OpaqueValueExpr allocated on the stack to store
the size of a VLA. Because the VLASizeMap in CodegenFunction
uses the address of the expression to avoid recomputing VLAs,
we were accidentally reusing an earlier llvm::Value. This led to
invalid LLVM IR.

This is a temporary solution until VLASizeMap can be pushed and popped
based on the context.

Differential Revision: https://reviews.llvm.org/D107666
2021-08-07 05:55:27 +00:00
Joseph Huber 41a6b50c25 [OpenMP]Fix PR51349: Remove AlwaysInline for if regions.
After D94315 we add the `NoInline` attribute to the outlined function to handle
data environments in the OpenMP if clause. This conflicted with the `AlwaysInline`
attribute added to the outlined function. for better performance in D106799.
The data environments should ideally not require NoInline, but for now this
fixes PR51349.

Reviewed By: mikerice

Differential Revision: https://reviews.llvm.org/D107649
2021-08-06 17:53:04 -04:00
Jennifer Yu 6b0f35931a Fix signal during the call to checkOpenMPLoop.
The root problem is a null pointer is accessed during the call to
checkOpenMPLoop, because loop up bound expr is an error expression
due to error diagnostic was emit early.

To fix this, in setLCDeclAndLB, setUB and setStep instead return false,
return true when LB, UB or Step contains Error, so that the checking is
stopped in checkOpenMPLoop.

Differential Revision: https://reviews.llvm.org/D107385
2021-08-05 08:59:35 -07:00
Aaron Ballman 530ea28fef Correct a lot of diagnostic wordings for the driver
Clang diagnostics should not start with a capital letter or use
trailing punctuation (https://clang.llvm.org/docs/InternalsManual.html#the-format-string),
but quite a few driver diagnostics were not following this advice. This
corrects the grammar and punctuation to improve consistency, but does
not change the circumstances under which the diagnostics are produced.
2021-08-05 07:04:55 -04:00
Jennifer Yu 656d022331 Stop emit incomplete type error for a variable in a map clause
where should not.

Currently we are using QTy->isIncompleteType(&ND) to check incomplete
type.  But before doing that, need to instantiate for a class template
specialization or a class member of a class template specialization,
or an array with known size of such..., so that we know it is really
incomplete type.

To fix this using RequireCompleteType instead.

The new test is added into "test/OpenMP/target_update_messages.cpp"

The different of using RequireCompleteType is when emit incomplete type,
an additional note is also emitted to point to where incomplete type
is declared.  Because this change, many tests are needed to be fixed
by adding additional note.

This is to fix https://bugs.llvm.org/show_bug.cgi?id=50508

Differential Revision: https://reviews.llvm.org/D107200
2021-08-03 10:51:32 -07:00
Chirag Khandelwal 77ebfba68b [Flang][Openmp] Upgrade TASKGROUP construct to 5.0.
In OMP 5.0 specification clause-list with
* task_reduction
* allocate
were allowed on taskgroup construct.

Fix XFAIL - omp-taskloop01.f90.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D93373
2021-08-03 10:27:47 +05:30
Eli Friedman 2a2847823f [ConstantFold] Get rid of special cases for sizeof etc.
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP).  We don't
need heroics to try to optimize the form of the expression before that
happens.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .

Differential Revision: https://reviews.llvm.org/D107116
2021-07-31 13:20:47 -07:00
Jose M Monsalve Diaz 0276db1416 [OpenMP] Creating the `omp_target_num_teams` and `omp_target_thread_limit` attributes to outlined functions
The device runtime contains several calls to __kmpc_get_hardware_num_threads_in_block
and __kmpc_get_hardware_num_blocks. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.

In commit D106033 we have the optimization phase. This commit adds the attributes to
the outlined function for the grid size. the two attributes are `omp_target_num_teams` and
`omp_target_thread_limit`. These values are added as long as they are constant.

Two functions are created `getNumThreadsExprForTargetDirective` and
`getNumTeamsExprForTargetDirective`. The original functions `emitNumTeamsForTargetDirective`
 and `emitNumThreadsForTargetDirective` identify the expresion and emit the code.
However, for the Device version of the outlined function, we cannot emit anything.
Therefore, this is a first attempt to separate emision of code from deduction of the
values.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106298
2021-07-27 17:21:04 -04:00
Joseph Huber af000197c4 [OpenMP] Always inline the OpenMP outlined function
This patch adds the always inline attribute to the outlined functions generated
by OpenMP regions. Because there is only a single instance of this function and
it always has internal linkage it is safe to inline in every instance it is
created. This could potentially lead to performance degredation due to
inflated register counts in the parallel region.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106799
2021-07-26 17:27:59 -04:00
Shilei Tian 3274cdc83e [Clang][OpenMP] Remove the mandatory flush for capture for OpenMP 5.1
In OpenMP 5.1:
> If the `write` or `update` clause is specifieded, the atomic operation is not an atomic conditional update for which the comparison fails, and the effective memory ordering is `release`, `acq_rel`, or `seq_cst`, the strong flush on entry to the atomic operation is also a release flush. If the `read` or `update` clause is specified and the effective memory ordering is `acquire`, `acq_rel`, or `seq_cst` then the strong flush on exit from the atomic operation is also an acquire flush.

In OpenMP 5.0:
> If the `write`, `update`, or **`capture`** clause is specified and the `release`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on entry to the atomic operation is also a release flush. If the `read` or `capture` clause is specified and the `acquire`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on exit from the atomic operation is also an acquire flush.

From my understanding, in OpenMP 5.1, `capture` is removed from the requirement for flush, therefore we don't have to enforce it.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100768
2021-07-26 11:00:44 -04:00
Alexey Bataev b88a68c45e [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.

Differential Revision: https://reviews.llvm.org/D106542
2021-07-22 08:44:37 -07:00
Alexey Bataev f828f0a90f Revert "[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments."
This reverts commit b455f7f225 to fix
buildbots.
2021-07-22 08:06:29 -07:00
Alexey Bataev b455f7f225 [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.

Differential Revision: https://reviews.llvm.org/D106542
2021-07-22 07:53:37 -07:00
Joseph Huber 754eb1c210 [OpenMP] Change `__kmpc_free_shared` to include the paired allocation size
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106496
2021-07-21 20:56:21 -04:00
Giorgis Georgakoudis fb0cf01795 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit e9c7291cb2.

Fix failing tests
2021-07-19 07:54:26 -07:00
Giorgis Georgakoudis e9c7291cb2 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102107
2021-07-16 23:27:44 -07:00
Joseph Huber 2c31d5ebfb [OpenMP] Add IDs to OpenMP remarks
This patch adds unique idenfitiers to the existing OpenMP remarks. This makes
it easier to identify the corresponding documentation for each remark that will
be hosted in the OpenMP webpage.

Depends on D105898

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105939
2021-07-16 14:07:03 -04:00
Joseph Huber eef6601b0f [OpenMP] Rework OpenMP remarks
This patch rewrites and reworks a few of the existing remarks to make the mmore
concise and consistent prior to writing the documentation for them.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105898
2021-07-16 14:07:00 -04:00
Aaron Ballman de59f56440 [OpenMP] Support OpenMP 5.1 attributes
OpenMP 5.1 added support for writing OpenMP directives using [[]]
syntax in addition to using #pragma and this introduces support for the
new syntax.

In OpenMP, the attributes take one of two forms:
[[omp::directive(...)]] or [[omp::sequence(...)]]. A directive
attribute contains an OpenMP directive clause that is identical to the
analogous #pragma syntax. A sequence attribute can contain either
sequence or directive arguments and is used to ensure that the
attributes are processed sequentially for situations where the order of
the attributes matter (remember:
https://eel.is/c++draft/dcl.attr.grammar#4.sentence-4).

The approach taken here is somewhat novel and deserves mention. We
could refactor much of the OpenMP parsing logic to work for either
pragma annotation tokens or for attribute clauses. It would be a fair
amount of effort to share the logic for both, but it's certainly
doable. However, the semantic attribute system is not designed to
handle the arbitrarily complex arguments that OpenMP directives
contain. Adding support to thread the novel parsed information until we
can produce a semantic attribute would be considerably more effort.
What's more, existing OpenMP constructs are not (often) represented as
semantic attributes. So doing this through Attr.td would be a massive
undertaking that would likely only benefit OpenMP and comes with
additional risks. Rather than walk down that path, I am taking
advantage of the fact that the syntax of the directives within the
directive clause is identical to that of the #pragma form. Once the
parser recognizes that we're processing an OpenMP attribute, it caches
all of the directive argument tokens and then replays them as though
the user wrote a pragma. This reuses the same OpenMP parsing and
semantic logic directly, but does come with a risk if the OpenMP
committee decides to purposefully diverge their pragma and attribute
syntaxes. So, despite this being a novel approach that does token
replay, I think it's actually a better approach than trying to do this
through the declarative syntax in Attr.td.
2021-07-12 06:51:19 -04:00
Johannes Doerfert 514c033db1 [OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.

The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D102307
2021-07-10 18:44:25 -05:00
Johannes Doerfert a706b94ea5 [OpenMP][NFCI] Re-enable two remarks tests after D101977 landed 2021-07-10 18:18:34 -05:00
Johannes Doerfert e2cfbfcc0c [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 17:53:56 -05:00
Alexey Bataev ab8989ab87 [OPENMP]Fix overlapped mapping for dereferenced pointer members.
If the base is used in a map clause and later we have a memberexpr with
this base, and the member is a pointer, and this pointer is dereferenced
anyhow (subscript, array section, dereference, etc.), such components
should be considered as overlapped, otherwise it may lead to incorrect
size computations, since we try to map a pointee as a part of the whole
struct, which is not true for the pointer members.

Differential Revision: https://reviews.llvm.org/D105562
2021-07-09 12:51:26 -07:00
Alexey Bataev f57d396dca [OPENMP]Do no privatize const firstprivates in target regions.
No need to emit private copyfor firstprivate constants in target
regions, we can use the original copy instead.

Differential Revision: https://reviews.llvm.org/D105647
2021-07-08 11:55:37 -07:00
Alexey Bataev b3c80dd894 [OPENMP]Remove const firstprivate allocation as a variable in a constant space.
Current implementation is not compatible with asynchronous target
regions, need to remove it.

Differential Revision: https://reviews.llvm.org/D105375
2021-07-07 05:56:48 -07:00
Alexey Bataev 3eb2158f4f [OPENMP]Fix PR50640: OpenMP target clause implicitly scaling loop bounds to uint64_t.
Need to add some conversions to suppress possible warning messages.

Differential Revision: https://reviews.llvm.org/D105187
2021-07-01 07:52:22 -07:00
Alexey Bataev d93ca4d27e Revert "[OPENMP]Fix PR50640: OpenMP target clause implicitly scaling loop bounds to uint64_t."
This reverts commit 67643f46ee to fix
unexpected diagnostic notes.
2021-07-01 06:40:19 -07:00
Alexey Bataev 67643f46ee [OPENMP]Fix PR50640: OpenMP target clause implicitly scaling loop bounds to uint64_t.
Need to add some conversions to suppress possible warning messages.

Differential Revision: https://reviews.llvm.org/D105187
2021-07-01 05:59:49 -07:00
Alexey Bataev 7fab1146e4 [OPENMP]Fix PR50929: Ignored initializer clause in user-defined reduction.
No need to try to create the default constructor for private copy, it
will be called automatically in the initializer of the declare
reduction. Fixes balance between constructors/destructors calls.

Differential Revision: https://reviews.llvm.org/D105143
2021-06-30 04:55:38 -07:00
Joseph Huber 9ce02ea8c9 [OpenMP] Add Module metadata for OpenMP compilation
This patch adds a module level metadata flag indicating that the module
was compiled with the `-fopenmp` flag. This will make it easier for
passes like OpenMPOpt to determine if it should be run.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102361
2021-06-25 16:34:19 -04:00
Joseph Huber 03d7e61c87 [OpenMP] Internalize functions in OpenMPOpt to improve IPO passes
Summary:
Currently the attributor needs to give up if a function has external linkage.
This means that the optimization introduced in D97818 will only apply to static
functions. This change uses the Attributor to internalize OpenMP device
routines by making a copy of each function with private linkage and replacing
the uses in the module with it. This allows for the optimization to be applied
to any regular function.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102824
2021-06-22 12:38:10 -04:00
Joseph Huber 68d133a3e8 [OpenMP] Simplify GPU memory globalization
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.

This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D97680
2021-06-22 10:52:46 -04:00
Graham Hunter c6a91ee6aa [Clang][OpenMP] Monotonic does not apply to SIMD
The codegen for simd constructs was affected by the presence (or
absence) of the 'monotonic' schedule modifier for worksharing
loops. The modifier is only intended to apply to the scheduling of
chunks for a thread, not iterations of a loop inside a chunk.

In addition, the monotonic modifier was applied to worksharing loops
by default if no schedule clause was present; the referenced part of
the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the
user specified a schedule clause with a static kind but no modifier.
Without a user-specified schedule clause we should default to
nonmonotonic scheduling.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D103793
2021-06-22 10:24:11 +01:00
Alexey Bataev 45ae766e78 [OPENMP]Fix PR50699: capture locals in combine directrives for aligned clause.
Need to capture locals in aligned clauses for the combined directives to
be fix the crash in the codegen.

Differential Revision: https://reviews.llvm.org/D104258
2021-06-15 04:58:02 -07:00
Alexey Bataev 4e15560879 [OPENMP][C++20]Add support for CXXRewrittenBinaryOperator in ranged for loops.
Added support for CXXRewrittenBinaryOperator as a condition in ranged
for loops. This is a new kind of expression, need to extend support for
  C++20 constructs.
It fixes PR49970: range-based for compilation fails for libstdc++ vector
with -std=c++20.

Differential Revision: https://reviews.llvm.org/D104240
2021-06-14 11:50:27 -07:00
Alexey Bataev 44f197e94b [OpenMP] Fix C-only clang assert on parsing use_allocator clause of target directive
The parser code assumes building with C++ compiler and asserts when using clang (not clang++) on C file. I made the code dependent on input language. This shows up for amdgpu target.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D103899
2021-06-14 10:36:27 -07:00
Zahira Ammarguellat 150f7cedfb Referencing a static function defined in an opnemp clause, is
generating an erroneous warning.

See here: https://godbolt.org/z/ajKPc36M7
2021-06-11 06:56:01 -07:00
Michael Kruse a22236120f [OpenMP] Implement '#pragma omp unroll'.
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99459
2021-06-10 14:30:17 -05:00
Joseph Huber 0c32ffceed [OpenMP] Add type to firstprivate symbol for const firstprivate values
Clang will create a global value put in constant memory if an aggregate value
is declared firstprivate in the target device. The symbol name only uses the
name of the firstprivate variable, so symbol name conflicts will occur if the
variable is allowed to have different types through templates. An example of
this behvaiour is shown in https://godbolt.org/z/EsMjYh47n. This patch adds the
mangled type name to the symbol to avoid such naming conflicts. This fixes
https://bugs.llvm.org/show_bug.cgi?id=50642.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D103995
2021-06-10 09:02:20 -04:00
AndreyChurbanov 9ce2e5e700 Revert "[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type"
This reverts commit a1f550e052.

Revert in order to fix backwards compatibility breakage
caused by type size change for task dependence flag.
2021-06-09 17:38:38 +03:00
AndreyChurbanov a1f550e052 [OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
Size of type of the dependence flag changed from 1 to 4 bytes in clang.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.

Differential Revision: https://reviews.llvm.org/D97085
2021-06-07 21:42:51 +03:00
Alexey Bataev c84a5448b5 [OPENMP]Fix PR50129: omp cancel parallel not working as expected.
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.

Differential Revision: https://reviews.llvm.org/D103646
2021-06-04 08:24:55 -07:00
Alexey Bataev 827b5c2154 [OPENMP]Fix PR49790: Constexpr values not handled in `omp declare mapper` clause.
Patch allows using of constexpr vars evaluatable to constant calue to be
used in declare mapper construct.

Differential Revision: https://reviews.llvm.org/D103642
2021-06-04 07:32:14 -07:00
Michael Kruse 64e5a3bbdd [clang] Fix fail of OpenMP/tile_codegen_tile_for.cpp.
Clang's version string can be customized using CLANG_VENDOR which the
test did not consider. Change the test to accept any version string.
2021-06-02 21:02:05 -05:00
Michael Kruse 07a6beb402 [Clang][OpenMP] Emit dependent PreInits before directive.
The PreInits of a loop transformation (atm moment only tile) include the computation of the trip count. The trip count is needed by any loop-associated directives that consumes the transformation-generated loop. Hence, we must ensure that the PreInits of consumed loop transformations are emitted with the consuming directive.

This is done by addinging the inner loop transformation's PreInits to the outer loop-directive's PreInits. The outer loop-directive will consume the de-sugared AST such that the inner PreInits are not emitted twice. The PreInits of a loop transformation are still emitted directly if its generated loop(s) are not associated with another loop-associated directive.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D102180
2021-06-02 16:59:35 -05:00
Johannes Doerfert 6ff380f439 [OpenMP][NFC] Remove SIMD check lines for non-simd tests
If a test does not contain an " simd" but -fopenmp-simd RUN lines we can
just check that we do not create __kmpc|__tgt calls.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D101973
2021-05-19 21:35:33 -05:00
Fangrui Song 37561ba89b -fno-semantic-interposition: Don't set dso_local on GlobalVariable
`clang -fpic -fno-semantic-interposition` may set dso_local on variables for -fpic.

GCC folks consider there are 'address interposition' and 'semantic interposition',
and 'disabling semantic interposition' can optimize function calls but
cannot change variable references to use local aliases
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483).

This patch removes dso_local for variables in
`clang -fpic -fno-semantic-interposition` mode so that the built shared objects can
work with copy relocations. Building llvm-project tiself with
-fno-semantic-interposition (D102453) should now be safe with trunk Clang.

Example:
```
// a.c
int var;
int *addr() { return var; }

// old: cannot be interposed
movslq  .Lvar$local(%rip), %rax
// new: can be interposed
movq    var@GOTPCREL(%rip), %rax
movslq  (%rax), %rax
```

The local alias lowering for `GlobalVariable`s is kept in case there is a
future option allowing local aliases.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D102583
2021-05-19 16:08:28 -07:00
Joseph Huber 2db182ff8d [Diagnostics] Allow emitting analysis and missed remarks on functions
Summary:
Currently, only `OptimizationRemarks` can be emitted using a Function.
Add constructors to allow this for `OptimizationRemarksAnalysis` and
`OptimizationRemarkMissed` as well.

Reviewed By: jdoerfert thegameg

Differential Revision: https://reviews.llvm.org/D102784
2021-05-19 15:10:20 -04:00