Commit Graph

14730 Commits

Author SHA1 Message Date
Masoud Ataei b0f68791f0 [clang] Option control afn flag
Clang option to set/unset afn fast-math flag.

 Differential: https://reviews.llvm.org/D106191
 Reviewd with: aaron.ballman, erichkeane, and others
2021-10-08 14:26:14 -04:00
Keith Smiley 68e49aea9a Revert "[clang] Fix absolute file paths with -fdebug-prefix-map"
This reverts commit a23a596793.

This broke a windows test https://buildkite.com/llvm-project/premerge-checks/builds/59492#7dad207c-6cbe-40ad-95e4-c48b47fe2527

Differential Revision: https://reviews.llvm.org/D111444
2021-10-08 10:39:44 -07:00
Keith Smiley a23a596793 [clang] Fix absolute file paths with -fdebug-prefix-map
Previously if you passed an absolute path to clang, where only part of
the path to the file was remapped, it would result in the file's DIFile
being stored with a duplicate path, for example:

```
!DIFile(filename: "./ios/Sources/bar.c", directory: "./ios/Sources")
```

This change handles absolute paths, specifically in the case they are
remapped to something relative, and uses the dirname for the directory,
and basename for the filename.

This also adds a test verifying this behavior for more standard uses as
well.

Differential Revision: https://reviews.llvm.org/D111352
2021-10-08 10:35:17 -07:00
John McCall 5ab6ee7599 Fix a variety of bugs with nil-receiver checks when targeting
non-Darwin ObjC runtimes:

- Use the same logic the Darwin runtime does for inferring that a
  receiver is non-null and therefore doesn't require null checks.
  Previously we weren't skipping these for non-super dispatch.

- Emit a null check when there's a consumed parameter so that we can
  destroy the argument if the call doesn't happen.  This mostly
  involves extracting some common logic from the Darwin-runtime code.

- Generate a zero aggregate by zeroing the same memory that was used
  in the method call instead of zeroing separate memory and then
  merging them with a phi.  This uses less memory and avoids unnecessary
  copies.

- Emit zero initialization, and generate zero values in phis, using
  the proper zero-value routines instead of assuming that the zero
  value of the result type has a bitwise-zero representation.
2021-10-08 05:44:06 -04:00
Wang, Pengfei c0f9c7c015 [X86] Check if struct is blank before getting the inner types
This fixes pr52011.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D111037
2021-10-08 17:09:34 +08:00
Joseph Huber 9efdca87c7 [OpenMP] Introduce new flags to assert thread and team usage in the runtime
This patch adds two flags to be supported for the new runtime. The flags
are `-fopenmp-assume-threads-oversubscription` and
-fopenmp-assume-teams-oversubscription`. These add global values that
can be checked by the work sharing runtime functions to make better
judgements about how to distribute work between the threads.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D111348
2021-10-07 22:23:09 -04:00
Itay Bookstein 40ec1c0f16 [IR][NFC] Rename getBaseObject to getAliaseeObject
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
2021-10-06 19:33:10 -07:00
David Blaikie f6a561c4d6 DebugInfo: Use clang's preferred names for integer types
This reverts c7f16ab3e3 / r109694 - which
suggested this was done to improve consistency with the gdb test suite.
Possible that at the time GCC did not canonicalize integer types, and so
matching types was important for cross-compiler validity, or that it was
only a case of over-constrained test cases that printed out/tested the
exact names of integer types.

In any case neither issue seems to exist today based on my limited
testing - both gdb and lldb canonicalize integer types (in a way that
happens to match Clang's preferred naming, incidentally) and so never
print the original text name produced in the DWARF by GCC or Clang.

This canonicalization appears to be in `integer_types_same_name_p` for
GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb.

(I tested this with one translation unit defining 3 variables - `long`,
`long (*)()`, and `int (*)()`, and another translation unit that had
main, and a function that took `long (*)()` as a parameter - then
compiled them with mismatched compilers (either GCC+Clang, or
Clang+(Clang with this patch applied)) and no matter the combination,
despite the debug info for one CU naming the type "long int" and the
other naming it "long", both debuggers printed out the name as "long"
and were able to correctly perform overload resolution and pass the
`long int (*)()` variable to the `long (*)()` function parameter)

Did find one hiccup, identified by the lldb test suite - that CodeView
was relying on these names to map them to builtin types in that format.
So added some handling for that in LLVM. (these could be split out into
separate patches, but seems small enough to not warrant it - will do
that if there ends up needing any reverti/revisiting)

Differential Revision: https://reviews.llvm.org/D110455
2021-10-06 16:02:34 -07:00
Jennifer Yu a4743eba3c Fix assert of "Unable to find base lambda address" from
adjustMemberOfForLambdaCaptures.

The problem is happening when user passes lambda function with reference
type in the map clause.

The natural of the problem when processing generateInfoForCapture,
the BasePointer is generated with new load for a lambda variable with
reference type.  It is not expected in adjustMemberOfForLambdaCaptures.

One way to fix this is to skipping call to generateInfoForCapture for
map(to:lambda).  The map info will be generated later in the call to
generateDefaultMapInfo samiler as firsprivate clase.

This to fix https://bugs.llvm.org/show_bug.cgi?id=52071

Differential Revision:https://reviews.llvm.org/D111115
2021-10-06 14:14:28 -07:00
Arthur Eubanks 6522b7cc32 [clang] Add option to clear AST memory before running LLVM passes
This is to save memory for Clang compiles.
Measuring building PassBuilder.cpp under /usr/bin/time, max rss goes from 0.93GB to 0.7GB.

This does not turn it by default yet.

I've turned on the option locally and run it over a good amount of files without any issues.

For more background, see
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D111105
2021-10-06 13:42:22 -07:00
Arthur Eubanks 05392466f0 Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 13:29:23 -07:00
Arthur Eubanks 569346f274 Revert "Reland [IR] Increase max alignment to 4GB"
This reverts commit 8d64314ffe.
2021-10-06 11:38:11 -07:00
Arthur Eubanks 8d64314ffe Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 11:03:51 -07:00
Arthur Eubanks 72cf8b6044 Revert "[IR] Increase max alignment to 4GB"
This reverts commit df84c1fe78.

Breaks some bots
2021-10-06 10:21:35 -07:00
Arthur Eubanks df84c1fe78 [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 09:54:14 -07:00
Michael Kruse f37e8b0b83 [Clang][OpenMP] Infix OMPLoopTransformationDirective abstract class. NFC.
Insert OMPLoopTransformationDirective between OMPLoopBasedDirective and the loop transformations OMPTileDirective and OMPUnrollDirective. This simplifies handling of loop transformations not requiring distinguishing between OMPTileDirective and OMPUnrollDirective anymore.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D111119
2021-10-06 10:49:07 -05:00
Simon Pilgrim b9b90bb542 [clang] Replace report_fatal_error(std::string) uses with report_fatal_error(Twine)
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
2021-10-06 11:43:19 +01:00
Corentin Jabot 424733c12a Implement if consteval (P1938)
Modify the IfStmt node to suppoort constant evaluated expressions.

Add a new ExpressionEvaluationContext::ImmediateFunctionContext to
keep track of immediate function contexts.

This proved easier/better/probably more efficient than walking the AST
backward as it allows diagnosing nested if consteval statements.
2021-10-05 08:04:14 -04:00
Arthur Eubanks 2568286892 [clang] Don't use the AST to display backend diagnostics
We keep a map from function name to source location so we don't have to
do it via looking up a source location from the AST. However, since
function names can be long, we actually use a hash of the function name
as the key.

Additionally, we can't rely on Clang's printing of function names via
the AST, so we just demangle the name instead.

This is necessary to implement
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D110665
2021-10-04 14:14:32 -07:00
serge-sans-paille 0f0e31cf51 Update inline builtin handling to honor gnu inline attribute
Per the GCC info page:

    If the function is declared 'extern', then this definition of the
    function is used only for inlining.  In no case is the function
    compiled as a standalone function, not even if you take its address
    explicitly.  Such an address becomes an external reference, as if
    you had only declared the function, and had not defined it.

Respect that behavior for inline builtins: keep the original definition, and
generate a copy of the declaration suffixed by '.inline' that's only referenced
in direct call.

This fixes holes in c3717b6858.

Differential Revision: https://reviews.llvm.org/D111009
2021-10-04 22:26:25 +02:00
Alexey Bataev bfc8f9e9b0 [clang] Fix computation of number of dependencies using OpenMP iterator,
by Raul Penacoba.

The size of kmp_depend_info and the number of dependencies are computed multiplying the iterator sizes, which not right.
Now size is computed as:

itersize1*numclausedeps1 + itersize2*numclausedeps2 + ... + itersizeN*numclausedepsN

where itersizeX is the size of the iterator and numclausedepsX the number of dependencies in that depend clause.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D111045
2021-10-04 07:06:51 -07:00
Stefan Pintilie 4fc2f4979c [PowerPC] Fix __builtin_ppc_load2r to return short instead of int.
This patch fixes the return value of the builtin __builtin_ppc_load2r to
correctly return short instead of int.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D110771
2021-10-04 06:17:02 -05:00
Jay Foad d933adeaca [APInt] Stop using soft-deprecated constructors and methods in clang. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in clang.

Differential Revision: https://reviews.llvm.org/D110808
2021-10-04 09:38:11 +01:00
Dávid Bolvanský b1fcca3884 Fixed warnings in LLVM produced by -Wbitwise-instead-of-logical 2021-10-03 13:04:18 +02:00
Joseph Huber d12502a3ab [OpenMP] Apply OpenMP assumptions to applicable call sites
This patch adds OpenMP assumption attributes to call sites in applicable
regions. Currently this applies the caller's assumption attributes to
any calls contained within it. So, if a call occurs inside an OpenMP
assumes region to a function outside that region, we will assume that
call respects the assumptions. This is primarily useful for inline
assembly calls used heavily in the OpenMP GPU device runtime, which
allows us to then make judgements about what the ASM will do.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110655
2021-09-29 16:08:21 -04:00
Quinn Pham 67a3d1e275 [PowerPC] swdiv builtins for XL compatibility
This patch is in a series of patches to provide builtins for compatibility with
the XL compiler. This patch implements the software divide builtin as
wrappers for a floating point divide. XL provided these builtins because it
didn't produce software estimates by default at `-Ofast`. When compiled
with `-Ofast` these builtins will produce the software estimate for divide.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D106959
2021-09-29 11:31:07 -05:00
Sven van Haastregt 4da744a20f [OpenCL] Fix as_type3 invalid store creation
With -fpreserve-vec3-type enabled, a cast was not created when
converting from a non-vec3 type to a vec3 type, even though a
conversion to vec3 was performed.  This resulted in creation of
invalid store instructions.

Differential Revision: https://reviews.llvm.org/D108470
2021-09-29 09:40:06 +01:00
Arthur Eubanks aa53785f23 Reland [clang] Rework dontcall attributes
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).

One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.

Previous revisions didn't properly declare the new dependencies.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D110364
2021-09-28 15:31:30 -07:00
Arthur Eubanks 7833d20f1f Revert "[clang] Rework dontcall attributes"
This reverts commit 2943071e2e.

Breaks bots
2021-09-28 14:49:27 -07:00
Arthur Eubanks 2943071e2e [clang] Rework dontcall attributes
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).

One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D110364
2021-09-28 14:21:10 -07:00
serge-sans-paille c3717b6858 Simplify handling of builtin with inline redefinition
(This is a recommit of 3d6f49a569 that should no longer break validation since
bd379915de).

It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.

Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.

Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.

After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite

Differential Revision: https://reviews.llvm.org/D109967
2021-09-28 21:00:47 +02:00
Kevin Athey 0d76d4833d Revert "Simplify handling of builtin with inline redefinition"
This reverts commit 3d6f49a569.

Broke bot: https://lab.llvm.org/buildbot/#/builders/5/builds/12360
2021-09-28 11:30:37 -07:00
David Blaikie 85f612efeb DebugInfo: Use sugared function type when emitting function declarations for call sites
Otherwise we're losing type information for these functions.
2021-09-28 10:44:35 -07:00
Quinn Pham 70391b3468 [PowerPC] FP compare and test XL compat builtins.
This patch is in a series of patches to provide builtins for
compatability with the XL compiler. This patch adds builtins for compare
exponent and test data class operations on floating point values.

Reviewed By: #powerpc, lei

Differential Revision: https://reviews.llvm.org/D109437
2021-09-28 11:01:51 -05:00
serge-sans-paille 3d6f49a569 Simplify handling of builtin with inline redefinition
It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.

Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.

Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.

After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite

Differential Revision: https://reviews.llvm.org/D109967
2021-09-28 13:24:25 +02:00
Ahsan Saghir 593b074a09 [PowerPC] MMA - Add __builtin_vsx_build_pair and __builtin_mma_build_acc builtins
This patch adds the following built-ins:

__builtin_vsx_build_pair
__builtin_mma_build_acc

Reviewed By: #powerpc, nemanjai, lei

Differential Revision: https://reviews.llvm.org/D107647
2021-09-27 19:51:28 -05:00
Joseph Huber b4a5543624 [OpenMP] Introduce a new worksharing RTL function for distribute
This patch adds a new RTL function for worksharing. Currently we use
`__kmpc_for_static_init` for both the `distribute` and `parallel`
portion of the loop clause. This patch replaces the `distribute` portion
with a new runtime call `__kmpc_distribute_static_init`. Currently this
will be used exactly the same way, but will make it easier in the future
to fine-tune the distribute and parallel portion of the loop.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110429
2021-09-27 11:36:37 -04:00
Wang, Pengfei 7d6889964a [X86][FP16] Add more builtins to avoid multi evaluation problems & add 2 missed intrinsics
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D110336
2021-09-27 09:27:04 +08:00
David Blaikie 8d9ddd4f50 DebugInfo: STN: Handle unreconstitutable types in function types 2021-09-23 21:13:16 -07:00
David Blaikie 25ac0d3c73 DebugInfo: Implement the -gsimple-template-names functionality
This excludes certain names that can't be rebuilt from the available
DWARF:

* Atomic types - no DWARF differentiating int from atomic int.
* Vector types - enough DWARF (an attribute on the array type) to do
  this, but I haven't written the extra code to add the attributes
  required for this
* Lambdas - ambiguous with any other unnamed class
* Unnamed classes/enums - would need column info for the type in
  addition to file/line number
* noexcept function types - not encoded in DWARF
2021-09-23 19:58:32 -07:00
Thomas Lively 2f519825ba [WebAssembly] Add prototype relaxed SIMD fma/fms instructions
Add experimental clang builtins, LLVM intrinsics, and backend definitions for
the new {f32x4,f64x2}.{fma,fms} instructions in the relaxed SIMD proposal:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.
Do not allow these instructions to be selected without explicit user opt-in.

Differential Revision: https://reviews.llvm.org/D110295
2021-09-23 11:01:36 -07:00
Hongtao Yu d9b511d8e8 [CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110209
2021-09-22 09:09:48 -07:00
Shilei Tian ca999f7191 [OpenMP][Offloading] Use bitset to indicate execution mode instead of value
The execution mode of a kernel is stored in a global variable, whose value means:
- 0 - SPMD mode
- 1 - indicates generic mode
- 2 - SPMD mode execution with generic mode semantics

We are going to add support for SIMD execution mode. It will be come with another
execution mode, such as SIMD-generic mode. As a result, this value-based indicator
is not flexible.

This patch changes to bitset based solution to encode execution mode. Each
position is:
[0] - generic mode
[1] - SPMD mode
[2] - SIMD mode (will be added later)

In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode
execution with generic mode semantics. In the future after we add the support for
SIMD mode, `0b1xx` will be in SIMD mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110029
2021-09-22 11:40:52 -04:00
Florian Hahn ea21d688dc
[Matrix] Emit assumption that matrix indices are valid.
The matrix extension requires the indices for matrix subscript
expression to be valid and it is UB otherwise.

extract/insertelement produce poison if the index is invalid, which
limits the optimizer to not be bale to scalarize load/extract pairs for
example, which causes very suboptimal code to be generated when using
matrix subscript expressions with variable indices for large matrixes.

This patch updates IRGen to emit assumes to for index expression to
convey the information that the index must be valid.

This also adjusts the order in which operations are emitted slightly, so
indices & assumes are added before the load of the matrix value.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D102478
2021-09-22 12:27:37 +01:00
David Blaikie 2ff049b12e DebugInfo: Don't use preferred template names in debug info
Using the preferred name creates a mismatch between the textual name of
a type and the DWARF tags describing the parameters as well as possible
inconsistency between DWARF producers (like Clang and GCC, or
older/newer Clang versions, etc).
2021-09-21 20:08:16 -07:00
David Blaikie db6f1e8a88 DebugInfo: Don't suppress inline namespaces when printing template template parameter names 2021-09-21 19:30:13 -07:00
David Blaikie d31dfc3011 DebugInfo: Unify some printing policy adjustments 2021-09-21 19:30:12 -07:00
Giorgis Georgakoudis ac90dfc43a Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 1d66649adf.

Revert to fix AMG GPU issue.
2021-09-21 13:20:39 -07:00
Matheus Izvekov d9308aa39b [clang] don't mark as Elidable CXXConstruct expressions used in NRVO
See PR51862.

The consumers of the Elidable flag in CXXConstructExpr assume that
an elidable construction just goes through a single copy/move construction,
so that the source object is immediately passed as an argument and is the same
type as the parameter itself.

With the implementation of P2266 and after some adjustments to the
implementation of P1825, we started (correctly, as per standard)
allowing more cases where the copy initialization goes through
user defined conversions.

With this patch we stop using this flag in NRVO contexts, to preserve code
that relies on that assumption.
This causes no known functional changes, we just stop firing some asserts
in a cople of included test cases.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D109800
2021-09-21 21:41:20 +02:00
Giorgis Georgakoudis 1d66649adf [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6

Differential Revision: https://reviews.llvm.org/D102107
2021-09-21 10:50:04 -07:00
Wang, Pengfei 227673398c [X86] Always check the size of SourceTy before getting the next type
D109607 results in a regression in llvm-test-suite.
The reason is we didn't check the size of SourceTy, so that we will
return wrong SSE type when SourceTy is overlapped.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D110037
2021-09-20 23:34:19 +08:00
alokmishra.besu 000875c127 OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-18 13:40:44 -05:00
Nico Weber 31cca21565 Revert "OpenMP 5.0 metadirective"
This reverts commit c7d7b98e52.
Breaks tests on macOS, see comment on https://reviews.llvm.org/D91944
2021-09-18 09:10:37 -04:00
Adrian Prantl 843390c58a Apply proper source location to fallthrough switch cases.
This fixes a bug in clang where, when clang sees a switch with a
fallthrough to a default like this:

static void funcA(void) {}
static void funcB(void) {}

int main(int argc, char **argv) {

switch (argc) {
    case 0:
        funcA();
        break;
    case 10:
    default:
        funcB();
        break;
}
}

It does not add a proper debug location for that switch case, such as
case 10: above.

Patch by Shubham Rastogi!

Differential Revision: https://reviews.llvm.org/D109940
2021-09-17 14:45:04 -07:00
cchen 9ff848c5cd Revert "[OpenMP] Use irbuilder as default for masked and master construct"
This reverts commit 2908fc0d3f.
2021-09-17 16:44:09 -05:00
alokmishra.besu 347f3c186d OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-17 16:30:06 -05:00
cchen 7efb825382 Revert "OpenMP 5.0 metadirective"
This reverts commit c7d7b98e52.
2021-09-17 16:14:16 -05:00
cchen c7d7b98e52 OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-17 16:03:13 -05:00
cchen 2908fc0d3f [OpenMP] Use irbuilder as default for masked and master construct
Use irbuilder as default and remove redundant Clang codegen for masked construct and master construct.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D100874
2021-09-17 15:54:11 -05:00
Erich Keane e3b10525b4 Make multiversioning work with internal linkage
We previously made all multiversioning resolvers/ifuncs have weak
ODR linkage in IR, since we NEED to emit the whole resolver every time
we see a call, but it is not necessarily the place where all the
definitions live.

HOWEVER, when doing so, we neglected the case where the versions have
internal linkage.  This patch ensures we do this, so you don't get weird
behavior with static functions.
2021-09-17 05:56:38 -07:00
Wang, Pengfei e9e1d4751b [X86] Refactor GetSSETypeAtOffset to fix pr51813
D105263 adds support for _Float16 type. It introduced a bug (pr51813) that generates a <4 x half> type instead the default double when passing blank structure by SSE registers.

Although I doubt it may expose a bug somewhere other than D105263, it's good to avoid return half type when no half type in arguments.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D109607
2021-09-17 10:51:59 +08:00
Aaron Ballman aefb81a33a Removing some spurious whitespace; NFC 2021-09-16 13:14:08 -04:00
Arnold Schwaighofer f670c5aeee Add a new frontend flag `-fswift-async-fp={auto|always|never}`
Summary:
Introduce a new frontend flag `-fswift-async-fp={auto|always|never}`
that controls how code generation sets the Swift extended async frame
info bit. There are three possibilities:

* `auto`: which determines how to set the bit based on deployment target, either
statically or dynamically via `swift_async_extendedFramePointerFlags`.
* `always`: default, always set the bit statically, regardless of deployment
target.
* `never`: never set the bit, regardless of deployment target.

Differential Revision: https://reviews.llvm.org/D109451
2021-09-16 08:48:51 -07:00
Yaxun (Sam) Liu abe8b354e3 Fix vtbl field addr space
Storing the vtable field of an object should use the same address space as
the this pointer. Currently it is assumed to be addr space 0 but this may not
be true.

This assumption (added in 054cc3b1b4) caused
issues for the out-of-tree CHERI targets.

Reviewed by: John McCall, Alexander Richardson

Differential Revision: https://reviews.llvm.org/D109841
2021-09-16 10:57:31 -04:00
Zarko Todorovski 1b0a71c5fc [PowerPC][AIX] Add support for varargs for complex types on AIX
Remove the previous error and add support for special handling of small
complex types as in PPC64 ELF ABI. As in, generate code to load from
varargs location and pack it in a temp variable, then return a pointer to
the struct.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D106393
2021-09-16 09:38:03 -04:00
Bjorn Pettersson 8f8616655c [NewPM] Use a separate struct for ModuleThreadSanitizerPass
Split ThreadSanitizerPass into ThreadSanitizerPass (as a function
pass) and ModuleThreadSanitizerPass (as a module pass).
Main reason is to make sure that we have a unique mapping from
ClassName to PassName in the new passmanager framework, making it
possible to correctly identify the passes when dealing with options
such as -print-after and -print-pipeline-passes.

This is a follow-up to D105006 and D105007.
2021-09-16 14:58:42 +02:00
Bjorn Pettersson ab41eef9ac [NewPM] Use a separate struct for ModuleMemorySanitizerPass
Split MemorySanitizerPass into MemorySanitizerPass (as a function
pass) and ModuleMemorySanitizerPass (as a module pass).
Main reason is to make sure that we have a unique mapping from
ClassName to PassName in the new passmanager framework, making it
possible to correctly identify the passes when dealing with options
such as -print-after and -print-pipeline-passes.

This is a follow-up to D105006 and D105007.
2021-09-16 14:58:42 +02:00
David Blaikie 8264846c0e Senticify some comments - post-commit review for e4b9f5e851
Based on feedback from Paul Robinson.
2021-09-15 13:59:11 -07:00
Walter Lee 66c6bbe7ff Put code that avoids heapifying local blocks behind a flag
This change puts the functionality in commit
c5792aa90f behind a flag that is off by
default.  The original commit is not in Apple's Clang fork (and blocks
are an Apple extension in the first place), and there is one known
issue that needs to be addressed before it can be enabled safely.

Differential Revision: https://reviews.llvm.org/D108243
2021-09-14 14:06:05 -04:00
David Blaikie 13e34f9fc1 Fixup some formatting from a recent commit 2021-09-14 00:41:19 -07:00
David Blaikie e4b9f5e851 DebugInfo: Add support for template parameters with reference qualifiers
Followon from the previous commit supporting cvr qualifiers.
2021-09-14 00:39:47 -07:00
David Blaikie db4ff98bf9 DebugInfo: Add support for template parameters with qualifiers
eg: t1<void () const> - DWARF doesn't have a particularly nice way to
encode this, for real member function types (like `void (t1::*)()
const`) the const-ness is encoded in the type of the artificial first
parameter. But `void () const` has no parameters, so encode it like a
normal const-qualified type, using DW_TAG_const_type. (similarly for
restrict and volatile)

Reference qualifiers (& and &&) coming in a separate commit shortly.
2021-09-14 00:04:40 -07:00
Xiang1 Zhang c81d6ab875 [X86] Adjust Keylocker handle mem size
Reviewed By: Topper Craig

Differential Revision: https://reviews.llvm.org/D109488
2021-09-13 18:03:27 +08:00
Xiang1 Zhang bdce8d40c6 Revert "[X86] Adjust Keylocker handle mem size"
This reverts commit 3731de6b7f.
2021-09-13 18:00:46 +08:00
Xiang1 Zhang 3731de6b7f [X86] Adjust Keylocker handle mem size
Reviewed By: Topper Craig

Differential Revision: https://reviews.llvm.org/D109354
2021-09-13 17:59:33 +08:00
Joseph Huber 29b44ca896 [OpenMP] Add flag for setting debug in the offloading device
This patch introduces the flags `-fopenmp-target-debug` and
`-fopenmp-target-debug=` to set the value of a global in the device.
This will be used to enable or disable debugging features statically in
the device runtime library.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D109544
2021-09-10 18:19:19 -04:00
Roman Lebedev 85ba583eba
[NFCI][clang] Move allocation alignment manifestation for malloc-like into Sema from Codegen
... so that it happens right next to `AddKnownFunctionAttributesForReplaceableGlobalAllocationFunction()`,
which is good for consistency.
2021-09-10 20:49:28 +03:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Akira Hatanaka 59cc39ae14 [ObjC][ARC] Use the addresses of the ARC runtime functions instead of
integer 0/1 for the operand of bundle "clang.arc.attachedcall"

This should make it easier to understand what the IR is doing and also
simplify some of the passes as they no longer have to translate the
integer values to the runtime functions.

Differential Revision: https://reviews.llvm.org/D102996
2021-09-08 11:56:22 -07:00
Wang, Pengfei e6e8d25920 [X86][mingw] Modify the alignment of __m128/__m256/__m512 vector type for mingw
This is a follow up patch after D78564 and D108887.

Martin helped to confirm the alignment in GCC mingw is the same as the
size of vector. https://reviews.llvm.org/D108887#inline-1040893

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D109265
2021-09-06 20:28:09 +08:00
Qiu Chaofan fae0dfa642 [Clang] Add __ibm128 type to represent ppc_fp128
Currently, we have no front-end type for ppc_fp128 type in IR. PowerPC
target generates ppc_fp128 type from long double now, but there's option
(-mabi=(ieee|ibm)longdouble) to control it and we're going to do
transition from IBM extended double-double ppc_fp128 to IEEE fp128 in
the future.

This patch adds type __ibm128 which always represents ppc_fp128 in IR,
as what GCC did for that type. Without this type in Clang, compilation
will fail if compiling against future version of libstdcxx (which uses
__ibm128 in headers).

Although all operations in backend for __ibm128 is done by software,
only PowerPC enables support for it.

There's something not implemented in this commit, which can be done in
future ones:

- Literal suffix for __ibm128 type. w/W is suitable as GCC documented.
- __attribute__((mode(IF))) should be for __ibm128.
- Complex __ibm128 type.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D93377
2021-09-06 18:00:58 +08:00
Michael Kruse 650bbc5620 [OpenMP][OpenMPIRBuilder] Implement loop unrolling.
Recommit of 707ce34b06. Don't introduce a
dependency to the LLVMPasses component, instead register the required
passes individually.

Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:

 * `unrollLoopFull`
 * `unrollLoopPartial`
 * `unrollLoopHeuristic`

`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.

With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.

Reviewed By: jdoerfert, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107764
2021-09-04 19:18:58 -05:00
Kazuaki Ishizaki 8f77dc459e [clang] NFC: Fix trivial typo in comments and document
`the the` -> `the`

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D77470
2021-09-04 12:59:42 +05:30
PeixinQiao a42380ce83 [OMPIRBuilder] Add ordered directive to OMPBuilder
Add support for ordered directive in the OpenMPIRBuilder.

This patch also modidies clang to use the ordered directive when the
option -fopenmp-enable-irbuilder is enabled.

Also fix one ICE when parsing one canonical for loop with the relational
operator LE or GE in openmp region by replacing unary increment
operation of the expression of the variable "Expr A" minus the variable
"Expr B" (++(Expr A - Expr B)) with binary addition operation of the
experssion of the variable "Expr A" minus the variable "Expr B" and the
expression with constant value "1" (Expr A - Expr B + "1").

Reviewed By: Meinersbur, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107430
2021-09-03 09:37:58 +08:00
David Blaikie 5fb3f43778 Fully qualify template template parameters when printing
I discovered this quirk when working on some DWARF - AST printing prints
type template parameters fully qualified, but printed template template
parameters the way they were written syntactically, or wholely
unqualified - instead, we should print them consistently with the way we
print type template parameters: fully qualified.

The one place this got weird was for partial specializations like in
ast-print-temp-class.cpp - hence the need for checking for
TemplateNameDependenceScope::DependentInstantiation template template
parameters. (not 100% sure that's the right solution to that, though -
open to ideas)

Differential Revision: https://reviews.llvm.org/D108794
2021-09-02 15:04:34 -07:00
Roman Lebedev 3f1f08f0ed
Revert @llvm.isnan intrinsic patchset.
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)

TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.

While the end result of discussion may lead back to the current design,
it may also not lead to the current design.

Therefore i take it upon myself
to revert the tree back to last known good state.

This reverts commit 4c4093e6e3.
This reverts commit 0a2b1ba33a.
This reverts commit d9873711cb.
This reverts commit 791006fb8c.
This reverts commit c22b64ef66.
This reverts commit 72ebcd3198.
This reverts commit 5fa6039a5f.
This reverts commit 9efda541bf.
This reverts commit 94d3ff09cf.
2021-09-02 13:53:56 +03:00
Roman Lebedev 50634deaa5
Revert "[OpenMP][OpenMPIRBuilder] Implement loop unrolling."
Breaks build with -DBUILD_SHARED_LIBS=ON
```
CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
  "LLVMFrontendOpenMP" of type SHARED_LIBRARY
    depends on "LLVMPasses" (weak)
  "LLVMipo" of type SHARED_LIBRARY
    depends on "LLVMFrontendOpenMP" (weak)
  "LLVMCoroutines" of type SHARED_LIBRARY
    depends on "LLVMipo" (weak)
  "LLVMPasses" of type SHARED_LIBRARY
    depends on "LLVMCoroutines" (weak)
    depends on "LLVMipo" (weak)
At least one of these targets is not a STATIC_LIBRARY.  Cyclic dependencies are allowed only among static libraries.
CMake Generate step failed.  Build files cannot be regenerated correctly.
```

This reverts commit 707ce34b06.
2021-09-02 12:42:23 +03:00
Michael Kruse 707ce34b06 [OpenMP][OpenMPIRBuilder] Implement loop unrolling.
Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:

 * `unrollLoopFull`
 * `unrollLoopPartial`
 * `unrollLoopHeuristic`

`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.

With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.

Reviewed By: jdoerfert, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107764
2021-09-02 02:37:25 -05:00
Erich Keane 42ae7eb581 Ensure field-annotations on pointers properly match the AS of the field.
Discovered in SYCL, the field annotations were always cast to an i8*,
which is an invalid bitcast for a pointer type with an address space.
This patch makes sure that we create an intrinsic that takes a pointer
to the correct address-space and properly do our casts.

Differential Revision: https://reviews.llvm.org/D109003
2021-09-01 06:12:24 -07:00
Joel E. Denny 83ddfa0d22 [OpenMP][OpenACC] Implement `ompx_hold` map type modifier extension in Clang (1/2)
This patch implements Clang support for an original OpenMP extension
we have developed to support OpenACC: the `ompx_hold` map type
modifier.  The next patch in this series, D106510, implements OpenMP
runtime support.

Consider the following example:

```
 #pragma omp target data map(ompx_hold, tofrom: x) // holds onto mapping of x
 {
   foo(); // might have map(delete: x)
   #pragma omp target map(present, alloc: x) // x is guaranteed to be present
   printf("%d\n", x);
 }
```

The `ompx_hold` map type modifier above specifies that the `target
data` directive holds onto the mapping for `x` throughout the
associated region regardless of any `target exit data` directives
executed during the call to `foo`.  Thus, the presence assertion for
`x` at the enclosed `target` construct cannot fail.  (As usual, the
standard OpenMP reference count for `x` must also reach zero before
the data is unmapped.)

Justification for inclusion in Clang and LLVM's OpenMP runtime:

* The `ompx_hold` modifier supports OpenACC functionality (structured
  reference count) that cannot be achieved in standard OpenMP, as of
  5.1.
* The runtime implementation for `ompx_hold` (next patch) will thus be
  used by Flang's OpenACC support.
* The Clang implementation for `ompx_hold` (this patch) as well as the
  runtime implementation are required for the Clang OpenACC support
  being developed as part of the ECP Clacc project, which translates
  OpenACC to OpenMP at the directive AST level.  These patches are the
  first step in upstreaming OpenACC functionality from Clacc.
* The Clang implementation for `ompx_hold` is also used by the tests
  in the runtime implementation.  That syntactic support makes the
  tests more readable than low-level runtime calls can.  Moreover,
  upstream Flang and Clang do not yet support OpenACC syntax
  sufficiently for writing the tests.
* More generally, the Clang implementation enables a clean separation
  of concerns between OpenACC and OpenMP development in LLVM.  That
  is, LLVM's OpenMP developers can discuss, modify, and debug LLVM's
  extended OpenMP implementation and test suite without directly
  considering OpenACC's language and execution model, which can be
  handled by LLVM's OpenACC developers.
* OpenMP users might find the `ompx_hold` modifier useful, as in the
  above example.

See new documentation introduced by this patch in `openmp/docs` for
more detail on the functionality of this extension and its
relationship with OpenACC.  For example, it explains how the runtime
must support two reference counts, as specified by OpenACC.

Clang recognizes `ompx_hold` unless `-fno-openmp-extensions`, a new
command-line option introduced by this patch, is specified.

Reviewed By: ABataev, jdoerfert, protze.joachim, grokos

Differential Revision: https://reviews.llvm.org/D106509
2021-08-31 16:13:49 -04:00
Kazu Hirata b8debabb77 [clang] Remove redundant calls to c_str() (NFC)
Identified with readability-redundant-string-cstr.
2021-08-31 08:53:51 -07:00
Justas Janickas f9bc1b3bee [OpenCL] Defines helper function for kernel language compatible OpenCL version
This change defines a helper function getOpenCLCompatibleVersion()
inside LangOptions class. The function contains mapping between
C++ for OpenCL versions and their corresponding compatible OpenCL
versions. This mapping function should be updated each time a new
C++ for OpenCL language version is introduced. The helper function
is expected to simplify conditions on OpenCL C and C++ for OpenCL
versions inside compiler code.

Code refactoring performed.

Differential Revision: https://reviews.llvm.org/D108693
2021-08-31 10:08:38 +01:00
David Blaikie 4f3a92ca0a DebugInfo: Refactor/deduplicate various template argument list emission
Streamline template arguments across types, variables, and functions -
for convenient reuse in experiments related to template argument list
reconstitution (not including template argument lists in the "name" of
those entities, and leaving it to debug info consumers to rebuild the
full template name from the semantic descriptions of the argument lists)

But the change seems like a good refactoring/cleanup anyway.

I'd certainly be open to suggestions about how this might be more
streamlined - like is there no generic way to query template argument
lists across the 3 kinds of entities, rather than needing special case
code?
2021-08-30 22:39:46 -07:00
Andrei Elovikov 1724a16437 [NFC][clang] Move IR-independent parts of target MV support to X86TargetParser.cpp
...that is located under llvm/lib/Support/.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D108423
2021-08-30 09:48:48 -07:00
Steven Wan 73733ae526 TypeInfo records more information about align requirement
Extend the information preserved in `TypeInfo` by replacing the `AlignIsRequired` bool flag with a three-valued enum, the enum also indicates where the alignment attribute come from, which could be helpful in determining whether the attribute should overrule.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D108858
2021-08-28 19:47:48 -04:00
Yonghong Song 82d9cb34a2 [DebugInfo] convert btf_tag attrs to DI annotations for func parameters
Generate btf_tag annotations for DILocalVariable. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106620
2021-08-26 14:27:58 -07:00
Yonghong Song d2d7a90ced [DebugInfo] convert btf_tag attrs to DI annotations for DIGlobalVariable
Generate btf_tag annotations for DIGlobalVariable. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106619
2021-08-26 10:36:33 -07:00
Yonghong Song 2de051ba12 [DebugInfo] convert btf_tag attrs to DI annotations for DISubprograms
Generate btf_tag annotations for DISubprograms. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106618
2021-08-26 08:54:11 -07:00
Sindhu Chittireddy de15979bc3 Assert pointer cannot be null; NFC
Klocwork static code analysis exposed this concern:
Pointer 'SubExpr' returned from call to getSubExpr() function which may
return NULL from 'cast_or_null<Expr>(Operand)', which will be
dereferenced in the statement following it

Add an assert on SubExpr to make it clear this pointer cannot be null.
2021-08-26 06:58:56 -04:00
Alex Richardson 7cab90a7b1 Fix __attribute__((annotate("")) with non-zero globals AS
The existing code attempting to bitcast from a value in the default globals AS
to i8 addrspace(0)* was triggering an assertion failure in our downstream fork.
I found this while compiling poppler for CHERI-RISC-V (we use AS200 for all
globals). The test case uses AMDGPU since that is one of the in-tree targets
with a non-zero default globals address space.
The new test previously triggered a "Invalid constantexpr bitcast!" assertion
and now correctly generates code with addrspace(1) pointers.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D105972
2021-08-26 10:09:40 +01:00
Nick Desaulniers 846e562dcc [Clang] add support for error+warning fn attrs
Add support for the GNU C style __attribute__((error(""))) and
__attribute__((warning(""))). These attributes are meant to be put on
declarations of functions whom should not be called.

They are frequently used to provide compile time diagnostics similar to
_Static_assert, but which may rely on non-ICE conditions (ie. relying on
compiler optimizations). This is also similar to diagnose_if function
attribute, but can diagnose after optimizations have been run.

While users may instead simply call undefined functions in such cases to
get a linkage failure from the linker, these provide a much more
ergonomic and actionable diagnostic to users and do so at compile time
rather than at link time. Users instead may be able use inline asm .err
directives.

These are used throughout the Linux kernel in its implementation of
BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be
converted to use _Static_assert because many of the parameters are not
ICEs. The Linux kernel still needs to be modified to make use of these
when building with Clang; I have a patch that does so I will send once
this feature is landed.

To do so, we create a new IR level Function attribute, "dontcall" (both
error and warning boil down to one IR Fn Attr).  Then, similar to calls
to inline asm, we attach a !srcloc Metadata node to call sites of such
attributed callees.

The backend diagnoses these during instruction selection, while we still
know that a call is a call (vs say a JMP that's a tail call) in an arch
agnostic manner.

The frontend then reconstructs the SourceLocation from that Metadata,
and determines whether to emit an error or warning based on the callee's
attribute.

Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D106030
2021-08-25 10:34:18 -07:00
Jonas Hahnfeld ea08c4cd1c [CUDA] Fix static device variables with -fgpu-rdc
NVPTX does not allow dots in the identifier, so ptxas errors out with
   fatal   : Parsing error near '.static': syntax error
because it parses .static as a directive. Avoid this problem by using
two underscores, similar to what OpenMP does for outlined functions.

Differential Revision: https://reviews.llvm.org/D108456
2021-08-25 09:31:22 +02:00
Yi Kong 5fc4828aa6 [clang] Don't generate warn-stack-size when the warning is ignored
8ace121305 introduced a regression for code that explicitly ignores the
-Wframe-larger-than= warning. Make sure we don't generate the
warn-stack-size attribute for that case.

Differential Revision: https://reviews.llvm.org/D108686
2021-08-25 14:58:45 +08:00
Richard Smith cd4d6d718b PR48030: Fix COMDAT-related linking problem with C++ thread_local static data members.
Previously when emitting a C++ guarded initializer, we tried to work out what
the enclosing function would be used for and added it to the COMDAT containing
the variable if we thought that doing so would be correct. But this was done
from a context in which we didn't -- and realistically couldn't -- correctly
infer how the enclosing function would be used.

Instead, add the initialization function to a COMDAT from the code that
creates it, in the case where it makes sense to do so: when we know that
the one and only reference to the initialization function is in
@llvm.global.ctors and that reference is in the same COMDAT.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D108680
2021-08-24 19:53:44 -07:00
Bob Haarman 1c829ce1e3 [clang][codegen] Set CurLinkModule in CodeGenAction::ExecuteAction
CodeGenAction::ExecuteAction creates a BackendConsumer for the
purpose of handling diagnostics. The BackendConsumer's
DiagnosticHandlerImpl method expects CurLinkModule to be set,
but this did not happen on the code path that goes through
ExecuteAction. This change makes it so that the BackendConsumer
constructor used by ExecuteAction requires the Module to be
specified and passes the appropriate module in ExecuteAction.

The change also adds a test that fails without this change
and passes with it. To make the test work, the FIXME in the
handling of DK_Linker diagnostics was addressed so that warnings
and notes are no longer silently discarded. Since this introduces
a new warning diagnostic, a flag to control it (-Wlinker-warnings)
has also been added.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D108603
2021-08-24 21:25:49 +00:00
Wang, Pengfei c728bd5bba [X86] AVX512FP16 instructions enabling 5/6
Enable FP16 FMA instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105268
2021-08-24 09:07:19 +08:00
Andrei Elovikov f5c2889488 [NFC][clang] Use X86 Features declaration from X86TargetParser
...instead of redeclaring them in clang's own X86Target.def. They were already
required to be in sync (IIUC), so no reason to maintain two identical lists.

Reviewed By: erichkeane, craig.topper

Differential Revision: https://reviews.llvm.org/D108151
2021-08-23 12:30:28 -07:00
Jon Chesterfield c2574e63ff [openmp][nfc] Refactor GridValues
Remove redundant fields and replace pointer with virtual function

Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.

This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108380
2021-08-23 16:19:11 +01:00
Andy Wingo 4fb0c08342 [clang][CodeGen] GetDefaultAlignTempAlloca uses preferred alignment
This function was defaulting to use the ABI alignment for the LLVM
type.  Here we change to use the preferred alignment.  This will allow
unification with GetTempAlloca, which if alignment isn't specified, uses
the preferred alignment.

Differential Revision: https://reviews.llvm.org/D108450
2021-08-23 14:55:58 +02:00
Andy Wingo 8da70fed70 [clang][NFC] Tighten up code for GetGlobalVarAddressSpace
The LangAS local is only used in the OpenCL case; move its decl
inwards.

Differential Revision: https://reviews.llvm.org/D108449
2021-08-23 14:55:58 +02:00
Andy Wingo d3d4d98576 [clang][NFC] GetOrCreateLLVMGlobal takes LangAS
Pass a LangAS instead of a target address space to
GetOrCreateLLVMGlobal, to remove a place where the frontend assumes that
target address space 0 is special.

Differential Revision: https://reviews.llvm.org/D108445
2021-08-23 14:55:58 +02:00
Shilei Tian 2c6ffb4eb2 [NFC] clang-format -i clang/lib/CodeGen/CGStmtOpenMP.cpp 2021-08-22 22:57:05 -04:00
Simon Pilgrim 7f48bd3bed CGBuiltin.cpp - pass SVETypeFlags by const reference. NFC.
Don't pass the struct by value.
2021-08-22 12:13:17 +01:00
Wang, Pengfei b088536ce9 [X86] AVX512FP16 instructions enabling 4/6
Enable FP16 unary operator instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105267
2021-08-22 08:59:35 +08:00
Joseph Huber ec66ed79f4 [OpenMP] Correctly add member expressions to OpenMP info
Mapping expressions that have `this` as their base expression aren't
considered a valid base variable and the rest of the runtime expects
this. However, if we have an expression with no value declaration we can
try to extract it manually to provide more helpful debuggin information.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108483
2021-08-20 20:45:14 -04:00
Arthur Eubanks 644f88a25b [NFC] addAttribute(FunctionIndex) => addFnAttribute() 2021-08-20 14:18:59 -07:00
Yonghong Song 5ca7131eb3 [DebugInfo] convert btf_tag attrs to DI annotations for record fields
Generate btf_tag annotations for record fields. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106616
2021-08-20 12:52:51 -07:00
Jon Chesterfield b1efeface7 Revert "[openmp][nfc] Refactor GridValues"
Failed a nvptx codegen test
This reverts commit 2a47a84b40.
2021-08-20 18:17:27 +01:00
Thomas Lively 88962cea46 [WebAssembly] Restore builtins and intrinsics for pmin/pmax
Partially reverts 85157c0079, which had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.

Differential Revision: https://reviews.llvm.org/D108387
2021-08-20 09:21:31 -07:00
Jon Chesterfield 2a47a84b40 [openmp][nfc] Refactor GridValues
Remove redundant fields and replace pointer with virtual function

Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.

This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108380
2021-08-20 16:41:26 +01:00
Alexander Potapenko b0391dfc73 [clang][Codegen] Introduce the disable_sanitizer_instrumentation attribute
The purpose of __attribute__((disable_sanitizer_instrumentation)) is to
prevent all kinds of sanitizer instrumentation applied to a certain
function, Objective-C method, or global variable.

The no_sanitize(...) attribute drops instrumentation checks, but may
still insert code preventing false positive reports. In some cases
though (e.g. when building Linux kernel with -fsanitize=kernel-memory
or -fsanitize=thread) the users may want to avoid any kind of
instrumentation.

Differential Revision: https://reviews.llvm.org/D108029
2021-08-20 14:01:06 +02:00
Yonghong Song cab12fc28c [DebugInfo] convert btf_tag attrs to annotations for DIComposite types
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations.

Differential Revision: https://reviews.llvm.org/D106615
2021-08-19 18:01:29 -07:00
Jon Chesterfield 77579b99e9 [openmp][nfc] Replace OMPGridValues array with struct
[nfc] Replaces enum indices into an array with a struct. Named the
fields to match the enum, leaves memory layout and initialization unchanged.

Motivation is to later safely remove dead fields and replace redundant ones
with (compile time) computation. It should also be possible to factor some
common fields into a base and introduce a gfx10 amdgpu instance with less
duplication than the arrays of integers require.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D108339
2021-08-19 13:25:42 +01:00
Sven van Haastregt 7bda1a0711 [OpenCL] Fix as_type(vec3) invalid store creation
With -fpreserve-vec3-type enabled, a cast was not created when
converting from a vec3 type to a non-vec3 type, even though a
conversion to vec4 was performed.  This resulted in creation of
invalid store instructions.

Differential Revision: https://reviews.llvm.org/D107963
2021-08-19 11:57:09 +01:00
Bjorn Pettersson 36d5138619 [NewPM] Make some sanitizer passes parameterized in the PassRegistry
Refactored implementation of AddressSanitizerPass and
HWAddressSanitizerPass to use pass options similar to passes like
MemorySanitizerPass. This makes sure that there is a single mapping
from class name to pass name (needed by D108298), and options like
-debug-only and -print-after makes a bit more sense when (despite
that it is the unparameterized pass name that should be used in those
options).

A result of the above is that some pass names are removed in favor
of the parameterized versions:
- "khwasan" is now "hwasan<kernel;recover>"
- "kasan" is now "asan<kernel>"
- "kmsan" is now "msan<kernel>"

Differential Revision: https://reviews.llvm.org/D105007
2021-08-19 12:43:37 +02:00
Martin Storsjö cc3affd8b0 [clang] [MSVC] Implement __mulh and __umulh builtins for aarch64
The code is based on the same __mulh and __umulh intrinsics for
x86.

This should fix PR51128.

Differential Revision: https://reviews.llvm.org/D106721
2021-08-19 11:29:55 +03:00
Rong Xu 5fdaaf7fd8 [SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
This patch implements Flow Sensitive Sample FDO (FSAFDO) profile
loader. We have two profile loaders for FS profile,
one before RegAlloc and one before BlockPlacement.

To enable it, when -fprofile-sample-use=<profile> is specified,
add "-enable-fs-discriminator=true \
     -disable-ra-fsprofile-loader=false \
     -disable-layout-fsprofile-loader=false"
to turn on the FS profile loaders.

Differential Revision: https://reviews.llvm.org/D107878
2021-08-18 18:37:35 -07:00
Arthur Eubanks 3f4d00bc3b [NFC] More get/removeAttribute() cleanup 2021-08-17 21:05:41 -07:00
Arthur Eubanks de0ae9e89e [NFC] Cleanup more AttributeList::addAttribute() 2021-08-17 21:05:41 -07:00
Arthur Eubanks ad727ab7d9 [NFC] Migrate some callers away from Function/AttributeLists methods that take an index
These methods can be confusing.
2021-08-17 21:05:40 -07:00
Arthur Eubanks 46cf82532c [NFC] Replace Function handling of attributes with less confusing calls
To avoid magic constants and confusing indexes.
2021-08-17 21:05:40 -07:00
Wang, Pengfei 5aeca3b0a5 [CFE][X86] Enable complex _Float16 support
Support complex _Float16 on X86 in C/C++ following the latest X86 psABI. (https://gitlab.com/x86-psABIs)

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105331
2021-08-18 11:16:14 +08:00
Wang, Pengfei 2379949aad [X86] AVX512FP16 instructions enabling 3/6
Enable FP16 conversion instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105265
2021-08-18 09:03:41 +08:00
Dylan Fleming ef198cd99e [SVE] Remove usage of getMaxVScale for AArch64, in favour of IR Attribute
Removed AArch64 usage of the getMaxVScale interface, replacing it with
the vscale_range(min, max) IR Attribute.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D106277
2021-08-17 14:42:47 +01:00
Wang, Pengfei f1de9d6dae [X86] AVX512FP16 instructions enabling 2/6
Enable FP16 binary operator instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105264
2021-08-15 08:56:33 +08:00
Arthur Eubanks 8e9ffa1dc6 [NFC] Cleanup callers of AttributeList::hasAttributes()
AttributeList::hasAttributes() is confusing, use clearer methods like
hasFnAttrs().
2021-08-13 12:16:52 -07:00
Arthur Eubanks 80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Arthur Eubanks 92ce6db9ee [NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr()
This is more consistent with similar methods.
2021-08-13 11:09:18 -07:00
Michael Kruse b1de32d6dd [OMPIRBuilder] Clarify CanonicalLoopInfo. NFC.
Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives.

CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards.

In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0

Also see discussion in D105706.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107540
2021-08-12 21:02:19 -05:00
Arnold Schwaighofer 9eb99d2e73 CodeGen: No need to check for isExternC if HasStrictReturn is already false
NFC intended.

Differential Revision: https://reviews.llvm.org/D107841
2021-08-11 07:42:48 -07:00
Wang, Pengfei 6f7f5b54c8 [X86] AVX512FP16 instructions enabling 1/6
1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105263
2021-08-10 12:46:01 +08:00
Michael Liao 6ec36d18ec [cuda] Mark builtin texture/surface reference variable as 'externally_initialized'.
- They need to be preserved even if there's no reference within the
  device code as the host code may need to initialize them based on the
  application logic.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D107718
2021-08-09 13:27:40 -04:00
Roger Ferrer Ibanez bfb77364d0 [OpenMP] Fix accidental reuse of VLA size
We were using an OpaqueValueExpr allocated on the stack to store
the size of a VLA. Because the VLASizeMap in CodegenFunction
uses the address of the expression to avoid recomputing VLAs,
we were accidentally reusing an earlier llvm::Value. This led to
invalid LLVM IR.

This is a temporary solution until VLASizeMap can be pushed and popped
based on the context.

Differential Revision: https://reviews.llvm.org/D107666
2021-08-07 05:55:27 +00:00
Joseph Huber 41a6b50c25 [OpenMP]Fix PR51349: Remove AlwaysInline for if regions.
After D94315 we add the `NoInline` attribute to the outlined function to handle
data environments in the OpenMP if clause. This conflicted with the `AlwaysInline`
attribute added to the outlined function. for better performance in D106799.
The data environments should ideally not require NoInline, but for now this
fixes PR51349.

Reviewed By: mikerice

Differential Revision: https://reviews.llvm.org/D107649
2021-08-06 17:53:04 -04:00
Serge Pavlov 4c4093e6e3 Introduce intrinsic llvm.isnan
This is recommit of the patch 16ff91ebcc,
reverted in 0c28a7c990 because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:

    Clang has builtin function '__builtin_isnan', which implements C
    library function 'isnan'. This function now is implemented entirely in
    clang codegen, which expands the function into set of IR operations.
    There are three mechanisms by which the expansion can be made.

    * The most common mechanism is using an unordered comparison made by
      instruction 'fcmp uno'. This simple solution is target-independent
      and works well in most cases. It however is not suitable if floating
      point exceptions are tracked. Corresponding IEEE 754 operation and C
      function must never raise FP exception, even if the argument is a
      signaling NaN. Compare instructions usually does not have such
      property, they raise 'invalid' exception in such case. So this
      mechanism is unsuitable when exception behavior is strict. In
      particular it could result in unexpected trapping if argument is SNaN.

    * Another solution was implemented in https://reviews.llvm.org/D95948.
      It is used in the cases when raising FP exceptions by 'isnan' is not
      allowed. This solution implements 'isnan' using integer operations.
      It solves the problem of exceptions, but offers one solution for all
      targets, however some can do the check in more efficient way.

    * Solution implemented by https://reviews.llvm.org/D96568 introduced a
      hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
      specific code into IR. Now only SystemZ implements this hook and it
      generates a call to target specific intrinsic function.

    Although these mechanisms allow to implement 'isnan' with enough
    efficiency, expanding 'isnan' in clang has drawbacks:

    * The operation 'isnan' is hidden behind generic integer operations or
      target-specific intrinsics. It complicates analysis and can prevent
      some optimizations.

    * IR can be created by tools other than clang, in this case treatment
      of 'isnan' has to be duplicated in that tool.

    Another issue with the current implementation of 'isnan' comes from the
    use of options '-ffast-math' or '-fno-honor-nans'. If such option is
    specified, 'fcmp uno' may be optimized to 'false'. It is valid
    optimization in general, but it results in 'isnan' always returning
    'false'. For example, in some libc++ implementations the following code
    returns 'false':

        std::isnan(std::numeric_limits<float>::quiet_NaN())

    The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
    operands are never NaNs. This assumption however should not be applied
    to the functions that check FP number properties, including 'isnan'. If
    such function returns expected result instead of actually making
    checks, it becomes useless in many cases. The option '-ffast-math' is
    often used for performance critical code, as it can speed up execution
    by the expense of manual treatment of corner cases. If 'isnan' returns
    assumed result, a user cannot use it in the manual treatment of NaNs
    and has to invent replacements, like making the check using integer
    operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
    which also expresses the opinion, that limitations imposed by
    '-ffast-math' should be applied only to 'math' functions but not to
    'tests'.

    To overcome these drawbacks, this change introduces a new IR intrinsic
    function 'llvm.isnan', which realizes the check as specified by IEEE-754
    and C standards in target-agnostic way. During IR transformations it
    does not undergo undesirable optimizations. It reaches instruction
    selection, where is lowered in target-dependent way. The lowering can
    vary depending on options like '-ffast-math' or '-ffp-model' so the
    resulting code satisfies requested semantics.

    Differential Revision: https://reviews.llvm.org/D104854
2021-08-06 14:32:27 +07:00
Fangrui Song c38efb4899 [clang] Implement -falign-loops=N (N is a power of 2) for non-LTO
GCC supports multiple forms of -falign-loops=.
-falign-loops= is currently ignored in Clang.

This patch implements the simplest but the most useful form where N is a
power of 2.

The underlying implementation uses a `llvm::TargetOptions` option for now.
Bitcode generation ignores this option.

Differential Revision: https://reviews.llvm.org/D106701
2021-08-05 12:17:50 -07:00
Anshil Gandhi 39dac1f7f6 [clang] Add clang builtins support for gfx90a
Implement target builtins for gfx90a including fadd64, fadd32, add2h,
max and min on various global, flat and ds address spaces for which
intrinsics are implemented.

Differential Revision: https://reviews.llvm.org/D106909
2021-08-05 02:08:06 -06:00
Pavel Asyutchenko 7df405e079 Apply -fmacro-prefix-map to __builtin_FILE()
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107393
2021-08-04 16:42:14 -07:00
Bradley Smith e57e1e4e00 [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.

Differential Revision: https://reviews.llvm.org/D106860
2021-08-04 16:10:37 +00:00
Serge Pavlov 0c28a7c990 Revert "Introduce intrinsic llvm.isnan"
This reverts commit 16ff91ebcc.
Several errors were reported mainly test-suite execution time. Reverted
for investigation.
2021-08-04 17:18:15 +07:00