Commit Graph

14259 Commits

Author SHA1 Message Date
Melanie Blower b55d723ed6 Revert "Modify FPFeatures to use delta not absolute settings"
This reverts commit 3a748cbf86.
I'm reverting this commit because I forgot to format the commit message
propertly. Sorry for the thrash.
2020-06-26 07:52:57 -07:00
Melanie Blower 3a748cbf86 Modify FPFeatures to use delta not absolute settings 2020-06-26 07:41:09 -07:00
Francesco Petrogalli 7200fa38a9 [sve][acle] Add some C intrinsics for brain float types.
Summary:
The following intrinsics has been added:

svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op)

svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices)

svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices)

svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t indices)

Reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, ctetreau

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82429
2020-06-25 16:31:01 +00:00
Andrew Wock 15edd7aaa7 [FPEnv] PowerPC-specific builtin constrained FP enablement
This change enables PowerPC compiler builtins to generate constrained
floating point operations when clang is indicated to do so.

A couple of possibly unexpected backend divergences between constrained
floating point and regular behavior are highlighted under the test tag
FIXME-CHECK. This may be something for those on the PPC backend to look
at.

Patch by: Drew Wock <drew.wock@sas.com>

Differential Revision: https://reviews.llvm.org/D82020
2020-06-25 11:42:58 -04:00
Alexey Bataev 32ea3397be [OPENMP]Dynamic globalization for parallel target regions.
Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.

Reviewers: jdoerfert

Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82324
2020-06-25 08:25:24 -04:00
Tyker c95ffadb24 [AssumeBundles] Use operand bundles to encode alignment assumptions
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.

Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71739
2020-06-25 12:59:44 +02:00
Nigel Perks dc3f8913d2 Fix crash on XCore on unused inline in EmitTargetMetadata
EmitTargetMetadata passed to emitTargetMD a null pointer as returned
from GetGlobalValue, for an unused inline function which has been
removed from the module at that point.

A FIXME in CodeGenModule.cpp commented that the calling code in
EmitTargetMetadata should be moved into the one target that needs it
(XCore). A review comment agreed. So the calling loop has been moved
into the XCore subclass. The check for null is done in that loop.

Differential Revision: https://reviews.llvm.org/D77068
2020-06-24 12:48:17 -07:00
Michael Liao ebc9e0f1f0 Fix coding style. NFC.
- Remove `else` after `return`.
2020-06-24 13:13:42 -04:00
Cullen Rhodes 05e10ee0ae [AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics
Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D82399
2020-06-24 10:06:31 +00:00
Cullen Rhodes fd2c4b8999 [AArch64][SVE] Add bfloat16 support to svlen intrinsic
Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D82186
2020-06-24 10:05:51 +00:00
Kazushi (Jam) Marukawa 96d4ccf00c [VE] Clang toolchain for VE
Summary:
This patch enables compilation of C code for the VE target with Clang.

Differential Revision: https://reviews.llvm.org/D79411
2020-06-24 10:12:09 +02:00
Eli Friedman bf8b63ed29 [clang codegen] Fix alignment of "Address" for incomplete array pointer.
The code was assuming all incomplete types don't have meaningful
alignment, but incomplete arrays do have meaningful alignment.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45710

Differential Revision: https://reviews.llvm.org/D79052
2020-06-23 17:16:17 -07:00
David Blaikie 4935419d77 Remove clang::Codegen::EHPadEndScope as unused
Unused since r255423 / D15140 /  4e52d6f811

Found indirectly by assessing -debug-info-kind=constructors and
observing the EHPadEndScope type was never emitted because the
constructor is never called. (all credit to Amy Huang for identifying
this issue)
2020-06-23 15:18:49 -07:00
Mikhail Maltsev 3f353a2e5a [BFloat] Add convert/copy instrinsic support
This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

Specifically it adds intrinsic support in clang and llvm for Arm and AArch64.

The bfloat type, and its properties are specified in the Arm Architecture Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:
  - Alexandros Lamprineas
  - Luke Cheeseman
  - Mikhail Maltsev
  - Momchil Velikov
  - Luke Geeson

Differential Revision: https://reviews.llvm.org/D80928
2020-06-23 14:27:05 +00:00
Alexey Bataev cb90e6a7c0 [OPENMP50]Codegen for scan directives in parallel for simd regions.
Summary:
Added codegen for scan directives in parallel for simd regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp parallel for simd reduction(inscan, op : ...)
for() {
  <input phase>;
  #pragma omp scan (in)exclusive(...)
  <scan phase>
}
```
is transformed to something:
```
 #pragma omp parallel
{
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for simd
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for simd
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82115
2020-06-23 08:41:11 -04:00
Mikhail Maltsev 9c579540ff [ARM] BFloat MatMul Intrinsics&CodeGen
Summary:
This patch adds support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman
 - Simon Tatham

Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM

Reviewed By: MarkMurrayARM

Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81740
2020-06-23 12:06:37 +00:00
Sander de Smalen 121e585ec8 [AArch64][SVE] ACLE: Add bfloat16 to struct load/stores.
This patch contains:
- Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4.
- New bfloat16 ACLE builtins for svld(2|3|4)[_vnum] and svst(2|3|4)[_vnum]

Reviewers: stuij, efriedma, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Tags: #clang, #lldb, #llvm

Differential Revision: https://reviews.llvm.org/D82187
2020-06-23 12:12:35 +01:00
Craig Topper 0dfc8e1837 [X86] Remove encoding value from the X86_FEATURE and X86_FEATURE_COMPAT macro. NFCI
This was orignally done so we could separate the compatibility
values and the llvm internal only features into a separate entries
in the feature array. This was needed when we explicitly had to
convert the feature into the proper 32-bit chunk at every reference
and we didn't want things moving around.

Now everything is in an array and we have helper funtions or macros
to convert encoding to index. So we renumbering is no longer an
issue.
2020-06-22 11:46:21 -07:00
Mikhail Maltsev 3a4feb1d53 [ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors
Currently, in order to extract an element from a bf16 vector, we cast
the vector to an i16 vector, perform the extraction, and cast the result to
bfloat. This behavior was copied from the old fp16 implementation.

The goal of this patch is to achieve optimal code generation for lane
copying intrinsics in a subsequent patch (LLVM fails to fold certain
combinations of bitcast, insertelement, extractelement and
shufflevector instructions leading to the generation of suboptimal code).

Differential Revision: https://reviews.llvm.org/D82206
2020-06-22 17:35:43 +00:00
Zhi Zhuang 37fb860301 Add support of __builtin_expect_with_probability
Add a new builtin-function __builtin_expect_with_probability and
intrinsic llvm.expect.with.probability.
The interface is __builtin_expect_with_probability(long expr, long
expected, double probability).
It is mainly the same as __builtin_expect besides one more argument
indicating the probability of expression equal to expected value. The
probability should be a constant floating-point expression and be in
range [0.0, 1.0] inclusive.
It is similar to builtin-expect-with-probability function in GCC
built-in functions.

Differential Revision: https://reviews.llvm.org/D79830
2020-06-22 10:21:28 -07:00
Eric Christopher 0861889be1 [clang/llvm] As part of using inclusive language within
the llvm project, migrate away from the use of blacklist and whitelist.
2020-06-20 16:03:58 -07:00
Eric Christopher 10563e16aa [Analysis/Transforms/Sanitizers] As part of using inclusive language
within the llvm project, migrate away from the use of blacklist and
whitelist.
2020-06-20 00:42:26 -07:00
Fangrui Song 2a4317bfb3 [SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list
Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now.

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D82244
2020-06-19 22:22:47 -07:00
Xiangling Liao 3f2e61c1fe [AIX] Default AIX to using -fno-use-cxa-atexit
On AIX, we use __atexit to register dtor functions rather than __cxa_atexit.
So a driver change is needed to default AIX to using -fno-use-cxa-atexit.

Windows platform does not uses __cxa_atexit either. Following its precedent,
we remove the assertion for when -fuse-cxa-atexit is specified by the user,
do not produce a message and silently default to -fno-use-cxa-atexit behavior.

Differential Revision: https://reviews.llvm.org/D82136
2020-06-19 08:27:07 -04:00
Xiangling Liao 22337bfe7d [AIX][Frontend] Static init implementation for AIX considering no priority
1. Provides no piroirity supoort && disables three priority related
   attributes: init_priority, ctor attr, dtor attr;
2. '-qunique' in XL compiler equivalent behavior of emitting sinit
    and sterm functions name using getUniqueModuleId() util function
    in LLVM (currently no support for InternalLinkage and WeakODRLinkage
    symbols);
3. Add testcases to emit IR sample with __sinit80000000, __dtor, and
    __sterm80000000;
4. Temporarily side-steps the need to implement the functionality of
   llvm.global_ctors and llvm.global_dtors arrays. The uses of that
   functionality in this patch (with respect to the name of the functions
   involved) are not representative of how the functionality will be used
   once implemented.

Differential Revision: https://reviews.llvm.org/D74166
2020-06-19 08:27:07 -04:00
Sander de Smalen ad828e3f4d [SveEmitter] Add builtins for struct loads/stores (ld2/ld3/etc)
The struct store intrinsics in LLVM IR take the individual parts
as arguments, so this patch uses the intrinsics used for `svget`
to break the tuples into individual parts.

Reviewers: c-rhodes, efriedma, ctetreau, david-arm

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81466
2020-06-19 10:35:42 +01:00
Xiangling Liao ed1b556954 [NFC] Cleanup of EmitCXXGlobalInitFunc() and EmitCXXGlobalDtorFunc()
Tidy up some code of EmitCXXGlobalInitFunc() and EmitCXXGlobalDtorFunc() as the
pre-work of D74166 patch.

Differential Revision: https://reviews.llvm.org/D81972
2020-06-18 18:49:23 -04:00
Ties Stuij 035795659b [ARM][bfloat] Do not coerce bfloat arguments and returns to integers
Summary:
As part of moving the argument lowering handling for bfloat arguments and
returns to the backend, this patch removes the code that was responsible for
handling the coercion of those arguments in Clang's Codegen.

Subscribers: kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81837
2020-06-18 18:26:01 +01:00
Francesco Petrogalli 3e59dfc301 [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.
Summary:
The new SVE builtin type __SVBFloat16_t` is used to represent scalable
vectors of bfloat elements.

Reviewers: sdesmalen, efriedma, stuij, ctetreau, shafik, rengolin

Subscribers: tschuett, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81304
2020-06-18 16:36:31 +00:00
Alexey Bataev 4971d0b8ec [OPENMP50]Allow nonmonotonic modifier for all schedule kinds.
Summary:
According to OpenMP 5.0, nonmonotonic modifier can be used with all
schedule kinds, not only dynamic and guided as in OpenMP 4.5.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82026
2020-06-18 12:30:50 -04:00
Alexey Bataev 1ec469cf4c [OPENMP50]Codegen for scan directives in parallel for regions.
Summary:
Added codegen for scan directives in parallel for regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp parallel for reduction(inscan, op : ...)
 for() {
   <input phase>;
   #pragma omp scan (in)exclusive(...)
   <scan phase>
 }
```
is transformed to something:

```
 #pragma omp parallel
{
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81478
2020-06-18 11:56:55 -04:00
Alexandre Ganea 89ea0b0520 [MC] Pass down argv0 & cc1 cmd-line to the back-end and store in MCTargetOptions
When targetting CodeView, the goal is to store argv0 & cc1 cmd-line in the emitted .OBJ, in order to allow a reproducer from the .OBJ alone.

This patch is to simplify https://reviews.llvm.org/D80833
2020-06-18 09:17:14 -04:00
Lucas Prates ada4c9dc4a [ARM][Clang] Removing lowering of half-precision FP arguments and returns from Clang's CodeGen
Summary:
On the process of moving the argument lowering handling for
half-precision floating point arguments and returns to the backend, this
patch removes the code that was responsible for handling the coercion of
those arguments in Clang's Codegen.

Reviewers: rjmccall, chill, ostannard, dnsampaio

Reviewed By: ostannard

Subscribers: stuij, kristof.beyls, dmgreen, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81451
2020-06-18 13:17:07 +01:00
Florian Hahn b5e082e728 [Matrix] Add __builtin_matrix_column_store to Clang.
This patch add __builtin_matrix_column_major_store to Clang,
as described in clang/docs/MatrixTypes.rst. In the initial version,
the stride is not optional yet.

Reviewers: rjmccall, jfb, rsmith, Bigcheese

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72782
2020-06-18 11:39:02 +01:00
Sander de Smalen 4ea8e27a64 [SveEmitter] Add builtins to insert/extract subvectors from tuples (svget/svset)
For example:
  svint32_t svget4(svint32x4_t tuple, uint64_t imm_index)

returns the subvector at `index`, which must be in range `0..3`.
  svint32x3_t svset3(svint32x3_t tuple, uint64_t index, svint32_t vec)

returns a tuple vector with `vec` inserted into `tuple` at `index`,
which must be in range `0..2`.

Reviewers: c-rhodes, efriedma

Reviewed By: c-rhodes

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81464
2020-06-18 11:06:16 +01:00
Florian Hahn 934bcaf10b [Matrix] Add __builtin_matrix_column_load to Clang.
This patch add __builtin_matrix_column_major_load to Clang,
as described in clang/docs/MatrixTypes.rst. In the initial version,
the stride is not optional yet.

Reviewers: rjmccall, rsmith, jfb, Bigcheese

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72781
2020-06-18 10:47:55 +01:00
Sander de Smalen 1d7b4a7e5e [SveEmitter] Add builtins for tuple creation (svcreate2/svcreate3/etc)
The svcreate builtins allow constructing a tuple from individual vectors, e.g.

  svint32x2_t svcreate2(svint32_t v2, svint32_t v2)`

Reviewers: c-rhodes, david-arm, efriedma

Reviewed By: c-rhodes, efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81463
2020-06-18 10:07:09 +01:00
Huihui Zhang 9d8d0646d7 [NFC] Silence compiler warning [-Wmissing-braces].
clang/lib/CodeGen/CGNonTrivialStruct.cpp:330:7: warning: suggest braces around initialization of subobject [-Wmissing-braces]
  Address(CGF->Builder.CreateLoad(CGF->GetAddrOfLocalVar(Args[Ints])),
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  {
2020-06-17 13:01:53 -07:00
Ian Levesque 7c7c8e0da4 [xray] Option to omit the function index
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching.  Minor additions to compiler-rt support per-function patching
without the index.

Reviewers: dberris, MaskRay, johnislarry

Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81995
2020-06-17 13:49:01 -04:00
Alexey Bataev 34ee2549a7 [OPENMP50]Codegen for scan directive in for simd regions.
Summary:
Added codegen for scan directives in parallel for regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp for simd reduction(inscan, op : ...)
for(...) {
  <input phase>;
  #pragma omp scan (in)exclusive(...)
  <scan phase>
}
```
is transformed to something:
```
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for simd
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for simd
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
```

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81658
2020-06-17 08:43:17 -04:00
Sander de Smalen e51c1d06a9 [SveEmitter] Add builtins for svtbl2
Reviewers: david-arm, efriedma, c-rhodes

Reviewed By: c-rhodes

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81462
2020-06-17 09:41:38 +01:00
Jun Ma 4a1776979f [CodeGen][TLS] Set TLS Model for __tls_guard as well.
Differential Revision: https://reviews.llvm.org/D81543
2020-06-17 08:31:13 +08:00
Christopher Tetreault eb81c85afd [SVE] Deprecate default false variant of VectorType::get
Reviewers: efriedma, fpetrogalli, kmclaughlin, huntergr

Reviewed By: fpetrogalli

Subscribers: cfe-commits, tschuett, rkruppe, psnobl, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D80342
2020-06-16 15:16:11 -07:00
Alexey Bataev 0f631bd3bb Revert "[OPENMP50]Codegen for scan directive in for simd regions."
This reverts commit 6e78a3086a to solve
the problem with mem leak.
2020-06-16 17:01:59 -04:00
Alexey Bataev 6e78a3086a [OPENMP50]Codegen for scan directive in for simd regions.
Summary:
Added codegen for scan directives in parallel for regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp for simd reduction(inscan, op : ...)
for(...) {
  <input phase>;
  #pragma omp scan (in)exclusive(...)
  <scan phase>
}
```
is transformed to something:
```
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for simd
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for simd
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81658
2020-06-16 16:13:27 -04:00
Luke Geeson 10b6567f49 [AArch64]: BFloat MatMul Intrinsics&CodeGen
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman

Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij

Reviewed By: miyuki, stuij

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80752

Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
2020-06-16 15:23:30 +01:00
Stanislav Mekhanoshin 9ee272f13d [AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
2020-06-15 16:18:05 -07:00
Akira Hatanaka 2cfb027369 [CodeGen][NFC] Add a helper function that returns the addresses of
parameters of non-trivial C struct special functions

This removes the need to pass std::array of Addresses to getFunction,
which were overwritten in the function.
2020-06-15 15:59:16 -07:00
Arnold Schwaighofer 4a8120ca9f Fix ConstantAggregateBuilderBase::getRelativeOffset
Summary:
If a record has a mix of relative pointers and other fields they
wouldn't necessarily be the same.

Fallout from D77592.

rdar://64309883

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81857
2020-06-15 12:23:20 -07:00
Jeff Mott 8799ebbc1f [clang] Fix or emit diagnostic for checked arithmetic builtins with
_ExtInt types

- Fix computed size for _ExtInt types passed to checked arithmetic
  builtins.
- Emit diagnostic when signed _ExtInt larger than 128-bits is passed
    to __builtin_mul_overflow.
- Change Sema checks for builtins to accept placeholder types.

Differential Revision: https://reviews.llvm.org/D81420
2020-06-15 06:51:54 -07:00
Tyker 51e4aa87e0 attempt to fix failing buildbots after 3bab88b7ba
Prevent IR-gen from emitting consteval declarations

Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
2020-06-15 12:58:37 +02:00
Kirill Bobyrev 550c4562d1 Revert "Prevent IR-gen from emitting consteval declarations"
This reverts commit 3bab88b7ba.

This patch causes test failures:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/17260
2020-06-15 12:14:15 +02:00
Tyker 3bab88b7ba Prevent IR-gen from emitting consteval declarations
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76420
2020-06-15 10:47:14 +02:00
Nikita Popov 7cac7e0cfc [IR] Prefer hasFnAttribute() where possible (NFC)
When checking for an enum function attribute, use hasFnAttribute()
rather than hasAttribute() at FunctionIndex, because it is
significantly faster (and more concise to boot).
2020-06-15 09:30:35 +02:00
Sander de Smalen 91a4a592ed [SveEmitter] Add SVE tuple types and builtins for svundef.
This patch adds new SVE types to Clang that describe tuples of SVE
vectors. For example `svint32x2_t` which maps to the twice-as-wide
vector `<vscale x 8 x i32>`. Similarly, `svint32x3_t` will map to
`<vscale x 12 x i32>`.

It also adds builtins to return an `undef` vector for a given
SVE type.

Reviewers: c-rhodes, david-arm, ctetreau, efriedma, rengolin

Reviewed By: c-rhodes

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81459
2020-06-15 07:36:01 +01:00
Alex Bradbury 3dcfd482cb [CodeGen] Increase applicability of ffine-grained-bitfield-accesses for targets with limited native integer widths
As pointed out in PR45708, -ffine-grained-bitfield-accesses doesn't
trigger in all cases you think it might for RISC-V. The logic in
CGRecordLowering::accumulateBitFields checks OffsetInRecord is a legal
integer according to the datalayout. RISC targets will typically only
have the native width as a legal integer type so this check will fail
for OffsetInRecord of 8 or 16 when you would expect the transformation
is still worthwhile.

This patch changes the logic to check for an OffsetInRecord of a at
least 1 byte, that fits in a legal integer, and is a power of 2. We
would prefer to query whether native load/store operations are
available, but I don't believe that is possible.

Differential Revision: https://reviews.llvm.org/D79155
2020-06-12 10:33:47 +01:00
Akira Hatanaka c9a52de002 [CodeGen] Simplify the way lifetime of block captures is extended
Rather than pushing inactive cleanups for the block captures at the
entry of a full expression and activating them during the creation of
the block literal, just call pushLifetimeExtendedDestroy to ensure the
cleanups are popped at the end of the scope enclosing the block
expression.

rdar://problem/63996471

Differential Revision: https://reviews.llvm.org/D81624
2020-06-11 16:06:22 -07:00
John McCall 7fac1acc61 Set the LLVM FP optimization flags conservatively.
Functions can have local pragmas that override the global settings.
We set the flags eagerly based on global settings, but if we emit
an expression under the influence of a pragma, we clear the
appropriate flags from the function.

In order to avoid doing a ton of redundant work whenever we emit
an FP expression, configure the IRBuilder to default to global
settings, and only reconfigure it when we see an FP expression
that's not using the global settings.

Patch by Michele Scandale!

https://reviews.llvm.org/D80462
2020-06-11 18:16:41 -04:00
Alexey Bataev 43101d10db [OPENMP50]Codegen for scan directive in simd loops.
Added codegen for scan directives in simd loop. The codegen transforms
original code:
```
int x = 0;
 #pragma omp simd reduction(inscan, +: x)
for (..) {
  <first part>
  #pragma omp scan inclusive(x)
  <second part>
}
```
into
```
int x = 0;
for (..) {
  int x_priv = 0;
  <first part>
  x = x_priv + x;
  x_priv = x;
  <second part>
}
```
and
```
int x = 0;
 #pragma omp simd reduction(inscan, +: x)
for (..) {
  <first part>
  #pragma omp scan exclusive(x)
  <second part>
}
```
into
```
int x = 0;
for (..) {
  int x_priv = 0;
  <second part>
  int temp = x;
  x = x_priv + x;
  x_priv = temp;
  <first part>
}
```

Differential revision: https://reviews.llvm.org/D78232
2020-06-11 14:48:43 -04:00
Leonard Chan 71568a9e28 [clang] Frontend components for the relative vtables ABI (round 2)
This patch contains all of the clang changes from D72959.

- Generalize the relative vtables ABI such that it can be used by other targets.
- Add an enum VTableComponentLayout which controls whether components in the
  vtable should be pointers to other structs or relative offsets to those structs.
  Other ABIs can change this enum to restructure how components in the vtable
  are laid out/accessed.
- Add methods to ConstantInitBuilder for inserting relative offsets to a
  specified position in the aggregate being constructed.
- Fix failing tests under new PM and ASan and MSan issues.

See D72959 for background info.

Differential Revision: https://reviews.llvm.org/D77592
2020-06-11 11:17:08 -07:00
Alexey Bataev fac7259c81 Revert "[OPENMP50]Codegen for scan directive in simd loops."
This reverts commit fb80e67f10 to resolve
the issue with asan buildbots.
2020-06-11 11:22:51 -04:00
Alexey Bataev 90b54fa045 [OPENMP50]Codegen for use_device_addr clauses.
Summary:
Added codegen for use_device_addr clause. The components of the list
items are mapped as a kind of RETURN components and then the returned
base address is used instead of the real address of the base declaration
used in the use_device_addr expressions.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80730
2020-06-11 09:54:51 -04:00
Alexey Bataev fb80e67f10 [OPENMP50]Codegen for scan directive in simd loops.
Added codegen for scandirectives in simd loop. The codegen transforms
original code:

```
int x = 0;
 #pragma omp simd reduction(inscan, +: x)
for (..) {
  <first part>
  #pragma omp scan inclusive(x)
  <second part>
}
```
into
```
int x = 0;
for (..) {
  int x_priv = 0;
  <first part>
  x = x_priv + x;
  x_priv = x;
  <second part>
}
```
and
```
int x = 0;
 #pragma omp simd reduction(inscan, +: x)
for (..) {
  <first part>
  #pragma omp scan exclusive(x)
  <second part>
}
```
into
```
int x = 0;
for (..) {
  int x_priv = 0;
  <second part>
  int temp = x;
  x = x_priv + x;
  x_priv = temp;
  <first part>
}
```

Differential revision: https://reviews.llvm.org/D78232
2020-06-11 09:01:23 -04:00
Daniel Grumberg e87e55edbc Make ASTFileSignature an array of 20 uint8_t instead of 5 uint32_t
Reviewers: aprantl, dexonsmith, Bigcheese

Subscribers: arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81347
2020-06-11 09:12:29 +01:00
Craig Topper ed34140e11 [X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC 2020-06-10 22:06:34 -07:00
Leonard Chan 7201272d4c Revert "[clang] Frontend components for the relative vtables ABI"
This reverts commit 2e009dbcb3.

Reverting since there were some test failures on buildbots that used the
new pass manager. ASan and MSan are also finding some bugs in this that
I'll need to address.
2020-06-10 13:50:05 -07:00
Leonard Chan 2e009dbcb3 [clang] Frontend components for the relative vtables ABI
This patch contains all of the clang changes from D72959.

- Generalize the relative vtables ABI such that it can be used by other targets.
- Add an enum VTableComponentLayout which controls whether components in the
  vtable should be pointers to other structs or relative offsets to those structs.
  Other ABIs can change this enum to restructure how components in the vtable
  are laid out/accessed.
- Add methods to ConstantInitBuilder for inserting relative offsets to a
  specified position in the aggregate being constructed.

See D72959 for background info.

Differential Revision: https://reviews.llvm.org/D77592
2020-06-10 12:48:10 -07:00
Arthur Eubanks bc38793852 Change debuginfo check for addHeapAllocSiteMetadata
Summary:
Move check inside of addHeapAllocSiteMetadata().
Change check to DebugInfo <= DebugLineTablesOnly.

Reviewers: akhuang

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81481
2020-06-09 11:01:06 -07:00
Thomas Lively b7d369280b [WebAssembly] Implement prototype SIMD rounding instructions
Summary:
As specified in https://github.com/WebAssembly/simd/pull/232. These
instructions are implemented as LLVM intrinsics for now rather than
normal ISel patterns to make these instructions opt-in. Once the
instructions are merged to the spec proposal, the intrinsics will be
replaced with proper ISel patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81222
2020-06-09 10:14:14 -07:00
Saiyedul Islam 675cefbf60 [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics
Summary:
__builtin_amdgcn_atomic_inc32(int *Ptr, int Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_inc64(int64_t *Ptr, int64_t Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_dec32(int *Ptr, int Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_dec64(int64_t *Ptr, int64_t Val, unsigned MemoryOrdering, const char *SyncScope)

First and second arguments gets transparently passed to the amdgcn atomic
inc/dec intrinsic. Fifth argument of the intrinsic is set as true if the
first argument of the builtin is a volatile pointer. The third argument of
this builtin is one of the memory-ordering specifiers ATOMIC_ACQUIRE,
ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST following C++11 memory
model semantics. This is mapped to corresponding LLVM atomic memory ordering
for the atomic inc/dec instruction using CLANG atomic C ABI. The fourth
argument is an AMDGPU-specific synchronization scope defined as string.

Reviewers: arsenm, sameerds, JonChesterfield, jdoerfert

Reviewed By: arsenm, sameerds

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, kerbowa, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80804
2020-06-09 17:02:58 +00:00
Arthur Eubanks ce7d3e1c55 Reland (again) D80966 [codeview] Put !heapallocsite on calls to operator new
Check that getDebugInfo() is not null, as in the first revision, before
calling getDebugInfo()->addHeapAllocSiteMetadata().
Else would cause a crash with a new expression in a default arg.

---

Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Differential Revision: https://reviews.llvm.org/D80966
2020-06-09 09:27:32 -07:00
Alexey Bataev cb9191c042 [OPENMP]Improve code readability, NFC.
Reuse existing function instead of code duplication and use better type.
2020-06-09 08:50:36 -04:00
Florian Hahn 3323a628ec [Matrix] Add __builtin_matrix_transpose to Clang.
This patch add __builtin_matrix_transpose to Clang, as described in
clang/docs/MatrixTypes.rst.

Reviewers: rjmccall, jfb, rsmith, Bigcheese

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72778
2020-06-09 10:14:37 +01:00
Arthur Eubanks a92ce3b706 Revert "Reland D80966 [codeview] Put !heapallocsite on calls to operator new"
This reverts commit b6e143aa54.

Causes https://bugs.chromium.org/p/chromium/issues/detail?id=1092370#c5.
Will investigate and reland (again).
2020-06-08 12:49:41 -07:00
Jian Cai 4db2b70248 Add a flag to debug automatic variable initialization
Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.

Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich

Reviewed By: jfb

Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77168
2020-06-08 12:30:56 -07:00
Arthur Eubanks c07339c675 Move *San module passes later in the NPM pipeline
Summary:
This fixes pr33372.cpp under the new pass manager.

ASan adds padding to globals. For example, it will change a {i32, i32, i32} to a {{i32, i32, i32}, [52 x i8]}. However, when loading from the {i32, i32, i32}, InstCombine may (after various optimizations) end up loading 16 bytes instead of 12, likely because it thinks the [52 x i8] padding is ok to load from. But ASan checks that padding should not be loaded from.

Ultimately this is an issue of *San passes wanting to be run after all optimizations. This change moves the module passes right next to the corresponding function passes.

Also remove comment that's no longer relevant, this is the last ASan/MSan/TSan failure under the NPM (hopefully...).

As mentioned in https://reviews.llvm.org/rG1285e8bcac2c54ddd924ffb813b2b187467ac2a6, NPM doesn't support LTO + sanitizers, so modified some tests that test for that.

Reviewers: leonardchan, vitalybuka

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81323
2020-06-08 12:08:49 -07:00
Fangrui Song fc935fc35b Reland D80979 [clang] Implement VectorType logic not operator
With a fix to use -triple %itanium_abi_triple

Differential Revision: https://reviews.llvm.org/D80979
2020-06-08 09:32:30 -07:00
Nico Weber abca3b7b2c Revert "[clang] Implement VectorType logic not operator."
This reverts commit a0de3335ed.
Breaks check-clang on Windows, see e.g.
https://reviews.llvm.org/D80979#2078750 (but fails on all
other Windows bots too).
2020-06-08 06:45:21 -04:00
Jun Ma a0de3335ed [clang] Implement VectorType logic not operator.
Differential Revision: https://reviews.llvm.org/D80979
2020-06-08 08:41:01 +08:00
Fangrui Song b6e143aa54 Reland D80966 [codeview] Put !heapallocsite on calls to operator new
With a change to use `CGM.getCodeGenOpts().getDebugInfo() != codegenoptions::NoDebugInfo`
instead of `getDebugInfo()`,
to fix `Profile-<arch> :: instrprof-gcov-multithread_fork.test`

See CodeGenModule::CodeGenModule, `EmitGcovArcs || EmitGcovNotes` can
set `clang::CodeGen::CodeGenModule::DebugInfo`.

---

Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Differential Revision: https://reviews.llvm.org/D80966
2020-06-07 13:35:20 -07:00
Florian Hahn 4affc444b4 [Matrix] Implement * binary operator for MatrixType.
This patch implements the * binary operator for values of
MatrixType. It adds support for matrix * matrix, scalar * matrix and
matrix * scalar.

For the matrix, matrix case, the number of columns of the first operand
must match the number of rows of the second. For the scalar,matrix variants,
the element type of the matrix must match the scalar type.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76794
2020-06-07 11:11:27 +01:00
Douglas Yung 059ba74bb6 Revert "[codeview] Put !heapallocsite on calls to operator new"
This reverts commit 672ed53860.

This commit is hitting an assertion failure across multiple bots in the test:
Profile-<arch> :: instrprof-gcov-multithread_fork.test

Failing bots include:
http://lab.llvm.org:8011/builders/llvm-avr-linux/builds/2205
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/8967
http://lab.llvm.org:8011/builders/clang-cmake-armv7-full/builds/10789
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/27750
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/16751
2020-06-06 23:30:46 +00:00
Richard Smith f39e12a06b PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz.
Summary:
This transformation is correct for a builtin call to 'free(p)', but not
for 'operator delete(p)'. There is no guarantee that a user replacement
'operator delete' has no effect when called on a null pointer.

However, the principle behind the transformation *is* correct, and can
be applied more broadly: a 'delete p' expression is permitted to
unconditionally call 'operator delete(p)'. So do that in Clang under
-Oz where possible. We do this whether or not 'p' has trivial
destruction, since the destruction might turn out to be trivial after
inlining, and even for a class-specific (but non-virtual,
non-destroying, non-array) 'operator delete'.

Reviewers: davide, dnsampaio, rjmccall

Reviewed By: dnsampaio

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79378
2020-06-05 17:13:43 -07:00
Reid Kleckner 672ed53860 [codeview] Put !heapallocsite on calls to operator new
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D80966
2020-06-05 12:52:38 -07:00
Ties Stuij 8b137a4306 [clang][BFloat] Add create/set/get/dup intrinsics
Summary:
This patch is part of a series that adds support for the Bfloat16 extension of
the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Luke Geeson
- Ties Stuij
- Mikhail Maltsev

Reviewers: t.p.northover, sdesmalen, fpetrogalli, LukeGeeson, stuij, labrinea

Reviewed By: labrinea

Subscribers: miyuki, dmgreen, labrinea, kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79710
2020-06-05 14:35:10 +01:00
Ties Stuij ecd682bbf5 [ARM] Add __bf16 as new Bfloat16 C Type
Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.

This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.

The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij

Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli

Reviewed By: SjoerdMeijer

Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76077
2020-06-05 10:32:43 +01:00
Alexey Bataev 4e3d4622b1 Fix undefined behaviour when trying to deref nullptr. 2020-06-04 17:52:06 -04:00
Alexey Bataev bd1c03d7b7 [OPENMP50]Codegen for inscan reductions in worksharing directives.
Summary:
Implemented codegen for reduction clauses with inscan modifiers in
worksharing constructs.

Emits the code for the directive with inscan reductions.
The code is the following:
```
size num_iters = <num_iters>;
<type> buffer[num_iters];
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79948
2020-06-04 16:29:33 -04:00
Alexey Bataev 9ca5a6d3b5 [OPENMP]Fix PR46146: Do not consider globalized variables as NRVO candidates.
Summary:
If the variables must be globalized in OpenMP mode (local automatic
variable, GPU compilation mode, the variable may escape its declaration
context by the reference or by the pointer), it should not be considered
as the NRVO candidate. Otherwise, incorrect the return value of the
function might not be updated.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80936
2020-06-04 12:33:25 -04:00
Craig Topper dd863ccae1 [X86] Separate X86_CPU_TYPE_COMPAT_WITH_ALIAS from X86_CPU_TYPE_COMPAT. NFC
Add a separate X86_CPU_TYPE_COMPAT_ALIAS that carries alias string
and the enum from X86_CPU_TYPE_COMPAT.
2020-06-03 14:13:12 -07:00
Yaxun (Sam) Liu 04abbb3a78 [HIP] Change default --gpu-max-threads-per-block value to 1024
Differential Revision: https://reviews.llvm.org/D76795
2020-06-03 11:09:22 -04:00
Andrew Wock 15a1780a10 [PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins
This is a re-revert with a corrected test.

This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.

Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.

Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
2020-06-03 09:45:27 -04:00
Alexey Bataev 59e0987a06 [OPENMP]Fix PR46170: partial mapping for array sections of data members.
Summary:
If the data member is mapped as an array section, need to emit the
pointer to the last element of this array section and use this pointer
as the highest element in partial struct data.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81037
2020-06-03 09:10:20 -04:00
Lucas Prates 8beaba13b8 [Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts
Summary:
During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.

This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.

Reviewers: t.p.northover, ostannard, pcc, efriedma

Reviewed By: ostannard, efriedma

Subscribers: echristo, plotfi, nickdesaulniers, efriedma, kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79721
2020-06-03 11:39:27 +01:00
Wei Mi 7a6c89427c [SampleFDO] Add use-sample-profile function attribute.
When sampleFDO is enabled, people may expect they can use
-fno-profile-sample-use to opt-out using sample profile for a certain file.
That could be either for debugging purpose or for performance tuning purpose.
However, when thinlto is enabled, if a function in file A compiled with
-fno-profile-sample-use is imported to another file B compiled with
-fprofile-sample-use, the inlined copy of the function in file B may still
get its profile annotated.

The inconsistency may even introduce profile unused warning because if the
target is not compiled with explicit debug information flag, the function
in file A won't have its debug information enabled (debug information will
be enabled implicitly only when -fprofile-sample-use is used). After it is
imported into file B which is compiled with -fprofile-sample-use, profile
annotation for the outline copy of the function will fail because the
function has no debug information, and that will trigger  profile unused
warning.

We add a new attribute use-sample-profile to control whether a function
will use its sample profile no matter for its outline or inline copies.
That will make the behavior of -fno-profile-sample-use consistent.

Differential Revision: https://reviews.llvm.org/D79959
2020-06-02 17:23:17 -07:00
Vitaly Buka 232d348c6e [MTE] Convert StackSafety into analysis
This lets us to remove !stack-safe metadata and
better controll when to perform StackSafety
analysis.

Reviewers: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80771
2020-06-02 16:08:14 -07:00
Min-Yih Hsu 4431d64c10 Support ExtVectorType conditional operator
Extension vectors now can be used in element-wise conditional selector.
For example:
```
R[i] = C[i]? A[i] : B[i]
```
This feature was previously only enabled in OpenCL C. Now it's also
available in C. Not that it has different behaviors than GNU vectors
(i.e. __vector_size__). Extension vectors selects on signdness of the
vector. GNU vectors on the other hand do normal bool conversions. Also,
this feature is not available in C++.

Differential Revision: https://reviews.llvm.org/D80574
2020-06-02 16:35:42 +00:00
Alexey Bataev 89d9dba2c6 [OPENMP50]Initial codegen for 'affinity' clauses.
Summary:
Added initial codegen for 'affinity' clauses on task directives.
Emits next code:
```
kmp_task_affinity_info_t affs[<num_elems>];

void *td = __kmpc_task_alloc(..);

affs[<i>].base = &data_i;
affs[<i>].size = sizeof(data_i);
__kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs);
```

The result returned by the call of `__kmpc_omp_reg_task_with_affinity`
function is ignored currently sincethe  runtime currently ignores args
and returns 0 uncoditionally.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80240
2020-06-02 10:50:08 -04:00
Sriraman Tallam e0bca46b08 Options for Basic Block Sections, enabled in D68063 and D73674.
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.

+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.

Differential Revision: https://reviews.llvm.org/D68049
2020-06-02 00:23:32 -07:00
John McCall 8a8d703be0 Fix how cc1 command line options are mapped into FP options.
Canonicalize on storing FP options in LangOptions instead of
redundantly in CodeGenOptions.  Incorporate -ffast-math directly
into the values of those LangOptions rather than considering it
separately when building FPOptions.  Build IR attributes from
those options rather than a mix of sources.

We should really simplify the driver/cc1 interaction here and have
the driver pass down options that cc1 directly honors.  That can
happen in a follow-up, though.

Patch by Michele Scandale!
https://reviews.llvm.org/D80315
2020-06-01 22:00:30 -04:00
Joseph Huber 1a4fb2edcb [OpenMP] Replace Clang's OpenMP RTL Definitions with OMPKinds.def
Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits

Tags: #openmp, #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80222
2020-06-01 16:23:10 -04:00
Florian Hahn 8f3f88d2f5 [Matrix] Implement matrix index expressions ([][]).
This patch implements matrix index expressions
(matrix[RowIdx][ColumnIdx]).

It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx).
MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First,
if the base of a subscript is of matrix type, we create a incomplete
MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete
MatrixSubscriptExpr, we create a complete
MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx)

Similar to vector elements, it is not possible to take the address of
a MatrixSubscriptExpr.
For CodeGen, a new MatrixElt type is added to LValue, which is very
similar to VectorElt. The only difference is that we may need to cast
the type of the base from an array to a vector type when accessing it.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76791
2020-06-01 20:08:49 +01:00
Christopher Tetreault 796898172c [SVE] Eliminate calls to default-false VectorType::get() from Clang
Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall

Reviewed By: fpetrogalli, rjmccall

Subscribers: tschuett, rkruppe, psnobl, dmgreen, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80323
2020-06-01 10:02:14 -07:00
Nick Desaulniers ef1d4bec89 [Clang][CGM] style cleanups NFC
Summary:
Forked from:
https://reviews.llvm.org/D80242

Use the getter for access to DebugInfo consistently.
Use break in switch in CodeGenModule::EmitTopLevelDecl consistently.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: cfe-commits, srhines

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80840
2020-06-01 09:33:08 -07:00
Djordje Todorovic 40a3fcb05c [DebugInfo][CallSites] Remove decl subprograms from 'retainedTypes:'
After the D70350, the retainedTypes: isn't being used for the purpose
of call site debug info for extern calls, so it is safe to delete it
from IR representation.
We are also adding a test to ensure the subprogram isn't stored within
the retainedTypes: from corresponding DICompileUnit.

Differential Revision: https://reviews.llvm.org/D80369
2020-06-01 09:10:05 +02:00
Florian Hahn 6f6e91d193 [Matrix] Implement + and - operators for MatrixType.
This patch implements the + and - binary operators for values of
MatrixType. It adds support for matrix +/- matrix, scalar +/- matrix and
matrix +/- scalar.

For the matrix, matrix case, the types must initially be structurally
equivalent. For the scalar,matrix variants, the element type of the
matrix must match the scalar type.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76793
2020-05-29 20:42:22 +01:00
Arthur Eubanks 1285e8bcac Run Coverage pass before other *San passes under new pass manager, round 2
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.

This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.

This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).

Currently sanitizers + LTO don't work together under the new pass
manager, so I removed tests that checked that this combination works for
sancov.

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80692
2020-05-28 17:04:47 -07:00
Arthur Eubanks e3fb8446f2 Revert "Run Coverage pass before other *San passes under new pass manager, round 2"
This reverts commit 922fa2fce3.
2020-05-28 14:38:05 -07:00
Arthur Eubanks 922fa2fce3 Run Coverage pass before other *San passes under new pass manager, round 2
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.

This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.

This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80692
2020-05-28 14:25:23 -07:00
Vitaly Buka 2f430f7a51 [StackSafety] Remove SetMetadata parameter 2020-05-28 13:32:57 -07:00
Sam McCall d283fc4f9d [DebugInfo] Use SplitTemplateClosers (foo<bar<baz> >) in DWARF too
Summary:
D76801 caused some regressions in debuginfo compatibility by changing how
certain functions were named.

For CodeView we try to mirror MSVC exactly: this was fixed in a549c0d004
For DWARF the situation is murkier. Per David Blaikie:
> In general DWARF doesn't specify this at all.
> [...]
> This isn't the only naming divergence between GCC and Clang

Nevertheless, including the space seems to provide better compatibility with
GCC and GDB. E.g. cpexprs.cc in the GDB testsuite requires this formatting.
And there was no particular desire to change the printing of names in debug
info in the first place (just in diagnostics and other more user-facing text).

Fixes PR46052

Reviewers: dblaikie, labath

Subscribers: aprantl, cfe-commits, dyung

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80554
2020-05-28 12:30:38 +02:00
Alok Kumar Sharma d20bf5a725 [DebugInfo] Upgrade DISubrange to support Fortran dynamic arrays
This patch upgrades DISubrange to support fortran requirements.

Summary:
Below are the updates/addition of fields.
lowerBound - Now accepts signed integer or DIVariable or DIExpression,
earlier it accepted only signed integer.
upperBound - This field is now added and accepts signed interger or
DIVariable or DIExpression.
stride - This field is now added and accepts signed interger or
DIVariable or DIExpression.
This is required to describe bounds of array which are known at runtime.

Testing:
unit test cases added (hand-written)
check clang
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D80197
2020-05-28 13:46:41 +05:30
James Y Knight aca3d067ef Fix Darwin 'constinit thread_local' variables.
Unlike other platforms using ItaniumCXXABI, Darwin does not allow the
creation of a thread-wrapper function for a variable in the TU of
users. Because of this, it can set the linkage of the thread-local
symbol to internal, with the assumption that no TUs other than the one
defining the variable will need it.

However, constinit thread_local variables do not require the use of
the thread-wrapper call, so users reference the variable
directly. Thus, it must not be converted to internal, or users will
get a link failure.

This was a regression introduced by the optimization in
00223827a9.

Differential Revision: https://reviews.llvm.org/D80417
2020-05-27 11:59:30 -04:00
Alexey Bataev a888fc6b34 [OPENMP50]Initial support for use_device_addr clause.
Summary:
Added parsing/sema analysis/serialization support for use_device_addr
clauses.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, sstefan1, llvm-commits, cfe-commits, caomhin

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80404
2020-05-27 11:35:31 -04:00
serge-sans-paille de02a75e39 [PGO] Fix computation of function Hash
And bump its version number accordingly.

This is a patched recommit of 7c298c104b

Previous hash implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.

Added a new test case that trivially tests that a small function change is
reflected in the hash value.

Not that as this patch fixes the hash computation, it would invalidate all hashes
computed before that patch applies, this is why we bumped the version number.

Update profile data hash entries due to hash function update, except for binary
version, in which case we keep the buggy behavior for backward compatibility.

Differential Revision: https://reviews.llvm.org/D79961
2020-05-27 09:15:21 +02:00
Eric Christopher 97a133f157 Temporarily Revert "[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts"
as it's causing crashes on code generation and https://bugs.llvm.org/show_bug.cgi?id=46084

This reverts commit 98cad555e2.
2020-05-26 18:51:00 -07:00
Adrian Prantl b59b3640bc Debug Info: Mark os_log helper functions as artificial
The os_log helper functions are linkonce_odr and supposed to be
uniqued across TUs, so attachine a DW_AT_decl_line on it is highly
misleading. By setting the function decl to implicit, CGDebugInfo
properly marks the functions as artificial and uses a default file /
line 0 location for the function.

rdar://problem/63450824

Differential Revision: https://reviews.llvm.org/D80463
2020-05-26 09:08:27 -07:00
Lucas Prates 98cad555e2 [Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts
Summary:
During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.

This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.

Reviewers: t.p.northover, ostannard, pcc

Reviewed By: ostannard

Subscribers: kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79721
2020-05-26 10:09:35 +01:00
Fangrui Song 9d55e4ee13 Make explicit -fno-semantic-interposition (in -fpic mode) infer dso_local
-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.

This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).

Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)
2020-05-25 20:48:18 -07:00
Benjamin Kramer 2b8d6fa0ac Revert "[PGO] Fix computation of function Hash"
This reverts commit 7c298c104b.
Fails make check-clang.

Failing Tests (8):
	Clang :: Profile/c-counter-overflows.c
	Clang :: Profile/c-general.c
	Clang :: Profile/c-unprofiled-blocks.c
	Clang :: Profile/cxx-rangefor.cpp
	Clang :: Profile/cxx-throws.cpp
	Clang :: Profile/misexpect-switch-default.c
	Clang :: Profile/misexpect-switch-nonconst.c
	Clang :: Profile/misexpect-switch.c
2020-05-25 20:14:28 +02:00
serge-sans-paille 7c298c104b [PGO] Fix computation of function Hash
Previous implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.

Added a new test case that trivially tests that a small function change is
reflected in the hash value.

Not that as this patch fixes the hash computation, it invalidates all hashes
computed before that patch applies, which could be an issue for large build
system that pre-compute the profile data and let client download them as part of
the build process.

Differential Revision: https://reviews.llvm.org/D79961
2020-05-25 17:17:29 +02:00
ISHIGURO, Hiroshi ac2c5af67f [OPENMP] Fix mixture of omp and clang pragmas
Fixes PR45753

When a program that contains a loop to which both `omp parallel for`
pragma and `clang loop` pragma are associated is compiled with the
-fopenmp option, `clang loop` pragma did not take effect. The example
below should not be vectorized by the `clang loop` pragma but it was
actually vectorized. The cause is that `llvm.loop.vectorize.width`
was not output to the IR when -fopenmp is specified.

The fix attaches attributes if they exist for the loop.

[example.c]

```
int a[100], b[100];
void foo() {
  #pragma omp parallel for
  #pragma clang loop vectorize(disable)
  for (int i = 0; i < 100; i++)
    a[i] += b[i] * i;
}
```

[compile]

```
$ clang -O2 -fopenmp example.c -c -Rpass=vect
example.c:3:11: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
  #pragma omp parallel for
          ^
```

[IR with -fopenmp]

```
$ clang -O2 exmaple.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fopenmp | grep 'vectorize\.width'
```

[IR with -fno-openmp]

```
$ clang -O2 example.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fno-openmp | grep 'vectorize\.width'
!7 = !{!"llvm.loop.vectorize.width", i32 1}
```

Differential Revision: https://reviews.llvm.org/D79921
2020-05-22 12:53:37 +09:00
Heejin Ahn 48acac3629 [WebAssembly] Warn on exception spec only when Wasm EH is used
Summary:
In D80061 we added warning for exception specifications with types (such
as `throw(int)`), but it was enabled every time the target was wasm,
which means it warned (and ignored) exception specifications even if
wasm EH was not used. This fixes it and we only have the warning when we
enable `-fwasm-exceptions`.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80362
2020-05-21 17:08:35 -07:00
Zequan Wu e36076ee3a [clang] Add nomerge function attribute to clang
Differential Revision: https://reviews.llvm.org/D79121
2020-05-21 17:07:39 -07:00
Zequan Wu b0a0f01bc1 Revert "Add nomerge function attribute to clang"
This reverts commit 307e853954.
2020-05-21 16:13:18 -07:00
Zequan Wu 307e853954 Add nomerge function attribute to clang 2020-05-21 15:28:27 -07:00
Yaxun (Sam) Liu 361e4f14e3 Fix debug info for NoDebug attr
NoDebug attr does not totally eliminate debug info about a function when
inlining is enabled. This is inconsistent with when inlining is disabled.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D79967
2020-05-21 09:02:56 -04:00
Alexey Bataev 414afdf940 [OPENMP]Fix PR45911: Data sharing and lambda capture.
Summary:
No need to generate inlined OpenMP region for variables captured in
lambdas or block decls, only for implicitly captured variables in the
OpenMP region.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79966
2020-05-20 15:01:02 -04:00
Melanie Blower 827be690dc [clang] FastMathFlags.allowContract should be initialized only from FPFeatures.allowFPContractAcrossStatement
Summary: Fix bug introduced in D72841 adding support for pragma float_control

Reviewers: rjmccall, Anastasia

Differential Revision: https://reviews.llvm.org/D79903
2020-05-20 06:19:10 -07:00
Eli Friedman 62f3ef2b53 [CGCall] Annotate references with "align" attribute.
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.

See also D80072.

Differential Revision: https://reviews.llvm.org/D80166
2020-05-19 20:21:30 -07:00
Erich Keane 74ef6a1147 Fix X86_64 complex-returns for regcall.
D35259 introduced a case where complex types of non-long-double would
result in FI.getReturnInfo() to not be initialized properly.  This
resulted in a crash under some very specific circumstances when
dereferencing the LLVMContext.

This patch makes sure that these types have the intended getReturnInfo
initialization.
2020-05-19 13:21:15 -07:00
jasonliu 7f5d91d3ff [clang][AIX] Implement ABIInfo and TargetCodeGenInfo for AIX
Summary:
Created AIXABIInfo and AIXTargetCodeGenInfo for AIX ABI.

Reviewed By: Xiangling_L, ZarkoCA

Differential Revision: https://reviews.llvm.org/D79035
2020-05-19 15:00:48 +00:00
Alexey Bataev 2e499eee58 [OPENMP50]Add initial support for 'affinity' clause.
Summary:
Added parsing/sema/serialization support for affinity clause in task
directives.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, llvm-commits, cfe-commits, caomhin

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80148
2020-05-19 08:19:09 -04:00
Heejin Ahn d94bacbcf8 [WebAssembly] Handle exception specifications
Summary:
Wasm currently does not fully handle exception specifications. Rather
than crashing,
- This treats `throw()` in the same way as `noexcept`.
- This ignores and prints a warning for `throw(type, ..)`, for a
  temporary measure. This warning is controlled by
  `-Wwasm-exception-spec`, which is on by default. You can suppress the
  warning by using `-Wno-wasm-exception-spec`.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80061
2020-05-19 01:16:09 -07:00
Martin Böhme 4c09289f63 [clang] Add an API to retrieve implicit constructor arguments.
Summary:
This is needed in Swift for C++ interop -- see here for the corresponding Swift change:

https://github.com/apple/swift/pull/30630

As part of this change, I've had to make some changes to the interface of CGCXXABI to return the additional parameters separately rather than adding them directly to a `CallArgList`.

Reviewers: rjmccall

Reviewed By: rjmccall

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79942
2020-05-19 09:21:26 +02:00
Francesco Petrogalli b593bfd4d8 [clang][SveEmitter] SVE builtins for `svusdot` and `svsudot` ACLE.
Summary:
Intrinsics, guarded by `__ARM_FEATURE_SVE_MATMUL_INT8`:

* svusdot[_s32]
* svusdot[_n_s32]
* svusdot_lane[_s32]
* svsudot[_s32]
* svsudot[_n_s32]
* svsudot_lane[_s32]

Reviewers: sdesmalen, efriedma, david-arm, rengolin

Subscribers: tschuett, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79877
2020-05-18 23:07:23 +00:00
Anastasia Stulova a6a237f204 [OpenCL] Added addrspace_cast operator in C++ mode.
This operator is intended for casting between
pointers to objects in different address spaces
and follows similar logic as const_cast in C++.

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60193
2020-05-18 12:07:54 +01:00
Jim Lin 7ee479a760 [RISCV] Fix passing two floating-point values in complex separately by two GPRs on RV64
Summary:
This patch fixed the error of counting the remaining FPRs. Complex floating-point
values should be passed by two FPRs for the hard-float ABI. If no two FPRs are
available, it should be passed via a 64-bit GPR (fp+fp). `ArgFPRsLeft` is only
decreased one while the type is complex floating-point. It causes two floating-point
values in the complex are passed separately by two GPRs.

Reviewers: asb, luismarques, lenary

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, evandro, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79770
2020-05-18 13:13:22 +08:00
John McCall 32870a84d9 Expose IRGen API to add the default IR attributes to a function definition.
I've also made a stab at imposing some more order on where and how we add
attributes; this part should be NFC.  I wasn't sure whether the CUDA use
case for libdevice should propagate CPU/features attributes, so there's a
bit of unnecessary duplication.
2020-05-16 14:44:54 -04:00
Heejin Ahn 945ad141ce Revert "[WebAssembly] Handle exception specifications"
This reverts commit bca347508c.

This broke clang/test/Misc/warning-flags.c, because the newly added
warning option in this commit didn't have a matching flag.
2020-05-15 21:33:44 -07:00
Heejin Ahn bca347508c [WebAssembly] Handle exception specifications
Summary:
Wasm currently does not fully handle exception specifications. Rather
than crashing, this treats `throw()` in the same way as `noexcept`, and
ignores and prints a warning for `throw(type, ..)`, for a temporary
measure.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79655
2020-05-15 21:03:38 -07:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Nikita Popov f89f7da999 [IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.

In the future, this attribute may be replaced with data layout
properties.

Differential Revision: https://reviews.llvm.org/D78862
2020-05-15 19:41:07 +02:00
Yonghong Song 072cde03aa [Clang][BPF] implement __builtin_btf_type_id() builtin function
Such a builtin function is mostly useful to preserve btf type id
for non-global data. For example,
   extern void foo(..., void *data, int size);
   int test(...) {
     struct t { int a; int b; int c; } d;
     d.a = ...; d.b = ...; d.c = ...;
     foo(..., &d, sizeof(d));
   }

The function "foo" in the above only see raw data and does not
know what type of the data is. In certain cases, e.g., logging,
the additional type information will help pretty print.

This patch implemented a BPF specific builtin
  u32 btf_type_id = __builtin_btf_type_id(param, flag)
which will return a btf type id for the "param".
flag == 0 will indicate a BTF local relocation,
which means btf type_id only adjusted when bpf program BTF changes.
flag == 1 will indicate a BTF remote relocation,
which means btf type_id is adjusted against linux kernel or
future other entities.

Differential Revision: https://reviews.llvm.org/D74668
2020-05-15 09:44:54 -07:00
Leonard Chan e9802aa422 Revert "Run Coverage pass before other *San passes under new pass manager"
This reverts commit 7d5bb94d78.

Reverting since this leads to a linker error we're seeing on Fuchsia.
The underlying issue seems to be that inlining is run after sanitizers
and causes different comdat groups instrumented by Sancov to reference
non-key symbols defined in other comdat groups.

Will re-land this patch after a fix for that is landed.
2020-05-14 15:19:27 -07:00
Alexey Bataev 0363ae97ab [OPENMP50]Codegen for uses_allocators clause.
Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```

Reviewers: jdoerfert, aaron.ballman

Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79257
2020-05-14 18:02:12 -04:00
Eli Friedman 428d0b6f77 Fix clang test failures from D77454 2020-05-14 14:10:51 -07:00
Adrian McCarthy a549c0d004 Fix template class debug info for Visual Studio visualizers
An earlier change eliminated spaces between the close brackets of nested
template lists.  Unfortunately that prevents the Windows debuggers from
matching some types to their corresponding visualizers (e.g., std::map).

This selects the SeparateTemplateClosers flag when generating CodeView.
Note that we were already making formatting adjustments under similar
circumstances for similar reasons.

This wasn't caught by existing tests because they were using only
-std=c++98.

Differential Revision: https://reviews.llvm.org/D79274
2020-05-13 14:20:18 -07:00
Sylvain Audi 7a8edcb212 [Clang] Restore replace_path_prefix instead of startswith
In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options.
However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests.

This patch restores those replace_path_prefix calls.
It also modifies the prefix matching to be case-insensitive under Windows.

Differential Revision : https://reviews.llvm.org/D76869
2020-05-13 13:49:14 -04:00
Shengchen Kan ad60ff70eb [NFC] Code cleanup in TargetInfo.cpp
Fix the signed/unsigned mismatch issue
2020-05-13 14:48:46 +08:00
Thomas Lively 3d49d1cfa7 [WebAssembly] Implement pseudo-min/max SIMD instructions
Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79742
2020-05-12 09:39:01 -07:00
Fangrui Song b56b1e67e3 [gcov] Default coverage version to '408*' and delete CC1 option -coverage-exit-block-before-body
gcov 4.8 (r189778) moved the exit block from the last to the second.
The .gcda format is compatible with 4.7 but

* decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the
  exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings,
  and print wrong `" returned %s` for branch statistics (-b).
* decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues.

Also, rename "return block" to "exit block" because the latter is the
appropriate term.
2020-05-12 09:14:03 -07:00
Sander de Smalen d6936be2ef [SveEmitter] Add builtins for svdup and svindex
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79357
2020-05-12 11:02:32 +01:00
Arthur Eubanks 7d5bb94d78 Run Coverage pass before other *San passes under new pass manager
Summary:
This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass manager (final check-msan test!).
Under the old pass manager, the coverage pass would run before the MSan pass. The opposite happened under the new pass manager. The MSan pass adds extra basic blocks, changing the number of coverage callbacks.

Reviewers: vitalybuka, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79698
2020-05-11 12:59:09 -07:00
Florian Hahn 1065869195 [Matrix] Add matrix type to Clang.
This patch adds a matrix type to Clang as described in the draft
specification in clang/docs/MatrixSupport.rst. It introduces a new option
-fenable-matrix, which can be used to enable the matrix support.

The patch adds new MatrixType and DependentSizedMatrixType types along
with the plumbing required. Loads of and stores to pointers to matrix
values are lowered to memory operations on 1-D IR arrays. After loading,
the loaded values are cast to a vector. This ensures matrix values use
the alignment of the element type, instead of LLVM's large vector
alignment.

The operators and builtins described in the draft spec will will be added in
follow-up patches.

Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72281
2020-05-11 18:55:45 +01:00
Thomas Lively 8e3e56f2a3 [WebAssembly] Add wasm-specific vector shuffle builtin and intrinsic
Summary:

Although using `__builtin_shufflevector` and the `shufflevector`
instruction works fine, they are not opaque to the optimizer. As a
result, DAGCombine can potentially reduce the number of shuffles and
change the shuffle masks. This is unexpected behavior for users of the
WebAssembly SIMD intrinsics who have crafted their shuffles to
optimize the code generated by engines. This patch solves the problem
by adding a new shuffle intrinsic that is opaque to the optimizers in
line with the decision of the WebAssembly SIMD contributors at
https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In
the future we may implement custom DAG combines to properly optimize
shuffles and replace this solution.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D66983
2020-05-11 10:01:55 -07:00
Sander de Smalen 4cad97595f [SveEmitter] Add builtins for svmovlb and svmovlt
These builtins are expanded in CGBuiltin to use intrinsics
for (signed/unsigned) shift left long top/bottom.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79579
2020-05-11 09:41:58 +01:00
Fangrui Song 25544ce2df [gcov] Default coverage version to '407*' and delete CC1 option -coverage-cfg-checksum
Defaulting to -Xclang -coverage-version='407*' makes .gcno/.gcda
compatible with gcov [4.7,8)

In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum.
We can infer the information from the version.

With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7.
We don't need other -Xclang -coverage* options.
There may be a mismatching version warning, though.

(Note, GCC r173147 "split checksum into cfg checksum and line checksum"
 made gcov 4.7 incompatible with previous versions.)
2020-05-10 16:14:07 -07:00
Fangrui Song 13a633b438 [gcov] Delete CC1 option -coverage-no-function-names-in-data
rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION
(this might be part of the reasons the header says
"We emit files in a corrupt version of GCOV's "gcda" file format").

rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data
to work around the issue. (However, the description is wrong.
libgcov never writes function names, even before GCC 4.2).

In reality, the linker command line has to look like:

clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data

Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7
either produce wrong results (for one gcov-4.9 program, I see "No executable lines")
or segfault (gcov-7).
(gcov-8 uses an incompatible format.)

This patch deletes -coverage-no-function-names-in-data and the related
function names support from libclang_rt.profile
2020-05-10 12:37:44 -07:00
Jinsong Ji a72b9dfd45 [sanitizer] Enable whitelist/blacklist in new PM
https://reviews.llvm.org/D63616 added `-fsanitize-coverage-whitelist`
and `-fsanitize-coverage-blacklist` for clang.

However, it was done only for legacy pass manager.
This patch enable it for new pass manager as well.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D79653
2020-05-10 02:34:29 +00:00
Matt Arsenault a881dc1103 Fix typo 2020-05-09 16:00:17 -04:00
Simon Pilgrim 0b9783350b LTO.h - reduce includes to forward declarations. NFC.
Add missing ToolOutputFile.h dependency to BackendUtil.cpp
2020-05-09 15:10:51 +01:00
Matt Arsenault 03cb328d6f clang: Cleanup usage of CreateMemCpy
It handles the the pointee type casts in preparation for opaque
pointers.
2020-05-08 20:57:56 -04:00
Sriraman Tallam e8147ad822 Uniuqe Names for Internal Linkage Symbols.
This is a standalone patch and this would help Propeller do a better job of code
layout as it can accurately attribute the profiles to the right internal linkage
function.

This also helps SampledFDO/AutoFDO correctly associate sampled profiles to the
right internal function. Currently, if there is more than one internal symbol
foo, their profiles are aggregated by SampledFDO.

This patch adds a new clang option, -funique-internal-funcnames, to generate
unique names for functions with internal linkage. This patch appends the md5
hash of the module name to the function symbol as a best effort to generate a
unique name for symbols with internal linkage.

Differential Revision: https://reviews.llvm.org/D73307
2020-05-07 18:18:37 -07:00
Arthur Eubanks 48451ee6a7 [MSan] Pass MSan command line options under new pass manager
Summary:
Properly forward TrackOrigins and Recover user options to the MSan pass under the new pass manager.
This makes the number of check-msan failures when ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER is TRUE go from 52 to 2.

Based on https://reviews.llvm.org/D77249.

Reviewers: nemanjai, vitalybuka, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79445
2020-05-07 08:21:35 -07:00
Alexey Bataev 8026394d3c [OPENMP]Consider 'omp_null_allocator' as a predefined allocator.
Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.

Reviewers: jdoerfert

Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79186
2020-05-07 10:11:06 -04:00
Sander de Smalen 3cb8b4c193 [SveEmitter] Add builtins for SVE2 Polynomial arithmetic
This patch adds builtins for:
- sveorbt
- sveortb
- svpmul
- svpmullb, svpmullb_pair
- svpmullt, svpmullt_pair

The svpmullb and svpmullt builtins are expressed using the svpmullb_pair
and svpmullt_pair LLVM IR intrinsics, respectively.

Reviewers: SjoerdMeijer, efriedma, rengolin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79480
2020-05-07 11:53:04 +01:00
Akira Hatanaka dc4e25d4f2 [CodeGen][ObjC] Don't try to retain a __unsafe_unretained ARC pointer
passed to __builtin_os_log_format to extend its lifetime to the end of
its enclosing block

Extend only lifetimes of pointers returned by function calls or message
sends instead. In the long term, we should lifetime-extend pointers in
more complex expressions and non-ARC objects (e.g., C++ temporaries)
too.

rdar://problem/61846261
2020-05-06 12:47:17 -07:00
Melanie Blower c355bec749 Add support for #pragma clang fp reassociate(on|off)
Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D78827
2020-05-06 08:05:44 -07:00
Erich Keane 8a1c999c9b Implement _ExtInt ABI for all ABIs in Clang, enable type for ABIs
This is the result of an audit of all of the ABIs in clang to implement
and enable the type for those targets.

Additionally, this finds an issue with integer-promotion passing for a
few platforms when using _ExtInt of < int, so this also corrects that
resulting in signext/zeroext being on a params of those types in some
platforms.

Differential Revisions: https://reviews.llvm.org/D79118
2020-05-06 06:52:18 -07:00
Michael Liao 9142c0b46b [clang][codegen] Hoist parameter attribute setting in function prolog.
Summary:
- If the coerced type is still a pointer, it should be set with proper
  parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist
  that (pointer) parameter attribute setting so that the coerced pointer
  parameter could be marked properly.

Depends on D79394

Reviewers: rjmccall, kerbowa, yaxunl

Subscribers: jvesely, nhaehnle, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79395
2020-05-05 15:31:51 -04:00
Michael Liao 276c8dde0b [clang][codegen] Refactor argument loading in function prolog. NFC.
Summary:
- Skip copying function arguments and unnecessary casting by using them
  directly.

Reviewers: rjmccall, kerbowa, yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79394
2020-05-05 15:31:51 -04:00
Hans Wennborg 55b9b11fea Don't assert about missing profile info in createProfileWeightsForLoop
The compiler shouldn't crash if the profile info is slightly off. We hit
this in Chromium.

Differential revision: https://reviews.llvm.org/D79417
2020-05-05 18:59:00 +02:00
Francesco Petrogalli 4fa13a3dac [clang][OpenMP] Fix getNDSWDS for aarch64.
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.

The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.

Reviewers: ABataev, jdoerfert, andwar

Reviewed By: jdoerfert

Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78969
2020-05-05 16:27:20 +00:00
Sander de Smalen 5ba329059f [SveEmitter] Add builtins for svreinterpret
The reinterpret builtins are generated separately because they
need the cross product of all types, 121 functions in total,
which is inconvenient to specify in the arm_sve.td file.

Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78756
2020-05-05 13:04:44 +01:00
Sander de Smalen aed6bd6f42 Reland D78750: [SveEmitter] Add builtins for svdupq and svdupq_lane
Edit: Changed a few CHECK lines into CHECK-DAG lines.

This reverts commit 90f3f62cb0.
2020-05-05 10:42:11 +01:00
Sander de Smalen 90f3f62cb0 Revert "[SveEmitter] Add builtins for svdupq and svdupq_lane"
It seems this patch broke some buildbots, so reverting until I
have had a chance to investigate.

This reverts commit 6b90a6887d.
2020-05-04 21:31:55 +01:00
Sander de Smalen 6b90a6887d [SveEmitter] Add builtins for svdupq and svdupq_lane
* svdupq builtins that duplicate scalars to every quadword of a vector
  are defined using builtins for svld1rq (load and replicate quadword).
* svdupq builtins that duplicate boolean values to fill a predicate vector
  are defined using `svcmpne`.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78750
2020-05-04 20:38:47 +01:00
Zarko cb78376433 Test commit. Modified comment to add a period at the end. 2020-05-04 13:06:21 -04:00
Melanie Blower f5360d4bb3 Reapply "Add support for #pragma float_control" with buildbot fixes
Add support for #pragma float_control

Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841

This reverts commit fce82c0ed3.
2020-05-04 05:51:25 -07:00
Thomas Lively e0f52842c8 [WebAssembly] Renumber SIMD opcodes
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79224
2020-05-01 17:20:49 -07:00
Francesco Petrogalli 7585ba208e [clang][OpenMP] Fix mangling of linear parameters.
Summary:
The linear parameter token in the mangling function must be multiplied
by the pointee size in bytes when the parameter is a pointer.

Reviewers: ABataev, andwar, jdoerfert

Subscribers: yaxunl, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78965
2020-05-01 21:19:00 +00:00
Melanie Blower fce82c0ed3 Revert "Reapply "Add support for #pragma float_control" with improvements to"
This reverts commit 69aacaf699.
2020-05-01 10:31:09 -07:00
Melanie Blower 69aacaf699 Reapply "Add support for #pragma float_control" with improvements to
test cases
Add support for #pragma float_control

Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841

This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.
2020-05-01 10:03:30 -07:00
Alexey Bataev 8c2f4e0e85 [OPENMP50]Codegen for reduction clauses with 'task' modifier.
Summary:
Added codegen for reduction clause with task modifier.
```
  #pragma omp ... reduction(task, +: a)
  {
  #pragma omp ... in_reduction(+: a)
  }
```
is translated into something like this:
```
  #pragma omp ... reduction(+:a)
  {
    struct red_input_t {
      void *reduce_shar;
      void *reduce_orig;
      size_t reduce_size;
      void *reduce_init;
      void *reduce_fini;
      void *reduce_comb;
      unsigned flags;
    } r_var;
    r_var.reduce_shar = &a;
    r_var.reduce_orig = &original a;
    r_var.reduce_size = sizeof(a);
    r_var.reduce_init = [](void* l,void*){return *(int*)l=0;};
    r_var.reduce_fini = nullptr;
    r_var.reduce_comb = [](void* l,void* r){return *(int*)l += *(int)r;};
    void *tg = __kmpc_taskred_modifier_init(<loc_addr>,<gtid>,
      <flag - 0 for parallel, 1 for worksharing>,
      <1 - number of reduction elements>,
      &r_var);
    {
    #pragma omp ... in_reduction(+: a) firstprivate(tg)
    ...
    }
    __kmpc_task_reduction_modifier_fini(<loc_addr>,<gtid>,
      <flag - 0 for parallel, 1 for worksharing>);
  }
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, jfb, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79034
2020-05-01 11:40:27 -04:00
Melanie Blower 85dc033cac Revert "Add support for #pragma float_control"
This reverts commit 4f1e9a17e9.
due to fail on buildbot, sorry for the noise
2020-05-01 06:36:58 -07:00
Melanie Blower 4f1e9a17e9 Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841
2020-05-01 06:14:24 -07:00
Alexey Bataev b5be1c5419 [OPENMP50]Basic support for uses_allocators clause.
Summary: Added parsing/sema/serialization supoprt for uses_allocators clause.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78577
2020-04-30 16:24:36 -04:00
Alexey Bataev b737b814fe [OPENMP]Allow cancellation constructs in target parallel regions.
Summary:
omp cancellation point parallel and omp cancel parallel directives are
allowed in target paralle regions.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, caomhin, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78941
2020-04-30 15:10:52 -04:00
Aaron Smith 4eabd00612 [Windows SEH] Fix abnormal-exits in _try
Summary:
Per Windows SEH Spec, except _leave, all other early exits of a _try (goto/return/continue/break) are considered abnormal exits.  In those cases, the first parameter passes to its _finally funclet should be TRUE to indicate an abnormal-termination.

One way to implement abnormal exits in _try is to invoke Windows runtime _local_unwind() (MSVC approach) that will invoke _dtor funclet where abnormal-termination flag is always TRUE when calling _finally.  Obviously this approach is less optimal and is complicated to implement in Clang.

Clang today has a NormalCleanupDestSlot mechanism to dispatch multiple exits at the end of _try.  Since  _leave (or try-end fall-through) is always Indexed with 0 in that NormalCleanupDestSlot,  this fix takes the advantage of that mechanism and just passes NormalCleanupDest ID as 1st Arg to _finally.

Reviewers: rnk, eli.friedman, JosephTremoulet, asmith, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77936
2020-04-30 09:38:19 -07:00
jasonliu e0c356582d [NFC][clang] Replace raw new/delete with unique_ptr to store ABIInfo in TargetCodeGenInfo
Use unique_ptr to manage the lifetime of ABIInfo member inside TargetCodeGenInfo.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D79033
2020-04-30 12:31:50 +00:00
Erich Keane 5a1d9c0f5a Fix x86/x86_64 calling convention for _ExtInt
After speaking with Craig Topper about some recent defects, he pointed
out that _ExtInts should be passed indirectly if larger than the largest
int register, and like ints when smaller than that.  This patch
implements that.

Note that this changed the way vaargs worked quite a bit, but they still
work.

Differential Revision: https://reviews.llvm.org/D78785
2020-04-29 11:04:25 -07:00
Sander de Smalen a4dac6d4e0 [SveEmitter] Add builtins for svmov_b and svnot_b.
These are custom expanded in CGBuiltin:

  svmov_b_z(pg, op) <=> svand_b_z(pg, op, op)
  svnot_b_z(pg, op) <=> sveor_b_z(pg, op, pg)

Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79039
2020-04-29 13:33:18 +01:00
Simon Pilgrim db97a12454 Fix Wparentheses gcc warning. NFC.
It should be either a float(32) or an int(32).
2020-04-29 12:21:05 +01:00
Sander de Smalen 42a56bf63f [SveEmitter] Add builtins for gather prefetches
Patch by Andrzej Warzynski

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78677
2020-04-29 11:52:49 +01:00
David Blaikie 9b77242c9a CodeGenTypes::CGRecordLayouts: Use unique_ptr to simplify memory management 2020-04-28 22:31:16 -07:00
Christopher Tetreault ef3678cfee [SVE] Update EmitSVEPredicateCast to take a ScalableVectorType
Summary:
Removes usage of VectorType::getNumElements identified by test located
at CodeGen/aarch64-sve-intrinsics/acle_sve_abs.c. Since the type is an
SVE predicate vector, it makes sense to specialize the code for scalable
vectors only.

Reviewers: rengolin, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78958
2020-04-28 11:22:20 -07:00
Momchil Velikov 102b4105e3 [CMSE] Clear padding bits of struct/unions/fp16 passed by value
When passing a value of a struct/union type from secure to non-secure
state (that is returning from a CMSE entry function or passing an
argument to CMSE-non-secure call), there is a potential sensitive
information leak via the padding bits in the structure. It is not
possible in the general case to ensure those bits are cleared by using
Standard C/C++.

This patch makes the compiler emit code to clear such padding
bits. Since type information is lost in LLVM IR, the code generation
is done by Clang.

For each interesting record type, we build a bitmask, in which all the
bits, corresponding to user declared members, are set. Values of
record types are returned by coercing them to an integer. After the
coercion, the coerced value is masked (with bitwise AND) and then
returned by the function. In a similar manner, values of record types
are passed as arguments by coercing them to an array of integers, and
the coerced values themselves are masked.

For union types, we effectively clear only bits, which aren't part of
any member, since we don't know which is the currently active one.
The compiler will issue a warning, whenever a union is passed to
non-secure state.

Values of half-precision floating-point types are passed in the least
significant bits of a 32-bit register (GPR or FPR) with the most
significant bits unspecified. Since this is also a potential leak of
sensitive information, this patch also clears those unspecified bits.

Differential Revision: https://reviews.llvm.org/D76369
2020-04-28 17:05:58 +01:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Christopher Tetreault da8918f27e [SVE][NFC] Use ScalableVectorType in CGBuiltin
Summary: * Upgrade some usages of VectorType to use ScalableVectorType

Reviewers: efriedma, david-arm, fpetrogalli, kmclaughlin

Reviewed By: efriedma

Subscribers: tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78842
2020-04-27 16:29:45 -07:00
Sander de Smalen e4872d7f08 [SveEmitter] Add builtins for svlen
The svlen builtins return the number of elements in a vector
and are implemented using `llvm.vscale`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78755
2020-04-27 21:27:32 +01:00
Matt Arsenault 5c03beefa7 clang: Allow backend unsupported warnings
Currently this asserts on anything other than errors. In one
workaround scenario, AMDGPU emits DiagnosticInfoUnsupported as a
warning for functions that can't be correctly codegened, but should
never be executed.
2020-04-27 12:14:51 -04:00
Sander de Smalen 03f419f3eb [SveEmitter] IsInsertOp1SVALL and builtins for svqdec[bhwd] and svqinc[bhwd]
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.

This patch adds the flag IsInsertOp1SVALL to insert SV_ALL as the
second operand.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78401
2020-04-27 11:45:10 +01:00
Saiyedul Islam 06bdffb2bb [AMDGPU] Expose llvm fence instruction as clang intrinsic
Expose llvm fence instruction as clang builtin for AMDGPU target

__builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope)

The first argument of this builtin is one of the memory-ordering specifiers
__ATOMIC_ACQUIRE, __ATOMIC_RELEASE, __ATOMIC_ACQ_REL, or __ATOMIC_SEQ_CST
following C++11 memory model semantics. This is mapped to corresponding
LLVM atomic memory ordering for the fence instruction using LLVM atomic C
ABI. The second argument is an AMDGPU-specific synchronization scope
defined as string.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D75917
2020-04-27 09:39:03 +05:30
Sander de Smalen 3817ca7dbf [SveEmitter] Add IsAppendSVALL and builtins for svptrue and svcnt[bhwd]
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.

This patch adds the flag IsAppendSVALL to append SV_ALL as the final
operand.

Reviewers: SjoerdMeijer, efriedma, rovka, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77597
2020-04-26 12:44:26 +01:00
Craig Topper 0ed5b0d517 [X86] Don't use types when getting the intrinsic declaration for x86_avx512_mask_vcvtph2ps_512.
This intrinsic isn't overloaded so we should query with types.
Doing so causes the backend to miss the intrinsic and not codegen it.
This eventually leads to a linker error.
2020-04-24 11:01:22 -07:00
Luke Geeson 7da1905125 [AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch32
- Intrinsics Support for AArch32 Neon Intrinsics for Matrix
  Multiplication

Note: these extensions are optional in the 8.6a architecture and so have
to be enabled by default

No additional IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: t.p.northover, miyuki

Reviewed By: miyuki

Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77872
2020-04-24 15:54:06 +01:00
Luke Geeson 832cd74913 [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)

No IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin

Reviewed By: kmclaughlin

Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77871
2020-04-24 15:54:06 +01:00
Alexey Bataev e9bfa1dd38 [OPENMP]Use new interface for task reduction.
Summary:
Patch forces codegen to use the new runtime functions for task reductions where
the issue with passing the address of the original variables to the UDR
initializers is fixed. Also, this patch is required for upcoming
support of task modifier inreduction clause.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78733
2020-04-24 09:41:48 -04:00
Sander de Smalen 0ddb2034c1 [SveEmitter] Add builtins for compares and ReverseCompare flag.
The IsReverseCompare flag tells CGBuiltin to swap the operands,
so that a LT/LE intrinsics can be expressed in terms of GE/GT
intrinsics.

This patch also adds builtins for the wide-variants of the compares.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78747
2020-04-24 14:33:47 +01:00
Sander de Smalen 823e2a670a [SveEmitter] Add builtins for contiguous prefetches
This patch also adds the enum `sv_prfop` for the prefetch operation specifier
and checks to ensure the passed enum values are valid.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78674
2020-04-24 11:35:59 +01:00
serge-sans-paille 8f766e382b Update compiler extension integration into the build system
The approach here is to create a new (empty) component, `Extensions', where all
statically compiled extensions dynamically register their dependencies. That way
we're more natively compatible with LLVMBuild and llvm-config.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=44870

Differential Revision: https://reviews.llvm.org/D78192
2020-04-24 09:40:14 +02:00
Puyan Lotfi 9721fbf85b [NFC] Refactoring PropertyAttributeKind for ObjCPropertyDecl and ObjCDeclSpec.
This is a code clean up of the PropertyAttributeKind and
ObjCPropertyAttributeKind enums in ObjCPropertyDecl and ObjCDeclSpec that are
exactly identical. This non-functional change consolidates these enums
into one. The changes are to many files across clang (and comments in LLVM) so
that everything refers to the new consolidated enum in DeclObjCCommon.h.

2nd Landing Attempt...

Differential Revision: https://reviews.llvm.org/D77233
2020-04-23 17:21:25 -04:00
Sander de Smalen 7003a1da37 [SveEmitter] Use llvm.aarch64.sve.ld1/st1 for contiguous load/store builtins
This patch changes the codegen of the builtins for contiguous loads
to map onto the SVE specific IR intrinsics llvm.aarch64.sve.ld1/st1.

Reviewers: SjoerdMeijer, efriedma, kmclaughlin, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78673
2020-04-23 15:15:41 +01:00
Sander de Smalen a5e0389b2a [AArch64] Define ACLE FP conversion intrinsics with more specific predicate.
This patch changes the FP conversion intrinsics to take a predicate
that matches the number of lanes for the vector with the widest element
type as opposed to using <vscale x 16 x i1>.

For example:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f16(<vscale x 4 x float>, <vscale x 4 x i1>, <vscale x 8 x half>)```
now uses <vscale x 4 x i1> instead of <vscale x 16 x i1>

And similar for:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f64(<vscale x 4 x float>, <vscale x 2 x i1>, <vscale x 2 x double>)```
where the predicate now matches the wider type, so <vscale x 2 x i1>.

Reviewers: efriedma, SjoerdMeijer, paulwalker-arm, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78402
2020-04-23 10:53:23 +01:00
Sander de Smalen 002164461b [SveEmitter] Add builtins for FP conversions
This adds the flag IsOverloadCvt which tells CGBulitin to use
the result type and the type of the last operand as the
overloaded types for the LLVM IR intrinsic.

This also adds the flag IsFPConvert, which is needed to avoid
converting the predicate of the operation from svbool_t to
a predicate with fewer lanes, as the LLVM IR intrinsics use
the <vscale x 16 x i1> as the predicate.

Reviewers: SjoerdMeijer, efriedma

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78239
2020-04-23 10:49:06 +01:00
Puyan Lotfi bbf386f02b Revert "[NFC] Refactoring PropertyAttributeKind for ObjCPropertyDecl and ObjCDeclSpec."
This reverts commit 2aa044ed08.

Reverting due to bot failure in lldb.
2020-04-23 00:05:08 -04:00
Puyan Lotfi 2aa044ed08 [NFC] Refactoring PropertyAttributeKind for ObjCPropertyDecl and ObjCDeclSpec.
This is a code clean up of the PropertyAttributeKind and
ObjCPropertyAttributeKind enums in ObjCPropertyDecl and ObjCDeclSpec that are
exactly identical. This non-functional change consolidates these enums
into one. The changes are to many files across clang (and comments in LLVM) so
that everything refers to the new consolidated enum in DeclObjCCommon.h.

Differential Revision: https://reviews.llvm.org/D77233
2020-04-22 23:27:06 -04:00
Sander de Smalen 2d1baf606a [SveEmitter] Add builtins for svwhilerw/svwhilewr
This also adds the IsOverloadWhileRW flag which tells CGBuiltin to use
the result predicate type and the first pointer type as the
overloaded types for the LLVM IR intrinsic.

Reviewers: SjoerdMeijer, efriedma

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78238
2020-04-22 21:49:18 +01:00
Sander de Smalen 1559485e60 [SveEmitter] Add builtins for svwhile
This also adds the IsOverloadWhile flag which tells CGBuiltin to use
both the default type (predicate) and the type of the second operand
(scalar) as the overloaded types for the LLMV IR intrinsic.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77595
2020-04-22 21:47:47 +01:00
Sander de Smalen 662cbaf647 [SveEmitter] Add IsOverloadNone flag and builtins for svpfalse and svcnt[bhwd]_pat
Add the IsOverloadNone flag to tell CGBuiltin that it does not have
an overloaded type. This is used for e.g. svpfalse which does
not take any arguments and always returns a svbool_t.

This patch also adds builtins for svcntb_pat, svcnth_pat, svcntw_pat
and svcntd_pat, as those don't require custom codegen.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77596
2020-04-22 16:42:08 +01:00
Sander de Smalen 41d52662d5 [SveEmitter] Add support for _n form builtins
The ACLE has builtins that take a scalar value that is to be expanded
into a vector by the operation. While the ISA may have an instruction
that takes an immediate or a scalar to represent this, the LLVM IR
intrinsic may not, so Clang will have to splat the scalar value.

This patch also adds the _n forms for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77594
2020-04-22 14:23:54 +01:00
Andrzej Warzynski 72f565899d [SveEmitter] Implement builtins for gathers/scatters
This patch adds builtins for:
  * regular, first-faulting and non-temporal gather loads
  * regular and non-temporal scatter stores

Differential Revision: https://reviews.llvm.org/D77735
2020-04-22 13:21:39 +01:00
Justin Hibbits 4ca2cad947 [PowerPC] Add clang -msvr4-struct-return for 32-bit ELF
Summary:

Change the default ABI to be compatible with GCC.  For 32-bit ELF
targets other than Linux, Clang now returns small structs in registers
r3/r4.  This affects FreeBSD, NetBSD, OpenBSD.  There is no change for
32-bit Linux, where Clang continues to return all structs in memory.

Add clang options -maix-struct-return (to return structs in memory) and
-msvr4-struct-return (to return structs in registers) to be compatible
with gcc.  These options are only for PPC32; reject them on PPC64 and
other targets.  The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.

To actually return a struct in registers, coerce it to an integer of the
same size.  LLVM may optimize the code to remove unnecessary accesses to
memory, and will return i32 in r3 or i64 in r3:r4.

Fixes PR#40736

Patch by George Koehler!

Reviewed By: jhibbits, nemanjai
Differential Revision: https://reviews.llvm.org/D73290
2020-04-21 20:17:25 -05:00
Aaron Ballman 6a30894391 C++2a -> C++20 in some identifiers; NFC. 2020-04-21 15:37:19 -04:00
Craig Topper 68b2e507e4 [Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
2020-04-20 21:31:44 -07:00
Sander de Smalen 06c980df46 [SveEmitter] Implement zeroing of false lanes
This implements zeroing of false lanes for binary operations,
where instead of merging into the first operand vector (_m)
a `select` is placed on the first input vector. This approach
easily translates to the use of the `zeroing movprfx` instruction.

This patch also adds builtins for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77593
2020-04-20 17:02:48 +01:00
Sander de Smalen 9986b3de26 [SveEmitter] Explicitly merge with zero/undef
Builtins that have the merge type MergeAnyExp or MergeZeroExp,
merge into a 'undef' or 'zero' vector respectively, which enables the
_x and _z behaviour for unary operations.

This patch also adds builtins for svabs and svneg.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77591
2020-04-20 16:26:20 +01:00
Sander de Smalen 515020c091 [SveEmitter] Add more immediate operand checks.
This patch adds a number of intrinsics that take immediates with
varying ranges based on the element size one of the operands.

    svext:   immediate ranging 0 to (2048/sizeinbits(elt) - 1)
    svasrd:  immediate ranging 1..sizeinbits(elt)
    svqshlu: immediate ranging 1..sizeinbits(elt)/2
    ftmad:   immediate ranging 0..(sizeinbits(elt) - 1)

Reviewers: efriedma, SjoerdMeijer, rovka, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76679
2020-04-20 14:41:58 +01:00
Christopher Tetreault c858debebc Remove asserting getters from base Type
Summary:
Remove asserting vector getters from Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: dexonsmith, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: cfe-commits, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D77278
2020-04-17 14:03:31 -07:00
Erich Keane 5f0903e9be Reland Implement _ExtInt as an extended int type specifier.
I fixed the LLDB issue, so re-applying the patch.

This reverts commit a4b88c0449.
2020-04-17 10:45:48 -07:00
Sterling Augustine a4b88c0449 Revert "Implement _ExtInt as an extended int type specifier."
This reverts commit 61ba1481e2.

I'm reverting this because it breaks the lldb build with
incomplete switch coverage warnings. I would fix it forward,
but am not familiar enough with lldb to determine the correct
fix.

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
2020-04-17 10:29:40 -07:00
Benjamin Kramer b639091c02 Change users of CreateShuffleVector to pass the masks as int instead of Constants
No functionality change intended.
2020-04-17 16:34:29 +02:00
Erich Keane 61ba1481e2 Implement _ExtInt as an extended int type specifier.
Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.

However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns.  Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.

We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch.  We are
proposing the syntax _ExtInt(N).  We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance.  An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14.  We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).

[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf

Differential Revision: https://reviews.llvm.org/D73967
2020-04-17 07:10:57 -07:00
George Burgess IV 94908088a8 [CodeGen] fix inline builtin-related breakage from D78162
In cases where we have multiple decls of an inline builtin, we may need
to go hunting for the one with a definition when setting function
attributes.

An additional test-case was provided on
https://github.com/ClangBuiltLinux/linux/issues/979
2020-04-16 11:54:10 -07:00
Ehud Katz 03a9526fe5 [CGExprAgg] Fix infinite loop in `findPeephole`
Simplify the function using IgnoreParenNoopCasts.

Fix PR45476

Differential Revision: https://reviews.llvm.org/D78098
2020-04-16 13:26:23 +03:00
Benjamin Kramer 3ee1ec0b9d LangOptions cannot depend on ASTContext, make it not use ASTContext directly
Fixes a layering violation introduced in 2ba4e3a459.
2020-04-16 11:46:35 +02:00
Ayke van Laethem 215dc2e203
[AVR] Use the correct address space for non-prototyped function calls
Some function declarations like this:

    void foo();

do not have a type declaration, for that you'd use:

    void foo(void);

Clang internally bitcasts the variadic function declaration to a
function pointer, but doesn't use the correct address space on AVR. This
commit fixes that.

This fix is necessary to let Clang compile compiler-rt for AVR.

Differential Revision: https://reviews.llvm.org/D78125
2020-04-15 23:44:51 +02:00
Melanie Blower 2ba4e3a459 Move BinaryOperators.FPOptions to trailing storage
Reviewers: rjmccall

Differential Revision: https://reviews.llvm.org/D76384
2020-04-15 12:57:31 -07:00
Richard Smith bab6df86ae Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented.
Summary:
Previously, we treated CXXUuidofExpr as quite a special case: it was the
only kind of expression that could be a canonical template argument, it
could be a constant lvalue base object, and so on. In addition, we
represented the UUID value as a string, whose source form we did not
preserve faithfully, and that we partially parsed in multiple different
places.

With this patch, we create an MSGuidDecl object to represent the
implicit object of type 'struct _GUID' created by a UuidAttr. Each
UuidAttr holds a pointer to its 'struct _GUID' and its original
(as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves
like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue
representation of the GUID on the MSGuidDecl and use it from constant
evaluation where needed.

This allows removing a lot of the special-case logic to handle these
expressions. Unfortunately, many parts of Clang assume there are only
a couple of interesting kinds of ValueDecl, so the total amount of
special-case logic is not really reduced very much.

This fixes a few bugs and issues:
 * PR38490: we now support reading from GUID objects returned from
   __uuidof during constant evaluation.
 * Our Itanium mangling for a non-instantiation-dependent template
   argument involving __uuidof no longer depends on which CXXUuidofExpr
   template argument we happened to see first.
 * We now predeclare ::_GUID, and permit use of __uuidof without
   any header inclusion, better matching MSVC's behavior. We do not
   predefine ::__s_GUID, though; that seems like a step too far.
 * Our IR representation for GUID constants now uses the correct IR type
   wherever possible. We will still fall back to using the
      {i32, i16, i16, [8 x i8]}
   layout if a definition of struct _GUID is not available. This is not
   ideal: in principle the two layouts could have different padding.

Reviewers: rnk, jdoerfert

Subscribers: arphaman, cfe-commits, aeubanks

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78171
2020-04-15 12:20:42 -07:00
George Burgess IV 2dd17ff081 [CodeGen] only add nobuiltin to inline builtins if we'll emit them
There are some inline builtin definitions that we can't emit
(isTriviallyRecursive & callers go into why). Marking these
nobuiltin is only useful if we actually emit the body, so don't mark
these as such unless we _do_ plan on emitting that.

This suboptimality was encountered in Linux (see some discussion on
D71082, and https://github.com/ClangBuiltLinux/linux/issues/979).

Differential Revision: https://reviews.llvm.org/D78162
2020-04-15 11:05:22 -07:00
Benjamin Kramer 316b49d373 Pass shufflevector indices as int instead of unsigned.
No functionality change intended.
2020-04-15 15:52:49 +02:00
Benjamin Kramer 6f64daca8f Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints
No functionality change intended.
2020-04-15 12:51:38 +02:00
George Burgess IV 91c8c74180 [CodeGen] clarify a comment; NFC
Prompted by discussion on https://reviews.llvm.org/D78148.
2020-04-14 14:33:01 -07:00
Joerg Sonnenberger 9d2d6e71f0 Emit Objective-C constructors as writable
They end up as .init_array sections and those need to be writable,
otherwise bad merging will happen.
2020-04-14 22:32:34 +02:00
Christopher Tetreault 670f2f694b [SVE] Remove calls to getBitWidth from clang
Reviewers: efriedma

Reviewed By: efriedma

Subscribers: tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77903
2020-04-14 13:29:09 -07:00
Sander de Smalen c8a5b30bac [SveEmitter] Add range checks for immediates and predicate patterns.
Summary:
This patch adds a mechanism to easily add range checks for a builtin's
immediate operands. This patch is tested with the qdech intrinsic, which takes
both an enum for the predicate pattern, as well as an immediate for the
multiplier.

Reviewers: efriedma, SjoerdMeijer, rovka

Reviewed By: efriedma, SjoerdMeijer

Subscribers: mgorny, tschuett, mgrang, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76678
2020-04-14 16:49:32 +01:00
Sander de Smalen 17a68c61a9 [SveEmitter] Implement builtins for contiguous loads/stores
This adds builtins for all contiguous loads/stores, including
non-temporal, first-faulting and non-faulting.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76238
2020-04-14 15:24:57 +01:00
Georgii Rymar 1647ff6e27 [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers.
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.

Differential revision: https://reviews.llvm.org/D78016
2020-04-14 14:11:02 +03:00
Ayke van Laethem cfc002714a
[AVR] Support aliases in non-zero address space
This fixes code like the following on AVR:

void foo(void) {
}
void bar(void) __attribute__((alias("foo")));

Code like this is present in compiler-rt, which I'm trying to build.

Differential Revision: https://reviews.llvm.org/D76182
2020-04-14 00:42:19 +02:00
Christopher Tetreault f22fbe3a15 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sdesmalen, efriedma, krememek

Reviewed By: sdesmalen, efriedma

Subscribers: dexonsmith, Charusso, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77257
2020-04-13 13:01:40 -07:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Matt Morehouse bef187c750 Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.

Patch by Yannis Juglaret of DGA-MI, Rennes, France

libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.

Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.

The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.

Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:

```
  # (a)
  src:*
  fun:*

  # (b)
  src:SRC/*
  fun:*

  # (c)
  src:SRC/src/woff2_dec.cc
  fun:*
```

Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.

For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.

We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.

The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.

Reviewers: kcc, morehouse, vitalybuka

Reviewed By: morehouse, vitalybuka

Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D63616
2020-04-10 10:44:03 -07:00
Kevin P. Neal 7f38812d5b [FPEnv][AArch64] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that.

Neon is part of this patch, so ARM is affected as well.

Differential Revision: https://reviews.llvm.org/D77074
2020-04-10 13:02:00 -04:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
Pratyai Mazumder ced398fdc8 [SanitizerCoverage] Add -fsanitize-coverage=inline-bool-flag
Reviewers: kcc, vitalybuka

Reviewed By: vitalybuka

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77637
2020-04-09 02:40:55 -07:00
Serge Pavlov c7ff5b38f2 [FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode.
These are:

1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.

2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.

3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.

4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.

5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.

The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.

This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.

Differential Revision: https://reviews.llvm.org/D77379
2020-04-09 13:26:47 +07:00
Erich Keane 30588a7395 Make target features check work with ctor and dtor-
The problem was reported in PR45468, applying target features to an
always_inline constructor/destructor runs afoul of GlobalDecl
construction assert when checking for target-feature compatibility.

The core problem is fixed by using the version of the check that takes a
FunctionDecl rather than the GlobalDecl. However, while writing the
test, I discovered that source locations weren't properly set for this
check on ctors/dtors. This patch also fixes constructors and CALLED destructors.

Unfortunately, it doesn't seem too possible to get a meaningful source
location for a 'cleanup' destructor, so those are still 'frontend' level
errors unfortunately. A fixme was added to the test to cover that
situation.
2020-04-08 13:19:55 -07:00
Raul Tambre 878d96011a [clang][CodeGen] Handle throw expression in conditional operator constant folding
Summary:
We're smart and do constant folding when emitting conditional operators.
Thus we emit the live value as a lvalue. This doesn't work if the live value is a throw expression.
Handle this by emitting the throw and returning the dead value as the lvalue.

Fixes PR28184.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77502
2020-04-08 12:32:21 -07:00
Artem Belevich a9627b7ea7 [CUDA] Add partial support for recent CUDA versions.
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.

Differential Revision: https://reviews.llvm.org/D77670
2020-04-08 11:19:44 -07:00
Bevin Hansson 313461f6d8 [CodeGen] Emit IR for compound assignment with fixed-point operands.
Reviewers: rjmccall, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73184
2020-04-08 14:33:04 +02:00
Bevin Hansson 39baaabf6d [CodeGen] Emit IR for fixed-point unary operators.
Reviewers: rjmccall, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73183
2020-04-08 14:33:04 +02:00
Bevin Hansson 0b9922e67a [CodeGen] Emit IR for fixed-point multiplication and division.
Reviewers: rjmccall, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73182
2020-04-08 14:33:04 +02:00
Alexey Bataev be99c61588 [OPENMP50]Codegen for iterator construct.
Implemented codegen for the iterator expression in the depend clauses.
Iterator construct is emitted the following way:
iterator(cnt1, cnt2, ...), in : <dep>

<TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...;
kmp_depend_t deps[<TotalNumDeps>];
deps_counter = 0;
for (cnt1) {
  for (cnt2) {
    ...
    deps[deps_counter].base_addr = &<dep>;
    deps[deps_counter].size = sizeof(<dep>);
    deps[deps_counter].flags = in;
    deps_counter += 1;
    ...
  }
}

For depobj construct the codegen is very similar, but the memory is
allocated dynamically and added extra first item reserved for internal use.
2020-04-07 15:26:00 -04:00
Amy Huang bcf66084ed [DebugInfo] Fix for adding "returns cxx udt" option to functions in CodeView.
Summary:
This change adds DIFlagNonTrivial to forward declarations of
DICompositeType. It adds the flag to nontrivial types and types with
unknown triviality.

It fixes adding the "CxxReturnUdt" flag to functions inconsistently,
since it is added based on whether the return type is marked NonTrivial, and
that changes if the return type was a forward declaration.

continues the discussion at https://reviews.llvm.org/D75215

Bug: https://bugs.llvm.org/show_bug.cgi?id=44785

Reviewers: rnk, dblaikie, aprantl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77436
2020-04-07 09:10:27 -07:00
Michael Liao c97be2c377 [hip] Remove `hip_pinned_shadow`.
Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
  surface/texture reference support. So far, both the host and device
  use the same reference type, which could be revised later when
  interface/implementation is stablized.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77583
2020-04-07 09:51:49 -04:00
Florian Hahn 338be9c595 [Clang] Add llvm.loop.unroll.disable to loops with -fno-unroll-loops.
Currently Clang does not respect -fno-unroll-loops during LTO. During
D76916 it was suggested to respect -fno-unroll-loops on a TU basis.

This patch uses the existing llvm.loop.unroll.disable metadata to
disable loop unrolling explicitly for each loop in the TU if
unrolling is disabled. This should ensure that loops from TUs compiled
with -fno-unroll-loops are skipped by the unroller during LTO.

This also means that if a loop from a TU with -fno-unroll-loops
gets inlined into a TU without this option, the loop won't be
unrolled.

Due to the fact that some transforms might drop loop metadata, there
potentially are cases in which we still unroll loops from TUs with
-fno-unroll-loops. I think we should fix those issues rather than
introducing a function attribute to disable loop unrolling during LTO.
Improving the metadata handling will benefit other use cases, like
various loop pragmas, too. And it is an improvement to clang completely
ignoring -fno-unroll-loops during LTO.

If that direction looks good, we can use a similar approach to also
respect -fno-vectorize during LTO, at least for LoopVectorize.

In the future, this might also allow us to remove the UnrollLoops option
LLVM's PassManagerBuilder.

Reviewers: Meinersbur, hfinkel, dexonsmith, tejohnson

Reviewed By: Meinersbur, tejohnson

Differential Revision: https://reviews.llvm.org/D77058
2020-04-07 14:01:55 +01:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Erik Pilkington d33c7de8e1 [CodeGenObjC] Fix a crash when attempting to copy a zero-sized bit-field in a non-trivial C struct
Zero sized bit-fields aren't included in the CGRecordLayout, so we shouldn't be
calling EmitLValueForField for them. rdar://60695105

Differential revision: https://reviews.llvm.org/D76782
2020-04-06 16:04:13 -04:00
Reid Kleckner b36c19bc4f [AST] Remove DeclCXX.h dep on ASTContext.h
Saves only 36 includes of ASTContext.h and related headers.

There are two deps on ASTContext.h:
- C++ method overrides iterator types (TinyPtrVector)
- getting LangOptions

For #1, duplicate the iterator type, which is
TinyPtrVector<>::const_iterator.

For #2, add an out-of-line accessor to get the language options. Getting
the ASTContext from a Decl is already an out of line method that loops
over the parent DeclContexts, so if it is ever performance critical, the
proper fix is to pass the context (or LangOpts) into the predicate in
question.

Other changes are just header fixups.
2020-04-06 10:09:01 -07:00
Amy Huang 11a04a64aa [DebugInfo] Change to constructor homing debug info mode: skip literal types
Summary:
In constructor type homing mode sometimes complete debug info for constexpr
types was missing, because there was not a constructor emitted. This change
makes constructor type homing ignore constexpr types.

Reviewers: rnk, dblaikie

Subscribers: aprantl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77432
2020-04-06 09:52:53 -07:00
Alexey Bataev 1c92448656 [OPENMP]Fix PR45439: `omp for collapse(2) ordered(2)` generates invalid
IR.

Fixed a crash because of the not quite correct casting of the value of
iterations.
2020-04-06 12:07:43 -04:00
Johannes Doerfert 419a559c5a [OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D77112
2020-04-05 22:30:29 -05:00
David Blaikie e9644e6f4f DebugInfo: Fix default template parameter computation for dependent non-type template parameters
This addresses the immediate bug, though in theory we could still
produce a default parameter for the DWARF in this test case - but other
cases will be definitely unachievable (you could have a default
parameter that cannot be evaluated - so long as the user overrode it
with another value rather than relying on that default)
2020-04-05 16:31:30 -07:00
Eli Friedman 83fa811e5b [clang][opaque pointers] Fix up a bunch of "getType()->getElementType()"
In contexts where we know an LLVM type is a pointer, there's generally
some simpler way to get the pointee type.
2020-04-03 18:00:33 -07:00
Eli Friedman b11decc221 [clang codegen][opaque pointers] Remove use of deprecated constructor
(See also D76269.)
2020-04-03 18:00:33 -07:00
Kevin P. Neal 9f1c35d8b1 Revert "[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins"
The new test case causes bot failures.

This reverts commit ba87430cad.
2020-04-03 15:47:19 -04:00
Andrew Wock ba87430cad [PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins
This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.

Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.

Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
2020-04-03 14:59:33 -04:00
Michael Liao b952d799ca [cuda][hip] Fix `RegisterVar` function prototype.
Summary:
- `RegisterVar` has `void` return type and `size_t` in its variable size
  parameter in HIP or CUDA 9.0+.

Reviewers: tra, yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77398
2020-04-03 12:57:09 -04:00
Lucas Prates e6cb4b659a [Clang][CodeGen] Fixing mismatch between memory layout and const expressions for oversized bitfields
Summary:
The construction of constants for structs/unions was conflicting the
expected memory layout for over-sized bit-fields. When building the
necessary bits for those fields, clang was ignoring the size information
computed for the struct/union memory layout and using the original data
from the AST's FieldDecl information. This caused an issue in big-endian
targets, where the field's contant was incorrectly misplaced due to
endian calculations.

This patch aims to separate the constant value from the necessary
padding bits, using the proper size information for each one of them.
With this, the layout of constants for over-sized bit-fields matches the
ABI requirements.

Reviewers: rsmith, eli.friedman, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77048
2020-04-02 11:55:20 +01:00
Daniel Kiss 7314aea5a4 [clang] Move branch-protection from CodeGenOptions to LangOptions
Summary:
Reason: the option has an effect on preprocessing.

Also see thread: http://lists.llvm.org/pipermail/cfe-dev/2020-March/065014.html

Reviewers: chill, efriedma

Reviewed By: efriedma

Subscribers: efriedma, danielkiss, cfe-commits, kristof.beyls

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77131
2020-04-02 10:31:52 +02:00
Johannes Doerfert 1858f4b50d Revert "[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`"
This reverts commit c18d55998b.

Bots have reported uses that need changing, e.g.,
  clang-tools-extra/clang-tidy/openmp/UseDefaultNoneCheck.cp
as reported by
  http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/46591
2020-04-02 02:23:22 -05:00
Johannes Doerfert c18d55998b [OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.

Differential Revision: https://reviews.llvm.org/D77112
2020-04-02 01:39:07 -05:00
David Blaikie db92719c1d DebugInfo: Defaulted non-type template parameters of bool type
Caused an assertion due to mismatched bit widths - this seems like the
right API to use for a possibly width-varying equality test. Though
certainly open to some post-commit review feedback if there's a more
suitable way to do this comparison/test.
2020-04-01 13:21:13 -07:00
Arnold Schwaighofer 153dadf3a3 [clang] CodeGen: Make getOrEmitProtocol public for Swift
Summary:
Swift would like to use clang's apis to emit protocol declarations.

This commits adds the public API:

```
 emitObjCProtocolObject(CodeGenModule &CGM, const ObjCProtocolDecl *p);
```

rdar://60888524

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77077
2020-04-01 08:55:56 -07:00
Vlastimil Labsky 57fd86de87 [AVR] Fix function pointer address space
Summary:
Function pointers should be created with program address space.
This fixes function pointers on AVR.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: Jim, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77119
2020-04-01 21:08:37 +13:00
Ian Levesque bb3111cbaf [clang][xray] Add xray attributes to functions without decls too
Summary: This allows instrumenting things like global initializers

Reviewers: dberris, MaskRay, smeenai

Subscribers: cfe-commits, johnislarry

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77191
2020-04-01 00:02:39 -04:00
Alexey Bataev c2aa543237 [OPENMP50]Codegen for array shaping expression in map clauses.
Added codegen support for array shaping operations in map/to/from
clauses.
2020-03-31 19:06:49 -04:00
Alexey Bataev e094dd5adc [OPENMP50]Fix size calculation for array shaping expression in the
codegen.

Need to include the size of the pointee type when trying to calculate
the total size of the array shaping expression.
2020-03-31 18:45:21 -04:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
zhizhouy 94d912296d [NFC] Do not run CGProfilePass when not using integrated assembler
Summary:
CGProfilePass is run by default in certain new pass manager optimization pipeline. Assemblers other than llvm as (such as gnu as) cannot recognize the .cgprofile entries generated and emitted from this pass, causing build time error.

This patch adds new options in clang CodeGenOpts and PassBuilder options so that we can turn cgprofile off when not using integrated assembler.

Reviewers: Bigcheese, xur, george.burgess.iv, chandlerc, manojgupta

Reviewed By: manojgupta

Subscribers: manojgupta, void, hiraditya, dexonsmith, llvm-commits, tcwang, llozano

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D62627
2020-03-31 10:31:31 -07:00
Alexey Bataev a4f74f377b [OPENMP50]Do not imply lvalue as base expression in array shaping
expression.

We should not assume that the base expression in the array shaping
operation is an lvalue of some form, it may be an rvalue.
2020-03-30 17:07:08 -04:00
Alexey Bataev 7842e7ebbf [OPENMP50]Add codegen support for array shaping expression in depend
clauses.

Implemented codegen for array shaping operation in depend clauses. The
begin of the expression is the pointer itself, while the size of the
dependence data is the mukltiplacation of all dimensions in the array
shaping expression.
2020-03-30 13:37:21 -04:00
Michael Liao cb6389360b Fix GCC warning on enum class bitfield. NFC. 2020-03-28 10:20:34 -04:00
Yaxun (Sam) Liu 369e26ca9e [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z
The main purpose of introducing these builtins is to add a range
metadata [1, 1025) on the work group size loaded from dispatch
ptr, which cannot be done by source code.

Differential Revision: https://reviews.llvm.org/D76772
2020-03-28 01:03:20 -04:00
Alexey Bataev 0fca766458 [OPENMP50]Fix PR45117: Orphaned task reduction should be allowed.
Add support for orpahned task reductions.
2020-03-27 17:47:30 -04:00
Adrian Prantl 22d5bd0e3b Allow remapping Clang module include paths
in the debug info with -fdebug-prefix-map.

rdar://problem/55685132

This reapplies an earlier attempt to commit this without
modifications.

Differential Revision: https://reviews.llvm.org/D76385
2020-03-27 14:23:30 -07:00
Michael Liao 5be9b8cbe2 [cuda][hip] Add CUDA builtin surface/texture reference support.
Summary: - Re-commit after fix Sema checks on partial template specialization.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365
2020-03-27 17:18:49 -04:00
Artem Belevich fe8063e1a0 Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."
This reverts commit 6a9ad5f3f4.
The patch breaks CUDA copmilation.

Differential Revision: https://reviews.llvm.org/D76365
2020-03-27 10:01:38 -07:00
Mikael Holmen 7d482e9213 Fix TBAA for unsigned fixed-point types
Summary:
Unsigned types can alias the corresponding signed types. I don't see
that this is explicitly mentioned in the Embedded-C specification, but
I think it should work the same as for the integer types.

Patch by: materi

Reviewers: ebevhan, leonardchan

Reviewed By: leonardchan

Subscribers: kosarev, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76856
2020-03-27 10:35:24 +01:00
Johannes Doerfert befb4be3a8 [OpenMP] `omp begin/end declare variant` - part 2, sema ("+CG")
This is the second part loosely extracted from D71179 and cleaned up.

This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.

A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].

The logic is as follows:

If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
  1) Create a function declaration for the definition we were about to generate.
  2) Create a function definition but with a mangled name (according to
     `<SELECTOR>`).
  3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
     one used already for `omp declare variant`, using and the mangled
     function definition as specialization for the context defined by
     `<SELECTOR>`.

When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman

Subscribers: bollu, guansong, openmp-commits, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75779
2020-03-27 02:30:58 -05:00
Johannes Doerfert 095cecbe0d [OpenMP] `omp begin/end declare variant` - part 1, parsing
This is the first part extracted from D71179 and cleaned up.

This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].

A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D74941
2020-03-27 02:30:58 -05:00
Sid Manning b0da094983 [Hexagon] Add support for Linux/Musl ABI (part 2)
A continuation of https://reviews.llvm.org/D72701.  This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638
2020-03-26 17:19:46 -05:00
Michael Liao 6a9ad5f3f4 [cuda][hip] Add CUDA builtin surface/texture reference support.
Summary:
- Even though the bindless surface/texture interfaces are promoted,
  there are still code using surface/texture references. For example,
  [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
  compilation issue for code using `tex2D` with texture references. For
  better compatibility, this patch proposes the support of
  surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
  `nvcc` does use builtins for texture support. From the limited NVVM
  documentation[^nvvm] and NVPTX backend texture/surface related
  tests[^test], it's believed that surface/texture references are
  supported by replacing their reference types, which are annotated with
  `device_builtin_surface_type`/`device_builtin_texture_type`, with the
  corresponding handle-like object types, `cudaSurfaceObject_t` or
  `cudaTextureObject_t`, in the device-side compilation. On the host
  side, that global handle variables are registered and will be
  established and updated later when corresponding binding/unbinding
  APIs are called[^bind]. Surface/texture references are most like
  device global variables but represented in different types on the host
  and device sides.
- In this patch, the following changes are proposed to support that
  behavior:
  + Refine `device_builtin_surface_type` and
    `device_builtin_texture_type` attributes to be applied on `Type`
    decl only to check whether a variable is of the surface/texture
    reference type.
  + Add hooks in code generation to replace that reference types with
    the correponding object types as well as all accesses to them. In
    particular, `nvvm.texsurf.handle.internal` should be used to load
    object handles from global reference variables[^texsurf] as well as
    metadata annotations.
  + Generate host-side registration with proper template argument
    parsing.

---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used.  But, the current backend doesn't have that supported. We may revise that later.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365
2020-03-26 14:44:52 -04:00
Eli Friedman c46a0c07a6 [clang codegen] Address review comment on comment in constWithPadding. 2020-03-25 10:58:03 -07:00
Adrian Prantl c025235e96 Revert "Allow remapping Clang module include paths"
to investigate why this commit broke a test in the LLDB testsuite.

This reverts commit dca920a904.
2020-03-24 17:57:34 -07:00
Adrian Prantl dca920a904 Allow remapping Clang module include paths
rdar://problem/55685132

Differential Revision: https://reviews.llvm.org/D76385
2020-03-24 17:14:27 -07:00
Eli Friedman 3f1defa6e2 [clang codegen] Clean up handling of vectors with trivial-auto-var-init.
The code was pretending to be doing something useful with vectors, but
really it was doing nothing: the element type of a vector is always a
scalar type, so constWithPadding would always just return the input constant.

Split off from D75661 so it can be reviewed separately.

While I'm here, also add testcase to show missing vector handling.

Differential Revision: https://reviews.llvm.org/D76528
2020-03-24 14:34:40 -07:00
Erik Pilkington de98cf92e3 [CodeGen] Add an alignment attribute to all sret parameters
This fixes a miscompile when the parameter is actually underaligned.
rdar://58316406

Differential revision: https://reviews.llvm.org/D74183
2020-03-24 15:31:57 -04:00
Momchil Velikov 080d046c91 [ARM][CMSE] Implement CMSE attributes
This patch adds CMSE attributes `cmse_nonsecure_call` and
`cmse_nonsecure_entry`.  As usual, specification is available here:
https://developer.arm.com/docs/ecm0359818/latest

Patch by Javed Absar, Bradley Smith, David Green, Momchil Velikov,
possibly others.

Differential Revision: https://reviews.llvm.org/D71129
2020-03-24 10:21:26 +00:00
Jun Ma d0f4af8f30 [Coroutines] Insert lifetime intrinsics even O0 is used
Differential Revision: https://reviews.llvm.org/D76119
2020-03-24 13:41:55 +08:00
Matt Arsenault 3f533006ba AMDGPU: Emit llvm.fshr for __builtin_amdgcn_alignbit
These are equivalent. The generic rotate builtins do not directly map
to the fshr intrinsic.
2020-03-23 16:51:25 -04:00
Johannes Doerfert 55eca2853e [OpenMP][NFC] Minimize memory usage and copying of `OMPTraitInfo`s
See rational here: https://reviews.llvm.org/D71830#1922656

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D76173
2020-03-23 14:23:46 -05:00
Alexey Bataev 63828a35da [OPENMP50]Bassic support for exclusive clause.
Added basic support (parsing/sema/serialization) for exclusive clause in
scan directives.
2020-03-23 13:12:52 -04:00
David Blaikie b5eafda8d3 Revert "EHScopeStack::Cleanup has virtual functions so the destructor should be too."
This type was already well designed - having a protected destructor, and
derived classes being final/public non-virtual destructors, the type
couldn't be destroyed polymorphically & accidentally cause slicing.

This reverts commit 736385c0b4.
2020-03-21 21:17:33 -07:00
Thomas Lively de6cd3e836 [WebAssembly] Add SIMD integer abs builtins
Summary:
Since the conditional operator cannot be used with vector conditions
in C, we need a builtin to be able to express this operation in C
source.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76538
2020-03-21 00:21:24 -07:00
Akira Hatanaka d35a454170 [CodeGen] Emit destructor calls to destruct non-trivial C struct objects
returned by function calls or loaded from volatile objects

rdar://problem/51867864

Differential Revision: https://reviews.llvm.org/D66094
2020-03-20 18:34:22 -07:00
Adrian Prantl ceae47143b Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 16:27:50 -07:00
Adrian Prantl bde15de3ca Revert "Allow remapping the sysroot with -fdebug-prefix-map."
This reverts commit 6725c4836a.
2020-03-20 16:27:23 -07:00
Adrian Prantl 6725c4836a Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 15:52:39 -07:00
Adrian Prantl 43580a5c5a Allow remapping Clang module skeleton CU references with -fdebug-prefix-map
Differential Revision: https://reviews.llvm.org/D76383
2020-03-20 15:15:56 -07:00
Adrian Prantl 97f490d87b Don't set the isOptimized flag in module skeleton DICompileUnits.
It's not used for anything.
2020-03-20 14:18:15 -07:00
Adrian Prantl 079c6ddaf5 Correctly initialize the DW_AT_comp_dir attribute of Clang module skeleton CUs
Before this patch a Clang module skeleton CU would have a
DW_AT_comp_dir pointing to the directory of the module map file, and
this information was not used by anyone. Even worse, LLDB actually
resolves relative DWO paths by appending it to DW_AT_comp_dir. This
patch sets it to the same directory that is used as the main CU's
compilation directory, which would make the LLDB code work.

Differential Revision: https://reviews.llvm.org/D76377
2020-03-20 14:18:14 -07:00
Alexey Bataev 06dea73307 [OPENMP50]Initial support for inclusive clause.
Added parsing/sema/serialization support for inclusive clause in scan
directive.
2020-03-20 14:20:38 -04:00
Reid Kleckner ce5173c0e1 Use FinishThunk to finish musttail thunks
FinishThunk, and the invariant of setting and then unsetting
CurCodeDecl, was added in 7f416cc426 (2015). The invariant didn't
exist when I added this musttail codepath in ab2090d107 (2014).
Recently in 28328c3771, I started using this codepath on non-Windows
platforms, and users reported problems during release testing (PR44987).

The issue was already present for users of EH on i686-windows-msvc, so I
added a test for that case as well.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D76444
2020-03-20 09:02:21 -07:00
Alexey Bataev fcba7c3534 [OPENMP50]Initial support for scan directive.
Addedi basic parsing/sema/serialization support for scan directive.
2020-03-20 07:58:15 -04:00
Shiva Chen fc3752665f [RISCV] Passing small data limitation value to RISCV backend
Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.

Differential Revision: https://reviews.llvm.org/D57497
2020-03-20 11:03:51 +08:00
Thomas Lively a3f974f3c3 [WebAssembly] SIMD bitmask intrinsics and builtin functions
Summary:
These experimental new instructions are proposed in
https://github.com/WebAssembly/simd/pull/201.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76397
2020-03-19 17:15:37 -07:00
Djordje Todorovic d9b9621009 Reland D73534: [DebugInfo] Enable the debug entry values feature by default
The issue that was causing the build failures was fixed with the D76164.
2020-03-19 13:57:30 +01:00
Lucas Prates d4ad386ee1 [ARM] Fixing range checks for Neon's vqdmulhq_lane and vqrdmulhq_lane intrinsics
Summary:
The range checks performed for the vqrdmulh_lane and vqrdmulh_lane Neon
intrinsics were incorrectly using their return type as the base type for
the range check performed on their 'lane' argument.

This patch updates those intrisics to use the type of the proper reference
argument to perform the range checks.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74766
2020-03-19 12:08:12 +00:00
Lucas Prates f56550cf7f [ARM] Enabling range checks on Neon intrinsics' lane arguments
Summary:
Range checks were not properly performed in the lane arguments of Neon
intrinsics implemented based on splat operations. Calls to those
intrinsics where translated to `__builtin__shufflevector` calls directly
by the pre-processor through the arm_neon.h macros, missing the chance
for the proper range checks.

This patch enables the range check by introducing an auxiliary splat
instruction in arm_neon.td, delaying the translation to shufflevector
calls to CGBuiltin.cpp in clang after the checks were performed.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, ostannard

Reviewed By: ostannard

Subscribers: ostannard, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74619
2020-03-19 12:07:23 +00:00
Lucas Prates 7bf23563f4 Revert "[ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions"
This reverts commit 62ab15ffa3.

Multiple commits were unintentionally squashed into this one. Reverting
so each of them can be pushed properly.
2020-03-19 12:01:13 +00:00
Lucas Prates 62ab15ffa3 [ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions
Summary:
Some of the `*_laneq` intrinsics defined in arm_neon.td were missing the
setting of the `isLaneQ` attribute. This patch sets the attribute on the
related definitions, as they will be required to properly perform range
checks on their lane arguments.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74616
2020-03-19 11:52:41 +00:00
Richard Smith f18233dad4 Fix -fsanitize=array-bound to treat T[0] union members as flexible array
members regardless of whether they're the last member of the union.
2020-03-18 15:47:24 -07:00
Alexey Bataev f3c857fae2 [OPENMP50]Add basic codegen support for ancestor device modifier.
If the ancestor device modifier is used and the value of the device
clause is evaluated to 1, the ancestor device shall be used for the
execution.
Since the reverse offloading is not supported yet, the target construct
execution is always initiated from the host, not from the device. So, if
the ancestor modifier is specified, just execute target region on the
host.
2020-03-18 17:53:18 -04:00
Alexey Bataev 2f8894a5b8 [OPENMP50]Add support for extended device clause in target directives.
Added parsing/sema/serialization support for extended device clause in
executable target directives.
2020-03-18 15:02:37 -04:00
Michael Liao 4cf01ed75e [hip] Revise `GlobalDecl` constructors. NFC.
Summary:
- https://reviews.llvm.org/D68578 revises the `GlobalDecl` constructors
  to ensure all GPU kernels have `ReferenceKenelKind` initialized
  properly with an explicit constructor and static one. But, there are
  lots of places using the implicit constructor triggering the assertion
  on non-GPU kernels. That's found in compilation of many tests and
  workloads.
- Fixing all of them may change more code and, more importantly, all of
  them assumes the default kernel reference kind. This patch changes
  that constructor to tell `CUDAGlobalAttr` and construct `GlobalDecl`
  properly.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76344
2020-03-18 09:33:39 -04:00
Alexey Bataev b09cce07c7 [OPENMP50]Codegen for detach clause.
Implemented codegen for detach clause in task directives.
2020-03-18 09:01:17 -04:00
Sander de Smalen c5b81466c2 Reland D75470 [SVE] Auto-generate builtins and header for svld1.
Reworked the patch to avoid sharing a header (SVETypeFlags.h) between
include/clang/Basic and utils/TableGen/SveEmitter.cpp. Now the patch
generates the enum/flags which is included in TargetBuiltins.h.

Also renamed one of the SveEmitter options to be in line with MVE.

Summary:

This is a first patch in a series for the SveEmitter to generate the arm_sve.h
header file and builtins.

I've tried my best to strip down this patch as best as I could, but there
are still a few changes that are not necessarily exercised by the load intrinsics
in this patch, mostly around the SVEType class which has some common logic to
represent types from a type and prototype string. I thought it didn't make
much sense to remove that from this patch and split it up.
2020-03-18 11:16:28 +00:00
Michael Liao a2920c4ea9 [codegen] Fix one more case where `getGlobalDecl` should be used. NFC.
- After https://reviews.llvm.org/D68578, the implicit conversion from
  `FunctionDecl` to `GlobalDecl` needs replacing with `getGlobalDecl`;
  otherwise, assertion is triggered.
2020-03-17 17:56:47 -04:00
Jon Chesterfield c45eaeabb7 [Clang] Undef attribute for global variables
Summary:
[Clang] Attribute to allow defining undef global variables

Initializing global variables is very cheap on hosted implementations. The
C semantics of zero initializing globals work very well there. It is not
necessarily cheap on freestanding implementations. Where there is no loader
available, code must be emitted near the start point to write the appropriate
values into memory.

At present, external variables can be declared in C++ and definitions provided
in assembly (or IR) to achive this effect. This patch provides an attribute in
order to remove this reason for writing assembly for performance sensitive
freestanding implementations.

A close analogue in tree is LDS memory for amdgcn, where the kernel is
responsible for initializing the memory after it starts executing on the gpu.
Uninitalized variables in LDS are observably cheaper than zero initialized.

Patch is loosely based on the cuda __shared__ and opencl __local variable
implementation which also produces undef global variables.

Reviewers: kcc, rjmccall, rsmith, glider, vitalybuka, pcc, eugenis, vlad.tsyrklevich, jdoerfert, gregrodgers, jfb, aaron.ballman

Reviewed By: rjmccall, aaron.ballman

Subscribers: Anastasia, aaron.ballman, davidb, Quuxplusone, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74361
2020-03-17 21:22:23 +00:00
Alexey Bataev 0f0564bb9a [OPENMP50]Initial support for detach clause in task directive.
Added parsing/sema/serialization support for detach clause.
2020-03-17 09:19:03 -04:00
Kerry McLaughlin af64948e2a [SVE][Inline-Asm] Add constraints for SVE ACLE types
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
 - y: SVE registers Z0-Z7
 - Upl: One of the low eight SVE predicate registers (P0-P7)
 - Upa: Full range of SVE predicate registers (P0-P15)

Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin

Reviewed By: efriedma

Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75690
2020-03-17 11:04:19 +00:00
Evgenii Stepanov 2a3723ef11 [memtag] Plug in stack safety analysis.
Summary:
Run StackSafetyAnalysis at the end of the IR pipeline and annotate
proven safe allocas with !stack-safe metadata. Do not instrument such
allocas in the AArch64StackTagging pass.

Reviewers: pcc, vitalybuka, ostannard

Reviewed By: vitalybuka

Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73513
2020-03-16 16:35:25 -07:00
Sander de Smalen 6ce537ccfc Revert "[SVE] Auto-generate builtins and header for svld1."
This reverts commit 8b409eabaf.

Reverting this patch for now because it breaks some buildbots.
2020-03-16 15:22:15 +00:00
Sander de Smalen 8b409eabaf [SVE] Auto-generate builtins and header for svld1.
This is a first patch in a series for the SveEmitter to generate the arm_sve.h
header file and builtins.

I've tried my best to strip down this patch as best as I could, but there
are still a few changes that are not necessarily exercised by the load intrinsics
in this patch, mostly around the SVEType class which has some common logic to
represent types from a type and prototype string. I thought it didn't make
much sense to remove that from this patch and split it up.

Reviewers: efriedma, rovka, SjoerdMeijer, rsandifo-arm, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75470
2020-03-16 10:52:37 +00:00
Jun Ma 53c2e10fb8 [Coroutines] Do not evaluate InitListExpr of a co_return
Differential Revision: https://reviews.llvm.org/D76118
2020-03-16 12:42:44 +08:00
Sander de Smalen 5087ace651 [Clang][SVE] Parse builtin type string for scalable vectors
This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.

This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:

 svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {
   return svld1_s8(pg, base);
 }

Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75298
2020-03-15 14:34:52 +00:00
Alexey Bataev b3998a0edb [OPENMP]Fix PR45047: Do not copy firstprivates in tasks twice.
Avoid copying of the orignal variable if it is going to be marked as
firstprivate in task regions. For taskloops, still need to copy the
non-trvially copyable variables to correctly construct them upon task
creation.
2020-03-13 18:04:16 -04:00
Nico Weber f82b32a51e Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit 5aa5c943f7.
Causes clang to assert, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c4
for a repro.
2020-03-13 15:37:44 -04:00
Adrian Prantl 842ea709e4 Debug Info: Store the SDK in the DICompileUnit.
This is another intermediate step for PR44213
(https://bugs.llvm.org/show_bug.cgi?id=44213).

This stores the SDK *name* in the debug info, to make it possible to
`-fdebug-prefix-map`-replace the sysroot with a recognizable string
and allowing the debugger to find a fitting SDK relative to itself,
not the machine the executable was compiled on.

rdar://problem/51645582
2020-03-13 11:21:30 -07:00
Alexey Bataev 172f1460ae [OPENMP]Reduce number of captured global vars.
Try to reduce the number of global vars captured in the OpenMP regions
by capturing them only the regions, which mark them as not-shared.
2020-03-13 10:47:54 -04:00
Yaxun (Sam) Liu 0ffb12ca67 [HIP] Mark kernels with uniform-work-group-size=true
Differential Revision: https://reviews.llvm.org/D76076
2020-03-13 06:56:56 -04:00
Huihui Zhang 118abf2017 [SVE] Update API ConstantVector::getSplat() to use ElementCount.
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.

This change is needed for D73753.

Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74386
2020-03-12 13:22:41 -07:00
Reid Kleckner 26d254f084 Sink more Attr.h inline methods, NFC
This has very little impact on build time, but is a mechanical pre-req
to removing the OpenMPClause.h include, which matters. Most of these
pretty print methods require Expr to be complete.
2020-03-12 11:54:31 -07:00
Simon Pilgrim adeb8c5428 Replace getAs with castAs to fix null dereference static analyzer warning.
Use castAs as we know the cast should succeed (and castAs will assert if it doesn't) and we're dereferencing it directly in the BuildRCBlockVarRecordLayout call.
2020-03-12 18:52:58 +00:00
Simon Pilgrim 336530be07 CGOpenMPRuntime::emitDeclareTargetVarDefinition - fix static analyzer null dereference warning. NFCI.
All paths test for or dereference the VD pointer, so just assert that its not null.
2020-03-12 18:52:57 +00:00
Reid Kleckner e08464fb45 Avoid including FileManager.h from SourceManager.h
Most clients of SourceManager.h need to do things like turning source
locations into file & line number pairs, but this doesn't require
bringing in FileManager.h and LLVM's FS headers.

The main code change here is to sink SM::createFileID into the cpp file.
I reason that this is not performance critical because it doesn't happen
on the diagnostic path, it happens along the paths of macro expansion
(could be hot) and new includes (less hot).

Saves some includes:
    309 -    /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileManager.h
    272 -    /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileSystemOptions.h
    271 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/VirtualFileSystem.h
    267 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/FileSystem.h
    266 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Chrono.h

Differential Revision: https://reviews.llvm.org/D75406
2020-03-11 13:53:12 -07:00
Reid Kleckner c915cb957d Avoid including Module.h from ExternalASTSource.h
Module.h takes 86ms to parse, mostly parsing the class itself. Avoid it
if possible. ASTContext.h depends on ExternalASTSource.h.

A few NFC changes were needed to make this possible:

- Move ASTSourceDescriptor to Module.h. This needs Module to be
  complete, and seems more related to modules and AST files than
  external AST sources.
- Move "import complete" bit from Module* pointer int pair to
  NextLocalImport pointer. Required because PointerIntPair<Module*,...>
  requires Module to be complete, and now it may not be.

Reviewed By: aaron.ballman, hans

Differential Revision: https://reviews.llvm.org/D75784
2020-03-11 13:37:41 -07:00
Jin Lin a0cacb6054 Fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode
Summary:
The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode.
The purpose is to provide the support of LTO for swift and Objective-C mixed project.

Reviewers: rjmccall, ahatanak, steven_wu

Reviewed By: rjmccall, steven_wu

Subscribers: manmanren, mehdi_amini, hiraditya, dexonsmith, llvm-commits, jinlin

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71219
2020-03-11 13:26:06 -07:00
Akira Hatanaka 37fa9d65ea [CodeGen][ObjC] Don't extend lifetime of ObjC pointers passed to calls
to __builtin_os_log_format if ARC isn't enabled

Fixes a bug introduced in this commit:
f4d791f833

rdar://problem/60301219
2020-03-10 22:10:32 -07:00
Erik Pilkington 75af694a6d [CodeGenObjC] Place property names in __objc_methname
This allows the property name to deduplicate with the accessor method name.
rdar://58927964
2020-03-10 14:31:00 -07:00
Akira Hatanaka 40568fec7e [CodeGen] Emit destructor calls to destruct compound literals
Fix a bug in IRGen where it wasn't destructing compound literals in C
that are ObjC pointer arrays or non-trivial structs. Also diagnose jumps
that enter or exit the lifetime of the compound literals.

rdar://problem/51867864

Differential Revision: https://reviews.llvm.org/D64464
2020-03-10 14:08:28 -07:00
Mikhail Maltsev 47edf5bafb [ARM,CDE] Generalize MVE intrinsics infrastructure to support CDE
Summary:
This patch generalizes the existing code to support CDE intrinsics
which will share some properties with existing MVE intrinsics
(some of the intrinsics will be polymorphic and accept/return values
of MVE vector types).
Specifically the patch:
* Adds new tablegen backends -gen-arm-cde-builtin-def,
  -gen-arm-cde-builtin-codegen, -gen-arm-cde-builtin-sema,
  -gen-arm-cde-builtin-aliases, -gen-arm-cde-builtin-header based on
  existing MVE backends.
* Renames the '__clang_arm_mve_alias' attribute into
  '__clang_arm_builtin_alias' (it will be used with CDE intrinsics as
  well as MVE intrinsics)
* Implements semantic checks for the coprocessor argument of the CDE
  intrinsics as well as the existing coprocessor intrinsics.
* Adds one CDE intrinsic __arm_cx1 to test the above changes

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: sdesmalen, mgorny, kristof.beyls, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75850
2020-03-10 14:03:16 +00:00
Djordje Todorovic 5aa5c943f7 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-03-10 09:15:06 +01:00
Erik Pilkington 7fbf15a8f2 [CodeGenObjC] Privatize some ObjC metadata symbols
Nobody needs these symbols, so there isn't any benefit in including them. This
saves some code-size in Objective-C binaries. Partially reverts:
https://reviews.llvm.org/D61454. rdar://56579760

Differential revision: https://reviews.llvm.org/D75491
2020-03-09 15:40:24 -07:00
Alexey Bataev 6309334b95 [OPENMP50]Codegen for depobj dependency kind.
Implemented codegen for depobj modifier in depend clauses.
2020-03-09 17:46:06 -04:00
Yaxun (Sam) Liu 22c457a869 [HIP] Fix device stub name
HIP emits a device stub function for each kernel in host code.

The HIP debugger requires device stub function to have a different unmangled name as the kernel.

Currently the name of the device stub function is the mangled name with a postfix .stub. However,
this does not work with the HIP debugger since the unmangled name is the same as the kernel.

This patch adds prefix __device__stub__ to the unmangled name of the device stub before mangling,
therefore the device stub function has a valid mangled name which is different than the device kernel
name. The device side kernel name is kept unchanged. kernels with extern "C" also gets the prefix added
to the corresponding device stub function.

Differential Revision: https://reviews.llvm.org/D68578
2020-03-09 16:40:05 -04:00
Krzysztof Parzyszek d0ca1041ba [Hexagon] Refactor handling of circular load/store builtins, NFC 2020-03-09 14:40:08 -05:00
Erich Keane 7b66160828 Fix Target Multiversioning renaming.
The initial implementation only did 'first declaration renaming' when
a default version came after. This is insufficient in cases where a
default does not exist, so this patch makes sure that we do the renaming
in all cases.

This renaming is necessary because we emit the first declaration before
knowing that it IS a target multiversion function, which would change
its name. The second declaration (the one that caused the
multiversioning) then needs to make sure that the first one has its name
changed to be consistent with the resolver usage.
2020-03-09 08:29:18 -07:00
Djordje Todorovic c15c68abdc [CallSiteInfo] Enable the call site info only for -g + optimizations
Emit call site info only in the case of '-g' + 'O>0' level.

Differential Revision: https://reviews.llvm.org/D75175
2020-03-09 12:12:44 +01:00
Yaxun (Sam) Liu 29e1a16be8 [NFC] Let mangler accept GlobalDecl
Differential Revision: https://reviews.llvm.org/D75700
2020-03-07 23:51:41 -05:00
Matt Arsenault a4e71f01c0 Assume ieee behavior without denormal-fp-math attribute 2020-03-07 12:10:56 -05:00
Akira Hatanaka f4d791f833 [CodeGen][ObjC] Extend lifetime of ObjC pointers passed to calls to
__builtin_os_log_format

This is needed to keep all the objects, including temporaries returned
by function calls, written to the buffer alive until os_log_pack_send is
called.

rdar://problem/60105410
2020-03-06 16:46:50 -08:00
Thomas Lively d43fcd0c04 [WebAssembly] Add SIMD integer min/max builtins
Summary:
Although SIMD integer min/max operations can be expressed using the ?:
operator in C++, that operator is disallowed for vectors in C. As a
workaround, this change introduces new WebAssembly-specific builtin
functions that lower to the desired vector icmp/select sequences.

Reviewers: aheejin, dschuff, kripken

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75770
2020-03-06 14:28:52 -08:00
Alexey Bataev 5dadf577d5 [OPENMP50]Add 'depobj' modifier in 'depend' clauses.
Added basic support (parsing/sema/serialization) for depobj dependency
kind in depend clauses.
2020-03-06 11:44:57 -05:00
Alexey Bataev 8d7b118875 [OPENMP50]Add codegen for update clause in depobj directive.
Added codegen for update clause in depobj. Reads the number of the
elements from the first element and updates flags for each element in
the loop.
```
omp_depend_t x;
kmp_depend_info *base = (kmp_depend_info *)x;
intptr_t num = x[-1].base_addr;
kmp_depend_info *end = x + num;
kmp_depend_info *el = base;
do {
  el.flags = new_flag;
  el = &el[1];
} while (el != end);
```
2020-03-05 14:31:07 -05:00
Alexey Bataev ea5b3ef593 [OPENMP50]Skip the first element when storing the list of dependencies
in depobj object.

The first element in the list of the dependencies is used for internal
purposes to store the number of the elements in the provided list.
The first element now is skipped and depobj object poits exactly to the
list of dependencies.
2020-03-05 14:26:07 -05:00
Adrian Prantl 314b9278f0 Revert "[CGBlocks] Improve line info in backtraces containing *_helper_block"
Block copy/destroy helpers are now linkonce_odr functions, meant to be uniqued, and thus attaching debug information from one translation unit (or even just from one instance of many inside one translation unit) would be misleading and wrong in the general case.

This effectively reverts commit 9c6b6826ce.

<rdar://problem/59137040>

Differential Revision: https://reviews.llvm.org/D75615
2020-03-05 09:58:42 -08:00
Alexey Bataev b27ff4d07d [OPENMP50]Codegen for 'destroy' clause in depobj directive.
If the destroy clause is appplied, the previously allocated memory for
the dependency object must be destroyed.
2020-03-04 16:30:34 -05:00
Alexey Bataev e46f0fee30 [OPENMP50]Codegen for 'depend' clause in depobj directive.
Added codegen for 'depend' clause in depobj directive. The depend clause
is emitted as kmp_depend_info <deps>[<number_of_items_in_clause> + 1]. The
first element in this array is reserved for storing the number of
elements in this array: <deps>[0].base_addr =
<number_of_items_in_clause>;

This extra element is required to implement 'update' and 'destroy'
clauses. It is required to know the size of array to destroy it
correctly and to update depency kind.
2020-03-04 15:01:53 -05:00
hsmahesha cac068600e [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code
Summary:
hip-pinned-shadow global var should remain in the final code object irrespective
of whether it is used or not within the code. Add it to used list, so that it
will not get eliminated when it is unused.

Reviewers: yaxunl, tra, hliao

Reviewed By: yaxunl

Subscribers: hliao, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75402
2020-03-04 10:54:26 +05:30
Alexey Bataev 375437ab92 [OPENMP50]Support 'destroy' clause on 'depobj' directives.
Added basic support (parsing/sema/serialization) for 'destroy' clause in
depobj directives.
2020-03-02 14:40:53 -05:00
Alexey Bataev c112e941a0 [OPENMP50]Add basic support for depobj construct.
Added basic parsing/sema/serialization support for depobj directive.
2020-03-02 13:10:32 -05:00
Simon Pilgrim dc8680eceb [CodeGenPGO] Fix shadow variable warning. NFC. 2020-03-02 15:06:34 +00:00
Simon Pilgrim 736385c0b4 EHScopeStack::Cleanup has virtual functions so the destructor should be too.
Fixes cppcheck warning.
2020-03-02 15:06:34 +00:00
Simon Pilgrim 842c5c7994 Fix shadow variable warning. NFC. 2020-03-02 11:41:20 +00:00
Awanish Pandey 7a42babeb8 Reland "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters
in C++ templates."

This was reverted in 802b22b5c8 due to
missing .bc file and a chromium bot failure.
https://bugs.chromium.org/p/chromium/issues/detail?id=1057559#c1
This revision address both of them.

Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.

Reviewers: probinson, aprantl, dblaikie

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D73462
2020-03-02 16:45:48 +05:30
Hans Wennborg 802b22b5c8 Revert "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters"
The Bitcode/DITemplateParameter-5.0.ll test is failing:

FAIL: LLVM :: Bitcode/DITemplateParameter-5.0.ll (5894 of 36324)
******************** TEST 'LLVM :: Bitcode/DITemplateParameter-5.0.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/llvm-dis -o - /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll.bc | /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/FileCheck /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll
--
Exit Code: 2

Command Output (stderr):
--

It looks like the Bitcode/DITemplateParameter-5.0.ll.bc file was never checked in.

This reverts commit c2b437d53d.
2020-03-02 09:30:52 +01:00
Awanish Pandey c2b437d53d [DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters
in C++ templates.

Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.

Reviewers: probinson, aprantl, dblaikie

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D73462
2020-03-02 12:33:05 +05:30
Simon Pilgrim 7e9747b50b [X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554)
This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default.

Differential Revision: https://reviews.llvm.org/D75162
2020-02-29 18:57:35 +00:00
Vedant Kumar dd1ea9de2e Reland: [Coverage] Revise format to reduce binary size
Try again with an up-to-date version of D69471 (99317124 was a stale
revision).

---

Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2020-02-28 18:12:04 -08:00
Vedant Kumar 3388871714 Revert "[Coverage] Revise format to reduce binary size"
This reverts commit 99317124e1. This is
still busted on Windows:

http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/40873

The llvm-cov tests report 'error: Could not load coverage information'.
2020-02-28 18:03:15 -08:00
Vedant Kumar 99317124e1 [Coverage] Revise format to reduce binary size
Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2020-02-28 17:33:25 -08:00
Vedant Kumar c54597b99d [ubsan] Add support for -fsanitize=nullability-* suppressions
rdar://59402904
2020-02-28 14:30:40 -08:00
cchen 6ee6fa28a7 [OpenMP5.0] Allow pointer arithmetic in motion/map clause, by Chi Chun
Chen

Summary:
Base declaration in pointer arithmetic expression is determined by
binary search with type information. Take "int *a, *b; *(a+*b)" as an
example, we determine the base by checking the type of LHS and RHS. In
this case the type of LHS is "int *", the type of RHS is "int",
therefore, we know that we need to visit LHS in order to find base
declaration.

Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: guansong, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75077
2020-02-28 15:07:32 -05:00
Reid Kleckner 4c2a6567bb Avoid ASTContext.h -> TargetInfo.h dep
This has been done before in 2008: ab13857072
But these things regress easily.
Move some things out of line.

Saves 316 includes + transitive stuff:
    316 -    ../clang/include/clang/Basic/TargetOptions.h
    316 -    ../clang/include/clang/Basic/TargetInfo.h
    316 -    ../clang/include/clang/Basic/TargetCXXABI.h
    316 -    ../clang/include/clang/Basic/OpenCLOptions.h
    316 -    ../clang/include/clang/Basic/OpenCLExtensions.def
    302 -    ../llvm/include/llvm/Target/TargetOptions.h
    302 -    ../llvm/include/llvm/Support/CodeGen.h
    302 -    ../llvm/include/llvm/MC/MCTargetOptions.h
    302 -    ../llvm/include/llvm/ADT/FloatingPointMode.h
    302 -    ../clang/include/clang/Basic/XRayInstr.h
    302 -    ../clang/include/clang/Basic/DebugInfoOptions.h
    302 -    ../clang/include/clang/Basic/CodeGenOptions.h
    302 -    ../clang/include/clang/Basic/CodeGenOptions.def
    257 -    ../llvm/include/llvm/Support/Regex.h
     79 -    ../llvm/include/llvm/ADT/SmallSet.h
     68 -    MSVCSTL/include/set
     66 -    ../llvm/include/llvm/ADT/SmallPtrSet.h
     62 -    ../llvm/include/llvm/ADT/StringSwitch.h
2020-02-27 14:35:00 -08:00
Reid Kleckner 86565c1309 Avoid SourceManager.h include in RawCommentList.h, add missing incs
SourceManager.h includes FileManager.h, which is expensive due to
dependencies on LLVM FS headers.

Remove dead BeforeThanCompare specialization.

Sink ASTContext::addComment to cpp file.

This reduces the time to compile a file that does nothing but include
ASTContext.h from ~3.4s to ~2.8s for me.

Saves these includes:
    219 -    ../clang/include/clang/Basic/SourceManager.h
    204 -    ../clang/include/clang/Basic/FileSystemOptions.h
    204 -    ../clang/include/clang/Basic/FileManager.h
    165 -    ../llvm/include/llvm/Support/VirtualFileSystem.h
    164 -    ../llvm/include/llvm/Support/SourceMgr.h
    164 -    ../llvm/include/llvm/Support/SMLoc.h
    161 -    ../llvm/include/llvm/Support/Path.h
    141 -    ../llvm/include/llvm/ADT/BitVector.h
    128 -    ../llvm/include/llvm/Support/MemoryBuffer.h
    124 -    ../llvm/include/llvm/Support/FileSystem.h
    124 -    ../llvm/include/llvm/Support/Chrono.h
    124 -    .../MSVCSTL/include/stack
    122 -    ../llvm/include/llvm-c/Types.h
    122 -    ../llvm/include/llvm/Support/NativeFormatting.h
    122 -    ../llvm/include/llvm/Support/FormatProviders.h
    122 -    ../llvm/include/llvm/Support/CBindingWrapping.h
    122 -    .../MSVCSTL/include/xtimec.h
    122 -    .../MSVCSTL/include/ratio
    122 -    .../MSVCSTL/include/chrono
    121 -    ../llvm/include/llvm/Support/FormatVariadicDetails.h
    118 -    ../llvm/include/llvm/Support/MD5.h
    109 -    .../MSVCSTL/include/deque
    105 -    ../llvm/include/llvm/Support/Host.h
    105 -    ../llvm/include/llvm/Support/Endian.h

Reviewed By: aaron.ballman, hans

Differential Revision: https://reviews.llvm.org/D75279
2020-02-27 13:49:40 -08:00
Dan Gohman 00072c08c7 [WebAssembly] Mangle the argc/argv `main` as `__wasm_argc_argv`.
WebAssembly enforces a rule that caller and callee signatures must
match. This means that the traditional technique of passing `main`
`argc` and `argv` even when it doesn't need them doesn't work.

Currently the backend renames `main` to `__original_main`, however this
doesn't interact well with LTO'ing libc, and the name isn't intuitive.
This patch allows us to transition to `__main_argc_argv` instead.

This implements the proposal in
https://github.com/WebAssembly/tool-conventions/pull/134
with a flag to disable it when targeting Emscripten, though this is
expected to be temporary, as discussed in the proposal comments.

Differential Revision: https://reviews.llvm.org/D70700
2020-02-27 07:55:36 -08:00
Roman Lebedev 3dd5a298bf
[clang] Annotating C++'s `operator new` with more attributes
Summary:
Right now we annotate C++'s `operator new` with `noalias` attribute,
which very much is healthy for optimizations.

However as per [[ http://eel.is/c++draft/basic.stc.dynamic.allocation | `[basic.stc.dynamic.allocation]` ]],
there are more promises on global `operator new`, namely:
* non-`std::nothrow_t` `operator new` *never* returns `nullptr`
* If `std::align_val_t align` parameter is taken, the pointer will also be `align`-aligned
* ~~global `operator new`-returned pointer is `__STDCPP_DEFAULT_NEW_ALIGNMENT__`-aligned ~~ It's more caveated than that.

Supplying this information may not cause immediate landslide effects
on any specific benchmarks, but it for sure will be healthy for optimizer
in the sense that the IR will better reflect the guarantees provided in the source code.

The caveat is `-fno-assume-sane-operator-new`, which currently prevents emitting `noalias`
attribute, and is automatically passed by Sanitizers ([[ https://bugs.llvm.org/show_bug.cgi?id=16386 | PR16386 ]]) - should it also cover these attributes?
The problem is that the flag is back-end-specific, as seen in `test/Modules/explicit-build-flags.cpp`.
But while it is okay to add `noalias` metadata in backend, we really should be adding at least
the alignment metadata to the AST, since that allows us to perform sema checks on it.

Reviewers: erichkeane, rjmccall, jdoerfert, eugenis, rsmith

Reviewed By: rsmith

Subscribers: xbolva00, jrtc27, atanasyan, nlopes, cfe-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D73380
2020-02-26 01:37:17 +03:00
Yaxun (Sam) Liu a57d9652a0 Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4
Differential Revision: https://reviews.llvm.org/D75028
2020-02-25 13:58:20 -05:00
Rong Xu 11857d4994 [remark][diagnostics] [codegen] Fix PR44896
This patch fixes PR44896. For IR input files, option fdiscard-value-names
should be ignored as we need named values in loadModule().
Commit 60d3947922 sets this option after loadModule() where valued names
already created. This creates an inconsistent state in setNameImpl()
that leads to a seg fault.
This patch forces fdiscard-value-names to be false for IR input files.

This patch also emits a warning of "ignoring -fdiscard-value-names" if
option fdiscard-value-names is explictly enabled in the commandline for
IR input files.

Differential Revision: https://reviews.llvm.org/D74878
2020-02-25 08:15:17 -08:00
Bill Wendling 50cac24877 Support output constraints on "asm goto"
Summary:
Clang's "asm goto" feature didn't initially support outputs constraints. That
was the same behavior as gcc's implementation. The decision by gcc not to
support outputs was based on a restriction in their IR regarding terminators.
LLVM doesn't restrict terminators from returning values (e.g. 'invoke'), so
it made sense to support this feature.

Output values are valid only on the 'fallthrough' path. If an output value's used
on an indirect branch, then it's 'poisoned'.

In theory, outputs *could* be valid on the 'indirect' paths, but it's very
difficult to guarantee that the original semantics would be retained. E.g.
because indirect labels could be used as data, we wouldn't be able to split
critical edges in situations where two 'callbr' instructions have the same
indirect label, because the indirect branch's destination would no longer be
the same.

Reviewers: jyknight, nickdesaulniers, hfinkel

Reviewed By: jyknight, nickdesaulniers

Subscribers: MaskRay, rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69876
2020-02-24 18:51:29 -08:00
Craig Topper 727328433a [X86] Add back fmaddsub intrinsics to work towards fixing the strict fp implementation
Previously we emitted an fmadd and a fmadd+fneg and combined them with a shufflevector. But this doesn't follow the correct exception behavior for unselected elements so the backend can't merge them into the fmaddsub/fmsubadd instructions.

This patch restores the the fmaddsub intrinsics so we don't have two arithmetic operations. We lose out on optimization opportunity in the non-strict FP case, but I don't think this is a big loss. If someone gives us a test case we can look into adding instcombine/dagcombine improvements. I'd rather not have the frontend do completely different things for strict and non-strict.

This still has problems because target specific intrinsics don't support strict semantics yet. We also still have all of the problems with masking. But we at least generate the right instruction in constrained mode now.

Differential Revision: https://reviews.llvm.org/D74268
2020-02-24 12:07:21 -08:00
Xiangling Liao 8bee52bdb5 [AIX][Frontend] C++ ABI customizations for AIX boilerplate
This PR enables "XL" C++ ABI in frontend AST to IR codegen. And it is driven by
static init work. The current kind in Clang by default is Generic Itanium, which
has different behavior on static init with IBM xlclang compiler on AIX.

Differential Revision: https://reviews.llvm.org/D74015
2020-02-24 10:26:51 -05:00
Johannes Doerfert 4b540fa8a1 [OpenMP][NFC] Remove leftover debug messages 2020-02-20 20:28:42 -06:00
Djordje Todorovic 2f215cf36a Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGfaff707db82d.
A failure found on an ARM 2-stage buildbot.
The investigation is needed.
2020-02-20 14:41:39 +01:00
Roman Lebedev 9ea5d17cc9
[Sema] Demote call-site-based 'alignment is a power of two' check for AllocAlignAttr into a warning
Summary:
As @rsmith notes in https://reviews.llvm.org/D73020#inline-672219
while that is certainly UB land, it may not be actually reachable at runtime, e.g.:
```
template<int N> void *make() {
  if ((N & (N-1)) == 0)
    return operator new(N, std::align_val_t(N));
  else
    return operator new(N);
}
void *p = make<7>();
```
and we shouldn't really error-out there.

That being said, i'm not really following the logic here.
Which ones of these cases should remain being an error?

Reviewers: rsmith, erichkeane

Reviewed By: erichkeane

Subscribers: cfe-commits, rsmith

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73996
2020-02-20 16:39:26 +03:00
Reid Kleckner 0edb212925 [MS] Mark vectorcall FP and vector args inreg
This has no effect on how LLVM passes the arguments, but it prevents
rewriteWithInAlloca from thinking that these parameters should be part
of the inalloca pack.

Follow-up to D72114

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D74452
2020-02-19 16:37:50 -08:00
Krzysztof Parzyszek b1d47467e2 [Hexagon] Change HVX vector predicate types from v512/1024i1 to v64/128i1
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.

It may cause existing bitcode files to become invalid.

* Converting between vector predicates and vector registers must be
  done explicitly via vandvrt/vandqrt instructions (their intrinsics),
  i.e. (for 64-byte mode):
    %Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
    %V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)

  The conversion intrinsics are:
    declare  <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
    declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
    declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
    declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
  They are all pure.

* Vector predicate values cannot be loaded/stored directly. This directly
  reflects the architecture restriction. Loading and storing or vector
  predicates must be done indirectly via vector registers and explicit
  conversions via vandvrt/vandqrt instructions.
2020-02-19 14:14:56 -06:00
Fady Ghanim ba3f863dfb [OpenMP][OMPIRBuilder] Introducing the `OMPBuilderCBHelpers` helper class
This patch introduces a new helper class `OMPBuilderCBHelpers`,
which will contain all reusable C/C++ language specific function-
alities required by the `OMPIRBuilder`.

Initially, this helper class contains the body and finalization
codegen functionalities implemented using callbacks which were
moved here for reusability among the different directives
implemented in the `OMPIRBuilder`, along with RAIIs for preserving
state prior to emitting outlined and/or inlined OpenMP regions.

In the future this helper class will also contain all the different
call backs required by OpenMP clauses/variable privatization.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D74562
2020-02-19 14:11:17 -06:00
Sander de Smalen 49b307e96d [AArch64][SVE] CodeGen of ACLE Builtin Types
Summary:
This patch adds codegen support for the ACLE builtin types added in:

  https://reviews.llvm.org/D62960

so that the ACLE builtin types are emitted as corresponding scalable
vector types in LLVM.

Reviewers: rsandifo-arm, rovka, rjmccall, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74724
2020-02-19 12:10:47 +00:00
Djordje Todorovic faff707db8 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-02-19 11:12:26 +01:00
Brian Gesiak 048239e46e [Coroutines][6/6] Clang schedules new passes
Summary:
Depends on https://reviews.llvm.org/D71902.

The last in a series of six patches that ports the LLVM coroutines
passes to the new pass manager infrastructure.

This patch has Clang schedule the new coroutines passes when the
`-fexperimental-new-pass-manager` option is used. With this and the
previous 5 patches, Clang is capable of building and successfully
running the test suite of large coroutines projects such as
https://github.com/lewissbaker/cppcoro with
`ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On`.

Reviewers: GorNishanov, lewissbaker, chandlerc, junparser

Subscribers: EricWF, cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71903
2020-02-19 01:03:28 -05:00
Djordje Todorovic 2bf44d11cb Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGa82d3e8a6e67.
2020-02-18 16:38:11 +01:00
Djordje Todorovic a82d3e8a6e Reland "[DebugInfo] Enable the debug entry values feature by default"
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-18 14:41:08 +01:00
Simon Tatham c32af4447f [ARM,MVE] Add the vmovnbq,vmovntq intrinsic family.
Summary:
These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they only
write half a vector's worth of output data, and therefore they also
take an 'inactive' parameter to provide the other half of the data in
the output vector. So vmovnb overwrites the even lanes of 'inactive'
with the narrowed values from the main input, and vmovnt overwrites
the odd lanes.

LLVM had existing codegen which generates these MVE instructions in
response to IR that takes two vectors of wide elements, or two vectors
of narrow ones. But in this case, we have one vector of each. So my
clang codegen strategy is to narrow the input vector of wide elements
by simply reinterpreting it as the output type, and then we have two
narrow vectors and can represent the operation as a vector shuffle
that interleaves lanes from both of them.

Even so, not all the cases I needed ended up being selected as a
single MVE instruction, so I've added a couple more patterns that spot
combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be
generated as a VMOVN instruction with operands swapped.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74337
2020-02-18 09:34:50 +00:00
Simon Tatham 5e97940cd2 [ARM,MVE] Add the vmovlbq,vmovltq intrinsic family.
Summary:
These intrinsics take a vector of 2n elements, and return a vector of
n wider elements obtained by sign- or zero-extending every other
element of the input vector. They're represented in IR as a
shufflevector that extracts the odd or even elements of the input,
followed by a sext or zext.

Existing LLVM codegen already matches this pattern and generates the
VMOVLB instruction (which widens the even-index input lanes). But no
existing isel rule was generating VMOVLT, so I've added some. However,
the new rules currently only work in little-endian MVE, because the
pattern they expect from isel lowering includes a bitconvert which
doesn't have the right semantics in big-endian.

The output of one existing codegen test is improved by those new
rules.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74336
2020-02-18 09:34:50 +00:00
Simon Tatham b6236e9479 [ARM,MVE] Add the vrev16q, vrev32q, vrev64q family.
Summary:
These intrinsics just reorder the lanes of a vector, so the natural IR
representation is as a shufflevector operation. Existing LLVM codegen
already recognizes those particular shufflevectors and generates the
MVE VREV instruction.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74334
2020-02-18 09:34:50 +00:00
Simon Tatham 90dc78bc62 [ARM,MVE] Add intrinsics for abs, neg and not operations.
Summary:
This commit adds the unpredicated intrinsics for the unary operations
vabsq (absolute value), vnegq (arithmetic negation), vmvnq (bitwise
complement), vqabsq and vqnegq (saturating versions of abs and neg for
signed integers, in the sense that they give INT_MAX if an input lane
is INT_MIN).

This is done entirely in clang: all of these operations have existing
isel patterns and existing tests for them on the LLVM side, so I've
just made clang emit the same IR that those patterns already match.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74331
2020-02-18 09:34:50 +00:00
Jim Lin 466f8843f5 [NFC] Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h,td}
2020-02-18 10:49:13 +08:00
Nicolai Hähnle bf197304a6 CGBuiltin: Remove uses of deprecated CreateCall overloads
Reviewers: t.p.northover

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74673
2020-02-18 00:24:09 +01:00
Nikita Popov 3eaa53e805 Reapply "[IRBuilder] Virtualize IRBuilder"
Relative to the original commit, this fixes some warnings,
and is based on the deletion of the IRBuilder copy constructor
in D74693. The automatic copy constructor would no longer be
safe.

-----

Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-17 19:04:11 +01:00
Benjamin Kramer 5fc5c7db38 Strength reduce vectors into arrays. NFCI. 2020-02-17 15:37:35 +01:00
Yaxun (Sam) Liu fb44b9db95 [OpenCL][CUDA][HIP][SYCL] Add norecurse
norecurse function attr indicates the function is not called recursively
directly or indirectly.

Add norecurse to OpenCL functions, SYCL functions in device compilation
and CUDA/HIP kernels.

Although there is LLVM pass adding norecurse to functions, it only works
for whole-program compilation. Also FE adding norecurse can make that
pass run faster since functions with norecurse do not need to be checked
again.

Differential Revision: https://reviews.llvm.org/D73651
2020-02-16 20:41:00 -05:00
Nikita Popov 7c362b25d7 [IRBuilder] Fix unnecessary IRBuilder copies; NFC
Fix a few cases where an IRBuilder is passed to a helper function
by value, while a by reference pass was intended.
2020-02-16 17:57:18 +01:00
Nikita Popov af480e8c63 Revert "[IRBuilder] Virtualize IRBuilder"
This reverts commit 0765d3824d.
This reverts commit 1b04866a3d.

Relevant looking crashes observed on:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
2020-02-16 17:01:10 +01:00
Nikita Popov 1b04866a3d [IRBuilder] Try to fix warnings
Try to fix -Wnon-virtual-dtor warnings that cause build failure
on clang-pcc64le-rhel.
2020-02-16 15:32:11 +01:00
Nikita Popov 0765d3824d [IRBuilder] Virtualize IRBuilder
Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-16 13:48:55 +01:00
Johannes Doerfert b86bf83c28 [FIX] Remove pointer in attribute to eliminate leaks (see D71830) 2020-02-15 18:09:54 -06:00
Fady Ghanim 7438059a90 [OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304
2020-02-15 01:15:45 -06:00
Johannes Doerfert 1228d42dda [OpenMP][Part 2] Use reusable OpenMP context/traits handling
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.

All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From
these definitions we generate the enum classes `TraitSet`,
`TraitSelector`, and `TraitProperty` as well as conversion and helper
functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`.

The above enum classes are used in the parser, sema, and the AST
attribute. The latter is not a collection of multiple primitive variant
arguments that contain encodings via numbers and strings but instead a
tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`).

The changes to the parser make it more forgiving when wrong syntax is
read and they also resulted in more specialized diagnostics. The tests
are updated and the core issues are detected as before. Here and
elsewhere this patch tries to be generic, thus we do not distinguish
what selector set, selector, or property is parsed except if they do
behave exceptionally, as for example `user={condition(EXPR)}` does.

The sema logic changed in two ways: First, the OMPDeclareVariantAttr
representation changed, as mentioned above, and the sema was adjusted to
work with the new `OpenMPTraitInfo`. Second, the matching and scoring
logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat
representation of the `match` clause that is not tied to clang.
`OpenMPTraitInfo` provides a method to generate this flat structure (see
`struct VariantMatchInfo`) by computing integer score values and boolean
user conditions from the `clang::Expr` we keep for them.

The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the `VariantMatchInfo`, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.

---

Test changes:

The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have
been auto generated to match the new warnings and notes of the parser.
The "subset" checks were reversed causing the wrong version to be
picked. The tests have been adjusted to correct this.
We do not print scores if the user did not provide one.
We print spaces to make lists in the `match` clause more legible.

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim

Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71830
2020-02-14 16:37:42 -06:00
Roger Ferrer Ibanez 2bef1c0e56 [OpenMP] Lower taskyield using OpenMP IR Builder
This is similar to D69828.

Special codegen for enclosing untied tasks is still done in clang.

Differential Revision: https://reviews.llvm.org/D70799
2020-02-14 11:35:17 +00:00
Roger Ferrer Ibanez a82f35e176 [OpenMP] Lower taskwait using OpenMP IR Builder
The code generation is exactly the same as it was.

But not that the special handling of untied tasks is still handled by
emitUntiedSwitch in clang.

Differential Revision: https://reviews.llvm.org/D69828
2020-02-14 09:53:02 +00:00
Fangrui Song 1d49eb00d9 [AsmPrinter] De-capitalize all AsmPrinter::Emit* but EmitInstruction
Similar to rL328848.
2020-02-13 17:06:24 -08:00
Alexey Bataev e0ca4792fa [OPENMP50]Add cancellation support in taskloop-based directives.
According to OpenMP 5.0, cancel and cancellation point constructs are
supported in taskloop directive. Added support for cancellation in
taskloop, master taskloop and parallel master taskloop.
2020-02-13 12:03:43 -05:00
Alexey Bataev 18789bfe3a [OPENMP50]Fix handling of clauses in parallel master taskloop directive.
We need to capture correctly the value of num_tasks clause and should
not try to emit the if clause at all in the task region.
2020-02-13 11:00:01 -05:00
Johannes Doerfert 70cac41a2b Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
Reapply 8a56d64d76 with minor fixes.

The problem was that cancellation can cause new edges to the parallel
region exit block which is not outlined. The CodeExtractor will encode
the information which "exit" was taken as a return value. The fix is to
ensure we do not return any value from the outlined function, to prevent
control to value conversion we ensure a single exit block for the
outlined region.

This reverts commit 3aac953afa.
2020-02-12 22:29:07 -06:00
Johannes Doerfert 3aac953afa Revert "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
This reverts commit 8a56d64d76.

Will be recommitted once the clang test problem is addressed.
2020-02-12 18:50:43 -06:00
Johannes Doerfert 8a56d64d76 [OpenMP][IRBuilder] Perform finalization (incl. outlining) late
In order to fix PR44560 and to prepare for loop transformations we now
finalize a function late, which will also do the outlining late. The
logic is as before but the actual outlining step happens now after the
function was fully constructed. Once we have loop transformations we
can apply them in the finalize step before the outlining.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74372
2020-02-12 17:55:01 -06:00
Erik Pilkington e26c24b849 Revert "[IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas"
This reverts commit fafc6e4fdf.

Should fix ppc stage2 failure: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/23546

Conflicts:
	clang/lib/CodeGen/CGCall.cpp
2020-02-12 12:26:46 -08:00
Michael Liao f6a3ac150b Fix `-Wunused-variable` warning. NFC. 2020-02-12 12:45:14 -05:00
Djordje Todorovic 97ed706a96 Revert "[DebugInfo] Enable the debug entry values feature by default"
This reverts commit rG9f6ff07f8a39.

Found a test failure on clang-with-thin-lto-ubuntu buildbot.
2020-02-12 11:59:04 +01:00
jasonliu 55e2678fcd [clang] Add -fignore-exceptions
Summary:

This is trying to implement the functionality proposed in:
http://lists.llvm.org/pipermail/cfe-dev/2017-April/053417.html
An exception can throw, but no cleanup is going to happen.
A module compiled with exceptions on, can catch the exception throws
from module compiled with -fignore-exceptions.

The use cases for enabling this option are:
1. Performance analysis of EH instrumentation overhead
2. The ability to QA non EH functionality when EH functionality is not available.
3. User of EH enabled headers knows the calls won't throw in their program and
   wants the performance gain from ignoring EH construct.

The implementation tried to accomplish that by removing any landing pad code
 that might get generated.

Reviewed by: aaron.ballman

Differential Revision: https://reviews.llvm.org/D72644
2020-02-12 09:56:18 +00:00
Djordje Todorovic 9f6ff07f8a [DebugInfo] Enable the debug entry values feature by default
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-12 10:25:14 +01:00
Reid Kleckner 2c6a3896ab Re-land "[MS] Overhaul how clang passes overaligned args on x86_32"
This brings back 2af74e27ed and reverts
eaabaf7e04.

The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
  struct alignas(uint64_t) Wrapper { uint64_t P };
  void f(uint64_t p);
  void f(Wrapper p);

MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
2020-02-11 16:49:28 -08:00
Ian Levesque 14f870366a [xray][clang] Always add xray-skip-entry/exit and xray-ignore-loops attrs
The function attributes xray-skip-entry, xray-skip-exit, and
xray-ignore-loops were only being applied if a function had an
xray-instrument attribute, but they should apply if xray is enabled
globally too.

Differential Revision: https://reviews.llvm.org/D73842
2020-02-11 14:00:41 -08:00
Alexey Bataev 2d4f80f78a [OPENMP50]Full handling of atomic_default_mem_order in requires
directive.

According to OpenMP 5.0, The atomic_default_mem_order clause specifies the default memory ordering behavior for atomic constructs that must be provided by an implementation. If the default memory ordering is specified as seq_cst, all atomic constructs on which memory-order-clause is not specified behave as if the seq_cst clause appears. If the default memory ordering is specified as relaxed, all atomic constructs on which memory-order-clause is not specified behave as if the relaxed clause appears.
If the default memory ordering is specified as acq_rel, atomic constructs on which memory-order-clause is not specified behave as if the release clause appears if the atomic write or atomic update operation is specified, as if the acquire clause appears if the atomic read operation is specified, and as if the acq_rel clause appears if the atomic captured update operation is specified.
2020-02-11 15:42:34 -05:00
Krzysztof Parzyszek 57148e0379 [Hexagon] Fix ABI info for returning HVX vectors 2020-02-11 12:38:54 -06:00
Justin Lebar 027eb71696 Use std::foo_t rather than std::foo in clang.
Summary: No functional change.

Reviewers: bkramer, MaskRay, martong, shafik

Subscribers: martong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74414
2020-02-11 10:37:08 -08:00
Alexey Bataev 9a8defcc34 [OPENMP50]Add support for relaxed clause in atomic directive.
Added full support for relaxed clause.
2020-02-11 11:54:46 -05:00
Vedant Kumar 8b81ebfe7e [ubsan] Null-check and adjust TypeLoc before using it
Null-check and adjut a TypeLoc before casting it to a FunctionTypeLoc.
This fixes a crash in -fsanitize=nullability-return, and also makes the
location of the nonnull type available when the return type is adjusted.

rdar://59263039

Differential Revision: https://reviews.llvm.org/D74355
2020-02-10 14:10:06 -08:00
Alexey Bataev 9559834a5c [OPENMP50]Add support for 'release' clause.
Added full support for 'release' clause in flush|atomic directives.
2020-02-10 16:01:41 -05:00
Alexey Bataev 04a830f80a [OPENMP50]Support for acquire clause.
Added full support for acquire clause in flush|atomic directives.
2020-02-10 14:51:46 -05:00
Kadir Cetinkaya 5731b6672d
Revert "[OpenMP] Fix unused variable"
This breaks under asan, see http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38597/steps/check-clang%20asan/logs/stdio

This reverts commit bb50454295.

Revert "[FIX] Ordering problem accidentally introduced with D72304"

This reverts commit 08c0a06d8f.

Revert "[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder."

This reverts commit e8a436c5ea.
2020-02-10 16:34:59 +01:00
Michael Liao a067891389 [clang][codegen] Fix another lifetime emission on alloca on non-default address space.
- Lifetime intrinsics expect the pointer directly from alloca. Need
  extra handling for targets with alloca on non-default (or non-zero)
  address space.
2020-02-10 00:15:56 -05:00
serge_sans_paille e67cbac812 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 10:42:45 +01:00
serge-sans-paille 4546211600 Revert "Support -fstack-clash-protection for x86"
This reverts commit 0fd51a4554.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
2020-02-09 10:06:31 +01:00
serge_sans_paille 0fd51a4554 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 09:35:42 +01:00
fady e8a436c5ea [OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304
2020-02-08 18:55:48 -06:00
serge-sans-paille 658495e6ec Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
serge_sans_paille e229017732 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00
Guillaume Chatelet d65bbf81f8 [clang] Add support for __builtin_memcpy_inline
Summary: This is a follow up on D61634 and the last step to implement http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

Reviewers: efriedma, courbet, tejohnson

Subscribers: hiraditya, cfe-commits, llvm-commits, jdoerfert, t.p.northover

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73543
2020-02-07 23:55:26 +01:00
Erik Pilkington fafc6e4fdf [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas
These temporaries are only used in the callee, and their memory can be reused
after the call is complete.

rdar://58552124

Differential revision: https://reviews.llvm.org/D74094
2020-02-07 14:39:31 -08:00
Alexey Bataev e8e05de08b [OPENMP50]Add codegen for acq_rel clause in atomic|flush directives.
Added codegen support for atomic|flush directives with acq_rel clause.
2020-02-07 15:05:09 -05:00
Nico Weber b03c3d8c62 Revert "Support -fstack-clash-protection for x86"
This reverts commit 4a1a0690ad.
Breaks tests on mac and win, see https://reviews.llvm.org/D68720
2020-02-07 14:49:38 -05:00
serge_sans_paille 4a1a0690ad Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with correct option
flags set.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 19:54:39 +01:00
Alexey Bataev ea9166b5a8 [OPENMP50]Add parsing/sema for acq_rel clause.
Added basic support (representation + parsing/sema/(de)serialization)
for acq_rel clause in flush/atomic directives.
2020-02-07 09:21:10 -05:00
serge-sans-paille f6d98429fc Revert "Support -fstack-clash-protection for x86"
This reverts commit 39f50da2a3.

The -fstack-clash-protection is being passed to the linker too, which
is not intended.

Reverting and fixing that in a later commit.
2020-02-07 11:36:53 +01:00
Diogo Sampaio 9d869180c4 [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields
Summary:
Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.

AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses

```
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

```

Reviewers: lebedev.ri, ostannard, jfb, eli.friedman

Reviewed By: jfb

Subscribers: rsmith, rjmccall, dexonsmith, kristof.beyls, jfb, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67399
2020-02-07 10:11:54 +00:00
serge_sans_paille 39f50da2a3 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 10:56:15 +01:00
Craig Topper 96400ae2a4 Recommit "[FPEnv][X86] Platform-specific builtin constrained FP enablement"
With REQUIRES: x86-register-target added to the tests.

Also remove some unneeded FIXMEs

But add a FIXME for bad IR generation for FMADDSUB/FMSUBADD with
constrained FP.

Original patch by Kevin P. Neal
2020-02-06 16:54:35 -08:00
Richard Smith 96c899449b C++ DR2026: static storage duration variables are not zeroed before
constant initialization.

Removing this zeroing regressed our code generation in a few cases, also
fixed here. We now compute whether a variable has constant destruction
even if it doesn't have a constant initializer, by trying to destroy a
default-initialized value, and skip emitting a trivial default
constructor for a variable even if it has non-trivial (but perhaps
constant) destruction.
2020-02-06 16:37:22 -08:00
Kevin P. Neal ad0e03fd4c Revert "[FPEnv][X86] Platform-specific builtin constrained FP enablement"
This reverts commit 208470dd5d.

Tests fail:
error: unable to create target: 'No available targets are compatible with triple "x86_64-apple-darwin"'

This happens on clang-hexagon-elf, clang-cmake-armv7-quick, and
clang-cmake-armv7-quick bots.

If anyone has any suggestions on why then I'm all ears.

Differential Revision: https://reviews.llvm.org/D73570

Revert "[FPEnv][X86] Speculative fix for failures introduced by eda495426."

This reverts commit 80e17e5fcc.

The speculative fix didn't solve the test failures on Hexagon, ARMv6, and
MSVC AArch64.
2020-02-06 19:17:14 -05:00
Kevin P. Neal 208470dd5d [FPEnv][X86] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the X86-specific builtins don't
use constrained intrinsics in some cases. Fix that.

Differential Revision: https://reviews.llvm.org/D73570
2020-02-06 14:20:44 -05:00
Vedant Kumar 65f0785fff [ubsan] Omit return value check when return block is unreachable
If the return block is unreachable, clang removes it in
CodeGenFunction::FinishFunction(). This removal can leave dangling
references to values defined in the return block if the return block has
successors, which it /would/ if UBSan's return value check is emitted.

In this case, as the UBSan check wouldn't be reachable, it's better to
simply not emit it.

rdar://59196131
2020-02-06 10:24:03 -08:00
Michael Liao 318d0ede57 Fix warning on unused variables. NFC. 2020-02-06 12:21:20 -05:00
shafik 428583dd22 [DebugInfo] Fix debug-info generation for block invocations so that we set the LinkageName
Currently when generating debug-info for a BlockDecl we are setting the Name to the mangled name and not setting the LinkageName.
This means we see the mangled name for block invcations ends up in DW_AT_Name and not in DW_AT_linkage_name.

This patch fixes this case so that we also set the LinkageName as well.

Differential Revision: https://reviews.llvm.org/D73282
2020-02-05 11:07:30 -08:00
Thomas Lively 8c3e6af71b [WebAssembly] Add experimental multivalue calling ABI
Summary:
For now, this ABI simply expands all possible aggregate arguments and
returns all possible aggregates directly. This ABI will change rapidly
as we prototype and benchmark a new ABI that takes advantage of
multivalue return and possibly other changes from the MVP ABI.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72972
2020-02-04 21:09:49 -08:00
Francis Visoiu Mistrih 7531a5039f [Remarks] Extend the RemarkStreamer to support other emitters
This extends the RemarkStreamer to allow for other emitters (e.g.
frontends, SIL, etc.) to emit remarks through a common interface.

See changes in llvm/docs/Remarks.rst for motivation and design choices.

Differential Revision: https://reviews.llvm.org/D73676
2020-02-04 17:16:02 -08:00
Kiran Chandramohan a969e051a5 [OpenMP] Add Flush directive to OpenMPIRBuilder
Add support for Flush in the OMPIRBuilder. This patch also adds changes
to clang to use the OMPIRBuilder when '-fopenmp-enable-irbuilder'
commandline option is used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70712
2020-02-04 22:48:02 +00:00
Matt Arsenault a3c814d234 Separately track input and output denormal mode
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
2020-02-04 12:59:21 -05:00
Yonghong Song 9271cab270 [BPF] use base lvalue type for preserve_{struct,union}_access_index metadata
Linux commit
  1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
      apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
  #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
  typedef struct {
    int a;
  } __t;
  #pragma clang attribute pop
  int test(__t *arg) { return arg->a; }

The current approach to use struct type in the relocation record will
result in an anonymous struct, which make later type matching difficult
    in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
  clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
     Assertion `TypeName.size()' failed.

The patch use the base lvalue type for the "base" value to annotate
preservee_{struct,union}_access_index intrinsics. In the above example,
the type will be "__t" which preserved the type name.

Differential Revision: https://reviews.llvm.org/D73900
2020-02-04 09:28:30 -08:00
Jonas Paulsson 563e84790f [SystemZ] Support -msoft-float
This is needed when building the Linux kernel.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D72189
2020-02-04 10:32:45 -05:00
Fangrui Song dbc96b518b Revert "[CodeGenModule] Assume dso_local for -fpic -fno-semantic-interposition"
This reverts commit 789a46f2d7.

Accidentally committed.
2020-02-03 10:09:39 -08:00
Fangrui Song 789a46f2d7 [CodeGenModule] Assume dso_local for -fpic -fno-semantic-interposition
Summary:
Clang -fpic defaults to -fno-semantic-interposition (GCC -fpic defaults
to -fsemantic-interposition).
Users need to specify -fsemantic-interposition to get semantic
interposition behavior.

Semantic interposition is currently a best-effort feature. There may
still be some cases where it is not handled well.

Reviewers: peter.smith, rnk, serge-sans-paille, sfertile, jfb, jdoerfert

Subscribers: dschuff, jyknight, dylanmckay, nemanjai, jvesely, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, arphaman, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73865
2020-02-03 09:52:48 -08:00
Alexey Bataev a781521867 [OPENMP50]Codegen support for order(concurrent) clause.
Emit llvm parallel access metadata for the loops if they are marked as
order(concurrent).
2020-02-03 12:27:33 -05:00
Alexey Bataev cb8e69148d [OPENMP50]Basic parsing/sema analysis for order(concurrent) clause.
Added parsing/sema/serialization support for order(concurrent) clause in
loop|simd-based directives.
2020-02-03 10:31:02 -05:00
Johannes Doerfert 9dcfc7cd64 Revert "[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder."
This reverts commit 1ca740387b.

The bots break [0], investigation is needed.

[0] http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/22899
2020-02-03 08:59:14 -06:00
Fady Ghanim 1ca740387b [OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304
2020-02-03 08:44:23 -06:00
Simon Tatham 961530fdc9 [ARM,MVE] Fix vreinterpretq in big-endian mode.
Summary:
In big-endian MVE, the simple vector load/store instructions (i.e.
both contiguous and non-widening) don't all store the bytes of a
register to memory in the same order: it matters whether you did a
VSTRB.8, VSTRH.16 or VSTRW.32. Put another way, the in-register
formats of different vector types relate to each other in a different
way from the in-memory formats.

So, if you want to 'bitcast' or 'reinterpret' one vector type as
another, you have to carefully specify which you mean: did you want to
reinterpret the //register// format of one type as that of the other,
or the //memory// format?

The ACLE `vreinterpretq` intrinsics are specified to reinterpret the
register format. But I had implemented them as LLVM IR bitcast, which
is specified for all types as a reinterpretation of the memory format.
So a `vreinterpretq` intrinsic, applied to values already in registers,
would code-generate incorrectly if compiled big-endian: instead of
emitting no code, it would emit a `vrev`.

To fix this, I've introduced a new IR intrinsic to perform a
register-format reinterpretation: `@llvm.arm.mve.vreinterpretq`. It's
implemented by a trivial isel pattern that expects the input in an
MQPR register, and just returns it unchanged.

In the clang codegen, I only emit this new intrinsic where it's
actually needed: I prefer a bitcast wherever it will have the right
effect, because LLVM understands bitcasts better. So we still generate
bitcasts in little-endian mode, and even in big-endian when you're
casting between two vector types with the same lane size.

For testing, I've moved all the codegen tests of vreinterpretq out
into their own file, so that they can have a different set of RUN
lines to check both big- and little-endian.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73786
2020-02-03 11:20:06 +00:00
Richard Smith 0130b6cb5a Don't assume a reference refers to at least sizeof(T) bytes.
When T is a class type, only nvsize(T) bytes need be accessible through
the reference. We had matching bugs in the application of the
dereferenceable attribute and in -fsanitize=undefined.
2020-01-31 19:08:17 -08:00
serge-sans-paille fd09f12f32 Implement -fsemantic-interposition
First attempt at implementing -fsemantic-interposition.

Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.

Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.

So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.

Note that it only impacts architecture compiled with -fPIC and no pie.

Differential Revision: https://reviews.llvm.org/D72829
2020-01-31 14:02:33 +01:00
Pierre Habouzit 6eb969b7c5 [objc_direct] fix codegen for mismatched Decl/Impl return types
For non direct methods, the codegen uses the type of the Implementation.
Because Objective-C rules allow some differences between the Declaration
and Implementation return types, when the Implementation is in this
translation unit, the type of the Implementation should be preferred to
emit the Function over the Declaration.

Radar-Id: rdar://problem/58797748
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Differential Revision: https://reviews.llvm.org/D73208
2020-01-30 18:17:45 -08:00
Reid Kleckner 01943a59f5 Move verification of Sema::MaximumAlignment to a .cpp file
Saves these transitive includes:
2020-01-30 13:37:52 -08:00
Alexey Bataev 4697874c28 [OPENMP50]Handle lastprivate conditionals passed as shared in inner
regions.

If the lastprivate conditional is passed as shared in inner region, we
shall check if it was ever changed and use this updated value after exit
from the inner region as an update value.
2020-01-30 11:35:23 -05:00
Jonas Devlieghere 509e21a1b9 [clang] Replace SmallStr.str().str() with std::string conversion operator.
Use the std::string conversion operator introduced in
d7049213d0.
2020-01-29 21:27:46 -08:00
Craig Topper a10cec02f7 [X86] Improve X86 cmpps/cmppd/cmpss/cmpsd intrinsics with strictfp
The constrained fcmp intrinsics don't allow the TRUE/FALSE predicates.
Using them will assert. To workaround this I'm emitting the old X86 specific intrinsics that were never removed from the backend when we switched to using fcmp in IR. We have no way to mark them as being strict, but that's true of all target specific intrinsics so doesn't seem like we need to solve that here.

I've also added support for selecting between signaling and quiet.

Still need to support SAE which will require using a target specific
intrinsic. Also need to fix masking to not use an AND instruction
after the compare.

Differential Revision: https://reviews.llvm.org/D72906
2020-01-29 15:52:11 -08:00
Sanne Wouda 2939fc13c8 [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q)
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:

   %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
   %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)

When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.

This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:

   int16x8_t v = {2,3,4,5,6,7,8,9};
   a = vqdmulh_laneq_s16(a, v, 0);
   b = vqdmulh_laneq_s16(b, v, 1);
   c = vqdmulh_laneq_s16(c, v, 2);
   d = vqdmulh_laneq_s16(d, v, 3);
   [...]

In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.

We could teach the compiler to recover the lane variants, but this would likely
require its own pass.  (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)

This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.

These 'lane' variants need an additional register class.  The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.

Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.

This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).

Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71469
2020-01-29 13:25:23 +00:00
Benjamin Kramer bd31243a34 Fix more implicit conversions. Getting closer to having clang working with gcc 5 again 2020-01-29 02:57:59 +01:00
Francis Visoiu Mistrih b1a8189d7d [NFC] Fix comment typo 2020-01-28 15:23:28 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Francis Visoiu Mistrih 4e799ada58 [CodeGen] Attach no-builtin attributes to function definitions with no Decl
When using -fno-builtin[-<name>], we don't attach the IR attributes to
function definitions with no Decl, like the ones created through
`CreateGlobalInitOrDestructFunction`.

This results in projects using -fno-builtin or -ffreestanding to start
seeing symbols like _memset_pattern16.

The fix changes the behavior to always add the attribute if LangOptions
requests it.

Differential Revision: https://reviews.llvm.org/D73495
2020-01-28 13:59:08 -08:00
Hans Wennborg eaabaf7e04 Revert "[MS] Overhaul how clang passes overaligned args on x86_32"
It broke some Chromium tests, so let's revert until it can be fixed; see
https://crbug.com/1046362

This reverts commit 2af74e27ed.
2020-01-28 22:25:07 +01:00
Aaron Ballman 5547919280 Fix a crash when casting _Complex and ignoring the results.
Performing a cast where the result is ignored caused Clang to crash when
performing codegen for the conversion:

  _Complex int a;
  void fn1() { (_Complex double) a; }

This patch addresses the crash by not trying to emit the scalar conversions,
causing it to be a noop. Fixes PR44624.
2020-01-28 13:05:56 -05:00
Alexey Bataev f117f2cc78 [OPENMP50]Check for lastprivate conditional updates in atomic
constructs.

Added analysis in atomic constrcuts to support checks for updates of
conditional lastprivate variables.
2020-01-28 11:40:31 -05:00
Wang, Pengfei 3239b5034e [FPEnv] Add pragma FP_CONTRACT support under strict FP.
Summary: Support pragma FP_CONTRACT under strict FP.

Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3

Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72820
2020-01-28 20:43:43 +08:00
Alexey Bataev e6d2583e45 [OPENMP50]Track changes of lastprivate conditional in parallel-based
regions with reductions, lastprivates or linears clauses.

If the lastprivate conditional variable is updated in inner parallel
region with reduction, lastprivate or linear clause, the value must be
considred as a candidate for lastprivate conditional. Also, tracking in
inner parallel regions is not required.
2020-01-27 14:53:25 -05:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Teresa Johnson af954e441a [WPD] Emit vcall_visibility metadata for MicrosoftCXXABI
Summary:
The MicrosoftCXXABI uses a separate mechanism for emitting vtable
type metadata, and thus didn't pick up the change from D71907
to emit the vcall_visibility metadata under -fwhole-program-vtables.

I believe this is the cause of a Windows bot failure when I committed
follow on change D71913 that required a revert. The failure occurred
in a CFI test that was expecting to not abort because it expected a
devirtualization to occur, and without the necessary vcall_visibility
metadata we would not get devirtualization.

Note in the equivalent code in CodeGenModule::EmitVTableTypeMetadata
(used by the ItaniumCXXABI), we also emit the vcall_visibility metadata
when Virtual Function Elimination is enabled. Since I am not as familiar
with the details of that optimization, I have marked that as a TODO and
am only inserting under -fwhole-program-vtables.

Reviewers: evgeny777

Subscribers: Prazek, ostannard, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73418
2020-01-27 06:22:24 -08:00
Guillaume Chatelet 07c9d53266 [Alignment][NFC] Use Align with CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73449
2020-01-27 10:58:36 +01:00
Reid Kleckner 8a81daaa8b [AST] Split parent map traversal logic into ParentMapContext.h
The only part of ASTContext.h that requires most AST types to be
complete is the parent map. Nothing in Clang proper uses the ParentMap,
so split it out into its own class. Make ASTContext own the
ParentMapContext so there is still a one-to-one relationship.

After this change, 562 fewer files depend on ASTTypeTraits.h, and 66
fewer depend on TypeLoc.h:
  $ diff -u deps-before.txt deps-after.txt | \
    grep '^[-+] ' | sort | uniq -c | sort -nr | less
      562 -    ../clang/include/clang/AST/ASTTypeTraits.h
      340 +    ../clang/include/clang/AST/ParentMapContext.h
       66 -    ../clang/include/clang/AST/TypeLocNodes.def
       66 -    ../clang/include/clang/AST/TypeLoc.h
       15 -    ../clang/include/clang/AST/TemplateBase.h
  ...
I computed deps-before.txt and deps-after.txt with `ninja -t deps`.

This removes a common and key dependency on TemplateBase.h and
TypeLoc.h.

This also has the effect of breaking the ParentMap RecursiveASTVisitor
instantiation into its own file, which roughly halves the compilation
time of ASTContext.cpp (29.75s -> 17.66s). The new file takes 13.8s to
compile.

I left behind forwarding methods for getParents(), but clients will need
to include a new header to make them work:
  #include "clang/AST/ParentMapContext.h"

I noticed that this parent map functionality is unfortunately duplicated
in ParentMap.h, which only works for Stmt nodes.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D71313
2020-01-24 13:42:28 -08:00
David Zarzycki 0d61cd25a6
Verify that clang's max alignment is <= LLVM's max alignment
Reviewers: lebedev.ri

Reviewed By: lebedev.ri

Subscribers: cfe-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D73363
2020-01-24 12:37:05 -05:00
Guillaume Chatelet 805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Awanish Pandey c83602fdf5 Recommit "[DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions."
Summary:
This was reverted in e45fcfc3aa due to
libcxx build failure. This revision addresses that case.

Original commit message:
    This patch will provide support for auto return type for the C++ member
    functions.

    This patch includes clang side implementation of this feature.

    Patch by: Awanish Pandey <Awanish.Pandey@amd.com>

    Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george
    Reviewed by: dblaikie

    Differential Revision: https://reviews.llvm.org/D70524
2020-01-24 14:50:17 +05:30
Pierre Habouzit 52311d0483 [objc_direct] do not add direct properties to the serialization array
If we do, then the property_list_t length is wrong
and class_getProperty gets very sad.

Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-Id: rdar://problem/58804805
Differential Revision: https://reviews.llvm.org/D73219
2020-01-23 22:39:47 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Fangrui Song 69bf40c45f [Driver][CodeGen] Support -fpatchable-function-entry=N,M and __attribute__((patchable_function_entry(N,M))) where M>0
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73072
2020-01-23 17:02:54 -08:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Reid Kleckner 2af74e27ed [MS] Overhaul how clang passes overaligned args on x86_32
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
  t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
  t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned

However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.

The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
  passed indirectly

Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.

There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors

For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.

For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.

Related to D71915 and D72110

Fixes most of PR44395

Reviewed By: rjmccall, craig.topper, erichkeane

Differential Revision: https://reviews.llvm.org/D72114
2020-01-23 16:04:00 -08:00
Roman Lebedev 5ffe6408ff
[Codegen] If reasonable, materialize clang's `AllocAlignAttr` as llvm's Alignment Attribute on call-site function return value
Summary:
Much like with the previous patch (D73005) with `AssumeAlignedAttr`
handling, results in mildly more readable IR,
and will improve test coverage in upcoming patch.

Note that in `AllocAlignAttr`'s case, there is no requirement
for that alignment parameter to end up being an I-C-E.

Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith

Reviewed By: erichkeane

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73006
2020-01-23 22:50:49 +03:00
Roman Lebedev e819f7c9fe
[Codegen] If reasonable, materialize clang's `AssumeAlignedAttr` as llvm's Alignment Attribute on call-site function return value
Summary:
This should be mostly NFC - we still lower the same alignment
knowledge to the IR. The main reasoning here is that
this somewhat improves readability of IR like this,
and will improve test coverage in upcoming patch.

Even though the alignment is guaranteed to always be an I-C-E,
we don't always materialize it as llvm's Alignment Attribute because:
1. There may be a non-zero offset
2. We may be sanitizing for alignment

Note that if there already was an IR alignment attribute
on return value, we union them, and thus the alignment
only ever rises.

Also, there is a second relevant clang attribute `AllocAlignAttr`,
so that is why `AbstractAssumeAlignedAttrEmitter` is templated.

Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith

Reviewed By: erichkeane

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73005
2020-01-23 22:50:49 +03:00
Teresa Johnson 458676db6e [WPD/VFE] Always emit vcall_visibility metadata for -fwhole-program-vtables
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).

In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.

One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.

Reviewers: pcc, ostannard, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71907
2020-01-23 11:36:01 -08:00
Guillaume Chatelet 59f95222d4 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Guillaume Chatelet 0957233320 [Alignment][NFC] Use Align with CreateMaskedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73106
2020-01-22 11:04:39 +01:00
Roman Lebedev 6b2f820221
[NFC][Codegen] Use MaybeAlign + APInt::getLimitedValue() when creating Alignment attr
Summary: Just an NFC code cleanup i stumbled upon when stumbling through clang alignment attribute handling.

Reviewers: erichkeane, gchatelet, courbet, jdoerfert

Reviewed By: gchatelet

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72993
2020-01-21 21:18:29 +03:00
Roman Lebedev 372cb38f45
[Codegen] Emit both AssumeAlignedAttr and AllocAlignAttr assumptions if they exist
Summary:
We shouldn't be just giving up if we find one of them
(like we currently do with `AssumeAlignedAttr`),
we should emit them all.

As the tests show, even if we materialized good knowledge
from `__attribute__((assume_aligned(32)`, it doesn't mean
`__attribute__((alloc_align([...])))` info won't be useful.
It might be, but that isn't given.

Reviewers: erichkeane, jdoerfert, aaron.ballman

Reviewed By: erichkeane

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72979
2020-01-21 21:18:27 +03:00
Kevin P. Neal 2e667d07c7 [FPEnv][SystemZ] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the SystemZ-specific builtins
don't use constrained intrinsics in some cases. Fix that.

Differential Revision: https://reviews.llvm.org/D72722
2020-01-21 12:44:39 -05:00
Diogo Sampaio 2147703bde Revert "[ARM] Follow AACPS standard for volatile bit-fields access width"
This reverts commit 6a24339a45.
Submitted using ide button by mistake
2020-01-21 15:31:33 +00:00
Diogo Sampaio 6a24339a45 [ARM] Follow AACPS standard for volatile bit-fields access width
Summary:
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
  short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
  short a;
  int  b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2019Q1.1
(https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths
of bit-fields where the container overlaps with a non-bit-field member.
 This is because the C/C++ memory model defines these as being separate
memory locations, which can be accessed by two threads
 simultaneously. For this reason, compilers must be permitted to use a
narrower memory access width (including splitting the access
 into multiple instructions) to avoid writing to a different memory location.
```

I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
 - it won't overlap with any other non-bit-field member
 - we only access memory inside the bounds of the record

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Reviewers: rsmith, rjmccall, eli.friedman, ostannard

Subscribers: ostannard, kristof.beyls, cfe-commits, carwil, olista01

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72932
2020-01-21 15:23:38 +00:00
Guillaume Chatelet bc8a1ab26f [Alignment][NFC] Use Align with CreateMaskedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73087
2020-01-21 14:13:22 +01:00
Zakk Chen e15fb06e2d [RISCV] Pass target-abi via module flag metadata
Reviewers: lenary, asb

Reviewed By: lenary

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72755
2020-01-20 23:30:54 -08:00
Saar Raz a0f50d7316 [Concepts] Requires Expressions
Implement support for C++2a requires-expressions.

Re-commit after compilation failure on some platforms due to alignment issues with PointerIntPair.

Differential Revision: https://reviews.llvm.org/D50360
2020-01-19 00:23:26 +02:00
Saar Raz baa84d8cde Revert "[Concepts] Requires Expressions"
This reverts commit 0279318997.

There have been some failing tests on some platforms, reverting while investigating.
2020-01-18 14:58:01 +02:00
Saar Raz 0279318997 [Concepts] Requires Expressions
Implement support for C++2a requires-expressions.

Differential Revision: https://reviews.llvm.org/D50360
2020-01-18 09:15:36 +02:00
Matt Arsenault a4451d88ee Consolidate internal denormal flushing controls
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.

- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute

AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).

Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.

Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.

-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.

Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.

This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.

AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
2020-01-17 20:09:53 -05:00
Ian Levesque 97ba483026 [xray] Allow instrumenting only function entry and/or only function exit
Extend -fxray-instrumentation-bundle to split function-entry and
function-exit into two separate options, so that it is possible to
instrument only function entry or only function exit.  For use cases
that only care about one or the other this will save significant overhead
and code size.

Differential Revision: https://reviews.llvm.org/D72890
2020-01-17 13:32:34 -08:00
Ian Levesque 1d62be2441 [clang][xray] Add -fxray-ignore-loops option
XRay allows tuning by minimum function size, but also always instruments
functions with loops in them. If the minimum function size is set to a
large value the loop instrumention ends up causing most functions to be
instrumented anyway. This adds a new flag, -fxray-ignore-loops, to disable
the loop detection logic.

Differential Revision: https://reviews.llvm.org/D72873
2020-01-17 13:32:24 -08:00
Adrian Prantl 7b30370e5b Move the sysroot attribute from DIModule to DICompileUnit
[this re-applies c0176916a4
 with the correct commit message and phabricator link]

This addresses point 1 of PR44213.
https://bugs.llvm.org/show_bug.cgi?id=44213

The DW_AT_LLVM_sysroot attribute is used for Clang module debug info,
to allow LLDB to import a Clang module from source. Currently it is
part of each DW_TAG_module, however, it is the same for all modules in
a compile unit. It is more efficient and less ambiguous to store it
once in the DW_TAG_compile_unit.

This should have no effect on DWARF consumers other than LLDB.

Differential Revision: https://reviews.llvm.org/D71732
2020-01-17 12:55:40 -08:00
Adrian Prantl c17aee67f1 Revert "Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot"
This reverts commit 12e479475a.

I accidentally landed this patch with the wrong commit message ...
2020-01-17 12:52:36 -08:00
Alexey Bataev c33ba8c158 [OPENMP]Improve debug locations in OpenMP regions.
Emit more precise debug locations for the OpenMP outlined regions.
2020-01-17 14:24:32 -05:00
Sanne Wouda ecfd6d3e84 [clang] Set function attributes on SEH filter functions correctly.
Summary:
When compiling with -munwind-tables, the SEH filter funclet needs the uwtable
function attribute, which gets automatically added if we use
SetInternalFunctionAttributes.  The filter funclet is internal so this seems
appropriate.

Reviewers: rnk

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72786
2020-01-17 18:09:42 +00:00
Adrian Prantl 12e479475a Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.

This attribute only appears in Clang module debug info.

Differential Revision: https://reviews.llvm.org/D71722
2020-01-17 09:36:48 -08:00
serge-sans-paille d293417931 Add __warn_memset_zero_len builtin as a workaround for glibc issue
Glibc issue: https://sourceware.org/bugzilla/show_bug.cgi?id=25399
The fix consist in considering the missing function as a builtin lowered to a nop.

Differential Revision: https://reviews.llvm.org/D72869
2020-01-17 09:58:32 +01:00
serge-sans-paille d437fba8ef Reapply Allow system header to provide their own implementation of some builtin
This reverts commit 3d210ed3d1.

See https://reviews.llvm.org/D71082 for the patch and discussion that make it
possible to reapply this patch.
2020-01-17 09:58:32 +01:00
Alexey Bataev 25b542c61f [OPENMP]Do not emit RTTI descriptor for NVPTX devices.
Need to disable emission of RTTI descriptors for NVPTX devices to be
able to use dynamic classes without unresolved symbols at link stage.
2020-01-16 18:12:50 -05:00
Alexey Bataev 8b32192948 [OPENMP]Avoid string concat where possible and use standard name
generation function, NFC.
2020-01-16 16:39:45 -05:00
Mircea Trofin 7acfda633f [llvm] Make new pass manager's OptimizationLevel a class
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.

In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.

This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.

The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72547
2020-01-16 09:00:56 -08:00
Simon Pilgrim 19c5057e8d Fix "pointer is null" static analyzer warnings. NFCI.
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately in all cases and castAs will perform the null assertion for us.
2020-01-16 13:02:40 +00:00
Sameer Sahasrabuddhe ed181efa17 [HIP][AMDGPU] expand printf when compiling HIP to AMDGPU
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
  program for the AMDGPU target.

The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D71365
2020-01-16 15:15:38 +05:30
Amy Huang 3d210ed3d1 Revert "Allow system header to provide their own implementation of some builtin"
This reverts commit 921f871ac4 because it
causes libc++ code to trigger __warn_memset_zero_len.

See https://reviews.llvm.org/D71082.
2020-01-15 15:03:45 -08:00
Pierre Habouzit d18fbfc097 Relax the rules around objc_alloc and objc_alloc_init optimizations.
Today the optimization is limited to:
- `[ClassName alloc]`
- `[self alloc]` when within a class method

However it means that when code is written this way:

```
    @interface MyObject
    - (id)copyWithZone:(NSZone *)zone
    {
        return [[self.class alloc] _initWith...];
    }

    @end
```

... then the optimization doesn't kick in and `+[NSObject alloc]` ends
up in IMP caches where it could have been avoided. It turns out that
`+alloc` -> `+[NSObject alloc]` is the most cached SEL/IMP pair in the
entire platform which is rather silly).

There's two theoretical risks allowing this optimization:

1. if the receiver is nil (which it can't be today), but it turns out
   that `objc_alloc()`/`objc_alloc_init()` cope with a nil receiver,

2. if the `Clas` type for the receiver is a lie. However, for such a
   code to work today (and not fail witn an unrecognized selector
   anyway) you'd have to have implemented the `-alloc` **instance
   method**.

   Fortunately, `objc_alloc()` doesn't assume that the receiver is a
   Class, it basically starts with a test that is similar to

       `if (receiver->isa->bits & hasDefaultAWZ) { /* fastpath */ }`.

   This bit is only set on metaclasses by the runtime, so if an instance
   is passed to this function by accident, its isa will fail this test,
   and `objc_alloc()` will gracefully fallback to `objc_msgSend()`.

   The one thing `objc_alloc()` doesn't support is tagged pointer
   instances. None of the tagged pointer classes implement an instance
   method called `'alloc'` (actually there's a single class in the
   entire Apple codebase that has such a method).

Differential Revision: https://reviews.llvm.org/D71682
Radar-Id: rdar://problem/58058316
Reviewed-By: Akira Hatanaka
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
2020-01-14 19:48:33 -08:00
Reid Kleckner 8e780252a7 [X86] ABI compat bugfix for MSVC vectorcall
Summary:
Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases.  The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:

  void __vectorcall vectorcall_indirect_vec(
      double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
      __m128 xmm5,
      __m128 ecx,
      int edx,
      __m128 mem);

classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.

I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.

Reviewers: erichkeane, craig.topper

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72110
2020-01-14 17:49:13 -08:00
Rong Xu 60d3947922 [remark][diagnostics] Using clang diagnostic handler for IR input files
For IR input files, we currently use LLVM diagnostic handler even the
compilation is from clang. As a result, we are not able to use -Rpass
to get the transformation reports. Some warnings are not handled
properly either: We found many mysterious warnings in our ThinLTO backend
compilations in SamplePGO and CSPGO. An example of the warning:
"warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)"

This turns out to be a warning by Wmisexpect, which is supposed to be
filtered out by default. But since the filter is in clang's
diagnostic hander, we emit these incomplete warnings from LLVM's
diagnostic handler.

This patch uses clang diagnostic handler for IR input files. We create
a fake backendconsumer just to install the diagnostic handler.

With this change, we will have proper handling of all the warnings and we can
use -Rpass* options in IR input files compilation.
Also note that with is patch, LLVM's diagnostic options, like
"-mllvm -pass-remarks=*", are no longer be able to get optimization remarks.

Differential Revision: https://reviews.llvm.org/D72523
2020-01-14 15:44:57 -08:00
Alexey Bataev a48600c0a6 [OPENMP]Do not emit special virtual function for NVPTX target.
There are no special virtual function handlers (like __cxa_pure_virtual)
defined for NVPTX target, so just emit such functions as null pointers
to prevent issues with linking and unresolved references.
2020-01-14 16:59:22 -05:00
Amy Huang 651128f557 [DebugInfo] Add option to clang to limit debug info that is emitted for classes.
Summary:
This patch adds an option to limit debug info by only emitting complete class
type information when its constructor is emitted. This applies to classes
that have nontrivial user defined constructors.

I implemented the option by adding another level to `DebugInfoKind`, and
a flag `-flimit-debug-info-constructor`.

Total object file size on Windows, compiling with RelWithDebInfo:
  before: 4,257,448 kb
  after:  2,104,963 kb

And on Linux
  before: 9,225,140 kb
  after:  4,387,464 kb

According to the Windows clang.pdb files, here is a list of types that are no
longer complete with this option enabled: https://reviews.llvm.org/P8182

Reviewers: rnk, dblaikie

Subscribers: aprantl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72427
2020-01-14 12:40:21 -08:00
Simon Pilgrim 25dc5c7cd1 Fix "pointer is null" static analyzer warnings. NFCI.
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
2020-01-14 14:00:37 +00:00
Benjamin Kramer df186507e1 Make helper functions static or move them into anonymous namespaces. NFC. 2020-01-14 14:06:37 +01:00
James Clarke 3d6c492d7a [RISCV] Fix ILP32D lowering for double+double/double+int return types
Summary:
Previously, since these aggregates are > 2*XLen, Clang would think they
were being returned indirectly and thus would decrease the number of
available GPRs available by 1. For long argument lists this could lead
to a struct argument incorrectly being passed indirectly.

Reviewers: asb, lenary

Reviewed By: asb, lenary

Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69590
2020-01-14 11:17:19 +00:00
Amy Huang 53539bb032 [DebugInfo] Add another level to DebugInfoKind called Constructor
The option will limit debug info by only emitting complete class
type information when its constructor is emitted.
This patch changes comparisons with LimitedDebugInfo to use the new
level instead.

Differential Revision: https://reviews.llvm.org/D72427
2020-01-13 15:59:03 -08:00
Martin Storsjö 810b28edb3 [ItaniumCXXABI] Make tls wrappers properly comdat
Just marking a symbol as weak_odr/linkonce_odr isn't enough for
actually tolerating multiple copies of it at linking on windows,
it has to be made a proper comdat; make it comdat for all platforms
for consistency.

This should hopefully fix
https://bugzilla.mozilla.org/show_bug.cgi?id=1566288.

Differential Revision: https://reviews.llvm.org/D71572
2020-01-13 23:36:26 +02:00
Erich Keane 349636d2bf Implement VectorType conditional operator GNU extension.
GCC supports the conditional operator on VectorTypes that acts as a
'select' in C++ mode. This patch implements the support. Types are
converted as closely to GCC's behavior as possible, though in a few
places consistency with our existing vector type support was preferred.

Note that this implementation is different from the OpenCL version in a
number of ways, so it unfortunately required a different implementation.

First, the SEMA rules and promotion rules are significantly different.

Secondly, GCC implements COND[i] != 0 ? LHS[i] : RHS[i] (where i is in
the range 0- VectorSize, for each element).  In OpenCL, the condition is
COND[i] < 0 ? LHS[i]: RHS[i].

In the process of implementing this, it was also required to make the
expression COND ? LHS : RHS type dependent if COND is type dependent,
since the type is now dependent on the condition.  For example:

    T ? 1 : 2;

Is not typically type dependent, since the result can be deduced from
the operands.  HOWEVER, if T is a VectorType now, it could change this
to a 'select' (basically a swizzle with a non-constant mask) with the 1
and 2 being promoted to vectors themselves.

While this is a change, it is NOT a standards incompatible change. Based
on my (and D. Gregor's, at the time of writing the code) reading of the
standard, the expression is supposed to be type dependent if ANY
sub-expression is type dependent.

Differential Revision: https://reviews.llvm.org/D71463
2020-01-13 13:27:20 -08:00
KAWASHIMA Takahiro 10c11e4e2d This option allows selecting the TLS size in the local exec TLS model,
which is the default TLS model for non-PIC objects. This allows large/
many thread local variables or a compact/fast code in an executable.

Specification is same as that of GCC. For example, the code model
option precedes the TLS size option.

TLS access models other than local-exec are not changed. It means
supoort of the large code model is only in the local exec TLS model.

Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>)
Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard
Reviewd By: peter.smith
Committed by: peter.smith

Differential Revision: https://reviews.llvm.org/D71688
2020-01-13 10:16:53 +00:00
Sam McCall e45fcfc3aa Revert "[DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions."
This reverts commit 6d6a4590c5, which
introduces a crash.

See https://reviews.llvm.org/D70524 for details.
2020-01-13 11:13:16 +01:00
Awanish Pandey 6d6a4590c5 [DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions.
Summary:
This patch will provide support for auto return type for the C++ member
functions.

This patch includes clang side implementation of this feature.

Patch by: Awanish Pandey <Awanish.Pandey@amd.com>

Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george
Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D70524
2020-01-13 12:40:18 +05:30
Simon Pilgrim 93431f96a7 Fix "pointer is null" static analyzer warning. NFCI.
Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately).
2020-01-11 16:02:23 +00:00
Simon Pilgrim 16c53ffcb9 Fix "pointer is null" static analyzer warnings. NFCI.
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
2020-01-11 16:02:23 +00:00
Richard Smith 7a38468e34 Only destroy static locals if they have non-trivial destructors.
This fixes a regression introduced in
2b4fa5348e that caused us to emit
shutdown-time destruction for variables with ARC ownership, using
C++-specific functions that don't exist in C implementations.
2020-01-10 15:18:36 -08:00
Fangrui Song f17ae668a9 [Driver][CodeGen] Add -fpatchable-function-entry=N[,0]
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.

Reviewed By: ostannard, MaskRay

Differential Revision: https://reviews.llvm.org/D72222
2020-01-10 09:57:39 -08:00
Fangrui Song a44c434b68 Support function attribute patchable_function_entry
This feature is generic. Make it applicable for AArch64 and X86 because
the backend has only implemented NOP insertion for AArch64 and X86.

Reviewed By: nickdesaulniers, aaron.ballman

Differential Revision: https://reviews.llvm.org/D72221
2020-01-10 09:57:34 -08:00
Simon Pilgrim fd8ded99fe Fix "pointer is null" static analyzer warning. NFCI.
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
2020-01-10 17:41:26 +00:00
Andrew Paverd bdd88b7ed3 Add support for __declspec(guard(nocf))
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.

Reviewers: rnk, dmajor, pcc, hans, aaron.ballman

Reviewed By: aaron.ballman

Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72167
2020-01-10 16:04:12 +00:00
Ulrich Weigand 76e9c2a987 [FPEnv] Generate constrained FP comparisons from clang
Update the IRBuilder to generate constrained FP comparisons in
CreateFCmp when IsFPConstrained is true, similar to the other
places in the IRBuilder.

Also, add a new CreateFCmpS to emit signaling FP comparisons,
and use it in clang where comparisons are supposed to be signaling
(currently, only when emitting code for the <, <=, >, >= operators).

Note that there is currently no way to add fast-math flags to a
constrained FP comparison, since this is implemented as an intrinsic
call that returns a boolean type, and FMF are only allowed for calls
returning a floating-point type. However, given the discussion around
https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself
really shouldn't have any FMF either, so this is probably OK.

Reviewed by: craig.topper

Differential Revision: https://reviews.llvm.org/D71467
2020-01-10 14:33:10 +01:00
serge-sans-paille 921f871ac4 Allow system header to provide their own implementation of some builtin
If a system header provides an (inline) implementation of some of their
function, clang still matches on the function name and generate the appropriate
llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation «
users may not provide their own version of symbols » but doesn't account for the
fact that glibc itself can provide inline version of some functions.

It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that
case an inline version of memcpy calls __memcpy_chk, a function that performs
extra runtime checks. Clang currently ignores the inline version and thus
provides no runtime check.

This code fixes the issue by detecting functions whose name is a builtin name
but also have an inline implementation.

Differential Revision: https://reviews.llvm.org/D71082
2020-01-10 09:44:20 +01:00
Wei Mi 21a4710c67 [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP
down to pass builder in ltobackend.

Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.

Differential Revision: https://reviews.llvm.org/D72386
2020-01-09 21:13:11 -08:00
Alex Richardson 8c387cbea7 Add builtins for aligning and checking alignment of pointers and integers
This change introduces three new builtins (which work on both pointers
and integers) that can be used instead of common bitwise arithmetic:
__builtin_align_up(x, alignment), __builtin_align_down(x, alignment) and
__builtin_is_aligned(x, alignment).

I originally added these builtins to the CHERI fork of LLVM a few years ago
to handle the slightly different C semantics that we use for CHERI [1].
Until recently these builtins (or sequences of other builtins) were
required to generate correct code. I have since made changes to the default
C semantics so that they are no longer strictly necessary (but using them
does generate slightly more efficient code). However, based on our experience
using them in various projects over the past few years, I believe that adding
these builtins to clang would be useful.

These builtins have the following benefit over bit-manipulation and casts
via uintptr_t:

- The named builtins clearly convey the semantics of the operation. While
  checking alignment using __builtin_is_aligned(x, 16) versus
  ((x & 15) == 0) is probably not a huge win in readably, I personally find
  __builtin_align_up(x, N) a lot easier to read than (x+(N-1))&~(N-1).
- They preserve the type of the argument (including const qualifiers). When
  using casts via uintptr_t, it is easy to cast to the wrong type or strip
  qualifiers such as const.
- If the alignment argument is a constant value, clang can check that it is
  a power-of-two and within the range of the type. Since the semantics of
  these builtins is well defined compared to arbitrary bit-manipulation,
  it is possible to add a UBSAN checker that the run-time value is a valid
  power-of-two. I intend to add this as a follow-up to this change.
- The builtins avoids int-to-pointer casts both in C and LLVM IR.
  In the future (i.e. once most optimizations handle it), we could use the new
  llvm.ptrmask intrinsic to avoid the ptrtoint instruction that would normally
  be generated.
- They can be used to round up/down to the next aligned value for both
  integers and pointers without requiring two separate macros.
- In many projects the alignment operations are already wrapped in macros (e.g.
  roundup2 and rounddown2 in FreeBSD), so by replacing the macro implementation
  with a builtin call, we get improved diagnostics for many call-sites while
  only having to change a few lines.
- Finally, the builtins also emit assume_aligned metadata when used on pointers.
  This can improve code generation compared to the uintptr_t casts.

[1] In our CHERI compiler we have compilation mode where all pointers are
implemented as capabilities (essentially unforgeable 128-bit fat pointers).
In our original model, casts from uintptr_t (which is a 128-bit capability)
to an integer value returned the "offset" of the capability (i.e. the
difference between the virtual address and the base of the allocation).
This causes problems for cases such as checking the alignment: for example, the
expression `if ((uintptr_t)ptr & 63) == 0` is generally used to check if the
pointer is aligned to a multiple of 64 bytes. The problem with offsets is that
any pointer to the beginning of an allocation will have an offset of zero, so
this check always succeeds in that case (even if the address is not correctly
aligned). The same issues also exist when aligning up or down. Using the
alignment builtins ensures that the address is used instead of the offset. While
I have since changed the default C semantics to return the address instead of
the offset when casting, this offset compilation mode can still be used by
passing a command-line flag.

Reviewers: rsmith, aaron.ballman, theraven, fhahn, lebedev.ri, nlopes, aqjune
Reviewed By: aaron.ballman, lebedev.ri
Differential Revision: https://reviews.llvm.org/D71499
2020-01-09 21:48:29 +00:00
Simon Tatham 06d07ec4a3 [Clang] Handle target-specific builtins returning aggregates.
Summary:
A few of the ARM MVE builtins directly return a structure type. This
causes an assertion failure at code-gen time if you try to assign the
result of the builtin to a variable, because the `RValue` created in
`EmitBuiltinExpr` from the `llvm::Value` produced by codegen is always
made by `RValue::get()`, which creates a non-aggregate `RValue` that
will fail an assertion when `AggExprEmitter::withReturnValueSlot` calls
`Src.getAggregatePointer()`. A similar failure occurs if you try to use
the struct return value directly to extract one field, e.g.
`vld2q(address).val[0]`.

The existing code-gen tests for those MVE builtins pass the returned
structure type directly to the C `return` statement, which apparently
managed to avoid that particular code path, so we didn't notice the
crash.

Now `EmitBuiltinExpr` checks the evaluation kind of the builtin's return
value, and does the necessary handling for aggregate returns. I've added
two extra test cases, both of which crashed before this change.

Reviewers: dmgreen, rjmccall

Reviewed By: rjmccall

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72271
2020-01-09 17:28:37 +00:00
serge-sans-paille cee4a1c957 Improve support of GNU mempcpy
- Lower to the memcpy intrinsic
- Raise warnings when size/bounds are known

Differential Revision: https://reviews.llvm.org/D71374
2020-01-09 17:31:00 +01:00
Alexey Bataev 4c11703b3d [OPENMP]Remove unused code, NFC. 2020-01-09 09:50:46 -05:00
Simon Pilgrim e3e72a2619 Fix "pointer is null" static analyzer warnings. NFCI.
Assert that the pointers are non-null before dereferencing them.
2020-01-09 12:05:47 +00:00
Simon Pilgrim 0d5407987a Fix MSVC unhandled enum warning. NFCI. 2020-01-09 11:11:01 +00:00
Simon Pilgrim 5936717fa6 Fix "pointer is null" static analyzer warning. NFCI.
Use castAs<> instead of getAs<> since we know that the pointer will be valid (and is dereferenced immediately below).
2020-01-08 17:19:08 +00:00
Alexey Bataev 4558842891 [OPENMP]Reduce calls for the mangled names.
Use canonical decls instead of mangled names in the set of already
emitted decls. This allows to reduce the number of function calls for
getting declarations mangled names and speedup the compilation.
2020-01-07 14:28:17 -05:00
Yaxun (Sam) Liu 9f2d8b5c0c [HIP] Add option --gpu-max-threads-per-block=n
Add this option to change the default launch bounds.

Differential Revision: https://reviews.llvm.org/D71221
2020-01-07 11:18:00 -05:00
Jim Lin ab1bcda851 [NFC] Use isX86() instead of getArch()
Summary: This is a clean up for https://reviews.llvm.org/D72247.

Reviewers: MaskRay, craig.topper, jhenderson

Reviewed By: MaskRay

Subscribers: hiraditya, rupprecht, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72320
2020-01-07 17:35:44 +08:00
Akira Hatanaka 20f005d25f [CodeGen][ObjC] Push the properties of a protocol before pushing the
properties of the protocol it inherits

This fixes a bug where the type string for a @dynamic property of an
@implementation didn't have 'D' in it when the protocol it conforms to
redeclares the property declared in the base protocol.

rdar://problem/45503561
2020-01-06 16:16:02 -08:00
Fangrui Song 6904cd9486 Add Triple::isX86()
Reviewed By: craig.topper, skan

Differential Revision: https://reviews.llvm.org/D72247
2020-01-06 15:51:02 -08:00
Alexey Bataev 7b518dcb29 [OPENMP50]Support lastprivate conditional updates in inc/dec unary ops.
Added support for checking of updates of variables used in unary
pre(pos) inc/dec expressions.
2020-01-06 16:37:01 -05:00
Alexey Bataev a58da1a2ff [OPENMP50]Codegen for lastprivate conditional list items.
Added codegen support for lastprivate conditional. According to the
standard, if  when the conditional modifier appears on the clause, if an
assignment to a list item is encountered in the construct then the
original list item is assigned the value that is assigned to the new
list item in the sequentially last iteration or lexically last section
in which such an assignment is encountered.
We look for the assignment operations and check if the left side
references lastprivate conditional variable. Then the next code is
emitted:
if (last_iv_a <= iv) {
  last_iv_a = iv;
  last_a = lp_a;
}

At the end the implicit barrier is generated to wait for the end of all
threads and then in the check for the last iteration the private copy is
assigned the last value.

if (last_iter) {
  lp_a = last_a; // <--- new code
  a = lp_a;      // <--- store of private value to the original  variable.
}
2020-01-02 16:43:00 -05:00
Kevin P. Neal 89d6c288ba [SystemZ] Use FNeg in s390x clang builtins
The s390x builtins are still using FSub instead of FNeg. Correct that.
2020-01-02 12:14:43 -05:00
serge_sans_paille 24ab9b537e Generalize the pass registration mechanism used by Polly to any third-party tool
There's quite a lot of references to Polly in the LLVM CMake codebase. However
the registration pattern used by Polly could be useful to other external
projects: thanks to that mechanism it would be possible to develop LLVM
extension without touching the LLVM code base.

This patch has two effects:

1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it
   with a generic mechanism

2. Provide a generic mechanism to register compiler extensions.

A compiler extension is similar to a pass plugin, with the notable difference
that the compiler extension can be configured to be built dynamically (like
plugins) or statically (like regular passes).

As a result, people willing to add extra passes to clang/opt can do it using a
separate code repo, but still have their pass be linked in clang/opt as built-in
passes.

Differential Revision: https://reviews.llvm.org/D61446
2020-01-02 16:45:31 +01:00
Mark de Wever 8dc7b982b4 [NFC] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71857
2020-01-01 20:01:37 +01:00
Fangrui Song d2bb8c16e7 [MC][TargetMachine] Delete MCTargetOptions::MCPIECopyRelocations
clang/lib/CodeGen/CodeGenModule performs the -mpie-copy-relocations
check and sets dso_local on applicable global variables. We don't need
to duplicate the work in TargetMachine shouldAssumeDSOLocal.

Verified that -mpie-copy-relocations can still emit PC relative
relocations for external variable accesses.

clang -target x86_64 -fpie -mpie-copy-relocations -c => R_X86_64_PC32
clang -target aarch64 -fpie -mpie-copy-relocations -c => R_AARCH64_ADR_PREL_PG_HI21+R_AARCH64_LDST64_ABS_LO12_NC
2020-01-01 00:50:18 -08:00
Alexey Bataev 8be5a0fe12 [OPENMP]Emit artificial threprivate vars as threadlocal, if possible.
It may improve performance for declare reduction constructs.
2019-12-31 14:11:36 -05:00
Craig Topper 5e5a1d2790 [CodeGen] Emit conj/conjf/confjl libcalls as fneg instructions if possible.
We already recognize the __builtin versions of these, might as well
recognize the libcall version.

Differential Revision: https://reviews.llvm.org/D72028
2019-12-31 10:41:00 -08:00
Craig Topper 70f8dd4cf6 [CodeGen] Use IRBuilder::CreateFNeg for __builtin_conj
This replaces the fsub -0.0 idiom with an fneg instruction. We didn't see to have a test that showed the current codegen. Just some tests for constant folding and a test that was only checking the declare lines for libcalls. The latter just checked that we did not have a declare for @conj when using __builtin_conj.

Differential Revision: https://reviews.llvm.org/D72012
2019-12-30 13:25:23 -08:00
Craig Topper 8b23b2bbd9 [CodeGen] Use CreateFNeg in buildFMulAdd
We have an fneg instruction now and should use it instead of the fsub -0.0 idiom. Looks like we had no test that showed that we handled the negation cases here so I've added new tests.

Differential Revision: https://reviews.llvm.org/D72010
2019-12-30 13:24:11 -08:00
Johannes Doerfert 10fedd94b4 [OpenMP] Use the OpenMPIRBuilder for `omp parallel`
This allows to use the OpenMPIRBuilder for parallel regions. Code was
extracted from D61953 and adapted to work with the new version (D70109).

All but one feature should be supported. An update of this patch will
provide test coverage and privatization other than shared.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D70290
2019-12-30 13:57:13 -06:00
Johannes Doerfert 6c5d1f40ff [OpenMP][NFCI] Use the libFrontend ProcBindKind in Clang
This removes the OpenMPProcBindClauseKind enum in favor of
llvm::omp::ProcBindKind which lives in OpenMPConstants.h and was
introduced in D70109.

No change in behavior is expected.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D70289
2019-12-26 11:04:07 -06:00
Johannes Doerfert f9c3c5da19 [OpenMP][IR-Builder] Introduce the finalization stack
As a permanent and generic solution to the problem of variable
finalization (destructors, lastprivate, ...), this patch introduces the
finalization stack. The objects on the stack describe (1) the
(structured) regions the OpenMP-IR-Builder is currently constructing,
(2) if these are cancellable, and (3) the callback that will perform the
finalization (=cleanup) when necessary.

As the finalization can be necessary multiple times, at different source
locations, the callback takes the position at which code is currently
generated. This position will also encode the destination of the "region
exit" block *iff* the finalization call was issues for a region
generated by the OpenMPIRBuilder. For regions generated through the old
Clang OpenMP code geneneration, the "region exit" is determined by Clang
inside the finalization call instead (see getOMPCancelDestination).

As a first user, the parallel + cancel barrier interaction is changed.
In contrast to the temporary solution before, the barrier generation in
Clang does not need to be aware of the "CancelDestination" block.
Instead, the finalization callback is and, as described above, later
even that one does not need to be.

D70109 will be updated to use this scheme.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D70258
2019-12-25 16:57:08 -06:00
Kevin P. Neal 0293b5d671 [NFC] Remove some dead code from CGBuiltin.cpp. 2019-12-24 09:38:34 -05:00
Craig Topper d35bcbbb5d [Sema][X86] Consider target attribute into the checks in validateOutputSize and validateInputSize.
The validateOutputSize and validateInputSize need to check whether
AVX or AVX512 are enabled. But this can be affected by the
target attribute so we need to factor that in.

This patch moves some of the code from CodeGen to create an
appropriate feature map that we can pass to the function.

Differential Revision: https://reviews.llvm.org/D68627
2019-12-23 11:23:30 -08:00
Alexey Bataev 0860db966a [OPENMP50]Codegen for nontemporal clause.
Summary:
Basic codegen for the declarations marked as nontemporal. Also, if the
base declaration in the member expression is marked as nontemporal,
lvalue for member decl access inherits nonteporal flag from the base
lvalue.

Reviewers: rjmccall, hfinkel, jdoerfert

Subscribers: guansong, arphaman, caomhin, kkwli0, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71708
2019-12-23 10:04:46 -05:00
Martin Storsjö 86c9831bb4 [ItaniumCXXABI] Don't mark an extern_weak init function as dso_local on windows
Since 6bf108d77a, we try to not mark extern_weak symbols as
dso_local, to allow using COFF stubs for references to those symbols
(as the symbol may be missing, resolving to an absolute address zero,
outside of the current DSO).

Differential Revision: https://reviews.llvm.org/D71716
2019-12-23 12:13:48 +02:00
Yonghong Song e3d8ee35e4 reland "[DebugInfo] Support to emit debugInfo for extern variables"
Commit d77ae1552f
("[DebugInfo] Support to emit debugInfo for extern variables")
added deebugInfo for extern variables for BPF target.
The commit is reverted by 891e25b02d
as the committed tests using %clang instead of %clang_cc1 causing
test failed in certain scenarios as reported by Reid Kleckner.

This patch fixed the tests by using %clang_cc1.

Differential Revision: https://reviews.llvm.org/D71818
2019-12-22 18:28:50 -08:00
Reid Kleckner 891e25b02d Revert "[DebugInfo] Support to emit debugInfo for extern variables"
This reverts commit d77ae1552f.

The tests committed along with this change do not pass, and should be
changed to use %clang_cc1.
2019-12-22 12:54:06 -08:00
Eric Astor dc5b614fa9 [ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly.
Summary:
This is documented as the appropriate template modifier for call operands.
Fixes PR44272, and adds a regression test.

Also adds support for operand modifiers in Intel-style inline assembly.

Reviewers: rnk

Reviewed By: rnk

Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71677
2019-12-22 09:16:34 -05:00
Fangrui Song 0792ef7256 [Driver] Verify -mrecord-mcount in Driver, instead of CodeGen after D71627
GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the
option.

  aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’

Allowing this option can cause failures when building Linux kernel for
aarch64, powerpc64, etc, which will think the feature is available if
the clang command returns 0.
2019-12-21 22:47:24 -08:00
Pierre Habouzit 42f9d0c0be [objc_direct] Tigthen checks for direct methods
Because the name of a direct method must be agreed upon by the caller
and the implementation, certain bad practices that one can get away with
when using dynamism are fatal with direct methods.

To avoid really weird and unscruttable linker error, tighten the
front-end error reporting.

Rule 1:
  Direct methods can only have at most one declaration in an @interface
  container. Any redeclaration is strictly forbidden.

  Today some amount of redeclaration is tolerated between the main
  interface and categories for dynamic methods, but we can't have that.

Rule 2:
  Direct method implementations can only be declared in a matching
  @interface container: when implemented in the primary @implementation
  then the declaration must be in the primary @interface or an
  extension, and when implemented in a category, the declaration must be
  in the @interface for the same category.

Also fix another issue with ObjCMethod::getCanonicalDecl(): when an
implementation lives in the primary @interface, then its canonical
declaration can be in any extension, even when it's not an accessor.

Add Sema tests to cover the new errors, and CG tests to beef up testing
around function names for categories and extensions.

Radar-Id: <rdar://problem/58054563>

Differential Revision: https://reviews.llvm.org/D71694
2019-12-20 10:57:36 -08:00
Tim Northover 85cb560b8a ConstrainedFP: use API compatible with opaque pointers.
This just updates an IRBuilder interface to take Functions instead of
Values so the type can be derived, and fixes some callsites in Clang to
call the updated API.
2019-12-19 21:50:47 +00:00
Jonas Paulsson 2520bef865 [Clang FE, SystemZ] Recognize -mrecord-mcount CL option.
Recognize -mrecord-mcount from the command line and add a function attribute
"mrecord-mcount" when passed.

Only valid on SystemZ (when used with -mfentry).

Review: Ulrich Weigand
https://reviews.llvm.org/D71627
2019-12-19 08:51:55 -08:00
Thomas Lively 71eb8023d8 [WebAssembly] Add avgr_u intrinsics and require nuw in patterns
Summary:
The vector pattern `(a + b + 1) / 2` was previously selected to an
avgr_u instruction regardless of nuw flags, but this is incorrect in
the case where either addition may have an unsigned wrap. This CL
changes the existing pattern to require both adds to have nuw flags
and adds builtin functions and intrinsics for the avgr_u instructions
because the corrected pattern is not representable in C.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71648
2019-12-18 15:31:38 -08:00
Richard Smith 3ced23976a Refactor CompareReferenceRelationship and its callers in preparation for
implementing the resolution of CWG2352.

No functionality change, except that we now convert the referent of a
reference binding to the underlying type of the reference in more cases;
we used to happen to preserve the type sugar from the referent if the
only type change was in the cv-qualifiers.

This exposed a bug in how we generate code for trivial assignment
operators: if the type sugar (particularly the may_alias attribute)
got lost during reference binding, we'd use the "wrong" TBAA information
for the load during the assignment.
2019-12-18 14:05:57 -08:00
Akira Hatanaka a6d57a8cd4 Use hasOffsetApplied to initialize member HasOffsetApplied
This is NFC since none of the constructor calls in trunk pass
hasOffsetApplied=true.
2019-12-18 13:56:59 -08:00
Jonas Paulsson ca520592c0 [Clang FE, SystemZ] Don't add "true" value for the "mnop-mcount" attribute.
Let the "mnop-mcount" function attribute simply be present or non-present.
Update SystemZ backend as well to use hasFnAttribute() instead.

Review: Ulrich Weigand
https://reviews.llvm.org/D71669
2019-12-18 11:04:13 -08:00
Alexey Bataev b6e7084e25 [OPENMP50]Add parsing/sema analysis for nontemporal clause.
Add basic support for parsing/sema analysis of the nontemporal clause in
simd-based directives.
2019-12-17 14:46:32 -05:00
Jonas Paulsson 599d1cc07a [Clang FE, SystemZ] Recognize -mpacked-stack CL option
Recognize -mpacked-stack from the command line and add a function attribute
"mpacked-stack" when passed. This is needed for building the Linux kernel.

If this option is passed for any other target than SystemZ, an error is
generated.

Review: Ulrich Weigand
https://reviews.llvm.org/D71441
2019-12-17 11:26:17 -08:00
Raphael Isemann ccfab8e459 [ObjC][DWARF] Emit DW_AT_APPLE_objc_direct for methods marked as __attribute__((objc_direct))
Summary:
With DWARF5 it is no longer possible to distinguish normal methods and methods with `__attribute__((objc_direct))` by just looking at the debug information
as they are both now children of the of the DW_TAG_structure_type that defines them (before only the `__attribute__((objc_direct))` methods were children).

This means that in LLDB we are no longer able to create a correct Clang AST of a module by just looking at the debug information. Instead we would
need to call the Objective-C runtime to see which of the methods have a `__attribute__((objc_direct))` and then add the attribute to our own Clang AST
depending on what the runtime returns. This would mean that we either let the module AST be dependent on the Objective-C runtime (which doesn't
seem right) or we retroactively add the missing attribute to the imported AST in our expressions.

A third option is to annotate methods with `__attribute__((objc_direct))` as `DW_AT_APPLE_objc_direct` which is what this patch implements. This way
LLDB doesn't have to call the runtime for any `__attribute__((objc_direct))` method and the AST in our module will already be correct when we create it.

Reviewers: aprantl, SouraVX

Reviewed By: aprantl

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D71201
2019-12-17 09:40:36 +01:00
Richard Smith f495de43bd [c++20] P1959R0: Remove support for std::*_equality. 2019-12-16 17:49:45 -08:00
Thomas Lively 3a93756dfb [WebAssembly] Replace SIMD int min/max builtins with patterns
Summary:
The instructions were originally implemented via builtins and
intrinsics so users would have to explicitly opt-in to using
them. This was useful while were validating whether these instructions
should have been merged into the spec proposal. Now that they have
been, we can use normal codegen patterns, so the intrinsics and
builtins are no longer useful.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71500
2019-12-16 11:48:49 -08:00
Teresa Johnson 878ab6df03 [TLI] Support for per-Function TLI that overrides available libfuncs
Summary:

Follow-on to D66428 and D71193, to build the TLI per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

With D71193, the -fno-builtin* flags are converted to function
attributes, so we can now set this information per-function on the TLI.

In this patch, the TLI constructor is changed to take a Function, which
can be used to override the available builtins. The TLI is augmented
with an array that can be used to specify which builtins are not
available for the corresponding function. The available function checks
are changed to consult this override before checking the underlying
module level baseline TLII. New code is added to set this override
array based on the attributes.

I also removed the code that sets availability in the TLII in clang from
the options, which is no longer needed.

I removed a per-Triple caching of TLII objects in the analysis object,
as it is based on the Module's Triple which is the same for all
functions in any case. Is there a case where we would be compiling
multiple Modules with different Triples in one compilation?

Finally, I have changed the legacy analysis wrapper to create and use
the new PM analysis class (TargetLibraryAnalysis) in getTLI. This is
consistent with the behavior of getTTI for the legacy
TargetTransformInfo analysis. This change means that getTLI now creates
a new TLI on each call (although that should be very cheap as we cache
the module level TLII, and computing the per-function
attribute based availability should also be reasonably efficient).
I measured the compile time for a large C++ file with tens of thousands
of functions and as expected there was no increase.

Reviewers: chandlerc, hfinkel, gchatelet

Subscribers: mehdi_amini, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67923
2019-12-16 09:19:30 -08:00
Nicola Zaghen 97572775d2 Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

This fixes the buildbot failures.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-13 14:30:21 +00:00
Alexey Bataev 5ad52587ec [OPENMP50]Fix possible conflict when emitting an alias for the functions
in declare variant.

If the types of the fnction are not equal, but match, at the codegen
thei may have different types. This may lead to compiler crash.
2019-12-12 15:48:33 -05:00
Teresa Johnson c8e0bb3b2c [LTO] Support for embedding bitcode section during LTO
Summary:
This adds support for embedding bitcode in a binary during LTO. The libLTO gains supports the `-lto-embed-bitcode` flag. The option allows users of the LTO library to embed a bitcode section. For example, LLD can pass the option via `ld.lld -mllvm=-lto-embed-bitcode`.

This feature allows doing something comparable to `clang -c -fembed-bitcode`, but on the (LTO) linker level. Having bitcode alongside native code has many use-cases. To give an example, the MacOS linker can create a `-bitcode_bundle` section containing bitcode. Also, having this feature built into LLVM is an alternative to 3rd party tools such as [[ https://github.com/travitch/whole-program-llvm | wllvm ]] or [[ https://github.com/SRI-CSL/gllvm | gllvm ]]. As with these tools, this feature simplifies creating "whole-program" llvm bitcode files, but in contrast to wllvm/gllvm it does not rely on a specific llvm frontend/driver.

Patch by Josef Eisl <josef.eisl@oracle.com>

Reviewers: #llvm, #clang, rsmith, pcc, alexshap, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mehdi_amini, inglorion, hiraditya, aheejin, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits, #llvm, #clang

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68213
2019-12-12 12:34:19 -08:00
Guillaume Chatelet 0508c994f0 [clang] Turn -fno-builtin flag into an IR Attribute
Summary:
This is a follow up on https://reviews.llvm.org/D61634#1742154 to turn the clang driver -fno-builtin flag into an IR attribute.
I also investigated pushing the attribute earlier on (in Sema) but it looks like this patch is simple and will cover all function calls.

Reviewers: aaron.ballman, courbet

Subscribers: cfe-commits, tejohnson

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71193
2019-12-12 17:21:12 +01:00
Guillaume Chatelet dbc5acf8ce [Alignment][NFC] Adding Align compatible methods to IntrinsicInst/IRBuilder
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71420
2019-12-12 16:22:15 +01:00
Nicola Zaghen f798eb21ec Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same."
This reverts commit 5f6208778f.

This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll
const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
2019-12-12 10:29:54 +00:00
Nicola Zaghen 5f6208778f [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-12 10:07:01 +00:00
Reid Kleckner 5d986953c8 [IR] Split out target specific intrinsic enums into separate headers
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
  Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
  object file size.
- Incremental step towards decoupling target intrinsics.

The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.

Part of PR34259

Reviewers: efriedma, echristo, MaskRay

Reviewed By: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D71320
2019-12-11 18:02:14 -08:00
Johannes Doerfert b3c06db456 [OpenMP] Use the OpenMP-IR-Builder
This is a follow up patch to use the OpenMP-IR-Builder, as discussed on
the mailing list ([1] and later) and at the US Dev Meeting'19.

[1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim

Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69922
2019-12-11 16:51:13 -06:00
Richard Smith e0e07a7e41 Fix detection of __attribute__((may_alias)) to properly look through
type sugar.

We previously missed the attribute in a lot of cases in C++, because
there's often other type sugar there (eg, ElaboratedType).
2019-12-11 14:04:37 -08:00
Alexey Bataev 0b9789456b [OPENMP50]Add if clause in teams distribute parallel for simd directive.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
2019-12-11 16:11:41 -05:00
Sam Clegg 881d877846 [WebAssembly] Add new `export_name` clang attribute for controlling wasm export names
This is equivalent to the existing `import_name` and `import_module`
attributes which control the import names in the final wasm binary
produced by lld.

This maps the existing

This attribute currently requires a string rather than using the
symbol name for a couple of reasons:

1. Avoid confusion with static and dynamic linking which is
   based on symbol name.  Exporting a function from a wasm module using
   this directive is orthogonal to both static and dynamic linking.
2. Avoids name mangling.

Differential Revision: https://reviews.llvm.org/D70520
2019-12-11 11:54:57 -08:00
Russell Gallop df494f7512 [Support] Add TimeTraceScope constructor without detail arg
This simplifies code where no extra details are required
Also don't write out detail when it is empty.

Differential Revision: https://reviews.llvm.org/D71347
2019-12-11 14:32:21 +00:00
Nicolai Hähnle f21c081b78 CodeGen: Allow annotations on globals in non-zero address space
Summary:
Attribute annotations are recorded in a special global composite variable
that points to annotation strings and the annotated objects.

As a restriction of the LLVM IR type system, those pointers are all
pointers to address space 0, so let's insert an addrspacecast when the
annotated global is in a non-0 address space.

Since this addrspacecast is only reachable from the global annotations
object, this should allow us to represent annotations on all globals
regardless of which addrspacecasts are usually legal for the target.

Reviewers: rjmccall

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71208
2019-12-11 13:24:32 +01:00
Sjoerd Meijer 0216854917 [Clang] Pragma vectorize_width() implies vectorize(enable)
Let's try this again; this has been reverted/recommited a few times. Last time
this got reverted because for this loop:

  void a() {
    #pragma clang loop vectorize(disable)
    for (;;)
      ;
  }

vectorisation was incorrectly enabled and the vectorize.enable metadata was set
due to a logic error. But with this fixed, we now imply vectorisation when:

1) vectorisation is enabled, which means: VectorizeWidth > 1,
2) and don't want to add it when it is disabled or enabled, otherwise we would
   be incorrectly setting it or duplicating the metadata, respectively.

This should fix PR27643.

Differential Revision: https://reviews.llvm.org/D69628
2019-12-11 10:37:40 +00:00
Simon Tatham bd0f271c9e [ARM][MVE] Add intrinsics for immediate shifts. (reland)
This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which
shift every lane of a vector left or right by a compile-time
immediate. They mostly work by expanding to the IR `shl`, `lshr` and
`ashr` operations, with their second operand being a vector splat of
the immediate.

There's a fiddly special case, though. ACLE specifies that the
immediate in `vshrq_n` can take values up to //and including// the bit
size of the vector lane. But LLVM IR thinks that shifting right by the
full size of the lane is UB, and feels free to replace the `lshr` with
an `undef` half way through the optimization pipeline. Hence, to keep
this legal in source code, I have to detect it at codegen time.
Logical (unsigned) right shifts by the element size are handled by
simply emitting the zero vector; arithmetic ones are converted into a
shift of one bit less, which will always give the same output.

In order to do that check, I also had to enhance the tablegen
MveEmitter so that it can cope with converting a builtin function's
operand into a bare integer to pass to a code-generating subfunction.
Previously the only bare integers it knew how to handle were flags
generated from within `arm_mve.td`.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen, MarkMurrayARM

Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71065
2019-12-11 10:10:09 +00:00
Erik Pilkington d5e66f0e06 NFC: Get rid of an unused parameter to CGObjCMac::EmitSelectorAddr. 2019-12-10 16:54:48 -08:00
Yaxun (Sam) Liu 21b43885b8 Fix bug 44190 - wrong code with #pragma pack(1)
5b330e8d61 caused
a regression on s390:

https://bugs.llvm.org/show_bug.cgi?id=44190

we need to copy if if either the argument is non-byval or the argument is underaligned.

Differential Revision: https://reviews.llvm.org/D71282
2019-12-10 13:56:34 -05:00
Kevin P. Neal 6515c524b0 [FPEnv] clang support for constrained FP builtins
Change the IRBuilder and clang so that constrained FP intrinsics will be
emitted for builtins when appropriate. Only non-target-specific builtins
are affected in this patch.

Differential Revision: https://reviews.llvm.org/D70256
2019-12-10 13:09:12 -05:00
Yonghong Song d77ae1552f [DebugInfo] Support to emit debugInfo for extern variables
Extern variable usage in BPF is different from traditional
pure user space application. Recent discussion in linux bpf
mailing list has two use cases where debug info types are
required to use extern variables:
  - extern types are required to have a suitable interface
    in libbpf (bpf loader) to provide kernel config parameters
    to bpf programs.
    https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t
  - extern types are required so kernel bpf verifier can
    verify program which uses external functions more precisely.
    This will make later link with actual external function no
    need to reverify.
    https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed

This patch added clang support to emit debuginfo for extern variables
with a TargetInfo hook to enable it. The debuginfo for the
extern variable is emitted only if that extern variable is
referenced in the current compilation unit.

Currently, only BPF target enables to generate debug info for
extern variables. The emission of such debuginfo is disabled for C++
 at this moment since BPF only supports a subset of C language.
Emission with C++ can be enabled later if an appropriate use case
is identified.

-fstandalone-debug permits us to see more debuginfo with the cost
of bloated binary size. This patch did not add emission of extern
variable debug info with -fstandalone-debug. This can be
re-evaluated if there is a real need.

Differential Revision: https://reviews.llvm.org/D70696
2019-12-10 08:09:51 -08:00
Guillaume Chatelet 1b2842bf90 [Alignment][NFC] CreateMemSet use MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71213
2019-12-10 15:17:44 +01:00
Johannes Doerfert eb3e81f43f [OpenMP][NFCI] Introduce llvm/IR/OpenMPConstants.h
Summary:
The new OpenMPConstants.h is a location for all OpenMP related constants
(and helpers) to live.

This patch moves the directives there (the enum OpenMPDirectiveKind) and
rewires Clang to use the new location.

Initially part of D69785.

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim

Subscribers: jholewinski, ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69853
2019-12-10 00:10:09 -06:00
Eric Christopher 9c6b7f68b8 Revert "[ARM][MVE] Add intrinsics for immediate shifts."
and two follow-on commits: one warning fix and one functionality.

As it's breaking at least the lto bot:

http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15132/steps/test-stage1-compiler/logs/stdio

This reverts commits:

 8d70f3c933
 ff4dceef92
 d97b3e3e65
2019-12-09 16:47:38 -08:00
Reid Kleckner 9803178a78 Avoid Attr.h includes, CodeGen edition
This saves around 20 includes of Attr.h. Not much.
2019-12-09 16:17:18 -08:00
Craig Topper 505aa2410d [Attr] Move ParsedTargetAttr out of the TargetAttr class
Need to forward declare it in ASTContext.h for D68627, so it can't be a nested struct.

Differential Revision: https://reviews.llvm.org/D71159
2019-12-09 12:40:41 -08:00
Haojian Wu ff4dceef92 Fix the compiler warnings: "-Winconsistent-missing-override", "-Wunused-variable"
for d97b3e3e65
2019-12-09 17:09:07 +01:00
Simon Tatham d97b3e3e65 [ARM][MVE] Add intrinsics for immediate shifts.
Summary:
This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which
shift every lane of a vector left or right by a compile-time
immediate. They mostly work by expanding to the IR `shl`, `lshr` and
`ashr` operations, with their second operand being a vector splat of
the immediate.

There's a fiddly special case, though. ACLE specifies that the
immediate in `vshrq_n` can take values up to //and including// the bit
size of the vector lane. But LLVM IR thinks that shifting right by the
full size of the lane is UB, and feels free to replace the `lshr` with
an `undef` half way through the optimization pipeline. Hence, to keep
this legal in source code, I have to detect it at codegen time.
Logical (unsigned) right shifts by the element size are handled by
simply emitting the zero vector; arithmetic ones are converted into a
shift of one bit less, which will always give the same output.

In order to do that check, I also had to enhance the tablegen
MveEmitter so that it can cope with converting a builtin function's
operand into a bare integer to pass to a code-generating subfunction.
Previously the only bare integers it knew how to handle were flags
generated from within `arm_mve.td`.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71065
2019-12-09 15:44:09 +00:00
Reid Kleckner eff08f4097 Revert "[Sema][X86] Consider target attribute into the checks in validateOutputSize and validateInputSize."
This reverts commit e1578fd2b7.

It introduces a dependency on Attr.h which I am removing from
ASTContext.h.
2019-12-06 15:42:14 -08:00
Craig Topper e1578fd2b7 [Sema][X86] Consider target attribute into the checks in validateOutputSize and validateInputSize.
The validateOutputSize and validateInputSize need to check whether
AVX or AVX512 are enabled. But this can be affected by the
target attribute so we need to factor that in.

This patch copies some of the code from CodeGen to create an
appropriate feature map that we can pass to the function. Probably
need some refactoring here to share more code with Codegen. Is
there a good place to do that? Also need to support the cpu_specific
attribute as well.

Differential Revision: https://reviews.llvm.org/D68627
2019-12-06 15:30:59 -08:00
Alex Lorenz f3efd69574 [ObjC] Make sure that the implicit arguments for direct methods have been setup
This commit sets the Self and Imp declarations for ObjC method declarations,
in addition to the definitions. It also fixes
a bunch of code in clang that had wrong assumptions about when getSelfDecl() would be set:

- CGDebugInfo::getObjCMethodName and AnalysisConsumer::getFunctionName would assume that it was
  set for method declarations part of a protocol, which they never were,
  and that self would be a Class type, which it isn't as it is id for a protocol.

Also use the Canonical Decl to index the set of Direct methods so that
when calls and implementations interleave, the same llvm::Function is
used and the same symbol name emitted.

Radar-Id: rdar://problem/57661767

Patch by: Pierre Habouzit

Differential Revision: https://reviews.llvm.org/D71091
2019-12-06 14:28:28 -08:00
Alexey Bataev 779a180d96 [OPENMP50]Add if clause in distribute simd directive.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
2019-12-06 14:49:49 -05:00
Zahira Ammarguellat a3b2552575 Fix for PR44000. Optimization record for bytecode input missing.
Review is here:  https://reviews.llvm.org/D70691
2019-12-06 07:48:42 -05:00
Adrian Prantl 338588d7cf Debug Info: Apply a default location for cleanups if none is available.
This unbreaks the debuginfo-tests testsuite by replacing the assertion
with a default location. There are cleanups in helper functions that
don't have a valid source location such as block copy helpers and it's
not worth tracking each of them down.

rdar://57630879
2019-12-05 13:30:23 -08:00
Adrian Prantl ce7d35988d Debug Info: Assert that location is available for cleanups
rdar://57630879

Differential Revision: https://reviews.llvm.org/D71042
2019-12-05 12:45:10 -08:00
cchen 47d6094d7f [OpenMP50] Add parallel master construct
Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: rnk, jholewinski, guansong, arphaman, jfb, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70726
2019-12-05 14:35:27 -05:00
Melanie Blower 7f9b513847 Reapply af57dbf12e "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
        The original patch is modified to set the strictfp IR attribute
        explicitly in CodeGen instead of as a side effect of IRBuilder.
        In the 2nd attempt to reapply there was a windows lit test fail, the
        tests were fixed to use wildcard matching.

        Differential Revision: https://reviews.llvm.org/D62731
2019-12-05 03:48:04 -08:00
Reid Kleckner 33f6d465d7 Revert "[OpenMP50] Add parallel master construct, by Chi Chun Chen."
This reverts commit 713dab21e2.

Tests do not pass on Windows.
2019-12-04 14:50:06 -08:00
Alexey Bataev 61205821ca [OPENMP50]Add support for if clause for simd part in taskloop simd
directive.

According to OpenMP 5.0, the `if` clause can be applied to simd
subdirective in the combined directive.
2019-12-04 15:50:39 -05:00
Melanie Blower 5412913631 Revert " Reapply af57dbf12e "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=""
This reverts commit cdbed2dd85.
Build break on Windows (lit fail)
2019-12-04 12:21:23 -08:00
cchen 713dab21e2 [OpenMP50] Add parallel master construct, by Chi Chun Chen.
Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: jholewinski, guansong, arphaman, jfb, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70726
2019-12-04 14:53:17 -05:00
Melanie Blower cdbed2dd85 Reapply af57dbf12e "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
        The original patch is modified to set the strictfp IR attribute
        explicitly in CodeGen instead of as a side effect of IRBuilder

        Differential Revision: https://reviews.llvm.org/D62731
2019-12-04 11:32:33 -08:00
Vedant Kumar f208b70fbc Revert "[Coverage] Revise format to reduce binary size"
This reverts commit e18531595b.

On Windows, there is an error:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54963/steps/stage%201%20check/logs/stdio

error: C:\b\slave\sanitizer-windows\build\stage1\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-merging.cpp.tmp.v1.o: Failed to load coverage: Malformed coverage data
2019-12-04 10:35:14 -08:00
Vedant Kumar e18531595b [Coverage] Revise format to reduce binary size
Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2019-12-04 10:10:55 -08:00
Akira Hatanaka d8136f14f1 [CodeGen][ObjC] Emit a primitive store to store a __strong field in
ExpandTypeFromArgs

This fixes a bug in IRGen where a call to `llvm.objc.storeStrong` was
being emitted to initialize a __strong field of an uninitialized
temporary struct, which caused crashes at runtime.

rdar://problem/51807365
2019-12-03 23:44:30 -08:00
Petr Hosek 9c3f9b9c12 [Clang] Define Fuchsia C++ABI
Currently, it is a modified version of the Itanium ABI, with the only
change being that constructors and destructors return 'this'.

Differential Revision: https://reviews.llvm.org/D70575
2019-12-03 18:35:57 -08:00
Akira Hatanaka f139ae3d93 [NFC] Pass a reference to CodeGenFunction to methods of LValue and
AggValueSlot

This reapplies 8a5b7c3570 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.

Original commit message:

This is needed for the pointer authentication work we plan to do in the
near future.

a63a81bd99/clang/docs/PointerAuthentication.rst
2019-12-03 15:22:13 -08:00
Reid Kleckner 705a6aef35 [MS] Emit exported complete/vbase destructors
Summary:
Fixes PR44205

I checked, and deleting destructors are not affected.

Reviewers: hans

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70931
2019-12-03 14:46:32 -08:00
Akira Hatanaka 9f37c0e703 Revert "[NFC] Pass a reference to CodeGenFunction to methods of LValue and"
This reverts commit 8a5b7c3570. This seems
to have broken UBSan because of a null dereference.
2019-12-03 13:08:01 -08:00
Vedant Kumar 859bf4d2be [Coverage] Emit a gap region to cover switch bodies
Emit a gap region beginning where the switch body begins. This sets line
execution counts in the areas between non-overlapping cases to 0.

This also removes some special handling of the first case in a switch:
these are now treated like any other case.

This does not resolve an outstanding issue with case statement regions
that do not end when a region is terminated. But it should address
llvm.org/PR44011.

Differential Revision: https://reviews.llvm.org/D70571
2019-12-03 12:35:54 -08:00
Akira Hatanaka 8a5b7c3570 [NFC] Pass a reference to CodeGenFunction to methods of LValue and
AggValueSlot

This is needed for the pointer authentication work we plan to do in the
near future.

a63a81bd99/clang/docs/PointerAuthentication.rst
2019-12-03 11:30:09 -08:00
Sourabh Singh Tomar f1e3988aa6 Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE."
This revision is revised to update Go-bindings and Release Notes.

The original commit message follows.

This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE.
When explicit alignment is specified.

Patch by Awanish Pandey <Awanish.Pandey@amd.com>

Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok,
deadalinx

Differential Revision: https://reviews.llvm.org/D70111
2019-12-03 09:51:43 +05:30
Victor Campos dcf11c5e86 [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A
Summary:
Add support for vcadd_* family of intrinsics. This set of intrinsics is
available in Armv8.3-A.

The fp16 versions require the FP16 extension, which has been available
(opt-in) since Armv8.2-A.

Reviewers: t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70862
2019-12-02 14:38:39 +00:00
Johannes Altmanninger 1ac700cdef [CodeGen] Fix clang crash on aggregate initialization of array of labels
Summary: Fix PR43700

The ConstantEmitter in AggExprEmitter::EmitArrayInit was initialized
with the CodeGenFunction set to null, which caused the crash.
Also simplify another call, and make the CGF member a const pointer
since it is public but only assigned in the constructor.

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70302
2019-11-28 00:59:25 +01:00
Roman Lebedev b98a0c7f6c
[clang][CodeGen] Implicit Conversion Sanitizer: handle increment/decrement (PR44054)(take 2)
Summary:
Implicit Conversion Sanitizer is *almost* feature complete.
There aren't *that* much unsanitized things left,
two major ones are increment/decrement (this patch) and bit fields.

As it was discussed in
[[ https://bugs.llvm.org/show_bug.cgi?id=39519 | PR39519 ]],
unlike `CompoundAssignOperator` (which is promoted internally),
or `BinaryOperator` (for which we always have promotion/demotion in AST)
or parts of `UnaryOperator` (we have promotion/demotion but only for
certain operations), for inc/dec, clang omits promotion/demotion
altogether, under as-if rule.

This is technically correct: https://rise4fun.com/Alive/zPgD
As it can be seen in `InstCombineCasts.cpp` `canEvaluateTruncated()`,
`add`/`sub`/`mul`/`and`/`or`/`xor` operators can all arbitrarily
be extended or truncated:
901cd3b3f6/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L1320-L1334)

But that has serious implications:
1. Since we no longer model implicit casts, do we pessimise
   their AST representation and everything that uses it?
2. There is no demotion, so lossy demotion sanitizer does not trigger :]

Now, i'm not going to argue about the first problem here,
but the second one **needs** to be addressed. As it was stated
in the report, this is done intentionally, so changing
this in all modes would be considered a penalization/regression.
Which means, the sanitization-less codegen must not be altered.

It was also suggested to not change the sanitized codegen
to the one with demotion, but i quite strongly believe
that will not be the wise choice here:
1. One will need to re-engineer the check that the inc/dec was lossy
   in terms of `@llvm.{u,s}{add,sub}.with.overflow` builtins
2. We will still need to compute the result we would lossily demote.
   (i.e. the result of wide `add`ition/`sub`traction)
3. I suspect it would need to be done right here, in sanitization.
   Which kinda defeats the point of
   using `@llvm.{u,s}{add,sub}.with.overflow` builtins:
   we'd have two `add`s with basically the same arguments,
   one of which is used for check+error-less codepath and other one
   for the error reporting. That seems worse than a single wide op+check.
4. OR, we would need to do that in the compiler-rt handler.
   Which means we'll need a whole new handler.
   But then what about the `CompoundAssignOperator`,
   it would also be applicable for it.
   So this also doesn't really seem like the right path to me.
5. At least X86 (but likely others) pessimizes all sub-`i32` operations
   (due to partial register stalls), so even if we avoid promotion+demotion,
   the computations will //likely// be performed in `i32` anyways.

So i'm not really seeing much benefit of
not doing the straight-forward thing.

While looking into this, i have noticed a few more LLVM middle-end
missed canonicalizations, and filed
[[ https://bugs.llvm.org/show_bug.cgi?id=44100 | PR44100 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=44102 | PR44102 ]].

Those are not specific to inc/dec, we also have them for
`CompoundAssignOperator`, and it can happen for normal arithmetics, too.
But if we take some other path in the patch, it will not be applicable
here, and we will have most likely played ourselves.

TLDR: front-end should emit canonical, easy-to-optimize yet
un-optimized code. It is middle-end's job to make it optimal.

I'm really hoping reviewers agree with my personal assessment
of the path this patch should take..

This originally landed in 9872ea4ed1
but got immediately reverted in cbfa237892
because the assertion was faulty. That fault ended up being caused
by the enum - while there will be promotion, both types are unsigned,
with same width. So we still don't need to sanitize non-signed cases.
So far. Maybe the assert will tell us this isn't so.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44054 | PR44054 ]].
Refs. https://github.com/google/sanitizers/issues/940

Reviewers: rjmccall, erichkeane, rsmith, vsk

Reviewed By: erichkeane

Subscribers: mehdi_amini, dexonsmith, cfe-commits, #sanitizers, llvm-commits, aaron.ballman, t.p.northover, efriedma, regehr

Tags: #llvm, #clang, #sanitizers

Differential Revision: https://reviews.llvm.org/D70539
2019-11-27 21:52:41 +03:00
Roman Lebedev cbfa237892
Revert "[clang][CodeGen] Implicit Conversion Sanitizer: handle increment/decrement (PR44054)"
The asssertion that was added does not hold,
breaks on test-suite/MultiSource/Applications/SPASS/analyze.c
Will reduce the testcase and revisit.

This reverts commit 9872ea4ed1, 870f3542d3.
2019-11-27 17:05:21 +03:00
David Green 9f15fcc271 [ARM] Replace arm_neon_vqadds with sadd_sat
This replaces the A32 NEON vqadds, vqaddu, vqsubs and vqsubu intrinsics
with the target independent sadd_sat, uadd_sat, ssub_sat and usub_sat.
This helps generate vqadds from standard IR nodes, which might be
produced from the vectoriser. The old variants are removed in the
process.

Differential Revision: https://reviews.llvm.org/D69350
2019-11-27 13:32:29 +00:00
Roman Lebedev 9872ea4ed1
[clang][CodeGen] Implicit Conversion Sanitizer: handle increment/decrement (PR44054)
Summary:
Implicit Conversion Sanitizer is *almost* feature complete.
There aren't *that* much unsanitized things left,
two major ones are increment/decrement (this patch) and bit fields.

As it was discussed in
[[ https://bugs.llvm.org/show_bug.cgi?id=39519 | PR39519 ]],
unlike `CompoundAssignOperator` (which is promoted internally),
or `BinaryOperator` (for which we always have promotion/demotion in AST)
or parts of `UnaryOperator` (we have promotion/demotion but only for
certain operations), for inc/dec, clang omits promotion/demotion
altogether, under as-if rule.

This is technically correct: https://rise4fun.com/Alive/zPgD
As it can be seen in `InstCombineCasts.cpp` `canEvaluateTruncated()`,
`add`/`sub`/`mul`/`and`/`or`/`xor` operators can all arbitrarily
be extended or truncated:
901cd3b3f6/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L1320-L1334)

But that has serious implications:
1. Since we no longer model implicit casts, do we pessimise
   their AST representation and everything that uses it?
2. There is no demotion, so lossy demotion sanitizer does not trigger :]

Now, i'm not going to argue about the first problem here,
but the second one **needs** to be addressed. As it was stated
in the report, this is done intentionally, so changing
this in all modes would be considered a penalization/regression.
Which means, the sanitization-less codegen must not be altered.

It was also suggested to not change the sanitized codegen
to the one with demotion, but i quite strongly believe
that will not be the wise choice here:
1. One will need to re-engineer the check that the inc/dec was lossy
   in terms of `@llvm.{u,s}{add,sub}.with.overflow` builtins
2. We will still need to compute the result we would lossily demote.
   (i.e. the result of wide `add`ition/`sub`traction)
3. I suspect it would need to be done right here, in sanitization.
   Which kinda defeats the point of
   using `@llvm.{u,s}{add,sub}.with.overflow` builtins:
   we'd have two `add`s with basically the same arguments,
   one of which is used for check+error-less codepath and other one
   for the error reporting. That seems worse than a single wide op+check.
4. OR, we would need to do that in the compiler-rt handler.
   Which means we'll need a whole new handler.
   But then what about the `CompoundAssignOperator`,
   it would also be applicable for it.
   So this also doesn't really seem like the right path to me.
5. At least X86 (but likely others) pessimizes all sub-`i32` operations
   (due to partial register stalls), so even if we avoid promotion+demotion,
   the computations will //likely// be performed in `i32` anyways.

So i'm not really seeing much benefit of
not doing the straight-forward thing.

While looking into this, i have noticed a few more LLVM middle-end
missed canonicalizations, and filed
[[ https://bugs.llvm.org/show_bug.cgi?id=44100 | PR44100 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=44102 | PR44102 ]].

Those are not specific to inc/dec, we also have them for
`CompoundAssignOperator`, and it can happen for normal arithmetics, too.
But if we take some other path in the patch, it will not be applicable
here, and we will have most likely played ourselves.

TLDR: front-end should emit canonical, easy-to-optimize yet
un-optimized code. It is middle-end's job to make it optimal.

I'm really hoping reviewers agree with my personal assessment
of the path this patch should take..

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44054 | PR44054 ]].

Reviewers: rjmccall, erichkeane, rsmith, vsk

Reviewed By: erichkeane

Subscribers: mehdi_amini, dexonsmith, cfe-commits, #sanitizers, llvm-commits, aaron.ballman, t.p.northover, efriedma, regehr

Tags: #llvm, #clang, #sanitizers

Differential Revision: https://reviews.llvm.org/D70539
2019-11-27 15:39:55 +03:00
Fangrui Song 3bb24bf257 Fix tests on Windows after D49466
It is tricky to use replace_path_prefix correctly on Windows which uses
backslashes as native path separators. Switch back to the old approach
(startswith is not ideal) to appease build bots for now.
2019-11-26 16:15:39 -08:00
Dan McGregor 6c92cdff72 Initial implementation of -fmacro-prefix-map and -ffile-prefix-map
GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map

Reviewed By: rnk, Lekensteyn, maskray

Differential Revision: https://reviews.llvm.org/D49466
2019-11-26 15:17:49 -08:00
Senran Zhang 01d8e09fdb [clang][CodeGen] Fix wrong memcpy size of no_unique_address in FieldMemcpyizer
When generating ctor, FieldMemcpyizer wrongly treated zero-sized class members
as what should be copied, and generated wrong memcpy size under some special
circumstances. This patch tries to fix it.

Reviewed By: MaskRay, rjmccall

Differential Revision: https://reviews.llvm.org/D70671
2019-11-25 18:15:34 -08:00
Peter Collingbourne 90b8bc003c IRGen: Call SetLLVMFunctionAttributes{,ForDefinition} on __cfi_check_fail.
This has the main effect of causing target-cpu and target-features to be set
on __cfi_check_fail, causing the function to become ABI-compatible with other
functions in the case where these attributes affect ABI (e.g. reserve-x18).

Technically we only need to call SetLLVMFunctionAttributes to get the target-*
attributes set, but since we're creating a definition we probably ought to
call the ForDefinition function as well.

Fixes PR44094.

Differential Revision: https://reviews.llvm.org/D70692
2019-11-25 15:16:43 -08:00
Alexey Bataev bbc328c624 [OPENMP]Fix PR41826: symbols visibility in device code.
Summary:
Currently, we ignore all locality attributes/info when building for
the device and thus all symblos are externally visible and can be
preemted at the runtime. It may lead to incorrect results. We need to
follow the same logic, compiler uses for static/pie builds. But in some
cases changing of dso locality may lead to problems with codegen, so
instead mark external symbols as hidden instead in the device code.

Reviewers: jdoerfert

Subscribers: guansong, caomhin, kkwli0, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70549
2019-11-25 15:01:28 -05:00
David Blaikie e956952ede DebugInfo: Flag Dwarf Version metadata for merging during LTO
When the Dwarf Version metadata was initially added (r184276) there was
no support for Module::Max - though the comment suggested that was the
desired behavior. The original behavior was Module::Warn which would
warn and then pick whichever version came first - which is pretty
arbitrary/luck-based if the consumer has some need for one version or
the other.

Now that the functionality's been added (r303590) this change updates
the implementation to match the desired goal.

The general logic here is - if you compile /some/ of your program with a
more recent DWARF version, you must have a consumer that can handle it,
so might as well use it for /everything/.

The only place where this might fall down is if you have a need to use
an old tool (supporting only the older DWARF version) for some subset of
your program. In which case now it'll all be the higher version. That
seems pretty narrow (& the inverse could happen too - you specifically
/need/ the higher DWARF version for some extra expressivity, etc, in
some part of the program)
2019-11-22 17:16:35 -08:00
Reid Kleckner a9cc64e50e Separate the MS inheritance model enum from the attribute, NFC
This avoids the need to include Attr.h in DeclCXX.h for a four-value
enum. Removing the include will be done separately, since it is large
and risky change.
2019-11-22 16:06:30 -08:00
Alexey Bataev 5459a905c2 [OPENMP]Simplify processing of context selectors, NFC. 2019-11-22 11:53:06 -05:00
Alexey Bataev f8ff3d7ebd [OPENMP]Remove unused template parameter, NFC. 2019-11-21 16:42:26 -05:00
Adrian Prantl e0cabe280b Debug info: Emit objc_direct methods as members of their containing class
even in DWARF 4 and earlier. This allows the debugger to recognize
them as direct functions as opposed to Objective-C methods.

<rdar://problem/57327663>

Differential Revision: https://reviews.llvm.org/D70544
2019-11-21 11:01:10 -08:00
Alexey Bataev 4e8231b5cf [OPENMP50]Add device/kind context selector support.
Summary: Added basic parsing/sema support for device/kind context selector.

Reviewers: jdoerfert

Subscribers: rampitec, aheejin, fedor.sergeev, simoncook, guansong, s.egerton, hfinkel, kkwli0, caomhin, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70245
2019-11-21 13:28:11 -05:00
Alexey Bataev 103f3c9e3b [OPENMP50]Add if clause in for simd directive.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
2019-11-21 09:29:12 -05:00
Ehud Katz c63f1b160e [DeclCXX] Remove unknown external linkage specifications
Partial revert of r372681 "Support for DWARF-5 C++ language tags".

The change introduced new external linkage languages ("C++11" and
"C++14") which not supported in C++.

It also changed the definition of the existing enum to use the DWARF
constants. The problem is that "LinkageSpecDeclBits.Language" (the field
that reserves this enum) is actually defined as 3 bits length
(bitfield), which cannot contain the new DWARF constants. Defining the
enum as integer literals is more appropriate for maintaining valid
values.

Differential Revision: https://reviews.llvm.org/D69935
2019-11-21 15:23:05 +02:00
Tim Northover 5cf58768cb Atomics: support min/max orthogonally
We seem to have been gradually growing support for atomic min/max operations
(exposing longstanding IR atomicrmw instructions). But until now there have
been gaps in the expected intrinsics. This adds support for the C11-style
intrinsics (i.e. taking _Atomic, rather than individually blessed by C11
standard), and the variants that return the new value instead of the original
one.

That way, people won't be misled by trying one form and it not working, and the
front-end is more friendly to people using _Atomic types, as we recommend.
2019-11-21 10:37:56 +00:00
Tim Northover e23d6f3184 NeonEmitter: remove special case on casting polymorphic builtins.
For some reason we were not casting a fairly obscure class of builtin calls we
expected to be polymorphic to vectors of char. It worked because the only
affected intrinsics weren't actually polymorphic after all, but is
unnecessarily complicated.
2019-11-20 13:20:02 +00:00
Djordje Todorovic ce1f95a6e0 Reland "[clang] Remove the DIFlagArgumentNotModified debug info flag"
It turns out that the ExprMutationAnalyzer can be very slow when AST
gets huge in some cases. The idea is to move this analysis to the LLVM
back-end level (more precisely, in the LiveDebugValues pass). The new
approach will remove the performance regression, simplify the
implementation and give us front-end independent implementation.

Differential Revision: https://reviews.llvm.org/D68206
2019-11-20 10:08:07 +01:00
Alexey Bataev d08c056695 [OPENMP50]Add if clause in simd directive.
According to OpenMP 5.0, if clause can be used in simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
2019-11-19 15:58:19 -05:00
Vedant Kumar 568db780bb [CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood (reland with fixes)
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

This was reverted in 5b9a072c because it attached declaration
subprograms to inlinable builtin calls, which interacted badly with the
MergeICmps pass. The fix is to not attach declarations to builtins.

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743
2019-11-19 12:49:27 -08:00
Alexey Bataev 1d943ae44c [OPENMP]Rename function, NFC.
Change the name of the CGOpenMPRuntime::emitOMPIfClause to CGOpenMPRuntime::emitIfClause.
2019-11-19 12:27:10 -05:00
Tyker b0561b3346 [NFC] Refactor representation of materialized temporaries
Summary:
this patch refactor representation of materialized temporaries to prevent an issue raised by rsmith in https://reviews.llvm.org/D63640#inline-612718

Reviewers: rsmith, martong, shafik

Reviewed By: rsmith

Subscribers: thakis, sammccall, ilya-biryukov, rnkovacs, arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69360
2019-11-19 18:20:45 +01:00
Matt Arsenault 7fe9435dc8 Work on cleaning up denormal mode handling
Cleanup handling of the denormal-fp-math attribute. Consolidate places
checking the allowed names in one place.

This is in preparation for introducing FP type specific variants of
the denormal-fp-mode attribute. AMDGPU will switch to using this in
place of the current hacky use of subtarget features for the denormal
mode.

Introduce a new header for dealing with FP modes. The constrained
intrinsic classes define related enums that should also be moved into
this header for uses in other contexts.

The verifier could use a check to make sure the denorm-fp-mode
attribute is sane, but there currently isn't one.

Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by
default in the one current user. Clang must be taught to start
emitting this attribute by default to avoid regressions when this is
switched to assume ieee behavior if the attribute isn't present.
2019-11-19 22:01:14 +05:30
Vedant Kumar ea1db31d20 [CodeGen] Assign locations to calls to special struct helpers
Assign artificial locations to calls to special struct-related helper
functions.

Such calls may not inherit a location if emitted within FinishFunction,
at which point the lexical scope stack may be empty, causing CGDebugInfo
to report the current DebugLoc as empty.

Fixes an IR verifier complaint about a call to '__destructor_8_s0' not
having a !dbg location attached.

rdar://57293361
2019-11-18 15:07:59 -08:00
Erich Keane 0213adde21 [NFC] Fix 'target' condition in checkTargetFeatures
checkTargetFeatures was incorrectly checking for cpu_specific instead of
just 'target'. While this function was never called in that situation,
it seemed correct to fix the condition.  Additionally, multiversion
functions can never be always_inline, but if any function accidentially
ended up here we shouldn't diagnose.

Note that the adding of target-features to the list is unnecessary since
the getFunctionFeatureMap actually considers attribute target,
however adding it results in significantly better error messages by
putting the 'target' features first (and thus first to fail).
Otherwise, the error message would be the first feature 'implied' by the
target attribute, and not necessarily the feature listed in the
attribute itself.
2019-11-18 13:43:52 -08:00
Pierre Habouzit d4e1ba3fa9 Implement __attribute__((objc_direct)), __attribute__((objc_direct_members))
__attribute__((objc_direct)) is an attribute on methods declaration, and
__attribute__((objc_direct_members)) on implementation, categories or
extensions.

A `direct` property specifier is added (@property(direct) type name)

These attributes / specifiers cause the method to have no associated
Objective-C metadata (for the property or the method itself), and the
calling convention to be a direct C function call.

The symbol for the method has enforced hidden visibility and such direct
calls are hence unreachable cross image. An explicit C function must be
made if so desired to wrap them.

The implicit `self` and `_cmd` arguments are preserved, however to
maintain compatibility with the usual `objc_msgSend` semantics,
3 fundamental precautions are taken:

1) for instance methods, `self` is nil-checked. On arm64 backends this
   typically adds a single instruction (cbz x0, <closest-ret>) to the
   codegen, for the vast majority of the cases when the return type is a
   scalar.

2) for class methods, because the class may not be realized/initialized
   yet, a call to `[self self]` is emitted. When the proper deployment
   target is used, this is optimized to `objc_opt_self(self)`.

   However, long term we might want to emit something better that the
   optimizer can reason about. When inlining kicks in, these calls
   aren't optimized away as the optimizer has no idea that a single call
   is really necessary.

3) the calling convention for the `_cmd` argument is changed: the caller
   leaves the second argument to the call undefined, and the selector is
   loaded inside the body when it's referenced only.

As far as error reporting goes, the compiler refuses:
- making any overloads direct,
- making an overload of a direct method,
- implementations marked as direct when the declaration in the
  interface isn't (the other way around is allowed, as the direct
  attribute is inherited from the declaration),
- marking methods required for protocol conformance as direct,
- messaging an unqualified `id` with a direct method,
- forming any @selector() expression with only direct selectors.

As warnings:
- any inconsistency of direct-related calling convention when
  @selector() or messaging is used,
- forming any @selector() expression with a possibly direct selector.

Lastly an `objc_direct_members` attribute is added that can decorate
`@implementation` blocks and causes methods only declared there (and in
no `@interface`) to be automatically direct. When decorating an
`@interface` then all methods and properties declared in this block are
marked direct.

Radar-ID: rdar://problem/2684889
Differential Revision: https://reviews.llvm.org/D69991
Reviewed-By: John McCall
2019-11-18 11:48:40 -08:00
Eric Christopher 30e7ee3c4b Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.

This reverts commits af57dbf12e and e6584b2b7b
2019-11-18 10:46:48 -08:00
Alexey Bataev c3eded068c [OPENMP50]Fix PR44024: runtime assert in distribute construct.
If the code is emitted for distribute construct, the nonmonotonic
modifier should not be added.
2019-11-18 11:14:27 -05:00
Sam McCall d27a16eb39 Revert "[DWARF5]Addition of alignment atrribute in typedef DIE."
This reverts commit 423f541c1a, which
breaks llvm-c ABI.
2019-11-18 15:53:22 +01:00
Simon Tatham 4a4dd85e5a [ARM,MVE] Add intrinsics for vector comparisons.
This adds the `vcmp` family of ACLE MVE intrinsics: vector/vector,
vector/scalar, and the predicated forms of both. All are represented
using standard existing IR: vector/scalar comparisons are represented
by making a vector out of the scalar first, and predicated forms are
represented by taking the bitwise AND of the input predicate and the
output of the comparison. Existing LLVM-side tests demonstrate that
ISel will pattern-match all of that back down to single MVE VCMPs.

The idiom of handling a vector/scalar operation by generating IR to
expand the scalar into a second vector is going to be needed for a lot
of MVE intrinsics, so to make that easy, I've provided a helper
function that automatically works out the element count.

The comparison intrinsics are the first ones that have to //return// a
predicate, in the user-facing `mve_pred16_t` format. This means we
have to use the `arm_mve_pred_v2i` low-level intrinsic to convert it
back from the logical `<n x i1>` form used in IR. I've done that
explicitly in the code gen specification for the builtins, because it
happens much more rarely in the ACLE API than passing a Predicate as
input, so it didn't seem worth automating in MveEmitter.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70297
2019-11-18 10:39:30 +00:00
Nico Weber c9276fbfdf Revert "[NFC] Refactor representation of materialized temporaries"
This reverts commit 08ea1ee2db.
It broke ./ClangdTests/FindExplicitReferencesTest.All
on the bots, see comments on https://reviews.llvm.org/D69360
2019-11-17 02:09:25 -05:00
Tyker 08ea1ee2db [NFC] Refactor representation of materialized temporaries
Summary:
this patch refactor representation of materialized temporaries to prevent an issue raised by rsmith in https://reviews.llvm.org/D63640#inline-612718

Reviewers: rsmith, martong, shafik

Reviewed By: rsmith

Subscribers: rnkovacs, arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69360
2019-11-16 17:56:09 +01:00
Sourabh Singh Tomar 423f541c1a [DWARF5]Addition of alignment atrribute in typedef DIE.
This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE.
When explicit alignment is specified.

Patch by Awanish Pandey <Awanish.Pandey@amd.com>

Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok,
deadalinx

Differential Revision: https://reviews.llvm.org/D70111
2019-11-16 21:56:53 +05:30
Akira Hatanaka 4516dc1c20 Don't add optnone or noinline if the function is already marked as
always_inline.

The assertion in SetLLVMFunctionAttributesForDefinition used to fail
when there was attribute OptimizeNone on the AST function and attribute
always_inline on the IR function. This happens because base destructors
are annotated with always_inline when the code is compiled with
-fapple-kext (see r124757).

rdar://problem/57169694
2019-11-15 15:44:04 -08:00
Alexandre Ganea caf3166d40 Revert "re-land [DebugInfo] Add debug location to stubs generated by CGDeclCXX and mark them as artificial"
This reverts commit 9c1baa2352.
2019-11-15 16:21:17 -05:00
Alexandre Ganea 9c1baa2352 re-land [DebugInfo] Add debug location to stubs generated by CGDeclCXX and mark them as artificial
Differential Revision: https://reviews.llvm.org/D66328
2019-11-15 16:01:39 -05:00
Momchil Velikov aa6d48fa70 Implement target(branch-protection) attribute for AArch64
This patch implements `__attribute__((target("branch-protection=...")))`
in a manner, compatible with the analogous GCC feature:

https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes

Differential Revision: https://reviews.llvm.org/D68711
2019-11-15 15:40:46 +00:00
Serge Pavlov e6584b2b7b Move floating point related entities to namespace level
Enumerations that describe rounding mode and exception behavior were
defined inside ConstrainedFPIntrinsic. It makes sense to use the same
definitions to represent the same properties in other cases, not only
in constrained intrinsics. It was however inconvenient as required to
include constrained intrinsics definitions even if they were not needed.
Also using long scope prefix reduced readability.

This change moves these definitioins to the namespace llvm::fp.
No functional changes.

Differential Revision: https://reviews.llvm.org/D69552
2019-11-15 19:56:33 +07:00
Djordje Todorovic 41d6ad6efd Revert "[clang] Remove the DIFlagArgumentNotModified debug info flag"
This reverts commit rG1643734741d2 due to LLDB test failure.
2019-11-15 12:16:44 +01:00
Djordje Todorovic 1643734741 [clang] Remove the DIFlagArgumentNotModified debug info flag
It turns out that the ExprMutationAnalyzer can be very slow when AST
gets huge in some cases. The idea is to move this analysis to the LLVM
back-end level (more precisely, in the LiveDebugValues pass). The new
approach will remove the performance regression, simplify the
implementation and give us front-end independent implementation.

Differential Revision: https://reviews.llvm.org/D68206
2019-11-15 11:10:19 +01:00
Reid Kleckner 4c1a1d3cf9 Add missing includes needed to prune LLVMContext.h include, NFC
These are a pre-requisite to removing #include "llvm/Support/Options.h"
from LLVMContext.h: https://reviews.llvm.org/D70280
2019-11-14 15:23:15 -08:00
Yonghong Song dd16b3fe25 [BPF] Restrict preserve_access_index attribute to C only
This patch is a follow-up for commit 4e2ce228ae
  [BPF] Add preserve_access_index attribute for record definition
to restrict attribute for C only. A new test case is added
to check for this restriction.

Additional code polishing is done based on
Aaron Ballman's suggestion in https://reviews.llvm.org/D69759/new/.

Differential Revision: https://reviews.llvm.org/D70257
2019-11-14 14:14:59 -08:00
Reid Kleckner 1dfede3122 Move CodeGenFileType enum to Support/CodeGen.h
Avoids the need to include TargetMachine.h from various places just for
an enum. Various other enums live here, such as the optimization level,
TLS model, etc. Data suggests that this change probably doesn't matter,
but it seems nice to have anyway.
2019-11-13 16:39:34 -08:00
Yonghong Song 4e2ce228ae [BPF] Add preserve_access_index attribute for record definition
This is a resubmission for the previous reverted commit
9434360401 with the same subject. This commit fixed the
segfault issue and addressed additional review comments.

This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
  struct s { ... } __attribute__((preserve_access_index));
  union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.

The attribute has no effect if -g is not specified.

When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
  __builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.

The following is an example to illustrate the usage:
  -bash-4.4$ cat t.c
  #define __reloc__ __attribute__((preserve_access_index))
  struct s1 {
    int c;
  } __reloc__;

  struct s2 {
    union {
      struct s1 b[3];
    };
  } __reloc__;

  struct s3 {
    struct s2 a;
  } __reloc__;

  int test(struct s3 *arg) {
    return arg->a.b[2].c;
  }
  -bash-4.4$ clang -target bpf -g -S -O2 t.c

A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.

forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.

Differential Revision: https://reviews.llvm.org/D69759
2019-11-13 08:23:44 -08:00
Simon Tatham a12f588ebb [ARM,MVE] Add intrinsics for contiguous load/stores.
This patch adds the ACLE intrinsics for all the MVE load and store
instructions not already handled by D69791. These ones don't need new
IR intrinsics, because they can be implemented in terms of standard
LLVM IR constructions.

Some of the load and store instructions access less than 128 bits of
memory, sign/zero extending each value to a wider vector lane on load
or truncating it on store. These are represented in IR by a load of a
shorter vector followed by a zext/sext, and conversely, a trunc
followed by a short store. Existing ISel patterns already recognize
those combinations and turn them into the right MVE instructions.

The predicated forms of all these instructions are represented in the
same way, except that the ordinary load/store operation is replaced
with the existing intrinsics @llvm.masked.{load,store}. These are
currently only code-generated as predicated MVE load/store
instructions if you give LLVM the `-enable-arm-maskedldst` option; so
I've done that in the LLVM codegen test. When we make that the
default, that option can be removed.

In the Tablegen backend, I've had to add a handful of extra support
features:

* We need to be able to make clang::Address objects out of a
  pointer and an alignment (previously we only needed these when the
  user passed us an existing one).

* We can now specify vector types that aren't 128 bits wide (for use
  in those intermediate values in IR), the parametrized type system
  can make one starting from two existing vector types (using the lane
  count of one and the element type of the other).

* I've added support for code generation of pointer casts, and for
  specifying LLVM types as operands to IRBuilder operations (for zext
  and sext, though I think they'll come in useful again).

* Now not all IR construction operations need to be specified as
  Builder.CreateFoo; some don't involve a Builder at all, and one
  passes it as a parameter to a tiny static helper function in
  CGBuiltin.cpp.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Subscribers: kristof.beyls, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70088
2019-11-13 12:47:00 +00:00
Mark de Wever 51abcebbb6 [OpenMP] Use an explicit copy in a range-based for
The std::pair<const clang::ValueDecl *, llvm::ArrayRef<clang::OMPClauseMappableExprCommon::MappableComponent>>
type will be copied in a range-based for loop. Make the copy explicit to
avoid the -Wrange-loop-analysis warning.

This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D70046
2019-11-12 20:50:38 +01:00
Tim Northover 44e5879f0f AArch64: add arm64_32 support to Clang. 2019-11-12 12:45:18 +00:00
Alexey Bataev fde11e9f23 [OPENMP50]Generalize handling of context matching/scoring.
Summary:
Untie context matching/scoring from the attribute for declare variant
directive to simplify future uses in other context-dependent directives.

Reviewers: jdoerfert

Subscribers: guansong, kkwli0, caomhin, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69952
2019-11-11 14:41:10 -05:00
Yonghong Song 9434360401 Revert "[BPF] Add preserve_access_index attribute for record definition"
This reverts commit 4a5aa1a7bf.

There are some other test failures. Investigate them first.
2019-11-09 08:32:44 -08:00
Yonghong Song 4a5aa1a7bf [BPF] Add preserve_access_index attribute for record definition
This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
  struct s { ... } __attribute__((preserve_access_index));
  union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.

The attribute has no effect if -g is not specified.

When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
  __builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.

The following is an example to illustrate the usage:
  -bash-4.4$ cat t.c
  #define __reloc__ __attribute__((preserve_access_index))
  struct s1 {
    int c;
  } __reloc__;

  struct s2 {
    union {
      struct s1 b[3];
    };
  } __reloc__;

  struct s3 {
    struct s2 a;
  } __reloc__;

  int test(struct s3 *arg) {
    return arg->a.b[2].c;
  }
  -bash-4.4$ clang -target bpf -g -S -O2 t.c

A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.

forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.

Differential Revision: https://reviews.llvm.org/D69759
2019-11-09 08:17:12 -08:00
Adrian Prantl 901cc4a4bc Debug Info: Nest Objective-C property function decls inside their container.
This has the nice side-effect of also fixing a crash in Clang.

Starting with DWARF 5 we are emitting ObjC method declarations as
children of their containing entity. This worked for interfaces, but
didn't consider the case of synthessized properties. When a property
of a protocol is synthesized in an interface implementation the
ObjCMethodDecl that was passed to CGF::StartFunction was the property
*declaration* which obviously couldn't have a containing
interface. This patch passes the containing interface all the way
through to CGDebugInfo, so the function declaration can be created
with the correct parent (= the class implementing the protocol).

rdar://problem/53782400

Differential Revision: https://reviews.llvm.org/D66121
2019-11-08 15:14:00 -08:00
Adrian Prantl 2073dd2da7 Redeclare Objective-C property accessors inside the ObjCImplDecl in which they are synthesized.
This patch is motivated by (and factored out from)
https://reviews.llvm.org/D66121 which is a debug info bugfix. Starting
with DWARF 5 all Objective-C methods are nested inside their
containing type, and that patch implements this for synthesized
Objective-C properties.

1. SemaObjCProperty populates a list of synthesized accessors that may
   need to inserted into an ObjCImplDecl.

2. SemaDeclObjC::ActOnEnd inserts forward-declarations for all
   accessors for which no override was provided into their
   ObjCImplDecl. This patch does *not* synthesize AST function
   *bodies*. Moving that code from the static analyzer into Sema may
   be a good idea though.

3. Places that expect all methods to have bodies have been updated.

I did not update the static analyzer's inliner for synthesized
properties to point back to the property declaration (see
test/Analysis/Inputs/expected-plists/nullability-notes.m.plist), which
I believed to be more bug than a feature.

Differential Revision: https://reviews.llvm.org/D68108

rdar://problem/53782400
2019-11-08 08:23:22 -08:00
Vedant Kumar b95bb0847a [CodeGenModule] Group blocks runtime globals together, NFC 2019-11-07 12:46:26 -08:00
Melanie Blower af57dbf12e Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=
Add options to control floating point behavior: trapping and
    exception behavior, rounding, and control of optimizations that affect
    floating point calculations. More details in UsersManual.rst.

    Reviewers: rjmccall

    Differential Revision: https://reviews.llvm.org/D62731
2019-11-07 07:22:45 -08:00
Tim Northover 10e0d64337 CodeGen: set correct result for atomic compound expressions
Atomic compound expressions try to use atomicrmw if possible, but this
path doesn't set the Result variable, leaving it to crash in later code
if anything ever tries to use the result of the expression. This fixes
that issue by recalculating the new value based on the old one
atomically loaded.
2019-11-07 13:36:44 +00:00
Hans Wennborg 5b9a072c39 Revert a5c8ec4 "[CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood"
This caused Chromium builds to fail with "inlinable function call in a function
with debug info must have a !dbg location" errors. See
https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a
reproducer.

> Currently, clang emits subprograms for declared functions when the
> target debugger or DWARF standard is known to support entry values
> (DW_OP_entry_value & the GNU equivalent).
>
> Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
> tail calls.
>
> Pre-patch debug session with a cross-TU tail call:
>
> ```
>   * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
>     frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> Post-patch (note that the tail-calling frame, "helper", is visible):
>
> ```
>   * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
>     frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
>     frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> rdar://46577651
>
> Differential Revision: https://reviews.llvm.org/D69743
2019-11-07 10:30:07 +01:00
Alexey Bataev dcec2ac4f3 [OPENMP50]Simplify processing of context selector scores.
If the context selector score was not specified, its value must be set
to 0. Simplify the processing of unspecified scores + save memory in
attribute representation.
2019-11-05 15:59:22 -05:00
Michael Liao 0a220de9e9 [HIP] Fix visibility for 'extern' device variables.
Summary:
- Fix a bug which misses the change for a variable to be set with
  target-specific attributes.

Reviewers: yaxunl

Subscribers: jvesely, nhaehnle, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63020
2019-11-05 14:19:32 -05:00
Michael Liao 15140e4bac [hip] Enable pointer argument lowering through coercing type.
Reviewers: tra, rjmccall, yaxunl

Subscribers: jvesely, nhaehnle, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69826
2019-11-05 13:05:05 -05:00
Alexey Bataev 7b710a4294 [OPENMP]Improve diagnostics for unsupported unified addressing.
Improved diagnostics for better user experience.
2019-11-05 10:31:59 -05:00
Jonas Paulsson 9376714314 [Clang FE] Recognize -mnop-mcount CL option (SystemZ only).
Recognize -mnop-mcount from the command line and add a function attribute
"mnop-mcount"="true" when passed.

When this option is used, a nop is added instead of a call to fentry. This
is used when building the Linux Kernel.

If this option is passed for any other target than SystemZ, an error is
generated.

Review: Ulrich Weigand
https://reviews.llvm.org/D67763
2019-11-05 12:12:36 +01:00
Vedant Kumar a5c8ec4baa [CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743
2019-11-04 15:14:24 -08:00
Alexey Bataev 8bbf2e3716 [OPENMP50]Support for imperfectly nested loops.
Added support for imperfectly nested loops introduced in OpenMP 5.0.
2019-11-04 16:09:25 -05:00
Amy Huang ab76cfdd20 Recommit "[CodeView] Add option to disable inline line tables."
This reverts commit 004ed2b0d1.
Original commit hash 6d03890384

Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

https://reviews.llvm.org/D67723
2019-11-04 09:15:26 -08:00
Craig Topper 910718bd03 [opaque pointer types] Add element type argument to IRBuilder CreatePreserveStructAccessIndex and CreatePreserveArrayAccessIndex
Summary:
These were the only remaining users of the GetElementPtrInst::getGEPReturnType
method that gets the element type from the pointer type.

Remove that method since its now dead.

Reviewers: jyknight, t.p.northover, arsenm

Reviewed By: arsenm

Subscribers: wdng, arsenm, arphaman, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69756
2019-11-03 10:27:18 -08:00
Luboš Luňák 4f2104c5ad make -ftime-trace also trace time spent creating debug info
Differential Revision: https://reviews.llvm.org/D69750
2019-11-02 18:12:51 +01:00
Sourabh Singh Tomar 52ea308f70 [NFC]: Removed an implicit capture argument from lambda. 2019-11-02 01:37:46 +05:30
Thomas Lively 935c84c3c2 [WebAssembly] Add experimental SIMD dot product instruction
Summary:
This instruction is not merged to the spec proposal, but we need it to
be implemented in the toolchain to experiment with it. It is available
only on an opt-in basis through a clang builtin.

Defined in https://github.com/WebAssembly/simd/pull/127.

Depends on D69696.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69697
2019-11-01 10:45:48 -07:00
Thomas Lively a07019a275 [WebAssembly] SIMD integer min and max instructions
Summary:
Introduces a clang builtins and LLVM intrinsics representing integer
min/max instructions. These instructions have not been merged to the
SIMD spec proposal yet, so they are currently opt-in only via builtins
and not produced by general pattern matching. If these instructions
are accepted into the spec proposal the builtins and intrinsics will
be replaced with normal pattern matching.

Defined in https://github.com/WebAssembly/simd/pull/27.

Reviewers: aheejin

Reviewed By: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69696
2019-10-31 20:22:11 -07:00
Matt Arsenault c6da9ec0e9 clang: Fix assert on void pointer arithmetic with address_space
This attempted to always use the default address space void pointer
type instead of preserving the source address space.
2019-10-31 20:07:23 -07:00
Yaxun (Sam) Liu bb1616ba47 [CodeGen] Fix invalid llvm.linker.options about pragma detect_mismatch
When a target does not support pragma detect_mismatch, an llvm.linker.options
metadata with an empty entry is created, which causes diagnostic in backend
since backend expects name/value pair in llvm.linker.options entries.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D69678
2019-10-31 22:27:35 -04:00
David Candler 92aa0c2dbc [cfi] Add flag to always generate .debug_frame
This adds a flag to LLVM and clang to always generate a .debug_frame
section, even if other debug information is not being generated. In
situations where .eh_frame would normally be emitted, both .debug_frame
and .eh_frame will be used.

Differential Revision: https://reviews.llvm.org/D67216
2019-10-31 09:48:30 +00:00
Akira Hatanaka c1d2927cc6 Run clang-format on lib/CodeGen/CGCall.h and fix indentation 2019-10-30 18:06:12 -07:00
Amy Huang 004ed2b0d1 Revert "[CodeView] Add option to disable inline line tables."
because it breaks compiler-rt tests.

This reverts commit 6d03890384.
2019-10-30 17:31:12 -07:00