Commit Graph

13743 Commits

Author SHA1 Message Date
Sriraman Tallam 7d0bbe4090 Re-apply https://reviews.llvm.org/D87921, was reverted to triage a PPC bot failure.
D87921 was reverted in commit b89059a313
as it was causing an unknown llvm PPC bot failure.  Reapplying the patch
after confirming that this is not responsible. Build bot failure:
https://reviews.llvm.org/D87921#2286644  which caused the revert.

The wrong placement of add pass with optimizations led to
-funique-internal-linkage-names being disabled.

Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make it
work correctly with -O2 and new pass manager. Updated the tests to explicitly
check O0 and O1.

Differential Revision: https://reviews.llvm.org/D87921
2020-09-23 10:28:40 -07:00
Yaxun (Sam) Liu 301e23305d [CUDA][HIP] Fix static device var used by host code only
A static device variable may be accessed in host code through
cudaMemCpyFromSymbol etc. Currently clang does not
emit the static device variable if it is only referenced by
host code, which causes host code to fail at run time.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D88115
2020-09-23 08:18:19 -04:00
Mircea Trofin cf112382dd [ThinLTO] Option to bypass function importing.
This completes the circle, complementing -lto-embed-bitcode
(specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips
function importing. The index file is still needed for the other data it
contains.

Differential Revision: https://reviews.llvm.org/D87949
2020-09-22 13:12:11 -07:00
Sriraman Tallam b89059a313 Revert "The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled."
This reverts commit 6950db36d3.
2020-09-22 12:32:43 -07:00
Zequan Wu 9caa3fbe03 [Coverage] Add empty line regions to SkippedRegions
Differential Revision: https://reviews.llvm.org/D84988
2020-09-21 12:42:53 -07:00
Reid Kleckner 3b3a165485 [MS] On x86_32, pass overaligned, non-copyable arguments indirectly
This updates the C++ ABI argument classification code to use the logic
from D72114, fixing an ABI incompatibility with MSVC.

Part of PR44395.

Differential Revision: https://reviews.llvm.org/D87923
2020-09-21 11:49:17 -07:00
Sriraman Tallam 6950db36d3 The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled.
Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make
it work correctly with -O2 and new pass manager. Updated the tests to
explicitly check O0 and O2.

Previously, the addPass was placed before BackendUtil.cpp#L1373 which is wrong
as MPM gets assigned at this point and any additions to the pass vector before
this is wrong. This change just moves it after MPM is assigned and places it at
a point where O0 and O0+ can share it.

Differential Revision: https://reviews.llvm.org/D87921
2020-09-21 10:00:12 -07:00
Alexey Bataev d5ce8233bf [OpenMP 5.0] Fix user-defined mapper privatization in tasks
This patch fixes the problem that user-defined mapper array is not correctly privatized inside a task. This problem causes openmp/libomptarget/test/offloading/target_depend_nowait.cpp fails.

Differential Revision: https://reviews.llvm.org/D84470
2020-09-17 11:21:10 -04:00
Michael Liao 4d4f092283 [clang][codegen] Skip adding default function attributes on intrinsics.
- After loading builtin bitcode for linking, skip adding default
  function attributes on LLVM intrinsics as their attributes are
  well-defined and retrieved directly from internal definitions. Adding
  extra attributes on intrinsics results in inconsistent result when
  `-save-temps` is present. Also, that makes few optimizations
  conservative.

Differential Revision: https://reviews.llvm.org/D87761
2020-09-16 14:10:05 -04:00
Sam McCall f5c7102dbc Update dead links to Itanium and ARM ABIs. NFC 2020-09-16 13:42:01 +02:00
Simon Pilgrim 4abb5cd839 CGBlocks.cpp - assert non-null CGF pointer. NFCI.
Fixes static analyzer warning.
2020-09-16 12:30:24 +01:00
Mircea Trofin 61fc10d6a5 [ThinLTO] add post-thinlto-merge option to -lto-embed-bitcode
This will embed bitcode after (Thin)LTO merge, but before optimizations.
In the case the thinlto backend is called from clang, the .llvmcmd
section is also produced. Doing so in the case where the caller is the
linker doesn't yet have a motivation, and would require plumbing through
command line args.

Differential Revision: https://reviews.llvm.org/D87636
2020-09-15 15:56:11 -07:00
Alexey Bataev 9e3842d603 [OPENMP]Fix codegen for is_device_ptr component, captured by reference.
Need to map the component as TO instead of the literal, because need to
pass a reference to a component if the pointer is overaligned.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D84887
2020-09-15 17:21:38 -04:00
Snehasish Kumar f1a3ab9044 [clang] Add a command line flag for the Machine Function Splitter.
This patch adds a command line flag for the machine function splitter
(added in rG94faadaca4e1).

-fsplit-machine-functions
Split machine functions using profile information (x86 ELF). On
other targets an error is emitted. If profile information is not
provided a warning is emitted notifying the user that profile
information is required.

Differential Revision: https://reviews.llvm.org/D87047
2020-09-15 12:41:58 -07:00
Zequan Wu f975ae4867 [CodeGen][typeid] Emit typeinfo directly if type is known at compile-time
Differential Revision: https://reviews.llvm.org/D87425
2020-09-15 12:15:47 -07:00
Alexey Bataev 738bab743b [OPENMP]Add support for allocate vars in untied tasks.
Local vars, marked with pragma allocate, mustbe allocate by the call of
the runtime function and cannot be allocated as other local variables.
Instead, we allocate a space for the pointer in private record and store
the address, returned by kmpc_alloc call in this pointer.
So, for untied tasks

```
 #pragma omp task untied
 {
   S s;
    #pragma omp allocate(s) allocator(allocator)
   s = x;
 }
```
compiler generates something like this:
```
struct task_with_privates {
  S *ptr;
};

void entry(task_with_privates *p) {
  S *s = p->s;
  switch(partid) {
  case 1:
    p->s = (S*)kmpc_alloc();
    kmpc_omp_task();
    br exit;
  case 2:
    *s = x;
    kmpc_omp_task();
    br exit;
  case 2:
    ~S(s);
    kmpc_free((void*)s);
    br exit;
  }
exit:
}
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86558
2020-09-15 13:39:14 -04:00
Simon Pilgrim 9eab73fa17 [X86] Update SSE/AVX integer MINMAX intrinsics to emit llvm.smax.* etc. (PR46851)
We're now getting close to having the necessary analysis/combines etc. for the new generic llvm smax/smin/umax/umin intrinsics.

This patch updates the SSE/AVX integer MINMAX intrinsics to emit the generic equivalents instead of the icmp+select code pattern.

Differential Revision: https://reviews.llvm.org/D87603
2020-09-15 11:19:08 +01:00
Teresa Johnson 226d80ebe2 [MemProf] Rename HeapProfiler to MemProfiler for consistency
This is consistent with the clang option added in
7ed8124d46, and the comments on the
runtime patch in D87120.

Differential Revision: https://reviews.llvm.org/D87622
2020-09-14 13:14:57 -07:00
Simon Pilgrim 3b7708e2de Assert we've found the size of each (non-overlapping) structure. NFCI.
Fixes clang static analyzer warning.
2020-09-14 16:10:52 +01:00
Serge Pavlov f1cd6593da [AST][FPEnv] Keep FP options in trailing storage of CastExpr
This is recommit of 6c8041aa0f, reverted in de044f7562 because of some
fails. Original commit message is below.

This change allow a CastExpr to have optional FPOptionsOverride object,
stored in trailing storage. Of all cast nodes only ImplicitCastExpr,
CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed
to have FPOptions.

Differential Revision: https://reviews.llvm.org/D85960
2020-09-14 12:15:21 +07:00
Florian Hahn a874d63344 [Clang] Add option to allow marking pass-by-value args as noalias.
After the recent discussion on cfe-dev 'Can indirect class parameters be
noalias?' [1], it seems like using using noalias is problematic for
current C++, but should be allowed for C-only code.

This patch introduces a new option to let the user indicate that it is
safe to mark indirect class parameters as noalias. Note that this also
applies to external callers, e.g. it might not be safe to use this flag
for C functions that are called by C++ functions.

In targets that allocate indirect arguments in the called function, this
enables more agressive optimizations with respect to memory operations
and brings a ~1% - 2% codesize reduction for some programs.

[1] : http://lists.llvm.org/pipermail/cfe-dev/2020-July/066353.html

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D85473
2020-09-12 14:56:13 +01:00
Tyker 78de7297ab Reland [AssumeBundles] Use operand bundles to encode alignment assumptions
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.
2020-09-12 15:36:06 +02:00
Serge Pavlov de044f7562 Revert "[AST][FPEnv] Keep FP options in trailing storage of CastExpr"
This reverts commit 6c8041aa0f.
It caused some fails on buildbots.
2020-09-12 17:06:42 +07:00
Serge Pavlov 6c8041aa0f [AST][FPEnv] Keep FP options in trailing storage of CastExpr
This change allow a CastExpr to have optional FPOptionsOverride object,
stored in trailing storage. Of all cast nodes only ImplicitCastExpr,
CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed
to have FPOptions.

Differential Revision: https://reviews.llvm.org/D85960
2020-09-12 14:30:44 +07:00
Cullen Rhodes 002f5ab3b1 [clang][aarch64] Fix ILP32 ABI for arm_sve_vector_bits
The element types of scalable vectors are defined in terms of stdint
types in the ACLE. This patch fixes the mapping to builtin types for the
ILP32 ABI when creating VLS types with the arm_sve_vector_bits, where
the mapping is as follows:

  int32_t -> LongTy
  int64_t -> LongLongTy
  uint32_t -> UnsignedLongTy
  uint64_t -> UnsignedLongLongTy

This is implemented by leveraging getBuiltinVectorTypeInfo which is
target agnostic since it calls ASTContext::getIntTypeForBitwidth for
integer types. The element type for svfloat16_t is changed from
Float16Ty to HalfTy when creating VLS types since this is what is used
elsewhere.

For more information, see:

https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#types-varying-by-data-model
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-support-for-scalable-vectors

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87358
2020-09-11 09:46:35 +00:00
Michael Liao b22d450496 Remove dependency on clangASTMatchers.
- It seems no long required for shared library builds.
2020-09-10 22:17:48 -04:00
Mark de Wever 08196e0b2e Implements [[likely]] and [[unlikely]] in IfStmt.
This is the initial part of the implementation of the C++20 likelihood
attributes. It handles the attributes in an if statement.

Differential Revision: https://reviews.llvm.org/D85091
2020-09-09 20:48:37 +02:00
Qiu Chaofan 88ff4d2ca1 [PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering
In standard C library, both rint and nearbyint returns rounding result
in current rounding mode. But nearbyint never raises inexact exception.
On PowerPC, x(v|s)r(d|s)pic may modify FPSCR XX, raising inexact
exception. So we can't select constrained fnearbyint into xvrdpic.

One exception here is xsrqpi, which will not raise inexact exception, so
fnearbyint f128 is okay here.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87220
2020-09-09 22:40:58 +08:00
Ties Stuij d6f3f61231 Revert "[ARM] Follow AACPS standard for volatile bit-fields access width"
This reverts commit 514df1b2bb.

Some of the buildbots got llvm-lit errors on CodeGen/volatile.c
2020-09-08 18:46:27 +01:00
Ties Stuij 514df1b2bb [ARM] Follow AACPS standard for volatile bit-fields access width
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
  short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
  short a;
  int  b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2020Q2
(https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=)
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.
```

Patch D16586 was updated to follow such behavior by verifying that we
only change volatile bit-field access when:
 - it won't overlap with any other non-bit-field member
 - we only access memory inside the bounds of the record
 - avoid overlapping zero-length bit-fields.

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Differential Revision: https://reviews.llvm.org/D72932

The following people contributed to this patch:
- Diogo Sampaio
- Ties Stuij
2020-09-08 17:49:49 +01:00
Simon Pilgrim 58970eb7d1 [OpenMP] Fix typo in CodeGenFunction::EmitOMPWorksharingLoop (PR46412)
Fixes issue noticed by static analysis where we have a copy+paste typo, testing ScheduleKind.M1 twice instead of ScheduleKind.M2.

Differential Revision: https://reviews.llvm.org/D87250
2020-09-08 11:59:38 +01:00
Simon Pilgrim 2853ae3c1b [X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851)
We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics.

This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern.

Differential Revision: https://reviews.llvm.org/D87101
2020-09-07 13:54:12 +01:00
Simon Pilgrim a8a91533dd [X86] Replace EmitX86AddSubSatExpr with EmitX86BinaryIntrinsic generic helper. NFCI.
Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.
2020-09-07 13:33:48 +01:00
Eduardo Caldas 1a7a2cd747 [Ignore Expressions][NFC] Refactor to better use `IgnoreExpr.h` and nits
This change groups
* Rename: `ignoreParenBaseCasts` -> `IgnoreParenBaseCasts` for uniformity
* Rename: `IgnoreConversionOperator` -> `IgnoreConversionOperatorSingleStep` for uniformity
* Inline `IgnoreNoopCastsSingleStep` into a lambda inside `IgnoreNoopCasts`
* Refactor `IgnoreUnlessSpelledInSource` to make adequate use of `IgnoreExprNodes`

Differential Revision: https://reviews.llvm.org/D86880
2020-09-07 09:32:30 +00:00
Amy Huang aaf1a96408 [DebugInfo] Add size to class declarations in debug info.
This adds the size to forward declared class DITypes, if the size is known.

Fixes an issue where we determine whether to emit fragments based on the
type size, so fragments would sometimes be incorrectly emitted if there
was no size.

Bug: https://bugs.llvm.org/show_bug.cgi?id=47338

Differential Revision: https://reviews.llvm.org/D87062
2020-09-03 15:42:27 -07:00
Yaxun (Sam) Liu 62dbb7e54c Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Temporarily revert commit 04abbb3a78
due to regressions in some HIP apps due backend issues revealed by
this change.

Will re-commit it when backend issues are fixed.
2020-09-02 16:12:28 -04:00
Erik Pilkington 2d11ae0a40 Fix a -Wparenthesis warning in 8ff44e644b, NFC 2020-09-02 15:01:54 -04:00
Erik Pilkington 8ff44e644b [IRGen] Fix an assert when __attribute__((used)) is used on an ObjC method
This assert doesn't really make sense for functions in general, since they
start life as declarations, and there isn't really any reason to require them
to be defined before attributes are applied to them.

rdar://67895846
2020-09-02 12:19:11 -04:00
Erik Pilkington a9a6e62ddf [CodeGen] Make sure the EH cleanup for block captures is conditional when the block literal is in a conditional context
Previously, clang was crashing on the attached test because the EH cleanup for
the block capture was incorrectly emitted under the assumption that the
expression wasn't conditionally evaluated. This was because before 9a52de00260,
pushLifetimeExtendedDestroy was mainly used with C++ automatic lifetime
extension, where a conditionally evaluated expression wasn't possible. Now that
we're using this path for block captures, we need to handle this case.

rdar://66250047

Differential revision: https://reviews.llvm.org/D86854
2020-08-31 10:12:17 -04:00
Arnold Schwaighofer 41634497d4 Teach the swift calling convention about _Atomic types
rdar://67351073

Differential Revision: https://reviews.llvm.org/D86218
2020-08-31 07:07:25 -07:00
Fangrui Song b5ef137c11 [gcov] Increment counters with atomicrmw if -fsanitize=thread
Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously
because non-atomic counter increments can be detected as data races.
2020-08-28 16:32:35 -07:00
Cullen Rhodes 2ddf795e8c Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
This relands D85743 with a fix for test
CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass
manager with '-fno-experimental-new-pass-manager'. Test was failing due
to IR differences with the new pass manager which broke the Fuchsia
builder [1]. Reverted in 2e7041f.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375

Original summary:

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-28 15:57:09 +00:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
JF Bastien 82d29b397b Add an unsigned shift base sanitizer
It's not undefined behavior for an unsigned left shift to overflow (i.e. to
shift bits out), but it has been the source of bugs and exploits in certain
codebases in the past. As we do in other parts of UBSan, this patch adds a
dynamic checker which acts beyond UBSan and checks other sources of errors. The
option is enabled as part of -fsanitize=integer.

The flag is named: -fsanitize=unsigned-shift-base
This matches shift-base and shift-exponent flags.

<rdar://problem/46129047>

Differential Revision: https://reviews.llvm.org/D86000
2020-08-27 19:50:10 -07:00
Cullen Rhodes 2e7041fdc2 Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders
[1][2]. Reverting whilst I investigate.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
[2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112

This reverts commit 42587345a3.
2020-08-27 21:31:05 +00:00
Craig Topper 17ceda99d3 [CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and 'tune-cpu' before re-adding in CodeGenModule::setNonAliasAttributes.
I think the removeAttributes interface should be faster than
calling removeAttribute 3 times.
2020-08-27 12:54:20 -07:00
Mikhail Maltsev ae1396c7d4 [ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).

The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.

This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86146
2020-08-27 18:43:16 +01:00
Teresa Johnson 7ed8124d46 [HeapProf] Clang and LLVM support for heap profiling instrumentation
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948
2020-08-27 08:50:35 -07:00
Cullen Rhodes 42587345a3 [CodeGen][AArch64] Support arm_sve_vector_bits attribute
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-27 15:11:58 +00:00
Sander de Smalen 4e9b66de3f [AArch64][SVE] Add missing debug info for ACLE types.
This patch adds type information for SVE ACLE vector types,
by describing them as vectors, with a lower bound of 0, and
an upper bound described by a DWARF expression using the
AArch64 Vector Granule register (VG), which contains the
runtime multiple of 64bit granules in an SVE vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86101
2020-08-27 10:56:42 +01:00