Commit Graph

14947 Commits

Author SHA1 Message Date
Nikita Popov caff8591ef [OpenMP] Simplify pointer comparison
Rather than checking ptrdiff(a, b) != 0, directly check a != b.
2022-01-25 12:38:37 +01:00
Nikita Popov 99adacbcb7 [clang] Remove some getPointerElementType() uses
Same cases where the call can be removed in a straightforward way.
2022-01-25 12:09:06 +01:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Simon Pilgrim e4074432d5 [X86] Remove avx512f integer and/or/xor/min/max reduction intrinsics and use generic equivalents
None of these have any reordering issues, and they still emit the same reduction intrinsics without any change in the existing test coverage:

llvm-project\clang\test\CodeGen\X86\avx512-reduceIntrin.c
llvm-project\clang\test\CodeGen\X86\avx512-reduceMinMaxIntrin.c

Differential Revision: https://reviews.llvm.org/D117881
2022-01-24 11:57:53 +00:00
Simon Pilgrim 3e50593b18 [X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min`
D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions

This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes:
```
__m256i test_mm256_max_epu32(__m256i a, __m256i b) {
  // CHECK-LABEL: test_mm256_max_epu32
  // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.*}}, <8 x i32> %{{.*}})
  return _mm256_max_epu32(a, b);
}
```
This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`).

Sibling patch to D117791

Differential Revision: https://reviews.llvm.org/D117798
2022-01-24 11:40:29 +00:00
Simon Pilgrim e5147f82e1 [X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs
D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN)

This patch removes the `__builtin_ia32_pabs*` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes:
```
__m256i test_mm256_abs_epi8(__m256i a) {
  // CHECK-LABEL: test_mm256_abs_epi8
  // CHECK: [[ABS:%.*]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false)
  return _mm256_abs_epi8(a);
}
```
This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`).

Differential Revision: https://reviews.llvm.org/D117791
2022-01-24 11:25:21 +00:00
Wei Wang 55d887b833 [time-trace] Add optimizer and codegen regions to NPM
Optimizer and codegen regions were only added to legacy PM. Add
them to NPM as well.

Differential Revision: https://reviews.llvm.org/D117605
2022-01-21 19:17:57 -08:00
Simon Pilgrim 0abaf64580 Revert rG4727d29d908f9dd608dd97a58c0af1ad579fd3ca "[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs"
Some build bots are referencing the `__builtin_ia32_pabs` intrinsics via alternative headers
2022-01-21 12:35:36 +00:00
Simon Pilgrim 3ef88b3184 Revert rG8ee135dcf8ff060656ad481c3e980fe8763576f5 "[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min`"
Some build bots are referencing the `__builtin_ia32_pmax/min` intrinsics via alternative headers
2022-01-21 12:34:19 +00:00
Simon Pilgrim 8ee135dcf8 [X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min`
D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions

This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes:
```
__m256i test_mm256_max_epu32(__m256i a, __m256i b) {
  // CHECK-LABEL: test_mm256_max_epu32
  // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.*}}, <8 x i32> %{{.*}})
  return _mm256_max_epu32(a, b);
}
```
This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`).

Sibling patch to D117791

Differential Revision: https://reviews.llvm.org/D117798
2022-01-21 12:24:58 +00:00
Simon Pilgrim 4727d29d90 [X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs
D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN)

This patch removes the `__builtin_ia32_pabs*` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes:
```
__m256i test_mm256_abs_epi8(__m256i a) {
  // CHECK-LABEL: test_mm256_abs_epi8
  // CHECK: [[ABS:%.*]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false)
  return _mm256_abs_epi8(a);
}
```
This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`).

Differential Revision: https://reviews.llvm.org/D117791
2022-01-21 11:59:08 +00:00
Joao Moreira 82af95029e [X86] Enable ibt-seal optimization when LTO is used in Kernel
Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function.

Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1].

This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage.

A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540.

The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged.

[1] - https://lkml.org/lkml/2021/11/22/591

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D116070
2022-01-21 10:55:34 +08:00
Alexandre Ganea 5af2433e17 [clang-cl] Support the /HOTPATCH flag
This patch adds support for the MSVC /HOTPATCH flag: https://docs.microsoft.com/sv-se/cpp/build/reference/hotpatch-create-hotpatchable-image?view=msvc-170&viewFallbackFrom=vs-2019

The flag is translated to a new -fms-hotpatch flag, which in turn adds a 'patchable-function' attribute for each function in the TU. This is then picked up by the PatchableFunction pass which would generate a TargetOpcode::PATCHABLE_OP of minsize = 2 (which means the target instruction must resolve to at least two bytes). TargetOpcode::PATCHABLE_OP is only implemented for x86/x64. When targetting ARM/ARM64, /HOTPATCH isn't required (instructions are always 2/4 bytes and suitable for hotpatching).

Additionally, when using /Z7, we generate a 'hot patchable' flag in the CodeView debug stream, in the S_COMPILE3 record. This flag is then picked up by LLD (or link.exe) and is used in conjunction with the linker /FUNCTIONPADMIN flag to generate extra space before each function, to accommodate for live patching long jumps. Please see: d703b92296/lld/COFF/Writer.cpp (L1298)

The outcome is that we can finally use Live++ or Recode along with clang-cl.

NOTE: It seems that MSVC cl.exe always enables /HOTPATCH on x64 by default, although if we did the same I thought we might generate sub-optimal code (if this flag was active by default). Additionally, MSVC always generates a .debug$S section and a S_COMPILE3 record, which Clang doesn't do without /Z7. Therefore, the following MSVC command-line "cl /c file.cpp" would have to be written with Clang such as "clang-cl /c file.cpp /HOTPATCH /Z7" in order to obtain the same result.

Depends on D43002, D80833 and D81301 for the full feature.

Differential Revision: https://reviews.llvm.org/D116511
2022-01-20 12:57:19 -05:00
Florian Hahn 67aa314bce
[IRGen] Do not overwrite existing attributes in CGCall.
When adding new attributes, existing attributes are dropped. While
this appears to be a longstanding issue, this was highlighted by D105169
which dropped a lot of attributes due to adding the new noundef
attribute.

Ahmed Bougacha (@ab) tracked down the issue and provided the fix in
CGCall.cpp. I bundled it up and updated the tests.
2022-01-20 13:45:19 +00:00
Chenbing.Zheng 0be3da1fab [RISCV] Add intrinsic for Zbt extension
RV32: fsl, fsr, fsri
RV64: fsl, fsr, fsri, fslw, fsrw, fsriw

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117468
2022-01-20 08:27:05 +00:00
Yaxun (Sam) Liu 85c2bd2a0e Prevent adding module flag amdgpu_hostcall multiple times
HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one
for printf and one for sanitize option. This patch fixes that issue.

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Roman Lebedev

Differential Revision: https://reviews.llvm.org/D116216
2022-01-19 12:52:33 -05:00
Arnamoy Bhattacharyya 9fbd33ad62 [OMPIRBuilder] Add support for simd (loop) directive.
This patch adds OMPIRBuilder support for the simd directive (without any clause).  This will be a first step towards lowering simd directive in LLVM_Flang.  The patch uses existing CanonicalLoop infrastructure of IRBuilder to add the support.  Also adds necessary code to add llvm.access.group and llvm.loop metadata wherever needed.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D114379
2022-01-19 11:32:17 -05:00
Ben Shi a2f488c6a5 [clang][AVR] Implement '__flashN' for variables on different flash banks
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115982
2022-01-19 11:24:01 +00:00
hyeongyu kim 1b1c8d83d3 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2022-01-16 18:54:17 +09:00
Nikita Popov c63a3175c2 [AttrBuilder] Remove ctor accepting AttributeList and Index
Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.
2022-01-15 22:39:31 +01:00
James Y Knight 0d3f2fd269 Revert "Skip exception cleanups when the innermost scope is EHTerminateScope."
Breaks tests on some platforms. Reverting while investigating.

This reverts commit a4e255f9c6.
2022-01-14 18:59:24 -05:00
Matt Arsenault 33315ef321 clang/AMDGPU: Don't set implicit arg attribute to default size
Since 2959e082e1, we conservatively
assume all inputs are enabled by default. This isn't the best
interface for controlling these anyway, since it's not granular and
only allows trimming the last fields.
2022-01-14 18:43:30 -05:00
James Y Knight a4e255f9c6 Skip exception cleanups when the innermost scope is EHTerminateScope.
EHTerminateScope is used to implement C++ noexcept semantics. Per C++
[except.terminate], it is implemented-defined whether no, some, or all
cleanups are run prior to terminatation.

Therefore, the code to run cleanups on the way towards termination is
unnecessary, and may be omitted.

After this change, we will still run some cleanups: any cleanups in a
function called from the noexcept function will continue to run, while
those in the noexcept function itself will not.

Differential Revision: https://reviews.llvm.org/D113620
2022-01-14 18:01:29 -05:00
Erich Keane 2bcba21c8b [CPU-Dispatch] Make sure Dispatch names get updated if previously mangled
Cases where there is a mangling of a cpu-dispatch/cpu-specific function
before the function becomes 'multiversion' (such as a member function)
causes the wrong name to be emitted for one of the variants/resolver,
since the name is cached.  Make sure we invalidate the cache in
cpu-dispatch/cpu-specific modes, like we previously did for just target
multiversioning.
2022-01-14 10:45:55 -08:00
Benjamin Kramer 765dd8b8a4 [CGBuiltin] Simplify code. NFCI. 2022-01-14 16:02:02 +01:00
Jun Zhang 8de0c1feca
[Clang] Add __builtin_reduce_or and __builtin_reduce_and
This patch implements two builtins specified in D111529.
The last __builtin_reduce_add will be seperated into another one.

Differential Revision: https://reviews.llvm.org/D116736
2022-01-14 22:05:26 +08:00
Kevin Athey a0458b531c Add -fsanitize-address-param-retval to clang.
With the introduction of this flag, it is no longer necessary to enable noundef analysis with 4 separate flags.
(-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1).
This change only covers the introduction into the compiler.

This is a follow up to: https://reviews.llvm.org/D116855

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116633
2022-01-14 00:41:28 -08:00
Maurice Heumann 072e2a7c67 [MS] Implement on-demand TLS initialization for Microsoft CXX ABI
TLS initializers, for example constructors of thread-local variables, don't necessarily get called. If a thread was created before a module is loaded, the module's TLS initializers are not executed for this particular thread.

This is why Microsoft added support for dynamic TLS initialization. Before every use of thread-local variables, a check is added that runs the module's TLS initializers on-demand.

To do this, the method `__dyn_tls_on_demand_init` gets called. Internally, it simply calls `__dyn_tls_init`.

No additional TLS initializer that sets the guard needs to be emitted, as the guard always gets set by `__dyn_tls_init`.
The guard is also checked again within `__dyn_tls_init`. This makes our check redundant, however, as Microsoft's compiler also emits this check, the behaviour is adopted here.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D115456
2022-01-13 21:23:23 -08:00
Elizabeth Andrews 4eaf5846d0 [clang] Fix function pointer address space
Functions pointers should be created with program address space. This
patch introduces program address space in TargetInfo. Targets with
non-default (default is 0) address space for functions should explicitly
set this value. This patch fixes a crash on lvalue reference to function
pointer (in device code) when using oneAPI DPC++ compiler.

Differential Revision: https://reviews.llvm.org/D111566
2022-01-13 08:06:19 -08:00
Erich Keane b699e8b11a Add another assert to cpu-dispatch emission to help track down a tough
to repro error.

As mentioned yesterday, I've got a problem that I can only reproduce on
Godbolt (none of the build configs on my local machine!), so this is at
least somewhat usable until I figure out a cause.
2022-01-13 06:54:08 -08:00
Lian Wang 16877c5d2c [RISCV] Add bfp and bfpw intrinsic in zbf extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116994
2022-01-13 02:53:00 +00:00
Kevin Athey a141e47138 [NFC] Minimize noundef analysis when disabled
Minor adjustment in order of noundef analysis to be a bit more optimal (when disabled).

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D117078
2022-01-12 17:21:19 -08:00
Erich Keane 6e77ad11ff Add an assert in cpudispatch emit to try to track down an error.
I'm attempting to debug an issue that I can only get to happen on
godbolt, where the cpu-dispatch resolver for an out of line member
function is generated with the wrong name, causing a link failure.
2022-01-12 10:31:28 -08:00
Simon Pilgrim 497a4b26c4 CGBuiltin - Use castAs<> instead of getAs<> to avoid dereference of nullptr
The pointer is always dereferenced immediately below, so assert the cast is correct instead of returning nullptr
2022-01-12 15:35:37 +00:00
Marco Elver 732ad8ea62 [clang][auto-init] Provide __builtin_alloca*_uninitialized variants
When `-ftrivial-auto-var-init=` is enabled, allocas unconditionally
receive auto-initialization since [1].

In certain cases, it turns out, this is causing problems. For example,
when using alloca to add a random stack offset, as the Linux kernel does
on syscall entry [2]. In this case, none of the alloca'd stack memory is
ever used, and initializing it should be controllable; furthermore, it
is not always possible to safely call memset (see [2]).

Introduce `__builtin_alloca_uninitialized()` (and
`__builtin_alloca_with_align_uninitialized`), which never performs
initialization when `-ftrivial-auto-var-init=` is enabled.

[1] https://reviews.llvm.org/D60548
[2] https://lkml.kernel.org/r/YbHTKUjEejZCLyhX@elver.google.com

Reviewed By: glider

Differential Revision: https://reviews.llvm.org/D115440
2022-01-12 15:13:10 +01:00
Sven van Haastregt 4b85800bfd [OpenCL] Set external linkage for block enqueue kernels
All kernels can be called from the host as per the SPIR_KERNEL calling
convention.  As such, all kernels should have external linkage, but
block enqueue kernels were created with internal linkage.

Reported-by: Pedro Olsen Ferreira

Differential Revision: https://reviews.llvm.org/D115523
2022-01-12 13:30:09 +00:00
Adam Magier b2715660ed [clang][CodeGen][UBSan] VLA size checking for unsigned integer parameter
The code generation for the UBSan VLA size check was qualified by a con-
dition that the parameter must be a signed integer, however the C spec
does not make any distinction that only signed integer parameters can be
used to declare a VLA, only qualifying that it must be greater than zero
if it is not a constant.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D116048
2022-01-12 01:11:52 +01:00
Nick Desaulniers 5c562f62a4 [clang] number labels in asm goto strings after tied inputs
I noticed that the following case would compile in Clang but not GCC:
    void *x(void) {
      void *p = &&foo;
      asm goto ("# %0\n\t# %l1":"+r"(p):::foo);
      foo:;
      return p;
    }

Changing the output template above from %l2 would compile in GCC but not
Clang.

This demonstrates that when using tied outputs (say via the "+r" output
constraint), the hidden inputs occur or are numbered BEFORE the labels,
at least with GCC.

In fact, GCC does denote this in its documentation:
https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels

> Output operand with constraint modifier ‘+’ is counted as two operands
> because it is considered as one output and one input operand.

For the sake of compatibility, I think it's worthwhile to just make this
change.

It's better to use symbolic names for compatibility (especially now
between released version of Clang that support asm goto with outputs).
ie. %l1 from the above would be %l[foo]. The GCC docs also make this
recommendation.

Also, I cleaned up some cruft in GCCAsmStmt::getNamedOperand. AFAICT,
NumPlusOperands was no longer used, though I couldn't find which commit
didn't clean that up correctly.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98096
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103640
Link: https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D115471
2022-01-11 12:09:24 -08:00
Nick Desaulniers c8463fd22b [clang][CGStmt] emit i constraint rather than X for asm goto indirect dests
As suggested in:
https://reviews.llvm.org/D114895#3177794
X will be converted to i by SelectionDAGISEL anyways.

Reviewed By: void, jyknight

Differential Revision: https://reviews.llvm.org/D115311
2022-01-11 11:48:40 -08:00
Akira Hatanaka e5df9cc098 [CodeGen] Treat ObjC `__unsafe_unretained` and class types as trivial
when generating copy/dispose helper functions

Analyze the block captures just once before generating copy/dispose
block helper functions and honor the inert `__unsafe_unretained`
qualifier. This refactor fixes a bug where captures of ObjC
`__unsafe_unretained` and class types were needlessly retained/released
by the copy/dispose helper functions.

Differential Revision: https://reviews.llvm.org/D116948
2022-01-11 11:18:24 -08:00
Nikita Popov acc39873b7 [CodeGen] Avoid deprecated Address constructor 2022-01-11 13:07:02 +01:00
Hans Wennborg 0b48d0fe12 [ADT] Add an in-place version of toHex()
and use that to simplify MD5's hex string code which was previously
using a string stream, as well as Clang's
CGDebugInfo::computeChecksum().

Differential revision: https://reviews.llvm.org/D116960
2022-01-11 11:51:04 +01:00
Nikita Popov 2d1b55ebea [CodeGen] Make element type in emitArrayDestroy() predictable
When calling emitArrayDestroy(), the pointer will usually have
ConvertTypeForMem(EltType) as the element type, as one would expect.
However, globals with initializers sometimes don't use the same
types as values normally would, e.g. here the global uses
{ double, i32 } rather than %struct.T as element type.

Add an early cast to the global destruction path to avoid this
special case. The cast would happen lateron anyway, it only gets
moved to an earlier point.

Differential Revision: https://reviews.llvm.org/D116219
2022-01-11 09:25:29 +01:00
Jennifer Yu 140a6b1e5c [clang][OpenMP5.1] Initial parsing/sema for 'indirect' clause
Differential Revision: https://reviews.llvm.org/D116764
2022-01-10 16:58:56 -08:00
Adrian Prantl eb200e584e Emit the C++ dialect in -gmodules .pcm files.
Because of commit: https://reviews.llvm.org/D104291 the -gmodules .pcm
files do not have the same DW_AT_language dialect as the .o file. This
was a simple matter of passing the DebugStrictDwarf flag to the
PCHContainerGenerator object's CodeGenOpts from the CompilerInstance
passed in to it.

Before this change if you ran dwarfdump on the gmodule cache folder
you would get DW_AT_language (DW_LANG_C_plus_plus) even when using
-std=c++14 with clang

Patch by Shubham Rastogi!

Differential Revision: https://reviews.llvm.org/D116790
2022-01-10 16:13:40 -08:00
Nikita Popov 7725331ccd [CodeGen] Avoid some pointer element type accesses
Possibly this is sufficient to fix PR53089.
2022-01-10 15:02:55 +01:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Kazu Hirata 17d4bd3d78 [clang] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-09 00:19:49 -08:00
Kazu Hirata 40446663c7 [clang] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-09 00:19:47 -08:00
Johannes Doerfert 37639b72a1 [OpenMP][FIX] Emit debug declares only if debug info is available
The `EmitDeclareOfAutoVariable` introduced in D114504 and D115510 has a
precondition that cannot be violated. It is unclear if we should call it
directly given the sparse usage in clang but for now we should at least
not crash if the debug info kind is too low.

Fixes #52938.

Differential Revision: https://reviews.llvm.org/D116865
2022-01-08 17:01:19 -06:00
Kazu Hirata d1b127b5b7 [clang] Remove unused forward declarations (NFC) 2022-01-08 11:56:40 -08:00
Simon Pilgrim 6ee589e2f5 [CGObjCMac] Use castAs<> instead of getAs<> to avoid dereference of nullptr inside BuildRCBlockVarRecordLayout
This will assert the cast is correct instead of returning nullptr (UnionType is a subtype of RecordType so this should be clean).
2022-01-08 16:18:55 +00:00
Simon Pilgrim 06e9733fec [CGExpr] Use castAs<> instead of getAs<> to avoid dereference of nullptr
This will assert the cast is correct instead of returning nullptr
2022-01-08 14:26:09 +00:00
Jun Zhang 5be131922c
[NFC] Test commit.
This is just a test commit to check whether the permission I got is
correct or not.
2022-01-08 10:36:09 +08:00
Jun Zhang b2ed9f3f44
[Clang] Implement the rest of __builtin_elementwise_* functions.
The patch implement the rest of __builtin_elementwise_* functions
specified in D111529, including:
* __builtin_elementwise_floor
* __builtin_elementwise_roundeven
* __builtin_elementwise_trunc

Signed-off-by: Jun <jun@junz.org>

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D115429
2022-01-07 15:11:36 +00:00
Nikita Popov e8b98a5216 [CodeGen] Emit elementtype attributes for indirect inline asm constraints
This implements the clang side of D116531. The elementtype
attribute is added for all indirect constraints (*) and tests are
updated accordingly.

Differential Revision: https://reviews.llvm.org/D116666
2022-01-06 09:29:22 +01:00
David Pagan 7df2371bc6 Add codegen for allocate directive's 'align' clause 2022-01-05 12:40:58 -05:00
Chuanqi Xu c75cedc237 [Coroutines] Set presplit attribute in Clang and mlir
This fixes bug49264.

Simply, coroutine shouldn't be inlined before CoroSplit. And the marker
for pre-splited coroutine is created in CoroEarly pass, which ran after
AlwaysInliner Pass in O0 pipeline. So that the AlwaysInliner couldn't
detect it shouldn't inline a coroutine. So here is the error.

This patch set the presplit attribute in clang and mlir. So the inliner
would always detect the attribute before splitting.

Reviewed By: rjmccall, ezhulenev

Differential Revision: https://reviews.llvm.org/D115790
2022-01-05 10:25:02 +08:00
serge-sans-paille 9290ccc3c1 Introduce the AttributeMask class
This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for building and removing, which introduced odd situation like creation of
Attribute with dummy value because the only relevant part was the attribute
kind.

Differential Revision: https://reviews.llvm.org/D116110
2022-01-04 15:37:46 +01:00
Jun Zhang 82020de532
Recommit "[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.""
This reverts the revert commit f552ba6e84.

Recommit with fixed author name.
2022-01-04 13:46:41 +00:00
Florian Hahn f552ba6e84
Revert "[Clang] Extend emitUnaryBuiltin to avoid duplicate logic."
This reverts commit 5c57e6aa57.

Reverted due to a typo in the authors name. Will recommit soon with
fixed authorship.
2022-01-04 13:45:28 +00:00
Jun Zhan 5c57e6aa57
[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.
This patch extends `emitUnaryBuiltin` so that we can better emitting IR when
implement builtins specified in D111529.

Also contains some NFC, applying it to existing code.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D116161
2022-01-04 11:47:41 +00:00
Kazu Hirata d677a7cb05 [clang] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-02 10:20:23 -08:00
Kazu Hirata 683e6ee7d0 [CodeGen] Remove redundant string initialization (NFC)
Identified with readability-redundant-string-init.
2022-01-01 09:14:23 -08:00
Kazu Hirata 298367ee6e [clang] Use nullptr instead of 0 or NULL (NFC)
Identified with modernize-use-nullptr.
2021-12-29 08:34:20 -08:00
Johannes Doerfert 944aa0421c Reapply "[OpenMP][NFCI] Embed the source location string size in the ident_t"
This reverts commit 73ece231ee and
reapplies 7bfcdbcbf3 with mlir changes.
Also reverts commit 423ba12971 and
includes the unit test changes of
16da214004.
2021-12-29 01:10:38 -06:00
Mehdi Amini 73ece231ee Revert "[OpenMP][NFCI] Embed the source location string size in the ident_t"
This reverts commit 7bfcdbcbf3.
Broke MLIR build
2021-12-29 06:57:36 +00:00
Johannes Doerfert 7bfcdbcbf3 [OpenMP][NFCI] Embed the source location string size in the ident_t
One of the unused ident_t fields now holds the size of the string
(=const char *) field so we have an easier time dealing with those
in the future.

Differential Revision: https://reviews.llvm.org/D113126
2021-12-28 23:53:29 -06:00
Joseph Huber 7cdaa5a94e [OpenMP][FIX] Change globalization alignment to 16
This patch changes the default aligntment from 8 to 16, and encodes this
information in the `__kmpc_alloc_shared` runtime call to communicate it
to the HeapToStack pass. The previous alignment of 8 was not sufficient
for the maximum size of primitive types on 64-bit systems, and needs to
be increaesd. This reduces the amount of space availible in the data
sharing stack, so this implementation will need to be improved later to
include the alignment requirements in the allocation call, and use it
properly in the data sharing stack in the runtime.

Depends on D115888

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D115971
2021-12-27 16:58:25 -05:00
Nikita Popov 3e65861131 [CodeGen] Avoid one more pointer element type access
The number of elements is always a SizeTy here.
2021-12-27 12:58:22 +01:00
Nikita Popov 1f07a4a569 [CodeGen] Avoid more pointer element type accesses 2021-12-27 12:00:22 +01:00
Shao-Ce SUN ec501f15a8 [clang][CodeGen] Remove the signed version of createExpression
Fix a TODO. Remove the callers of this signed version and delete.

Reviewed By: CodaFi

Differential Revision: https://reviews.llvm.org/D116014
2021-12-27 14:16:08 +08:00
Kazu Hirata 31cfb3f4f6 [clang] Remove redundant calls to c_str() (NFC)
Identified with readability-redundant-string-cstr.
2021-12-26 13:31:40 -08:00
Kazu Hirata 0542d15211 Remove redundant string initialization (NFC)
Identified with readability-redundant-string-init.
2021-12-26 09:39:26 -08:00
Kazu Hirata 2d303e6781 Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-12-24 23:17:54 -08:00
Kazu Hirata 76f0f1cc5c Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC) 2021-12-24 21:43:06 -08:00
Kazu Hirata 9c0a4227a9 Use Optional::getValueOr (NFC) 2021-12-24 20:57:40 -08:00
Shilei Tian c7a589a2c4 [Clang][OpenMP] Add the support for atomic compare in parser
This patch adds the support for `atomic compare` in parser. The support
in Sema and CodeGen will come soon. For now, it simply eimits an error when it
is encountered.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D115561
2021-12-24 08:16:51 -05:00
Nikita Popov dd903173c0 [OpenMP] Avoid creating null pointer lvalue (NFC)
The reduction initialization code creates a "naturally aligned null
pointer to void lvalue", which I found somewhat odd, even though it
works out in the end because it is not actually used. It doesn't
look like this code actually needs an LValue for anything though,
and we can use an invalid Address to represent this case instead.

Differential Revision: https://reviews.llvm.org/D116214
2021-12-24 09:01:56 +01:00
Chuanqi Xu f3d4e168db [C++20] Conform coroutine's comments in clang (NFC-ish)
The comments for coroutine in clang wrote for coroutine-TS. Now
coroutine is merged into standard. Try to conform the comments.
2021-12-24 12:41:44 +08:00
Nikita Popov 7977fd7cfc [OpenMP] Remove no-op cast (NFC)
This was casting the address to its own element type, which is
a no-op.
2021-12-23 15:15:26 +01:00
Nikita Popov bf2b5551f9 [CodeGen] Use CreateConstInBoundsGEP() in one more place
This does exactly what this code manually implemented.
2021-12-23 14:58:47 +01:00
Nikita Popov 2c7dc13146 [CGBuilder] Add CreateGEP() overload that accepts an Address
Add an overload for an Address and a single non-constant offset.
This makes it easier to preserve the element type and adjust the
alignment appropriately.
2021-12-23 14:53:42 +01:00
Nikita Popov 53f0538181 [CodeGen] Use correct element type for store to sret
sret is special in that it does not use the memory type
representation. Manually construct the LValue using ConvertType
instead of ConvertTypeForMem here.

This fixes matrix-lowering-opt-levels.c on s390x.
2021-12-23 13:02:49 +01:00
Nikita Popov 09669e6c5f [CodeGen] Avoid pointer element type access when creating LValue
This required fixing two places that were passing the pointer type
rather than the expected pointee type to the method.
2021-12-23 10:53:15 +01:00
Nikita Popov 1201a0f395 [OpenMP] Fix incorrect type when casting from uintptr
MakeNaturalAlignAddrLValue() expects the pointee type, but the
pointer type was passed. As a result, the natural alignment of
the pointer (usually 8) was always used in place of the natural
alignment of the value type.

Differential Revision: https://reviews.llvm.org/D116171
2021-12-23 08:57:11 +01:00
Krzysztof Parzyszek dcb3e8083a [Hexagon] Make conversions to vector predicate types explicit for builtins
HVX does not have load/store instructions for vector predicates (i.e. bool
vectors). Because of that, vector predicates need to be converted to another
type before being stored, and the most convenient representation is an HVX
vector.
As a consequence, in C/C++, source-level builtins that either take or
produce vector predicates take or return regular vectors instead. On the
other hand, the corresponding LLVM intrinsics do have boolean types that,
and so a conversion of the operand or the return value was necessary.
This conversion would happen inside clang's codegen, but was somewhat
fragile.

This patch changes the strategy: a builtin that takes a vector predicate
now really expects a vector predicate. Since such a predicate cannot be
provided via a variable, this builtin must be composed with other builtins
that either convert vector to a predicate (V6_vandvrt) or predicate to a
vector (V6_vandqrt).

For users using builtins defined in hvx_hexagon_protos.h there is no impact:
the conversions were added to that file. Other users will need to insert
- __builtin_HEXAGON_V6_vandvrt[_128B](V, -1) to convert vector V to a
  vector predicate, or
- __builtin_HEXAGON_V6_vandqrt[_128B](Q, -1) to convert vector predicate Q
  to a vector.

Builtins __builtin_HEXAGON_V6_vmaskedstore.* are a temporary exception to
that, but they are deprecated and should not be used anyway. In the future
they will either follow the same rule, or be removed.
2021-12-22 12:52:24 -08:00
Jeremy Morse ea22fdd120 [Clang][DebugInfo] Cease turning instruction-referencing off by default
Over in D114631 I turned this debug-info feature on by default, for x86_64
only. I'd previously stripped out the clang cc1 option that controlled it
in 651122fc4a, unfortunately that turned out to not be completely
effective, and the two things deleted in this patch continued to keep it
off-by-default.  Oooff.

As a follow-up, this patch removes the last few things to do with
ValueTrackingVariableLocations from clang, which was the original purpose
of D114631. In an ideal world, if this patch causes you trouble you'd
revert 3c04507088 instead, which was where this behaviour was supposed
to start being the default, although that might not be practical any more.
2021-12-22 16:30:05 +00:00
Alok Kumar Sharma 5eb271880c [clang][OpenMP][DebugInfo] Debug support for variables in shared clause of OpenMP task construct
Currently variables appearing inside shared clause of OpenMP task construct
are not visible inside lldb debugger.

After the current patch, lldb is able to show the variable

```
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000400934 a.out`.omp_task_entry. [inlined] .omp_outlined.(.global_tid.=0, .part_id.=0x000000000071f0d0, .privates.=0x000000000071f0e8, .copy_fn.=(a.out`.omp_task_privates_map. at testshared.cxx:8), .task_t.=0x000000000071f0c0, __context=0x000000000071f0f0) at testshared.cxx:10:34
   7      else {
   8    #pragma omp task shared(svar) firstprivate(n)
   9        {
-> 10         printf("Task svar = %d\n", svar);
   11         printf("Task n = %d\n", n);
   12         svar = fib(n - 1);
   13       }
(lldb) p svar
(int) $0 = 9
```

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D115510
2021-12-22 20:04:21 +05:30
Jun Zhan b55ea2fbc0
[Clang] Add __builtin_reduce_xor
This patch implements __builtin_reduce_xor as specified in D111529.

Reviewed By: fhahn, aaron.ballman

Differential Revision: https://reviews.llvm.org/D115231
2021-12-22 10:00:27 +00:00
Alexandre Ganea a282ea4898 Reland - [CodeView] Emit S_OBJNAME record
Reland integrates build fixes & further review suggestions.

Thanks to @zturner for the initial S_OBJNAME patch!

Differential Revision: https://reviews.llvm.org/D43002
2021-12-21 19:02:14 -05:00
Alexandre Ganea 5bb5142e80 Revert [CodeView] Emit S_OBJNAME record
Also revert all subsequent fixes:
- abd1cbf5e5 [Clang] Disable debug-info-objname.cpp test on Unix until I sort out the issue.
- 00ec441253 [Clang] debug-info-objname.cpp test: explictly encode a x86 target when using %clang_cl to avoid falling back to a native CPU triple.
- cd407f6e52 [Clang] Fix build by restricting debug-info-objname.cpp test to x86.
2021-12-21 19:02:14 -05:00
Nikita Popov a995cdab19 [CodeGen] Avoid more pointer element type accesses 2021-12-21 15:52:18 +01:00
Alexandre Ganea f44e3fbadd [CodeView] Emit S_OBJNAME record
Thanks to @zturner for the initial patch!

Differential Revision: https://reviews.llvm.org/D43002
2021-12-21 09:26:36 -05:00
Nikita Popov 9a05a7b00c [CodeGen] Accept Address in CreateLaunderInvariantGroup
Add an overload that accepts and returns an Address, as we
generally just want to replace the pointer with a laundered one,
while retaining remaining information.
2021-12-21 14:43:20 +01:00
Nikita Popov e751d97863 [CodeGen] Avoid some pointer element type accesses
This avoids some pointer element type accesses when compiling
C++ code.
2021-12-21 14:16:28 +01:00
Nikita Popov 55d7a12b86 [CodeGen] Avoid pointee type access during global var declaration
All callers pass in a GlobalVariable, so we can conveniently fetch
the type from there.
2021-12-21 11:48:37 +01:00
Sami Tolvanen ec2e26eaf6 [Clang] Add __builtin_function_start
Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as operating system kernels, which may
need the address of an actual function body without the jump table
indirection.

This change adds the __builtin_function_start() builtin, which
accepts an argument that can be constant-evaluated to a function,
and returns the address of the function body.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Depends on D108478

Reviewed By: pcc, rjmccall

Differential Revision: https://reviews.llvm.org/D108479
2021-12-20 12:55:33 -08:00
Ellis Hoag ac719d7c9a [InstrProf] Don't profile merge by default in lightweight mode
Profile merging is not supported when using debug info profile
correlation because the data section won't be in the binary at runtime.
Change the default profile name in this mode to `default_%p.proflite` so
we don't use profile merging.

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D115979
2021-12-20 09:51:49 -08:00
Sam McCall af27466c50 Reland "[AST] Add UsingType: a sugar type for types found via UsingDecl"
This reverts commit cc56c66f27.
Fixed a bad assertion, the target of a UsingShadowDecl must not have
*local* qualifiers, but it can be a typedef whose underlying type is qualified.
2021-12-20 18:03:15 +01:00