Commit Graph

14947 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu 1d97cb1f6e [HIP] Emit amdgpu_code_object_version module flag
code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code object version (default 4).

The amdgpu_code_object_version value is code object version times 100.

LLVM IR with different amdgpu_code_object_version module flag cannot
be linked.

The -cc1 option -mcode-object-version=none is for ROCm device library use
only, which supports multiple ABI.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D119026
2022-02-08 21:58:40 -05:00
Bill Wendling deaf22bc0e [X86] Implement -fzero-call-used-regs option
The "-fzero-call-used-regs" option tells the compiler to zero out
certain registers before the function returns. It's also available as a
function attribute: zero_call_used_regs.

The two upper categories are:

  - "used": Zero out used registers.
  - "all": Zero out all registers, whether used or not.

The individual options are:

  - "skip": Don't zero out any registers. This is the default.
  - "used": Zero out all used registers.
  - "used-arg": Zero out used registers that are used for arguments.
  - "used-gpr": Zero out used registers that are GPRs.
  - "used-gpr-arg": Zero out used GPRs that are used as arguments.
  - "all": Zero out all registers.
  - "all-arg": Zero out all registers used for arguments.
  - "all-gpr": Zero out all GPRs.
  - "all-gpr-arg": Zero out all GPRs used for arguments.

This is used to help mitigate Return-Oriented Programming exploits.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D110869
2022-02-08 17:42:54 -08:00
Arthur Eubanks f05a63f9a0 [clang] Properly cache member pointer LLVM types
When not going through the main Clang->LLVM type cache, we'd
accidentally create multiple different opaque types for a member pointer
type.

This allows us to remove the -verify-type-cache flag now that
check-clang passes with it on. We can do the verification in expensive
builds. Previously microsoft-abi-member-pointers.cpp was failing with
-verify-type-cache.

I suspect that there may be more issues when we have multiple member
pointer types and we clear the cache, but we can leave that for later.

Followup to D118744.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D119215
2022-02-08 13:22:24 -08:00
Dawid Jurczak 5d8d3a11c4 [NFC] Increase initial size of FoldingSets used in ASTContext and CodeGenTypes
Among many FoldingSet users most notable seem to be ASTContext and CodeGenTypes.
The reasons that we spend not-so-tiny amount of time in FoldingSet calls from there, are following:

  1. Default FoldingSet capacity for 2^6 items very often is not enough.
     For PointerTypes/ElaboratedTypes/ParenTypes it's not unlikely to observe growing it to 256 or 512 items.
     FunctionProtoTypes can easily exceed 1k items capacity growing up to 4k or even 8k size.

  2. FoldingSetBase::GrowBucketCount cost itself is not very bad (pure reallocations are rather cheap thanks to BumpPtrAllocator).
     What matters is high collision rate when lot of items end up in same bucket slowing down FoldingSetBase::FindNodeOrInsertPos and trashing CPU cache
     (as items with same hash are organized in intrusive linked list which need to be traversed).

This change address both issues by increasing initial size of FoldingSets used in ASTContext and CodeGenTypes.

Extracted from: https://reviews.llvm.org/D118385

Differential Revision: https://reviews.llvm.org/D118608
2022-02-08 17:54:04 +01:00
Nikita Popov 18834dca2d [OpenCL] Mark kernel arguments as ABI aligned
Following the discussion on D118229, this marks all pointer-typed
kernel arguments as having ABI alignment, per section 6.3.5 of
the OpenCL spec:

> For arguments to a __kernel function declared to be a pointer to
> a data type, the OpenCL compiler can assume that the pointee is
> always appropriately aligned as required by the data type.

Differential Revision: https://reviews.llvm.org/D118894
2022-02-08 16:12:51 +01:00
Simon Pilgrim 09857a4bd1 [X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions

This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:

__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
  // CHECK-LABEL: test_mm256_adds_epi8
  // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
  return _mm256_adds_epi8(a, b);
}
2022-02-08 15:00:10 +00:00
Simon Pilgrim a59faf272e Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat"
Missed some legacy builtin tests that need cleaning up first
2022-02-08 14:45:28 +00:00
Simon Pilgrim 6c174ab2ad [X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions

This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:

__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
  // CHECK-LABEL: test_mm256_adds_epi8
  // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
  return _mm256_adds_epi8(a, b);
}
2022-02-08 14:21:20 +00:00
David Pagan 0a7cc078ac Enable inoutset dependency-type in depend clause.
Done in manner similar to mutexinoutset
(see https://reviews.llvm.org/D57576)

Runtime support already exists in LLVM OpenMP runtime (see
https://reviews.llvm.org/D97085).

The value used to identify an inoutset dependency type in the LLVM
OpenMP runtime is 8.

Some tests updated due to change in dependency type error messages that
now include new dependency type. Also updated
test/OpenMP/task_codegen.cpp to verify we emit the right code.
2022-02-08 08:35:36 -05:00
Simon Pilgrim c00db97159 [Clang] Add elementwise saturated add/sub builtins
This patch implements `__builtin_elementwise_add_sat` and `__builtin_elementwise_sub_sat` builtins.

These map to the add/sub saturated math intrinsics described here:
https://llvm.org/docs/LangRef.html#saturation-arithmetic-intrinsics

With this in place we should then be able to replace the x86 SSE adds/subs intrinsics with these generic variants - it looks like other targets should be able to use these as well (arm/aarch64/webassembly all have similar examples in cgbuiltin).

Differential Revision: https://reviews.llvm.org/D117898
2022-02-08 11:22:01 +00:00
Arthur Eubanks 45084eab5e [clang] Fix some clang->llvm type cache invalidation issues
Take the following as an example

  struct z {
    z (*p)();
  };

  z f();

When we attempt to get the LLVM type of f, we recurse into z. z itself
has a function pointer with the same type as f. Given the recursion,
Clang simply treats z::p as a pointer to an empty struct `{}*`. The
LLVM type of f is as expected. So we have two different potential
LLVM types for a given Clang type. If we store one of those into the
cache, when we access the cache with a different context (e.g. we
are/aren't recursing on z) we may get an incorrect result. There is some
attempt to clear the cache in these cases, but it doesn't seem to handle
all cases.

This change makes it so we only use the cache when we are not in any
sort of function context, i.e. `noRecordsBeingLaidOut() &&
FunctionsBeingProcessed.empty()`, which are the cases where we may
decide to choose a different LLVM type for a given Clang type. LLVM
types for builtin types are never recursive so they're always ok.

This allows us to clear the type cache less often (as seen with the
removal of one of the calls to `TypeCache.clear()`). We
still need to clear it when we use a placeholder type then replace it
later with the final type and other dependent types need to be
recalculated.

I've added a check that the cached type matches what we compute. It
triggered in this test case without the fix. It's currently not
check-clang clean so it's not on by default for something like expensive
checks builds.

This change uncovered another issue where the LLVM types for an argument
and its local temporary don't match. For example in type-cache-3, when
expanding z::dc's argument into a temporary alloca, we ConvertType() the
type of z::p which is `void ({}*)*`, which doesn't match the alloca GEP
type of `{}*`.

No noticeable compile time changes:
https://llvm-compile-time-tracker.com/compare.php?from=3918dd6b8acf8c5886b9921138312d1c638b2937&to=50bdec9836ed40e38ece0657f3058e730adffc4c&stat=instructions

Fixes #53465.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D118744
2022-02-07 18:59:09 -08:00
Arthur Eubanks 2724c153f9 [clang] Cache OpenCL types
If we call CGOpenCLRuntime::convertOpenCLSpecificType() multiple times
we should get the same type back.

Reviewed By: svenvh

Differential Revision: https://reviews.llvm.org/D119011
2022-02-07 09:23:04 -08:00
Nikita Popov c45a99f36b [MatrixBuilder] Require explicit element type in CreateColumnMajorLoad()
This makes the method compatible with opaque pointers.
2022-02-07 16:57:33 +01:00
Nikita Popov cdc0573f75 [MatrixBuilder] Remove unnecessary IRBuilder template (NFC)
IRBuilderBase exists specifically to avoid the need for this.
2022-02-07 16:42:38 +01:00
Yaxun (Sam) Liu 171da443d5 [HIPSPV] Fix literals are mapped to Generic address space
This issue is an oversight in D108621.

Literals in HIP are emitted as global constant variables with default
address space which maps to Generic address space for HIPSPV. In
SPIR-V such variables translate to OpVariable instructions with
Generic storage class which are not legal. Fix by mapping literals
to CrossWorkGroup address space.

The literals are not mapped to UniformConstant because the “flat”
pointers in HIP may reference them and “flat” pointers are modeled
as Generic pointers in SPIR-V. In SPIR-V/OpenCL UniformConstant
pointers may not be casted to Generic.

Patch by: Henry Linjamäki

Reviewed by: Yaxun Liu

Differential Revision: https://reviews.llvm.org/D118876
2022-02-05 17:26:52 -05:00
James Y Knight caa1ebde70 Don't assume that a new cleanup was added to InnermostEHScope.
After fa87fa97fb, this was no longer guaranteed to be the cleanup
just added by this code, if IsEHCleanup got disabled. Instead, use
stable_begin(), which _is_ guaranteed to be the cleanup just added.

This caused a crash when a object that is callee destroyed (e.g. with the MS ABI) was passed in a call from a noexcept function.

Added a test to verify.

Fixes: fa87fa97fb
2022-02-04 23:39:42 -05:00
Joseph Huber 034adaf5be [OpenMP] Completely remove old device runtime
This patch completely removes the old OpenMP device runtime. Previously,
the old runtime had the prefix `libomptarget-new-` and the old runtime
was simply called `libomptarget-`. This patch makes the formerly new
runtime the only runtime available. The entire project has been deleted,
and all references to the `libomptarget-new` runtime has been replaced
with `libomptarget-`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D118934
2022-02-04 15:31:33 -05:00
Shilei Tian b35be6fe98 [Clang][Sema][OpenMP] Sema support for `atomic compare`
This patch adds the Sema support for `atomic compare`.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D116637
2022-02-04 12:30:56 -05:00
Hans Wennborg 853e0aa424 Don't dllexport reference temporaries
Even if the reference itself is dllexport, the temporary should not be.
In fact, we're already giving it internal linkage, so dllexporting it
is not just wasteful, but will fail to link, as in the example below:

  $ cat /tmp/a.cc
  void _DllMainCRTStartup() {}
  const int __declspec(dllexport) &foo = 42;

  $ clang-cl -fuse-ld=lld /tmp/a.cc /Zl /link /dll /out:a.dll
  lld-link: error: <root>: undefined symbol: int const &foo::$RT1

Differential revision: https://reviews.llvm.org/D118980
2022-02-04 16:31:51 +01:00
John Brawn bca998ed3c [AArch64] Generate fcmps when appropriate for neon intrinsics
Differential Revision: https://reviews.llvm.org/D118257
2022-02-04 12:55:38 +00:00
Jan Svoboda 42afaf7f47 [clang][CodeGen] Use memory type representation in `va_arg`
Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types.

This patch makes sure we use memory representation for `va_arg`.

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D118904
2022-02-04 12:10:57 +01:00
James Y Knight fa87fa97fb Skip exception cleanups when the innermost scope is EHTerminateScope.
EHTerminateScope is used to implement C++ noexcept semantics. Per C++
[except.terminate], it is implemented-defined whether no, some, or all
cleanups are run prior to terminatation.

Therefore, the code to run cleanups on the way towards termination is
unnecessary, and may be omitted.

After this change, we will still run some cleanups: any cleanups in a
function called from the noexcept function will continue to run, while
those in the noexcept function itself will not.

(Commit attempt 2: check InnermostEHScope != stable_end() before accessing it.)

Differential Revision: https://reviews.llvm.org/D113620
2022-02-02 17:50:18 -05:00
Rainer Orth efdd0a29b7 [clang][Sparc] Fix __builtin_extract_return_addr etc.
While investigating the failures of `symbolize_pc.cpp` and
`symbolize_pc_inline.cpp` on SPARC (both Solaris and Linux), I noticed that
`__builtin_extract_return_addr` is a no-op in `clang` on all targets, while
`gcc` has non-default implementations for arm, mips, s390, and sparc.

This patch provides the SPARC implementation.  For background see
`SparcISelLowering.cpp` (`SparcTargetLowering::LowerReturn_32`), the SPARC
psABI p.3-12, `%i7` and p.3-16/17, and SCD 2.4.1, p.3P-10, `%i7` and
p.3P-15.

Tested (after enabling the `sanitizer_common` tests on SPARC) on
`sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D91607
2022-02-02 19:20:02 +01:00
Alex Lorenz 116c1bea65 [clang][macho] add clang frontend support for emitting macho files with two build version load commands
This patch extends clang frontend to add metadata that can be used to emit macho files with two build version load commands.
It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that.

MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target,
and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native
macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build
compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable
by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support.

Differential Revision: https://reviews.llvm.org/D115415
2022-02-02 08:30:39 -08:00
serge-sans-paille e188aae406 Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.

This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/

I've tried to summarize the biggest change below:

- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h

And the usual count of preprocessed lines:
$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after:  6189948

200k lines less to process is no that bad ;-)

Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup

Differential Revision: https://reviews.llvm.org/D118652
2022-02-02 06:54:20 +01:00
Joseph Huber 53d5757ea2 [OpenMP] Add kernel string attribute to kernel function
This patch adds a function attribute to the kernel function generated in
OpenMP offloading. We already create a `nvvm.annotations` metadata node
indicating the kernels present in the program. However, this created
some indirection when trying to identify if a specific function was an
entry. We add a single function attribute for each function now to
simplify this.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D118708
2022-02-01 13:49:31 -05:00
Fangrui Song 7aaf024dac [BitcodeWriter] Fix cases of some functions
`WriteIndexToFile` is used by external projects so I do not touch it.
2022-01-31 16:46:11 -08:00
Fangrui Song 85dfe19b36 [ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils
D116542 adds EmbedBufferInModule which introduces a layer violation
(https://llvm.org/docs/CodingStandards.html#library-layering).
See 2d5f857a1e for detail.

EmbedBufferInModule does not use BitcodeWriter functionality and should be moved
LLVMTransformsUtils. While here, change the function case to the prevailing
convention.

It seems that EmbedBufferInModule just follows the steps of
EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR
update operations which ideally should be refactored to another library.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D118666
2022-01-31 16:33:57 -08:00
Itay Bookstein 2a868802a3 [clang][CodeGen][NFC] Remove unused CodeGenModule fields
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D118619
2022-01-31 23:45:53 +02:00
Joseph Huber 551b177452 [OpenMP] Add a flag for embedding a file into the module
This patch adds support for a flag `-fembed-offload-binary` to embed a
file as an ELF section in the output by placing it in a global variable.
This can be used to bundle offloading files with the host binary so it
can be accessed by the linker. The section is named using the
`-fembed-offload-section` option.

Depends on D116541

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D116542
2022-01-31 15:56:00 -05:00
tyb0807 51e188d079 [AArch64] Support for memset tagged intrinsic
This introduces a new ACLE intrinsic for memset tagged
(https://github.com/ARM-software/acle/blob/next-release/main/acle.md#memcpy-family-of-operations-intrinsics---mops).

  void *__builtin_arm_mops_memset_tag(void *, int, size_t)

A corresponding LLVM intrinsic is introduced:

  i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64)

The types match llvm.memset but the return type is not void.

This is part 1/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.

Patch by Tomas Matheson

Differential Revision: https://reviews.llvm.org/D117753
2022-01-31 20:49:34 +00:00
Ben Shi 653836251a [clang][AVR] Set '-fno-use-cxa-atexit' to default
AVR is baremetal environment, so the avr-libc does not support
'__cxa_atexit()'.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D118445
2022-01-30 02:26:19 +00:00
Weverything be2147db05 Remove reference type when checking const structs
ConstStructBuilder::Finalize in CGExprConstant.ccp assumes that the
passed in QualType is a RecordType.  In some instances, the type is a
reference to a RecordType and the reference needs to be removed first.

Differential Revision: https://reviews.llvm.org/D117376
2022-01-28 13:08:58 -08:00
Amilendra Kodithuwakku 1f08b08674 [clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures
Branch protection in M-class is supported by
 - Armv8.1-M.Main
 - Armv8-M.Main
 - Armv7-M

Attempting to enable this for other architectures, either by
command-line (e.g -mbranch-protection=bti) or by target attribute
in source code (e.g.  __attribute__((target("branch-protection=..."))) )
will generate a warning.

In both cases function attributes related to branch protection will not
be emitted. Regardless of the warning, module level attributes related to
branch protection will be emitted when it is enabled via the command-line.

The following people also contributed to this patch:
- Victor Campos

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D115501
2022-01-28 09:59:58 +00:00
Joseph Huber 2945f11c60 [OpenMP] Only generate runtime flags with host input
This patch changes the code generation of runtime flags to only occur if
a host bitcode file was passed in. This is a cheap way to determine if
we are compiling the OpenMP device runtime itself or user code. This is
needed because the global flags we generate for the device runtime e.g.
__omp_rtl_debug_kind were being generated with default values when we
compiled the runtime library. This would then invalidate the ones we
want to be able to add in when the user defines it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D118399
2022-01-27 18:43:41 -05:00
Arthur Eubanks 662ef6d177 [NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in VisitArrayInitLoopExpr
With this we can bootstrap an `-O0 -g0` clang with `-mllvm -opaque-pointers`!
2022-01-27 14:44:53 -08:00
Arthur Eubanks 6e8a66bdad [NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in EmitCXXMemberDataPointerAddress() 2022-01-27 14:44:53 -08:00
Arthur Eubanks f17123831e [NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in CreateTempAlloca()
Specify the Address element type, which is the bitcast destination type.
(the whole bitcast won't be needed after opaque pointers)
2022-01-27 14:18:54 -08:00
Arthur Eubanks 63cf2063a2 [NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in EmitNewArrayInitializer()
Specify the Address element type, which is the same for all pointers in the array.
2022-01-27 14:00:16 -08:00
Sri Hari Krishna Narayanan 5aa24558cf OMPIRBuilder for Interop directive
Implements the OMPIRBuilder portion for the
Interop directive.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105876
2022-01-27 14:53:18 -05:00
David Green 82973edfb7 [ARM][AArch64] Introduce qrdmlah and qrdmlsh intrinsics
Since it's introduction, the qrdmlah has been represented as a qrdmulh
and a sadd_sat. This doesn't produce the same result for all input
values though. This patch fixes that by introducing a qrdmlah (and
qrdmlsh) intrinsic specifically for the vqrdmlah and sqrdmlah
instructions. The old test cases will now produce a qrdmulh and sqadd,
as expected.

Fixes #53120 and #50905 and #51761.

Differential Revision: https://reviews.llvm.org/D117592
2022-01-27 19:19:46 +00:00
Dawid Jurczak b88ca619d3 [NFC][CodeGen] Use llvm::DenseMap for DeferredDecls
CodeGenModule::DeferredDecls std::map::operator[] seem to be hot especially while code generating huge compilation units.
In such cases using DenseMap instead gives observable compile time improvement. Patch was tested on Linux build with default config acting as benchmark.
Build was performed on isolated CPU cores in silent x86-64 Linux environment following: https://llvm.org/docs/Benchmarking.html#linux rules.
Compile time statistics diff produced by perf and time before and after change are following:
instructions -0.15%, cycles -0.7%, max-rss +0.65%.
Using StringMap instead DenseMap doesn't bring any visible gains.

Differential Revision: https://reviews.llvm.org/D118169
2022-01-27 10:57:48 +01:00
Ahmed Bougacha ecb502342c [ObjC] Emit selector load right before msgSend call.
We currently emit the selector load early, but only because we need
it to compute the signature (so that we know which msgSend variant to
call).  We can prepare the signature with a plain undef, and replace
it with the materialized selector value if (and only if) needed, later.

Concretely, this usually doesn't have an effect, but tests need updating
because we reordered the receiver bitcast and the selector load, which
is always fine.

There is one notable change: with this, when a msgSend needs a
receiver null check, the selector is now loaded in the non-null
block, instead of before the null check.  That should be a mild
improvement.
2022-01-26 20:52:54 -08:00
Arthur Eubanks eee97f1617 [clang] Use proper type to left shift after D117262
Causing warnings like
warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits
as reported in D117262.
2022-01-26 17:54:37 -08:00
Arthur Eubanks 6a953d931c [clang] Fix -Wsubobject-linkage after D117262
/home/buildbot/llvm-avr-linux/llvm-avr-linux/llvm/clang/lib/CodeGen/Address.h:76:7: warning: 'clang::CodeGen::Address' has a field 'clang::CodeGen::Address::A' whose type uses the anonymous namespace [-Wsubobject-linkage]

https://lab.llvm.org/buildbot/#/builders/112/builds/12047
2022-01-26 11:43:44 -08:00
Arthur Eubanks b1613f05ae [NFC] Store Address's alignment into PointerIntPairs
This mitigates the extra memory caused by D115725.

On 32-bit arches where we only have 2 bits per PointerIntPair we fall
back to simply storing alignment separately.

Reviewed By: rnk, nikic

Differential Revision: https://reviews.llvm.org/D117262
2022-01-26 10:35:28 -08:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
JackAKirk 0ad19a8331 [CUDA,NVPTX] Corrected fragment size for tf32 LD B matrix.
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D118023
2022-01-25 11:29:19 -08:00
Nikita Popov 30d4a7e295 [IRBuilder] Require explicit element type in CreatePtrDiff()
For opaque pointer compatibility, we cannot derive the element
type from the pointer type.
2022-01-25 12:43:57 +01:00