Commit Graph

32875 Commits

Author SHA1 Message Date
Kazu Hirata 32aa35b504 Drop empty string literals from static_assert (NFC)
Identified with modernize-unary-static-assert.
2022-09-03 11:17:47 -07:00
Kazu Hirata fedc59734a [llvm] Use range-based for loops (NFC) 2022-09-03 11:17:40 -07:00
Kazu Hirata 89f1433225 Use llvm::lower_bound (NFC) 2022-09-03 11:17:37 -07:00
Kazu Hirata bc96b36a41 [CodeGen] Use std::lcm (NFC) 2022-09-03 11:17:33 -07:00
Simon Pilgrim 62cdfdab4d [DAG] canCreateUndefOrPoison - add freeze(insert_subvector(x,y,c)) -> insert_subvector(freeze(x),freeze(y),c) support
We already have plenty of assertions in place to ensure that the insertion index is constant and inrange
2022-09-03 13:41:33 +01:00
Simon Pilgrim e2d140e9c3 [TTI] Add isExpensiveToSpeculativelyExecute wrapper
CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead.

This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.
2022-09-03 13:12:22 +01:00
Daniil Fukalov b4e1b0e00d [LiveIntervals] Split live intervals on any dead def
Each dead def of the same virtual register is required to be split into multiple
virtual registers with separate live intervals to avoid MachineVerifier error.

Partially fixes https://github.com/llvm/llvm-project/issues/56050 and
https://github.com/llvm/llvm-project/issues/56051

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D130477
2022-09-02 20:00:22 +03:00
David Green 5073499b69 [TypePromotionPass] Rename variable to avoid name conflict. NFC 2022-09-02 12:35:15 +01:00
Fangrui Song 8d95fd7e56 [MachineFunctionPass] Support -filter-passes for -print-changed
[MachineFunctionPass] Support -filter-passes for -print-changed

-filter-passes specifies a `PassID` (a lower-case dashed-separated pass name,
also used by -print-after, -stop-after, etc) instead of a CamelCasePass.

`-filter-passes=CamelCaseNewPMPass` seems like a workaround for new PM passes before
we can use lower-case dashed-separated pass names (as used by `-passes=`).

Example:
```
# getPassName() is "IRTranslator". PassID is "irtranslator"
llc -mtriple=aarch64 -print-changed -filter-passes=irtranslator < print-changed-machine.ll
```

Close https://github.com/llvm/llvm-project/issues/57453

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133055
2022-09-01 11:06:06 -07:00
Nikita Popov 5134bd432f [DwarfEhPrepare] Assign dummy debug location for inserted _Unwind_Resume calls (PR57469)
DwarfEhPrepare inserts calls to _Unwind_Resume into landing pads.
If _Unwind_Resume happens to be defined in the same module and
debug info is used, then this leads to a verifier error:

  inlinable function call in a function with debug info must
    have a !dbg location
  call void @_Unwind_Resume(ptr %exn.obj) #0

Fix this by assigning a dummy location to the call. (As this
happens in the backend, inlining is not actually relevant here.)

Fixes https://github.com/llvm/llvm-project/issues/57469.

Differential Revision: https://reviews.llvm.org/D133095
2022-09-01 16:35:49 +02:00
Nikita Popov c635ea5c50 [CombinerHelper] Avoid deprecated method (NFC) 2022-09-01 16:09:05 +02:00
Stephen Tozer 211efaa1ce Reapply "[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops"
Re-landing with an erroneous assert removed.

This reverts commit 58d104b352.
2022-09-01 14:20:24 +01:00
Amara Emerson 4cf3db41da [GlobalISel] Add sdiv exact (X, constant) -> mul combine.
This port of the SDAG optimization is only for exact sdiv case.

Differential Revision: https://reviews.llvm.org/D130517
2022-09-01 13:34:00 +01:00
Craig Topper 77dbc5200b [MachineCSE] Use TargetInstrInfo::isAsCheapAsAMove in isPRECandidate.
Some targets like RISC-V require operands to be inspected to
determine if an instruction is similar to a move.

Spotted while investigating code differences between using an ADDI
vs an ADDIW. RISC-V has the isAsCheapAsAMove flag for ADDI, but
the TII hook checks the immediate is 0 or the register is X0. ADDIW
is never generated with X0 or with an immediate of 0 so it doesn't
have the isAsCheapAsAMove flag.

I don't know enough about the PRE code to write a test for this yet.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132981
2022-08-31 15:39:41 -07:00
Sam Clegg c5c4ba37b1 [WebAssembly][MC] Avoid the need for .size directives for functions
Warn if `.size` is specified for a function symbol.  The size of a
function symbol is determined solely by its content.

I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.

Differential Revision: https://reviews.llvm.org/D132929
2022-08-31 14:28:56 -07:00
Nick Desaulniers d7474bef77 [llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets
This fixes a crash observed after
https://reviews.llvm.org/D129997.

Similar to D88823.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D130127
2022-08-31 13:07:10 -07:00
Simon Pilgrim eaede4b5b7 [DAG] extractShiftForRotate - replace assertion for shift opcode with an early-out
We feed the result from the first extractShiftForRotate call into the second, and that result might no longer be a shift op (usually due to constant folding).

NOTE: We REALLY need to stop creating nodes on the fly inside extractShiftForRotate!

Fixes Issue #57474
2022-08-31 15:50:48 +01:00
Simon Pilgrim 9d22800275 [DAG] visitFreeze - account for operand depth when calling isGuaranteedNotToBeUndefOrPoison (PR57402)
We were calling isGuaranteedNotToBeUndefOrPoison on operands (with Depth = 0), but wasn't accounting for the fact that a later isGuaranteedNotToBeUndefOrPoison assertion will call from the new node (with Depth = 0 as well) - which will then recursively call isGuaranteedNotToBeUndefOrPoison for its operands with Depth = 1

Fixes #57402
2022-08-31 12:20:30 +01:00
Kai Luo ad2f7fd286 [AtomicExpand] Make floating point conversion happens before fence insertion
IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations.
This also fixes atomic load of floating point values which requires fence on PowerPC.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D127609
2022-08-31 09:54:58 +08:00
Markus Böck 2fdf963daf [GlobalISel] Explicitly fail trying to translate `gc.statepoint` and related intrinsics
The provided testcase would previously fail with an assertion due to later down below trying to allocate registers for `token` return types and arguments. This is especially problematic as the process would then exit instead of falling back to using FastIsel.

This patch fixes that by simply explicitly failing translation if either of these intrinsics are encountered.

Fixes https://github.com/llvm/llvm-project/issues/57349

Differential Revision: https://reviews.llvm.org/D132974
2022-08-31 00:47:17 +02:00
David Penry 9aca7b0217 [ModuloScheduler] Fix missing LLVM_DEBUG
Guard a debug message with LLVM_DEBUG

Differential Revision: https://reviews.llvm.org/D132895
2022-08-30 09:20:37 -07:00
Tomas Matheson 9a390d6692 [AArch64][GISel] fix G_ADD*/G_SUB* legalization
widenScalarDst updates the insert point to after MI, so
widenScalarSrc must be called before widenScalarDst. Otherwise
The updated Src values will appear after MI and break SSA. e.g.:

  %14:_(s64), %15:_(s1) = G_UADDE %9:_, %11:_, %13:_

becomes

  %14:_(s64), %16:_(s32) = G_UADDE %9:_, %11:_, %17:_
  %15:_(s1) = G_TRUNC %16:_(s32)
  %17:_(s32) = G_ZEXT %13:_(s1)

Differential Revision: https://reviews.llvm.org/D132547

Change-Id: Ie3458747a6879433f4d5ab9939d2bd102dd0f2db
2022-08-30 10:59:32 +01:00
Xiang1 Zhang a808ac2e42 [NFC] Clang-format for CodeGenPrepare.cpp 2022-08-30 13:42:36 +08:00
wanglian e2bb9774b1 [LegalizeTypes] Support widen result for VECTOR_REVERSE.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132359
2022-08-30 10:01:26 +08:00
Craig Topper 2f811a6c7f [VP][RISCV] Add vp.fabs intrinsic and RISC-V support.
Mostly just modeled after vp.fneg except there is a
"functional instruction" for fneg while fabs is always an
intrinsic.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D132793
2022-08-29 09:32:06 -07:00
Kazu Hirata 267f21a21b Use std::gcd (NFC)
This patch replaces calls to greatestCommonDivisor with std::gcd where
two arguments are of the same type.  This means that
std::common_type_t of the argument type is the same as the argument
type.

We could drop calls to std::abs in some cases, but that's left for
another patch.
2022-08-28 10:41:51 -07:00
Kazu Hirata d1688e9ddf [llvm] Use std::gcd (NFC)
This patch replaces calls to greatestCommonDivisor with std::gcd where
both arguments are known to be of unsigned.  This means that
std::common_type_t of the two argument types should just be the wider
one of the two.
2022-08-27 23:54:29 -07:00
Kazu Hirata 9d6ab7230b [GlobalISel] Use std::lcm (NFC)
This patch replaces getLCMSize with std::lcm, a C++17 feature.

Note that all the arguments are of unsigned with no implicit type
conversion as they are passed to getLCMSize.
2022-08-27 09:53:16 -07:00
Kazu Hirata 21de2888a4 Use llvm::is_contained (NFC) 2022-08-27 09:53:11 -07:00
Matthias Gehre 3e39b27101 [llvm/CodeGen] Add ExpandLargeDivRem pass
Adds a pass ExpandLargeDivRem to expand div/rem instructions
with more than 128 bits into a loop computing that value.

As discussed on https://reviews.llvm.org/D120327, this approach has the advantage
that it is independent of the runtime library. This also helps the clang driver,
which otherwise would need to understand enough about the runtime library
to know whether to allow _BitInts with more than 128 bits.

Targets are still free to disable this pass and instead provide a faster
implementation in a runtime library.

Fixes https://github.com/llvm/llvm-project/issues/44994

Differential Revision: https://reviews.llvm.org/D126644
2022-08-26 11:55:15 +01:00
Simon Pilgrim 88c7b16bed [DAG] Strip poison generating flags in freeze(op()) -> op(freeze()) fold
This patch follows the InstCombine approach of stripping poison generating flags (nsw/nuw from add/sub etc.) to allow us to push a freeze() through the op. Unlike InstCombine it doesn't retain any flags, but we have plenty of DAG folds that do the same thing already. We assert that the newly generated op isGuaranteedNotToBeUndefOrPoison.

Similar to the ValueTracking approach, isGuaranteedNotToBeUndefOrPoison has been updated to confirm that if an op can't create undef/poison and its operands are guaranteed not to be undef/poison - then its not undef/poison. This is just for the generic opcodes - target specific opcodes will need to do this manually just in case they have some special cases.

Differential Revision: https://reviews.llvm.org/D132333
2022-08-26 11:47:51 +01:00
Matthias Gehre 6d13b80fcb Revert "[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers"
This reverts https://reviews.llvm.org/D120329.
I abandoned the PR [0] to add __divei4 functions to compiler-rt
in favor of adding a pass to transform div/rem [1].

This removes the backend code that was supposed to emit calls to the __divei4 functions.

[0] https://reviews.llvm.org/D120327
[1] https://reviews.llvm.org/D130076

Differential Revision: https://reviews.llvm.org/D130079
2022-08-26 10:52:56 +01:00
Alex Richardson 0483b00875 Mark the $local function begin symbol as a function
While this does not matter for most targets, when building for Arm Morello,
we have to mark the symbol as a function and add size information, so that
LLD can correctly evaluate relocations against the local symbol.
Since Morello is an out-of-tree target, I tried to reproduce this with
in-tree backends and with the previous reviews applied this results in
a noticeable difference when targeting Thumb.

Background: Morello uses a method similar Thumb where the encoding mode is
specified in the LSB of the symbol. If we don't mark the target as a
function, the relocation will not have the LSB set and calls will end up
using the wrong encoding mode (which will almost certainly crash).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D131429
2022-08-26 09:34:04 +00:00
wanglian 2887d7786f [DAGCombiner] Use FoldConstantArithmetic instead of dyn_cast in visitFP_ROUND.
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132439
2022-08-25 11:29:05 +08:00
Matthias Braun 5364f49407 Fix CSR update check
D132080 introduced a bug leading to `RegisterClassInfo` caches not
getting invalidated when there was exactly one more CSR register added.

Differential Revision: https://reviews.llvm.org/D132606
2022-08-24 18:09:49 -07:00
Sami Tolvanen cff5bef948 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Relands 67504c9549 with a fix for
32-bit builds.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 22:41:38 +00:00
Sami Tolvanen a79060e275 Revert "KCFI sanitizer"
This reverts commit 67504c9549 as using
PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.
2022-08-24 19:30:13 +00:00
Sami Tolvanen 67504c9549 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 18:52:42 +00:00
spupyrev 8d5b694da1 extending code layout alg
The diff modifies ext-tsp code layout algorithm in the following ways:
(i) fixes merging of cold block chains (this is a port of D129397);
(ii) adjusts the cost model utilized for optimization;
(iii) adjusts some APIs so that the implementation can be used in BOLT; this is
a prerequisite for D129895.

The only non-trivial change is (ii). Here we introduce different weights for
conditional and unconditional branches in the cost model. Based on the new model
it is slightly more important to increase the number of "fall-through
unconditional" jumps, which makes sense, as placing two blocks with an
unconditional jump next to each other reduces the number of jump instructions in
the generated code. Experimentally, this makes a mild impact on the performance;
I've seen up to 0.2%-0.3% perf win on some benchmarks.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D129893
2022-08-24 09:40:25 -07:00
Simon Pilgrim f9de13232f [X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch
This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.

For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.

Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.

Differential Revision: https://reviews.llvm.org/D132520
2022-08-24 17:28:18 +01:00
Stephen Tozer 58d104b352 Revert "[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops"
Reverting due to reported errors when running Linux kernel builds with
KMSAN -gdwarf-4.

This reverts commit 2cb9e1ac42.
2022-08-24 15:24:32 +01:00
Simon Pilgrim 5377abcde2 [DAG] matchRotateHalf - constify SelectionDAG arg. NFC.
Based off Issue #57283 - we need to try harder to ensure we're not creating nodes on-the-fly - so make sure we're just using SelectionDAG for analysis where possible
2022-08-24 10:57:38 +01:00
Simon Pilgrim e624f8a3bb [DAG] MatchRotate - bail if we fail to match a shl/srl pair
extractShiftForRotate may fail to return canonicalized shifts due to constant folding or other simplification that can occur in getNode()

Fixes Issue #57283
2022-08-24 03:05:07 +01:00
Sanjay Patel f8dfbea324 [SDAG] expand more is-power-of-2 patterns that use popcount
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)

Adjust the legality check to avoid the poor codegen on AArch64.
We probably only want to use popcount on this pattern when it
is a single instruction.

fixes #57225

Differential Revision: https://reviews.llvm.org/D132237
2022-08-23 17:53:53 -04:00
Stephen Tozer 2cb9e1ac42 [DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops
This patch builds on prior support patches to enable support for
variadic debug values in InstrRefLDV, allowing DBG_VALUE_LISTs to
have their ranges extended.

Differential Revision: https://reviews.llvm.org/D128212
2022-08-23 20:17:09 +01:00
Arthur Eubanks d6cc7a5b46 [FastISel] Respect musttail over "disable-tail-calls"
musttail should be honored even in the presence of attributes like "disable-tail-calls". SelectionDAG properly handles this.

Update LangRef to explicitly mention that this is the semantics of musttail.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D132193
2022-08-23 08:55:40 -07:00
Jakub Kuderski 6fa87ec10f [ADT] Deprecate is_splat and replace all uses with all_equal
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D132335
2022-08-23 11:36:27 -04:00
Stephen Tozer 89d0cc99ec [DebugInfo][InstrRef] Handle transfers of variadic debug values in LDV
This patch adds the last of the changes required to enable
DBG_VALUE_LIST handling in InstrRefLDV, handling variadic debug values
during the transfer tracking step. Most of the changes are fairly
straightforward, and based around tracking multiple locations per
variable in TransferTracker::VLocTracker.

Differential Revision: https://reviews.llvm.org/D128211
2022-08-23 15:01:28 +01:00
Thomas Symalla 562accddaa
[NFC] Fix typo in dbg message in RegisterCoalescer.
funcion => function
2022-08-23 15:14:45 +02:00
Stephen Tozer b12e5c884f [DebugInfo][InstrRef][NFC] Emit variadic debug values from InstrRefLDV
In preparation for supporting DBG_VALUE_LIST in InstrRefLDV, this patch
adds the logic for emitting DBG_VALUE_LIST instructions from
InstrRefLDV. The logical changes here are fairly simple, with the main
change being that instead of directly prepending offsets to the DIExpr,
we use appendOpsToArg to modify the expression for individual debug
operands in the expression. The function emitLoc is also changed to take
a list of debug ops, with an empty list meaning an undef value.

Differential Revision: https://reviews.llvm.org/D128209
2022-08-23 13:22:56 +01:00