Commit Graph

11556 Commits

Author SHA1 Message Date
Carl Ritson cfbb92441f [SDAG] Fix pow2 assumption when splitting vectors
When reducing vector builds to shuffles it possible that
the DAG combiner may try to extract invalid subvectors.

This happens as the existing code assumes vectors will be power
of 2 sizes, which is already untrue, but becomes more noticable
with v6 and v7 types.
Specifically the existing code assumes that half PowerOf2Ceil of
a given vector index will fit twice into a given vector.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103880
2021-06-11 08:58:16 +09:00
David Spickett 64de8763aa Revert "Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 31859f896c.

Causing SVE and RISCV-V test failures on bots.
2021-06-10 10:11:17 +00:00
Paulo Matos 31859f896c Implementation of global.get/set for reftypes in LLVM IR
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D95425
2021-06-10 10:07:45 +02:00
Sanjay Patel dd763ac791 [SDAG] fix miscompile from merging stores of different sizes
As shown in:
https://llvm.org/PR50623
...and the similar tests here, we were not accounting for
store merging of different sizes that do not cover the
entire range of the wide value to be stored.

This is the easy fix: just make sure that all of the
original stores are the same size, so when we calculate
the wide width, it's a simple N * M check.

This still allows all of the motivating optimizations from:
D86420 / 54a5dd485c
D87112 / 7a06b166b1

We could enhance this code to track individual bytes and
allow merging multiple sizes.
2021-06-09 09:51:39 -04:00
Simon Pilgrim 114e712c34 InstrEmitter.cpp - don't dereference a dyn_cast<>.
dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.
2021-06-08 17:59:04 +01:00
Simon Pilgrim 61a2d6bfe4 [DAG] foldShuffleOfConcatUndefs - ensure shuffles of upper (undef) subvector elements is undef (PR50609)
shuffle(concat(x,undef),concat(y,undef)) -> concat(shuffle(x,y),shuffle(x,y))

If the original shuffle references any of the upper (undef) subvector elements, ensure the split shuffle masks uses undef instead of an out-of-bounds value.

Fixes PR50609
2021-06-08 15:49:41 +01:00
Hans Wennborg 386b66b2fc Revert "3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands""
> This reapplies c0f3dfb9, which was reverted following the discovery of
> crashes on linux kernel and chromium builds - these issues have since
> been fixed, allowing this patch to re-land.

This reverts commit 36ec97f76a.

The change caused non-determinism in the compiler, see comments on the code
review at https://reviews.llvm.org/D91722.

Reverting to unbreak people's builds until that can be addressed.

This also reverts the follow-up "[DebugInfo] Limit the number of values
that may be referenced by a dbg.value" in
a0bd6105d8.
2021-06-08 14:54:08 +02:00
David Green b889c6ee99 [DAG] Allow isNullOrNullSplat to see truncated zeroes
This sets the AllowTruncation flag on isConstOrConstSplat in
isNullOrNullSplat, allowing it to see truncated constant zeroes on
architectures such as AArch64, where only a i32.i64 are legal. As a
truncation of 0 is always 0, this should always be valid, allowing some
extra folding to happen including some of the cases from D103755.

Differential Revision: https://reviews.llvm.org/D103756
2021-06-08 10:18:58 +01:00
Arthur Eubanks 47211fa889 Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
Needs to be discussed more.

This reverts commit 255a5c1baa6020c009934b4fa342f9f6dbbcc46
This reverts commit df2056ff3730316f376f29d9986c9913b95ceb1
This reverts commit faff79b7ca144e505da6bc74aa2b2f7cffbbf23
This reverts commit d2a9020785c6e02afebc876aa2778fa64c5cafd
2021-06-07 16:07:44 -07:00
Guillaume Chatelet 1da2c7d25c [NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE
Differential Revision: https://reviews.llvm.org/D103251
2021-06-07 10:04:16 +00:00
Fraser Cormack aec9cbbeb8 [SelectionDAG] Extend FoldConstantVectorArithmetic to SPLAT_VECTOR
This patch extends the SelectionDAG's ability to constant-fold vector
arithmetic to include support for SPLAT_VECTOR. This is not only for
scalable-vector types but also for fixed-length vector types, which
helps Hexagon in a couple of cases.

The original RISC-V test case was in fact an infinite DAGCombine loop.
The pattern `and (truncate v1), (truncate v2)` can be combined to
`truncate (and v1, v2)` but the truncate can similarly be combined back
to `truncate (and v1, v2)` (but, crucially, only when one of `v1` or
`v2` is a constant vector).

It wasn't exposed in on fixed-length types because a TRUNCATE of a
constant BUILD_VECTOR was folded into the BUILD_VECTOR itself, whereas
this did not happen for the equivalent (scalable-vector) SPLAT_VECTOR.

Reviewed By: RKSimon, craig.topper

Differential Revision: https://reviews.llvm.org/D103246
2021-06-04 09:53:15 +01:00
Arthur Eubanks 9255a5c1ba [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

Issues can be diagnosed with D103412.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-06-03 15:52:01 -07:00
Fraser Cormack 1de1887f5f [CodeGen] Fix a scalable-vector crash in VSELECT legalization
The `DAGTypeLegalizer::WidenVSELECTMask` function is not (yet) ready for
scalable vector types, and has numerous places in which it tries to grab
either the fixed size or number of elements of its types.

I believe that it should be possible to update this method to properly
account for scalable-vector types, but we don't have test cases for
that; RISC-V bails out early on as it has legal i1 vector masks. As
such, this patch just prevents it from crashing.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103536
2021-06-03 10:24:55 +01:00
Sanjay Patel 0718ac706d [SDAG] allow cast folding for vector sext-of-setcc with signed compare
This extends 434c8e013a and ede3982792 to handle signed
predicates by sign-extending the setcc operands.

This is not shown directly in https://llvm.org/PR50055 ,
but the pattern is visible by changing the unsigned convert
to signed in the source code.
2021-06-02 15:05:02 -04:00
Sanjay Patel ede3982792 [SDAG] allow more cast folding for vector sext-of-setcc
This is a follow-up to D103280 that eases the use restrictions,
so we can handle the motivating case from:
https://llvm.org/PR50055

The loop code is adapted from similar use checks in
ExtendUsesToFormExtLoad() and SliceUpLoad(). I did not see an
easier way to filter out non-chain uses of load values.

Differential Revision: https://reviews.llvm.org/D103462
2021-06-02 13:14:49 -04:00
Bjorn Pettersson 536e02a23c [CodeGen] Refactor libcall lookups for RTLIB::POWI_*
Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_*
libcall for a given value type.

This includes a small refactoring of ExpandFPLibCall and
ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code,
plus adding an ExpandFPLibCall version that can be called directly
when expanding FPOWI/STRICT_FPOWI to ensure that we actually use
the same RTLIB::Libcall when expanding the libcall as we used when
checking the legality of such a call by doing a getLibcallName check.

Differential Revision: https://reviews.llvm.org/D103050
2021-06-02 11:40:34 +02:00
Bjorn Pettersson d1273d39d3 [LegalizeTypes] Avoid promotion of exponent in FPOWI
The FPOWI DAG node is normally lowered to a libcall to one of the
RTLIB::POWI* runtime functions and the exponent should normally
have a type matching sizeof(int) when making the call. Thus,
type promotion of the exponent could lead to an FPOWI with a type
for the second operand that would be incorrect when doing the
libcall (a situation which would be hard to detect post-legalization
if we allow such FPOWI nodes).

This patch is changing DAGTypeLegalizer::PromoteIntOp_FPOWI to
do the rewrite into a libcall directly instead of promoting the
operand. This way we can check that the exponent is smaller than
sizeof(int) and we can let TargetLowering handle promotion as
part of making the libcall. It could be noticed here that makeLibCall
has some knowledge about targets such as 64-bit RISCV, for which the
libcall argument should be extended to a type larger than sizeof(int).

Differential Revision: https://reviews.llvm.org/D102950
2021-06-02 11:40:34 +02:00
Sanjay Patel 1b14f3951a [SDAG] add helper function for sext-of-setcc folds; NFC
Try to make this easier to read as noted in D103280
2021-06-01 08:07:17 -04:00
Sanjay Patel 63fe4cb082 [SDAG] add check to sext-of-setcc fold to bypass changing a legal op
I accidentaly pushed a draft of D103280 that was discussed
during the review, but it was not supposed to be the final
version.

Rather than revert and recommit, I'm updating the existing
code. This way we have a record of the codegen diff that
would result if we decide to remove this predicate in the
future.
2021-05-31 08:58:11 -04:00
Sanjay Patel 434c8e013a [SDAG] try harder to fold casts into vector compare
sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) --
(when the zexts are free and a bunch of other conditions)

We have a couple of similar folds to this already for vector selects,
but this pattern slips through because it is only a setcc.

The tests are based on the motivating case from:
https://llvm.org/PR50055
...but we need extra logic to get that example, so I've left that as
a TODO for now.

Differential Revision: https://reviews.llvm.org/D103280
2021-05-31 07:14:01 -04:00
Florian Hahn 126f90b252
[DAGCombine] Poison-prove scalarizeExtractedVectorLoad.
extractelement is poison if the index is out-of-bounds, so just
scalarizing the load may introduce an out-of-bounds load, which is UB.

To avoid introducing new UB, we can mask the index so it only contains
valid indices.

Fixes PR50382.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D103077
2021-05-30 11:40:55 +01:00
Arthur Eubanks 71cca4f728 Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
This reverts commit 1c7f32334d.

Some code still needs to properly set parameter ABI attributes, see
D101806.
2021-05-29 23:08:15 -07:00
Arthur Eubanks 3a6f12f915 Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering"
This reverts commit bc7d15c61d.

Dependent change is to be reverted.
2021-05-29 22:40:33 -07:00
Eli Friedman 0b3b0a727a [AArch64][RISCV] Make sure isel correctly honors failure orderings.
If a cmpxchg specifies acquire or seq_cst on failure, make sure we
generate code consistent with that ordering even if the success ordering
is not acquire/seq_cst.

At one point, it was ambiguous whether this sort of construct was valid,
but the C++ standad and LLVM now accept arbitrary combinations of
success/failure orderings.

This doesn't address the corresponding issue in AtomicExpand. (This was
reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .)

Fixes https://bugs.llvm.org/show_bug.cgi?id=50512.

Differential Revision: https://reviews.llvm.org/D103284
2021-05-28 12:47:40 -07:00
Craig Topper 2830d924b0 [VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names.
Parameter positions seem like they should be unsigned.

While there, make function names lowercase per coding standards.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D103224
2021-05-28 11:28:47 -07:00
Craig Topper d24d2447cd [SelectionDAG] Fix typo in assert. NFC 2021-05-28 10:37:11 -07:00
Tim Northover 9ff2eb1ea5 SwiftTailCC: teach verifier musttail rules applicable to this CC.
SwiftTailCC has a different set of requirements than the C calling convention
for a tail call. The exact argument sequence doesn't have to match, but fewer
ABI-affecting attributes are allowed.

Also make sure the musttail diagnostic triggers if a musttail call isn't
actually a tail call.
2021-05-28 11:12:00 +01:00
Fraser Cormack 5a80dc4988 [VP][SelectionDAG] Add a target-configurable EVL operand type
This patch adds a way for the target to configure the type it uses for
the explicit vector length operands of VP SDNodes. The type must be a
legal integer type (there is still no target-independent legalization of
this operand) and must currently be at least as big as i32, the type
used by the IR intrinsics. An implicit zero-extension takes place on
targets which choose a larger type. All VP nodes should be created with
this type used for the EVL operand.

This allows 64-bit RISC-V to avoid custom legalization of all VP nodes,
keeping them in their target-independent form for that bit longer.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D103027
2021-05-27 15:27:36 +01:00
Fraser Cormack b7101e218c [DAGCombine][RISCV] Don't try to trunc-store combined vector stores
DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told
whether it's to use vector types and also whether it's to issue a
truncating store. However, the truncating store code path assumes a
scalar integer `ConstantSDNode`, and when using vector types it creates
either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which
is a constant.

The `riscv64` target is able to expose a crash here because it switches
on both code paths at the same time. The `f32` is stored as `i32` which
must be promoted to `i64`, necessitating a truncating store.
It also decides later that it prefers a vector store of `v2f32`.

While vector truncating stores are legal, this combine is not able to
emit them. We also don't have a test case. This patch adds an assert to
catch this case more gracefully, and updates one of the caller functions
to the function to turn off the use of truncating stores when preferring
vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103173
2021-05-27 14:16:32 +01:00
Fraser Cormack 772b58a641 [SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs
This patch extends the cases in which the legalizer is able to express
VSELECT in terms of XOR/AND/OR. When dealing with a VSELECT between
boolean vector types, the mask itself is an all-ones or all-ones value
of the operand type, so a 0/1 boolean type behaves identically to a 0/-1
type.

This greatly helps RISC-V which relies on expansion for these nodes. It
also allows scalable-vector bool VSELECTs to use the default expansion,
where before it would crash in SelectionDAG::UnrollVectorOp.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103147
2021-05-27 10:08:57 +01:00
Jonas Paulsson d058262b14 [SystemZ] Support i128 inline asm operands.
Support virtual, physical and tied i128 register operands in inline assembly.

i128 is on SystemZ not really supported and is not a legal type and generally
such a value will be split into two i64 parts. There are however some
instructions that require a pair of two GPR64 registers contained in the GR128
bit reg class, which is untyped.

For inline assmebly operands, it proved to be very cumbersome to first follow
the general behavior of splitting an i128 operand into two parts and then
later rebuild the INLINEASM MI to have one GR128 register. Instead, some
minor common code changes were made to SelectionDAGBUilder to only create one
GR128 register part to begin with. In particular:

- getNumRegisters() now has an optional parameter "RegisterVT" which is
  passed by AddInlineAsmOperands() and GetRegistersForValue().

- The bitcasting in GetRegistersForValue is not performed if RegVT is
  Untyped.

- The RC for a tied use in AddInlineAsmOperands() is now computed either from
  the tied def (virtual register), or by getMinimalPhysRegClass() (physical
  register).

- InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register
  class (DstRC) can also be computed for an illegal type.

In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and
joinRegisterPartsIntoValue() have been implemented to handle i128 operands.

Differential Revision: https://reviews.llvm.org/D100788

Review: Ulrich Weigand
2021-05-26 10:08:32 -05:00
Michael Liao c9dd29925f [SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics.
- When memory intrinsics, such as memcpy, the attached scoped AA
  metadata is not passed down to the backend. As a result, the backend
  cannot schedule relevant memory operations around them following that
  hint. In this patch, SelectionDAG is enhanced to propagate that
  metadata (scoped AA only) when they are lowered into loads and stores.

Differential Revision: https://reviews.llvm.org/D102215
2021-05-25 14:42:26 -04:00
LemonBoy fd5cc41818 [SelectionDAG] Fix argument copy elision with irregular types
D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D102153
2021-05-22 09:43:37 +02:00
Stephen Tozer 36ec97f76a 3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
This reapplies c0f3dfb9, which was reverted following the discovery of
crashes on linux kernel and chromium builds - these issues have since
been fixed, allowing this patch to re-land.

This reverts commit 4397b7095d.
2021-05-21 11:06:20 +01:00
Jessica Clarke e10958c807 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Re-landed again after D102819 fixed PowerPC to correctly zero-extend all
of its atomics as it claimed to do, since the combination of that bug
and this optimisation caused buildbot regressions.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-20 20:34:23 +01:00
Fraser Cormack 26bd2250c1 [RISCV] Ensure shuffle splat operands are type-legal
The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a
type-legal splat value as it may implicitly extract a vector element
from another shuffle. It is not permitted to introduce an illegal type
when lowering shuffles.

This patch addresses the crash by adding a boolean flag to
`getSplatValue`, defaulting to false, which when set will ensure a
type-legal return value. If it is unable to do that it will fail to
return a splat value.

I've been through the existing uses of `getSplatValue` in other targets
and was unable to find a need or test cases showing a need to update
their uses. In some cases, the call is made during `LegalizeVectorOps`
which may still produce illegal scalar types. In other situations, the
illegally-typed splat value may be quickly patched up to a legal type
(such as any-extending the returned `extract_vector_elt` up to a legal
type) before `LegalizeDAG` notices.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102687
2021-05-20 18:00:03 +01:00
Stephen Tozer cf725dde9c [DebugInfo] Handle DIArgList in FastISel or GlobalIsel
Currently, variadic dbg.values (i.e. those using a DIArgList as part of
their location) are not handled properly by FastISel or GlobalISel, and
will produce invalid DBG_VALUE instructions if they encounter them. This
patch fixes this issue by emitting undef DBG_VALUE instructions for
variadic dbg.values, so that no incorrect instruction is produced and
any prior variable location is terminated.

This is simply a quick-fix to prevent errors; a correct implementation
should come later for these ISel pipelines to ensure that we do not drop
debug information unnecessarily.

Differential Revision: https://reviews.llvm.org/D102500
2021-05-20 17:37:28 +01:00
David Sherwood a21bff0673 [CodeGen] Add support for widening the result of EXTRACT_SUBVECTOR
When trying to return a type such as <vscale x 1 x i32> from a
function we crash in DAGTypeLegalizer::WidenVecRes_EXTRACT_SUBVECTOR
when attempting to get the fixed number of elements in the vector.

For the simple case we are dealing with, i.e. extracting
<vscale x 1 x i32> from index 0 of input vector <vscale x 4 x i32>
we can simply rely upon existing code that just returns the input.

Differential Revision: https://reviews.llvm.org/D102605
2021-05-20 12:27:08 +01:00
David Sherwood d07d5c1b06 [CodeGen] Add support for widening INSERT_SUBVECTOR operands
When attempting to return something like a <vscale x 1 x i32>
type from a function we end up trying to widen the vector by
inserting a <vscale x 1 x i32> subvector into an undefined
<vscale x 4 x i32> vector. However, during legalisation we
then attempt to widen the INSERT_SUBVECTOR operands and hit
an error in WidenVectorOperand.

This patch adds a new WidenVecOp_INSERT_SUBVECTOR function
that currently only supports inserting subvectors into undefined
vectors.

Differential Revision: https://reviews.llvm.org/D102501
2021-05-20 10:37:03 +01:00
Sanjay Patel 6025663578 [SDAG] propagate FMF from target-specific IR intrinsics
This is a step towards relying more on node-level FMF rather than function-wide
or target settings.
I think it was just an oversight that we didn't get this path in D87361
or follow-on patches.

The lack of FMF propagation is blocking D90901 from converting tests to IR-level FMF.

We can't do much more than this currently because we also fail to propagate flags
from x86-specific node to generic FMA node. That would be another patch, so the
test just verifies that we can transfer from IR to initial SDAG node.

Differential Revision: https://reviews.llvm.org/D102725
2021-05-19 07:50:50 -04:00
Arthur Eubanks bc7d15c61d [NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().

A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.

The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.

This is a reland after an MSan fix in D102667.

Differential Revision: https://reviews.llvm.org/D101713
2021-05-18 14:30:22 -07:00
Arthur Eubanks 1c7f32334d [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

This is a reland after fixing MSan issues in D102667.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-05-18 14:30:22 -07:00
Ten Tzen 797ad70152 [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
This is the first step of this project; only X86_64 target is enabled in this patch.

Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow
three rules:
* First, no exception can move in or out of _try region., i.e., no "potential
  faulty instruction can be moved across _try boundary.
* Second, the order of exceptions for instructions 'directly' under a _try
  must be preserved (not applied to those in callees).
* Finally, global states (local/global/heap variables) that can be read
  outside of _try region must be updated in memory (not just in register)
  before the subsequent exception occurs.

The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding
process.

Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.

One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing
SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others.  If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.

Implementation:
Part-1: Clang implementation described below.

Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end().
_scope_begin() is immediately added after ctor() is called and EHStack is pushed.
So it must be an invoke, not a call. With that it's also guaranteed an
EH-cleanup-pad is created regardless whether there exists a call in this scope.
_scope_end is added before dtor(). These two intrinsics make the computation of
Block-State possible in downstream code gen pass, even in the presence of
ctor/dtor inlining.

Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark
_try boundary and to prevent from exceptions being moved across _try boundary.
All memory instructions inside a _try are considered as 'volatile' to assure
2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's
acceptable as the amount of code directly under _try is very small.

Part-2 (will be in Part-2 patch): LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Differential Revision: https://reviews.llvm.org/D80344/new/
2021-05-17 22:42:17 -07:00
Serguei Katkov 57c660f374 [Statepoint Lowering] Cleanup: remove unused option statepoint-always-spill-base. 2021-05-18 12:15:15 +07:00
Simon Pilgrim c29522d648 [TargetLowering] prepareUREMEqFold/prepareSREMEqFold - account for non legal shift types
Ensure we tell getShiftAmountTy that we're working with pre-legalized types to prevent cases where the (legalized) shift type can no longer handle the (non-legalized) type width.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366
2021-05-17 11:03:27 +01:00
Fraser Cormack 85e31eddf2 [DAGCombiner] Relax an assertion to an early return
The select-of-constants transform was asserting that its constant vector
inputs did not implicitly truncate their input without that as an
explicit precondition to the function. This patch relaxes that assertion
into an early return to skip the optimization.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D102393
2021-05-17 09:15:55 +01:00
Arthur Eubanks 341902672c Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
This reverts commit 16748bd2fb.

Causes https://crbug.com/1209013
2021-05-16 22:02:10 -07:00
Arthur Eubanks 7647cb14dc Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering"
This reverts commit 85af8a8c1b.
2021-05-16 22:00:54 -07:00
Pan, Tao 976a3e5f61 [SelectionDAG] Make fast and linearize visible by clang -pre-RA-sched
ScheduleDAGFast.cpp is compiled to object file, but the ScheduleDAGFast
object file isn't linked into clang executable file as no symbol is
referred by outside. Add calling to createXxx of ScheduleDAGFast.cpp,
then the ScheduleDAGFast object file will be linked into clang
executable file. The static RegisterScheduler will register scheduler
fast and linearize at clang boot time.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D101601
2021-05-17 11:25:15 +08:00
Nikita Popov fb9ed1979a [IR] Add BasicBlock::isEntryBlock() (NFC)
This is a recurring and somewhat awkward pattern. Add a helper
method for it.
2021-05-15 12:41:58 +02:00
Sanjay Patel 9dfd7f9b67 [SDAG] reduce code duplication for extend_vec_inreg combines; NFC
These are identical so far, and I was looking at adding a fold
for a pattern with scalar_to_vector which would also nd up duplicated.
2021-05-14 08:29:57 -04:00
Tim Northover ea0eec69f1 IR+AArch64: add a "swiftasync" argument attribute.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
2021-05-14 11:43:58 +01:00
cynecx 8ec9fd4839 Support unwinding from inline assembly
I've taken the following steps to add unwinding support from inline assembly:

1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:

```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
    to label %exit unwind label %uexit
```

2.) Add Bitcode writing/reading support + LLVM-IR parsing.

3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.

4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.

5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.

6.) Don't allow unwinding callbr.

Reviewed By: Amanieu

Differential Revision: https://reviews.llvm.org/D95745
2021-05-13 19:13:03 +01:00
Max Kazantsev d8b37de8a4 [GC][NFC] Move GCStrategy from CodeGen to IR
We want it to be available in analyzes so that we could use the
CodeGen notion in middle-end passes (for example, to check if
a GC may free some particular pointer).

This is a preparatory patch that simply moves the files around.

Note: if this causes some build issues, this patch must just be reverted.

Differential Revision: https://reviews.llvm.org/D100557
Reviewed By: reames
2021-05-13 12:31:59 +07:00
Fraser Cormack c5ec00e62b [TargetLowering] Improve legalization of scalable vector types
This patch extends the vector type-conversion and legalization capabilities of
scalable vector types.

Firstly, `vscale x 1` types now behave more like the corresponding `vscale x
2+` types. This enables the integer promotion legalization of extended scalable
types, such as the promotion of `<vscale x 1 x i5>` to `<vscale x 1 x i8>`.

These `vscale x 1` types are also now better handled by
`getVectorTypeBreakdown`, where what looks like older handling for 1-element
fixed-length vector types was spuriously updated to include scalable types.

Widening of scalable types is now better supported, by using `INSERT_SUBVECTOR`
to insert the smaller scalable vector "value" type into the wider scalable
vector "part" type. This allows AArch64 to pass and return `vscale x 1` types
by value by widening.

There are still cases where we are unable to legalize `vscale x 1` types, such
as where expansion would require splitting the vector in two.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102073
2021-05-12 16:33:07 +01:00
Stefan Pintilie 8d37411e48 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This reverts commit 6c80361b84.
Breaks PowerPC Big Endian buildbots.
2021-05-12 09:46:18 -05:00
Hendrik Greving 762ac725bf [DAGCombiner] Fix DAG combine store elimination, different address space.
Fixes a bug in the DAG combiner that eliminates the stores because it missed
to inspect the address space of the pointers.

%v = load %ptr_as1
// no chain side effect
store %v, %ptr_as2

As well as

store %v, %ptr_as1
store %v, %ptr_as2

Fixes a test for above in X86.

Differential Revision: https://reviews.llvm.org/D102096
2021-05-12 07:14:22 -07:00
Arthur Eubanks 85af8a8c1b [NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().

A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.

The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.

Differential Revision: https://reviews.llvm.org/D101713
2021-05-10 13:05:15 -07:00
Arthur Eubanks 16748bd2fb [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-05-10 12:35:11 -07:00
Bradley Smith 635164b95a [AArch64][SVE] Improve SVE codegen for fixed length BITCAST
Expanding a fixed length operation involves wrapping the operation in an
insert/extract subvector pair, as such, when this is done to bitcast we
end up with an extract_subvector of a bitcast. DAGCombine tries to
convert this into a bitcast of an extract_subvector which restores the
initial fixed length bitcast, causing an infinite loop of legalization.

As part of this patch, we must make sure the above DAGCombine does not
trigger after legalization if the created bitcast would not be legal.

Differential Revision: https://reviews.llvm.org/D101990
2021-05-10 14:43:53 +01:00
Fraser Cormack 6db0cedd23 [LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion
This patch extends VectorLegalizer::ExpandSELECT to permit expansion
also for scalable vector types. The only real change is conditionally
checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the
vector type.

We can use this to fix "cannot select" errors for scalable vector
selects on the RISCV target. Note that in future patches RISCV will
possibly custom-lower vector SELECTs to VSELECTs for branchless codegen.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102063
2021-05-10 08:22:35 +01:00
Simon Pilgrim dd21c6b843 [DAG] Ensure all SD classes consistently return a const reference with getDebugLoc(). NFCI.
Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.
2021-05-07 14:48:23 +01:00
Stephen Tozer ce0c1f3ced [DebugInfo] Fix crash when emitting an invalidated SDDbgValue
This patch fixes a crash in the compiler that occurs when certain
invalidated SDDbgValues are emitted. The cause of this was that we would
attempt to check the liveness of the debug value's operands, which
triggers an assert if any of those operands are invalid. This patch
changes this check such that it only occurs if the SDDbgValue is valid;
if not, the check is irrelevant anyway, so can be safely ignored.

Differential Revision: https://reviews.llvm.org/D101540
2021-05-07 13:13:56 +01:00
Simon Pilgrim 280aa3415e [DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts
Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts.

Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication.

I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to).

NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly.

Differential Revision: https://reviews.llvm.org/D101987
2021-05-07 13:12:30 +01:00
Jessica Clarke 6c80361b84 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-06 04:01:20 +01:00
Jessica Clarke 897d7bceb9 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This seems to have broken sanitizers, giving lots of

  Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.

failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.

This reverts commit 6e876f9ded.
2021-05-05 17:02:05 +01:00
Jessica Clarke 6e876f9ded [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-05 16:34:45 +01:00
Christudasan Devadasan 80c79035ef DAG: Cleanup assertion in EmitFuncArgumentDbgValue
Removing an assertion introduced with D68945. The
patch was later reverted with 6531a78ac4, but failed
to remove this assertion. It causes a problem while
trying to split a 64-bit argument into sub registers.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D101594
2021-05-04 21:48:58 +05:30
Tomas Matheson 9d86095ff8 Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0"
This reverts commit 753185031d.
2021-05-03 21:48:20 +01:00
Tomas Matheson 753185031d [CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0
atomicrmw instructions are expanded by AtomicExpandPass before register allocation
into cmpxchg loops. Register allocation can insert spills between the exclusive loads
and stores, which invalidates the exclusive monitor and can lead to infinite loops.

To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them
after register allocation.

Floating point legalisation:
f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to
f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually
f32 ATOMIC_LOAD_FADD_16(*i16, f32)

Differential Revision: https://reviews.llvm.org/D101164

Originally submitted as 3338290c18.
Reverted in c7df6b1223.
2021-05-03 20:25:15 +01:00
Craig Topper 6430430958 [TableGen] Use sign rotated VBR for OPC_EmitInteger.
This allows for a much more efficient encoding for small negative
numbers by storing the sign bit first and negating the rest of
the bits. This was already being used for OPC_CheckInteger.

For every in tree target this affects, the table got smaller.
R600GenDAGISel.inc saw the largest reduction of 7K.

I did have to add a new opcode for StringIntegers used for
register class ids and subregister indices since we don't have the
integer value to encode. The enum name is emitted directly into
the table. Previously assumed the enum would expand to a positive
7-bit number. We might be able to just shift that right by 1 and
assume it is a positive 6 bit number, but that will need more
investigation.
2021-05-02 12:40:44 -07:00
Nathan Chancellor 4397b7095d
Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands""
This reverts commit 791930d740, as per
https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy.

I observed breakage with the Linux kernel, as reported at
https://reviews.llvm.org/D91722#2724321

Fixes exist at
https://reviews.llvm.org/D101523
https://reviews.llvm.org/D101540

but they have not landed so to unbreak the tree for the weekend, revert
this commit.

Commit b11e4c9907 ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an
excessive number of debug operands"") only reverted one follow-up fix,
not the original patch that broke the kernel.

e
2021-04-30 20:23:21 -07:00
Tomas Matheson c7df6b1223 Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0"
This reverts commit 3338290c18.

Broke expensive checks on debian.
2021-04-30 16:53:14 +01:00
Tomas Matheson 3338290c18 [CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0
atomicrmw instructions are expanded by AtomicExpandPass before register allocation
into cmpxchg loops. Register allocation can insert spills between the exclusive loads
and stores, which invalidates the exclusive monitor and can lead to infinite loops.

To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them
after register allocation.

Floating point legalisation:
f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to
f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually
f32 ATOMIC_LOAD_FADD_16(*i16, f32)

Differential Revision: https://reviews.llvm.org/D101164
2021-04-30 16:40:33 +01:00
Craig Topper 0c330afdfa [RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32.
This replaces D98479.

This allows type legalization to form SPLAT_VECTOR_PARTS so we don't
lose the splattedness when the scalar type is split.

I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so
we can continue using non-VL nodes for scalable vectors.

I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes
to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR
with other operations. Especially interesting is a splat BUILD_VECTOR of
the extract_vector_elt which can become a splat shuffle, but won't if
we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR
or add visitSPLAT_VECTOR.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100803
2021-04-29 08:20:09 -07:00
Craig Topper 3067520bf4 [SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't
legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the
type legalizer.

This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes
it opaque to the type legalizer.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101262
2021-04-27 14:38:42 -07:00
Dávid Bolvanský ef2dc7ed9f [Analysis] Attribute alignment should not prevent tail call optimization
Fixes tail folding issue mentioned in D100879.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D101230
2021-04-24 19:57:42 +02:00
Stephen Tozer 791930d740 Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
Previous build failures were caused by an error in bitcode reading and
writing for DIArgList metadata, which has been fixed in e5d844b587.
There were also some unnecessary asserts that were being triggered on
certain builds, which have been removed.

This reverts commit dad5caa59e.
2021-04-23 10:54:01 +01:00
Craig Topper 5185b52988 [RISCV] Fix crash with fptosi.sat/fptoui.sat intrinsics on RV64. Add test cases.
Add PromoteIntOp_FP_TO_XINT_SAT to type legalize the bit width
operand from i32 to i64 for RV64.

Add test cases for the saturating intrinsics for half/float/double
and i32/i64. CodeGen is definitely not optimal. We can probably
make use of the native behavior of fcvt instructions in many cases.

Fixes PR50083
2021-04-22 15:18:15 -07:00
Jun Ma 978eb3f168 [DAGCombiner] Allow operand of step_vector to be negative.
It is proper to relax non-negative limitation of step_vector.
Also this patch adds more combines for step_vector:
(sub X, step_vector(C)) -> (add X,  step_vector(-C))

Differential Revision: https://reviews.llvm.org/D100812
2021-04-22 20:58:03 +08:00
Simon Pilgrim d860bf2d0e [DAG] TargetLowering.cpp - breakup if-else chains where each block returns. NFCI.
Match style guide that requests that if+return blocks are separate.
2021-04-21 11:17:27 +01:00
Fraser Cormack c141bd3cf9 [DAGCombiner] Support all-ones/all-zeros SPLAT_VECTOR in more combines
This patch adds incrementally-better support for SPLAT_VECTOR in a
handful of vector combines by changing a few more
isBuildVectorAllOnes/isBuildVectorAllZeros to the equivalent
isConstantSplatVectorAllOnes/Zeros calls.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D100851
2021-04-21 11:05:37 +01:00
Philip Reames 4824d876f0 Revert "Allow invokable sub-classes of IntrinsicInst"
This reverts commit d87b9b81cc.

Post commit review raised concerns, reverting while discussion happens.
2021-04-20 15:38:38 -07:00
Philip Reames d87b9b81cc Allow invokable sub-classes of IntrinsicInst
It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked.

This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke.

After this lands and has baked for a couple days, planned cleanups:
    Make GCStatepointInst a IntrinsicInst subclass.
    Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor.
    Do the same in SelectionDAG.
    Do the same in FastISEL.

Differential Revision: https://reviews.llvm.org/D99976
2021-04-20 15:03:49 -07:00
Simon Pilgrim 2a419a0b99 [X86][SSE] combineX86ShuffleChain - check if we're blending with zero into already zero elements
Add a SelectionDAG::MaskedElementsAreZero helper that wraps SelectionDAG::MaskedValueIsZero testing for entirely zero vector elements
2021-04-20 17:09:49 +01:00
Simon Pilgrim e156f2515c [DAG] SelectionDAG.cpp - breakup if-else chains where each block returns. NFCI.
Match style guide that requests that if+return blocks are separate.
2021-04-20 12:37:00 +01:00
Jun Ma 1ef5699d1a [DAGCombiner] Support fold zero scalar vector.
This patch changes ISD::isBuildVectorAllZeros to
ISD::isConstantSplatVectorAllZeros which handles zero sclar vector.

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100813
2021-04-20 16:28:43 +08:00
Fraser Cormack 457da7f298 [SelectionDAG] Relax constraints on STEP_VECTOR step operand
This patch relaxes the requirement that the STEP_VECTOR step constant
must be of a type at least as large as the vector element type. This
does not permit its use on targets which have legal vector element types
larger than the largest legal scalar type, such as i64 vectors on RV32.

As such, the requirement has been loosened so that the step operand must
be any scalar type so long as the constant immediate is non-negative and
the value fits inside the vector element type.

This limits combining optimizations in certain circumstances but in
practice it's unlikely to be a hindrance.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D100660
2021-04-20 08:41:42 +01:00
David Sherwood 83f5fa519e [CodeGen] Improve code generation for clamping of constant indices with scalable vectors
When trying to clamp a constant index into a scalable vector we can
test if the index is less than the minimum number of elements in the
vector. If so, we can simply return the index because we know it is
guaranteed to fit inside the vector.

Differential Revision: https://reviews.llvm.org/D100639
2021-04-19 08:34:17 +01:00
Serge Guelton d6de1e1a71 Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string).
throughout the codebase, this led to inelegant checks ranging from

        if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

to

        if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

Introduce a getValueAsBool that normalize the check, with the following
behavior:

no attributes or attribute set to "false" => return false
attribute set to "true" => return true

Differential Revision: https://reviews.llvm.org/D99299
2021-04-17 08:17:33 +02:00
Simon Pilgrim 37a4621fb6 [DAG] SelectionDAG::isSplatValue - early out if binop is not splat. NFCI.
Just return false if we fail to match splats - the remainder of the code is for (fixed)vector operations - shuffles/insertions etc.
2021-04-16 18:26:33 +01:00
Momchil Velikov f9d932e673 [clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example

    struct S {
      __attribute__ ((__aligned__(16))) double v[4];
    };

Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)

Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.

This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.

The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.

For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.

On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.

Patch by Momchil Velikov and Lucas Prates.

Differential Revision: https://reviews.llvm.org/D98794
2021-04-15 22:58:14 +01:00
Jun Ma 7e1422c1e4 [DAGCombiner] Fold step_vector with add/mul/shl
This patch implements some DAG combines for STEP_VECTOR:
add step_vector(C1), step_vector(C2) -> step_vector(C1+C2)
add (add X step_vector(C1)), step_vector(C2) -> add X step_vector(C1+C2)
mul step_vector(C1), C2 -> step_vector(C1*C2)
shl step_vector(C1), C2 -> step_vector(C1<<C2)

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100088
2021-04-15 18:06:35 +08:00
Tim Northover 6401b78ab3 SDAG: constant fold bf16 -> i16 casts
This direction is particularly useful because i16 constants are much more
likely to be legal than bf16.
2021-04-14 11:27:46 +01:00
Tim Northover 5e3d9fcc3a StackProtector: ensure protection does not interfere with tail call frame.
The IR stack protector pass must insert stack checks before the call instead of
between it and the return.

Similarly, SDAG one should recognize that ADJCALLFRAME instructions could be
part of the terminal sequence of a tail call. In this case because such call
frames cannot be nested in LLVM the stack protection code must skip over the
whole sequence (or risk clobbering argument registers).
2021-04-13 15:14:57 +01:00
Amy Huang dad5caa59e Revert "Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands""
This change causes an assert / segmentation fault in LTO builds.

This reverts commit f2e4f3eff3.
2021-04-12 20:10:17 -07:00
Stephen Tozer f2e4f3eff3 Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
The causes of the previous build errors have been fixed in revisions
aa3e78a59f, and
140757bfaa

This reverts commit f40976bd01.
2021-04-12 16:57:29 +01:00
Stephen Tozer aa3e78a59f Reapply "[DebugInfo] Correctly track SDNode dependencies for list debug values"
Fixed memory leak error by using BumpAllocator for SDDbgValue arrays.

This reverts commit 1b589172bd.
2021-04-12 12:51:29 +01:00
dfukalov d066079728 [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98027
2021-04-09 12:54:22 +03:00
Stephen Tozer 1b589172bd Revert "[DebugInfo] Correctly track SDNode dependencies for list debug values"
Reverted due to failure on the sanitizer-x86_64-linux-fast bot.

This reverts commit e10493eb50.
2021-04-08 17:55:45 +01:00
Stephen Tozer e10493eb50 [DebugInfo] Correctly track SDNode dependencies for list debug values
During SelectionDAG, we must track the SDNodes that each SDDbgValue depends on
to compute its value. These are ultimately derived from the location operands to
the SDDbgValue, but were stored in a separate vector prior to this patch. This
resulted in cases where one of the lists was updated incorrectly, resulting in
crashes during compilation. This patch fixes the issue by directly recomputing
the dependency list from the SDDbgOperands in getDependencies().

Differential Revision: https://reviews.llvm.org/D99423
2021-04-08 17:01:45 +01:00
Craig Topper 67953311e2 [SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR
This allows FoldConstantArithmetic to handle SPLAT_VECTOR in
addition to BUILD_VECTOR. This allows it to support scalable
vectors. I'm also allowing fixed length SPLAT_VECTOR which is
used by some targets, but I'm not familiar enough to write tests
for those targets.

I had to block this function from running on CONCAT_VECTORS to
avoid calling getNode for a CONCAT_VECTORS of 2 scalars.
This can happen because the 2 operand getNode calls this
function for any opcode. Previously we were protected because
CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR
before that call. But it's not always possible to fold a CONCAT_VECTORS
of SPLAT_VECTORs, and we don't even try.

This fixes PR49781 where DAG combine thought constant folding
should be possible, but FoldConstantArithmetic couldn't do it.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D99682
2021-04-07 10:03:33 -07:00
Yevgeny Rouban 3e738afae4 [Statepoint Lowering] Allow other than N byte sized types in deopt bundle
I do not see any bit-width restriction from the point of the
LLVM Lang Ref - Operand Bundles on the types of the deopt bundle
operands. Statepoint Lowering seems to be able to work with any
types.
This patch relaxes the two related assertions and adds a new test
for this change.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100006
2021-04-07 17:48:31 +07:00
Philip Reames fb41cae039 More precisely type code used for gc.relocate assertions [nfc] 2021-04-06 11:27:36 -07:00
Simon Pilgrim ddbb58736a [KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI.
As promised in D98866
2021-04-06 10:11:41 +01:00
Nikita Popov 665065821e [FastISel] Remove kill tracking
This is a followup to D98145: As far as I know, tracking of kill
flags in FastISel is just a compile-time optimization. However,
I'm not actually seeing any compile-time regression when removing
the tracking. This probably used to be more important in the past,
before FastRA was switched to allocate instructions in reverse
order, which means that it discovers kills as a matter of course.

As such, the kill tracking doesn't really seem to serve a purpose
anymore, and just adds additional complexity and potential for
errors. This patch removes it entirely. The primary changes are
dropping the hasTrivialKill() method and removing the kill
arguments from the emitFast methods. The rest is mechanical fixup.

Differential Revision: https://reviews.llvm.org/D98294
2021-04-03 15:50:13 +02:00
Simon Pilgrim 4ea5475a3f [KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI.
Include exhaustive test coverage.
2021-04-02 21:44:33 +01:00
Jun Ma 274ac9d40e [AArch64][SVE] Lowering sve.dot to DOT node
Differential Revision: https://reviews.llvm.org/D99699
2021-04-02 20:05:17 +08:00
Sander de Smalen 0f7bbbc481 Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed.
In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.

This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.

The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.

This allows us to do away with the following pattern in many of the SVE tests:
  RUN: .... 2>%t
  RUN: cat %t | FileCheck --check-prefix=WARN
  WARN-NOT: warning: ...

The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.

This patch also has fixes the following tests:
 unittests:
   ScalableVectorMVTsTest.SizeQueries
   SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
   AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG

 regression tests:
   Transforms/InstCombine/vscale_gep.ll

Reviewed By: paulwalker-arm, ctetreau

Differential Revision: https://reviews.llvm.org/D98856
2021-04-02 10:55:22 +01:00
Simon Pilgrim 77d625f8d8 [DAG] MergeInnerShuffle with BinOps - sometimes accept undef mask elements
If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well.

This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.
2021-04-01 14:33:00 +01:00
Simonas Kazlauskas 777a58e05b Support {S,U}REMEqFold before legalization
This allows these optimisations to apply to e.g. `urem i16` directly
before `urem` is promoted to i32 on architectures where i16 operations
are not intrinsically legal (such as on Aarch64). The legalization then
later can happen more directly and generated code gets a chance to avoid
wasting time on computing results in types wider than necessary, in the end.

Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like {D87976}.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88785
2021-04-01 01:35:41 +03:00
Craig Topper 9e00b6660d [SelectionDAG] Remove unneeded vector resize from the end of FoldConstantArithmetic. NFC
There's an assert right before that makes sure the size already matches.
Earlier in this function's life, scalars and vectors shared more
code.
2021-03-31 12:33:10 -07:00
Tomas Matheson a9968c0a33 [NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions
Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.

This patch attempts to clarify the situation by separating them and introducing
new names:

 - shouldRealignStack - true if there is any reason the stack should be
   realigned

 - canRealignStack - true if we are still able to realign the stack (e.g. we
   can still reserve/have reserved a frame pointer)

 - hasStackRealignment = shouldRealignStack && canRealignStack (not target
   customisable)

Targets can now override shouldRealignStack to indicate that stack realignment
is required.

This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).

Differential Revision: https://reviews.llvm.org/D98716

Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
2021-03-30 17:31:39 +01:00
Bradley Smith 9745dce8c3 [SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps
This is currently performed in SelectionDAGLegalize, here we make it also
happen in LegalizeVectorOps, allowing a target to lower the SETCC condition
codes first in LegalizeVectorOps and then lower to a custom node afterwards,
without having to duplicate all of the SETCC condition legalization in the
target specific lowering.

As a result of this, fixed length floating point SETCC nodes can now be
properly lowered for SVE.

Differential Revision: https://reviews.llvm.org/D98939
2021-03-29 15:32:25 +01:00
Florian Hahn eb3d9f2eb6
[SelDag] Add isIntOrFPConstant helper function.
This patch adds a new isIntOrFPConstant  helper function to check if a
SDValue is a integer of FP constant. This pattern is used in various
places.

There also are places that incorrectly just check for integer constants,
e.g. D99384, so hopefully this helper will help people avoid that issue.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D99428
2021-03-28 12:48:58 +01:00
David Sherwood 748ae5281d [IR][SVE] Add new llvm.experimental.stepvector intrinsic
This patch adds a new llvm.experimental.stepvector intrinsic,
which takes no arguments and returns a linear integer sequence of
values of the form <0, 1, ...>. It is primarily intended for
scalable vectors, although it will work for fixed width vectors
too. It is intended that later patches will make use of this
new intrinsic when vectorising induction variables, currently only
supported for fixed width. I've added a new CreateStepVector
method to the IRBuilder, which will generate a call to this
intrinsic for scalable vectors and fall back on creating a
ConstantVector for fixed width.

For scalable vectors this intrinsic is lowered to a new ISD node
called STEP_VECTOR, which takes a single constant integer argument
as the step. During lowering this argument is set to a value of 1.
The reason for this additional argument at the codegen level is
because in future patches we will introduce various generic DAG
combines such as

  mul step_vector(1), 2 -> step_vector(2)
  add step_vector(1), step_vector(1) -> step_vector(2)
  shl step_vector(1), 1 -> step_vector(2)
  etc.

that encourage a canonical format for all targets. This hopefully
means all other targets supporting scalable vectors can benefit
from this too.

I've added cost model tests for both fixed width and scalable
vectors:

  llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll
  llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll

as well as codegen lowering tests for fixed width and scalable
vectors:

  llvm/test/CodeGen/AArch64/neon-stepvector.ll
  llvm/test/CodeGen/AArch64/sve-stepvector.ll

See this thread for discussion of the intrinsic:
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html
2021-03-23 10:43:35 +00:00
Craig Topper 2f13e63f9e [LegalizeDAG] Add asserts to verify the types of custom legalized operation matches the original node.
We've messed this up a few times recently on RISCV. Experiments
with these asserts found a couple issues on other targets as well.
They've all been cleaned up now so we can put in these asserts to
catch future issues

I had to waive Glue because ADDC/ADDE/etc legalization replaces
Glue with i32 on at least AArch64. X86 used to do the same before
we switched to ADDCARRY. So I guess that's just how that works.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D98979
2021-03-22 10:28:51 -07:00
Craig Topper 30080b003e [DAGCombiner] Minor compile time improvement to (sext_in_reg (sign_extend_vector_inreg x)) optimization.
Don't bother calling ComputeNumSignBits if N00Bits < ExtVTBits. No
matter what answer we get back this will be true:
(N00Bits - DAG.ComputeNumSignBits(N00, DemandedSrcElts)) < ExtVTBits)

So we might as well save the computation. This makes the code more
consistent with the similar (sext_in_reg (sext x)) handling above.
2021-03-21 11:16:41 -07:00
Simon Pilgrim 64c2641c89 [DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension
As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.
2021-03-21 14:01:37 +00:00
Simon Pilgrim 9d2df96407 [DAG] computeKnownBits - add ISD::MULHS/MULHU/SMUL_LOHI/UMUL_LOHI handling
Reuse the existing KnownBits multiplication code to handle the 'extend + multiply + extract high bits' pattern for multiply-high ops.

Noticed while looking at the codegen for D88785 / D98587 - the patch helps division-by-constant expansion code in particular, which suggests that we might have some further KnownBits div/rem cases we could handle - but this was far easier to implement.

Differential Revision: https://reviews.llvm.org/D98857
2021-03-19 16:02:31 +00:00
Simon Pilgrim ffb2887103 [DAG] Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),undef) -> bop(shuffle'(x,y),shuffle'(z,w))
Followup to D96345, handle unary shuffles of binops (as well as binary shuffles) if we can merge the shuffle with inner operand shuffles.

Differential Revision: https://reviews.llvm.org/D98646
2021-03-19 14:14:56 +00:00
Craig Topper 182b831aeb [DAGCombiner][RISCV] Teach visitMGATHER/MSCATTER to remove gather/scatters with all zeros masks that use SPLAT_VECTOR.
Previously only all zeros BUILD_VECTOR was recognized.
2021-03-18 15:34:14 -07:00
Simon Pilgrim 1ba5c550d4 [DAG] Improve folding (sext_in_reg (*_extend_vector_inreg x)) -> (sext_vector_inreg x)
Extend this to support ComputeNumSignBits of the (used) source vector elements so that we can handle more than just the case where we're sext_in_reg from the source element signbit.

Noticed while investigating the poor codegen in D98587.
2021-03-18 15:34:53 +00:00
Simon Pilgrim b1afa187c8 [DAG] SelectionDAG::isSplatValue - add ISD::ABS handling
Add ISD::ABS to the existing unary instructions handling for splat detection

This is similar to D83605, but doesn't appear to need to touch any of the wasm refactoring.

Differential Revision: https://reviews.llvm.org/D98778
2021-03-18 10:28:29 +00:00
Stephen Tozer 3bfddc2593 Reapply "[DebugInfo] Handle multiple variable location operands in IR"
Fixed section of code that iterated through a SmallDenseMap and added
instructions in each iteration, causing non-deterministic code; replaced
SmallDenseMap with MapVector to prevent non-determinism.

This reverts commit 01ac6d1587.
2021-03-17 16:45:25 +00:00
Hans Wennborg 01ac6d1587 Revert "[DebugInfo] Handle multiple variable location operands in IR"
This caused non-deterministic compiler output; see comment on the
code review.

> This patch updates the various IR passes to correctly handle dbg.values with a
> DIArgList location. This patch does not actually allow DIArgLists to be produced
> by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
> Other than that, it should cover every IR pass.
>
> Most of the changes simply extend code that operated on a single debug value to
> operate on the list of debug values in the style of any_of, all_of, for_each,
> etc. Instances of setOperand(0, ...) have been replaced with with
> replaceVariableLocationOp, which takes the value that is being replaced as an
> additional argument. In places where this value isn't readily available, we have
> to track the old value through to the point where it gets replaced.
>
> Differential Revision: https://reviews.llvm.org/D88232

This reverts commit df69c69427.
2021-03-17 13:36:48 +01:00
serge-sans-paille 6e040a19db [NFC] Wisely nest dyn_cast in FunctionLoweringInfo
Take advantage of the inheritance tree to avoid a few comparison.
2021-03-16 10:22:44 +01:00
Fraser Cormack 0035decae7 [CodeGen] Fix issues with scalable-vector INSERT/EXTRACT_SUBVECTORs
This patch addresses a few issues when dealing with scalable-vector
INSERT_SUBVECTOR and EXTRACT_SUBVECTOR nodes.

When legalizing in DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR, we
store the low and high halves to the stack separately. The offset for
the high half was calculated incorrectly.

Additionally, we can optimize this process when we can detect that the
subvector is contained entirely within the low/high split vector type.
While this optimization is valid on scalable vectors, when performing
the 'high' optimization, the subvector must also be a scalable vector.
Note that the 'low' optimization is still conservative: it may be
possible to insert v2i32 into the low half of a split nxv1i32/nxv1i32,
but we can't guarantee it. It is always possible to insert v2i32 into
nxv2i32 or v2i32 into nxv4i32+2 as we know vscale is at least 1.

Lastly, in SelectionDAG::isSplatValue, we early-exit on the extracted subvector value
type being a scalable vector, forgetting that we can also extract a
fixed-length vector from a scalable one.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98495
2021-03-15 17:04:21 +00:00
Craig Topper 5b825433d7 [DAGCombiner] Optimize 1-bit smulo to AND+SETNE.
A 1-bit smulo overflows is both inputs are -1 since the result
should be +1 which can't be represented in a signed 1 bit value.

We can detect this with an AND and a setcc. The multiply result
can also use the same AND.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97634
2021-03-13 09:39:36 -08:00
Craig Topper 2ea7014089 [DAGCombiner] Use isConstantSplatVectorAllZeros/Ones instead of isBuildVectorAllZeros/Ones in visitMSTORE and visitMLOAD.
This allows us to optimize when the mask is a splat_vector in
addition to build_vector.
2021-03-12 12:14:56 -08:00
LemonBoy cfe69c8efd [SelectionDAG] Improve scalarization of irregular vector types
Use a more general strategy when splitting a vector into scalar parts (and vice-versa) to correctly handle vector types whose element size is not a power of 2 (and a multiple of 8).

Reviewed By: atanasyan

Differential Revision: https://reviews.llvm.org/D98273
2021-03-11 19:57:13 +01:00
Stephen Tozer f40976bd01 Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
This reverts commit c0f3dfb9f1.

Reverted due to an error on the clang-x64-windows-msvc buildbot.
2021-03-11 14:48:01 +00:00
gbtozers c0f3dfb9f1 [DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands
This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic
operations with two or more non-const operands; this includes the GetElementPtr
instruction, and most Binary Operator instructions. These salvages produce
DIArgList locations and are only valid for dbg.values, as currently variadic
DIExpressions must use DW_OP_stack_value. This functionality is also only added
for salvageDebugInfoForDbgValues; other functions that directly call
salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be
updated in a later patch.

Differential Revision: https://reviews.llvm.org/D91722
2021-03-11 13:33:49 +00:00
Serguei Katkov 0480927712 [Statepoint Lowering] Handle the case with several gc.result
Recently gc.result has been marked with readnone instead of readonly and
this opens a door for different optimization to duplicate gc.result.
Statepoint lowering is not ready to see several gc.results.
The problem appears when there are gc.results with one located in the same
basic block and another located in other basic block.
In this case we need both export VR and fill local setValue.

Note that this case is not sufficient optimization done before CodeGen.
It is evident that local gc.result dominates all other gc.results and it is handled
by GVN and EarlyCSE.

But anyway, even if IR is not optimal Backend should not crash on a valid IR.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98393
2021-03-11 18:44:44 +07:00
Quentin Colombet 66dab2fa84 [NFC] Fix compiler warnings
Fix warnings caused by -Wrange-loop-analysis.

Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>

Differential Revision: https://reviews.llvm.org/D98298
2021-03-10 11:03:50 -08:00
Craig Topper 9106d04554 [RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32.
On riscv32, i64 isn't a legal scalar type but we would like to
support scalable vectors of i64.

This patch introduces a new node that can represent a splat made
of multiple scalar values. I've used this new node to solve the current
crashes we experience when getConstant is used after type legalization.

For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS
when needed and then handling the SPLAT_VECTOR_PARTS later during
LegalizeOps. I've remove the special case I previously put in for
ABS for D97991 as the default expansion is now able to succesfully
use getConstant.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98004
2021-03-10 09:46:18 -08:00
Jinzheng Tu 481079e284 [NFC] Unify FIME with FIXME in comments
There are 5 occurrences FIME and 15333 FIXME. All of them should be FIXME.

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D98321
2021-03-10 14:00:51 +01:00
Serguei Katkov 2fccd1b00a [Statepoint Lowering] Fix the crash with gc.relocate in a separate block
If it was decided to relocate derived pointer using the spill its value is
not exported in general case.
When gc.relocate is located in an another block than a statepoint we cannot
get SD for derived value but for spill case it is not required at all.
However implementation of gc.relocate lowering unconditionally request SD value
causing the assert triggering.

The CL fixes this by handling spill case earlier than SD is really required.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98324
2021-03-10 19:51:04 +07:00
Nikita Popov 55ae279ba7 [FastISel] Don't trivially kill extractvalues (PR49467)
All extractvalues of the same value at the same index will map to
the same register, so even if one specific extractvalue only has
one use, we should not mark it as a trivial kill, as there may be
more extractvalues later.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49467.

Differential Revision: https://reviews.llvm.org/D98145
2021-03-09 18:46:38 +01:00
gbtozers df69c69427 [DebugInfo] Handle multiple variable location operands in IR
This patch updates the various IR passes to correctly handle dbg.values with a
DIArgList location. This patch does not actually allow DIArgLists to be produced
by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
Other than that, it should cover every IR pass.

Most of the changes simply extend code that operated on a single debug value to
operate on the list of debug values in the style of any_of, all_of, for_each,
etc. Instances of setOperand(0, ...) have been replaced with with
replaceVariableLocationOp, which takes the value that is being replaced as an
additional argument. In places where this value isn't readily available, we have
to track the old value through to the point where it gets replaced.

Differential Revision: https://reviews.llvm.org/D88232
2021-03-09 16:44:38 +00:00
gbtozers 5491a86f59 [DebugInfo] Emit DBG_VALUE_LIST from ISel
This patch completes ISel support for DIArgList dbg.values by allowing
SDDbgValues with multiple location operands to be emitted as DBG_VALUE_LIST
instructions.

The primary change of this patch is refactoring EmitDbgValue by pulling location
operand emission out to the new function AddDbgValueLocationOps, which is used
for both DIArgList and single value dbg.values. Outside of that, the only
behaviour change is that the scheduler has a lambda added, HasUnknownVReg, to
prevent us from attempting to emit a DBG_VALUE_LIST before all of its used VRegs
have become available.

Differential Revision: https://reviews.llvm.org/D88592
2021-03-09 12:17:39 +00:00
Cullen Rhodes 2750f3ed31 [IR] Introduce llvm.experimental.vector.splice intrinsic
This patch introduces a new intrinsic @llvm.experimental.vector.splice
that constructs a vector of the same type as the two input vectors,
based on a immediate where the sign of the immediate distinguishes two
variants. A positive immediate specifies an index into the first vector
and a negative immediate specifies the number of trailing elements to
extract from the first vector.

For example:

  @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E>  ; index
  @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count

These intrinsics support both fixed and scalable vectors, where the
former is lowered to a shufflevector to maintain existing behaviour,
although while marked as experimental the recommended way to express
this operation for fixed-width vectors is to use shufflevector. For
scalable vectors where it is not possible to express a shufflevector
mask for this operation, a new ISD node has been implemented.

This is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].

Patch by Paul Walker and Cullen Rhodes.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D94708
2021-03-09 10:44:22 +00:00
gbtozers 93b170ea24 [DebugInfo] Handle dbg.values with multiple variable location operands in ISel
This patch adds partial support in Instruction Selection for dbg.values that use
a DIArgList. This patch does not add support for producing DBG_VALUE_LIST, but
adds the logic for processing DIArgLists within the ISel pass. This change is
largely focused on handleDebugValue and some of the functions that it calls.
Outside of this, salvageDebugInfo and transferDbgValues have been modified to
replace individual operands instead of the entire value; dangling debug info for
variadic debug values is not currently supported (but may be added later).

Differential Revision: https://reviews.llvm.org/D88589
2021-03-09 09:48:03 +00:00
Jessica Paquette f7d73a6b9e [SelectionDAG] Don't scalarize vector fpround sources that don't need it.
Similar to the workaround code in ScalarizeVecRes_UnaryOp, ScalarizeVecRes_SETCC
, ScalarizeVecRes_VSELECT, etc.

If we have a case like this:

```
define <1 x half> @func(<1 x float> %x) {
  %tmp = fptrunc <1 x float> %x to <1 x half>
  ret <1 x half> %tmp
}
```

On AArch64, the <1 x float> is legal. So, this will crash if we call
GetScalarizedVector on it.

Differential Revision: https://reviews.llvm.org/D98208
2021-03-08 14:37:33 -08:00
Stephen Tozer c0450af559 Fix: [DebugInfo] Support representation of multiple location operands in SDDbgValue
Removes a "default" label from a fully covered switch, causing errors on
-Wcovered-switch-default builds.
2021-03-08 19:14:12 +00:00
gbtozers 9525af7b91 [DebugInfo] Support representation of multiple location operands in SDDbgValue
This patch modifies the class that represents debug values during ISel,
SDDbgValue, to support multiple location operands (to represent a dbg.value that
uses a DIArgList). Part of this class's functionality has been split off into a
new class, SDDbgOperand.

The new class SDDbgOperand represents a single value, corresponding to an SSA
value or MachineOperand in the IR and MIR respectively. Members of SDDbgValue
that were previously related to that specific value (as opposed to the
variable or DIExpression), such as the Kind enum, have been moved to
SDDbgOperand. SDDbgValue now contains an array of SDDbgOperand instead, allowing
it to hold more than one of these values.

All changes outside SDDbgValue are simply updates to use the new interface.

Differential Revision: https://reviews.llvm.org/D88585
2021-03-08 18:45:17 +00:00
gbtozers e5d958c456 [DebugInfo] Support DIArgList in DbgVariableIntrinsic
This patch updates DbgVariableIntrinsics to support use of a DIArgList for the
location operand, resulting in a significant change to its interface. This patch
does not update all IR passes to support multiple location operands in a
dbg.value; the only change is to update the DbgVariableIntrinsic interface and
its uses. All code outside of the intrinsic classes assumes that an intrinsic
will always have exactly one location operand; they will still support
DIArgLists, but only if they contain exactly one Value.

Among other changes, the setOperand and setArgOperand functions in
DbgVariableIntrinsic have been made private. This is to prevent code from
setting the operands of these intrinsics directly, which could easily result in
incorrect/invalid operands being set. This does not prevent these functions from
being called on a debug intrinsic at all, as they can still be called on any
CallInst pointer; it is assumed that any code directly setting the operands on a
generic call instruction is doing so safely. The intention for making these
functions private is to prevent DIArgLists from being overwritten by code that's
naively trying to replace one of the Values it points to, and also to fail fast
if a DbgVariableIntrinsic is updated to use a DIArgList without a valid
corresponding DIExpression.
2021-03-08 14:36:13 +00:00
Craig Topper 0eb405c3b8 [SelectionDAG] Add computeKnownBits support for ISD::USUBSAT.
The result of ISD::USUBSAT will never be larger than the LHS. We
can use this to put a bound on the number of leading zeros.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D98133
2021-03-07 09:48:42 -08:00
LemonBoy 2ec43e4167 [LegalizeDAG] Implement promotion rules for SELECT_CC
Implement the promotion rule for SELECT_CC nodes by upcasting all the parameters and downcasting the result.
The AArch64 target makes use of this rule and, since it was not implemented, in some cases the instruction selector would hit an assertion upon encountering the illegal node.

This patch requires D97840, the included test cases hit both problems.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97859
2021-03-05 18:22:55 +01:00
Craig Topper ad532be012 [SelectionDAG] Assert that operands to SelectionDAG::getNode are not DELETED_NODE to catch issues like PR49393 earlier.
I'm not sure this would catch all such issues, but it would catch some.

The problem for PR49393 was that we were holding a reference to a node that
wasn't connect edto the DAG across a function that could delete unused nodes. In
this particular case we managed to try to use the deleted node while it was in
the deleted state before its memory got recycled.

It could also happen that we delete the node, something allocates a new node
which recycles the memory. Then  we try to use the reference we were holding and
it is now a completely different node with different valid opcode. This patch
would not catch that.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97969
2021-03-04 23:05:32 -08:00