Commit Graph

32796 Commits

Author SHA1 Message Date
Andre Vieira 49223e0a2d [TypePromotion] Don't promote PHI + ZExt if wider than RegisterBitWidth
Differential Revision: https://reviews.llvm.org/D131966
2022-08-17 09:54:15 +01:00
Eli Friedman cfd2c5ce58 Untangle the mess which is MachineBasicBlock::hasAddressTaken().
There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code.  Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.

Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.

So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.

Discovered while trying to sort out related stuff on D102817.

Differential Revision: https://reviews.llvm.org/D124697
2022-08-16 16:15:44 -07:00
Nicolas Miller ccfabfbb1f Fix subrange liveness checking at rematerialization
This patch fixes an issue where an instruction reading a whole register would be moved during register allocation into a spot where one of the subregisters was dead.

The code to check whether an instruction can be rematerialized at a given point or not was already checking for subranges to ensure that subregisters are live, but only when the instruction being moved was using a subregister, this patch changes that so the subranges are checked even when the moved instruction uses the full register.

This patch also adds a case to the original test for the subrange checking that trigger the issue described above.

The original subrange checking code was introduced in this revision: https://reviews.llvm.org/D115278

And I've encountered this issue on AMDGPUs while working with DPC++: https://github.com/intel/llvm/issues/6209

Essentially the greedy register allocator attempts to move the following instruction:

```
%3961:vreg_64 = V_LSHLREV_B64_e64 3, %3078:vreg_64, implicit $exec
```

From `@3440` into the body of a loop `@16312`, but `%3078` has the following live ranges:

```
%3078 [2224r,2240r:0)[2240r,3488B:1)[16192B,38336B:1) 0@2224r 1@2240r  L0000000000000003 [2224r,3440r:0) 0@2224r  L000000000000000C [2240r,3488B:0)[16192B,38336B:0) 0@2240r
```

So `@16312e` `%3078.sub1` is alive but `%3078.sub0` is dead, so this instruction being moved there leads to invalid memory accesses as `3078.sub0` ends up being trashed and the result of this instruction is used as part of an address calculation for a load.

On the original ticket this issue showed up on gfx906 and gfx90a but not on gfx908, this turned out to be because on gfx908 instead of moving the shift instruction into the loop, its value is spilled into an ACC register, gfx906 doesn't have ACC registers and for gfx90a ACC registers are used like regular vector registers and so aren't used for spilling.

With this patch the original application from the DPC++ ticket works properly on gfx906, and the result of the shift instruction is correctly spilled instead of moving the instruction in the loop.

Original Author: npmiller

Reviewed by: rampitec

Submitted by: rampitec

Differential Revision: https://reviews.llvm.org/D131884
2022-08-16 10:50:09 -07:00
Arthur Eubanks 9181ce623f [Windows] Put init_seg(compiler/lib) in llvm.global_ctors
Currently we treat initializers with init_seg(compiler/lib) as similar
to any other init_seg, they simply have a global variable in the proper
section (".CRT$XCC" for compiler/".CRT$XCL" for lib) and are added to
llvm.used. However, this doesn't match with how LLVM sees normal (or
init_seg(user)) initializers via llvm.global_ctors. This
causes issues like incorrect init_seg(compiler) vs init_seg(user)
ordering due to GlobalOpt evaluating constructors, and the
ability to remove init_seg(compiler/lib) initializers at all.

Currently we use 'A' for priorities less than 200. Use 200 for
init_seg(compiler) (".CRT$XCC") and 400 for init_seg(lib) (".CRT$XCL"),
which do not append the priority to the section name. Priorities
between 200 and 400 use ".CRT$XCC${Priority}". This allows for
some wiggle room for people/future extensions that want to add
initializers between compiler and lib.

Fixes #56922

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D131910
2022-08-16 08:16:18 -07:00
Steve Merritt ec60fca752 [CodeView] Use non-qualified names for static local variables
Static variables declared within a routine or lexical block should
be emitted with a non-qualified name.  This allows the variables to
be visible to the Visual Studio watch window.

Differential Revision: https://reviews.llvm.org/D131400
2022-08-16 10:33:43 -04:00
Andre Vieira c6b5a13b7a [TypePromotion] Only search for PHI + ZExt promotion of Integers
Differential Revision: https://reviews.llvm.org/D131948
2022-08-16 10:15:32 +01:00
wanglian fbc4c26e9a [SelectionDAG][NFC] Fix return type when used isConstantIntBuildVectorOrConstantInt
and isConstantFPBuildVectorOrConstantFP

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D131870
2022-08-16 10:07:24 +08:00
Rahman Lavaee df2213f345 [EHStreamer] Omit @LPStart when function has no landing pads
When no landing pads exist for a function, `@LPStart` is undefined and must be omitted.

EH table is generally not emitted for functions without landing pads, except when the personality function is uknown (`!isNoOpWithoutInvoke(classifyEHPersonality(Per))`). In that case, we must omit `@LPStart` even when machine function splitting is enabled.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D131626
2022-08-15 17:09:46 -07:00
Aiden Grossman 24cdf97d63 [mlgo] Add ability to create feature-gated development features in regalloc advisor
Currently there is no way to add in development features to the ML
regalloc evict advisor which is useful to have when working on feature
engineering/improving the current model. This patch adds in the ability
to add in development features to the ML regalloc evict advisor which
are gated by a runtime flag and not added in at all if not compiled in
LLVM development mode. This sets the stage for future work where we are
planning on upstreaming some of the newer features that we are currently
experimenting with.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D131209
2022-08-15 16:01:37 -07:00
David Green dfc95bab07 [DAG] Ensure more Legal BUILD_VECTOR elements types in shuffle->And combine
This is a followup to D131350, which caused another problem for i64
types being split into i32 on i32 targets. This patch tries to make sure
that either Illegal types are OK, or that the element types of a
buildvector are legal and bigger than or equal to the size of the
original elements.

Differential Revision: https://reviews.llvm.org/D131883
2022-08-15 14:41:45 +01:00
Luo, Yuanke 853bb192c4 Revert "(Reland) [fastalloc] Support allocating specific register class in fastalloc"
This reverts commit 30f9e6ebd3.
2022-08-15 20:33:15 +08:00
Ayke van Laethem de48717fcf
[AVR] Support unaligned store
This patch really just extends D39946 towards stores as well as loads.
While the patch is in SelectionDAGBuilder, it only applies to AVR (the
only target that supports unaligned atomic operations).

Differential Revision: https://reviews.llvm.org/D128483
2022-08-15 14:29:37 +02:00
Simon Pilgrim 3a73133217 [DAG] canCreateUndefOrPoison - add freeze(sign_extend_inreg(x,vt)) -> sign_extend_inreg(freeze(x),vt) support
Guaranteed not to create undef/poison
2022-08-15 12:18:59 +01:00
Peter Waller 6e85db7293 [DAGCombine] Combine signext_inreg of extract-extend
The outer signext_inreg is redundant in the following:

  Fold (signext_inreg (extract_subvector (zext|anyext|sext iN_value to _) _) from iN)
       -> (extract_subvector (signext iN_value to iM))

Tests are precommitted and clone those by analogy from the AND case in
the same file. Add a negative test to check extension width is handled
correctly.

This patch supersedes D130700.

Differential Revision: https://reviews.llvm.org/D131503
2022-08-15 10:58:07 +00:00
Simon Pilgrim 7e294e676e [DAG] canCreateUndefOrPoison - add freeze(assertsext/zext(x,bt)) -> assertsext/zext(freeze(x),vt) support
These are guaranteed not to create undef/poison (although they may pass through) - the associated ISD::VALUETYPE node is also guaranteed never to generate poison
2022-08-15 11:13:43 +01:00
Kazu Hirata f5a68feab3 Use llvm::none_of (NFC) 2022-08-14 16:25:39 -07:00
Simon Pilgrim e2d13fd096 [DAG] canCreateUndefOrPoison - add freeze(shl(x,y)) -> shl(freeze(x),y) support
These are guaranteed not to create undef/poison if the shift amount is known to be in range
2022-08-14 14:38:10 +01:00
Simon Pilgrim a621d38bcb [DAG] canCreateUndefOrPoison - add freeze(and/or/xor(x,y)) -> and/or/xor(freeze(x),y) support
These are guaranteed not to create undef/poison
2022-08-14 13:14:53 +01:00
Simon Pilgrim 60534b8879 [DAG] canCreateUndefOrPoison - add freeze(add/sub/mul(x,y)) -> add/sub/mul(freeze(x),y,z) support
These are guaranteed not to create undef/poison as long as there are no poison generating flags
2022-08-13 20:58:00 +01:00
Luo, Yuanke 30f9e6ebd3 (Reland) [fastalloc] Support allocating specific register class in fastalloc
Reland commit 719658d078

The base RA support infrastructure that only allow a specific register
class be allocated in RA pss. Since greedy RA, basic RA derived from
base RA, they all allow allocating specific register class. Fast RA
doesn't support allocating register for specific register class. This
patch is to enable ShouldAllocateClass in fast RA, so that it can
support allocating register for specific register class.

Differential Revision: https://reviews.llvm.org/D131825
2022-08-13 13:57:34 +08:00
Joe Loser b12aa497cd
[DAGCombine] Replace std::monostate equivalent in DAGCombiner.cpp
Remove the `UnitT` type and operators in favor of using `std::monostate`
directly.

Differential Revision: https://reviews.llvm.org/D131778
2022-08-12 21:42:09 -06:00
Simon Pilgrim 4de35f4bbf [DAG] Add TODO to remove creation of INSERT_SUBVECTOR nodes from SimplifyMultipleUseDemandedBits
SimplifyMultipleUseDemandedBits shouldn't be creating general nodes like this - although we allow bitcasts, even general constant folding is avoided.

Removing it causes a number of regressions that need addressing first, but I've added a TODO for now.
2022-08-12 10:45:30 +01:00
Filipp Zhinkin 1626ee6a95 [DAGCombine] Hoist shifts out of a logic operations tree.
Hoist and combine shift operations from logic operations tree:
logic (logic (SH x0, s), y), (logic (SH x1, s), z)  --> logic (SH (logic x0, x1), s), (logic y, z)

The transformation improves code generated for some cases related to the issue https://github.com/llvm/llvm-project/issues/49541.

Correctness:
https://alive2.llvm.org/ce/z/pVqVgY
https://alive2.llvm.org/ce/z/YVvT-q
https://alive2.llvm.org/ce/z/W5zTBq
https://alive2.llvm.org/ce/z/YfJsvJ
https://alive2.llvm.org/ce/z/3YSyDM
https://alive2.llvm.org/ce/z/Bs2kzk
https://alive2.llvm.org/ce/z/EoQpzU
https://alive2.llvm.org/ce/z/Jnc_5H
https://alive2.llvm.org/ce/z/_LP6k_
https://alive2.llvm.org/ce/z/KvZNC9

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D131189
2022-08-12 12:42:16 +03:00
wanglian 061f7ec9fa [LegalizeTypes][NFC] Use getConstantOperandVal instead of cast constant getvalue
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D131642
2022-08-12 14:35:10 +08:00
wanglian 1303057888 [LegalizeTypes][NFC] Use dyn_cast instead of isa and cast
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D131544
2022-08-12 14:18:49 +08:00
Chen Zheng 8d19cfb72e [PowerPC] omit location attribute for TLS variable on AIX
TLS debug on AIX is not ready for now.
The location generated in no-integrated-as mode is wrong and
in integrated-as mode causes AIX linker error.

Reviewed By: Esme

Differential Revision: https://reviews.llvm.org/D130245
2022-08-12 00:54:48 -04:00
wanglian 3b71f1d5ab [LegalizeTypes][NFC] Use getConstantOperandAPInt instead of cast constant getAPInt
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D131653
2022-08-12 10:21:54 +08:00
Peter Waller 898699831b [DAGCombine] Check zext legality in zext-extract-extend combine
Discussed in D131503.

Fix to D130782.
2022-08-11 14:30:42 +00:00
Andre Vieira 1640679187 [TypePromotion] Search from ZExt + PHI
Expand TypePromotion pass to try to promote PHI-nodes in loops that are the
operand of a ZExt, using the ZExt's result type to determine the Promote Width.

Differential Revision: https://reviews.llvm.org/D111237
2022-08-11 09:50:10 +01:00
Andre Vieira 05fc5037cd [TypePromotion] Hoist out Promote Width calculation
Hoist out promote width calculation to simplify runOnFunction.

Differential Revision: https://reviews.llvm.org/D131489
2022-08-11 09:50:10 +01:00
Andre Vieira e524d61f35 [TypePromotion] Don't delete Insns when iterating
Differential Revision: https://reviews.llvm.org/D131488
2022-08-11 09:50:10 +01:00
Andre Vieira 57de4e059d [TypePromotion] Don't insert Truncate for a no-op ZExt
Differential Revision: https://reviews.llvm.org/D131487
2022-08-11 09:50:10 +01:00
aqjune 02e56e2533 [CodeGen] Generate efficient assembly for freeze(poison) version of `mm*_cast*` intel intrinsics
This patch makes the variants of `mm*_cast*` intel intrinsics that use `shufflevector(freeze(poison), ..)` emit efficient assembly.
(These intrinsics are planned to use `shufflevector(freeze(poison), ..)` after shufflevector's semantics update; relevant thread: D103874)

To do so, this patch

1. Updates `LowerAVXCONCAT_VECTORS` in X86ISelLowering.cpp to recognize `FREEZE(UNDEF)` operand of `CONCAT_VECTOR` in addition to `UNDEF`
2. Updates X86InstrVecCompiler.td to recognize `insert_subvector` of `FREEZE(UNDEF)` vector as its first operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D130339
2022-08-11 13:36:21 +09:00
Simon Pilgrim 8623da5f74 [DAG] visitFREEZE - generalize freeze(op()) -> op(freeze()) to any number of operands
canCreateUndefOrPoison currently only handles unary ops, but we intend to change that soon - this more closely matches the pushFreezeToPreventPoisonFromPropagating behaviour where the freeze is pushed up to a single operand value, as long as all others are guaranteed not to be poison/undef.

However, pushFreezeToPreventPoisonFromPropagating would freeze all uses of the value - whilst this variant requires the frozen value to be only used in the op - we can look at generalize multiple uses later if the need arises.
2022-08-10 13:12:46 +01:00
Simon Pilgrim bbc27d0148 [DAG] canCreateUndefOrPoison - add freeze(truncate(x)) -> truncate(freeze(x)) support 2022-08-10 11:27:22 +01:00
David Truby b1b9c39629 [AArch64][SVE] Use SVE for VLS fcopysign for wide vectors
Currently fcopysign for VLS vectors lowers through NEON even when the
vector width is wider than a NEON vector, causing bad codegen as the
vectors are split. This patch causes SVE to be used for these vectors
instead, giving much better codegen on wide VLS vectors.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D128642
2022-08-10 10:17:19 +00:00
Simon Pilgrim df3ea7365e [DAG] Use DAG.getFreeze() to create freeze node. NFC. 2022-08-10 10:26:26 +01:00
Adrian Prantl 68f97d2f78 LiveDebugValues: Fix another crash related to unreachable blocks
This is a follow-up patch to D130999. In the test, the MIR contains an
unreachable MBB but the code attempts to look it up in MLocs. This
patch fixes this issue by checking for the default-constructed value.

rdar://97226240

Differential Revision: https://reviews.llvm.org/D131453
2022-08-09 10:34:57 -07:00
Simon Pilgrim ed162d455a [DAG] Avoid hasOneUse() calls if the cheaper !AssumeSingleUse test has already failed. NFC.
Very minor optimization, but every little helps..
2022-08-09 16:42:19 +01:00
Simon Pilgrim d79e7dc939 [DAG] SimplifyDemandedVectorElts - and/mul(x,y) - if a demanded element of y is known zero then we don't need to demand it in x
This fixes most of the remaining regressions from the fixes in rG293899c64b75
2022-08-09 16:24:08 +01:00
Simon Pilgrim 2724143551 [DAG] canCreateUndefOrPoison - add freeze(ctpop(x)) -> ctpop(freeze(x)) and freeze(parity(x)) -> parity(freeze(x)) support
Both are guaranteed not to create undef/poison
2022-08-09 10:10:29 +01:00
Luo, Yuanke aaf6c7b05c [globalisel] Select register bank for DBG_VALUE
The register operand of DBG_VALUE is not selected to a proper register
bank in both AArch64 and X86. This would cause getRegClass crash after
global ISel. After discussion, we think the MIR should assume all
vritual register should be set proper register class after global ISel,
so this patch is to fix the gap of DBG_VALUE for AArch64 and X86.

Differential Revision: https://reviews.llvm.org/D129037
2022-08-09 13:11:51 +08:00
Yuta Mukai 5357dd2f43 [MachinePipeliner] Fix Phi generation failure for large stages
The previous code overwrites VRMap for prologue stages during Phi
generation if a register spans many stages.
As a result, the wrong register is used as the one coming from
the prologue in Phis at later stages. (A process exists to correct
this, but it does not work in all cases.)
In addition, VRMap for prologue must be preserved until addBranches().

This patch fixes them by separating the map for Phis into a different
variable (VRMapPhi).

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D127840
2022-08-09 13:14:26 +09:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Simon Pilgrim 6f2bee667a [DAG] canCreateUndefOrPoison - add freeze(bswap(x)) -> bswap(freeze(x)) and freeze(bitreverse(x)) -> bitreverse(freeze(x)) support
Both are guaranteed not to create undef/poison
2022-08-08 17:27:17 +01:00
Simon Pilgrim e4b2c52420 [DAG] canCreateUndefOrPoison - add freeze(sext(x)) -> sext(freeze(x)) and freeze(zext(x)) -> zext(freeze(x)) support
Both are guaranteed not to create undef/poison
2022-08-08 16:43:40 +01:00
Krzysztof Parzyszek 0f5385b70e Recommit [RDF] Remove explicit template arguments from Print
The build breakages should be addressed by d4abdd2e3d:
  [CMake] Check CMAKE_CXX_STANDARD and error if it's to old

Thanks to Tobias and Roy for addressing these issues.
2022-08-08 07:28:45 -07:00
Simon Pilgrim 9641a201a5 [DAG] Add initial SelectionDAG::canCreateUndefOrPoison support
This patch adds basic support for a DAG variant of the canCreateUndefOrPoison call and updates DAGCombiner::visitFREEZE to use it, further Opcodes (including target specific Opcodes) can be handled when we have test coverage.

So far, I've left visitFREEZE to just use this for unary nodes (which currently means the existing BITCAST/FREEZE cases) - later patches will add other unary opcodes (with test coverage) and we can also refactor visitFREEZE to support a general number of operands like we do in InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating.

I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place. Similarly we will be able to handle poison generating SDNodeFlags as and when it becomes an issue.

Part of the work for D106675 / PR50468

Differential Revision: https://reviews.llvm.org/D130646
2022-08-08 15:16:06 +01:00
Simon Pilgrim b334709467 Remove superfluous ; outside of a function 2022-08-08 12:14:03 +01:00
Shubham Narlawar ab4fc87a9d [DAG] Emit table lookup from TargetLowering::expandCTTZ()
This patch emits table lookup in expandCTTZ.

Context -
https://reviews.llvm.org/D113291 transforms set of IR instructions to
cttz intrinsic but there are some targets which does not support CTTZ or
CTLZ. Hence, I generate a table lookup in TargetLowering::expandCTTZ().

Differential Revision: https://reviews.llvm.org/D128911
2022-08-08 12:08:05 +01:00