Commit Graph

32682 Commits

Author SHA1 Message Date
Fangrui Song adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Eli Friedman 0ff51d5dde Fix interaction of CFI instructions with MachineOutliner.
1. When checking if a candidate contains a CFI instruction, actually
iterate over all of the instructions, instead of stopping halfway
through.
2. Make sure copied CFI directives refer to the correct instruction.

Fixes https://github.com/llvm/llvm-project/issues/55842

Differential Revision: https://reviews.llvm.org/D126930
2022-06-10 13:37:49 -07:00
Guillaume Chatelet 95083fa3b8 [NFC] Remove deadcode 2022-06-10 15:13:42 +00:00
Simon Pilgrim 91adbc3208 [DAG] SimplifyDemandedVectorElts - adding SimplifyMultipleUseDemandedVectorElts handling to ISD::CONCAT_VECTORS
Attempt to look through multiple use operands of ISD::CONCAT_VECTORS nodes

Another minor improvement for D127115
2022-06-10 16:06:43 +01:00
Guillaume Chatelet 38637ee477 [clang] Add support for __builtin_memset_inline
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.

The idea is to get support from the compiler to easily write efficient memory function implementations.

This patch could be split in two:
 - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
 - and another one for the Clang part providing the instrinsic as a builtin.

Differential Revision: https://reviews.llvm.org/D126903
2022-06-10 13:13:59 +00:00
Nikita Popov c10921fa1a [CGP] Also freeze ctlz/cttz operand when despeculating
D125887 changed the ctlz/cttz despeculation transform to insert
a freeze for the introduced branch on zero. While this does fix
the "branch on poison" issue, we may still get in trouble if we
pick a different value for the branch and for the ctz argument
(i.e. non-zero for the branch, but zero for the ctz). To avoid
this, we should use the same frozen value in both positions.

This does cause a regression in RISCV codegen by introducing an
additional sext. The DAG looks like this:

    t0: ch = EntryToken
        t2: i64,ch = CopyFromReg t0, Register:i64 %3
      t4: i64 = AssertSext t2, ValueType:ch:i32
    t23: i64 = freeze t4
          t9: ch = CopyToReg t0, Register:i64 %0, t23
          t16: ch = CopyToReg t0, Register:i64 %4, Constant:i64<32>
        t18: ch = TokenFactor t9, t16
            t25: i64 = sign_extend_inreg t23, ValueType:ch:i32
          t24: i64 = setcc t25, Constant:i64<0>, seteq:ch
        t28: i64 = and t24, Constant:i64<1>
      t19: ch = brcond t18, t28, BasicBlock:ch<cond.end 0x8311f68>
    t21: ch = br t19, BasicBlock:ch<cond.false 0x8311e80>

I don't see a really obvious way to improve this, as we can't push
the freeze past the AssertSext (which may produce poison).

Differential Revision: https://reviews.llvm.org/D126638
2022-06-10 09:46:10 +02:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Simon Pilgrim 7dbfcfa735 [DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing shuffle op, try to replace an undef op if there is one.
This should fix a number of shuffle regressions in D127115 where the re-ordered combines mean we fail to fold a EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT sequence into a BUILD_VECTOR if we extract from more than one vector source.
2022-06-09 14:56:14 +01:00
Guillaume Chatelet dc3367970e [SelectionDAG] Handle bzero/memset libcalls globally instead of per target
Differential Revision: https://reviews.llvm.org/D127279
2022-06-09 08:34:55 +00:00
Craig Topper 4bcfc41846 [SelectionDAG] Teach computeKnownBits that a nsw self multiply produce a positive value.
This matches what we do in IR. For the RISC-V test case, this allows
us to use -8 for the AND mask instead of materializing a constant in a register.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D127335
2022-06-08 14:55:58 -07:00
Kai Nacke d897a14c2e [SystemZ] Fix check for zero size when lowering memcmp.
During lowering of memcmp/bcmp, the check for a size of 0 is done
in 2 different ways. In rare cases this can lead to a crash in
SystemZSelectionDAGInfo::EmitTargetCodeForMemcmp(). The root cause
is that SelectionDAGBuilder::visitMemCmpBCmpCall() checks for a
constant int value which is not yet evaluated. When the value is
turned into a SDValue, then the evaluation is done and results in
a ConstantSDNode. But EmitTargetCodeForMemcmp() expects the special
case of 0 length to be handled, which results in an assertion.

The fix is to turn the value into a SDValue, so that both functions
use the same check.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D126900
2022-06-08 14:52:13 -04:00
Simon Pilgrim b84c10d4bc [DAG] visitVSELECT - don't wait for truncation of sub before attempting to match with getTruncatedUSUBSAT
Fixes some X86 PSUBUS regressions encountered in D127115 where the truncate was being replaced with a PACKSS/PACKUS before the fold got called again
2022-06-08 16:16:35 +01:00
Joseph Huber 9e0dbd2a2a [Target] Remove `startswith` for adding `SHF_EXCLUDE` to offload section
Summary:
We use the special section name `.llvm.offloading` to store device
imagees in the host object file. We want these to be stripped by the
linker as they are not used after linking so we use the `SHF_EXCLUDE`
flag to instruct the linker to drop them. We used to do this for all
sections that started with `.llvm.offloading` when we encoded metadata
in the section name itself. Now we embed a special binary containing the
metadata, we should only add the flag on this name specifically.
2022-06-08 09:56:51 -04:00
Paul Walker d88354213c [SelectionDAG] Remove invalid TypeSize conversion from PromoteIntRes_BITCAST.
Extend the TypeWidenVector case of PromoteIntRes_BITCAST to work
with TypeSize directly rather than silently casting to unsigned.

To accomplish this I've extended TypeSize with an interface that
essentially allows TypeSize division when both operands have the
same number of dimensions.

There still exists combinations of scalable vector bitcasts that
cause compiler crashes. I call these out by adding "is missing"
entries to sve-bitcast.

Depends on D126957.
Fixes: #55114

Differential Revision: https://reviews.llvm.org/D127126
2022-06-08 10:30:07 +01:00
Paul Walker a1121c31d8 [SVE] Fix incorrect code generation for bitcasts of unpacked vector types.
Bitcasting between unpacked scalable vector types of different
element counts is not a NOP because the live elements are laid out
differently.
               01234567
e.g. nxv2i32 = XX??XX??
     nxv4f16 = X?X?X?X?

Differential Revision: https://reviews.llvm.org/D126957
2022-06-08 10:30:07 +01:00
Chuanqi Xu 0e10f12844 [NFC] Remove commented cerr debugging loggings
There are some unused cerr debugging loggings in the codes. It is weird
to remain such commented debug helpers in the product.
2022-06-08 15:58:06 +08:00
Kito Cheng 7207373e1e Revert "[SplitKit] Handle early clobber + tied to def correctly"
Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.

This reverts commit e14d04909d.
2022-06-08 13:05:35 +08:00
Kito Cheng e14d04909d [SplitKit] Handle early clobber + tied to def correctly
Spliter will try to extend a live range into `r` slot for a use operand,
that's works on most situaion, however that not work correctly when the operand
has tied to def, and the def operand is early clobber.

Give an example to demo what's wrong:
  0  %0 = ...
 16  early-clobber %0 = Op %0 (tied-def 0), ...
 32  ... = Op %0

Before extend:
 %0 = [0r, 0d) [16e, 32d)

The point we want to extend is 0d to 16e not 16r in this case, but if
we use 16r here we will extend nothing because that already contained
in [16e, 32d).

This patch add check for detect such case and adjust the extend point.

Detailed explanation for testcase: https://reviews.llvm.org/D126047

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D126048
2022-06-08 11:33:05 +08:00
David Penry 907aedbb3d [NFC] Fix spelling/newlines in comments/debug messages
Just a few spelling mistakes and missing newlines

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D127162
2022-06-07 09:38:53 -07:00
Simon Pilgrim a083f3caa1 [DAG] combineShuffleOfSplatVal - fold shuffle(splat,undef) -> splat, iff the splat contains no UNDEF elements
As noticed on D127115 - we were missing this fold, instead just having the shuffle(shuffle(x,undef,splatmask),undef) fold. We should be able to merge these into one using SelectionDAG::isSplatValue, but we'll need to match the shuffle's undef handling first.

This also exposed an issue in SelectionDAG::isSplatValue which was incorrectly propagating the undef mask across a bitcast (it was trying to just bail with a APInt::isSubsetOf if it found any undefs but that was actually the wrong way around so didn't fire for partial undef cases).
2022-06-07 16:42:24 +01:00
Matt Arsenault 56303223ac llvm-reduce: Don't assert on functions which don't track liveness
Use the query that doesn't assert if TracksLiveness isn't set, which
needs to always be available. We also need to start printing liveins
regardless of TracksLiveness.
2022-06-07 10:00:25 -04:00
Guillaume Chatelet 0788186182 [Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.

Differential Revision: https://reviews.llvm.org/D126910
2022-06-07 13:52:20 +00:00
Nikita Popov 5a64bc207e [DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert (PR55846)
These assert that there are no "useless" assertzext/assertsext nodes
(that assert a wider width than a following trunc), but I don't think
there is anything preventing such nodes from reaching this code.
I don't think the assertion is relevant for correctness of this
transform either -- if such an assert is present, then the other
one will always be to a smaller width, and we'll pick that one.
The assertion dates back to D37017.

Fixes https://github.com/llvm/llvm-project/issues/55846.

Differential Revision: https://reviews.llvm.org/D126952
2022-06-07 09:50:26 +02:00
Fangrui Song 15d82c62dc [MC] De-capitalize MCStreamer functions
Follow-up to c031378ce0 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
Hendrik Greving a43d25734a [ModuloSchedule] Fix terminator update when peeling.
Fixes a bug of us not correctly updating the terminator of the loop's
preheader, if multiple terminating branch instructions are present.

This is tested through existing tests. The bug itself is hard or not
possible to get exposed with the upstream Hexagon backend, because
the machine pipeliner checks for an existing preheader, which is
defined as a block with only 1 edge into the header.

The condition of this bug is a block into the loop with more than 1
edge, and not every downstream target checks for an existing preheader.

Differential Revision: https://reviews.llvm.org/D126386
2022-06-06 19:52:28 +00:00
Michael Kitzan b7fcf6632f [GISel] Add new combines for G_ADD
Patch adds new GICombineRules for G_ADD:

G_ADD(x, G_SUB(y, x)) -> y
G_ADD(G_SUB(y, x), x) -> y

Patch additionally adds new combine tests for AArch64 target for
these new rules.

Reviewed by: paquette

Differential Revision: https://reviews.llvm.org/D87936
2022-06-06 11:19:45 -07:00
Craig Topper be398100ea [SelectionDAG] Further improve computeKnownBits for (smax X, C) where C is non-negative.
Move the code that was added for D126896 after the normal recursive calls
to computeKnownBits. This allows us to calculate trailing zeros.
Previously we would break out of the switch before the recursive calls.
2022-06-06 09:59:23 -07:00
Kazu Hirata 5c06f7168f [CodeGen] Remove splitCanCauseEvictionChain and its helpers (NFC)
The last use was removed on Mar 7, 2022 in commit
294eca35a0.
2022-06-05 20:22:47 -07:00
Kazu Hirata 43d4585e64 [GlobalISel] Remove widenWithUnmerge (NFC)
The last use was removed on Dec 23, 2021 in commit
29f88b93fd.
2022-06-05 19:58:18 -07:00
Kazu Hirata 61abcb0b37 [GlobalISel] Remove valueIsSplit (NFC)
The last use was removed on Jun 27, 2019 in commit
8138996128.
2022-06-05 19:51:03 -07:00
Lian Wang 20cf77f776 [LegalizeTypes][VP] Add widen and split support for vp.fptrunc and vp.fpext
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D126439
2022-06-06 02:28:01 +00:00
Kazu Hirata 3b9707dbc0 [llvm] Convert for_each to range-based for loops (NFC) 2022-06-05 12:07:14 -07:00
Alexey Lapshin 501d5b24db [Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef - remove isIndexed().
This patch is extraction from the https://reviews.llvm.org/D126883.
It removes DwarfStringPoolEntryRef::isIndexed() and isIndexed bit
since they are not used.

Differential Revision: https://reviews.llvm.org/D126958
2022-06-05 21:18:31 +03:00
Fangrui Song 95a134254a Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 01:07:51 -07:00
Fangrui Song d86a206f06 Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 00:31:44 -07:00
Kazu Hirata bcf4fa458a [CodeGen] Use a range-based for loop (NFC) 2022-06-04 22:26:55 -07:00
Kazu Hirata 4969a6924d Use llvm::less_first (NFC) 2022-06-04 21:23:18 -07:00
Kazu Hirata 32ce076d78 [CodeGen] Use StringRef::contains (NFC) 2022-06-04 20:58:58 -07:00
Fangrui Song 36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Fangrui Song 557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Benjamin Kramer e8e4b741dd [DAGCombiner] Add bf16 to the matrix of types that we don't promote to integer stores
Remove a few stray semicolons while there.
2022-06-03 13:28:34 +02:00
Nikita Popov ad742cf85d [DAGCombine] Handle promotion of shift with both operands the same
When promoting a shift, make sure we only fetch the second operand
after promoting the first. Load promotion may replace users of the
old load, and we don't want to be left with a dangling reference to
the old load instruction.

The crashing test case is from https://reviews.llvm.org/D126689#3553212.

Differential Revision: https://reviews.llvm.org/D126886
2022-06-03 10:00:44 +02:00
Craig Topper fa20bf1636 [DAGCombiner][RISCV] Improve computeKnownBits for (smax X, C) where C is non-negative.
If C is non-negative, the result of the smax must also be
non-negative, so all sign bits of the result are 0.

This allows DAGCombiner to remove a zext_inreg in the modified test.
This zext_inreg started as a sext that became zext before type
legalization then was promoted to a zext_inreg.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126896
2022-06-02 12:34:24 -07:00
jacquesguan 5482ae6328 [LegalizeTypes][VP] Add widen and split support for VP FP integer casting op.
This patch adds widen and split support for VP_FPTOSI, VP_FPTOUI, VP_SITOFP and VP_UITOFP.

Differential Revision: https://reviews.llvm.org/D126847
2022-06-02 09:05:27 +00:00
jacquesguan 058791d8f2 [LegalizeTypes][VP] Add widen and split support for VP_SIGN_EXTEND and VP_ZERO_EXTEND.
Differential Revision: https://reviews.llvm.org/D126442
2022-06-02 02:21:22 +00:00
Matt Arsenault 4cb722acbc BranchFolder: Require NoPHIs
The pass doesn't handle SSA and breaks any phis.
2022-06-01 21:14:49 -04:00
Hendrik Greving a92ed167f2 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-02 00:49:11 +00:00
Quentin Colombet 1a155ee7de [RegisterClassInfo] Invalidate cached information if ignoreCSRForAllocationOrder changes
Even if CSR list is same between functions, we could have had a different
allocation order if ignoreCSRForAllocationOrder is evaluated differently.
Hence invalidate cached register class information if
ignoreCSRForAllocationOrder changes.

Patch by Srividya Karumuri <srividya_karumuri@apple.com>

Differential Revision: https://reviews.llvm.org/D126565
2022-06-01 17:15:51 -07:00
Hendrik Greving e9d05cc7d8 Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c302.

Due to failures in Clang tests.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 13:27:49 -07:00
Hendrik Greving 430ac5c302 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 12:48:01 -07:00
Denis Antrushin 7047d79fde [TwoAddressInstructionPass] Relax assert in statepoint processing.
D124631 added special processing for STATEPOINT instructions.
It appears that assertion added there is too strong. We can get two
tied operands with the same register tied to different defs. If we
hit such case, do not process it in statepoint-specific code and
delegate it to common case.
2022-06-01 21:34:52 +07:00
Matt Arsenault 0e1c71e4a4 CodeGen: Move getAddressSpaceForPseudoSourceKind into TargetMachine
Avoid the dependency on TargetInstrInfo, which depends on the subtarget
and therefore the individual function.

Currently AMDGPU is constructing PseudoSourceValue instances in MachineFunctionInfo.
In order to facilitate copying MachineFunctionInfo, we need to stop allocating these
there. Alternatively we could allow targets to subclass PseudoSourceValueManager,
and allocate them similarly to MachineFunctionInfo.
2022-06-01 09:45:40 -04:00
Martin Storsjö 6b75a3523f [ARM] [MC] Add support for writing ARM WinEH unwind info
This includes .seh_* directives for generating it from assembly.
It is designed fairly similarly to the ARM64 handling.

For .seh_handler directives, such as
".seh_handler __C_specific_handler, @except" (which is supported
on x86_64 and aarch64 so far), the "@except" bit doesn't work in
ARM assembly, as '@' is used as a comment character (on all current
platforms).

Allow using '%' instead of '@' for this purpose. This convention
is used by GAS in similar contexts already,
e.g. [1]:

    Note on targets where the @ character is the start of a comment
    (eg ARM) then another character is used instead. For example the
    ARM port uses the % character.

In practice, this unfortunately means that all such .seh_handler
directives will need ifdefs for ARM.

Contrary to ARM64, on ARM, it's quite common that we can't evaluate
e.g. the function length at this point, due to instructions whose
length is finalized later. (Also, inline jump tables end with
a ".p2align 1".)

If unable to to evaluate the function length immediately, emit
it as an MCExpr instead. If we'd implement splitting the unwind
info for a function (which isn't implemented for ARM64 yet either),
we wouldn't know whether we need to split it though.

Avoid calling getFrameIndexOffset() on an unset
FuncInfo.UnwindHelpFrameIdx, to avoid triggering asserts in the
preexisting testcase CodeGen/ARM/Windows/wineh-basic.ll. (Once
MSVC exception handling is fully implemented, those changes
can be reverted.)

[1] https://sourceware.org/binutils/docs/as/Section.html#Section

Differential Revision: https://reviews.llvm.org/D125645
2022-06-01 11:25:48 +03:00
Ping Deng ae8ae45e2a [DAGCombine][NFC] Add braces to 'else' to match braced 'if'
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D126624
2022-06-01 07:54:05 +00:00
Bjorn Pettersson 86caa03718 Revert "Round up zero-sized symbols to 1 byte in `.debug_aranges`."
This reverts commit 256a52d9aa (and
also the follow-up commit 38eb4fe74b that moved a test
case to a different directory).

As discussed in https://reviews.llvm.org/D126257 there is a suspicion
that something was wrong with this commit as text section range was
shortened to 1 byte rather than rounded up as shown in the
llvm/test/DebugInfo/X86/dwarf-aranges.ll test case.
2022-05-31 11:03:44 +02:00
Denis Antrushin 85322e82be [TwoAddressInstructionPass] Special processing of STATEPOINT instruction.
STATEPOINT is a special pseudo instruction which represent Moving GC semantic to LLVM.
Every tied def/use VReg pair in STATEPOINT represent same physical register which can
'magically' change during call wrapped by statepoint.
(By construction, tied use operand  is not live across  STATEPOINT).

This means that when converting into two-address form, there is not need to insert COPY
instruction before stateppoint, what TwoAddressInstruction pass does for 'regular'
instructions.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D124631
2022-05-30 19:07:30 +03:00
Simon Moll 18c1ee04de Re-land "[VP] vp intrinsics are not speculatable" with test fix
Update the llvmir-intrinsics.mlir test to account for the modified
attribute sets.

This reverts commit 2e2a8a2d90.
2022-05-30 14:41:15 +02:00
Mehdi Amini 2e2a8a2d90 Revert "[VP] vp intrinsics are not speculatable"
This reverts commit 78a18d2b54.

Break MLIR bot: https://lab.llvm.org/buildbot/#/builders/61/builds/27127
2022-05-30 12:26:16 +00:00
Simon Moll 78a18d2b54 [VP] vp intrinsics are not speculatable
VP intrinsics show UB if the %evl parameter is out of bounds - they must
not carry the speculatable attribute.  The out-of-bounds UB disappears
when the %evl parameter is expanded into the mask or expansion replaces
the entire VP intrinsic with non-VP code.

This patch
- Removes the speculatable attribute on all VP intrinsics.
- Generalizes the isSafeToSpeculativelyExecute function to let VP
  expansion know whether the VP intrinsic replacement will be
  speculatable.  VP expansion may only discard %evl where this is the
  case.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125296
2022-05-30 12:20:05 +02:00
Ping Deng 88af539c0e [RISCV] Support VP_REDUCE_MUL mask operation
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126520
2022-05-30 03:05:39 +00:00
Ping Deng 083798e270 [LegalizeTypes][VP] Add integer promotion support for vp.fptosi/vp.fptoui
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125760
2022-05-30 03:05:39 +00:00
Serge Pavlov bdd0093f4d [GlobalISel] Add G_IS_FPCLASS
Add a generic opcode to represent `llvm.is_fpclass` intrinsic.

Differential Revision: https://reviews.llvm.org/D121454
2022-05-27 13:49:47 +07:00
Ping Deng 121689a62e [SelectionDAG][NFC] Simplify integer promotion in setcc/vp.setcc
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126516
2022-05-27 05:50:19 +00:00
Rahman Lavaee 08cc058518 Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c5.

The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests.

Differential Revision: https://reviews.llvm.org/D126518
2022-05-26 19:53:14 -07:00
Rahman Lavaee 3aa249329f Revert "[Propeller] Promote functions with propeller profiles to .text.hot."
This reverts commit 4d8d2580c5.
2022-05-26 18:45:40 -07:00
Rahman Lavaee 4d8d2580c5 [Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).

This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.

The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.

Differential Revision: https://reviews.llvm.org/D122930
2022-05-26 16:23:21 -07:00
Craig Topper 460781feef [LegalizeTypes] Fix bug in expensive checks verification
With a fix for an expensive checks build failure exposed by new RISC-V tests.
Something about expanding two rotates in type legalization caused a change
in the remapping tables that the expensive checks verifying wasn't expecting.
See comment in the code for how it was fixed.

Tests came from this commit that exposed the bug
[RISCV] Add test cases showing failure to remove mask on rotate amounts.

If the masking AND has multiple users we fail to remove it.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D126036
2022-05-26 13:13:32 -07:00
Adrian Tong 7c13ae6490 Give option to use isCopyInstr to determine which MI is
treated as Copy instruction in MCP.

This is then used in AArch64 to remove copy instructions after taildup
ran in machine block placement

Differential Revision: https://reviews.llvm.org/D125335
2022-05-26 18:43:16 +00:00
Simon Pilgrim f366acdbf6 [DAG] Generalize (sra (trunc (sra x, c1)), c2) -> (trunc (sra x, c1 + c2)) constant folding
Remove local (uniform) constant folding and rely on getNode() to perform it

Minor cleanup step toward adding non-uniform shift amount support
2022-05-26 14:05:09 +01:00
Simon Pilgrim 7b617eef80 [DAG] Cleanup "and/or of cmp with single bit diff" fold to use ISD::matchBinaryPredicate
Prep work as I'm investigating some cases where TLI::convertSetCCLogicToBitwiseLogic should accept vectors.
2022-05-26 12:34:09 +01:00
Chen Zheng d79275238f [MachineSink] replace MachineLoop with MachineCycle
reapply 62a9b36fcf and fix module build
failue:
1: remove MachineCycleInfoWrapperPass in MachinePassRegistry.def
   MachineCycleInfoWrapperPass is a anylysis pass, should not be there.
2: move the definition for MachineCycleInfoPrinterPass to cpp file.

Otherwise, there are module conflicit for MachineCycleInfoWrapperPass
in MachinePassRegistry.def and MachineCycleAnalysis.h after
62a9b36fcf.

MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().

This patch tries to use MachineCycle so that we can handle
irreducible loop better.

Reviewed By: sameerds, MatzeB

Differential Revision: https://reviews.llvm.org/D123995
2022-05-26 06:45:23 -04:00
Fangrui Song 9ee15bba47 [MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFC 2022-05-26 00:14:08 -07:00
serge-sans-paille fb67d683db [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since 7030654296 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D126417
2022-05-26 08:12:34 +02:00
Lian Wang 8aa6b05deb [LegalizeTypes][VP] Add widen and split support for VP_TRUNCATE
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125950
2022-05-26 02:03:27 +00:00
Patrick Walton 256a52d9aa Round up zero-sized symbols to 1 byte in `.debug_aranges`.
This commit modifies the AsmPrinter to avoid emitting any zero-sized symbols to
the .debug_aranges table, by rounding their size up to 1. Entries with zero
length violate the DWARF 5 spec, which states:

> Each descriptor is a triple consisting of a segment selector, the beginning
> address within that segment of a range of text or data covered by some entry
> owned by the corresponding compilation unit, followed by the non-zero length
> of that range.

In practice, these zero-sized entries produce annoying warnings in lld and
cause GNU binutils to truncate the table when parsing it.

Other parts of LLVM, such as DWARFDebugARanges in the DebugInfo module
(specifically the appendRange method), already avoid emitting zero-sized
symbols to .debug_aranges, but not comprehensively in the AsmPrinter. In fact,
the AsmPrinter does try to avoid emitting such zero-sized symbols when labels
aren't involved, but doesn't when the symbol to emitted is a difference of two
labels; this patch extends that logic to handle the case in which the symbol is
defined via labels.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D126257
2022-05-25 13:31:36 -07:00
Takafumi Arakaki 18e6b8234a Allow pointer types for atomicrmw xchg
This adds support for pointer types for `atomic xchg` and let us write
instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This
is similar to the patch for allowing atomicrmw xchg on floating point
types: https://reviews.llvm.org/D52416.

Differential Revision: https://reviews.llvm.org/D124728
2022-05-25 16:20:26 +00:00
Simon Moll 6e12711081 [VP][fix] Don't discard masks in reductions
When expanding VP reductions to non VP-code, the reduction pass was
ignoring the mask before. Fix this by keeping the mask and selecting
neutral elements where the mask is zero.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D126362
2022-05-25 15:54:45 +02:00
Chen Zheng 80c4910f3d Revert "[MachineSink] replace MachineLoop with MachineCycle"
This reverts commit 62a9b36fcf.
Cause build failure on lldb incremental buildbot:
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/43994/changes
2022-05-24 22:43:37 -04:00
Paul Walker 6f215ca680 [SelectionDAG] Add support to widen ISD::STEP_VECTOR operations.
Fixes: #55165

Differential Revision: https://reviews.llvm.org/D126168
2022-05-24 22:42:37 +01:00
Sotiris Apostolakis 67be40df6e Recommit "[SelectOpti][5/5] Optimize select-to-branch transformation"
Use container::size_type directly to avoid type mismatch causing build failures in Windows.

Original commit message:
This patch optimizes the transformation of selects to a branch when the heuristics deemed it profitable.
It aggressively sinks eligible instructions to the newly created true/false blocks to prevent their
execution on the common path and interleaves dependence slices to maximize ILP.

Depends on D120232

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120233
2022-05-24 14:08:09 -04:00
Serge Pavlov 6fc0bc5b0f Fix behavior of is_fp_class on empty class set
The second argument to is_fp_class specifies the set of floating-point
class to test against. It can be zero, in this case the intrinsic is
expected to return zero value.

Differential Revision: https://reviews.llvm.org/D112025
2022-05-24 21:50:18 +07:00
Simon Pilgrim 11455e4758 [DAG] Unroll vectorized FPOW instructions before widening that will scalarize to libcalls anyway
Followup to D125988 - FPOW is similar to FREM and will most likely scalarize to libcalls, so unroll before widening to prevent use making additional libcalls with UNDEF args.
2022-05-24 15:44:53 +01:00
Sam Parker e0fe9785d3 [TypePromotion] Avoid unnecessary trunc zext pairs
Any zext 'sink' should already have an operand that is in the legal
value, so avoid using a trunc and just use the trunc operand instead.

Differential Revision: https://reviews.llvm.org/D118905
2022-05-24 15:34:36 +01:00
Nabeel Omer 8b5d9cbbfe [x86][DAG] Unroll vectorized FREMs that will become libcalls
Currently, two element vectors produced as the result of a binary op are
widened to four element vectors on x86 by
DAGTypeLegalizer::WidenVecRes_BinaryCanTrap. If the op still isn't legal
after widening it is unrolled into 4 scalar ops in SelectionDAG before
being converted into a libcall. This way we end up with 4 libcalls (two of
them on known undef elements) instead of the original two libcalls.

This patch modifies DAGTypeLegalizer::WidenVectorResult to ensure that if
it is known that a binary op will be tunred into a libcall, it is unrolled
instead of being widened. This prevents the creation of the extra scalar
instructions on known undef elements and (eventually) libacalls with known
undef parameters which would otherwise be created when the op gets expanded
post widening.

Differential Revision: https://reviews.llvm.org/D125988
2022-05-24 13:34:51 +01:00
Fraser Cormack 7f7ef0ed61 [LegalizeTypes][NFC] Fix node name in assertion message
This was probably copy/pasted from the MSCATTER widening.
2022-05-24 09:16:18 +01:00
Lian Wang be84f91f87 [LegalizeTypes][VP] Fix OpNo in WidenVecOp_VP_SCATTER
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126276
2022-05-24 07:14:46 +00:00
Chen Zheng 62a9b36fcf [MachineSink] replace MachineLoop with MachineCycle
MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().

This patch tries to use MachineCycle so that we can handle
irreducible loop better.

Reviewed By: sameerds, MatzeB

Differential Revision: https://reviews.llvm.org/D123995
2022-05-24 01:16:19 -04:00
Sotiris Apostolakis 1786e70bd8 Revert "[SelectOpti][5/5] Optimize select-to-branch transformation"
This reverts commit a111fb9601.
2022-05-24 00:02:00 -04:00
Sotiris Apostolakis a111fb9601 [SelectOpti][5/5] Optimize select-to-branch transformation
This patch optimizes the transformation of selects to a branch when the heuristics deemed it profitable.
It aggressively sinks eligible instructions to the newly created true/false blocks to prevent their
execution on the common path and interleaves dependence slices to maximize ILP.

Depends on D120232

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120233
2022-05-23 23:31:27 -04:00
Sotiris Apostolakis d7ebb74611 [SelectOpti][4/5] Loop Heuristics
This patch adds the loop-level heuristics for determining whether branches are more profitable than conditional moves.
These heuristics apply to only inner-most loops.

Depends on D120231

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120232
2022-05-23 22:05:41 -04:00
Sotiris Apostolakis 8b42bc5662 [SelectOpti][3/5] Base Heuristics
This patch adds the base heuristics for determining whether branches are more profitable than conditional moves.
Base heuristics apply to all code apart from inner-most loops.

Depends on D122259

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120231
2022-05-23 22:01:12 -04:00
Sotiris Apostolakis 97c3ef5c8a [SelectOpti][2/5] Select-to-branch base transformation
This patch implements the actual transformation of selects to branches.
It includes only the base transformation without any sinking.

Depends on D120230

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D122259
2022-05-23 16:11:40 -04:00
Qunyan Mangus 12bae5f3e2 Remove duplicate fields in RAGreedy
RAGreedy has two fields of RegisterClassInfo, one called RCI and another RegClassInfo from its base class.
RCI is initialized without freezeReservedRegs first, while RegClassInfo does. Therefore, if reserved registers
information is changed between last time freezeReservedRegs is called and RAGreedy, it's not picked up by RCI.
Instead of having both fields in RAGreedy, remove RCI and use RegClassInfo instead. Also removed is the TRI field
which is present in its base class.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D125926
2022-05-23 13:08:25 -07:00
Craig Topper 569d8945f3 [DAGCombiner][AArch64] Don't fold (smulo x, 2) -> (saddo x, x) if VT is i2.
If the VT is i2, then 2 is really -2.

Test has not been commited yet, but diff shows the change.

Fixes PR55644.

Differential Revision: https://reviews.llvm.org/D126213
2022-05-23 11:13:57 -07:00
Nikita Popov 5126c38012 [CGP] Freeze condition when despeculating ctlz/cttz
Freeze the condition of the newly introduced conditional branch,
to avoid immediate undefined behavior if the input to ctlz/cttz
was originally poison.

Differential Revision: https://reviews.llvm.org/D125887
2022-05-23 11:01:18 +02:00
Craig Topper c11051a400 [SelectionDAG] Add a freeze to ISD::ABS expansion.
I had initially assumed this was the problem with
https://github.com/llvm/llvm-project/issues/55271#issuecomment-1133426243

But it turns out that was a simpler issue. This patch is still
more correct than what we were doing before so figured I'd submit
it anyway.

No test case because I'm not sure how to get an undef around
until expansion.

Looking at the test deltas I wonder if it be valid to combine
(sext_inreg (freeze (aextload X))) -> (freeze (sextload X)).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D126175
2022-05-22 14:29:58 -07:00
Craig Topper 768a1ca5ec [SelectionDAG] Fold abs(undef) to 0 instead of undef.
abs should only produce a positive value or the signed minimum
value. This means we can't fold abs(undef) to undef as that would
allow more values. Fold to 0 instead to match InstSimplify.

Fixes test mentioned in comment on pr55271.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126174
2022-05-22 12:47:32 -07:00
Paul Walker 258dac43d6 [SVE] Enable use of 32bit gather/scatter indices for fixed length vectors
Differential Revision: https://reviews.llvm.org/D125193
2022-05-22 12:32:30 +01:00
Ping Deng 0e8ac3a797 [LegalizeTypes][VP] Add integer promotion support for vp.sitofp/vp.uitofp
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125960
2022-05-22 02:13:45 +00:00
Craig Topper 4638766794 [TypePromotion] Refine fix sext/zext for promoted constant from D125294.
Reviewing the code again, I believe the sext is needed on the LHS
or RHS for ICmp and only on the RHS for Add.

Add an opcode check before checking the operand number.

Fixes PR55627.

Differential Revision: https://reviews.llvm.org/D125654
2022-05-21 14:08:15 -07:00
Craig Topper 003b95acf2 [LegalizeTypes] Remove double map lookup in DAGTypeLegalizer::PerformExpensiveChecks. NFC
Remove repeated checks for ResId being 0.
2022-05-21 00:06:59 -07:00
Craig Topper 66875dbcc0 [LegalizeTypes] Use SmallDenseMap::count instead of SmallDenseMap::find. NFC
It's more readable and more efficient.
2022-05-21 00:06:55 -07:00
Shilei Tian ff60a0a364 [LLVM] Add a check if should cast atomic operations to integer type
Currently for atomic load, store, and rmw instructions, as long as the
operand is floating-point value, they are casted to integer. Nowadays many
targets can actually support part of atomic operations with floating-point
operands. For example, NVPTX supports atomic load and store of floating-point
values. This patch adds a series interface functions `shouldCastAtomicXXXInIR`,
and the default implementations are same as what we currently do. Later for
targets can have their specialization.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125652
2022-05-20 17:23:53 -04:00
Zequan Wu 9886046289 [CodeView] Combine variable def ranges that are continuous.
It saves about 1.13% size for chrome.dll.pdb on chrome official build.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D125721
2022-05-20 12:12:14 -07:00
Craig Topper 8d3894f67e [TypePromotion] Fix another case for sext vs zext in promoted constant.
If the SafeWrap operation is a subtract, we negated the constant
to treat the subtract as an addition. The sext was based on the
operation being addition. So we really need to do (neg (sext (neg C)))
when promoting the constant. This is equivalent to (sext C) for
every value of C except the min signed value. For min signed value
we need to do (zext C) instead.

Fixes PR55490.

Differential Revision: https://reviews.llvm.org/D125653
2022-05-20 09:30:07 -07:00
Ivan Kosarev 86803008ea [MIR] Provide location of extra instruction operand when diagnosing it.
Also resolves misspelled FileCheck directives caught with D125604.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D125965
2022-05-20 05:56:25 +01:00
Sotiris Apostolakis ca7c307d18 [SelectOpti][1/5] Setup new select-optimize pass
This is the first commit for the cmov-vs-branch optimization pass.
The goal is to develop a new profile-guided and target-independent cost/benefit analysis
for selecting conditional moves over branches when optimizing for performance.

Initially, this new pass is expected to be enabled only for instrumentation-based PGO.

RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D120230
2022-05-19 16:31:10 +00:00
Jay Foad 6bec3e9303 [APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557
2022-05-19 11:23:13 +01:00
Lian Wang 530bab1f93 [RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation
Re-landed D125206

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125206
2022-05-19 09:53:33 +00:00
Lian Wang f035068bb3 [LegalizeVectorTypes][VP] Add widen and split support for VP_SETCC
Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D125446
2022-05-19 07:42:39 +00:00
Lian Wang bbc6834e26 [LegalizeTypes][VP] Add integer promotions support for VP_TRUNCATE
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125739
2022-05-19 07:36:10 +00:00
Lian Wang 993070d11f [LegalizeTypes][VP][NFC] Use an if and two returns instead of ?: operator
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125858
2022-05-19 07:18:24 +00:00
Jon Roelofs d699e54ca2 Fix an or+and miscompile w/ GlobalISel
Fixes #55284
2022-05-18 19:09:47 -07:00
Matthias Braun 8d03c49f49 Extend switch condition in optimizeSwitchPhiConst when free
In a case like:

    switch((i32)x) { case 42: phi((i64)42, ...); }

replace `(i64)42` with `zext(x)` when we can do so for free.

This fixes a part of https://github.com/llvm/llvm-project/issues/55153

Differential Revision: https://reviews.llvm.org/D124897
2022-05-18 16:23:53 -07:00
Mitch Phillips 7aa1fa0a0a Reland "[dwarf] Emit a DIGlobalVariable for constant strings."
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.

Reviewed By: dblaikie, rnk, aprantl

Differential Revision: https://reviews.llvm.org/D123534
2022-05-18 13:56:45 -07:00
Michael Kitzan 29bebb0237 [GISel] Add new combines for G_FMINNUM/MAXNUM and G_FMINIMUM/MAXIMUM
I noticed https://reviews.llvm.org/D87415 added SDAG combines to fold
FMIN/MAX instrs with NaNs.

The patch implements the same NaN combines for GISel GMIR FMIN/MAX opcodes:
G_FMINNUM(X, NaN) -> X
G_FMAXNUM(X, NaN) -> X
G_FMINIMUM(X, NaN) -> NaN
G_FMAXIMUM(X, NaN) -> NaN

The patch adds AArch64 tests for these combines as well.

Reviewed by: arsenm

Differential revision: https://reviews.llvm.org/D125819
2022-05-18 12:08:53 -07:00
Yusra Syeda 5ac411aea8 [SystemZ][z/OS] Add the PPA1 to SystemZAsmPrinter
Differential Revision: https://reviews.llvm.org/D125725
2022-05-18 14:13:17 -04:00
Craig Topper 46eef76876 [DAGCombiner] Fix bug in MatchBSwapHWordLow.
This function tries to match (a >> 8) | (a << 8) as (bswap a) >> 16.

If the SRL isn't masked and the high bits aren't demanded, we still
need to ensure that bits 23:16 are zero. After the right shift they
will be in bits 15:8 which is where the important bits from the SHL
end up. It's only a bswap if the OR on bits 15:8 only takes the bits
from the SHL.

Fixes PR55484.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D125641
2022-05-18 09:23:18 -07:00
Yeting Kuo 00999fb6e1 [SelectionDAGBuilder] Pass fast math flags to most of VP SDNodes.
The patch does not pass math flags to float VPCmpIntrinsics because LLParser
could not identify float VPCmpIntrinsics as FPMathOperators.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125600
2022-05-18 16:15:47 +08:00
Simon Pilgrim d40b7f0d5a [DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses
If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a single AND node.

AArch64 has a couple of regressions due to this, so I've enforced the existing one use limit inside a AArch64TargetLowering::shouldFoldConstantShiftPairToMask callback.

Part of the work to fix the regressions in D77804

Differential Revision: https://reviews.llvm.org/D125607
2022-05-17 13:40:11 +01:00
Jay Foad 77480556c4 [RegAllocGreedy] New hook regClassPriorityTrumpsGlobalness
Add a new TargetRegisterInfo hook to allow targets to tweak the
priority of live ranges, so that AllocationPriority of the register
class will be treated as more important than whether the range is local
to a basic block or global. This is determined per-MachineFunction.

Differential Revision: https://reviews.llvm.org/D125102
2022-05-17 12:35:21 +01:00
jacquesguan 26593e7314 [SelectionDAG] Support more VP reduction mask operation.
This patch uses VP_REDUCE_AND and VP_REDUCE_OR to replace VP_REDUCE_SMAX,VP_REDUCE_SMIN,VP_REDUCE_UMAX and VP_REDUCE_UMIN for mask vector type.

Differential Revision: https://reviews.llvm.org/D125002
2022-05-17 09:14:21 +00:00
Fraser Cormack 599ff247de [StackColoring] Don't merge slots with differing StackIDs
The documentation for this specifically mentions that this should not
happen. We could think about adding target hooks to permit it (and how
to merge IDs) in the future if that is desirable.

This specific test case was merging a scalable-vector slot into a
non-scalable one and dropping the notion of scalability, meaning we
failed to allocate enough stack space for the object.

Reviewed By: arsenm, MaskRay, sdesmalen

Differential Revision: https://reviews.llvm.org/D125699
2022-05-17 08:28:49 +01:00
Mitch Phillips ed2c3218f5 Revert "[dwarf] Emit a DIGlobalVariable for constant strings."
This reverts commit 4680982b36.

Broke a fuchsia windows bot. More details in the review:
https://reviews.llvm.org/D123534
2022-05-16 19:07:38 -07:00
Mitch Phillips 4680982b36 [dwarf] Emit a DIGlobalVariable for constant strings.
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.

Reviewed By: dblaikie, rnk, aprantl

Differential Revision: https://reviews.llvm.org/D123534
2022-05-16 16:52:16 -07:00
Philip Reames 7dbf2e7b57 Teach PeepholeOpt to eliminate redundant copy from constant physreg (e.g VLENB on RISCV)
The existing redundant copy elimination required a virtual register source, but the same logic works for any physreg where we don't have to worry about clobbers.  On RISCV, this helps eliminate redundant CSR reads from VLENB.

Differential Revision: https://reviews.llvm.org/D125564
2022-05-16 16:38:30 -07:00
Paul Walker 7dd05ba9ed [SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes.
During early gather/scatter enablement two different approaches
were taken to represent scaled indices:

* A Scale operand whereby byte_offsets = Index * Scale
* An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType)

Having multiple representations is bad as shown by this patch which
fixes instances where the two are out of sync. The dedicated scale
operand is more flexible and pervasive so this patch removes the
UNSCALED values from IndexType. This means all indices are scaled
but the scale can be one, hence unscaled. SDNodes now use the scale
operand to answer the "isScaledIndex" question.

I toyed with the idea of keeping the UNSCALED enums and helper
functions but because they will have no uses and force SDNodes to
validate the set of supported values I figured it's best to remove
them. We can re-add them if there's a real need. For similar
reasons I've kept the IndexType enum when a bool could be used as I
think being explicitly looks better.

Depends On D123347

Differential Revision: https://reviews.llvm.org/D123381
2022-05-16 20:47:52 +01:00
Craig Topper 1c4880a2d3 [TargetLowering] Expand the last stage of i16 popcnt using shift+add+and instead of mul+shift.
If we use multiply it would be with 0x0101 which is 1 more than a power
of 2. On some targets we would expand this to shl+add. By avoiding the
multiply earlier, we can generate better code.

Note, PowerPC doesn't do the shl+add expansion of multiply so one of
the tests increased in instruction count.

Limiting to scalars because it almost always increased the number of
instructions in vector tests.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D125638
2022-05-16 09:27:44 -07:00
Craig Topper e6fc8454be [DAGCombiner] Fix incorrect indentation. NFC 2022-05-16 09:27:15 -07:00
Philip Reames 55e2df7285 [LiveIntervals] Add range accessors for value numbers [nfc] 2022-05-16 08:23:12 -07:00
Bradley Smith 7ff5148d64 [DAGCombine] Support splat_vector nodes in (and (extload)) dagcombine
Differential Revision: https://reviews.llvm.org/D125367
2022-05-16 11:25:20 +00:00
Abinav Puthan Purayil 485dd0b752 [GlobalISel] Handle constant splat in funnel shift combine
This change adds the constant splat versions of m_ICst() (by using
getBuildVectorConstantSplat()) and uses it in
matchOrShiftToFunnelShift(). The getBuildVectorConstantSplat() name is
shortened to getIConstantSplatVal() so that the *SExtVal() version would
have a more compact name.

Differential Revision: https://reviews.llvm.org/D125516
2022-05-16 16:03:30 +05:30
Yeting Kuo 26a61ab678 [SelectionDAG] Make getNode which uses single element SDVTList pass SDNodeFlags.
The patch make users not need to know getNode with SDNodeFlags argument may not
pass its flags.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125659
2022-05-16 18:19:46 +08:00
Denis Antrushin 8903dbef8f [StatepointLowering] Properly handle local and non-local relocates of the same value.
FunctionLoweringInfo::StatepointRelocationMaps map is used to pass GC pointer
lowering information from statepoint to gc.relocate  which may appear ini
different block.
D124444 introduced different lowering for local and non-local relocates.
Local relocates use SDValue and non-local relocates use value exported to VReg.
But I overlooked the fact that StatepointRelocationMap is indexed not by
GCRelocate instruction, but by derived pointer. This works incorrectly when
we have two relocates (one local and another non-local) of the same value,
because they need different relocation records.

This patch fixes the problem by recording relocation information per relocate
instruction, not per derived pointer. This way, each gc.relocate can be lowered
differently.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D125538
2022-05-16 17:02:34 +07:00
Nikita Popov 05c3fe075d [FastISel] Fix load folding for registers with fixups
FastISel tries to fold loads into the single using instruction.
However, if the register has fixups, then there may be additional
uses through an alias of the register.

In particular, this fixes the problem reported at
https://reviews.llvm.org/D119432#3507087. The load register is
(at the time of load folding) only used in a single call instruction.
However, selection of the bitcast has added a fixup between the
load register and the cross-BB register of the bitcast result.
After fixups are applied, there would now be two uses of the load
register, so load folding is not legal.

Differential Revision: https://reviews.llvm.org/D125459
2022-05-16 10:25:25 +02:00
Craig Topper b4ad450953 [TargetLowering] expandCTPOP don't create an used constant mask for i8 ctpop. NFC
Use early out for the i8 case.

I'm looking at avoiding MUL on targets that use libcalls for MUL.
So doing a little pre-refactoring.
2022-05-14 20:35:38 -07:00
Simon Pilgrim f4eac6e5f6 [DAG] visitOR - merge isa/cast<ShuffleVectorSDNode> into dyn_cast<ShuffleVectorSDNode>. NFC.
Also, initialize entire mask to -1 to simplify undefined cases.
2022-05-14 20:49:26 +01:00
Simon Pilgrim 95cdd63b87 [DAG] visitADDLike - use SelectionDAG::FoldConstantArithmetic directly to match constant operands
SelectionDAG::FoldConstantArithmetic determines if operands are foldable constants, so we don't need to bother with isConstantOrConstantVector / Opaque tests before calling it directly.
2022-05-14 18:39:41 +01:00
Simon Pilgrim 8db72d9d04 [DAG] visitMUL - pull out repeated SDLoc() calls. NFC. 2022-05-14 14:28:39 +01:00
Simon Pilgrim 8d4d4988e4 [DAG] Use SelectionDAG::FoldConstantArithmetic directly to match constant operands
SelectionDAG::FoldConstantArithmetic determines if operands are foldable constants, so we don't need to bother with isConstantOrConstantVector / Opaque tests before calling it directly.
2022-05-14 14:19:12 +01:00
Simon Pilgrim 1ecc3d86ae [DAG] Enable ISD::SHL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
Pulled out of D77804 as its going to be easier to address the regressions individually.

This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

The lost RISCV gorc2 fold shouldn't be a problem - instcombine would have already destroyed that pattern - see https://github.com/llvm/llvm-project/issues/50553

Differential Revision: https://reviews.llvm.org/D124839
2022-05-14 09:50:01 +01:00
Eli Friedman 96c2a0c9ff [GlobalIsel] Fix fallback if stack protector isn't supported.
When GlobalISel fails, we need to report the error, and we need to set
the FailedISel property.  We skipped those steps if stack protector
insertion failed, which led to a very strange miscompile.

Differential Revision: https://reviews.llvm.org/D125584
2022-05-13 14:17:27 -07:00
Simon Pilgrim 3fc33ced10 DAGCombiner.cpp - break if-else chains that always return (style) 2022-05-13 18:31:39 +01:00
Sanjay Patel e52e1dab2a [SDAG] freeze operand when expanging urem
This is a potential miscompile as discussed in issue #55291.

The related IR transform was patched with:
d428f09b2c
2022-05-13 10:55:14 -04:00
Nikita Popov ed1cb01baf [IRBuilder] Add IsInBounds parameter to CreateGEP()
We commonly want to create either an inbounds or non-inbounds GEP
based on a boolean value, e.g. when preserving inbounds from
existing GEPs. Directly accept such a boolean in the API, rather
than requiring a ternary between CreateGEP and CreateInBoundsGEP.

This change is not entirely NFC, because we now preserve an
inbounds flag in a constant expression edge-case in InstCombine.
2022-05-13 14:30:55 +02:00
Sam Parker 6d53d35efd [TypePromotion] Avoid some unnecessary truncs
Recommit.

Check for legal zext 'sinks' before inserting a trunc.

Differential Revision: https://reviews.llvm.org/D115451
2022-05-13 09:45:20 +01:00
Jay Foad 26e1ebd3ea [GlobalISel] Change ConstantFoldVectorBinop to return vector of APInt
Previously it built MIR for the results and returned a Register.

This avoids building constants for earlier elements of the vector if
later elements will fail to fold, and allows CSEMIRBuilder::buildInstr
to avoid unconditionally building a copy from the result.

Use a new helper function MachineIRBuilder::buildBuildVectorConstant
to build a G_BUILD_VECTOR of G_CONSTANTs.

Differential Revision: https://reviews.llvm.org/D117758
2022-05-13 09:33:07 +01:00
Lian Wang 693758b282 [LegalizeTypes][VP] Add integer promotion support for vp.setcc
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125453
2022-05-13 06:25:13 +00:00
Lian Wang 8050ba6678 [LegalizeTypes][VP] Add integer promotion support for vp.merge
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125452
2022-05-13 03:28:29 +00:00
Craig Topper cec249c60d [TypePromotion] Promote undef by converting to 0.
If we're promoting an undef I think that means that we expect the
upper bits are zero. undef doesn't guarantee that.

This patch replaces undef with 0 to ensure this. This matches how
a zext or sext of undef would be folded by InstCombine/InstSimplify.

I haven't found a failure from this was just thinking through the code.

Differential Revision: https://reviews.llvm.org/D123174
2022-05-12 09:09:24 -07:00
Fraser Cormack 1106bc208c [CodeGen][NFC] Move some comments from the end of lines to above them
This avoids wrapping the line itself awkwardly when it exceeds 80 chars.
It also better matches our style most other places.
2022-05-12 15:45:04 +01:00
Jeremy Morse a975472fa6 [DebugInfo][InstrRef] Describe value sizes when spilt to stack
This is a re-apply of D123599, which was reverted in 4fe2ab5279, now
with a more appropriate assertion. Original commit message follow:

InstrRefBasedLDV can track and describe variable values that are spilt to
the stack -- however it does not current describe the size of the value on
the stack. This can cause uninitialized bytes to be read from the stack if
a small register is spilt for a larger variable, or theoretically on
big-endian machines if a large value on the stack is used for a small
variable.

Fix this by using DW_OP_deref_size to specify the amount of data to load
from the stack, if there's any possibility for ambiguity. There are a few
scenarios where this can be omitted (such as when using DW_OP_piece and a
non-DW_OP_stack_value location), see deref-spills-with-size.mir for an
explicit table of inputs flavours and output expressions.

Differential Revision: https://reviews.llvm.org/D123599
2022-05-12 15:52:55 +01:00
Nikita Popov 50f846d634 [FastISel] Add some debug output (NFC)
Print a debug message when aborting isel (next to the ORE report)
and when folding a load.
2022-05-12 12:25:20 +02:00
Lian Wang 9176096c86 [LegalizeVectorTypes] Enable WidenVecRes_SETCC work for scalable vector.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125359
2022-05-12 02:52:43 +00:00
Craig Topper edbf390d10 [CodeGenPrepare] Use const reference to avoid unnecessary APInt copy. NFC
Spotted while looking at Matthias' patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124985
2022-05-11 12:06:45 -07:00
Matthias Braun de9ad98d2d Fix endless loop in optimizePhiConst with integer constant switch condition
Avoid endless loop in degenerate case with an integer constant as switch
condition as reported in https://reviews.llvm.org/D124552
2022-05-11 08:49:01 -07:00
David Green 5feeceddb2 [TypePromotion] Fix sext vs zext in promoted constant
As pointed out in #55342, given non-canonical IR with multiple
constants, we check the second operand in isSafeWrap, but can promote
both with sext. Fix that as suggested by @craig.topper by ensuring we
only extend the second constant if multiple are present.

Fixes #55342

Differential Revision: https://reviews.llvm.org/D125294
2022-05-11 10:47:44 +01:00
David Green 764a7f4864 [TypePromotion] Format Type Promotion. NFC
This clang-formats the TypePromotion code, with the only meaningful
change being the removal of a verifyFunction call inside a LLVM_DEBUG,
and the printing of the entire function which can be better handled
via -print-after-all.
2022-05-11 08:18:58 +01:00
Xiang1 Zhang 2ea8f203cd [CodeGen] Fix ConvertNodeToLibcall for STRICT_FPOWI
Reviewed By: PengfeiWang

Differential Revision: https://reviews.llvm.org/D125159
2022-05-11 08:58:06 +08:00
Matthias Braun f0ea9c9cec CodeGenPrepare: Replace constant PHI arguments with switch condition value
We often see code like the following after running SCCP:

    switch (x) { case 42: phi(42, ...); }

This tends to produce bad code as we currently materialize the constant
phi-argument in the switch-block. This increases register pressure and
if the pattern repeats for `n` case statements, we end up generating `n`
constant values.

This changes CodeGenPrepare to catch this pattern and revert it back to:

    switch (x) { case 42: phi(x, ...); }

Differential Revision: https://reviews.llvm.org/D124552
2022-05-10 10:00:10 -07:00
Matthias Braun cd19af74c0 Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to
give targets a chance to control the type used in
`CodeGenPrepare::optimizeSwitchInst`.

Implement callback for X86 to avoid i8 and i16 types where possible as
they often incur extra zero-extensions.

This is NFC for non-X86 targets.

Differential Revision: https://reviews.llvm.org/D124894
2022-05-10 10:00:10 -07:00
Lian Wang f14a1f26ad Revert "[RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation"
This patch make CodeGen/test/AArch64/vecreduce-add-legalization.ll fail.

This reverts commit 17a8a1bb71.
2022-05-10 09:25:25 +00:00
Lian Wang 17a8a1bb71 [RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125206
2022-05-10 08:52:48 +00:00
Mircea Trofin c35ad9ee4f [mlgo] Support exposing more features than those supported by models
This allows the compiler to support more features than those supported by a
model. The only requirement (development mode only) is that the new
features must be appended at the end of the list of features requested
from the model. The support is transparent to compiler code: for
unsupported features, we provide a valid buffer to copy their values;
it's just that this buffer is disconnected from the model, so insofar
as the model is concerned (AOT or development mode), these features don't
exist. The buffers are allocated at setup - meaning, at steady state,
there is no extra allocation (maintaining the current invariant). These
buffers has 2 roles: one, keep the compiler code simple. Second, allow
logging their values in development mode. The latter allows retraining
a model supporting the larger feature set starting from traces produced
with the old model.

For release mode (AOT-ed models), this decouples compiler evolution from
model evolution, which we want in scenarios where the toolchain is
frequently rebuilt and redeployed: we can first deploy the new features,
and continue working with the older model, until a new model is made
available, which can then be picked up the next time the compiler is built.

Differential Revision: https://reviews.llvm.org/D124565
2022-05-09 18:01:21 -07:00
David Green 2cfb243bcd [DAG] Use isAnyConstantBuildVector. NFC
As suggested from 02f8519502, this uses the
isAnyConstantBuildVector method in lieu of separate
isBuildVectorOfConstantSDNodes calls. It should
otherwise be an NFC.
2022-05-09 14:13:03 +01:00
David Green 02f8519502 [DAG] Prevent infinite loop combining bitcast shuffle
This prevents an infinite loop from D123801, where code trying to reduce
the total number of bitcasts, but also handling constants, could create
the opposite transform. Prevent the transform in these case to let the
bitcast of a constant transform naturally.

Fixes #55345
2022-05-09 09:36:22 +01:00
Simon Pilgrim 800d36cf32 [DAG] Only perform the fold (A-B)+(C-D) --> (A+C)-(B+D) when both inner subs have one use
Fixes #51381
2022-05-08 13:51:58 +01:00
Craig Topper b81bf7bb2f [LegalizeTypes] Make use of SelectionDAG::getShiftAmountConstant. NFC
Instead of calling getShiftAmountTy and getConstant separately.
2022-05-07 12:16:53 -07:00
Craig Topper 00bfaba997 [LegalizeTypes] Don't assume fshl/fshr shift amount type matches the other operands.
Like other shifts, the type isn't required to match. We shouldn't
assume we can call ZExtPromotedInteger.

I tested the PromoteIntOp_FunnelShift locally by removing the promotion
of the shift amount from PromoteIntRes_FunnelShift. But with the final
version of this patch it is never executed on any tests.

Differential Revision: https://reviews.llvm.org/D125106
2022-05-07 11:44:07 -07:00
Amaury Séchet 06fad8bc05 [DAGCombine] Add node in the worklist in topological order in CombineTo
This is part of an ongoing effort toward making DAGCombine process the nodes in topological order.

This is able to discover a couple of new optimizations, but also causes a couple of regression. I nevertheless chose to submit this patch for review as to start the discussion with people working on the backend so we can find a good way forward.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124743
2022-05-07 16:24:31 +00:00
Paul Walker 702c4ade22 [ISD::IndexType] Helper functions for common queries.
Add helper functions to query the signed and scaled properties
of ISD::IndexType along with functions to change them.

Remove setIndexType from MaskedGatherSDNode because it only has
one usage and typically should only be changed alongside its
index operand.

Minimise the direct use of the enum values to lay the groundwork
for more refactoring.

Differential Revision: https://reviews.llvm.org/D123347
2022-05-07 11:23:42 +01:00
David Green 5930691ee1 Revert "[DAGCombine] Make combineShuffleOfBitcast LittleEndian specific"
This reverts commit 891c3cf99e as it turns
out that the error was not caused by this commit, the error caming
from D124526 instead.
2022-05-06 21:03:22 +01:00
David Green 891c3cf99e [DAGCombine] Make combineShuffleOfBitcast LittleEndian specific
Something is going wrong with the BigEndian PowerPC bot. It is hard to
tell what is wrong from here, but attempt to fix it by disabling the
combineShuffleOfBitcast combine for bigendian.
2022-05-06 18:42:44 +01:00
Craig Topper 76f90a9d71 [SelectionDAG] Clear promoted bits before UREM on shift amount in PromoteIntRes_FunnelShift.
Otherwise we have garbage in the upper bits that can affect the
results of the UREM.

Fixes PR55296.

Differential Revision: https://reviews.llvm.org/D125076
2022-05-06 09:26:30 -07:00
Simon Pilgrim c0bebc12f0 [DAG] visitREM - merge buildOptimizedSREM into if(). NFCI. 2022-05-06 15:39:17 +01:00
David Green 115c188807 [DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask'))
If the mask is made up of elements that form a mask in the higher type
we can convert shuffle(bitcast into the bitcast type, simplifying the
instruction sequence. A v4i32 2,3,0,1 for example can be treated as a
1,0 v2i64 shuffle. This helps clean up some of the AArch64 concat load
combines, along with helping simplify a number of other tests.

The PowerPC combine for v16i8 splat vector loads needed some fixes to
keep it working for v16i8 vectors. This improves the handling of v2i64
shuffles to match too, hopefully improving them in general.

Differential Revision: https://reviews.llvm.org/D123801
2022-05-06 10:50:31 +01:00
Lian Wang fb0d636f28 [RISCV][SelectionDAG] Support VP_REDUCE_ADD mask operation.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124986
2022-05-06 01:49:21 +00:00
Craig Topper 5140e0d219 [SelectionDAGISel] Add back a comment to MergeInputChains handling. NFC
This comment used to exist, but was lost in a refactor over 10 years
ago, but still seems relevant and improves readability.
2022-05-05 12:59:21 -07:00
Craig Topper 084f967370 [SelectionDAG] Constant fold (sext_inreg undef, VT) to 0 instead of undef.
The result of sign_extend_inreg needs to have as many sign bits
as requested by the VT argument. The easiest way to guarantee this
is to fold it to 0.

SystemZ test was modified to avoid using undef.

Fixes https://github.com/llvm/llvm-project/issues/55178

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124696
2022-05-05 09:45:35 -07:00
Craig Topper 4e2d1a6c18 [DAGCombiner] Fold (sext/zext undef) -> 0 and aext(undef) -> undef.
Differential Revision: https://reviews.llvm.org/D124988
2022-05-05 09:34:18 -07:00
Craig Topper fd13192aa5 [DAGCombiner] Fold (max/min X, X) -> X.
Differential Revision: https://reviews.llvm.org/D124951
2022-05-05 09:34:17 -07:00
Brian Tracy 87a55137e2 Fix "the the" typo in documentation and user facing strings
There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users.

Reviewed By: #libc, Mordante, martong, ldionne

Differential Revision: https://reviews.llvm.org/D124708
2022-05-05 17:52:08 +02:00
Thomas Preud'homme 68dee83923 [MachinePipeliner] Fix unscheduled instruction
Prior to ordering instructions to be scheduled, the machine pipeliner
update recurrence node sets in groupRemainingNodes() by adding in a
given node set any node on the dependency path from a node set with
higher priority to the given node set. The function computePath() that
determine what constitutes a path follows artificial dependencies.

However, when ordering the nodes in the resulting node sets,
computeNodeOrder() calls ignoreDependence when looking at dependencies
which ignores artificial dependencies. This can cause a node not to be
scheduled which then causes wrong code generation and in the case of a
debug build will lead to an assert failure in generatePhis() in
ModuloScheduler.cpp.

This commit adds calls to ignoreDependence() in computePath() to not add
any node in groupRemainingNodes() that would not be ordered by
computeNodeOrder().

Reviewed By: sgundapa

Differential Revision: https://reviews.llvm.org/D124267
2022-05-05 16:01:41 +01:00
Xing Xue e5926906eb [XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections
Summary:
When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function.

Reviewed by: MaskRay, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D124855
2022-05-05 09:01:36 -04:00
Jay Foad 9ebbe25034 RegAllocGreedy: Common up part of the priority calculation. NFC. 2022-05-05 10:35:33 +01:00
Nikita Popov 9678936f18 [DAGCombine] Fold (X & ~Y) | Y with truncated not
This extends the (X & ~Y) | Y to X | Y fold to also work if ~Y is
a truncated not (when taking into account the mask X). This is
done by exporting the infrastructure added in D124856 and reusing
it here.

I've retained the old value of AllowUndefs=false, though probably
this can be switched to true with extra test coverage.

Differential Revision: https://reviews.llvm.org/D124930
2022-05-05 11:10:11 +02:00
Craig Topper 572dfef1db [SelectionDAG] Use llvm::any_of to simplify a loop. NFC 2022-05-04 19:09:06 -07:00
Nikita Popov 451bc723ae [SDAG] Handle truncated not in haveNoCommonBitsSet()
Demanded bits analysis may replace a full-width not with a
any_extend (not (truncate X)) pattern. This patch looks through
this kind of pattern in haveNoCommonBitsSet(). Of course, we can
only do this if we only need negated bits in the non-extended part,
as the other bits may now be arbitrary. For example, if we have
haveNoCommonBitsSet(~X & Y, X) then ~X only needs to actually
negate bits set in Y.

This is only a partial solution to the problem in that it allows
add -> or conversion, but the resulting or doesn't get folded yet.
(I guess that will involve exposing getBitwiseNotOperand() as a
more general helper and using that in the relevant transform.)

Differential Revision: https://reviews.llvm.org/D124856
2022-05-04 15:30:44 +02:00
serge-sans-paille 7030654296 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since fa5a4e1b95 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D124847
2022-05-04 08:32:38 +02:00
Luo, Yuanke 764676b737 [fastregalloc] Fix bug when undef value is tied to def.
If the tied use is undef value, fastregalloc should free the def
register. There is no reload needed for the undef value.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D124834
2022-05-04 12:12:55 +08:00
Jon Roelofs e1c808b36e Fix zero-width bitfield extracts to emit 0
Fixes #55129
2022-05-03 14:46:42 -07:00
Simon Pilgrim faa35fc873 [DAG] Fix issue with rot(rot(x,c1),c2) -> rot(x,c1+c2) fold with unnormalized rotation amounts
Don't assume the rotation amounts have been correctly normalized - do it as part of the constant folding.

Also, the normalization should be performed with UREM not SREM.
2022-05-03 17:16:26 +01:00
Nikita Popov 2171a896ed [SDAG] Handle A and B&~A in haveNoCommonBitsSet()
This is the DAG variant of D124763. The code already handles the
general pattern, but not this degenerate case.

This allows folding A + (B&~A) to A | (B&~A) which further holds
to A | B.

Handling on the SDAG level is needed because in the motivating
case the add is actually a getelementptr, which only gets converted
into an add on the SDAG level. However, this patch is not quite
sufficient to handle the getelementptr case yet, because of an
interfering demanded bits simplification.

Differential Revision: https://reviews.llvm.org/D124772
2022-05-03 15:47:02 +02:00
Nikita Popov e0892614b1 [SDAG] Extract commutative helper from haveNoCommonBitsSet() (NFC)
To make it easier to add additional patterns, which will generally
want to handle commuted top-level operands.
2022-05-03 12:28:35 +02:00
Jeremy Morse 1d712c3818 [DebugInfo][InstrRef] Don't generate redundant DBG_PHIs
In SelectionDAG, DBG_PHI instructions are created to "read" physreg values
and give them an instruction number, when they can't be traced back to a
defining instruction. The most common scenario if arguments to a function.
Unfortunately, if you have 100 inlined methods, each of which has the same
"this" pointer, then the 100 dbg.value instructions become 100
DBG_INSTR_REFs plus 100 DBG_PHIs, where only one DBG_PHI would suffice.

This patch adds a vreg cache for MachienFunction::salvageCopySSA, if we've
already traced a value back to the start of a block and created a DBG_PHI
then it allows us to re-use the DBG_PHI, as well as reducing work.

Differential Revision: https://reviews.llvm.org/D124517
2022-05-03 09:56:12 +01:00
David Green 6f81903e89 [LV][SLP] Mark fptosi_sat as vectorizable
This adds fptosi_sat and fptoui_sat to the list of trivially
vectorizable functions, mainly so that the loop vectorizer can vectorize
the instruction. Marking them as trivially vectorizable also allows them
to be SLP vectorized, and Scalarized.

The signature of a fptosi_sat requires two type overrides
(@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only
take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd
to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first
operand of the intrinsic as a overloaded (but not scalar) operand.

Differential Revision: https://reviews.llvm.org/D124358
2022-05-03 09:32:34 +01:00
Hsiangkai Wang eaaa31ff2c [RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C).
Follow-up to D122933.

Differential Revision: https://reviews.llvm.org/D124374
2022-05-03 03:51:36 +00:00
Craig Topper 5f057eaa0d [DAGCombiner] reassociationCanBreakAddressingModePattern should check uses of the outer add.
When looking for memory uses,
reassociationCanBreakAddressingModePattern should check uses of
the outer ADD rather than the inner ADD. We want to know if the
two ops we're reassociating are used by a load/store.

In practice, the existing check usually works because CodeGenPrepare
will make one of the load/stores have an offset of 0 relative to
split GEP. That will make the inner add have a memory use.

To test this, I've manually split the GEPs so there is no 0 offset
store.

This issue was recently discussed in the original review D60294.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D124644
2022-05-02 16:38:53 -07:00
Sanjay Patel 747c6a0c73 [SDAG] fix miscompile when casting int->FP->int
This is the codegen equivalent of D124692.

As shown in https://github.com/llvm/llvm-project/issues/55150 -
the existing fold may be wrong when converting to a signed value.
This is a quick fix to avoid the miscompile.
https://alive2.llvm.org/ce/z/KtaDmd

Differential Revision: https://reviews.llvm.org/D124771
2022-05-02 14:57:27 -04:00
Simon Pilgrim ae8b10e543 [DAG] (style) Break apart if-else chain as they all return 2022-05-01 17:56:59 +01:00
Paul Walker f10a8f6752 [LegalizeDAG] Fix TypeSize conversion error when expanding SIGN_EXTEND_INREG
SIGN_EXTEND_INREG expansion can trigger a TypeSize error because
"VT.getSizeInBits() == 1" is used to detect for a boolean without
first verifying VT is a scalar.
2022-04-30 19:21:48 +01:00
Craig Topper 6affe87bda [DAGCombiner] When matching a disguised rotate by constant don't forget to apply LHSMask/RHSMask.
We try to match as a disguised rotate by constant of these forms
(shl (X | Y), C1) | (srl X, C2) --> (rotl X, C1) | (shl Y, C1)
(shl X, C1) | (srl (X | Y), C2) --> (rotl X, C1) | (srl Y, C2)

We may have also looked through an AND to find the shift. If we
did, we need to apply a mask to the result.

I'll add an AArch64 test and pre-commit it and the RISC-V test
tomorrow.

Fixes PR55201.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124711
2022-04-30 11:02:30 -07:00
David Penry dcb77643e3 Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM
Fixed "private field is not used" warning when compiled
with clang.

original commit: 28d09bbbc3
reverted in: fa49021c68

------

This patch permits Swing Modulo Scheduling for ARM targets
turns it on by default for the Cortex-M7.  The t2Bcc
instruction is recognized as a loop-ending branch.

MachinePipeliner is extended by adding support for
"unpipelineable" instructions.  These instructions are
those which contribute to the loop exit test; in the SMS
papers they are removed before creating the dependence graph
and then inserted into the final schedule of the kernel and
prologues. Support for these instructions was not previously
necessary because current targets supporting SMS have only
supported it for hardware loop branches, which have no
loop-exit-contributing instructions in the loop body.

The current structure of the MachinePipeliner makes it difficult
to remove/exclude these instructions from the dependence graph.
Therefore, this patch leaves them in the graph, but adds a
"normalization" method which moves them in the schedule to
stage 0, which causes them to appear properly in kernel and
prologues.

It was also necessary to be more careful about boundary nodes
when iterating across successors in the dependence graph because
the loop exit branch is now a non-artificial successor to
instructions in the graph. In additional, schedules with physical
use/def pairs in the same cycle should be treated as creating an
invalid schedule because the scheduling logic doesn't respect
physical register dependence once scheduled to the same cycle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D122672
2022-04-29 10:54:39 -07:00
Paul Walker 23c509754d [DAGCombiner] Stop invalid sign conversion in refineIndexType.
When looking through extends of gather/scatter indices it's safe
to convert a known positive signed index to unsigned, but unsigned
indices must remain unsigned.

Depends On D123318

Differential Revision: https://reviews.llvm.org/D123326
2022-04-29 14:20:13 +01:00
Nikita Popov 027c728f29 [SelectionDAGBuilder] Don't create MGATHER/MSCATTER with Scale != ElemSize
This is an alternative to D124530. In getUniformBase() only create
scales that match the gather/scatter element size. If targets also
support other scales, then they can produce those scales in target
DAG combines. This is what X86 already does (as long as the
resulting scale would be 1, 2, 4 or 8).

This essentially restores the pre-opaque-pointer state of things.

Fixes https://github.com/llvm/llvm-project/issues/55021.

Differential Revision: https://reviews.llvm.org/D124605
2022-04-29 14:57:53 +02:00
Paul Walker 7a0b897e86 [DAGCombiner][SVE] Ensure MGATHER/MSCATTER addressing mode combines preserve index scaling
refineUniformBase and selectGatherScatterAddrMode both attempt the
transformation:

  base(0) + index(A+splat(B)) => base(B) + index(A)

However, this is only safe when index is not implicitly scaled.

Differential Revision: https://reviews.llvm.org/D123222
2022-04-29 12:35:16 +01:00
Serge Pavlov 9fc58f1820 [PowerPC] Support of ppc_fp128 in lowering of llvm.is_fpclass
PowerPC supports `ppc_fp128`, which is not an IEEE floating point
type. The generic lowering of llvm.is_fpclass could not handle it
properly. This change extends the generic lowering code to
support `ppc_fp128`.

The change was tested on emulator using runtime tests from
https://reviews.llvm.org/D112933 and the patch for clang
https://reviews.llvm.org/D112932.

Differential Revision: https://reviews.llvm.org/D113908
2022-04-29 11:10:47 +07:00
Zequan Wu 4fe2ab5279 Revert "[DebugInfo][InstrRef] Describe value sizes when spilt to stack"
This reverts commit a15b66e76d.

This causes linker to crash at assertion: `Assertion failed: !Expr->isComplex(), file C:\b\s\w\ir\cache\builder\src\third_party\llvm\llvm\lib\CodeGen\LiveDebugValues\InstrRefBasedImpl.cpp, line 907`.
2022-04-28 16:18:16 -07:00
David Penry fa49021c68 Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM"
This reverts commit 28d09bbbc3
while I investigate a buildbot failure.
2022-04-28 13:29:27 -07:00
David Penry 28d09bbbc3 [CodeGen][ARM] Enable Swing Module Scheduling for ARM
This patch permits Swing Modulo Scheduling for ARM targets
turns it on by default for the Cortex-M7.  The t2Bcc
instruction is recognized as a loop-ending branch.

MachinePipeliner is extended by adding support for
"unpipelineable" instructions.  These instructions are
those which contribute to the loop exit test; in the SMS
papers they are removed before creating the dependence graph
and then inserted into the final schedule of the kernel and
prologues. Support for these instructions was not previously
necessary because current targets supporting SMS have only
supported it for hardware loop branches, which have no
loop-exit-contributing instructions in the loop body.

The current structure of the MachinePipeliner makes it difficult
to remove/exclude these instructions from the dependence graph.
Therefore, this patch leaves them in the graph, but adds a
"normalization" method which moves them in the schedule to
stage 0, which causes them to appear properly in kernel and
prologues.

It was also necessary to be more careful about boundary nodes
when iterating across successors in the dependence graph because
the loop exit branch is now a non-artificial successor to
instructions in the graph. In additional, schedules with physical
use/def pairs in the same cycle should be treated as creating an
invalid schedule because the scheduling logic doesn't respect
physical register dependence once scheduled to the same cycle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D122672
2022-04-28 13:01:18 -07:00
Alexey Bataev 75e1cf4a6a [COST]Improve cost model for shuffles in SLP.
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486
2022-04-28 10:04:41 -07:00
Bjorn Pettersson 3a39bb96ca [SelectionDAG] Use correct boolean representation in FoldConstantArithmetic
The description of SETCC says
  /// SetCC operator - This evaluates to a true value iff the condition is
  /// true.  If the result value type is not i1 then the high bits conform
  /// to getBooleanContents.

Without this patch, we sign extended the i1 to the used larger type
regardless of getBooleanContents. This resulted in miscompiles, as
shown in the attached testcase that ended up returning -1 instead of
1 when using -mattr=+v.

Fixes https://github.com/llvm/llvm-project/issues/55168

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124618
2022-04-28 18:42:16 +02:00
Alexey Bataev 9861ca0c23 Revert "[COST]Improve cost model for shuffles in SLP."
This reverts commit 29a470e380 to fix
a crash reported in https://reviews.llvm.org/D100486#3479989.
2022-04-28 08:11:56 -07:00
Matt Arsenault 7762a3ce18 Revert "BranchFolder: Assert on SSA functions"
This reverts commit 6ff91d17d6.
2022-04-27 19:02:15 -04:00
Matt Arsenault 6ff91d17d6 BranchFolder: Assert on SSA functions
We probably should have the opposite of getRequiredProperties for this
2022-04-27 18:51:37 -04:00
Bill Wendling 8f2ec974d1 [X86] Move target-generic code into CodeGen [NFC]
This code is the same for all platforms.

Differential Revision: https://reviews.llvm.org/D124566
2022-04-27 15:37:28 -07:00
Matt Arsenault 7c2db66632 llvm-reduce: Support multiple MachineFunctions
The current testcase I'm trying to reduce only reproduces with IPRA
enabled and requires handling multiple functions.

The only real difference vs. the IR is the extra indirect to look for
the underlying MachineFunction, so treat the ReduceWorkItem as the
module instead of the function.

The ugliest piece of this is really the ugliness of
MachineModuleInfo. It not only tracks actual module state, but has a
number of transient fields used for isel and/or the asm printer. These
shouldn't do any harm for the use here, though they should be
separated out.
2022-04-27 18:11:59 -04:00
Alexey Bataev 29a470e380 [COST]Improve cost model for shuffles in SLP.
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486
2022-04-27 10:56:26 -07:00
Denis Antrushin 4059770af5 [StatepointLowering] Only export STATEPOINT results if used in nonlocal blocks.
Cuurently we always export STATEPOINT results (GC pointers lowered via VRegs)
to virtual registers. When processing gc.relocate instructions we have to
generate CopyFromRegs node and then export it to VReg again if gc.relocate
is used in other basic blocks. This results in generation of extra COPY MIR
instruction if statepoint and its gc.relocate are in the same BB, but gc.relocate
result is used in other blocks.

This patch changes this behavior to export statepoint results only if used
in other basic blocks. For local uses StatepointLoweringState.(get|set)Location()
API is used to communicate appropriate statepoint result from `LowerStatepoint()`
to `visitGCRelocate()`

This is NFC and is purely compile time optimization. On big methids it can improve
codegen compile time up to 10%.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124444
2022-04-27 15:53:24 +03:00
Jeremy Morse a15b66e76d [DebugInfo][InstrRef] Describe value sizes when spilt to stack
InstrRefBasedLDV can track and describe variable values that are spilt to
the stack -- however it does not current describe the size of the value on
the stack. This can cause uninitialized bytes to be read from the stack if
a small register is spilt for a larger variable, or theoretically on
big-endian machines if a large value on the stack is used for a small
variable.

Fix this by using DW_OP_deref_size to specify the amount of data to load
from the stack, if there's any possibility for ambiguity. There are a few
scenarios where this can be omitted (such as when using DW_OP_piece and a
non-DW_OP_stack_value location), see deref-spills-with-size.mir for an
explicit table of inputs flavours and output expressions.

Differential Revision: https://reviews.llvm.org/D123599
2022-04-27 09:54:50 +01:00
Andrew Savonichev 0a27622a1d [NVPTX] Disable DWARF .file directory for PTX
Default behavior for .file directory was changed in D105856, but
ptxas (CUDA 11.5 release) refuses to parse it:

    $ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll
    $ ptxas debug-file-loc.s
    ptxas debug-file-loc.s, line 42; fatal   : Parsing error near
    '"foo.h"': syntax error

Added a new field to MCAsmInfo to control default value of
UseDwarfDirectory. This value is used if -dwarf-directory command line
option is not specified.

Differential Revision: https://reviews.llvm.org/D121299
2022-04-26 21:40:36 +03:00
Jeremy Morse 65d5beca13 Reapply D124184, [DebugInfo][InstrRef] Add a size operand to DBG_PHI
This was reverted twice, in 987cd7c3ed and 13815e8cbf. The latter
stemed from not accounting for rare register classes in a pre-allocated
array, and the former from an array not being completely initialized,
leading to asan complaining.
2022-04-26 15:49:22 +01:00
Serge Pavlov 170a903144 Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks
if the provided floating-point number belongs to any of the the specified
value classes. The intrinsic implements the checks made by C standard
library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`,
`issignaling` and corresponding IEEE-754 operations.

The primary motivation for this intrinsic is the support of strict FP
mode. In this mode using compare instructions or other FP operations is
not possible, because if the value is a signaling NaN, floating-point
exception `Invalid` is raised, but the aforementioned functions must
never raise exceptions.

Currently there are two solutions for this problem, both are
implemented partially. One of them is using integer operations to
implement the check. It was implemented in https://reviews.llvm.org/D95948
for `isnan`. It solves the problem of exceptions, but offers one
solution for all targets, although some can do the check in more
efficient way.

The other, implemented in https://reviews.llvm.org/D96568, introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target
specific code into IR to implement `isnan` and some other functions. It is
convenient for targets that have dedicated instruction to determine FP data
class. However using target-specific intrinsic complicates analysis and can
prevent some optimizations.

A special intrinsic for value class checks allows representing data class
tests with enough flexibility. During IR transformations it represents the
check in target-independent way and saves it from undesired transformations.
In the instruction selector it allows efficient lowering depending on the
used target and mode.

This implementation is an extended variant of `llvm.isnan` introduced
in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic
support. Target-specific treatment will be implemented in separate
patches.

Differential Revision: https://reviews.llvm.org/D112025
2022-04-26 13:09:16 +07:00
Lian Wang 9980148305 [RISCV][SelectionDAG] Support VP_ADD/VP_MUL/VP_SUB mask operations
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124144
2022-04-26 02:30:22 +00:00
Jeremy Morse 987cd7c3ed Revert "Reapply D124184, [DebugInfo][InstrRef] Add a size operand to DBG_PHI"
This reverts commit 5db9250231.

Further to the early revert, the sanitizers have found something wrong with
this.
2022-04-25 23:30:15 +01:00
Matt Arsenault 7714e03175 RegAllocGreedy: Allow last chance recolor to retry overlapping tuples
Last chance recoloring didn't try recoloring a done register with the
same class since it believed there was no point. This doesn't
necessarily apply if the members in that class overlap. Allow the
recoloring to proceed if the assigned interfering physical register
overlaps with the candidate register.

This avoids an allocation failure with overlapping tuples. This
testcase could be handled better, and I don't believe should reach
last chance recoloring. The failure only manifests with the mutually
unsatisfiable register hints to overlapping tuples. The earlier
assignment decisions probably should have figured out that using these
hints was a bad idea.
2022-04-25 17:07:17 -04:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Jeremy Morse 5db9250231 Reapply D124184, [DebugInfo][InstrRef] Add a size operand to DBG_PHI
This was applied in fda4305e53, reverted in 13815e8cbf, the problem
was that fp80 X86 registers that were spilt to the stack aren't expected by
LiveDebugValues. It pre-allocates a position number for all register sizes
that can be spilt, and 80 bits isn't exactly common.

The solution is to scan the register classes to find any unrecognised
register sizes, adn pre-allocate those position numbers, avoiding a later
assertion.
2022-04-25 15:50:15 +01:00
Jeremy Morse 13815e8cbf Revert "[DebugInfo][InstrRef] Add a size operand to DBG_PHI"
This reverts commit fda4305e53.

Green dragon has spotted a problem -- it's understood, but might be fiddly
to fix, reverting in the meantime.
2022-04-25 14:06:12 +01:00
Jeremy Morse fda4305e53 [DebugInfo][InstrRef] Add a size operand to DBG_PHI
DBG_PHI instructions can refer to stack slots, to indicate that multiple
values merge together on control flow joins in that slot. This is fine --
however the slot might be merged at a later date with a slot of a different
size. In doing so, we lose information about the size the eliminated PHI.
Later analysis passes have to guess.

Improve this by attaching an optional "bit size" operand to DBG_PHI, which
only gets added for stack slots, to let us know how large a size the value
on the stack is.

Differential Revision: https://reviews.llvm.org/D124184
2022-04-25 13:41:34 +01:00
Matt Arsenault 6fa1d12b3c ProcessImplicitDefs: Use required properties instead of isSSA assert 2022-04-22 18:28:45 -04:00
Simon Pilgrim 34e7243464 [DAG] Fold freeze(bitcast(x)) -> bitcast(freeze(x))
This is a very specific fold to fix an upstream poor codegen issue.

InstCombine has the much more flexible pushFreezeToPreventPoisonFromPropagating but I don't think we're quite there with DAG/TLI handling for canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison value tracking yet.

Fixes #54911

Differential Revision: https://reviews.llvm.org/D124185
2022-04-22 16:39:25 +01:00
Matt Arsenault 9c122537cd MIR: Serialize FunctionContextIdx in MachineFrameInfo 2022-04-22 11:07:41 -04:00
Matt Arsenault 40bc9112c0 GlobalISel: Relax handling of G_ASSERT_* with source register classes
The most common situation where G_ASSERT_ZEXT appears for AMDGPU is a
copy from a physical register, which happens to use set the actual
register class on the virtual register. After copy coalescing, the
assert's source operand had a vreg with a set class. The verifier was
strictly rejecting cases where the set class/bank weren't an exact
match. Additionally, RegBankSelect was also expecting a register bank
to be set on the register, not a class.

This is much stricter than regular copies so relax this behavior. This
now allows these 2 cases:

1. Source register has either class or bank, and the result does not
2. Source register has a register class, and the result is a register
with a matching bank.

This should avoid needing some kind of special handling to avoid
violating this constraint when folding copies.
2022-04-22 10:49:50 -04:00
Vitaly Buka 9be90748f1 Revert "[asan] Emit .size directive for global object size before redzone"
Revert "[docs] Fix underline"

Breaks a lot of asan tests in google.

This reverts commit 365c3e85bc.
This reverts commit 78a784bea4.
2022-04-21 16:21:17 -07:00
Alex Brachet 78a784bea4 [asan] Emit .size directive for global object size before redzone
This emits an `st_size` that represents the actual useable size of an object before the redzone is added.

Reviewed By: vitalybuka, MaskRay, hctim

Differential Revision: https://reviews.llvm.org/D123010
2022-04-21 20:46:38 +00:00
Paul Kirth 61e36e87df [safestack] Support safestack in stack size diagnostics
Current stack size diagnostics ignore the size of the unsafe stack.
This patch attaches the size of the static portion of the unsafe stack
to the function as metadata, which can be used by the backend to emit
diagnostics regarding stack usage.

Reviewed By: phosek, mcgrathr

Differential Revision: https://reviews.llvm.org/D119996
2022-04-20 18:29:40 +00:00
Alexey Bataev 2cca53c815 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 09:37:16 -07:00
Matt Arsenault 9209a51918 MachineModuleInfo: Move AddrLabelSymbols to AsmPrinter
This was tracking global state only used by the AsmPrinter, which can
store its own module global state.
2022-04-20 11:21:40 -04:00
Matt Arsenault 3659780d58 MachineModuleInfo: Remove UsesMorestackAddr
This is x86 specific, and adds statefulness to
MachineModuleInfo. Instead of explicitly tracking this, infer if we
need to declare the symbol based on the reference previously inserted.

This produces a small change in the output due to the move from
AsmPrinter::doFinalization to X86's emitEndOfAsmFile. This will now be
moved relative to other end of file fields, which I'm assuming doesn't
matter (e.g. the __morestack_addr declaration is now after the
.note.GNU-split-stack part)

This also produces another small change in code if the module happened
to define/declare __morestack_addr, but I assume that's invalid and
doesn't really matter.
2022-04-20 11:10:20 -04:00
Matt Arsenault d7938b1a81 MachineModuleInfo: Move HasSplitStack handling to AsmPrinter
This is used to emit one field in doFinalization for the module. We
can accumulate this when emitting all individual functions directly in
the AsmPrinter, rather than accumulating additional state in
MachineModuleInfo.

Move the special case behavior predicate into MachineFrameInfo to
share it. This now promotes it to generic behavior. I'm assuming this
is fine because no other target implements adjustForSegmentedStacks,
or has tests using the split-stack attribute.
2022-04-20 10:54:29 -04:00
Alexey Bataev 5f7ac15912 Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer."
This reverts commit 2f49163b33 to fix
a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284
2022-04-20 06:35:55 -07:00
Matt Arsenault 26d575eb08 LocalStackSlotAllocation: Combine debug printing statements 2022-04-20 09:31:14 -04:00
Matt Arsenault 4575f35ea1 LocalStackSlotAllocation: Stop creating unused virtual register 2022-04-20 09:31:14 -04:00
Alexey Bataev 2f49163b33 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 05:32:56 -07:00
Matt Arsenault 9592e88f59 MachineModuleInfo: Don't allow dynamically setting DbgInfoAvailable
This can be set up front, and used only as a cache. This avoids a
field that looks like it requires MIR serialization.

I believe this fixes 2 bugs for CodeView. First, this addresses a
FIXME that the flag -diable-debug-info-print only works with
DWARF. Second, it fixes emitting debug info with emissionKind NoDebug.
2022-04-19 21:08:37 -04:00
Matt Arsenault 8591328e15 Intrinsics: Mark llvm.eh.sjlj.callsite argument as immarg
The assert in SelectionDAG implies that it is
2022-04-19 21:04:33 -04:00
Matt Arsenault 507259820a GlobalISel: Add LegalizeMutations to help use More/FewerElements 2022-04-19 21:04:32 -04:00
Vitaly Buka 33c5d8f939 [msan] Disable assert with msan
The assert uses data from just destroyed BasicBlock.
2022-04-19 16:42:05 -07:00
chenglin.bi 222adf338a [Arch64][SelectionDAG] Add target-specific implementation of srem
1. X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first.
2. Add AArch64 faster path for SREM only pow2 case.

Fix https://github.com/llvm/llvm-project/issues/54649

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122968
2022-04-19 02:49:42 +08:00
chenglin.bi acfc025a72 Revert "[Arch64][SelectionDAG] Add target-specific implementation of srem"
This reverts commit 9d9eddd3dd.
2022-04-18 10:35:09 +08:00
chenglin.bi 9d9eddd3dd [Arch64][SelectionDAG] Add target-specific implementation of srem
X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. Add AArch64 faster path for SREM only pow2 case.

Fix https://github.com/llvm/llvm-project/issues/54649

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122968
2022-04-16 12:29:11 +08:00
Matt Arsenault b8033de063 MIR: Serialize a few bool function fields 2022-04-15 20:31:07 -04:00
Craig Topper c6dc229a6d [DAGCombiner] Move call to hasOneUse after opcode checks. NFC
Checking the opcode is cheap, counting the number of uses is not.
2022-04-15 17:02:16 -07:00
Craig Topper a7b9d75e7a [DAGCombiner] Move or/xor/and opcode check in ReduceLoadOpStoreWidth before hasOneUse check.
hasOneUse is not cheap on nodes with chain results that might have
many uses. By checking the opcode first, we can avoid a costly walk
of the use list on nodes we aren't interested in.

Found by investigating calls to hasNUsesOfValue from the example
provided in D123857.
2022-04-15 16:38:27 -07:00
Chih-Ping Chen eab6e94f91 [DebugInfo] Add a TargetFuncName field in DISubprogram for
specifying DW_AT_trampoline as a string. Also update the signature
of DIBuilder::createFunction to reflect this addition.

Differential Revision: https://reviews.llvm.org/D123697
2022-04-15 16:38:23 -04:00
Johannes Doerfert 3f7a6ce0de [DWARF][FIX] Handle the use of multiple registers gracefully
Certain applications crashed for us with the AMDGPU backend. While this
is not a proper fix it allows us to compile the code for now. I left a
TODO for someone that understands DWARF.

Differential Revision: https://reviews.llvm.org/D123717
2022-04-15 13:43:50 -05:00
Clement Courbet 46a13a0ef8 [ExpandMemCmp] Properly expand `bcmp` to an equality pattern.
Before that change, constant-size `bcmp` would miss an opportunity to generate
a more efficient equality pattern and would generate a -1/0-1 pattern
instead.

Differential Revision: https://reviews.llvm.org/D123849
2022-04-15 11:26:24 +02:00
Matt Arsenault 9196f5dab7 MachineCSE: Report this requires SSA 2022-04-14 20:21:21 -04:00
John Brawn 12c1022679 [AArch64] Lowering and legalization of strict FP16
For strict FP16 to work correctly needs some changes in lowering and
legalization:
 * SelectionDAGLegalize::PromoteNode was missing handling for some
   strict fp opcodes.
 * Some of the custom lowering of strict fp operations needed to be
   adjusted to work with FP16.
 * Custom lowering needed to be added for round-to-int operations.

With this, and the previous patches for the rest of the strict fp
isel, we can set IsStrictFPEnabled = true.

Differential Revision: https://reviews.llvm.org/D115620
2022-04-14 16:51:22 +01:00
Joseph Huber 11f47b791f [OpenMP] Make offloading sections have the SHF_EXCLUDE flag
Offloading sections can be embedded in the host during codegen via a
section. This section was originally marked as metadata to prevent it
from being loaded, but these sections are completely unused at runtime
so the linker should automatically drop them from the final executable
or shard library. This flag adds support for the SHF_EXCLUDE flag in
target lowering and uses it.

Reviewed By: JonChesterfield, MaskRay

Differential Revision: https://reviews.llvm.org/D122987
2022-04-14 10:50:49 -04:00
Paul Walker 0c44115e51 [SVE] Add support for non-element-type sized scaling when lowering MGATHER/MSCATTER.
The lowering code did not use the scale operand of MGATHER/MSCATTER
nodes, but instead assumed scaled indices were always scaled based
on the element type of the memory type. This patch adds the missing
support by rewritting the nodes as unscaled variants.

Differential Revision: https://reviews.llvm.org/D123670
2022-04-14 11:54:46 +01:00
Matt Arsenault 1732242bee RegAlloc: Fix remaining virtual registers after allocation failure
This testcase fails register allocation, but at the failure point
there were also new split virtual registers. Previously this was
assigning the failing register and not enqueueing the newly created
split virtual registers. These would then never be allocated and
assert in VirtRegRewriter.
2022-04-13 16:25:30 -04:00
Matt Arsenault 681b9466c9 RegAllocGreedy: Remove redundant check for virtual registers
The set of interfering virtual registers obviously only includes
virtual registers.
2022-04-13 15:00:18 -04:00
serge-sans-paille fa5a4e1b95 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since a96638e50e detected a few
regressions, fixing them.
2022-04-13 20:53:19 +02:00
Simon Pilgrim fef221bf1f [DAG] Enable SimplifyVBinOp folds on add/sub sat intrinsics 2022-04-13 12:53:23 +01:00
Jonas Paulsson 46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Simon Pilgrim cfb3ee2185 [DAG] Add non-uniform vector support to (shl (srl x, c1), c2) -> (and (shift x, c3))
Another part of D77804 yak shaving

Differential Revision: https://reviews.llvm.org/D123523
2022-04-13 11:37:33 +01:00
Matt Arsenault d4b1be20f6 RegAllocGreedy: Fix illegal eviction assert for urgent evictions
The condition in canEvictInterferenceBasedOnCost is slightly different
from the assertion in evictInteference.
canEvictInterferenceBasedOnCost uses a <= check for the cascade number
for legality, but the assert was checking for <. For equal cascade
numbers for an urgent eviction, canEvictInterferenceBasedOnCost could
return success. The actual eviction would then hit this assert. Avoid
ever returning true for equivalent cascade numbers.

The resulting failed allocation seems a bit off to me. e.g. in
illegal-eviction-assert.mir, I wuold assume %0 gets allocated starting
at $vgpr0. That was its initial allocation choice, but was later
evicted. In this example no evictions can help improve anything.
2022-04-12 19:16:56 -04:00
Matt Arsenault eefed1dbf0 RegAllocGreedy: Roll back successful recolorings on failure
This is a replacement for the original fix attempted in
c46aab01c0.

This fixes "overlapping insert" assertion failures when trying to
unwind an unsuccessful recoloring attempt.

The problem would occur when there are multiple recoloring candidates
which recursively required recoloring. If one recoloring candidate was
successfully recolored at one level, and the next recoloring candidate
was unsuccessful, we would not roll back the first candidates
successful recoloring. The forgotten successful recoloring may have
been assigned to something that conflicts with a register that needs
to be restored in a parent recoloring attempt.

See the testcase added in issue48473 for a more concrete example with
explanation.
2022-04-12 19:02:48 -04:00
Matt Arsenault 3754f60112 GlobalISel: Implement MoreElements for select of vector conditions 2022-04-12 16:54:04 -04:00
Matt Arsenault 3f2cc7cc2b GlobalISel: Fix lowerSelect handling of boolean high bits
This was making several invalid assumptions about the incoming
select. First, it was assuming the incoming condition was either s1 or
already sign extended, not accounting for different boolean high bits
behavior between scalar and vector conditions. We only had a vector
boolean due to the intermediate step vector select, which is now
avoided.

Second, it was assuming it can use the result vector type as a boolean
mask. These types don't have anything to do with other, and only makes
sense in the context of the expansion to bit operations. Since these
logically are part of the same lowering, do the complete expansion in
a single step.

The added select_v4s1_s1 test does fail to legalize, since it seems
AArch64's vector legalization support is pretty incomplete.
2022-04-12 16:54:03 -04:00
Matt Arsenault 0e489926be GlobalISel: Handle widening addo/subo booleans
This will be tested in a future patch
2022-04-12 16:54:03 -04:00
Matt Arsenault 95c2bcbf8b GlobalISel: Handle widening umulo/smulo condition outputs 2022-04-12 16:54:03 -04:00
Matt Arsenault abe171df06 GlobalISel: Update mutationIsSane assert for scalable vectors 2022-04-12 16:54:03 -04:00
Shao-Ce SUN e90110e696 [NFC][CodeGen] Use ArrayRef in TargetLowering functions
This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallName`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123467
2022-04-13 00:46:05 +08:00
Simon Pilgrim bc32a1dd76 [DAG] Add non-uniform vector support to (shl (sr[la] exact X, C1), C2) folds 2022-04-12 12:57:56 +01:00
Craig Topper 35be4a7af3 [SelectionDAG] Remove unecessary null check after call to getNode. NFC
As far as I know getNode will never return a null SDValue.

I'm guessing this was modeled after the FoldConstantArithmetic
call earlier.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D123550
2022-04-11 18:03:44 -07:00
Matt Arsenault 5a5034d508 GlobalISel: Verify atomic load/store ordering restriction
Reject acquire stores and release loads. This matches the restriction
imposed by the LLParser and IR verifier.
2022-04-11 20:12:22 -04:00
Matt Arsenault d1f97a3419 GlobalISel: Add memSizeNotByteSizePow2 legality helper
This is really a replacement for memSizeInBytesNotPow2 that actually
does what most every target wants. In particular, since s1 rounds to 1
byte, it wasn't lowered by this predicate. This results in targets
needing to think harder and add more matchers to catch all the
degenerate cases.

Also small bug fix that prevented the correct insertion of
G_ASSERT_ZEXT in the AArch64 use case.
2022-04-11 19:43:37 -04:00
Matt Arsenault 1416744f84 GlobalISel: Implement computeKnownBits for overflow bool results 2022-04-11 19:43:37 -04:00
Craig Topper 2ce2562876 [RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64.
Materializing constants on RISCV is simpler if the constant is sign
extended from i32. By default i32 constant operands of phis are
zero extended.

This patch adds a hook to allow RISCV to override this for i32. We
have an existing isSExtCheaperThanZExt, but it operates on EVT which
we don't have at these places in the code.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122951
2022-04-11 14:38:39 -07:00
Craig Topper 28cb508195 [TargetLowering][RISCV] Allow truncation when checking if the arguments of a setcc are splats.
We're just trying to canonicalize here and won't be using the constant
value returned.

The attached test changes are because we were previously commuting
a seteq X, (splat_vector 0) because we also have (sub 0, X). The
0 is larger than the element type so we don't detect it as a splat
without the AllowTruncation flag. By preventing the commute we are
able to match it to the vmseq.vx instruction during isel. We only
look for constants on the RHS in isel.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D123256
2022-04-11 09:49:36 -07:00
Momchil Velikov b4ad28da19 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CGA
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints taht LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-11 13:27:26 +01:00
Sanjay Patel 2ed15984b4 [SDAG] try to reduce compare of funnel shift equal 0
fshl (or X, Y), X, C ==/!= 0 --> or (shl Y, C), X ==/!= 0
fshl X, (or X, Y), C ==/!= 0 --> or (srl Y, BW-C), X ==/!= 0

This is similar to an existing setcc-of-rotate fold, but the
matching requires more checks for the more general funnel op:
https://alive2.llvm.org/ce/z/Ab2jDd

We are effectively decomposing the funnel shift into logical
shifts, reassociating, and removing a shift.

This should get us the final improvements for x86-64 that were
originally shown in D111530
( https://github.com/llvm/llvm-project/issues/49541 );
x86-32 still shows some SHLD/SHRD, so the pattern is not
matching there yet.

Differential Revision: https://reviews.llvm.org/D122919
2022-04-11 07:44:58 -04:00
Tim Northover 6c85668d28 Tail calls: look through AssertZExt to find register copy.
arm64_32 guarantees the high 32 bits of pointer parameters are passed as 0, and
this is modelled in the IR by inserting an AssertZExt after the CopyFromReg.
The function deciding whether registers that need to be preserved actually are
wasn't expecting this so it banned perfectly legitimate tail calls.
2022-04-11 12:24:47 +01:00
Matt Arsenault 9fdd25848a Transforms: Fix code duplication between LowerAtomic and AtomicExpand 2022-04-08 19:06:36 -04:00
Fraser Cormack 18106b99f0 [VP] Explicitly map from VP intrinsic to ISD opcode
This patch aims to overcome an issue in these mappings where, when an ISD
node was registered with BEGIN_REGISTER_VP_SDNODE but outwidth the scope
of a pair of BEGIN_REGISTER_VP_INTRINSIC/END_REGISTER_VP_INTRINSIC
macros, the switch cases fell apart. This in particular happened with
VP_SETCC, where we'd end up with something along the lines of:

  case Intrinsic::vp_fcmp:
    break;
  case Intrinsic::vp_icmp:
    break;
    ResOpc = ISD::VP_SETCC;
  case Intrinsic::vp_store:
    ...

To remedy this, we introduce a special-purpose mapping macro which can
map any number of VP intrinsic opcodes to an ISD opcode.

As a result, we no longer need to special-case the mapping from vp.icmp
and vp.fcmp to VP_SETCC, as the new helper macro does it for us.

Thanks to @craig.topper for noticing this and to @rogfer01 for the idea.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123324
2022-04-08 12:30:22 +01:00
Nikita Popov a5a272a491 [SafeStack] Don't create SCEV min between pointer and integer (PR54784)
Rather than rewriting the alloca pointer to zero, use
removePointerBase() to drop the base pointer. This will simply bail
if the base pointer is not the alloca. We could try doing something
more fancy here (like dropping the sources not based on the alloca
on the premise that they aren't SafeStack-relevant), but I don't
think that's worthwhile.

Fixes https://github.com/llvm/llvm-project/issues/54784.

Differential Revision: https://reviews.llvm.org/D123309
2022-04-08 09:44:00 +02:00
Chih-Ping Chen c226a5c4d7 [DebugInfo] Use DW_ATE_signed encoding when creating a Fortran
array index type.
2022-04-07 07:00:56 -04:00
Fraser Cormack 8216255c9f [RISCV][VP] Add basic RVV codegen for vp.fcmp
This patch adds the necessary infrastructure to lower vp.fcmp via
ISD::VP_SETCC to RVV instructions.

Most notably this patch adds cond-code legalization for VP_SETCC,
reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in
additional SDValue parameters for the Mask and EVL. This method then
uses VP operations to legalize the condcode.

There is still a general lack of canonicalization on VP_SETCC as opposed
to SETCC which results in worse code than is theoretically possible.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123051
2022-04-07 09:16:07 +01:00
Matt Arsenault 7f14a1d46b AtomicExpand: Add NotAtomic lowering strategy
Currently LowerAtomics exists as a separate pass which blindly
replaces all atomics. Add a new lowering strategy option to eliminate
the atomics which the target can control on a per-instruction level.
2022-04-06 22:34:35 -04:00
Matt Arsenault c4ea925f50 AtomicExpand: Change return type for shouldExpandAtomicStoreInIR
Use the same enum as the other atomic instructions for consistency, in
preparation for addition of another strategy.

Introduce a new "Expand" option, since the store expansion does not
use cmpxchg. Alternatively, the existing CmpXChg strategy could be
renamed to Expand.
2022-04-06 22:34:04 -04:00
Craig Topper bdb1ab9804 [LegalizeTypes][VP] Use LoVT/HiVT when splitting VP operations in SplitVecRes_UnaryOp.
The VP path was using the split source VTs instead of the split
destination VTs. This may not be a problem today because the VP
nodes going through this have the same source and dest VTs.
It will be a problem when we start using this function for legalizing
VP cast operations.
2022-04-06 10:51:49 -07:00
Daniil Kovalev 62a983ebc5 Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0.

As discussed in D120714 with @thakis, the patch added unneeded complexity
without noticeable benefits.
2022-04-06 20:32:53 +03:00
Craig Topper 8fc19185e3 [LegalizeTypes] Move SplitVecRes_VECTOR_REVERSE/VECTOR_SPLICE near other SplitVecRes methods. NFC
This file is divided into sections for different legalization actions.
We should keep similar methods together.
2022-04-06 10:29:32 -07:00
Craig Topper 1ad36487e9 [LegalizeDAG] Use SelectionDAG::getBoolConstant to simplify some code. NFC 2022-04-06 10:08:11 -07:00
Craig Topper 5b5f59428c [DAGCombiner] Replace call getSExtOrTrunc with a truncate. NFC
The extend case should never occur. The sign extend would be an
arbitrary choice, remove it to avoid confusion.
2022-04-06 09:59:45 -07:00
Paul Walker 7d3af9ef0f [DAGCombine] insert_subvector undef, (splat X), N2 -> splat X
Differential Revision: https://reviews.llvm.org/D120328
2022-04-06 17:15:38 +01:00
Fraser Cormack 6be5e875be [RISCV][VP] Add basic RVV codegen for vp.icmp
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.

Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.

In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.

Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.

Portions of this code were taken from the VP reference patches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122743
2022-04-06 16:51:22 +01:00