Commit Graph

33072 Commits

Author SHA1 Message Date
Yuanfang Chen 24c6ea917c [JMCInstrument] rename ELF section name from ".just.my.code" to ".data.just.my.code"
This gives linker scripts a hint about where to place the section.
2022-10-19 10:49:54 -07:00
Simon Pilgrim 9708d88017 Revert rG42230efccf8fe1185be5fa6c23dce0a8183d6ec9 "[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1"
@foad was right - this isn't actually going to help with D136042 as much as hoped, we need a better AMDGPU-specific solution as other targets are likely to make use of it
2022-10-19 12:07:41 +01:00
Simon Pilgrim 42230efccf [DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1
Helps with some of the AMDGPU regressions identified in D136042 where we were losing signed BFE patterns after sinking shifts behind logic ops.

Differential Revision: https://reviews.llvm.org/D136081
2022-10-19 11:18:49 +01:00
Koakuma d3fcbee10d [SPARC] Make calls to function with big return values work
Implement CanLowerReturn and associated CallingConv changes for SPARC/SPARC64.

In particular, for SPARC64 there's new `RetCC_Sparc64_*` functions that handles the return case of the calling convention.
It uses the same analysis as `CC_Sparc64_*` family of funtions, but fails if the return value doesn't fit into the return registers.

This makes calls to functions with big return values converted to an sret function as expected, instead of crashing LLVM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D132465
2022-10-18 00:01:55 +00:00
Craig Topper 30305d7948 [TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF.
The code incorrectly checked for CTLZ_ZERO_UNDEF instead of
CTTZ_ZERO_UNDEF.

While I was there I flipped the condition into an early out.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D136010
2022-10-17 10:15:39 -07:00
Kazu Hirata ef9956f434 [IR] Rename FuncletPadInst::getNumArgOperands to arg_size (NFC)
This patch renames FuncletPadInst::getNumArgOperands to arg_size for
consistency with CallBase, where getNumArgOperands was removed in
favor of arg_size in commit 3e1c787b31

Differential Revision: https://reviews.llvm.org/D136048
2022-10-17 10:15:10 -07:00
Simon Pilgrim 8e77458578 [DAG] visitShiftByConstant - replace constant detection with FoldConstantArithmetic
Instead of checking that an operand is constant/opaque before calling getNode() and then checking that the result is a constant, just use FoldConstantArithmetic which will just early-out if the operands are not constant foldable.
2022-10-17 16:19:10 +01:00
Simon Pilgrim af5942cc09 Remove trailing whitespace. NFC. 2022-10-17 15:20:26 +01:00
Peter Rong c2e7c9cb33 [CodeGen] Using ZExt for extractelement indices.
In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`.
This is because IRTranslator uses SExt for indices.

In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt.
This change includes both documentation, SelectionDAG and IRTranslator.
We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86

This patch fixes issue #57452.

Differential Revision: https://reviews.llvm.org/D132978
2022-10-15 15:45:35 -07:00
Filipp Zhinkin ef774bec63 [AArch64] Support SETCCCARRY lowering
Support SETCCCARRY lowering to SBCS instruction.

Related issue: https://github.com/llvm/llvm-project/issues/44629

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D135302
2022-10-14 22:29:31 +03:00
chenglin.bi c1909d7337 [DAGCombiner] Fix crash for the merge stores with different value type
The crash case comes from #58350. It have two stores, one store is type f32 and the other is v1f32.
When we try to merge these two stores on v1f32, the memVT is vector type so the old code will use ISD::EXTRACT_SUBVECTOR for type f32 also then compiler crash.
So this patch insert a build_vector for f32 store to generate v1f32 also when memVT is v1f32.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135954
2022-10-15 01:16:35 +08:00
Nicola Lancellotti ce1a2ccf94 [NFC] Fix typo in DAGCombiner 2022-10-14 17:47:25 +01:00
Sander de Smalen 02df03c5b7 [AArch64][SME] Add support for arm_locally_streaming functions.
Functions with `aarch64_sme_pstatesm_body` will emit a SMSTART at the start
of the function, and a SMSTOP at the end of the function, such that all
operations use the right value for vscale.

Because the placement of these nodes is critically important (i.e. no
vscale-dependent operations should be done before SMSTART has been issued),
we require glueing the CopyFromReg to the Entry node such that we can
insert the SMSTART as part of that glued chain.

More details about the SME attributes and design can be found
in D131562.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D131582
2022-10-14 13:47:53 +00:00
Matt Arsenault d0750ec475 AtomicExpand: Avoid some operations if the atomic is overaligned
Let some of the pointer bithacking fold away if we know the LSB are 0.
2022-10-13 23:31:00 -07:00
Anshil Gandhi d383adec4d [BranchRelaxation] Fall through only if block has no unconditional branches
Prior to inserting an unconditional branch from X to its
fall through basic block, check if X has any terminators to
avoid inserting additional branches.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134557
2022-10-13 22:48:41 -06:00
Matt Arsenault c427ee9798 AsmPrinter: Remove pointless code in inline asm emission
This was scanning through def operands looking for the
symbol operand. This is pointless because the symbol is always
the first operand as enforced by the verifier, and all operands
are implicit.
2022-10-13 21:12:11 -07:00
Xiang1 Zhang aad013de41 [InlineAsm][bugfix] Correct function addressing in inline asm
In Linux PIC model, there are 4 cases about value/label addressing:
Case 1: Function call or Label jmp inside the module.
Case 2: Data access (such as global variable, static variable) inside the module.
Case 3: Function call or Label jmp outside the module.
Case 4: Data access (such as global variable) outside the module.

Due to current llvm inline asm architecture designed to not "recognize" the asm
code, there are quite troubles for us to treat mem addressing differently for
same value/adress used in different instuctions.
For example, in pic model, call a func may in plt way or direclty pc-related,
but lea/mov a function adress may use got.

This patch fix/refine the case 1 and case 2 in inline asm.
Due to currently inline asm didn't support jmp the outsider lable, this patch
mainly focus on fix the function call addressing bugs in inline asm.

Reviewed By: Pengfei, RKSimon

Differential Revision: https://reviews.llvm.org/D133914
2022-10-14 09:47:26 +08:00
David Green 16e4e4ab87 [CodeGenPrep] Handle constants in ConvertPhiType
This is a simple addition to the convertPhiTypes in CodeGenPrepare to
consider and convert constants as it converts the phi type. Someone
fixed the bug in the motivating example, so the undef is now a constant
0. This does mean converting between integer and floating point
constants, which may have different materialization.

Differential Revision: https://reviews.llvm.org/D135561
2022-10-13 16:41:44 +01:00
Anton Sidorenko 4431e705cc [NFC] Use forward decl of MachineCombinerPattern enum to reduce dependencies
Differential Revision: https://reviews.llvm.org/D135776
2022-10-13 14:56:14 +01:00
Simon Tatham 526ce9c929 Propagate tied operands when copying a MachineInstr.
MachineInstr's copy constructor works by calling the addOperand method
to add each operand of the old MachineInstr to the new one, one by
one. But addOperand deliberately avoids trying to replicate ties
between operands, on the grounds that the tie refers to operands by
index, and the indices aren't necessarily finalized yet.

This led to a code generation fault when the machine pipeliner cloned
an Arm conditional instruction, and lost the tie between the output
register and the input value to be used when the condition failed to
execute.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D135434
2022-10-13 09:40:35 +01:00
Mirko Brkusanin 8b8463ef6c [SelectionDAG] Use consistent type sizes for opcode 2022-10-12 17:33:04 +02:00
Craig Topper ac9209751a Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))"
This reverts commit 0148df8157.

Getting a lit test failures on AMDGPU but I can't reproduce it so far.
Reverting to investigate.
2022-10-11 16:30:40 -07:00
Craig Topper 0148df8157 [DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))
(sra X, BW-1) is either 0 or -1. So the multiply is a conditional
negate of Y.

This pattern shows up when type legalizing wide multiplies involving
a sign extended value.

Fixes PR57549.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133399
2022-10-11 16:20:55 -07:00
Jessica Paquette 0f1a51e173 [GlobalISel] Allow vectors in redundant or + add combines
We support KnownBits for vectors, so we can enable these.

https://godbolt.org/z/r9a9W4Gj1

Differential Revision: https://reviews.llvm.org/D135719
2022-10-11 15:31:09 -07:00
Jessica Paquette 036a13065b [GlobalISel] Combine (X op Y) == X --> Y == 0
This matches patterns of the form

```
(X op Y) == X
```

And transforms them to

```
Y == 0
```

where appropriate.

Example: https://godbolt.org/z/hfW811c7W

Differential Revision: https://reviews.llvm.org/D135380
2022-10-11 09:52:48 -07:00
Philip Reames 487695e7c9 [SDAG] Treat DemandedElts argument to isSplatVector as splat for scalable vectors [nfc]
The previous code used a APInt(1, 0) to represent the demanded elts of a scalable vector, and then ignored that argument if type was scalable.  This was inconsistent with the UndefElts parameter which is set to either APInt(1, 0) or APInt(1,1) - that is, implicitly broadcast across all lanes.  Particularly since the undef code relied on the DemandedElts parameter having bitwidth 1 to achieve that result!

This change switches the demanded parameter to APInt(1,1), documents the broadcast semantics, and takes advantage of it to remove one special case for scalable vectors which is no longer required.
2022-10-11 09:49:28 -07:00
Philip Reames ac4f3fff8c [SDAG] Clarify behavior of scalable demanded/undef elts in isSplatValue [nfc]
Update comment, and add an assertion to check property expected by sole (non-test) caller.  Remove tests which appear to have been copied from fixed vector tests, and whose demanded bits don't correspond to the way this interface is otherwise used.
2022-10-11 07:28:34 -07:00
Craig Topper 0121b1a4ac Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant."
This reverts commit d4facda414.

This has been reported to cause failures. Reverting while I investigate.
2022-10-10 14:53:29 -07:00
Craig Topper d4facda414 [TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant.
If the divisor is even, we can first shift the dividend and divisor
right by the number of trailing zeros. Now the divisor is odd and we
can do the original algorithm to calculate a remainder. Then we shift
that remainder left by the number of trailing zeros and add the bits
that were shifted out of the dividend.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135541
2022-10-10 11:02:22 -07:00
wanglei 730ee6568c [LoongArch] Set correct encodings for DWARF exception handling
This patch sets correct encodings for DWARF exception handling for
LoongArch.

Differential Revision: https://reviews.llvm.org/D134710
2022-10-08 11:53:48 +08:00
Craig Topper 9f67047cf0 [VP][RISCV] Add vp.smax/smin/umax/umin intrinsics
Differential Revision: https://reviews.llvm.org/D135418
2022-10-07 17:14:31 -07:00
Amaury Séchet 62ea6c5be7
[DAGCombine] Deduplicate addcarry node using commutativity.
The first two parameters of addcarry are commutative. We may face a situation where both variant are present in the DAG, in which case we benefit from using just one.

Depends on D57302 and D33587

Reviewed By: RKSimon, chfast

Differential Revision: https://reviews.llvm.org/D57317
2022-10-08 00:55:14 +02:00
eopXD dbc681c98e [VP][RISCV] Add vp.roundtozero and its RISC-V support
The scalar instruction of this is `llvm.trunc`. However the naming of
ISD::VP_TRUNC is already taken by `trunc` of the LLVM IR. Naming this as
`vp.ftrunc` would likely cause confusion with `vp.fptrunc`. So adding
`vp.roundtozero` that will look similar to `vp.roundeven`.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D135233
2022-10-07 02:15:23 -07:00
Pierre van Houtryve 36c3833783 [GISel] Add Trunc/Lshr/BuildVector Folding
Similar to the current "Trunc/BuildVector" folding - which folds low element extracts of BuildVectors, folds hi element extracts done using bitshifts.

For D134354

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135148
2022-10-07 08:44:03 +00:00
Pierre van Houtryve a34977c4d0 [GISel] Handle G_TRUNC in `matchExtractVecEltBuildVec`
Spotted some cases in D134354 where this was an issue.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135147
2022-10-07 08:37:18 +00:00
Mike Hommey d3b0e745e8 [CodeView] Avoid NULL deref of Scope
Regression from D131400: cross-language LTO causes a crash in the
compiler on the NULL deref of Scope in `isa` call when Rust IR is
involved. Presumably, this might affect other languages too, and
even Rust itself without cross-language LTO when the Rust compiler
switched to LLVM 16.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D134616
2022-10-07 08:34:57 +02:00
Philip Reames 04bb32e58a [DAG] Extract helper for (neg x) [nfc]
This is a frequently reoccurring pattern, let's factor it out.

Differential Revision: https://reviews.llvm.org/D135301
2022-10-06 13:23:52 -07:00
Pierre van Houtryve 3ec0085c3f [DAG] Update `isKnownNeverNaN` for `FMA/FMAD`
We can still get a NaN even if none of the operands are NaN,
e.g. from +inf/-inf. D50804 didn't catch that.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134854
2022-10-06 06:52:36 +00:00
Ellis Hoag 549773f9e9 [Dwarf] Reference the correct CU when inlining
Sometimes when a function is inlined into a different CU, `llvm-dwarfdump --verify` would find an inlined subroutine with an invalid abstract origin. This is because `DwarfUnit::addDIEEntry()` will incorrectly assume the inlined subroutine and the abstract origin are from the same CU if it can't find the CU for the inlined subroutine.

In the added test, the inlined subroutine for `bar()` is created before the CU for `B.swift` is created, so it tries to point to `goo()` in the wrong CU. Interestingly, if we swap the order of the two functions then we don't see a crash since the module for `goo()` is created first.

The fix is to give a parent DIE to `ScopeDIE` before calling `addDIEEntry()` so that its CU can be found. Luckily, `constructInlinedScopeDIE()` is only called once so we can pass it the DIE of the scope's parent and give it a child just after it's created.

`constructInlinedScopeDIE()` should always return a DIE, so assert that it is not null.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D135114
2022-10-05 09:19:12 -07:00
Fraser Cormack 08497a785b [VP] Fix unused variable in release configurations 2022-10-05 10:33:07 +01:00
Fraser Cormack a3a9b0743e [VP][NFC] Remove \brief commands from doxygen comments
Following a precedent set in D46861.
2022-10-05 08:08:30 +01:00
Fraser Cormack 3362e2d57f [VP] Add IR expansion for vp.icmp and vp.fcmp
These intrinsics are simply expanded to regular icmp/fcmp instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121594
2022-10-05 08:07:39 +01:00
Serguei Katkov d330731f94 [RegAllocFast] Clean-up. Remove redundant operations. NFC.
Reviewed By: MatzeB, arsenm
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D109213
2022-10-05 11:38:54 +07:00
Amara Emerson c5cebf78bd [GlobalISel] Add computeNumSignBits() support for compares.
Doing so allows G_SEXT_INREG to be combined away for many vector cases.

Differential Revision: https://reviews.llvm.org/D135168
2022-10-05 00:28:08 +01:00
Amara Emerson 8055aa8e8a [AArch64][GlobalISel] Make vector G_SEXT_INREG legal and allow combining.
As a result of making these legal, and tweaking the combine to allow vectors,
we generate vector G_SEXT_INREG during legalization.

The reason we want to make these legal in the first place is to allow for
more combine opportunities. Once those have been done, we can just lower them
back to shifts in the post-legalizer lowering.

This needs to be one commit otherwise we start causing tests to fail due to
incomplete support for selection etc.
2022-10-05 00:28:08 +01:00
jeff cebec42089 [DAGCombiner] [AMDGPU] Allow vector loads in MatchLoadCombine
Since SROA chooses promotion based on reaching load / stores of allocas, we may run into scenarios in which we alloca a vector, but promote it to an integer. The result of which is the familiar LoadCombine pattern (i.e. ZEXT, SHL, OR). However, instead of coming directly from distinct loads, the elements to be combined are coming from ExtractVectorElements which stem from a shared load.

This patch identifies such a pattern and combines it into a load.

Change-Id: I0bc06588f11e88a0a975cde1fd71e9143e6c42dd
2022-10-04 12:16:00 -07:00
Sanjay Patel 17dcbd8165 [SDAG] don't hoist div/rem through a select with neutral constant
This bug was introduced with D134966.
2022-10-04 13:15:01 -04:00
Jay Foad af947d9fcb [ISel] Fix crash in new FMA DAG combine
Fix a crash in the FMA combine added by D132837 and amended by D134810.
In cases where the newly created node could be folded, the combiner
would fail this assertion:

llc: DAGCombiner.cpp:268: void (anonymous namespace)::DAGCombiner::AddToWorklist(llvm::SDNode *): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed.

Differential Revision: https://reviews.llvm.org/D135150
2022-10-04 15:13:18 +01:00
Amara Emerson 07ccf651b9 x[AArch64][GlobalISel] Enable vector support for G_SELECT->G_FMAXIMUM/MINIMUM.
Vector support seems to work immediately, as long as we run the combine before
legalization (so the vector SELECTs don't get lowered) and the legalizer rules
are there to enable generation.

Differential Revision: https://reviews.llvm.org/D135047
2022-10-03 21:39:52 +01:00
Philip Reames a200b0fc25 [DAG] Introduce getSplat utility for common dispatch pattern [nfc]
We have a very common pattern of dispatching between BUILD_VECTOR and SPLAT_VECTOR creation repeated in many cases in code.  Common the pattern into a utility function.
2022-10-03 12:49:39 -07:00