Commit Graph

33283 Commits

Author SHA1 Message Date
Dmitry Vyukov a1255dc467 Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-29 17:37:36 +01:00
Philip Reames fc0efb7e78 [SDAG] Allow scalable vectors in ComputeNumSignBits (try 2)
I had reverted this before the holiday week because a problem was reported with a related change (D137140 - scalable vector known bits in DAG).  I had initially confused the two patches, and then decided to leave this reverted out an abundance of caution.  Now that we're through the holiday week, reapplying.

I also roled in fixes for several post commit review comments that hadn't landed with the original change.

Original commit message

This is a continuation of the series of patches adding lane wise support for scalable vectors in various knownbit-esq routines.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

Differential Revision: https://reviews.llvm.org/D137141
2022-11-29 08:25:05 -08:00
Mateja Marjanovic 68057c2b8d Add new vector types for LLVM
Add v9i32, v9f32, v10i32, v10f32, v11i32, v11f32, v12i32 and v12f32.

Differential Revision: https://reviews.llvm.org/D138136
2022-11-29 17:02:04 +01:00
Simon Pilgrim 30eff7f29f [DAG] Attempt to replace a mul node with an existing umul_lohi/smul_lohi node (PR59217)
As discussed on Issue #59217, under certain circumstances the DAG can generate duplicate MUL and MUL_LOHI nodes, often during MULO legalization.

This patch attempts to replace MUL nodes with additional uses of the LO result from the MUL_LOHI node

Differential Revision: https://reviews.llvm.org/D138790
2022-11-29 12:51:30 +00:00
Janek van Oirschot 322966f8f8 [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp
class support and introduce GlobalISel implementation for AMDGPU

Uses existing SelectionDAG lowering of the llvm.amdgcn.class intrinsic
for llvm.is.fpclass
2022-11-28 16:00:36 -05:00
Guillaume Chatelet f5dd9dda63 Remove support for 10.4 Tiger from AsmPrinter
I stumbled on this while trying to tighten Alignment in MCStreamer (D138705).
From the [wikipedia page](https://en.wikipedia.org/wiki/Mac_OS_X_Tiger), last release of MacOSX Tiger was released 15 years ago and is not supported anymore by Apple.

Relevant commit : 9f06f911d1 (diff-17b326b45ef392288420bed274616afa7df81b27576c96723b3c25f5198dc398)

Differential Revision: https://reviews.llvm.org/D138707
2022-11-28 08:31:49 +00:00
Kazu Hirata d6f0ab47a7 [CodeGen] Use std::optional in TargetPassConfig.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:12:11 -08:00
Kazu Hirata 5b839fc2d0 [CodeGen] Use std::optional in ShadowStackGCLowering.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:09:25 -08:00
Kazu Hirata a5ef7bb5c1 [SelectionDAG] Use std::optional in SelectionDAGISel.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:07:23 -08:00
Kazu Hirata 01e998e752 [SelectionDAG] Use std::optional in SelectionDAGBuilder.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:05:06 -08:00
Kazu Hirata d82f7fbfce [SelectionDAG] Use std::optional in FastISel.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:02:45 -08:00
Kazu Hirata dd698b7777 [SelectionDAG] Use std::optional in DAGCombiner.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:00:23 -08:00
Kazu Hirata d77ecb675b [CodeGen] Use std::optional in SafeStack.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:57:44 -08:00
Kazu Hirata 8a45032e5c [CodeGen] Use std::optional in MachineOperand.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:55:08 -08:00
Kazu Hirata 3ff6ed8103 [LiveDebugValues] Use std::optional in InstrRefBasedImpl.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:52:33 -08:00
Kazu Hirata 5076bdf6e9 [CodeGen] Use std::optional in IndirectBrExpandPass.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:50:12 -08:00
Kazu Hirata af0d385693 [GlobalISel] Use std::optional in Utils.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:47:46 -08:00
Kazu Hirata 3ccbfc34c0 [GlobalISel] Use std::optional in LegalizerHelper.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:44:54 -08:00
Kazu Hirata 4531b61208 [GlobalISel] Use std::optional in CombinerHelper.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:33:45 -08:00
Kazu Hirata 214646d6b5 [CodeGen] Use std::optional in ExpandMemCmp.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:29:56 -08:00
Kazu Hirata 000749d753 [CodeGen] Use std::optional in CodeGenPrepare.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:27:19 -08:00
Kazu Hirata 07ce3b8abd [CodeGen] Use std::optional in BasicBlockSections.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:24:38 -08:00
Kazu Hirata 644159e20b [AsmPrinter] Use std::optional::value_or (NFC) 2022-11-26 14:21:32 -08:00
Kazu Hirata 15bb5c9253 [AsmPrinter] Use std::optional in DwarfCompileUnit.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:16:31 -08:00
Kazu Hirata fb2f3b30b2 [AsmPrinter] Use std::optional in DbgEntityHistoryCalculator.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:13:47 -08:00
Kazu Hirata 6a9ef0dd4e [AsmPrinter] Use std::optional in AsmPrinter.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:11:17 -08:00
Benjamin Maxwell 79b5829a15 [TargetLowering][AArch64] Teach DemandedBits about SVE count intrinsics
This allows DemandedBits to see that the SVE count intrinsics (CNTB,
CNTH, CNTW, CNTD) sans multiplier will only ever produce small
positive integers. The maximum value you could get here is 256, which
is CNTB on a machine with a 2048bit vector size (the maximum for SVE).

Using this various redundant operations (zexts, sexts, ands, ors, etc)
can be eliminated.

Differential Revision: https://reviews.llvm.org/D138424
2022-11-25 10:15:14 +00:00
Phoebe Wang 2e5366ac2e [NFC] Change `dyn_cast` to `cast` to make sure no dereference on nullptr 2022-11-25 17:40:37 +08:00
Anton Sidorenko 8e3545a64e [Debugify] Accumulate the number of variables in debugify metadata
When a module contains more than one function, we should update debugify metadata
by increasing the number of variables in the function rather than overwritting it.

Previous revert issue is fixed: I forgot to strip all x86-related info from the
test.

Differential Revision: https://reviews.llvm.org/D136949
2022-11-25 10:53:55 +03:00
Anton Sidorenko 5e04d8b72e Revert "[Debugify] Accumulate the number of variables in debugify metadata"
This brokes some builds
This reverts commit a1bbe8a4e2.
2022-11-24 19:09:52 +03:00
Guillaume Chatelet 6c09ea3fdd [Alignment][NFC] Use Align in MCStreamer::emitValueToAlignment
Differential Revision: https://reviews.llvm.org/D138674
2022-11-24 16:09:44 +00:00
Anton Sidorenko a1bbe8a4e2 [Debugify] Accumulate the number of variables in debugify metadata
When a module contains more than one function, we should update debugify metadata
by increasing the number of variables in the function rather than overwritting it.

Differential Revision: https://reviews.llvm.org/D136949
2022-11-24 18:49:49 +03:00
Guillaume Chatelet 4f17734175 [Alignment][NFC] Use Align in MCStreamer::emitCodeAlignment
This patch makes code less readable but it will clean itself after all functions are converted.

Differential Revision: https://reviews.llvm.org/D138665
2022-11-24 14:51:46 +00:00
David Green ca78b56014 [SelectOpt] Don't treat LogicalAnd/LogicalOr as selects
A `select i1 %c, i1 true, i1 %d` is just an or and a `select i1 %c, i1 %d, i1 false`
is just an and. There are better treated as such in the logic of SelectOpt, allowing
the backend to optimize them to and/or directly.

Differential Revision: https://reviews.llvm.org/D138490
2022-11-24 14:29:57 +00:00
Manuel Brito f408635b26 [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]
Differential Revision: https://reviews.llvm.org/D138483
2022-11-24 08:42:28 +00:00
Kazu Hirata 34bcadc38c Use std::nullopt_t instead of NoneType (NFC)
This patch replaces those occurrences of NoneType that would trigger
an error if the definition of NoneType were missing in None.h.

To keep this patch focused, I am deliberately not replacing None with
std::nullopt in this patch or updating comments.  They will be
addressed in subsequent patches.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Differential Revision: https://reviews.llvm.org/D138539
2022-11-23 14:16:04 -08:00
chenglin.bi cdb7b804f6 [DAGCombiner] fold or (xor x, y),? patterns
or (xor x, y), x --> or x, y
or (xor x, y), y --> or x, y
or (xor x, y), (and x, y) --> or x, y
or (xor x, y), (or x, y) --> or x, y

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D138401
2022-11-23 09:28:10 +08:00
James Y Knight 4af73d7ebb Refactor AsmPrinterHandler callbacks. NFCI.
The existing behaviors and callbacks were overlapping and had very
confusing semantics: beginBasicBlock/endBasicBlock were not always
called, beginFragment/endFragment seemed like they were meant to mean
the same thing, but were slightly different, etc. This resulted in
confusing semantics, virtual method overloads, and control flow.

Remove the above, and replace with new beginBasicBlockSection and
endBasicBlockSection callbacks. And document them.

These are always called before the first and after the last blocks in
a function, even when basic-block-sections are disabled.
2022-11-22 18:25:22 -05:00
Simon Pilgrim 629f17c516 [DAG] isGuaranteedNotToBeUndefOrPoison - handle FrameIndex/TargetFrameIndex
Fixes #58904
2022-11-22 18:16:15 +00:00
David Green 7d098988bc [SelectOptimize] Add some debug logging. NFC
This is some quick debug messages for the SelectOptimize pass, adding
some information for the costs that are measured from getInstructionCost
calls, and re-using the existing optimization remarks to print some
information about if transforms were performed or not.

Differential Revision: https://reviews.llvm.org/D138108
2022-11-22 13:47:56 +00:00
Nuno Lopes b50e1bd605 Revert "[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]"
This reverts commit f50423c1a4.
2022-11-22 12:41:22 +00:00
Manuel Brito f50423c1a4 [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]
Differential Revision: https://reviews.llvm.org/D138483
2022-11-22 11:40:25 +00:00
Han-Kuan Chen caa9f63022 [CodeGen] Refactor visitSCALAR_TO_VECTOR. NFC.
Differential Revision: https://reviews.llvm.org/D137688
2022-11-22 01:29:04 -08:00
Kazu Hirata 6ba4b62af8 Return None instead of Optional<T>() (NFC)
This patch replaces:

  return Optional<T>();

with:

  return None;

to make the migration from llvm::Optional to std::optional easier.
Specifically, I can deprecate None (in my source tree, that is) to
identify all the instances of None that should be replaced with
std::nullopt.

Note that "return None" far outnumbers "return Optional<T>();".  There
are more than 2000 instances of "return None" in our source tree.

All of the instances in this patch come from functions that return
Optional<T> except Archive::findSym and ASTNodeImporter::import, where
we return Expected<Optional<T>>.  Note that we can construct
Expected<Optional<T>> from any parameter convertible to Optional<T>,
which None certainly is.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Differential Revision: https://reviews.llvm.org/D138464
2022-11-21 19:06:42 -08:00
Kazu Hirata 1f914944b6 Don't use Optional::getPointer (NFC)
Since std::optional does not offer getPointer(), this patch replaces
X.getPointer() with &*X to make the migration from llvm::Optional to
std::optional easier.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Differential Revision: https://reviews.llvm.org/D138466
2022-11-21 19:03:40 -08:00
Phoebe Wang b39b76f2ef [X86] Allow no X87 on 32-bit
This patch is an alternative of D100091. It solved the problems in `f80` type lowering.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D137946
2022-11-22 10:47:47 +08:00
OCHyams 3115e6828c [Assignment Tracking][25/*] Replace sunk address uses in dbg.assign intrinsics
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D136255
2022-11-21 15:50:47 +00:00
chenglin.bi ac1b999e85 [DAGCombiner] fold or (and x, y), x --> x
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D138398
2022-11-21 22:11:12 +08:00
Anton Sidorenko fb47bb37e4 [MachineTraceMetrics] Pick the trace successor for an entry block
We generate erroneous trace for a basic block if it does not have at least one
predecessor when MinInstr strategy is used. Currently only this strategy is
implemented, so we always have a wrong trace for any entry block. This results in
wrong instructions heights calculation and also leads to wrong critical path.

The described behavior is demonstrated on a simple test. It shows that early
if-conv pass makes wrong decisions due to incorrectly calculated critical path
lenght.

Differential Revision: https://reviews.llvm.org/D138272
2022-11-21 12:56:40 +03:00
Kazu Hirata 7524db4d44 [llvm] Remove unused forward declarations (NFC) 2022-11-20 09:59:36 -08:00
Kazu Hirata 1fa870b1bd Use None consistently (NFC)
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.

In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.

Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:

  using NoneType = std::nullopt_t;
  inline constexpr std::nullopt_t None = std::nullopt;

to ease the migration from llvm::Optional to std::optional.

Differential Revision: https://reviews.llvm.org/D138376
2022-11-20 00:24:40 -08:00
Kazu Hirata 6ccf1d23d7 [SelectionDAG] Teach getRegistersForValue to return std::optional (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716/11
2022-11-19 15:00:19 -08:00
Alexandre Ganea 49e483d3d6 [CodeView] Replace GHASH hasher by BLAKE3
Previously, we used SHA-1 for hashing the CodeView type records.
SHA-1 in `GloballyHashedType::hashType()` is coming top in the profiles. By simply replacing with BLAKE3, the link time is reduced in our case from 15 sec to 13 sec. I am only using MSVC .OBJs in this case. As a reference, the resulting .PDB is approx 2.1GiB and .EXE is approx 250MiB.

Differential Revision: https://reviews.llvm.org/D137101
2022-11-19 15:17:42 -05:00
chenglin.bi fe07eeb825 [GlobalISel] Fix crash in applyShiftOfShiftedLogic caused by CSEMIRBuilder reuse instruction
If LogicNonShiftReg is the same to Shift1Base, and shift1 const is the same to MatchInfo.Shift2 const, CSEMIRBuilder will reuse the old shift1 when build shift2.
So, if we erase MatchInfo.Shift2 at the end, actually we remove old shift1. And it will cause crash later.

Solution for this issue is just erase it earlier to avoid the crash.

Fix #58423

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138187
2022-11-19 09:13:44 +08:00
Philip Reames 6705d94005 Revert "[SDAG] Allow scalable vectors in ComputeKnownBits"
This reverts commit bc0fea0d55.

There was a "timeout for a Halide Hexagon test" reported.  Revert until investigation complete.
2022-11-18 15:29:14 -08:00
Philip Reames 102f05bd34 Revert "[SDAG] Allow scalable vectors in ComputeNumSignBits" and follow up
This reverts commits 3fb08d14a6 and f8c63a7fbf.

There was a "timeout for a Halide Hexagon test" reported.  Revert until investigation complete.
2022-11-18 15:25:59 -08:00
Matt Arsenault 1fe1299a93 GlobalISel: Legalize strict_fsub
In the future should probably have a more convenient
way to switch between building strict and non-strict ops.
2022-11-18 15:21:41 -08:00
Philip Reames 3fb08d14a6 [SDAG] Address post commit review feedback from f8c63a7f
The major change is falling through to ComputeKnownBits when we don't have an implementation of ComputeNumSignBits due to conservatism over scalable vectors.  Right now, we're mostly conservative in the same cases, but this allows our results to improve when we change ComputeKnownBits without also needing to improve ComputeNumSignBits at the same time.
2022-11-18 12:30:10 -08:00
Philip Reames f8c63a7fbf [SDAG] Allow scalable vectors in ComputeNumSignBits
This is a continuation of the series of patches adding lane wise support for scalable vectors in various knownbit-esq routines.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

Differential Revision: https://reviews.llvm.org/D137141
2022-11-18 10:50:06 -08:00
Matt Arsenault 08ec15e44b AMDGPU/GlobalISel: Fix strictfp fmul 2022-11-18 08:53:49 -08:00
Philip Reames bc0fea0d55 [SDAG] Allow scalable vectors in ComputeKnownBits
his is the SelectionDAG equivalent of D136470, and is thus an alternate patch to D128159.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

This patch also includes an implementation for SPLAT_VECTOR as without it, the lane wise reasoning has no base case. The original patch which inspired this (D128159), also included STEP_VECTOR. I plan to do that as a separate patch.

Differential Revision: https://reviews.llvm.org/D137140
2022-11-18 07:40:32 -08:00
Alexander Timofeev 32bd75716c PEI should be able to use backward walk in replaceFrameIndicesBackward.
The backward register scavenger has correct register
liveness information. PEI should leverage the backward register scavenger.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137574
2022-11-18 15:57:34 +01:00
Phoebe Wang d558255650 [X86] Use lock add/sub for cases that we only care about the EFLAGS
This fixes #36373, #36905 and partial of #58685.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D137711
2022-11-18 21:43:47 +08:00
Benjamin Maxwell 34d88cf6cf [DAG] Allow folding AND of anyext masked_load with >1 user to zext version
This now allows folding an AND of a anyext masked_load to a
zext_masked_load even if the masked load has multiple users.  Doing is
eliminates some redundant ANDs/MOVs for certain AArch64 SVE code.

I'm not sure if there's any cases where doing this could negatively the
other users of the masked_load.  Looking at other optimizations of
masked loads, most don't apply if the load is used more than once, so it
doesn't look like this would interfere.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D137844
2022-11-18 10:38:09 +00:00
luxufan 18c5f3c35d [RegisterScavenger][RISCV] Don't search for FrameSetup instrs if we were searching from Non-FrameSetup instrs
Otherwise, the spill position may point to position where before
FrameSetup instructions. In which case, the spill instruction may store
to caller's frame since the stack pointer has not been adjustted.

Fixes https://github.com/llvm/llvm-project/issues/58286

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135693
2022-11-18 15:13:52 +08:00
Matt Arsenault fe5b9a6a11 AMDGPU/GlobalISel: Make strict fadd, fmul and fma legal 2022-11-17 20:50:04 -08:00
YingChi Long 7a715bf317
[VP] Add support for vp.inttoptr & vp.ptrtoint
Add vp.inttoptr & vp.ptrtoint support by lowering them into
vp.zext / vp.truncate with in SelectionDAGBuilder.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137169
2022-11-18 10:42:24 +08:00
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Philip Reames 4105794e66 [SDAG] Assert we don't see scalable VECTOR_SHUFFLES
It was pointed out in review of D137140 that this case should be impossible.  This patch converts an existing bailout into an assert instead.
2022-11-17 08:18:51 -08:00
Alex Richardson 754d25844a [CGP] Update MemIntrinsic alignment if possible
Previously it was only being done if shouldAlignPointerArgs() returned
true, which right now is only true for ARM targets.
Updating the argument alignment attributes of memcpy/memset intrinsics
if the underlying object has larger alignment can be beneficial even
when CGP didn't increase alignment (as can be seen from the test changes),
so invert the loop and if condition.

Differential Revision: https://reviews.llvm.org/D134281
2022-11-17 11:59:35 +00:00
Anton Sidorenko b6c790736e [MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patterns
This patch adds tranformation of fmul+fadd/fsub chains to fused multiply
instructions:
  * fmul+fadd->fmadd
  * fmul+fsub->fmsub/fnmsub

We also will try to combine these instructions if the fmul has more than one use
and cannot be deleted. However, removing the dependence between fmul and fadd can
still be profitable, and we rely on machine combiner approximations of scheduling.

Differential Revision: https://reviews.llvm.org/D136764
2022-11-17 13:24:04 +03:00
Jay Foad 96a661de4b [GlobalISel] Better verification of G_UNMERGE_VALUES
Verify three cases of G_UNMERGE_VALUES separately:

1. Splitting a vector into subvectors (the converse of
   G_CONCAT_VECTORS).
2. Splitting a vector into its elements (the converse of
   G_BUILD_VECTOR).
3. Splitting a scalar into smaller scalars (the converse of
   G_MERGE_VALUES).

Previously #1 allowed strange combinations like this:
  %1:_(<2 x s16>),%2:_(<2 x s16>) = G_UNMERGE_VALUES %0(<2 x s32>)
This has been tightened up to check that the source and destination
element types match, and some MIR test cases updated accordingly.

Differential Revision: https://reviews.llvm.org/D111132
2022-11-17 08:19:57 +00:00
Sinan Lin 4ad8952d2d [CodeGen][BasicBlockSections] Fix wrong alignment directive placement in
basic block section cases

MachineBlockPlacement pass sets an alignment attribute to the loop
header MBB and this attribute will lead to an alignment directive during
emitting asm. In the case of the basic block section, the alignment
directive is put before the section label, and thus the alignment is set
to the predecessor of the loop header, which is not what we expect and
increases the code size (both inserting nop and set section alignment).

Reviewed By: rahmanl

Differential Revision: https://reviews.llvm.org/D137535
2022-11-17 15:01:57 +08:00
zhongyunde 8fbb6f8678 [NFC] Fix typo in comment
Address comment in https://reviews.llvm.org/D137936

Differential Revision: https://reviews.llvm.org/D138124
2022-11-16 23:35:53 +08:00
David Green 71609871dd [AArch64][MachineCombiner] Use MIMetadata to copy pcsections metadata to reassociated instructions.
D134260/D138107 exposed that the MachineCombiner was not copying
pcsections metadata where it should. This patch switches the MIBuild
methods to use MIMetadata that can copy the debug loc and pcsections at
the same time.

Differential Revision: https://reviews.llvm.org/D138112
2022-11-16 13:22:48 +00:00
Simon Pilgrim a92f5a08a1 [DAG] simplifySelect - add support for vselect(0, T, F) -> F fold
We still need to add handling for the non-zero T fold (which requires getBooleanContents handling)
2022-11-16 13:11:14 +00:00
OCHyams a1ac6efcb0 [NFC][SelectionDAG][DebugInfo] Refactor DanglingDebugInfo class
Hide the underlying DbgValueInst by adding methods to extract the necessary
information and by adding a raw_ostream &operator<< overload to print it.

Remove the DebugLoc field as this is always the same as the DbgValueInst's
DebugLoc (see D136247).

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D136249
2022-11-16 10:10:24 +00:00
OCHyams 9792744650 [NFC][SelectionDAG][DebugInfo] Remove duplicate parameter from handleDebugValue
handleDebugValue has two DebugLoc parameters that appear to always take the
same value. Remove one of the duplicate parameters. See phabricator review for
more detail.

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D136247
2022-11-16 09:59:35 +00:00
Matt Arsenault 116c894d72 DAG: Fix assert on load casted to vector with attached range metadata
AMDGPU legalizes i64 loads to loads of <2 x i32>, leaving the
i64 MMO with attached range metadata alone. The known bit width
was using the scalar element type, and asserting on a mismatch.
2022-11-15 23:28:55 -08:00
Yeting Kuo ed9638c44b [VP][RISCV] Add vp.nearbyint and RISC-V support.
nearbyint has the property to execute without exception.
For not modifying fflags, the patch added new machine opcode
PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair
of frflags and fsflags.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137685
2022-11-16 14:05:35 +08:00
Yeting Kuo 5c3ca10b09 [VP][RISCV] Add vp.bswap and RISC-V support.
The patch also added function expandVPBSWAP to expand ISD::VP_BSWAP nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137928
2022-11-16 11:36:38 +08:00
Craig Topper f387918dd8 [TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.
We can reuse constants if we use SRL followed by AND and AND followed by SHL.
Similar was done to bitreverse previously.

Differential Revision: https://reviews.llvm.org/D138045
2022-11-15 14:36:01 -08:00
Fangrui Song 6c7666a408 Revert D137574 "PEI should be able to use backward walk in replaceFrameIndicesBackward."
This reverts commit e05ce03cfa.

Caused asan use-after-poison to 4 DebugInfo/AMDGPU/ tests.
Triggered in PEI::replaceFrameIndicesBackward called llvm::MachineInstr::getNumOperands
2022-11-15 19:19:46 +00:00
Sanjay Patel fe05a0a3dd [SDAG] avoid udiv/urem transform for vector/scalar type mismatches
This solves the crashing from issue #58994.
I don't know anything about VE, so I don't know if the output
is as expected or even correct.
2022-11-15 11:01:18 -05:00
Alexander Timofeev e05ce03cfa PEI should be able to use backward walk in replaceFrameIndicesBackward.
The backward register scavenger has correct register
liveness information. PEI should leverage the backward register scavenger.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137574
2022-11-15 15:20:25 +01:00
Serge Pavlov ec893da990 [GlobalISel] Remove semantic operand of G_IS_FPCLASS
Instruction G_IS_FPCLASS had an operand that represented floating-point
semantics of its first operand. It allowed types that have the same length,
like `bfloat16` and `half`, to be distinguished. Unfortunately, it is
not sufficient, as other operation still cannot distinguish such types.
Solution of this problem must be more general, so now this operand is removed.

Differential Revision: https://reviews.llvm.org/D138004
2022-11-15 15:48:05 +07:00
Guozhi Wei 11e86868c1 [MachineCSE] Allow CSE for instructions with ignorable operands
Ignorable operands don't impact instruction's behavior, we can safely do CSE on
the instruction.

It is split from D130919. It has big impact to some AMDGPU test cases.
For example in atomic_optimizations_raw_buffer.ll, when trying to check if the
following instruction can be CSEed

  %37:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

Function isCallerPreservedOrConstPhysReg is called on operand "implicit $exec",
this function is implemented as

  -  return TRI.isCallerPreservedPhysReg(Reg, MF) ||
  +  return TRI.isCallerPreservedPhysReg(Reg, MF) || TII.isIgnorableUse(MO) ||
            (MRI.reservedRegsFrozen() && MRI.isConstantPhysReg(Reg));

Both TRI.isCallerPreservedPhysReg and MRI.isConstantPhysReg return false on this
operand, so isCallerPreservedOrConstPhysReg is also false, it causes LLVM failed
to CSE this instruction.

With this patch TII.isIgnorableUse returns true for the operand $exec, so
isCallerPreservedOrConstPhysReg also returns true, it causes this instruction to
be CSEed with previous instruction

  %14:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

So I got different result from here. AMDGPU's implementation of isIgnorableUse
is

  bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const {
    // Any implicit use of exec by VALU is not a real register read.
    return MO.getReg() == AMDGPU::EXEC && MO.isImplicit() &&
           isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent());
  }

Since the operand $exec is not a real register read, my understanding is it's
reasonable to do CSE on such instructions.

Because more instructions are CSEed, so I get less instructions generated for
these tests.

Differential Revision: https://reviews.llvm.org/D137222
2022-11-14 19:34:59 +00:00
Nicholas Guy d52e2839f3 [ARM][CodeGen] Add support for complex deinterleaving
Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic.

Differential Revision: https://reviews.llvm.org/D114174
2022-11-14 14:02:27 +00:00
Nikita Popov feda983ff8 [TableGen] Use MemoryEffects to represent intrinsic memory effects (NFCI)
The TableGen implementation was using a homegrown implementation of
FunctionModRefInfo. This switches it to use MemoryEffects instead.
This makes the code simpler, and will allow exposing the full
representational power of MemoryEffects in the future. Among other
things, this will allow us to map IntrHasSideEffects to an
inaccessiblemem readwrite, rather than just ignoring it entirely
in most cases.

To avoid layering issues, this moves the ModRef.h header from IR
to Support, so that it can be included in the TableGen layer.

Differential Revision: https://reviews.llvm.org/D137641
2022-11-14 10:52:04 +01:00
chenglin.bi 8482247900 [GlobalISel] Correct constant type in matchReassocConstantInnerLHS
When we match a pattern from m_GCst, the register type could be different from original op. So we can't replace the original op to vreg direct.
This code create a new constant with original op type then replace the original op.

Fix #58906

Reviewed By: arsenm, aemerson

Differential Revision: https://reviews.llvm.org/D137778
2022-11-13 19:20:07 +08:00
Matt Arsenault 3cfa03856f AtomicExpand: Support cmpxchg expansion for small FP types
Handles f16 atomics for AMDGPU.
2022-11-10 22:16:11 -08:00
Nick Desaulniers f2981a3bc9 [SelectDagISEL] refactor HandlePHINodesInSuccessorBlocks NFC.
While working on this code to support outputs from callbr along indirect
branches, I kept making these changes again and again. Precommit these.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137445
2022-11-10 14:34:23 -08:00
Alexander Timofeev 27091e6227 [PEI][NFC] Refactoring of the debug instructions frame index replacement
This is required for the upcoming backward PEI::replaceFrameIndices version.
Both forward and backward versions will use same code for debug instruction processing.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137741
2022-11-10 14:02:03 +01:00
wlei 47b0758049 [SampleFDO] Persist profile staleness metrics into binary
With https://reviews.llvm.org/D136627, now we have the metrics for profile staleness based on profile statistics, monitoring the profile staleness in real-time can help user quickly identify performance issues. For a production scenario, the build is usually incremental and if we want the real-time metrics, we should store/cache all the old object's metrics somewhere and pull them in a post-build time. To make it more convenient, this patch add an option to persist them into the object binary, the metrics can be reported right away by decoding the binary rather than polling the previous stdout/stderrs from a cache system.

For implementation, it writes the statistics first into a new metadata section(llvm.stats) then encode into a special ELF `.llvm_stats` section. The section data is formatted as a list of key/value pair so that future statistics can be easily extended. This is also under a new switch(`-persist-profile-staleness`)

In terms of size overhead, the metrics are computed at module level, so the size overhead should be small, measured on one of our internal service, it costs less than < 1MB for a 10GB+ binary.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D136698
2022-11-09 22:34:33 -08:00
chenglin.bi 597f444092 [TypePromotion] Replace Zext to Truncate for the case src bitwidth is larger
Fix: https://github.com/llvm/llvm-project/issues/58843

Reviewed By: samtebbs

Differential Revision: https://reviews.llvm.org/D137613
2022-11-09 05:08:01 +08:00
Nathan James 6aa050a690 Reland "[llvm][NFC] Use c++17 style variable type traits"
This reverts commit 632a389f96.

This relands commit
1834a310d0.

Differential Revision: https://reviews.llvm.org/D137493
2022-11-08 14:15:15 +00:00
Nathan James 632a389f96 Revert "[llvm][NFC] Use c++17 style variable type traits"
This reverts commit 1834a310d0.
2022-11-08 13:11:41 +00:00
Nathan James 1834a310d0
[llvm][NFC] Use c++17 style variable type traits
This was done as a test for D137302 and it makes sense to push these changes

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D137493
2022-11-08 12:22:52 +00:00
Petar Avramovic 838d5d371a AMDGPU/GlobalISel: Fix combine crash because LI is not set in prelegalizer
Caused by legacy min/max combines (select + cmp) asking for legalizer info
in prelegalizer (D135047 added combine to all_combines).
Combine still does not work for AMDGPU since destination opcode is custom,
not legal. Similar combine works on DAG since it asks for legal or custom.

Differential Revision: https://reviews.llvm.org/D137274
2022-11-08 12:46:16 +01:00
Tobias Hieta aa99b607b5
[clang][pdb] Don't include -fmessage-length in PDB buildinfo
As discussed in https://reviews.llvm.org/D136474 -fmessage-length
creates problems with reproduciability in the PDB files.

This patch just drops that argument when writing the PDB file.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D137322
2022-11-08 10:05:59 +01:00