Commit Graph

32796 Commits

Author SHA1 Message Date
Simon Pilgrim e5e93b6130 [DAG] FoldConstantArithmetic - add initial support for undef elements in bitcasted binop constant folding
FoldConstantArithmetic can fold constant vectors hidden behind bitcasts (e.g. vXi64 -> v2Xi32 on 32-bit platforms), but currently bails if either vector contains undef elements. These undefs can often occur due to SimplifyDemandedBits/VectorElts calls recognising that the upper bits are often unnecessary (e.g. funnel-shift/rotate implicit-modulo and AND masks).

This patch adds a basic 'FoldValueWithUndef' handler that will attempt to constant fold if one or both of the ops are undef - so far this just handles the AND and MUL cases where we always fold to zero.

The RISCV codegen increase is interesting - it looks like the BUILD_VECTOR lowering was loading a constant pool entry but now (with all elements defined constant) it can materialize the constant instead?

Differential Revision: https://reviews.llvm.org/D130839
2022-08-08 11:53:56 +01:00
David Green 061e0189a3 [DAG] Ensure Legal BUILD_VECTOR elements types in shuffle->And combine
D129150 added a combine from shuffles to And that creates a BUILD_VECTOR
of constant elements. We need to ensure that the elements are of a legal
type, to prevent asserts during lowering.

Fixes #56970.

Differential Revision: https://reviews.llvm.org/D131350
2022-08-08 09:47:55 +01:00
Aaron Ballman 32fd0b7fd5 Revert "[RDF] Remove explicit template arguments from Print"
This reverts commit ede96de751.

This breaks the build on Windows with Visual Studio:
https://lab.llvm.org/buildbot/#/builders/123/builds/12134
2022-08-07 08:24:01 -04:00
Kazu Hirata a2d4501718 [llvm] Fix comment typos (NFC) 2022-08-07 00:16:14 -07:00
Kazu Hirata 3b114087c3 [llvm] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2022-08-07 00:16:11 -07:00
Fangrui Song fa66789d06 [llvm] LLVM_NODISCARD => [[nodiscard]]. NFC
With C++17 there is no Clang pedantic warning.
2022-08-07 00:26:33 +00:00
Krzysztof Parzyszek 2bc390bdd6 [RDF] Use default TargetOperandInfo if not given in constructor
All current in-tree users use the default implementation.
2022-08-06 14:32:52 -05:00
Krzysztof Parzyszek ede96de751 [RDF] Remove explicit template arguments from Print
CTAD takes care of it.
2022-08-06 13:29:15 -05:00
Filipp Zhinkin c55899f763 [DAGCombiner] Hoist funnel shifts from logic operation
Hoist funnel shift from logic op:
logic_op (FSH x0, x1, s), (FSH y0, y1, s) --> FSH (logic_op x0, y0), (logic_op x1, y1), s

The transformation improves code generated for some cases related to
issue https://github.com/llvm/llvm-project/issues/49541.

Reduced amount of funnel shifts can also improve throughput on x86 CPUs by utilizing more
available ports: https://quick-bench.com/q/gC7AKkJJsDZzRrs_JWDzm9t_iDM

Transformation correctness checks:
https://alive2.llvm.org/ce/z/TKPULH
https://alive2.llvm.org/ce/z/UvTd_9
https://alive2.llvm.org/ce/z/j8qW3_
https://alive2.llvm.org/ce/z/7Wq7gE
https://alive2.llvm.org/ce/z/Xr5w8R
https://alive2.llvm.org/ce/z/D5xe_E
https://alive2.llvm.org/ce/z/2yBZiy

Differential Revision: https://reviews.llvm.org/D130994
2022-08-05 17:02:22 -04:00
Dawid Jurczak 1bd31a6898 [NFC] Add SmallVector constructor to allow creation of SmallVector<T> from ArrayRef of items convertible to type T
Extracted from https://reviews.llvm.org/D129781 and address comment:
https://reviews.llvm.org/D129781#3655571

Differential Revision: https://reviews.llvm.org/D130268
2022-08-05 13:35:41 +02:00
Fangrui Song 7d6017fd31 [TTI] Change new getVectorInstrCost overload to use const reference after D131114
A const reference is preferred over a non-null const pointer.
`Type *` is kept as is to match the other overload.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D131197
2022-08-04 15:16:51 -07:00
Mingming Liu bc8f2f3649 [AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation.
1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method.
2) This patch also changes a few callsites (VectorCombine.cpp,
   SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method.
3) This is a split of D128302.

Differential Revision: https://reviews.llvm.org/D131114
2022-08-04 12:58:25 -07:00
Lorenzo Albano 74940d2668 [VP] Add widening for VP_STRIDED_LOAD and VP_STRIDED_STORE
Reviewed By: frasercrmck, craig.topper

Differential Revision: https://reviews.llvm.org/D121114
2022-08-04 16:12:01 +02:00
wanglian b6b0690355 [LegalizeTypes][VP] Add split operand support for VP float and integer casting
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D130685
2022-08-04 15:41:50 +08:00
Mircea Trofin 0cb9746a7d [nfc][mlgo] Separate logger and training-mode model evaluator
This just shuffles implementations and declarations around. Now the
logger and the TF C API-based model evaluator are separate.

Differential Revision: https://reviews.llvm.org/D131116
2022-08-03 16:20:28 -07:00
Adrian Prantl 905f2d1ecb Fix LDV InstrRefBasedImpl to not crash when encountering unreachable MBBs.
The testcase was delta-reduced from an LTO build with sanitizer
coverage and the MIR tail duplication pass caused a machine basic
block to become unreachable in MIR. This caused the MBB to be invisible
to the reverse post-order traversal used to initialize the MBB <->
RPONumber lookup tables.

rdar://97226240

Differential Revision: https://reviews.llvm.org/D130999
2022-08-03 13:05:05 -07:00
Felipe de Azevedo Piovezan a5a8a05c78 [SelectionDAG] Handle IntToPtr constants in dbg.value
The function `handleDebugValue` has custom logic to handle certain kinds
constants, namely integers, floats and null pointers. However, it does
not handle constant pointers created from IntToPtr ConstantExpressions.
This patch addresses the issue by replacing the Constant with its
integer operand.

A similar bug was addressed for GlobalISel in D130642.

Reviewed By: aprantl, #debug-info

Differential Revision: https://reviews.llvm.org/D130908
2022-08-03 14:10:05 -04:00
David Truby 9a976f3661 [llvm] Always use TargetConstant for FP_ROUND ISD Nodes
This patch ensures consistency in the construction of FP_ROUND nodes
such that they always use ISD::TargetConstant instead of ISD::Constant.

This additionally fixes a bug in the AArch64 SVE backend where patterns
were matching against TargetConstant nodes and sometimes failing when
passed a Constant node.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D130370
2022-08-03 14:02:11 +01:00
Fraser Cormack 646e2f4803 [VP] Rename VP int<->float conversion ISD opcodes
These should be named like the non-VP versions for consistency.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D130967
2022-08-03 10:04:38 +01:00
Paul Kirth d434e40f39 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-08-03 00:09:45 +00:00
Mircea Trofin 4146c1756d [nfc] Remove unused parameter in TailDuplicator::duplicateSimpleBB
Differential Revision: https://reviews.llvm.org/D131008
2022-08-02 13:39:34 -07:00
Kai Nacke b38375378d [GIsel] Add missing libcall for G_MUL to LegalizerHelper
The LegalizerHelper misses the code to lower G_MUL to a library call,
which this change adds.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D130987
2022-08-02 13:35:25 -04:00
Simon Pilgrim b651fdff79 [DAG] matchRotateSub - ensure the (pre-extended) shift amount is wide enough for the amount mask (PR56859)
matchRotateSub is given shift amounts that will already have stripped any/zero-extend nodes from - so make sure those values are wide enough to take a mask.
2022-08-02 11:38:52 +01:00
Tim Northover b586dc21a7 Outliner: add "target-cpu" feature from source function to outlined
The CPU is used to determine which inline asm instructions are allowed, so
needs to be copied across in case the outlined function contains any.
2022-08-02 09:33:29 +01:00
Sotiris Apostolakis 995b61cdac [SelectOpti] Auto-disable other cmov optis when the new select-opti pass is enabled
Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D129817
2022-08-02 00:19:59 +00:00
Fangrui Song 2b70bebc6d [MachineFunctionPass] Support -print-changed={,c}diff{,-quiet}
Follow-up to D130434.
Move doSystemDiff to PrintPasses.cpp and call it in MachineFunctionPass.cpp.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D130833
2022-08-01 12:56:15 -07:00
Marius Brehler ddb6c28638 Avoid comparison of integers of different signs
Otherwiese a warning is emitted when compiling with `-Wsign-compare`.
2022-08-01 11:20:41 +00:00
Simon Pilgrim b43d7aacf8 [DAG] visitINSERT_VECTOR_ELT - extend folding to BUILD_VECTOR if all missing elements from an insertion chain are known zero 2022-08-01 11:32:33 +01:00
David Sherwood 41119a0f52 [DAGCombiner] Extend visitAND to include EXTRACT_SUBVECTOR
Eliminate an AND by redefining an anyext|sext|zext.

     (and (extract_subvector (anyext|sext|zext v) _) iN_mask)
  => (extract_subvector (zeroext_iN v))

Differential Revision: https://reviews.llvm.org/D130782
2022-08-01 10:32:32 +01:00
Vladislav Dzhidzhoev facb3ac385 [GlobalISel][DebugInfo] salvageDebugInfo analogue for gMIR
Salvage debug info of instruction that is about to be deleted as dead in
Combiner pass. Currently supported instructions are COPY and G_TRUNC.

It allows to salvage debug info of some dead arguments of functions, by putting
DWARF expression corresponding to the instruction being deleted into related
DBG_VALUE instruction.

Here is an example of missing variables location https://godbolt.org/z/K48osb9dK.
We see that arguments x, y of function foo are not available in debugger, and
corresponding DBG_VALUE instructions have undefined register operand instead of
variables locaton after Aarch64PreLegalizerCombiner pass. The reason is that
registers where variables are located are removed as dead (with instruction
G_TRUNC). We can use salvageDebugInfo analogue for gMIR to preserve debug
locations of dead variables.

Statistics of llvm object files built with vs without this commit on -O2
optimization level (CMAKE_BUILD_TYPE=RelWithDebInfo, -fglobal-isel) on Aarch64 (macOS):

Number of variables with 100% of parent scope covered by DW_AT_location has been increased by 7,9%.
Number of variables with 0% coverage of parent scope has been decreased by 1,2%.
Number of variables processed by location statistics has been increased by 2,9%.
Average PC ranges coverage has been increased by 1,8 percentage points.

Coverage can be improved by supporting more instructions, or by calling
salvageDebugInfo for instructions that are deleted during Combiner rules exection.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D129909
2022-08-01 11:14:53 +02:00
Chuanqi Xu 9701053517 Introduce @llvm.threadlocal.address intrinsic to access TLS variable
This belongs to a series of patches which try to solve the thread
identification problem in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for a full background.

The problem consists of two concrete problems: TLS variable and readnone
functions. This patch tries to convert the TLS problem to readnone
problem by converting the access of TLS variable to an intrinsic which
is marked as readnone.

The readnone problem would be addressed in following patches.

Reviewed By: nikic, jyknight, nhaehnle, ychen

Differential Revision: https://reviews.llvm.org/D125291
2022-08-01 10:51:30 +08:00
Luís Marques 260a641068 [RISCV] Pre-RA expand pseudos pass
Expand load address pseudo-instructions earlier (pre-ra) to allow follow-up
patches to fold the addi of PseudoLLA instructions into the immediate
operand of load/store instructions.

Differential Revision: https://reviews.llvm.org/D123264
2022-07-31 23:19:00 +02:00
Kazu Hirata 12b29900a1 Use any_of (NFC) 2022-07-30 10:35:56 -07:00
Dmitry Vassiliev adc387460d [CodeGen] Fixed undeclared MISchedCutoff in case of NDEBUG and LLVM_ENABLE_ABI_BREAKING_CHECKS
This patch fixes the error llvm/lib/CodeGen/MachineScheduler.cpp(755): error C2065: 'MISchedCutoff': undeclared identifier in case of NDEBUG and LLVM_ENABLE_ABI_BREAKING_CHECKS.
Note MISchedCutoff is declared under #ifndef NDEBUG.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130425
2022-07-30 18:24:50 +02:00
Simon Pilgrim 9ad082eb5a [DAG] Pull out repeated getOperand() calls for shuffle ops. NFC. 2022-07-30 14:02:54 +01:00
Amaury Séchet 226086230c [DAG] Use recursivelyDeleteUnusedNodes in CommitTargetLoweringOpt.
It simplifies the logic and removes the need for manual bookkeeping.

Differential Revision: https://reviews.llvm.org/D130445
2022-07-29 13:49:03 +00:00
Simon Pilgrim af1b7ebcdf [TargetLowering] Move a few hasOneUse() tests later to reduce unnecessary computations. NFC.
Many of these cases, an early-out on the much cheaper getOpcode() check will avoid us needing to call hasOneUse() entirely.
2022-07-29 14:20:35 +01:00
Matt Arsenault a4834ad068 RegisterCoalescer: Shrink main range after shrinking subranges
If the subregister uses were dead, this would leave the main range
segment pointing to a deleted instruction.

Not sure if this should try to avoid shrinking if we know we don't
have dead components.
2022-07-29 08:57:28 -04:00
Simon Pilgrim 641dba9e28 [DAG] Move a few hasOneUse() tests later to reduce unnecessary computations. NFC.
Many of these cases, an early-out on the much cheaper getOpcode() check will avoid us needing to call hasOneUse() entirely.
2022-07-29 11:34:39 +01:00
Simon Pilgrim 9082c13106 [Support] Add KnownBits::concat method
Add a method for the various cases where we need to concatenate 2 KnownBits together (BUILD_PAIR and SHIFT_PARTS in particular) - uses the existing APInt::concat 'HiBits.concat(LoBits)' convention

Differential Revision: https://reviews.llvm.org/D130557
2022-07-29 11:06:39 +01:00
Felipe de Azevedo Piovezan 58526b2d2b [GlobalISel] Handle nullptr constants in dbg.value
Currently, the LLVM IR -> MIR translator fails to translate dbg.values
whose first argument is a null pointer. However, in other portions of
the code, such pointers are always lowered to the constant zero, for
example see IRTranslator::Translate(Constant, Register).

This patch addresses the limitation by following the same approach of
lowering null pointers to zero.

A prior test was checking that null pointers were always lowered to
$noreg; this test is changed to check for zero, and the previous
behavior is now checked by introducing a dbg.value whose first argument
is the address of a global variable.

Differential Revision: https://reviews.llvm.org/D130721
2022-07-28 14:58:14 -07:00
Felipe de Azevedo Piovezan 0ef6809c48 [GlobalISel][nfc] Remove unnecessary cast
The getOperand method already returns a Constant when it is called on
a ConstantExpression, as such the cast is not needed. To prevent a type
mismatch between the different return statements of the lambda, the
lambda return type is explicitly provided.

Differential Revision: https://reviews.llvm.org/D130719
2022-07-28 14:55:07 -07:00
Simon Pilgrim 8c99cef1e7 [DAG] Remove SelectionDAG::GetDemandedBits and use SimplifyMultipleUseDemandedBits directly.
GetDemandedBits is mainly a wrapper around SimplifyMultipleUseDemandedBits now, and is only used by DAGCombiner::visitSTORE so I've moved all remaining functionality there.

visitSTORE was making use of this to 'simplify' constants for a trunc-store. Just removing this code left to a mixture of regressions and gains - it came down to whether a target preferred a sign or zero extended constant for materialization/truncation. I've just moved the code over for now, but a next step would be to move this to targetShrinkDemandedConstant, but some targets that override the method expect a basic binop, and might react badly to a store node.....
2022-07-28 17:03:44 +01:00
Simon Pilgrim be488ba7de [DAG] DAGCombiner::visitTRUNCATE - remove GetDemandedBits call
This should now all be handled by SimplifyDemandedBits.
2022-07-28 15:23:04 +01:00
Simon Pilgrim ea7f14dad0 [DAG] SelectionDAG::GetDemandedBits - don't simplify opaque constants
I'm actually trying to get rid of GetDemandedBits - but while dismantling it I noticed that we were altering opaque constants. Fixing that causes a FP_TO_INT_SAT regression that should be addressed separately - I'll raise a bug.
2022-07-28 14:46:59 +01:00
Simon Pilgrim 69d5a038b9 [DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits.

There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP.

Differential Revision: https://reviews.llvm.org/D77804
2022-07-28 14:10:44 +01:00
Amaury Séchet 474a8ee03d [DAG] Use recursivelyDeleteUnusedNodes in PromoteLoad
It simplifies the code overall and removes the need for manual bookkeeping.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130447
2022-07-28 12:54:52 +00:00
Amaury Séchet 7920805b27 [DAG] Use recursivelyDeleteUnusedNodes in ReplaceLoadWithPromotedLoad
It simplifies the code overall and removes the need for manual bookkeeping.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130444
2022-07-28 12:32:37 +00:00
David Spickett a0ccba5e19 [llvm] Fix some test failures with EXPENSIVE_CHECKS and libstdc++
DebugLocEntry assumes that it either contains 1 item that has no fragment
or many items that all have fragments (see the assert in addValues).

When EXPENSIVE_CHECKS is enabled, _GLIBCXX_DEBUG is defined. On a few machines
I've checked, this causes std::sort to call the comparator even
if there is only 1 item to sort. Perhaps to check that it is implemented
properly ordering wise, I didn't find out exactly why.

operator< for a DbgValueLoc will crash if this happens because the
optional Fragment is empty.

Compiler/linker/optimisation level seems to make this happen
or not. So I've seen this happen on x86 Ubuntu but the buildbot
for release EXPENSIVE_CHECKS did not have this issue.

Add an explicit check whether we have 1 item.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D130156
2022-07-28 08:53:38 +00:00
Matt Arsenault bfdca1535c RegAllocGreedy: Fix nondeterminism in tryLastChanceRecoloring
tryLastChanceRecoloring iterates over the set of LiveInterval pointers
and used that to seed the recoloring stack, which was
nondeterministic. Fixes a future test failing about 20% of the time.

This just takes the order the interfering vreg was encountered. Not
sure if we should try to order this more intelligently.
2022-07-27 19:02:06 -04:00
Paul Kirth 6e9bab71b6 Revert "[llvm][NFC] Refactor code to use ProfDataUtils"
This reverts commit 300c9a7881.

We will reland once these issues are ironed out.
2022-07-27 21:38:11 +00:00
Paul Kirth 300c9a7881 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-07-27 21:13:54 +00:00
Adrian Prantl 719ab04acf [GlobalISel] Handle IntToPtr constants in dbg.value
Currently, the IR to MIR translator can only handle two kinds of constant
inputs to dbg.values intrinsics: constant integers and constant floats. In
particular, it cannot handle pointers created from IntToPtr ConstantExpression
objects.

This patch addresses the limitation above by replacing the IntToPtr with
its input integer prior to converting the dbg.value input.

Patch by Felipe Piovezan!

Differential Revision: https://reviews.llvm.org/D130642
2022-07-27 13:42:07 -07:00
Amara Emerson 65246d3eb4 Use hasNItemsOrLess() in MRI::hasAtMostUserInstrs(). 2022-07-27 11:42:14 -07:00
Amara Emerson 19cdd1908b [AArch64][GlobalISel] Add heuristics for localizing G_CONSTANT.
This adds similar heuristics to G_GLOBAL_VALUE, querying the cost of
materializing a specific constant in code size. Doing so prevents us from
sinking constants which require multiple instructions to generate into
use blocks.

Code size savings on CTMark -Os:
Program                                       size.__text
                                              before         after           diff
ClamAV/clamscan                               381940.00      382052.00       0.0%
lencod/lencod                                 428408.00      428428.00       0.0%
SPASS/SPASS                                   411868.00      411876.00       0.0%
kimwitu++/kc                                  449944.00      449944.00       0.0%
Bullet/bullet                                 463588.00      463556.00      -0.0%
sqlite3/sqlite3                               284696.00      284668.00      -0.0%
consumer-typeset/consumer-typeset             414492.00      414424.00      -0.0%
7zip/7zip-benchmark                           595244.00      594972.00      -0.0%
mafft/pairlocalalign                          247512.00      247368.00      -0.1%
tramp3d-v4/tramp3d-v4                         372884.00      372044.00      -0.2%
                           Geomean difference                               -0.0%

Differential Revision: https://reviews.llvm.org/D130554
2022-07-27 10:51:16 -07:00
Simon Pilgrim c0b3f7a50f [DAG] SimplifyDemandedBits - ensure we clear known One bits that AssertZext asserts are really known Zero
Matches ComputeKnownBits behaviour

Thanks to @uabelho for the fuzz regression report on D129765
2022-07-27 13:57:47 +01:00
Simon Pilgrim 529bd4f352 [DAG] SimplifyDemandedBits - don't early-out for multiple use values
SimplifyDemandedBits currently early-outs for multi-use values beyond the root node (just returning the knownbits), which is missing a number of optimizations as there are plenty of cases where we can still simplify when initially demanding all elements/bits.

@lenary has confirmed that the test cases in aea-erratum-fix.ll need refactoring and the current increase codegen is not a major concern.

Differential Revision: https://reviews.llvm.org/D129765
2022-07-27 10:54:06 +01:00
Dmitry Vassiliev e3e63f30a5 [CodeGen] Fixed ambiguous symbol ExtAddrMode in case of NDEBUG and LLVM_ENABLE_DUMP
This patch fixes the following error with MSVC 16.9.2 in case of NDEBUG and LLVM_ENABLE_DUMP:
llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2872: 'ExtAddrMode': ambiguous symbol
llvm/include/llvm/CodeGen/TargetInstrInfo.h(86): note: could be 'llvm::ExtAddrMode'
llvm/lib/CodeGen/CodeGenPrepare.cpp(2447): note: or '`anonymous-namespace'::ExtAddrMode'
llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2039: 'print': is not a member of 'llvm::ExtAddrMode'

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D130426
2022-07-27 00:21:57 +02:00
Fangrui Song f106525de2 [MachineFunctionPass] Support -print-changed and -print-changed=quiet
-print-changed for new pass manager is handy beside -print-after-all.
Port it to MachineFunctionPass.

Note: lib/Passes/StandardInstrumentations.cpp implements a number of
misc features. If we want to use them for codegen, we may need to lift
some functionality to LLVMIR.

Reviewed By: aeubanks, jamieschmeiser

Differential Revision: https://reviews.llvm.org/D130434
2022-07-26 10:16:49 -07:00
Simon Pilgrim 1ea7b9c6ee [DAG] matchRotateSub - set demanded bits to the shift amount type size, not the shift result size.
This should fix a report on D130251 of an assert due to a bitwidth mismatch in APInt::isSubSetOf
2022-07-26 17:58:51 +01:00
Stefan Gränitz 1e30820483 [WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms
WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This
affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222

This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs.
* Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand.
* PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers.
* LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites.

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D128190
2022-07-26 17:52:43 +02:00
Paul Walker e5c892dd85 [SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.
This patch starts small, only detecting sequences of the form
<a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes.

Differential Revision: https://reviews.llvm.org/D125194
2022-07-26 15:28:37 +00:00
wangpc 1a7078d106 [DAGCombine] Mask doesn't have to be (EltSize - 1) exactly when combining rotation
I think what we need is the least Log2(EltSize) significant bits are known to be ones.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130251
2022-07-26 21:14:45 +08:00
Sven van Haastregt c8d91b07bb Reassoc FMF should not optimize FMA(a, 0, b) to (b)
Optimizing (a * 0 + b) to (b) requires assuming that a is finite and not
NaN. DAGCombiner will do this optimization when the reassoc fast math
flag is set, which is not correct. Change DAGCombiner to only consider
UnsafeMath for this optimization.

Differential Revision: https://reviews.llvm.org/D130232

Co-authored-by: Andrea Faulds <andrea.faulds@arm.com>
2022-07-26 09:39:12 +01:00
Kazu Hirata 3f3930a451 Remove redundaunt virtual specifiers (NFC)
Identified with tidy-modernize-use-override.
2022-07-25 23:00:59 -07:00
jacquesguan cb370cf413 [DAGCombiner] Teach scalarizeExtractedBinop to support scalable splat.
This patch supports the scalable splat part for scalarizeExtractedBinop.

Differential Revision: https://reviews.llvm.org/D129725
2022-07-26 09:31:45 +08:00
Amara Emerson 5ae0472694 [GlobalISel] Fix miscompile of G_UREM + G_UDIV due to not checking for equality
of the first operands of each.

Fixes issue #55287

Differential Revision: https://reviews.llvm.org/D130525
2022-07-25 16:03:05 -07:00
Alexander Shaposhnikov 1e636f2676 [IRBuilder] Add assert for AtomicRMW ordering
Add assert for AtomicRMW: Ordering != AtomicOrdering::Unordered
(https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Verifier.cpp#L3944)
and adjust expandAtomicStore accordingly.

Test plan:
1/ ninja check-llvm check-clang check-lld
2/ Bootstrapped LLVM/Clang pass tests

Differential revision: https://reviews.llvm.org/D130457
2022-07-25 22:51:25 +00:00
Matt Arsenault 62531518f9 RegAllocGreedy: Add a command line flag for reverseLocalAssignment
Introduce a flag like for some of the other target heuristic controls
to help with experimentation.
2022-07-25 15:47:15 -04:00
Vladislav Dzhidzhoev fc93ba061a [GlobalISel][DebugInfo] Remove debug info with zero line from constants inserted at entry block
Emission of constants having DebugLoc with line 0 causes significant increase of debug_line section size for some source files.

To illustrate, we can compare section sizes of several files from llvm test-suite, built with SelectionDAG vs GlobalISel, on Aarch64 (macOS), using -O0 optimization level:

| Source path                                                    | SDAG text sz | GISel text sz | SDAG debug_line sz |  GISel debug_line sz
| -------------------------------------------------------------- | ------------ | ------------- | ------------------ | --------------------
| `SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c`   | 15320        | 660           | 14872              | 6340
| `SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c` | 33640        | 26300         | 2812               | 6693
| `SingleSource/Benchmarks/Misc/flops-4.c`                       | 1428         | 1196          | 594                | 1008
| `MultiSource/Benchmarks/MiBench/consumer-typeset/z31.c`        | 2716         | 964           | 809                | 903
| `MultiSource/Benchmarks/Prolangs-C/gnugo/showinst.c`           | 2534         | 2502          | 189                | 573

For instance, here is a fragment of `flops-4.c.o` debug line section dump

```
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000000000    174      0      1   0             0  is_stmt
0x0000000000000010      0      0      1   0             0
0x0000000000000018    185      4      1   0             0  is_stmt prologue_end
0x000000000000001c      0      0      1   0             0
0x0000000000000024    186      4      1   0             0  is_stmt
0x000000000000002c    189     10      1   0             0  is_stmt
0x0000000000000030      0      0      1   0             0
0x0000000000000038    207     11      1   0             0  is_stmt
0x0000000000000044    208     11      1   0             0  is_stmt
0x0000000000000048      0      0      1   0             0
0x0000000000000058    210     10      1   0             0  is_stmt
0x000000000000005c      0      0      1   0             0
0x0000000000000060    211     10      1   0             0  is_stmt
0x0000000000000064      0      0      1   0             0
0x000000000000006c    212     10      1   0             0  is_stmt
0x0000000000000070      0      0      1   0             0
0x000000000000007c    213     10      1   0             0  is_stmt
0x0000000000000080      0      0      1   0             0
0x0000000000000088    214     10      1   0             0  is_stmt
0x000000000000008c      0      0      1   0             0
0x0000000000000094    215     10      1   0             0  is_stmt
```

Lot of zero lines are produced by constants (global values) having DebugLoc with line 0.
It seems that they're not significant for debugging experience.

With the commit applied, total size of debug_line sections of llvm shared libraries has reduced by 2.5%.
Change of debug line section size of files listed above:

| Source path                                                    | GISel debug_line sz | Patch debug_line sz
| -------------------------------------------------------------- | ------------------- | --------------------
| `SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c`   | 6340                | 1465
| `SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c` | 6693                | 3782
| `SingleSource/Benchmarks/Misc/flops-4.c`                       | 1008                | 609
| `MultiSource/Benchmarks/MiBench/consumer-typeset/z31.c`        | 903                 | 841
| `MultiSource/Benchmarks/Prolangs-C/gnugo/showinst.c`           | 573                 | 190

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D127488
2022-07-25 17:19:01 +00:00
Nikita Popov fb7caa3c7b [AsmPrinter] Reject ptrtoint to larger size in lowerConstant()
When using a ptrtoint to a size larger than the pointer width in a
global initializer, we currently create a ptr & low_bit_mask style
MCExpr, which will later result in a relocation error during object
file emission.

This patch rejects the constant expression already during
lowerConstant(), which results in a much clearer error message
that references the constant expression at fault.

This fixes https://github.com/llvm/llvm-project/issues/56400,
for certain definitions of "fix".

Differential Revision: https://reviews.llvm.org/D130366
2022-07-25 10:18:27 +02:00
Kazu Hirata b5188591a0 [llvm] Remove redundaunt virtual specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 21:50:35 -07:00
Kazu Hirata acf648b5e9 Use llvm::less_first and llvm::less_second (NFC) 2022-07-24 16:21:29 -07:00
Kazu Hirata ea29810c9d [CodeGen] Remove a redundant void (NFC)
Identified with modernize-redundant-void-arg.
2022-07-24 12:27:14 -07:00
Matt Arsenault 40abb28f61 RegAllocGreedy: Fix subranges when rematerializing dead subreg defs
This would create a new interval missing the subrange and hit this
verifier error:

*** Bad machine code: Live interval for subreg operand has no subranges ***
- function:    test_remat_subreg_def
- basic block: %bb.0  (0xa568758) [0B;128B)
- instruction: 32B	dead undef %4.sub0:vreg_64 = V_MOV_B32_e32 2, implicit $exec
2022-07-24 11:51:59 -04:00
Simon Pilgrim 562ee7cc5f [DAG] visitSMUL_LOHI/visitUMUL_LOHI - ensure we canonicalize constants to the RHS 2022-07-24 16:09:56 +01:00
Simon Pilgrim 428c0f2adc [DAG] getNode - assert that SMUL_LOHI/UMUL_LOHI nodes have the correct ops + types 2022-07-24 15:30:57 +01:00
Simon Pilgrim 0708771cce [DAG] MaskedVectorIsZero - don't bother with (-1).isSubsetOf mask check. NFC.
Just use KnownBits::isZero() to ensure all the bits are known zero.
2022-07-24 13:12:21 +01:00
Simon Pilgrim e82d49bfed [DAG] SimplifyMultipleUseDemandedBits - early-out for any scalable vector types
Noticed while working to remove SelectionDAG::GetDemandedBits - we were relying on the callers to have already bailed for scalable vectors
2022-07-24 12:59:43 +01:00
Simon Pilgrim a3e38b4a20 [DAG] SimplifyDemandedVectorElts - if every and/mul element-pair has a zero/undef then just constant fold to zero 2022-07-24 12:00:31 +01:00
Kazu Hirata 7bfa06f6c0 [CodeGen] Use range-based for loops (NFC) 2022-07-23 16:10:46 -07:00
Simon Pilgrim ac8be21365 [DAG] isSplatValue - don't attempt to merge any BITCAST sub elements if they contain UNDEFs
We still haven't found a solution that correctly handles 'don't care' sub elements properly - given how close it is to the next release branch, I'm making this fail safe change and we can revisit this later if we can't find alternatives.

NOTE: This isn't a reversion of D128570 - it's the removal of undef handling across bitcasts entirely

Fixes #56520
2022-07-23 18:38:48 +01:00
Dmitri Gribenko aba43035bd Use llvm::sort instead of std::sort where possible
llvm::sort is beneficial even when we use the iterator-based overload,
since it can optionally shuffle the elements (to detect
non-determinism). However llvm::sort is not usable everywhere, for
example, in compiler-rt.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D130406
2022-07-23 15:19:05 +02:00
Simon Pilgrim 5f89d2bae9 [DAG] Move OR(AND(X,C1),AND(OR(X,Y),C2)) -> OR(AND(X,OR(C1,C2)),AND(Y,C2)) fold to SimplifyDemandedBits
This will fix the SystemZ v3i31 memcpy regression in D77804 (with the help of D129765 as well....).

It should also allow us to /bend/ the oneuse limitation for cases where we can use demanded bits to safely peek though multiple uses of the AND ops.
2022-07-23 13:17:24 +01:00
Simon Pilgrim 6aff1b7b3c [DAG] SimplifyDemandedBits - pull out repeated getValueType() calls. NFC. 2022-07-23 12:01:54 +01:00
Simon Pilgrim 2421a5af72 [DAG] ExpandIntRes_ADDSUB - create UADDO/USUBO instead of ADDCARRY/SUBCARRY if overflow is known to be zero
As noticed on D127115, when splitting ADD/SUB nodes we often end up with cases where overflow from the lower bits is impossible - in such cases we're better off breaking the carry chain dependency as soon as possible.

This path is being exercised by llvm/test/CodeGen/ARM/dsp-mlal.ll, although I haven't been able to get any codegen diff without a topological worklist.
2022-07-23 11:13:44 +01:00
Simon Pilgrim 8937252465 [DAG] computeKnownBits - add basic shift-by-parts handling
Concat KnownBits from ISD::SHL_PARTS / ISD::SRA_PARTS / ISD::SRL_PARTS lo/hi operands and perform the KnownBits calculation by the shift amount on the extended type, before splitting the KnownBits based on the requested lo/hi result.
2022-07-23 09:46:30 +01:00
ARCHIT SAXENA 3bb1ce2319 Add a nop instruction if a section starts with landing pad for function splitter
This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections.

Detailed description:
The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart.
```
	.section	.text.split.foo10,"ax",@progbits
foo10.cold:                             # %lpad
	.cfi_startproc
	.cfi_personality 3, __gxx_personality_v0
	.cfi_lsda 3, .Lexception5
	.cfi_def_cfa %rsp, 16
.Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section
	movq	%rax, %rdi <--- first instruction is at offest 0 from LPStart
	callq	_Unwind_Resume@PLT

 ```
This will cause landing pad entries to become zero (.Ltmp11-foo10.cold)
```
.Lcst_begin4:
	.uleb128 .Ltmp9-.Lfunc_begin2           # >> Call Site 1 <<
	.uleb128 .Ltmp10-.Ltmp9                 #   Call between .Ltmp9 and .Ltmp10
	.uleb128 .Ltmp11-foo10.cold  <---This is zero           #     jumps to .Ltmp11
	.byte	3                               #   On action: 2
	.uleb128 .Ltmp10-.Lfunc_begin2          # >> Call Site 2 <<
	.uleb128 .Lfunc_end9-.Ltmp10            #   Call between .Ltmp10 and .Lfunc_end9
	.byte	0                               #     has no landing pad
	.byte	0                               #   On action: cleanup
	.p2align	2
```
The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output:
```
	.section	.text.split.foo10,"ax",@progbits
foo10.cold:                             # %lpad
	.cfi_startproc
	.cfi_personality 3, __gxx_personality_v0
	.cfi_lsda 3, .Lexception5
	.cfi_def_cfa %rsp, 16
	nop <--- new instruction that is added
.Ltmp11:
	movq	%rax, %rdi
	callq	_Unwind_Resume@PLT
```

Reviewed By: modimo, snehasish, rahmanl

Differential Revision: https://reviews.llvm.org/D130133
2022-07-22 15:20:10 -07:00
Craig Topper be208b40c1 [DAGCombiner] Simplify code around call to reduceLoadWidth in visitAND. NFC
We were looking for loads or any_extend+load. reduceLoadWidth
hasn't known how to look through such an any_extend to find the
load since D40667 almost 5 years ago.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130333
2022-07-22 08:36:56 -07:00
Nikita Popov c2be703c6c [AsmPrinter] Move lowerConstant() error code out of switch (NFC)
Move this out of the switch, so that different branches can
indicate an error by breaking out of the switch. This becomes
important if there are more than the two current error cases.
2022-07-22 16:08:28 +02:00
Cullen Rhodes bf268a05cd [AArch64] Emit vector FP cmp when LE is used with fast-math
Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D130093
2022-07-22 07:53:55 +00:00
jacquesguan e60eb7053d recommit "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."
With fix for AArch64 and Hexgon test cases.
2022-07-21 17:34:34 +08:00
David Green 23d6186be0 [SelectionDAG] Fix fptoi.sat scalable vector lowering
Vector fptosi_sat and fptoui_sat were being expanded by unrolling the
vector operation. This doesn't work for scalable vector, so this patch
adds a call to TLI.expandFP_TO_INT_SAT if the vector is scalable.

Scalable tests are added for AArch64 and RISCV. Some of the AArch64
fptoi_sat operations should be legal, but that will be handled in
another patch.

Differential Revision: https://reviews.llvm.org/D130028
2022-07-21 08:00:22 +01:00
esmeyi 339392ecf2 [AIX] follow-up of D124654.
Emitting the remaining aliases instead of reporting
an error to avoid SPEC2017 PEAK failures.
And mark this as a TODO.
2022-07-21 01:10:09 -04:00
Simon Pilgrim 029e83b401 [DAG] getNode - don't bother creating ADDO(X,0) or SUBO(X,0) nodes.
Similar to what we already do in getNode for basic ADD/SUB nodes, return the X operand directly, but here we know that there will be no/zero overflow as well.

As noted on D127115 - this path is being exercised by llvm/test/CodeGen/ARM/dsp-mlal.ll, although I haven't been able to get any codegen without a topological worklist.
2022-07-20 12:04:33 +01:00
Simon Pilgrim 766cd95481 [DAG] getNode - assert that ADDO/SUBO nodes have the correct ops + types 2022-07-20 11:23:58 +01:00
Simon Pilgrim 9fc347aa4e [DAG] PromoteIntRes_BUILD_VECTOR - extend constant boolean vectors according to target BooleanContents
PromoteIntRes_BUILD_VECTOR currently always ANY_EXTENDs build vector operands, but if this is a constant boolean vector we're losing the useful ability to keep the vector matching the BooleanContents mode used by the target.

This patch extends constant boolean vectors according to target BooleanContents, allowing a number of additional all-bits folds (notable XOR -> NOT conversions) to occur.

Differential Revision: https://reviews.llvm.org/D129641
2022-07-20 10:49:31 +01:00
Lorenzo Albano 07d69d9fc9 [VP] Legalize the stride operand for EXPERIMENTAL_VP_STRIDED SDNodes
Add promotion and expansion of integer operands for
experimental_vp_strided SelectionDAG nodes; the expansion is actually
just a truncation of the stride operand.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D123112
2022-07-20 10:22:43 +02:00
Kazu Hirata 76e18cc4f6 [llvm] Use llvm::any_of and llvm::none_of (NFC) 2022-07-20 00:36:19 -07:00
Kazu Hirata 0387da6f4f Use value instead of getValue (NFC) 2022-07-19 21:18:26 -07:00