Treat hint instructions like G_ASSERT_ZEXT like COPY instructions in helpers
which walk through copies.
This ensures that instructions like G_ASSERT_ZEXT won't impact any optimizations
that rely on these helpers.
Differential Revision: https://reviews.llvm.org/D95577
If constants are hidden behind G_ANYEXT we can treat them same way as G_SEXT.
For that purpose we extend getConstantVRegValWithLookThrough with option
to handle G_ANYEXT same way as G_SEXT.
Differential Revision: https://reviews.llvm.org/D92219
When constraining an operand register using constrainOperandRegClass(),
the function may emit a COPY in case the provided register class does
not match the current operand register class. However, the operand
itself is not updated to make use of the COPY, thereby resulting in
incorrect code. This patch fixes that bug by updating the machine
operand accordingly.
Reviewed By: dsanders
Differential Revision: https://reviews.llvm.org/D91244
Returning int64_t was arbitrarily limiting for wide integer types, and
the functions should handle the full generality of the IR.
Also changes the full form which returns the originally defined
vreg. Add another wrapper for the common case of just immediately
converting to int64_t (arguably this would be useful for the full
return value case as well).
One possible issue with this change is some of the existing uses did
break without conversion to getConstantVRegSExtVal, and it's possible
some without adequate test coverage are now broken.
It's fairly common to need matchers for a specific constant value, or for
common idioms like finding a negated register.
Add
- `m_SpecificICst`, which returns true when matching a specific value..
- `m_ZeroInt`, which returns true when an integer 0 is matched.
- `m_Neg`, which returns when a register is negated.
Also update a few places which use idioms related to the new matchers.
Differential Revision: https://reviews.llvm.org/D91397
Check for NoNaNsFPMath function attribute in isKnownNeverSNaN.
Function attributes are in held in 'TargetMachine.Options'.
Among other things, this allows selection of some patterns imported
in D87351 since G_FCANONICALIZE is not generated when isKnownNeverSNaN
returns true in lowerFMinNumMaxNum.
However we notice some incorrect results since function attributes are
not correctly written in TargetMachine.Options when next function is
processed. Take a look at @v_test_no_global_nnans_med3_f32_pat0_srcmod0,
it has "no-nans-fp-math"="false" but TargetMachine.Options still has it
set to true since first function in test file had this attribute set to
true. This will be fixed in D87511.
Differential Revision: https://reviews.llvm.org/D87456
This combine previously tried to take sequences like:
%cond = G_ICMP pred, a, b
G_BRCOND %cond, %truebb
G_BR %falsebb
%truebb:
...
%falsebb:
...
and by inverting the compare predicate and swapping branch targets, delete the
G_BR and instead have a single conditional branch to the falsebb. Since in an
earlier patch we have a combine to fold not(icmp) into just an inverted icmp,
we don't need this combine to do as much. This patch instead generalizes the
combine by just looking for:
G_BRCOND %cond, %truebb
G_BR %falsebb
%truebb:
...
%falsebb:
...
and then inverting the condition using a not (xor). The xor can be folded away
in a separate combine. This change also lets us avoid some optimization code
in the IRTranslator.
I also think that deleting G_BRs in the combiner is unnecessary. That's
something that targets can decide to do at selection time and could simplify
generic code in future.
Differential Revision: https://reviews.llvm.org/D86664
This is needed for an upcoming change to how we translate conditional branches
which might generate these.
Differential Revision: https://reviews.llvm.org/D86383
These should really match either G_BUILD_VECTOR or
G_BUILD_VECTOR_TRUNC, but there doesn't seem to be an existing
mechanism for matching alternative opcodes. There is GIM_SwitchOpcode,
but it seems to assume it's oly only used for matcher optimization.
I could also omit any opcode check and rely on the matcher directly
checking the opcode, but the table optimizer currently assumes there
has to be an opcode check.
Also doesn't try to handle undef elements like the DAG version.
Get the argument register and ensure there's a copy to the virtual
register. AMDGPU and AArch64 have similarish code to get the livein
value, and I also want to use this in multiple places.
This is a bit more aggressive about setting the register class than
the original function, but that's probably OK.
I think we're missing a few verifier checks for function live ins. I
noticed AArch64's calling convention code is not actually adding
liveins to functions, only the entry block (which apparently might not
matter that much?). There should probably be a verifier check that
entry block live ins are also live into the function. We also might
need a verifier check that the copy to the livein virtual register is
in the entry block.
This one is pretty easy and shrinks the list of unhandled
intrinsics. I'm not sure how relevant the insert point is. Using the
insert position of EntryBuilder will place this after
constants. SelectionDAG seems to end up emitting these after argument
copies and before anything else, but I don't think it really
matters. This also ends up emitting these in the opposite order from
SelectionDAG, but I don't think that matters either.
This also needs a fix to stop the later passes dropping this as a dead
instruction. DeadMachineInstructionElim's version of isDead special
cases LOCAL_ESCAPE for some reason, and I'm not sure why it's excluded
from MachineInstr::isLabel (or why isDead doesn't check it).
I also noticed DeadMachineInstructionElim never considers inline asm
as dead, but GlobalISel will drop asm with no constraints.
In future, we'd like to use the perfect-shuffle mechanism to deal with these
shuffle permutations. For now, this improves performance by avoiding the
super-expensive const-pool load + tbl instruction.
Differential Revision: https://reviews.llvm.org/D84866
Try to make the behavior more consistent with getGCDType, and bias
towards returning something closer to the source type whenever there's
an ambiguity.
Try harder to find a canonical unmerge type when trying to cover the
desired target type. Handle finding a compatible unmerge type for two
vectors with different element types. This will return the largest
multiple of the source vector element that will evenly divide the
target vector type.
Also make the handling mixing scalars and vectors, and prefer the
source element type as the unmerge target type.
Summary:
This will allow us to fix the issue where the lost locations
verifier causes CodeGen changes on lost locations because it
falls back on DAGISel
Reviewers: qcolombet, bogner, aprantl, vsk, paquette
Subscribers: rovka, hiraditya, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78261
https://reviews.llvm.org/D67133
While investigating some non determinism (CSE doesn't produce wrong
code, it just doesn't CSE some times) in GISel CSE on an out of tree
target, I realized that the core issue was that there were lots of code
that mutates (setReg, setRegClass etc), but doesn't notify observers
(CSE in this case but this could be any other observer). In order to
make the Observer be available in various parts of code and to avoid
having to thread it through various API, the MachineFunction now has the
observer as field. This allows it to be easily used in helper functions
such as constrainOperandRegClass.
Also added some invariant verification method in CSEInfo which can
catch these issues (when CSE is enabled).
Like COPY instructions explained in D70616, we don't check the constraints
when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check
if registers can be replaced, or a COPY instruction needs to be built.
https://reviews.llvm.org/D70564
contractCrossBankCopyIntoStore() finds the instruction defines the
source register and uses its output to replace the register. There are,
however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES.
Current implementation hardcodes to operand 0 and has no way of knowing
which output should be used.
This change adds another function to directly return the register that
is the source of the register and use that for folding.
This fixes https://bugs.llvm.org/show_bug.cgi?id=44783
Differential Revision: https://reviews.llvm.org/D74005
As an intermediate step, some TLI functions can be converted to using
LLT instead of MVT. Move this somewhere out of GlobalISel so DAG
functions can use these.
In GISel we have both G_CONSTANT and G_FCONSTANT, but because
in GISel we don't really have a concept of Float vs Int value
the only difference between the two is where the data originates
from.
What both G_CONSTANT and G_FCONSTANT return is just a bag of bits
with the constant representation in it.
By making getConstantVRegVal() return G_FCONSTANTs bit representation
as well we allow ConstantFold and other things to operate with
G_FCONSTANT.
Adding tests that show ConstantFolding to work on mixed G_CONSTANT
and G_FCONSTANT sources.
Differential Revision: https://reviews.llvm.org/D68739
llvm-svn: 374458
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.
Support for splitting for return types not yet implemented.
Differential Revision: https://reviews.llvm.org/D66180
llvm-svn: 370822
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
%1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
%2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
%3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 23
All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.
To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
%1 = G_CONSTANT 16
%2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
%2 = G_SEXT_INREG %0, 16
or:
%1 = G_CONSTANT 16
%2:_(s32) = G_SHL %0, %1
%3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.
Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61289
llvm-svn: 368487
Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes
obstruct attempts to find constant source values. These usually come about when
try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough
about this operation allows the CBZ/CBNZ optimization to catch more cases.
This change also improves the case where we can't find a constant source at all.
Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit
a cmp and conditional branch, saving an instruction.
The cumulative code size improvement of this change plus D64354 is 5.5% geomean
on arm64 CTMark -O0.
Differential Revision: https://reviews.llvm.org/D64377
llvm-svn: 365690
When constrainRegClass is called if the constraining happens on a use the COPY
needs to be inserted before the instruction that contains the MachineOperand,
but if we are constraining a definition it actually needs to be added
after the instruction. In addition, the COPY needs to have its operands
flipped (in the use case we are copying from the old unconstrained register
to the new constrained register, while in the definition case we are copying
from the new constrained register that the instruction defines to the old
unconstrained register).
llvm-svn: 359282
getConstantVRegVal used to only look for G_CONSTANT when looking at
unboxing the value of a vreg. However, constants are sometimes not
directly used and are hidden behind trunc, s|zext or copy chain of
computation.
In particular this may be introduced by the legalization process that
doesn't want to simplify these patterns because it can lead to infine
loop when legalizing a constant.
To circumvent that problem, add a new variant of getConstantVRegVal,
named getConstantVRegValWithLookThrough, that allow to look through
extensions.
Differential Revision: https://reviews.llvm.org/D59227
llvm-svn: 356116
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
https://reviews.llvm.org/D52803
This patch adds support to continuously CSE instructions during
each of the GISel passes. It consists of a GISelCSEInfo analysis pass
that can be used by the CSEMIRBuilder.
llvm-svn: 351283
https://reviews.llvm.org/D53304
Currently dead phis are not cleaned up during DCE. This patch allows
dead PHI and G_PHI insts to be deleted.
Reviewed by: dsanders
llvm-svn: 344811
This re-applies r336929 with a fix to accomodate for the Mips target
scheduling multiple SelectionDAG instances into the pass pipeline.
PrologEpilogInserter and StackColoring depend on the StackProtector analysis
being alive from the point it is run until PEI, which requires that they are all
scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass
between StackProtector and PEI results in these passes being in separate
FunctionPassManagers and the StackProtector is not available for PEI.
PEI and StackColoring don't use much information from the StackProtector pass,
so transfering the required information to MachineFrameInfo is cleaner than
keeping the StackProtector pass around. This commit moves the SSP layout
information to MFI instead of keeping it in the pass.
This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.
Patch by Stephen Crane <sjc@immunant.com>
Differential Revision: https://reviews.llvm.org/D49256
llvm-svn: 336964