Summary:
This fixes a build break with r331269.
test/Transforms/LoopVectorize/pr23997.ll
should be in:
test/Transforms/LoopVectorize/X86/pr23997.ll
llvm-svn: 331281
Summary:
This is a fix for PR23997.
The loop vectorizer is not preserving the inbounds property of GEPs that it creates.
This is inhibiting some optimizations. This patch preserves the inbounds property in
the case where a load/store is being fed by an inbounds GEP.
Reviewers: mkuper, javed.absar, hsaito
Reviewed By: hsaito
Subscribers: dcaballe, hsaito, llvm-commits
Differential Revision: https://reviews.llvm.org/D46191
llvm-svn: 331269
phi is on lhs of a comparison op.
For the following testcase,
L1:
%t0 = add i32 %m, 7
%t3 = icmp eq i32* %t2, null
br i1 %t3, label %L3, label %L2
L2:
%t4 = load i32, i32* %t2, align 4
br label %L3
L3:
%t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ]
%t6 = icmp eq i32 %t0, %t5
br i1 %t6, label %L4, label %L5
We know if we go through the path L1 --> L3, %t6 should always be true. However
currently, if the rhs of the eq comparison is phi, JumpThreading fails to
evaluate %t6 to true. And we know that Instcombine cannot guarantee always
canonicalizing phi to the left hand side of the comparison operation according
to the operand priority comparison mechanism in instcombine. The patch handles
the case when rhs of the comparison op is a phi.
Differential Revision: https://reviews.llvm.org/D46275
llvm-svn: 331266
instcombine should transform the relevant cases if the OverflowingBinaryOperator/PossiblyExactOperator can be proven to be safe.
Change-Id: I7aec62a31a894e465e00eb06aed80c3ea0c9dd45
llvm-svn: 331265
The previous version of this patch restricted the 'jal' instruction to MIPS and
microMIPSr3. microMIPS32r6 does not have this instruction and instead uses jal
as an alias for balc.
Original commit message:
> Reviewers: smaksimovic, atanasyan, abeserminji
>
> Differential Revision: https://reviews.llvm.org/D46114
>
llvm-svn: 331259
This patch fixes a bug introduced by revision 330778 (originally reviewed at:
https://reviews.llvm.org/D44782), where function isFrameLoadOpcode returned
the wrong number of bytes read for opcodes VMOVSSrm and VMOVSDrm.
This corrects that mistake, and extends the regression test to catch cases where
the dead stores should be removed.
Patch by Jeremy Morse.
Differential Revision: https://reviews.llvm.org/D46256
llvm-svn: 331252
Previously for instructions like fxsave we would print "opaque ptr" as part of the memory operand. Now we print nothing.
We also no longer accept "opaque ptr" in the parser. We still accept any size to be specified for these instructions, but we may want to consider only parsing when no explicit size is specified. This what gas does.
llvm-svn: 331243
This appears to have some issues associated with the file directive output
causing multiple global symbols with the name "file" to be emitted into a
startup section. I'm investigating more specific causes and working with the
original author.
This reverts commit r330271.
Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local"
This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271.
llvm-svn: 331237
This test had values that differed in only in capitalization,
and that causes problems for the auto-generating check line
script. So I changed that in rL331226, but I accidentally
forgot to change a subsequent use of a param.
llvm-svn: 331228
Dead defs were being removed from the live set (in stepForward), but
registers clobbered by regmasks weren't (more specifically, they were
actually removed by removeRegsInMask, but then they were added back in).
llvm-svn: 331219
Teach AsmParser to check with Assembler for when evaluating constant
expressions. This improves the handing of preprocessor expressions
that must be resolved at parse time. This idiom can be found as
assembling-time assertion checks in source-level assemblers. Note that
this relies on the MCStreamer to keep sufficient tabs on Section /
Fragment information which the MCAsmStreamer does not. As a result the
textual output may fail where the equivalent object generation would
pass. This can most easily be resolved by folding the MCAsmStreamer
and MCObjectStreamer together which is planned for in a separate
patch.
Currently, this feature is only enabled for assembly input, keeping IR
compilation consistent between assembly and object generation.
Reviewers: echristo, rnk, probinson, espindola, peter.smith
Reviewed By: peter.smith
Subscribers: eraman, peter.smith, arichardson, jyknight, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45164
llvm-svn: 331218
No need to waste space nor number MBBs differently if MF gets recreated.
Reviewers: qcolombet, stoklund, t.p.northover, bogner, javed.absar
Reviewed By: qcolombet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46078
llvm-svn: 331213
Summary:
As discussed in D45733, we want to do this in InstCombine.
https://rise4fun.com/Alive/LGk
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: chandlerc, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D45867
llvm-svn: 331205
This provides an optimized implementation of SADDO/SSUBO/UADDO/USUBO
as well as ADDCARRY/SUBCARRY on top of the new CC implementation.
In particular, multi-word arithmetic now uses UADDO/ADDCARRY instead
of the old ADDC/ADDE logic, which means we no longer need to use
"glue" links for those instructions. This also allows making full
use of the memory-based instructions like ALSI, which couldn't be
recognized due to limitations in the DAG matcher previously.
Also, the llvm.sadd.with.overflow et.al. intrinsincs now expand to
directly using the ADD instructions and checking for a CC 3 result.
llvm-svn: 331203
Currently, an instruction setting the condition code is linked to
the instruction using the condition code via a "glue" link in the
SelectionDAG. This has a number of drawbacks; in particular, it
means the same CC cannot be used by multiple users. It also makes
it more difficult to efficiently implement SADDO et. al.
This patch changes the back-end to represent CC dependencies as
normal values during SelectionDAG matching, along the lines of
how this is handled in the X86 back-end already.
In addition to the core mechanics of updating all relevant patterns,
this requires a number of additional changes:
- We now need to be able to spill/restore a CC value into a GPR
if necessary. This means providing a copyPhysReg implementation
for moves involving CC, and defining getCrossCopyRegClass.
- Since we still prefer to avoid such spills, we provide an override
for IsProfitableToFold to avoid creating a merged LOAD / ICMP if
this would result in multiple users of the CC.
- combineCCMask no longer requires a single CC user, and no longer
need to be careful about preventing invalid glue/chain cycles.
- emitSelect needs to be more careful in marking CC live-in to
the basic block it generates. Also, we can now optimize the
case of multiple subsequent selects with the same condition
just like X86 does.
llvm-svn: 331202
There are two separate fixes here:
* The lowering code for non-extending loads should report UnableToLegalize instead of emitting the same instruction.
* The target should not be requesting lowering of non-extending loads.
llvm-svn: 331201
This prevents infinite recursion in DWARFDie::findRecursively for
malformed DWARF where a DIE references itself.
This fixes PR36257.
Differential revision: https://reviews.llvm.org/D43092
llvm-svn: 331200
This fixes PR37293.
We can have scheduling classes with no write latency entries, that still consume
processor resources. We don't want to treat those instructions as zero-latency
instructions; they still have to be issued to the underlying pipelines, so they
still consume resource cycles.
This is likely to be a regression which I have accidentally introduced at
revision 330807. Now, if an instruction has a non-empty set of write processor
resources, we conservatively treat it as a normal (i.e. non zero-latency)
instruction.
llvm-svn: 331193
If we have LOCR instructions, select them directly from SelectionDAG
instead of first going through a pseudo instruction and then using
the custom inserter to emit the LOCR.
Provide Select pseudo-instructions for VR32/VR64 if we have vector
instructions, to avoid having to go through the first 16 FPRs
unnecessarily.
If we do not have LOCFHR, prefer using LOCR followed by a move
over a conditional branch.
llvm-svn: 331191
Summary:
This patch will introduce copying of DBG_VALUE instructions
from an otherwise empty basic block to predecessor/successor
blocks in case the empty block is eliminated/bypassed. It
is currently only done in one identified situation in the
BranchFolding pass, before optimizing on empty block.
It can be seen as a light variant of the propagation done
by the LiveDebugValues pass, which unfortunately is executed
after the BranchFolding pass.
We only propagate (copy) DBG_VALUE instructions in a limited
number of situations:
a) If the empty BB is the only predecessor of a successor
we can copy the DBG_VALUE instruction to the beginning of
the successor (because the DBG_VALUE instruction is always
part of the flow between the blocks).
b) If the empty BB is the only successor of a predecessor
we can copy the DBG_VALUE instruction to the end of the
predecessor (because the DBG_VALUE instruction is always
part of the flow between the blocks). In this case we add
the DBG_VALUE just before the first terminator (assuming
that the terminators do not impact the DBG_VALUE).
A future solution, to handle more situations, could perhaps
be to run the LiveDebugValues pass before branch folding?
This fix is related to PR37234. It is expected to resolve
the problem seen, when applied together with the fix in
SelectionDAG from here: https://reviews.llvm.org/D46129
Reviewers: #debug-info, aprantl, rnk
Reviewed By: #debug-info, aprantl
Subscribers: ormris, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D46184
llvm-svn: 331183
Summary:
When building the selection DAG at ISel all PHI nodes are
selected and lowered to Machine Instruction PHI nodes before
we start to create any SDNodes. So there are no SDNodes for
values produced by the PHI nodes.
In the past when selecting a dbg.value intrinsic that uses
the value produced by a PHI node we have been handling such
dbg.value intrinsics as "dangling debug info". I.e. we have
not created a SDDbgValue node directly, because there is
no existing SDNode for the PHI result, instead we deferred
the creationg of a SDDbgValue until we found the first use
of the PHI result.
The old solution had a couple of flaws. The position of the
selected DBG_VALUE instruction would end up quite late in a
basic block, and for example not directly after the PHI node
as in the LLVM IR input. And in case there were no use at all
in the basic block the dbg.value could be dropped completely.
This patch introduces a new VREG kind of SDDbgValue nodes.
It is similar to a SDNODE kind of node, but it refers directly
to a virtual register and not a SDNode. When we do selection
for a dbg.value that is using the result of a PHI node we
can do a lookup of the virtual register directly (as it already
is determined for the PHI node) and create a SDDbgValue node
immediately instead of delaying the selection until we find a
use.
This should fix a problem with losing debug info at ISel
as seen in PR37234 (https://bugs.llvm.org/show_bug.cgi?id=37234).
It does not resolve PR37234 completely, because the debug info
is dropped later on in the BranchFolder (see D46184).
Reviewers: #debug-info, aprantl
Reviewed By: #debug-info, aprantl
Subscribers: rnk, gbedwell, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D46129
llvm-svn: 331182
Previously these instructions were unselectable and instead were generated
through the instruction mapping tables.
Reviewers: atanasyan, smaksimovic, abeserminji
Differential Revision: https://reviews.llvm.org/D46055
llvm-svn: 331165
This patch extends the 'isSVEVectorRegWithShiftExtend' function to
improve diagnostics for SVE's gather load (scalar + vector) addressing
modes. Instead of always suggesting the 'unscaled' addressing mode,
the use of DiagnosticPredicate enables a more specific error message
in the context where the scaling is incorrect. For example:
ld1h z0.d, p0/z, [x0, z0.d, lsl #2]
^
shift amount should be '1'
Instead of suggesting the packed, unscaled addressing mode:
expected 'z[0..31].d, (uxtw|sxtw)'
the assembler now suggests using the proper scaling:
expected 'z[0..31].d, (lsl|uxtw|sxtw) #1'
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D46124
llvm-svn: 331162