In this patch I've fixed some warnings that arose from the implicit
cast of TypeSize -> uint64_t. I tried writing a variety of different
cases to show how this optimisation might work for scalable vectors
and found:
1. The optimisation does not work for cases where the cast type
is scalable and the allocated type is not. This because we need to
know how many times the cast type fits into the allocated type.
2. If we pass all the various checks for the case when the allocated
type is scalable and the cast type is not, then when creating the
new alloca we have to take vscale into account. This leads to
sub-optimal IR that is worse than the original IR.
3. For the remaining case when both the alloca and cast types are
scalable it is hard to find examples where the optimisation would
kick in, except for simple bitcasts, because we typically fail the
ABI alignment checks.
For now I've changed the code to bail out if only one of the alloca
and cast types is scalable. This means we continue to support the
existing cases where both types are fixed, and also the specific case
when both types are scalable with the same size and alignment, for
example a simple bitcast of an alloca to another type.
I've added tests that show we don't attempt to promote the alloca,
except for simple bitcasts:
Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll
Differential revision: https://reviews.llvm.org/D87378
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).
This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
Narrowing an input expression of a truncate to a type larger than the
result of the truncate won't allow removing the truncate, but it may
enable further optimizations, e.g. allowing for larger vectorization
factors.
For now this is intentionally limited to integer types only, to avoid
producing new vector ops that might not be suitable for the target.
If we know that the only user is a trunc, we can also be allow more
cases, e.g. also shortening expressions with some additional shifts.
I would appreciate feedback on the best place to do such a narrowing.
This fixes PR43580.
Reviewers: spatel, RKSimon, lebedev.ri, xbolva00
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D82973
Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range.
This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange.
This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have.
Differential Revision: https://reviews.llvm.org/D83127
As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount.
Differential Revision: https://reviews.llvm.org/D83035
This is intended to preserve the logic of the existing transform,
but remove unnecessary restrictions on uses and types.
https://rise4fun.com/Alive/pYfR
Pre: C1 <= width(C1) - 8
%B = sext i8 %A
%C = lshr %B, C1
%r = trunc %C to i8
=>
%r = ashr i8 %A, trunc(umin(C1, 7))
Whilst trying to compile this test to assembly:
CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret.c
I discovered some warnings were firing in InstCombiner::visitBitCast
due to calls to getNumElements() for scalable vector types. These
calls only really made sense for fixed width vectors so I have fixed
up the code appropriately.
Differential Revision: https://reviews.llvm.org/D80559
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.
Differential Revision: https://reviews.llvm.org/D75670
This was originally in D79116.
Converting from a narrow-enough FP source value to integer and
back to FP guarantees that the conversion to FP is exact because
of UB/poison-on-overflow.
This was suggested in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c19
Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
We can combine a floating-point extension cast with a conversion
from integer if we know the earlier cast is exact.
This is an optimization suggested in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c19
However, this patch does not change the example suggested there.
This patch only uses the existing analysis to handle cases where
the integer source value magnitude is narrower than the
intermediate FP mantissa (guarantees that the conversion to FP is
exact). Follow-up patches to the analysis function can enable
more cases.
Differential Revision: https://reviews.llvm.org/D79116
As suggested in D79116 - there's shared logic between the
existing code and potential new folds. This could go in
ValueTracking if it seems generally useful.
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: sdesmalen, rriddle, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77263
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.
This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.
I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors. Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it. Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.
Differential Revision: https://reviews.llvm.org/D72467
Canonicalize the case when a scalar extracted from a vector is
truncated. Transform such cases to bitcast-then-extractelement.
This will enable erasing the truncate operation.
This commit fixes PR45314.
reviewers: spatel
Differential revision: https://reviews.llvm.org/D76983
The existence of the class is more confusing than helpful, I think; the
commonality is mostly just "GEP is legal", which can be queried using
APIs on GetElementPtrInst.
Differential Revision: https://reviews.llvm.org/D75660
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp
negation. This also extends the scalarization cost in instcombine for unary
operators to result in the same IR rewrites for fneg as for the idiom.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75467
This makes sure that the constant expression bitcast goes through
target-dependent constant folding, and thus avoids an additional
iteration of InstCombine.
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.
This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.
Differential Revision: https://reviews.llvm.org/D75543
Use UnaryOperator::CreateFNeg instead.
Summary:
With the introduction of the native fneg instruction, the
fsub -0.0, %x idiom is obsolete. This patch makes LLVM
emit fneg instead of the idiom in all places.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75130
This is a bug noted in the recent D72733 and seen
in the similar transform just above the changed source code.
I added tests with illegal types and zexts to show the bug -
we could transform legal phi ops to illegal, etc. I did not add
tests with trunc because we won't see any diffs on those patterns.
That is because InstCombiner::SliceUpIllegalIntegerPHI() appears to
do those transforms independently of datalayout. It can also create
more casts than are present in existing code.
There are some existing regression tests that do not include a
datalayout that would be altered by this fix. I assumed that the
lack of a datalayout in those regression files is an oversight, so
I added the minimal layout (make i32 legal) necessary to preserve
behavior on those tests.
Differential Revision: https://reviews.llvm.org/D73907
Adds a replaceOperand() helper, which is like Instruction.setOperand()
but adds the old operand to the worklist. This reduces the amount of
missing or incorrect worklist management.
This only applies the helper to a relatively small subset of
setOperand() calls in InstCombine, namely those of the pattern
`I.setOperand(); return &I;`, where it is most obviously applicable.
Differential Revision: https://reviews.llvm.org/D73803
This renames Worklist.AddDeferred() to Worklist.add() and
Worklist.Add() to Worklist.push(). The intention here is that
Worklist.add() should be the go-to method for explicit worklist
management, while the raw Worklist.push() is mostly for
InstCombine internals. I will then migrate uses of Worklist.push()
to Worklist.add() in followup changes.
As suggested by spatel on D73411 I'm also changing the remaining
method names to lowercase first character, in line with current
coding standards.
Differential Revision: https://reviews.llvm.org/D73745
D47163 created a rule that we should not change the casted
type of a select when we have matching types in its compare condition.
That was intended to help vector codegen, but it also could create
situations where we miss subsequent folds as shown in PR44545:
https://bugs.llvm.org/show_bug.cgi?id=44545
By using shouldChangeType(), we can continue to get the vector folds
(because we always return false for vector types). But we also solve
the motivating bug because it's ok to narrow the scalar select in that
example.
Our canonicalization rules around select are a mess, but AFAICT, this
will not induce any infinite looping from the reverse transform (but
we'll need to watch for that possibility if committed).
Side note: there's a similar use of shouldChangeType() for phi ops
just below this diff, and the source and destination types appear to
be reversed.
Differential Revision: https://reviews.llvm.org/D72733
This fixes the issue encountered in D71164. Instead of using a
range-based for, manually iterate over the users and advance the
iterator beforehand, so we do not skip any users due to iterator
invalidation.
Differential Revision: https://reviews.llvm.org/D72657
This reverts commit a041c4ec6f.
This looks like a non-trivial change and there has been no code
reviews (at least there were no phabricator revisions attached to the
commit description). It is also causing a regression in one of our
downstream integration tests, we haven't been able to come up with a
minimal reproducer yet.
This does not solve PR17101, but it is one of the
underlying diffs noted here:
https://bugs.llvm.org/show_bug.cgi?id=17101#c8
We could ease the one-use checks for the 'clear'
(no 'not' op) half of the transform, but I do not
know if that asymmetry would make things better
or worse.
Proofs:
https://rise4fun.com/Alive/uVB
Name: masked bit set
%sh1 = shl i32 1, %y
%and = and i32 %sh1, %x
%cmp = icmp ne i32 %and, 0
%r = zext i1 %cmp to i32
=>
%s = lshr i32 %x, %y
%r = and i32 %s, 1
Name: masked bit clear
%sh1 = shl i32 1, %y
%and = and i32 %sh1, %x
%cmp = icmp eq i32 %and, 0
%r = zext i1 %cmp to i32
=>
%xn = xor i32 %x, -1
%s = lshr i32 %xn, %y
%r = and i32 %s, 1
Judging by the existing comments, this was the intention, but the
transform never actually checked if the existing phi's would be removed.
See https://bugs.llvm.org/show_bug.cgi?id=44242 for an example where
this causes much worse code generation on AMDGPU.
Differential Revision: https://reviews.llvm.org/D71209
Summary:
This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions.
It also adds a check in `Worklist::Add` that ensures that all added instructions have parents.
Simple test case that illustrates the difference when run with `--debug-only=instcombine`:
```
define i32 @test35(i32 %a, i32 %b) {
%1 = or i32 %a, 1135
%2 = or i32 %1, %b
ret i32 %2
}
```
Before this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting: %1 = or i32 %a, 1135
IC: Visiting: %2 = or i32 %1, %b
IC: ADD: %2 = or i32 %a, %b
IC: Old = %3 = or i32 %1, %b
New = <badref> = or i32 %2, 1135
IC: ADD: <badref> = or i32 %2, 1135
...
```
With this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting: %1 = or i32 %a, 1135
IC: Visiting: %2 = or i32 %1, %b
IC: ADD: %2 = or i32 %a, %b
IC: Old = %3 = or i32 %1, %b
New = <badref> = or i32 %2, 1135
IC: ADD: %3 = or i32 %2, 1135
...
```
Reviewers: fhahn, davide, spatel, foad, grosser, nikic
Reviewed By: nikic
Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71093
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.
This fixes the buildbot failures.
Differential Revision: https://reviews.llvm.org/D68328
Patch by Joseph Faulls!
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.
Differential Revision: https://reviews.llvm.org/D68328
Patch by Joseph Faulls!
This makes no difference currently because we don't apply FMF
to FP casts, but that may change.
This could also be a place to add a fold for select with fptrunc,
so it will make that patch easier/smaller.
This reverts these two commits
[InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double.
[InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement
We're seeing at least one internal test failure related to a
bitcast that was previously before an inline assembly block
containing emms being placed after it. This leads to the mmx
state ending up not empty after the emms. IR has no way to
make any specific guarantees about this. Reverting these patches
to get back to previous behavior which at least worked for this
test.
Summary:
optimizeVectorResize is rewriting patterns like:
%1 = bitcast vector %src to integer
%2 = trunc/zext %1
%dst = bitcast %2 to vector
Since bitcasting between integer an vector types gives
different integer values depending on endianness, we need
to take endianness into account. As it happens the old
implementation only produced the correct result for little
endian targets.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=44178
Reviewers: spatel, lattner, lebedev.ri
Reviewed By: spatel, lebedev.ri
Subscribers: lebedev.ri, hiraditya, uabelho, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70844
x86_mmx is conceptually a vector already. Don't introduce an extra conversion between it and scalar i64.
I'm using VectorType::isValidElementType which checks for floating point, integer, and pointers to hopefully make this more readable than just blacklisting x86_mmx.
Differential Revision: https://reviews.llvm.org/D69964
Follow-up to D68244 to account for a corner case discussed in:
https://bugs.llvm.org/show_bug.cgi?id=43501
Add one more restriction: if the pointer is deref-or-null and in a non-default
(non-zero) address space, we can't assume inbounds.
Differential Revision: https://reviews.llvm.org/D68706
llvm-svn: 374728
bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N*8} --> bswap (bitcast X to i{N*8})
In PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146
...we have a more complicated case where SLP is making a mess of bswap. This patch won't
do anything for that currently, but we need to improve bswap recognition in instcombine,
SLP, and/or a standalone pass to avoid that problem.
This is limited using the data-layout so we don't try to do this transform with actual
vector types. The backend does not appear to have folds to convert in either direction,
so we don't want to mess up something that is actually better lowered as a shuffle.
On x86, we're trading something like this:
vmovd %edi, %xmm0
vpshufb LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u]
vmovd %xmm0, %eax
For:
movl %edi, %eax
bswapl %eax
Differential Revision: https://reviews.llvm.org/D66965
llvm-svn: 370659
This reverts commit bc4a63fd3c, this is a
speculative revert to fix a number of sanitizer bots (like
sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2
compiler crashes, presumably due to a miscompile.
llvm-svn: 367029
trunc (load X) --> load (bitcast X to narrow type)
We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated
load pattern can interfere with other instcombine transforms, so I'd like to
allow the fold sooner.
Example:
https://bugs.llvm.org/show_bug.cgi?id=16739
...in that report, we have bitcasts bracketing these ops, so those could get
eliminated too.
We've generally ruled out widening of loads early in IR ( LoadCombine -
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but
that reasoning may not apply to narrowing if we can preserve information
such as the dereferenceable range.
Differential Revision: https://reviews.llvm.org/D64432
llvm-svn: 367011
The worklist loop that we're returning back to should be able to do the repacement itself. This is how we normally do replacements.
My main motivation was that I observed that we weren't preserving the name of the result when we do this transform. The replacement code in the worklist loop will call takeName as part of the replacement.
Differential Revision: https://reviews.llvm.org/D61695
llvm-svn: 360284
For some specific cases with bitcast A->B->A with intervening PHI nodes InstCombiner::optimizeBitCastFromPhi transformation creates extra PHI nodes, which are actually a copy of already created PHI or in another words, they are redundant. These extra PHI nodes could lead to extra move instructions generated after DeSSA transformation. This happens when several conditions are met
- SROA kicks in and creates new alloca;
- there is a simple assignment L = R, which falls under 'canonicalize loads' done by combineLoadToOperationType (this transformation is by default). Exactly this transformation is the reason of bitcasts generated;
- the alloca is then used in A->B->A + PHI chain;
- there is a loop unrolling.
As a result optimizeBitCastFromPhi creates as many of PHI nodes for each new SROA alloca as loop unrolling factor is. These new extra PHI nodes are redundant actually except of one and should not be created. Moreover the idea of optimizeBitCastFromPhi is to get rid of the cast (when possible) but that doesn't happen in these conditions.
The proposed fix is to do the cast replacement for the whole calculated/accumulated PHI closure not for one cast only, which is an argument to the optimizeBitCastFromPhi. These will help to accomplish several things: 1) avoid extra PHI nodes generated as all casts which may trigger optimizeBitCastFromPhi transformation will be replaced, 3) bitcasts will be replaced, and 3) create more opportunities to remove dead code, which appears after the replacement.
A new test case shows that it's possible to get rid of all bitcasts completely and get quite good code reduction.
Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com>
Reviewed By: Carrot
Differential Revision: https://reviews.llvm.org/D57053
llvm-svn: 353595
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57173
llvm-svn: 352913
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57170
llvm-svn: 352909
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Similar to rL350199 - there are no known analysis/codegen holes for
funnel shift intrinsics now, so we can canonicalize the 6+ regular
instructions to funnel shift to improve vectorization, inlining,
unrolling, etc.
llvm-svn: 350419
The problem is shown specifically for a case with vector multiply here:
https://bugs.llvm.org/show_bug.cgi?id=40032
...and this might mask the original backend bug for ARM shown in:
https://bugs.llvm.org/show_bug.cgi?id=39967
As the test diffs here show, we were (and probably still aren't) doing
these kinds of transforms in a principled way. We are producing more or
equal wide instructions than we started with in some cases, so we still
need to restrict/correct other transforms from overstepping.
If there are perf regressions from this change, we can either carve out
exceptions to the general IR rules, or improve the backend to do these
transforms when we know the transform is profitable. That's probably
similar to a change like D55448.
Differential Revision: https://reviews.llvm.org/D55744
llvm-svn: 349389
This is a longer variant for the pattern handled in
rL346713
This one includes zexts.
Eventually, we should canonicalize all rotate patterns
to the funnel shift intrinsics, but we need a bit more
infrastructure to make sure the vectorizers handle those
intrinsics as well as the shift+logic ops.
https://rise4fun.com/Alive/FMn
Name: narrow rotateright
%neg = sub i8 0, %shamt
%rshamt = and i8 %shamt, 7
%rshamtconv = zext i8 %rshamt to i32
%lshamt = and i8 %neg, 7
%lshamtconv = zext i8 %lshamt to i32
%conv = zext i8 %x to i32
%shr = lshr i32 %conv, %rshamtconv
%shl = shl i32 %conv, %lshamtconv
%or = or i32 %shl, %shr
%r = trunc i32 %or to i8
=>
%maskedShAmt2 = and i8 %shamt, 7
%negShAmt2 = sub i8 0, %shamt
%maskedNegShAmt2 = and i8 %negShAmt2, 7
%shl2 = lshr i8 %x, %maskedShAmt2
%shr2 = shl i8 %x, %maskedNegShAmt2
%r = or i8 %shl2, %shr2
llvm-svn: 346716
The sub-pattern for the shift amount in a rotate can take on
several different forms, and there's apparently no way to
canonicalize those without seeing the entire rotate sequence.
This is the form noted in:
https://bugs.llvm.org/show_bug.cgi?id=39624https://rise4fun.com/Alive/qnT
%zx = zext i8 %x to i32
%maskedShAmt = and i32 %shAmt, 7
%shl = shl i32 %zx, %maskedShAmt
%negShAmt = sub i32 0, %shAmt
%maskedNegShAmt = and i32 %negShAmt, 7
%shr = lshr i32 %zx, %maskedNegShAmt
%rot = or i32 %shl, %shr
%r = trunc i32 %rot to i8
=>
%truncShAmt = trunc i32 %shAmt to i8
%maskedShAmt2 = and i8 %truncShAmt, 7
%shl2 = shl i8 %x, %maskedShAmt2
%negShAmt2 = sub i8 0, %truncShAmt
%maskedNegShAmt2 = and i8 %negShAmt2, 7
%shr2 = lshr i8 %x, %maskedNegShAmt2
%r = or i8 %shl2, %shr2
llvm-svn: 346713
Replacing BinaryOperator::isFNeg(...) to avoid regressions when we
separate FNeg from the FSub IR instruction.
Differential Revision: https://reviews.llvm.org/D53650
llvm-svn: 345295
Workaround bug where the InstCombine pass was asserting on the IR added in lit
test, where we have a bitcast instruction after a GEP from an addrspace cast.
The second bitcast in the test was getting combined into
`bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)*`, which looks like it should
be an addrspace cast instruction instead. Otherwise if control flow is allowed
to continue as it is now we create a GEP instruction
`<badref> = getelementptr inbounds <16 x i32>, <16 x i32>* %0, i32 0`. However
because the type of this instruction doesn't match the address space we hit an
assert when replacing the bitcast with that GEP.
```
void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.
```
Differential Revision: https://reviews.llvm.org/D50058
llvm-svn: 338395
InstCombine has a cast transform that matches a cast-of-select:
Orig = cast (Src = select Cond TV FV)
And tries to replace it with a select which has the cast folded in:
NewSel = select Cond (cast TV) (cast FV)
The combiner does RAUW(Orig, NewSel), so any debug values for Orig would
survive the transform. But debug values for Src would be lost.
This patch teaches InstCombine to replace all debug uses of Src with
NewSel (taking care of doing any necessary DIExpression rewriting).
Differential Revision: https://reviews.llvm.org/D49270
llvm-svn: 337310
The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.
This improves upon the existing insertReplacementDbgValues API by:
- Updating debug intrinsics in-place, while preventing use-before-def of
the replacement value.
- Falling back to salvageDebugInfo when a replacement can't be made.
- Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into
common utility code.
Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:
- The no-op conversion. Applies when the values have the same width, or
have bit-for-bit compatible pointer representations.
- Truncation. Applies when the new value is wider than the old one.
- Zero/sign extension. Applies when the new value is narrower than the
old one.
Testing:
- check-llvm, check-clang, a stage2 `-g -O3` build of clang,
regression/unit testing.
- This resolves a number of mis-sized dbg.value diagnostics from
Debugify.
Differential Revision: https://reviews.llvm.org/D48676
llvm-svn: 336451
We have bailout hacks based on min/max in various places in instcombine
that shouldn't be necessary. The affected test was added for:
D48930
...which is a consequence of the improvement in:
D48584 (https://reviews.llvm.org/rL336172)
I'm assuming the visitTrunc bailout in this patch was added specifically
to avoid a change from SimplifyDemandedBits, so I'm just moving that
below the EvaluateInDifferentType optimization. A narrow min/max is still
a min/max.
llvm-svn: 336293
When zext is EvaluatedInDifferentType, InstCombine
drops the dbg.value intrinsic. This patch tries to
preserve said DI, by inserting the zext's old DI in the
resulting instruction. (Only for integer type for now)
Differential Revision: https://reviews.llvm.org/D48331
llvm-svn: 336254
This prevents InstCombine from creating mis-sized dbg.values when
replacing a sequence of casts with a simpler cast. For example, in:
(fptrunc (floor (fpext X))) -> (floorf X)
We no longer emit dbg.value(X) (with a 32-bit float operand) to describe
(fpext X) (which is a 64-bit float).
This was diagnosed by the debugify check added in r335682.
llvm-svn: 335696
The previous code worked with vectors, but it failed when the
vector constants contained undef elements.
The matchers handle those cases.
llvm-svn: 335262
The purpose of this utility is to make it easier for optimizations to
insert replacement dbg.values for instructions they are deleting. This
is useful in situations where salvageDebugInfo is inapplicable, say,
because the new dbg.value cannot refer to an operand of the dying value.
The utility is called insertReplacementDbgValues.
It assumes that the instruction 'From' is going to be deleted, and
inserts replacement dbg.values for each debug user of 'From'. The
newly-inserted dbg.values refer to 'To' instead of 'From'. Each
replacement dbg.value has the same location and variable as the debug
user it replaces, has a DIExpression determined by the result of
'RewriteExpr' applied to an old debug user of 'From', and is placed
before 'InsertBefore'.
This should simplify future patches, like D48331.
llvm-svn: 335144