Commit Graph

2189 Commits

Author SHA1 Message Date
Reid Kleckner b518054b87 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393
2017-03-21 16:57:19 +00:00
Artur Pilipenko 4cc6130f52 NFC. InstCombiner::visitFAdd extract LHSIntVal/RHSIntVal local variables
llvm-svn: 298359
2017-03-21 11:32:15 +00:00
Matt Arsenault 6b00d40900 InstCombine: Check source value precision when reducing cast intrinsic
Missed this check when porting from the libcall version.

llvm-svn: 298312
2017-03-20 21:59:24 +00:00
Craig Topper d92d2fc763 [InstCombine] Print a debug message when we constant fold an operand during worklist creation
InstCombine tries to constant fold instruction operands during worklist building, but we don't print that we're doing this.

We also set a change flag here that causes us to rebuild and rerun the worklist one more time even if processing the worklist itself created no additional changes. So in the log I saw two inst combine runs that visited all instructions without printing that anything was changed. I may be submitting another patch to remove the change flag unless I can find some reason why we should be doing that.

Differential Revision: https://reviews.llvm.org/D31091

llvm-svn: 298264
2017-03-20 16:31:14 +00:00
Craig Topper ff9749f759 [InstCombine] Remove duplicate code in SimplifyDemandedUseBits for URem. NFC
llvm-svn: 298231
2017-03-19 21:45:57 +00:00
Craig Topper 3a86a04404 [InstCombine] Use setHighBits/setLowBits/setBitsFrom in place of getLowBitsSet/getHighBitsSet.
llvm-svn: 298204
2017-03-19 05:49:16 +00:00
Adrian Prantl 47ea6478ed Salvage debug info from instructions about to be deleted
[Reapplies r297971 and punting on finding a better API for findDbgValues()]

This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.

In the example in the testcase (which was extracted from XNU) there is a sequence of

 %4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
 %5 = bitcast %struct.entry* %4 to i8*, !dbg !42
 %add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
 %6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
 call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34

When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:

- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic

The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.

rdar://problem/30725338

Differential Revision: https://reviews.llvm.org/D30919

llvm-svn: 297994
2017-03-16 21:14:09 +00:00
Sanjay Patel 6105bb5eaf [InstCombine] avoid breaking up bitcasted vector min/max patterns (PR32306)
As the related tests show, we're not canonicalizing to this form for scalars or vectors yet,
but this solves the immediate problem in:
https://bugs.llvm.org/show_bug.cgi?id=32306

llvm-svn: 297989
2017-03-16 20:42:45 +00:00
Adrian Prantl fa9e84eb6d Revert commit r297971 because of issues reported by msan.
llvm-svn: 297982
2017-03-16 20:11:54 +00:00
Adrian Prantl 4377314a98 Salvage debug info from instructions about to be deleted
This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.

In the example in the testcase (which was extracted from XNU) there is a sequence of

  %4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
  %5 = bitcast %struct.entry* %4 to i8*, !dbg !42
  %add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
  %6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
  call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34

When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:

- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic

The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.

rdar://problem/30725338

Differential Revision: https://reviews.llvm.org/D30919

llvm-svn: 297971
2017-03-16 18:22:52 +00:00
Bjorn Pettersson c98dabb1a0 [InstCombine] Liberate assert in InstCombiner::visitZExt
Summary:
The call to canEvaluateZExtd in InstCombiner::visitZExt may
return with BitsToClear == SrcTy->getScalarSizeInBits(), but
there is an assert that BitsToClear should be smaller than
SrcTy->getScalarSizeInBits().

I have a test case that triggers the assert, but it only happens
for my downstream target. I've not been able to trigger it for
any upstream target.

The assert triggered for a piece of code such as this
  %shr1 = lshr i16 undef, 15
  ...
  %shr2 = lshr i16 %shr1, 1
  %conv = zext i16 %shr2 to i32

Normally the lshr instructions are constant folded before we
visit the zext (that is why it is so hard to reproduce).
The original pattern, before instcombine, is of course a lot more
complicated in my test case. The shift count in the second lshr
is for example determined by the outcome of a PHI instruction.
It seems like other rewrites by instcombine leads up to
the pattern above. And then the zext is pulled from the
worklist, and visited (hitting the assert), before we detect
that the lshr instrucions can be constant folded.

Anyway, since the canEvaluateZExtd may return with BitsToClear
equal to SrcTy->getScalarSizeInBits(), and since the rewrite
that converts the expression type to avoid a zero extend works
also for the case where SrcBitsKept ends up being zero, then
it should be OK to liberate the assert to
  assert(BitsToClear <= SrcTy->getScalarSizeInBits() &&
         "Unreasonable BitsToClear");

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D30993

llvm-svn: 297952
2017-03-16 13:22:01 +00:00
Sanjay Patel a0a5682d00 [InstCombine] improve readability; NFCI
llvm-svn: 297755
2017-03-14 17:27:27 +00:00
Matt Arsenault d81f557fe2 AMDGPU: Fold icmp/fcmp into icmp intrinsic
The typical use is a library vote function which
compares to 0. Fold the user condition into the intrinsic.

llvm-svn: 297650
2017-03-13 18:14:02 +00:00
Matt Arsenault a3bdd8f27b AMDGPU: Fix insertion point when reducing load intrinsics
The insertion point may be later than the next instruction,
so it is necessary to set it when replacing the call.

llvm-svn: 297439
2017-03-10 05:25:49 +00:00
Matt Arsenault efe949cc67 AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsics
llvm-svn: 297408
2017-03-09 20:34:27 +00:00
Sanjay Patel 62906af379 [InstCombine] avoid crashing on shuffle shrinkage when input type is not same as result type
llvm-svn: 297280
2017-03-08 15:02:23 +00:00
Sanjay Patel fe9705149b [InstCombine] shrink truncated insertelement into undef vector
This is the 2nd part of solving:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

D30123 moves the trunc ahead of the shuffle, and this moves the trunc ahead of the insertelement. 
We're limiting this transform to undef rather than any constant to avoid backend problems.

Differential Revision: https://reviews.llvm.org/D30137

llvm-svn: 297242
2017-03-07 23:27:14 +00:00
Sanjay Patel 53fa17a014 [InstCombine] shrink truncated splat shuffle (2nd try)
This was committed at r297155 and reverted at r297166 because of an
over-reaching clang test. That should be fixed with r297189.

This is one part of solving a recent bug report:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

This keeps with our general approach: changing arbitrary shuffles is off-limts,
but changing splat is ok. The transform is very similar to the existing
shrinkBitwiseLogic() canonicalization.

Differential Revision: https://reviews.llvm.org/D30123

llvm-svn: 297232
2017-03-07 21:45:16 +00:00
Sanjay Patel 6d30606168 revert r297155 because there's a clang test that depends on InstCombine:
tools/clang/test/CodeGen/zvector.c

llvm-svn: 297166
2017-03-07 17:41:45 +00:00
Sanjay Patel defdb7bed5 [InstCombine] shrink truncated splat shuffle
This is one part of solving a recent bug report:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

This keeps with our general approach: changing arbitrary shuffles is off-limts, 
but changing splat is ok. The transform is very similar to the existing 
shrinkBitwiseLogic() canonicalization.

Differential Revision: https://reviews.llvm.org/D30123

llvm-svn: 297155
2017-03-07 16:10:36 +00:00
Sanjay Patel c3b4735b6f [InstCombine] use dyn_cast instead of isa+cast; NFCI
llvm-svn: 297092
2017-03-06 23:25:28 +00:00
Simon Pilgrim e938a152c5 Use APInt::getLowBitsSet instead of APInt::getBitsSet for lower bit mask creation
llvm-svn: 296882
2017-03-03 16:56:33 +00:00
Bjorn Pettersson e5027cfbcc [InstCombine] Avoid faulty combines of select-cmp-br
Summary:
When InstCombine is optimizing certain select-cmp-br patterns
it replaces the result of the select in uses outside of the
basic block containing the select. This is only legal if the
path from the select to the outside use is disjoint from all
other paths out from the originating basic block.

The problem found was that InstCombiner::replacedSelectWithOperand
did not consider the case when both edges out from the br pointed
to the same label. In that case the paths aren't disjoint and the
transformation is illegal. This patch avoids the faulty rewrites
by verifying that there is a single flow to the successor where
we want to replace uses.

Reviewers: llvm-commits, spatel, majnemer

Differential Revision: https://reviews.llvm.org/D30455

llvm-svn: 296752
2017-03-02 15:18:58 +00:00
Mikael Holmen 760dc9aba7 Remove sometimes faulty rewrite of memcpy in instcombine.
Summary:
Solves PR 31990.

The bad rewrite could replace a memcpy of one word with
 store i4 -1
while it should actually be
 store i8 -1

Hopefully opt and llc has improved enough so the original optimization
done by the code isn't needed anymore.

One already existing testcase is affected. It originally tested that
the memcpy was replaced with
 load double
but since we now remove that rewrite it will be
 load i64
instead.

Patch suggestion by Eli Friedman.

Reviewers: eli.friedman, majnemer, efriedma

Reviewed By: efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D30254

llvm-svn: 296585
2017-03-01 06:45:20 +00:00
Matt Arsenault cdb468c0f9 AMDGPU: Basic folds for fmed3 intrinsic
Constant fold, canonicalize constants to RHS,
reduce to minnum/maxnum when inputs are nan/undef.

llvm-svn: 296409
2017-02-27 23:08:49 +00:00
Yaxun Liu e6d1ce59c0 [InstCombine] Fix bug in pointer replacement
This optimisation was crashing when there was a chain of more than one bitcast
instruction to replace, as a result of the changes in D27283.

Patch by James Price.

Differential Revision: https://reviews.llvm.org/D30347

llvm-svn: 296163
2017-02-24 20:27:25 +00:00
Sanjay Patel ec9a8de0e6 [InstCombine] don't try SimplifyDemandedInstructionBits from zext/sext because it's slow and unnecessary
This one seems more obvious than D30270 that it can't make improvements because an extension always needs
all of the incoming bits. There's one specific transform in SimplifyDemandedInstructionBits of converting
a sext to a zext when the sign-bit is known zero, but that is handled explicitly in visitSext() with
ComputeSignBit().

Like D30270, there are no IR differences (other than instruction names) for the case in PR32037:
https://bugs.llvm.org//show_bug.cgi?id=32037
...and no regression test differences.

Zext/sext are a smaller part of the profile, but this still appears to shave off another 0.5% or so from
'opt -O2'.

Differential Revision: https://reviews.llvm.org/D30280

llvm-svn: 296129
2017-02-24 15:18:42 +00:00
Sanjay Patel 68e4cb3c86 [InstCombine] use loop instead of recursion to peek through FPExt; NFCI
llvm-svn: 295992
2017-02-23 16:39:51 +00:00
Sanjay Patel adf2ab16e4 [InstCombine] use 'match' to reduce code; NFCI
llvm-svn: 295991
2017-02-23 16:26:03 +00:00
Matt Arsenault d4bca1e9ef AMDGPU: Replace disabled exp inputs with undef
llvm-svn: 295914
2017-02-23 00:44:03 +00:00
Matt Arsenault f5262256a1 AMDGPU: Add replacement bfe intrinsics
llvm-svn: 295899
2017-02-22 23:04:58 +00:00
Sanjay Patel 4805ce0b17 [InstCombine] don't try SimplifyDemandedInstructionBits from add/sub because it's slow and unlikely to succeed
Notably, no regression tests change when we remove these calls, and these are expensive calls.

The motivation comes from the general acknowledgement that the compiler is getting slower:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109188.html
http://lists.llvm.org/pipermail/llvm-dev/2016-December/108279.html

And specifically the test case attached to PR32037:
https://bugs.llvm.org//show_bug.cgi?id=32037

Profiling the middle-end (opt) part of the compile:
$ ./opt -O2 row_common.bc -o /dev/null

...visitAdd and visitSub are near the top of the instcombine list, and the calls to SimplifyDemandedInstructionBits()
are high within each of those. Those calls account for 1%+ of the opt time in either debug or release profiles. And 
that's the rough win I see from this patch when testing opt built release from r295864 on an iMac with Haswell 4GHz
(model 4790K).

It seems unlikely that we'd be able to eliminate add/sub or change their operands given that add/sub normally affect
all bits, and the PR32037 example shows no IR difference after this change using -O2.

Also worth noting - the code comment in visitAdd:
// This handles stuff like (X & 254)+1 -> (X&254)|1
...isn't true. That transform is handled later with a call to haveNoCommonBitsSet().

Differential Revision: https://reviews.llvm.org/D30270

llvm-svn: 295898
2017-02-22 23:01:12 +00:00
Matt Arsenault 1f17c66890 AMDGPU: Add cvt.pkrtz intrinsic
Convert llvm.SI.packf16 test uses

llvm-svn: 295797
2017-02-22 00:27:34 +00:00
Sanjay Patel cb731f1538 [InstCombine] canonicalize non-obivous forms of integer min/max
This is part of trying to clean up our handling of min/max patterns in IR.
By converting these to canonical form, we're more likely to recognize them
because there are various places in InstCombine that don't use 
matchSelectPattern or m_SMax and friends.

The backend fixups referenced in the now deleted TODO comment were added with:
https://reviews.llvm.org/rL291392
https://reviews.llvm.org/rL289738

If there's any codegen fallout from this change, we should be able to address
it in DAGCombiner or target-specific lowering. 

llvm-svn: 295758
2017-02-21 19:33:53 +00:00
Anna Thomas ec36f3b79a [InstCombine] Do not exercise nested max/min pattern on abs
Summary:
This is a fix for assertion failure in
`getInverseMinMaxSelectPattern` when ABS is passed in as a select pattern.

We should not be invoking the simplification rule for
ABS(MIN(~ x,y))) or ABS(MAX(~x,y)) combinations.

Added a test case which would cause an assertion failure without the patch.

Reviewers: sanjoy, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30051

llvm-svn: 295719
2017-02-21 14:40:28 +00:00
Sanjay Patel 53c5c3d65d [InstCombine] add nsw/nuw X, signbit --> or X, signbit
Changing to 'or' (rather than 'xor' when no wrapping flags are set)
allows icmp simplifies to happen as expected.

Differential Revision: https://reviews.llvm.org/D29729

llvm-svn: 295574
2017-02-18 22:20:09 +00:00
Eugene Leviant 958fcd7502 InstCombine: fix extraction when performing vector/array punning
Differential revision: https://reviews.llvm.org/D29491

llvm-svn: 295429
2017-02-17 07:36:03 +00:00
Matt Arsenault 920576042d InstCombine: Canonicalize fast fmuladd to fmul + fadd
llvm-svn: 295353
2017-02-16 18:46:24 +00:00
Craig Topper 3731f4d173 [AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit.
llvm-svn: 295294
2017-02-16 07:35:23 +00:00
Sanjay Patel 845ea963aa [InstCombine] improve formatting; NFC
llvm-svn: 295237
2017-02-15 21:31:34 +00:00
Sanjay Patel 45b7e69fef [InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2)
I found one special case of this transform for 'slt 0', so I removed that and added the general transform.

Alive code to check correctness:

Name: slt_no_overflow
Pre: WillNotOverflowSignedSub(C1, C2)
%a = add nsw i8 %x, C2
%b = icmp slt %a, C1
  =>
%b = icmp slt %x, C1 - C2

Name: sgt_no_overflow
Pre: WillNotOverflowSignedSub(C1, C2)
%a = add nsw i8 %x, C2
%b = icmp sgt %a, C1
  =>
%b = icmp sgt %x, C1 - C2

http://rise4fun.com/Alive/MH

Differential Revision: https://reviews.llvm.org/D29774

llvm-svn: 294898
2017-02-12 16:40:30 +00:00
Benjamin Kramer 03ab8a366e [InstCombine] Move class into anonymous namespace. NFC.
This is necessary to avoid warnings from GCC.
InstCombineLoadStoreAlloca.cpp:238:7: error: 'PointerReplacer' declared
with greater visibility than the type of its field 'PointerReplacer::IC'

llvm-svn: 294794
2017-02-10 22:26:35 +00:00
Benjamin Kramer 684c87be4f [InstCombine] Silence unused variable warning in Release builds.
llvm-svn: 294788
2017-02-10 22:04:17 +00:00
Yaxun Liu ba01ed00fe Fix invalid addrspacecast due to combining alloca with global var
For function-scope variables with large initialisation list, FE usually 
generates a global variable to hold the initializer, then generates 
memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst 
identifies such allocas which are accessed only by reading and replaces 
them with the global variable. This is done by casting the global variable 
to the type of the alloca and replacing all references.

However, when the global variable is in a different address space which 
is disjoint with addr space 0 (e.g. for IR generated from OpenCL, 
global variable cannot be in private addr space i.e. addr space 0), casting 
the global variable to addr space 0 results in invalid IR for certain 
targets (e.g. amdgpu).

To fix this issue, when the global variable is not in addr space 0, 
instead of casting it to addr space 0, this patch chases down the uses 
of alloca until reaching the load instructions, then replaces load from 
alloca with load from the global variable. If during the chasing 
bitcast and GEP are encountered, new bitcast and GEP based on the global 
variable are generated and used in the load instructions.

Differential Revision: https://reviews.llvm.org/D27283

llvm-svn: 294786
2017-02-10 21:46:07 +00:00
Sanjay Patel f38bab73aa [InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectors
This fold already existed for vectors but only when 'C1' was a splat
constant (but 'C2' could be any constant). 

There were no tests for any vector constants, so I'm adding a test
that shows non-splat constants for both operands.  

llvm-svn: 294650
2017-02-09 23:13:04 +00:00
Sanjay Patel ae3b43e488 [InstCombine] use m_APInt to allow demanded bits analysis on splat constants
llvm-svn: 294628
2017-02-09 21:43:06 +00:00
Sanjay Patel 6dd2eae76a [InstCombine] add local name for repeated calls; NFC
llvm-svn: 294470
2017-02-08 16:19:36 +00:00
Igor Laevsky a9b6872908 [InstComobineCalls] Fix buildbot failures after r294453.
Some targets don't support uint64_t options. Change type to unsigned.

Differential Revision: https://reviews.llvm.org/D28909

llvm-svn: 294461
2017-02-08 15:21:48 +00:00
Igor Laevsky 900ffa34c8 [InstCombineCalls] Unfold element atomic memcpy instruction
Differential Revision: https://reviews.llvm.org/D28909

llvm-svn: 294453
2017-02-08 14:32:04 +00:00
Igor Laevsky 4b317fa24e [InstCombineCalls] Remove zero length atomic memcpy intrinsics
Differential Revision: https://reviews.llvm.org/D28909

llvm-svn: 294452
2017-02-08 14:23:47 +00:00