Commit Graph

2189 Commits

Author SHA1 Message Date
David Blaikie 4c01af203e Fix the -Werror build for some sign-comparisons
llvm-svn: 294331
2017-02-07 18:58:17 +00:00
Davide Italiano 2133bf5562 [InstCombine] Make max size array combine a tunable.
Requested by Sanjoy/Hal a while ago, and forgotten by me
(r283612).

llvm-svn: 294323
2017-02-07 17:56:50 +00:00
Paul Robinson 383c5c228f Merge DebugLoc on combined stores; in this case, when combining stores
from the end of two blocks, merge instead of arbitrarily picking one.

Differential Revision: http://reviews.llvm.org/D29504

llvm-svn: 294251
2017-02-06 22:19:04 +00:00
Sanjay Patel cf4c90f3d3 [InstCombine] simplify dyn_cast + isa; NFCI
llvm-svn: 294198
2017-02-06 17:16:16 +00:00
Sanjay Patel 0fe32ac256 [InstCombine] treat i1 as a special type in shouldChangeType()
This patch is based on the llvm-dev discussion here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html

Folding to i1 should always be desirable because that's better for value tracking 
and we have special folds for i1 types.

I checked for other users of shouldChangeType() where this might have an effect, 
but we already handle the i1 case differently than other types in all of those cases.

Side note: the default datalayout includes i1, so it seems we only find this gap in 
shouldChangeType + phi folding for the case when there is (1) an explicit datalayout 
without i1, (2) casting to i1 from a legal type, and (3) a phi with exactly 2 incoming
casted operands (as Björn mentioned).

Differential Revision: https://reviews.llvm.org/D29336

llvm-svn: 294066
2017-02-03 23:13:11 +00:00
Sanjay Patel 73fc8ddb06 [InstCombine] fix operand-complexity-based canonicalization (PR28296)
The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg) 
operators from arguments. Adding another level to the weighting scheme provides more structure and can help 
simplify the pattern matching in InstCombine and other places.

I fixed regressions that would have shown up from this change in:
rL290067
rL290127

But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests.

Should fix:
https://llvm.org/bugs/show_bug.cgi?id=28296

Differential Revision: https://reviews.llvm.org/D27933

llvm-svn: 294049
2017-02-03 21:43:34 +00:00
Sanjay Patel c56d1ccd79 [InstCombine] move folds for shift-shift pairs; NFCI
Although this is 'no-functional-change-intended', I'm adding tests
for shl-shl and lshr-lshr pairs because there is no existing test 
coverage for those folds.

It seems like we should be able to remove some code from foldShiftedShift()
at this point because we're handling those patterns on the general path.

llvm-svn: 293814
2017-02-01 21:31:34 +00:00
Sanjoy Das e0e5795f6b [InstCombine] Allow InstCombine to merge adjacent guards
Summary:
If there are two adjacent guards with different conditions, we can
remove one of them and include its condition into the condition of
another one. This patch allows InstCombine to merge them by the
following pattern:

    guard(a); guard(b) -> guard(a & b).

Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29378

llvm-svn: 293778
2017-02-01 16:34:55 +00:00
Davide Italiano aec4617dc8 [Instcombine] Combine consecutive identical fences
Differential Revision:  https://reviews.llvm.org/D29314

llvm-svn: 293661
2017-01-31 18:09:05 +00:00
Arnold Schwaighofer c368563bd6 Don't combine stores to a swifterror pointer operand to a different type
llvm-svn: 293658
2017-01-31 17:53:49 +00:00
Sanjay Patel 2217f75ad1 fix formatting; NFC
llvm-svn: 293652
2017-01-31 17:25:42 +00:00
Silviu Baranga c6d21eba0e [InstCombine] Make sure that LHS and RHS have the same type in
transformToIndexedCompare

If they don't have the same type, the size of the constant
index would need to be adjusted (and this wouldn't be always
possible).

Alternatively we could try the analysis with the initial
RHS value, which would guarantee that the two sides have
the same type. However it is unlikely that in practice this
would pass our transformation requirements.

Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808).

llvm-svn: 293629
2017-01-31 14:04:15 +00:00
Sanjay Patel 8c5f236197 [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors with splat constants
llvm-svn: 293570
2017-01-30 23:35:52 +00:00
Sanjay Patel 0c39d56a60 [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat constants
llvm-svn: 293562
2017-01-30 23:01:05 +00:00
Sanjay Patel 373db5ba6c [InstCombine] enable (X >>?exact C1) << C2 --> X >>?exact (C1-C2) for vectors with splat constants
llvm-svn: 293524
2017-01-30 18:40:23 +00:00
Sanjay Patel 062c14af5c [InstCombine] use auto with obvious type; NFC
llvm-svn: 293508
2017-01-30 17:38:55 +00:00
Sanjay Patel 77732d5033 [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1-C2) for vectors with splat constants
llvm-svn: 293507
2017-01-30 17:19:32 +00:00
Sanjay Patel 8e644c08ee [InstCombine] fixed to propagate 'exact' on lshr
The original shift is bigger, so this may qualify as 'obvious', 
but here's an attempt at an Alive-based proof:

Name: exact
Pre: (C1 u< C2)
%a = shl i8 %x, C1
%b = lshr exact i8 %a, C2 
  =>
%c = lshr exact i8 %x, C2 - C1
%b = and i8 %c, ((1 << width(C1)) - 1) u>> C2

Optimization is correct!

llvm-svn: 293498
2017-01-30 16:53:03 +00:00
Sanjay Patel 1196d7cd7f [InstCombine] enable lshr(shl X, C1), C2 folds for vectors with splat constants
llvm-svn: 293489
2017-01-30 16:11:40 +00:00
Sanjay Patel 062adaab83 [InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors with splats
llvm-svn: 293435
2017-01-29 17:11:18 +00:00
Sanjay Patel febcb9ce54 [InstCombine] move icmp transforms that might be recognized as min/max and inf-loop (PR31751)
This is a minimal patch to avoid the infinite loop in:
https://llvm.org/bugs/show_bug.cgi?id=31751

But the general problem is bigger: we're not canonicalizing all of the min/max forms reported
by value tracking's matchSelectPattern(), and we don't define min/max consistently. Some code
uses matchSelectPattern(), other code uses matchers like m_Umax, and others have their own
inline definitions which may be subtly different from any of the above.

The reason that the test cases in this patch need a cast op to trigger is because we don't
(yet) canonicalize all min/max forms based on matchSelectPattern() in 
canonicalizeMinMaxWithConstant(), but we do make min/max+cast transforms based on 
matchSelectPattern() in visitSelectInst().

The location of the icmp transforms that trigger the inf-loop seems arbitrary at best, so
I'm moving those behind the min/max fence in visitICmpInst() as the quick fix.

llvm-svn: 293345
2017-01-27 23:26:27 +00:00
Justin Lebar 25ebe2d767 [NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC.
llvm-svn: 293253
2017-01-27 02:04:07 +00:00
Justin Lebar e3ac0fb948 [NVPTX] Fix use-after-stack-free bug in InstCombineCalls.
Introduced in r293244.

llvm-svn: 293251
2017-01-27 01:49:39 +00:00
Justin Lebar 698c31b8db [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.
Summary:
There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.

For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32.  On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction.  In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.

These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.

Reviewers: tra

Subscribers: hfinkel, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D28794

llvm-svn: 293244
2017-01-27 00:58:58 +00:00
Sanjoy Das 7516192a71 Revert a couple of InstCombine/Guard checkins
This change reverts:

r293061: "[InstCombine] Canonicalize guards for NOT OR condition"
r293058: "[InstCombine] Canonicalize guards for AND condition"

They miscompile cases like:

```
declare void @llvm.experimental.guard(i1, ...)

define void @test_guard_not_or(i1 %A, i1 %B) {
  %C = or i1 %A, %B
  %D = xor i1 %C, true
  call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ]
  ret void
}
```

because they do transfer the `i32 20, i32 30` parameters to newly
created guard instructions.

llvm-svn: 293227
2017-01-26 23:38:11 +00:00
Sanjay Patel 50753f02c2 [InstCombine] fold (X >>u C) << C --> X & (-1 << C)
We already have this fold when the lshr has one use, but it doesn't need that
restriction. We may be able to remove some code from foldShiftedShift().

Also, move the similar:
(X << C) >>u C --> X & (-1 >>u C)
...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().

That whole function seems questionable since it is called by commonShiftTransforms(),
but there's really not much in common if we're checking the shift opcodes for every
fold.

llvm-svn: 293215
2017-01-26 22:08:10 +00:00
Sanjay Patel b0d96d327e [InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with splat vectors
llvm-svn: 293208
2017-01-26 20:52:27 +00:00
Craig Topper b6122122c9 [X86] Add demanded elts support for the inputs to pclmul intrinsic
This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors.

Differential Revision: https://reviews.llvm.org/D28979

llvm-svn: 293151
2017-01-26 05:17:13 +00:00
Artur Pilipenko b85f7a5d99 [InstCombine] Canonicalize guards for NOT OR condition
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D29075

Patch by Maxim Kazantsev.

llvm-svn: 293061
2017-01-25 14:45:12 +00:00
Simon Pilgrim 6f6b279109 [InstCombine][SSE] Add support for PACKSS/PACKUS constant folding
Differential Revision: https://reviews.llvm.org/D28949

llvm-svn: 293060
2017-01-25 14:37:24 +00:00
Artur Pilipenko 4df4c4a4aa [InstCombine] Canonicalize guards for AND condition
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D29074

Patch by Maxim Kazantsev.

llvm-svn: 293058
2017-01-25 14:20:52 +00:00
Artur Pilipenko e812ca00bb [InstCombine] Allow InstrCombine to remove one of adjacent guards if they are equivalent
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: majnemer, apilipenko

Differential Revision: https://reviews.llvm.org/D29071

Patch by Maxim Kazantsev.

llvm-svn: 293056
2017-01-25 14:12:12 +00:00
Amaury Sechet d90f5f6698 Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one.
Summary: As per title. This will add the instructiions we are interested in in the worklist.

Reviewers: mehdi_amini, majnemer, andreadb

Differential Revision: https://reviews.llvm.org/D29081

llvm-svn: 292957
2017-01-24 17:48:25 +00:00
Amaury Sechet 5da456e6a1 Fix formating in foldSelectCttzCtlz. NFC
llvm-svn: 292934
2017-01-24 14:22:27 +00:00
Simon Pilgrim 78f8630ac0 [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now

Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst

llvm-svn: 292913
2017-01-24 11:07:41 +00:00
Matt Arsenault 954a624fb9 SimplifyLibCalls: Replace more unary libcalls with intrinsics
llvm-svn: 292855
2017-01-23 23:55:08 +00:00
Simon Pilgrim f6f3a36159 [InstCombine][X86] Add MULDQ/MULUDQ constant folding support
llvm-svn: 292793
2017-01-23 15:22:59 +00:00
Simon Pilgrim bb13fdabec [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same

llvm-svn: 292784
2017-01-23 12:07:32 +00:00
Sanjay Patel 478a83c905 [InstCombine] use m_APInt to allow ashr folds for vectors with splat constants
We may be able to assert that no shl-shl or lshr-lshr pairs ever get here
because we should have already handled those in foldShiftedShift().

llvm-svn: 292726
2017-01-21 17:59:59 +00:00
Simon Pilgrim a50a93fcd0 [InstCombine][X86] Add MULDQ/MULUDQ undef handling
llvm-svn: 292627
2017-01-20 18:20:30 +00:00
Simon Pilgrim 51b3b98e3a [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions
Simplify a packss/packus truncation based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28777

llvm-svn: 292591
2017-01-20 09:28:21 +00:00
Davide Italiano 2ef8c4e708 [InstCombine] Simplify gep (gep p, a), (b-a)
Patch by Andrea Canciani.

Differential Revision:  https://reviews.llvm.org/D27413

llvm-svn: 292506
2017-01-19 18:51:56 +00:00
Sanjay Patel 291c3d8ff2 [InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1
Try harder to fold icmp with shl nsw as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html

This is similar to the 'shl nuw' transforms that were added with D25913.

This may eventually help solve:
https://llvm.org/bugs/show_bug.cgi?id=30773

Differential Revision: https://reviews.llvm.org/D28406

llvm-svn: 292492
2017-01-19 16:12:10 +00:00
Sanjay Patel ae23d65a7d [InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI
llvm-svn: 292440
2017-01-18 21:16:12 +00:00
Sanjay Patel 589de5ea4e [InstCombine] remove a redundant check; NFCI
I missed deleting this check when I refactored this chunk in:
https://reviews.llvm.org/rL292260

llvm-svn: 292433
2017-01-18 20:09:59 +00:00
Simon Pilgrim fe2c0ed4cf [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles
Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.

llvm-svn: 292371
2017-01-18 14:47:49 +00:00
Simon Pilgrim a22c3a1c0f [InstCombine] Remove unnecessary intrinsics demanded elts handling
As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need.

llvm-svn: 292365
2017-01-18 13:44:04 +00:00
Sanjay Patel 14715b3c2a [InstCombine] refactor foldICmpShlConstant(); NFCI
This reduces the size of and increases the symmetry with the planned functional change in:
https://reviews.llvm.org/D28406

llvm-svn: 292260
2017-01-17 21:25:16 +00:00
David Majnemer de55c606d1 [InstCombine] Fold ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2)
This further extends r292179 to support additional binary operators
beyond subtraction.

llvm-svn: 292238
2017-01-17 18:08:06 +00:00
Sanjay Patel 5424bd2625 [InstCombine] reduce indent; NFCI
llvm-svn: 292230
2017-01-17 16:59:09 +00:00
Simon Pilgrim d4eb800b03 [InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions
Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded.

llvm-svn: 292209
2017-01-17 11:35:03 +00:00
Sanjoy Das 679bc32c6a [InstCombine] Don't DSE across readnone functions that may throw
Summary: Depends on D28740

Reviewers: dberlin, chandlerc, hfinkel, majnemer

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D28742

llvm-svn: 292197
2017-01-17 05:45:09 +00:00
David Majnemer 36d382b773 [InstCombine] Fold ((C1-zext(X)) & C2) -> zext((C1-X) & C2)
This is valid if C2 fits within the bitwidth of X thanks to two's
complement modulo arithmetic.

llvm-svn: 292179
2017-01-17 00:45:57 +00:00
Matt Arsenault 7233344c28 SimplifyLibCalls: Replace fabs libcalls with intrinsics
Add missing fabs(fpext) optimzation that worked with the call,
and also fixes it creating a second fpext when there were multiple
uses.

llvm-svn: 292172
2017-01-17 00:10:40 +00:00
Sanjay Patel da5682afdd [InstCombine] use m_APInt instead of faking it
llvm-svn: 292164
2017-01-16 21:24:41 +00:00
Sanjay Patel 65cce20caa [InstCombine] fix names in canEvaluateShiftedShift(); NFC
It's not clear what 'First' and 'Second' mean, so use 'Inner' and 'Outer'
to match foldShiftedShift() and add comments with formulas, so it's easier
to see what's going on.

llvm-svn: 292153
2017-01-16 20:05:26 +00:00
Sanjay Patel ab8b32de71 [InstCombine] use m_APInt to allow shift-shift folds for vectors with splat constants
Some existing 'FIXME' tests are still not folded because of splat holes in value tracking.

llvm-svn: 292151
2017-01-16 19:35:45 +00:00
Sanjay Patel 646734a6cd [InstCombine] refactor shift-of-shift folds; NFCI
Reduces code duplication and makes it easier to extend these folds for vectors.

llvm-svn: 292145
2017-01-16 17:27:50 +00:00
Simon Pilgrim 73a68c25a0 [InstCombine][SSE] Add DemandedElts support for PSHUFB instructions
Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28745

llvm-svn: 292101
2017-01-16 11:30:41 +00:00
Sanjay Patel 20aaf58543 [InstCombine] fix formatting; NFC
llvm-svn: 292073
2017-01-15 17:55:35 +00:00
Sanjay Patel 5f8451afad [InstCombine] use m_APInt to allow ashr folds for vectors with splat constants
llvm-svn: 292064
2017-01-15 16:38:19 +00:00
Chandler Carruth ca68a3ec47 [PM] Introduce an analysis set used to preserve all analyses over
a function's CFG when that CFG is unchanged.

This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.

I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.

Differential Revision: https://reviews.llvm.org/D28627

llvm-svn: 292054
2017-01-15 06:32:49 +00:00
Chandler Carruth 2f19a324cb [PM] The assumption cache is fundamentally designed to be self-updating,
mark it as never invalidated in the new PM.

The old PM already required this to work, and after a discussion with
Hal this seems to really be the only sensible answer. The cache
gracefully degrades as the IR is mutated, and most things which do this
should already be incrementally updating the cache.

This gets rid of a bunch of logic preserving and testing the
invalidation of this analysis.

llvm-svn: 292039
2017-01-15 00:26:18 +00:00
Chandler Carruth 5edfd4d99e [PM] Fix instcombine's analysis preservation in the new pass manager to
cover domtree and alias analysis. These are the pretty clear analyses
that we would always want to survive this pass.

To make these survive, we also need to preserve the assumption cache.

Added a test that verifies the important bits of this preservation.

llvm-svn: 292037
2017-01-14 23:25:22 +00:00
Sanjay Patel ca3124f74b [InstCombine] clean up visitAshr(); NFCI
llvm-svn: 292036
2017-01-14 23:13:50 +00:00
Sanjay Patel 40f401776b [InstCombine] optimize unsigned icmp of increment
Allows LLVM to optimize sequences like the following:

%add = add nuw i32 %x, 1
%cmp = icmp ugt i32 %add, %y

Into:

%cmp = icmp uge i32 %x, %y

Previously, only signed comparisons were being handled.

Decrements could also be handled, but 'sub nuw %x, 1' is currently canonicalized to
'add %x, -1' in InstCombineAddSub, losing the nuw flag. Removing that canonicalization
seems like it might have far-reaching ramifications so I kept this simple for now.

Patch by Matti Niemenmaa!

Differential Revision: https://reviews.llvm.org/D24700

llvm-svn: 291975
2017-01-13 23:25:46 +00:00
Sanjay Patel 2d4b456427 [InstCombine] use m_APInt to allow lshr folds for vectors with splat constants
llvm-svn: 291972
2017-01-13 23:04:10 +00:00
Sanjay Patel acd24c7b6a [InstCombine] use 'match' and other clean-up; NFCI
llvm-svn: 291937
2017-01-13 18:52:10 +00:00
Sanjay Patel b22f6c5f26 [InstCombine] use m_APInt to allow shl folds for vectors with splat constants
llvm-svn: 291934
2017-01-13 18:39:09 +00:00
Sanjay Patel cf08203105 [InstCombine] use Op0/Op1 local variables more consistently with shifts; NFC
llvm-svn: 291923
2017-01-13 18:08:25 +00:00
Sanjay Patel 5178363687 [InstCombine] if the condition of a select may be known via assumes, eliminate the select
This is a limited solution for PR31512:
https://llvm.org/bugs/show_bug.cgi?id=31512

The motivation is that we will need to increase usage of llvm.assume and/or metadata to solve PR28430:
https://llvm.org/bugs/show_bug.cgi?id=28430

...and this kind of simplification is needed to take advantage of that extra information.

The 'not' test case would be handled by:
https://reviews.llvm.org/D28485

Differential Revision:
https://reviews.llvm.org/D28337

llvm-svn: 291915
2017-01-13 17:02:42 +00:00
Robert Lougher f5df7a18dd [DebugInfo] Add const to DILocation variable declaration; NFC.
llvm-svn: 291785
2017-01-12 18:29:28 +00:00
Hal Finkel 8a9a783f2c Make processing @llvm.assume more efficient - Add affected values to the assumption cache
Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.

As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).

Differential Revision: https://reviews.llvm.org/D28459

llvm-svn: 291671
2017-01-11 13:24:24 +00:00
Sanjay Patel db0938fd9a [InstCombine] add a wrapper for a common pair of transforms; NFCI
Some of the callers are artificially limiting this transform to integer types;
this should make it easier to incrementally remove that restriction.

llvm-svn: 291620
2017-01-10 23:49:07 +00:00
Matt Arsenault 3f509042b0 InstCombine: Set operands instead of creating new call
llvm-svn: 291612
2017-01-10 23:17:52 +00:00
Matt Arsenault fdb78f8bae InstCombine: fdiv -x, -y -> fdiv x, y
llvm-svn: 291611
2017-01-10 23:08:54 +00:00
Sanjay Patel 940c06188e fix comment typos; NFC
llvm-svn: 291447
2017-01-09 16:27:56 +00:00
Matt Arsenault 3bdd75d01e InstCombine: Fold cos(-x) -> cos(x)
Also cos(fabs(x)) -> cos(x)

llvm-svn: 291022
2017-01-04 22:49:03 +00:00
David Majnemer cb892e9066 [InstCombine] Move casts around shift operations
It is possible to perform a left shift before zero extending if the
shift would only shift out zeros.

llvm-svn: 290928
2017-01-04 02:21:34 +00:00
David Majnemer 022d2a563b [InstCombine] Combine adds across a zext
We can perform the following:
(add (zext (add nuw X, C1)), C2) -> (zext (add nuw X, C1+C2))

This is only possible if C2 is negative and C2 is greater than or equal to negative C1.

llvm-svn: 290927
2017-01-04 02:21:31 +00:00
Matt Arsenault 56ff4839ae InstCombine: Fold fabs on select of constants
llvm-svn: 290913
2017-01-03 22:40:34 +00:00
Sanjay Patel f0d1e77373 [InstCombine] use 'match' to reduce code bloat; NFCI
I wrote this patch before seeing the comment in:
https://reviews.llvm.org/D27114
...that suggests we should actually be canonicalizing the other way.

So just in case we decide this is the right way, we might as well
have a cleaner implementation.

llvm-svn: 290912
2017-01-03 22:25:31 +00:00
Matt Arsenault b264c94963 InstCombine: Add fma with constant transforms
DAGCombine already does these.

llvm-svn: 290860
2017-01-03 04:32:35 +00:00
Matt Arsenault 1cc294c85d InstCombine: Add fma + fabs/fneg transforms
fma (fneg x), (fneg y), z -> fma x, y, z
fma (fabs x), (fabs x), z -> fma x, x, z

llvm-svn: 290859
2017-01-03 04:32:31 +00:00
Sanjay Patel b38ad88e9f [InstCombine] use combineMetadataForCSE instead of copying it; NFCI
llvm-svn: 290844
2017-01-02 23:25:28 +00:00
Craig Topper d00db69227 [InstCombine][AVX-512] Teach InstCombine that llvm.x86.avx512.vcomi.sd and llvm.x86.avx512.vcomi.ss don't use the upper elements of their input.
This was already done for the SSE/SSE2 version of the intrinsics.

llvm-svn: 290776
2016-12-31 00:45:06 +00:00
Craig Topper 991636312b [InstCombine][AVX-512] When turning intrinsics with masking into native IR, don't emit a select if the mask is known to be all ones.
This saves InstCombine the burden of having to optimize the select later.

llvm-svn: 290774
2016-12-30 23:06:28 +00:00
David Majnemer 5ec5f278c9 [InstCombine] Address post-commit feedback
llvm-svn: 290741
2016-12-30 03:36:17 +00:00
David Majnemer a1cfd7c5f8 [InstCombine] More thoroughly canonicalize the position of zexts
We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y))
where possible.  However, we didn't perform the same canonicalization
for zexts or for muls.

llvm-svn: 290733
2016-12-30 00:28:58 +00:00
Craig Topper 17b5568bc7 [InstCombine] Use getVectorNumElements instead of explicitly casting to VectorType and calling getNumElements. NFC
llvm-svn: 290707
2016-12-29 07:03:18 +00:00
Craig Topper 62f06e241b [InstCombine] Fix typo in comment. NFC
llvm-svn: 290706
2016-12-29 05:38:31 +00:00
Craig Topper 2e18bcfc60 [InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC
llvm-svn: 290705
2016-12-29 04:24:32 +00:00
Craig Topper 1a8a3377cc [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner.
We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded.

llvm-svn: 290704
2016-12-29 03:30:17 +00:00
Craig Topper 28ec3460e4 [InstCombine] Remove a piece of a comment that said that InstCombiner contains pass infrastructure. That hasn't been true since r226618. NFC
llvm-svn: 290648
2016-12-28 03:12:42 +00:00
Michael Kuperstein cd7ad7130f [InstCombine] Canonicalize insert splat sequences into an insert + shuffle
This adds a combine that canonicalizes a chain of inserts which broadcasts
a value into a single insert + a splat shufflevector.

This fixes PR31286.

Differential Revision: https://reviews.llvm.org/D27992

llvm-svn: 290641
2016-12-28 00:18:08 +00:00
Craig Topper 72f2d4e8d6 [InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

This builds on r290554 which added supported for 128 and 256-bit.

llvm-svn: 290582
2016-12-27 05:30:09 +00:00
Craig Topper 7f8540b5e7 [AVX-512][InstCombine] Teach InstCombine to turn masked scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
An earlier commit added support for unmasked scalar operations. At that time isel wouldn't generate an optimal sequence for masked operations, but that has now been fixed.

llvm-svn: 290566
2016-12-27 01:56:30 +00:00
Craig Topper 020b228155 [AVX-512][InstCombine] Teach InstCombine to turn packed add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
llvm-svn: 290559
2016-12-27 00:23:16 +00:00
Simon Pilgrim c9cf7fc7a4 [InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

llvm-svn: 290554
2016-12-26 23:28:17 +00:00
Craig Topper 7b788ada2d [AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
Summary:
I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon.

I'll do the same thing for packed add/sub/mul/div in a future patch.

Reviewers: delena, RKSimon, zvi, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27879

llvm-svn: 290535
2016-12-26 06:33:19 +00:00