Commit Graph

17134 Commits

Author SHA1 Message Date
Alexey Bataev 4015bf8372 [SLP] Refactoring of horizontal reduction analysis, NFC.
Some checks in SLP horizontal reduction analysis function are performed
several times, though it is enough to perform these checks only once
during an initial attempt at adding candidate for the reduction
instruction/reduced value.

Differential Revision: https://reviews.llvm.org/D29175

llvm-svn: 293274
2017-01-27 10:54:04 +00:00
Chandler Carruth fd2d7c72fc [LICM] When we are recomputing the alias sets for a subloop, we cannot
skip sub-subloops.

The logic to skip subloops dated from when this code was shared with the
cached case. Once it was factored out to only run in the case of
recomputed subloops it became a dangerous bug. If a subsubloop contained
an interfering instruction it would be silently skipped from the alias
sets for LICM.

With the old pass manager this was extremely hard to trigger as it would
require failing to visit these subloops with the LICM pass but then
visiting the outer loop somehow. I've not yet contrived any test case
that actually manages to trigger this.

But with the new pass manager we don't do the cross-loop caching hack
that the old PM does and so we recompute alias set information from
first principles. While this seems much cleaner and simpler it exposed
this bug and would subtly miscompile code due to failing to correctly
model the aliasing constraints of deeply nested loops.

llvm-svn: 293273
2017-01-27 10:27:32 +00:00
Richard Trieu 0b79aa3373 Fix unused variable warning.
llvm-svn: 293260
2017-01-27 06:06:05 +00:00
Daniel Berlin c479686af2 NewGVN: Add basic dead and redundant store elimination
Summary:
This adds basic dead and redundant store elimination to
NewGVN.  Unlike our current DSE, it will happily do cross-block DSE if
it meets our requirements.

We get a bunch of DSE's simple.ll cases, and some stuff it doesn't.
Unlike DSE, however, we only try to eliminate stores of the same value
to the same memory location, not just general stores to the same
memory location.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29149

llvm-svn: 293258
2017-01-27 02:37:11 +00:00
Justin Lebar 25ebe2d767 [NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC.
llvm-svn: 293253
2017-01-27 02:04:07 +00:00
Justin Lebar e3ac0fb948 [NVPTX] Fix use-after-stack-free bug in InstCombineCalls.
Introduced in r293244.

llvm-svn: 293251
2017-01-27 01:49:39 +00:00
Xin Tong e5f8d643d4 Constant fold switch inst when looking for trivial conditions to unswitch on.
Summary: Constant fold switch inst when looking for trivial conditions to unswitch on.

Reviewers: sanjoy, chenli, hfinkel, efriedma

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D29037

llvm-svn: 293250
2017-01-27 01:42:20 +00:00
Chandler Carruth baabda9317 [PM] Port LoopLoadElimination to the new pass manager and wire it into
the main pipeline.

This is a very straight forward port. Nothing weird or surprising.

This brings the number of missing passes from the new PM's pipeline down
to three.

llvm-svn: 293249
2017-01-27 01:32:26 +00:00
Justin Lebar 698c31b8db [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.
Summary:
There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.

For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32.  On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction.  In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.

These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.

Reviewers: tra

Subscribers: hfinkel, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D28794

llvm-svn: 293244
2017-01-27 00:58:58 +00:00
Justin Lebar cb9b41dd76 [LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.
Summary:
Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute
sqrt(x), check if x is negative, and select NaN if it is:

  %cmp = fcmp olt double %a, -0.000000e+00
  %sqrt = call double @llvm.sqrt.f64(double %a)
  %ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt

This is technically UB as the LangRef is written today if %a is ever less than
-0.  But emitting code that's compliant with the current definition of sqrt
would require a branch, which would then prevent us from matching this idiom in
SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative
inputs), because SelectionDAG looks at one BB at a time.

Nothing in LLVM takes advantage of this undefined behavior, as far as we can
tell, and the fact that llvm.sqrt has UB dates from its initial addition to the
LangRef.

Reviewers: arsenm, mehdi_amini, hfinkel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D28797

llvm-svn: 293242
2017-01-27 00:58:03 +00:00
Sanjoy Das 7516192a71 Revert a couple of InstCombine/Guard checkins
This change reverts:

r293061: "[InstCombine] Canonicalize guards for NOT OR condition"
r293058: "[InstCombine] Canonicalize guards for AND condition"

They miscompile cases like:

```
declare void @llvm.experimental.guard(i1, ...)

define void @test_guard_not_or(i1 %A, i1 %B) {
  %C = or i1 %A, %B
  %D = xor i1 %C, true
  call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ]
  ret void
}
```

because they do transfer the `i32 20, i32 30` parameters to newly
created guard instructions.

llvm-svn: 293227
2017-01-26 23:38:11 +00:00
Daniel Berlin 1ea5f324bd NewGVN: Fix bug exposed by PR31761
Summary:
This does not actually fix the testcase in PR31761 (discussion is
ongoing on the testcase), but does fix a bug it exposes, where stores
were not properly clobbering loads.

We accomplish this by unifying the memory equivalence infratructure
back into the normal congruence infrastructure, and then properly
destroying congruence classes when memory state leaders disappear.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29195

llvm-svn: 293216
2017-01-26 22:21:48 +00:00
Sanjay Patel 50753f02c2 [InstCombine] fold (X >>u C) << C --> X & (-1 << C)
We already have this fold when the lshr has one use, but it doesn't need that
restriction. We may be able to remove some code from foldShiftedShift().

Also, move the similar:
(X << C) >>u C --> X & (-1 >>u C)
...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().

That whole function seems questionable since it is called by commonShiftTransforms(),
but there's really not much in common if we're checking the shift opcodes for every
fold.

llvm-svn: 293215
2017-01-26 22:08:10 +00:00
Daniel Berlin db3c7be069 NewGVN: Add algorithm overview
llvm-svn: 293212
2017-01-26 21:39:49 +00:00
Sanjay Patel b0d96d327e [InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with splat vectors
llvm-svn: 293208
2017-01-26 20:52:27 +00:00
Daniel Berlin 2b83492eee NewGVN: Make unreachable blocks be marked with unreachable
llvm-svn: 293196
2017-01-26 18:30:29 +00:00
Chandler Carruth 6f4ed077d0 [LV] Fix an issue where forming LCSSA in the place that we did would
change the set of uniform instructions in the loop causing an assert
failure.

The problem is that the legalization checking also builds data
structures mapping various facts about the loop body. The immediate
cause was the set of uniform instructions. If these then change when
LCSSA is formed, the data structures would already have been built and
become stale. The included test case triggered an assert in loop
vectorize that was reduced out of the new PM's pipeline.

The solution is to form LCSSA early enough that no information is cached
across the changes made. The only really obvious position is outside of
the main logic to vectorize the loop. This also has the advantage of
removing one case where forming LCSSA could mutate the loop but we
wouldn't track that as a "Changed" state.

If it is significantly advantageous to do some legalization checking
prior to this, we can do a more careful positioning but it seemed best
to just back off to a safe position first.

llvm-svn: 293168
2017-01-26 10:41:09 +00:00
Jonas Paulsson 8e2f948ef0 [TargetTransformInfo] Refactor and improve getScalarizationOverhead()
Refactoring to remove duplications of this method.

New method getOperandsScalarizationOverhead() that looks at the present unique
operands and add extract costs for them. Old behaviour was to just add extract
costs for one operand of the type always, which still happens in
getArithmeticInstrCost() if no operands are provided by the caller.

This is a good start of improving on this, but there are more places
that can be improved by using getOperandsScalarizationOverhead().

Review: Hal Finkel
https://reviews.llvm.org/D29017

llvm-svn: 293155
2017-01-26 07:03:25 +00:00
Craig Topper b6122122c9 [X86] Add demanded elts support for the inputs to pclmul intrinsic
This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors.

Differential Revision: https://reviews.llvm.org/D28979

llvm-svn: 293151
2017-01-26 05:17:13 +00:00
Taewook Oh 0d26a5376c Revert test commit
llvm-svn: 293150
2017-01-26 04:34:25 +00:00
Taewook Oh d3f1ec9962 test commit
llvm-svn: 293148
2017-01-26 04:32:40 +00:00
Chandler Carruth eab3b90a14 [PM] Simplify the new PM interface to the loop unroller and expose two
factory functions for the two modes the loop unroller is actually used
in in-tree: simplified full-unrolling and the entire thing including
partial unrolling.

I've also wired these up to nice names so you can express both of these
being in a pipeline easily. This is a precursor to actually enabling
these parts of the O2 pipeline.

Differential Revision: https://reviews.llvm.org/D28897

llvm-svn: 293136
2017-01-26 02:13:50 +00:00
Michael Kuperstein 5dd55e8405 [LoopUnroll] Properly update loopinfo for runtime unrolling by 2
Even when we don't create a remainder loop (that is, when we unroll by 2), we
may duplicate nested loops into the remainder. This is complicated by the fact
the remainder may itself be either inserted into an outer loop, or at the top
level. In the latter case, we may need to create new top-level loops.

Differential Revision: https://reviews.llvm.org/D29156

llvm-svn: 293124
2017-01-26 01:04:11 +00:00
Davide Italiano ccbbc8313f [NewGVN] Skip uses in unreachable blocks.
Otherwise we ask for a domtree node that's not there, and we crash.

Differential Revision:  https://reviews.llvm.org/D29145

llvm-svn: 293122
2017-01-26 00:42:42 +00:00
Peter Collingbourne 1df6e858ef LowerTypeTests: Ignore external globals with type metadata.
Thanks to Davide Italiano for finding the problem and providing a test case.

llvm-svn: 293119
2017-01-26 00:32:15 +00:00
Davide Italiano b3886dd84f [NewGVN] Simplify folding a lambda used only once. NFCI.
llvm-svn: 293112
2017-01-25 23:37:49 +00:00
Daniel Berlin d602e04c9e MemorySSA: Link all defs together into an intrusive defslist, to make updater easier
Summary:
This is the first in a series of patches to add a simple, generalized updater to MemorySSA.

For MemorySSA, every def is may-def, instead of the normal must-def.
(the best way to think of memoryssa is "everything is really one variable, with different versions of that variable at different points in the program).
This means when updating, we end up having to do a bunch of work to touch defs below and above us.

In order to support this quickly, i have ilist'd all the defs for each block.  ilist supports tags, so this is quite easy. the only slightly messy part is that you can't have two iplists for the same type that differ only whether they have the ownership part enabled or not, because the traits are for the value type.

The verifiers have been updated to test that the def order is correct.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29046

llvm-svn: 293085
2017-01-25 20:56:19 +00:00
Krzysztof Parzyszek 0fd6296b82 Add loop pass insertion point EP_LateLoopOptimizations
Differential Revision: https://reviews.llvm.org/D28694

llvm-svn: 293067
2017-01-25 16:12:25 +00:00
Artur Pilipenko 8fb3d57e67 [Guards] Introduce loop-predication pass
This patch introduces guard based loop predication optimization. The new LoopPredication pass tries to convert loop variant range checks to loop invariant by widening checks across loop iterations. For example, it will convert

  for (i = 0; i < n; i++) {
    guard(i < len);
    ...
  }

to

  for (i = 0; i < n; i++) {
    guard(n - 1 < len);
    ...
  }

After this transformation the condition of the guard is loop invariant, so loop-unswitch can later unswitch the loop by this condition which basically predicates the loop by the widened condition:

  if (n - 1 < len)
    for (i = 0; i < n; i++) {
      ...
    } 
  else
    deoptimize

This patch relies on an NFC change to make ScalarEvolution::isMonotonicPredicate public (revision 293062).

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D29034

llvm-svn: 293064
2017-01-25 16:00:44 +00:00
Artur Pilipenko b85f7a5d99 [InstCombine] Canonicalize guards for NOT OR condition
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D29075

Patch by Maxim Kazantsev.

llvm-svn: 293061
2017-01-25 14:45:12 +00:00
Simon Pilgrim 6f6b279109 [InstCombine][SSE] Add support for PACKSS/PACKUS constant folding
Differential Revision: https://reviews.llvm.org/D28949

llvm-svn: 293060
2017-01-25 14:37:24 +00:00
Artur Pilipenko 4df4c4a4aa [InstCombine] Canonicalize guards for AND condition
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D29074

Patch by Maxim Kazantsev.

llvm-svn: 293058
2017-01-25 14:20:52 +00:00
Artur Pilipenko e812ca00bb [InstCombine] Allow InstrCombine to remove one of adjacent guards if they are equivalent
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: majnemer, apilipenko

Differential Revision: https://reviews.llvm.org/D29071

Patch by Maxim Kazantsev.

llvm-svn: 293056
2017-01-25 14:12:12 +00:00
Alexey Bataev d28ab559a7 [SLP] Improve horizontal vectorization for non-power-of-2 number of
instructions.

If number of instructions in horizontal reduction list is not power of 2
then only PowerOf2Floor(NumberOfInstructions) last elements are actually
vectorized, other instructions remain scalar. Patch tries to vectorize
the remaining elements either.

Differential Revision: https://reviews.llvm.org/D28959

llvm-svn: 293042
2017-01-25 09:54:38 +00:00
Akira Hatanaka 4ec7b20ef6 [SimplifyCFG] Do not sink and merge inline-asm instructions.
Conservatively disable sinking and merging inline-asm instructions as doing so
can potentially create arguments that cannot satisfy the inline-asm constraints.

For example, SimplifyCFG used to do the following transformation:

(before)
if.then:
  %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 8)
  br label %if.end
if.else:
  %1 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 6)
  br label %if.end

(after)
  %.sink = select i1 %tobool, i32 6, i32 8
  %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 %.sink)

This would result in a crash in the backend since only immediate integer operands
are permitted for constraint "n".

rdar://problem/30110806

Differential Revision: https://reviews.llvm.org/D29111

llvm-svn: 293025
2017-01-25 06:21:51 +00:00
Chandler Carruth ce40fa13ce [PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls
loops.

We do this by reconstructing the newly added loops after the unroll
completes to avoid threading pass manager details through all the mess
of the unrolling infrastructure.

I've enabled some extra assertions in the LPM to try and catch issues
here and enabled a bunch of unroller tests to try and make sure this is
sane.

Currently, I'm manually running loop-simplify when needed. That should
go away once it is folded into the LPM infrastructure.

Differential Revision: https://reviews.llvm.org/D28848

llvm-svn: 293011
2017-01-25 02:49:01 +00:00
Gor Nishanov df3d71a7a9 [coroutines] Spill the result of the invoke instruction correctly
Summary:
When we decide that the result of the invoke instruction need to be spilled, we need to insert the spill into a block that is on the normal edge coming out of the invoke instruction. (Prior to this change the code would insert the spill immediately after the invoke instruction, which breaks the IR, since invoke is a terminator instruction).

In the following example, we will split the edge going into %cont and insert the spill there.

```
  %r = invoke double @print(double 0.0) to label %cont unwind label %pad

  cont:
    %0 = call i8 @llvm.coro.suspend(token none, i1 false)
    switch i8 %0, label %suspend [i8 0, label %resume
                                  i8 1, label %cleanup]
  resume:
    call double @print(double %r)
```

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: mehdi_amini, llvm-commits, EricWF

Differential Revision: https://reviews.llvm.org/D29102

llvm-svn: 293006
2017-01-25 02:25:54 +00:00
Dehao Chen a5eb1689dc Explicitly promote indirect calls before sample profile annotation.
Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation.

Reviewers: xur, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29040

llvm-svn: 292979
2017-01-24 21:05:51 +00:00
Daniel Berlin 390dfde0f3 Remove the load hoisting code of MLSM, it is completely subsumed by GVNHoist
Summary:
GVNHoist performs all the optimizations that MLSM does to loads, in a
more general way, and in a faster time bound (MLSM is N^3 in most
cases, N^4 in a few edge cases).

This disables the load portion.

Note that the way ld_hoist_st_sink.ll is written makes one think that
the loads should be moved to the while.preheader block, but

1. Neither MLSM nor GVNHoist do it (they both move them to identical places).

2. MLSM couldn't possibly do it anyway, as the while.preheader block
is not the head of the diamond, while.body is.  (GVNHoist could do it
if it was legal).

3. At a glance, it's not legal anyway because the in-loop load
conflict with the in-loop store, so the loads must stay in-loop.

I am happy to update the test to use update_test_checks so that
checking is tighter, just was going to do it as a followup.

Note that i can find no particular benefit to the store portion on any
real testcase/benchmark i have (even size-wise).  If we really still
want it, i am happy to commit to writing a targeted store sinker, just
taking the code from the MemorySSA port of MergedLoadStoreMotion
(which is N^2 worst case, and N most of the time).

We can do what it does in a much better time bound.

We also should be both hoisting and sinking stores, not just sinking
them, anyway, since whether we should hoist or sink to merge depends
basically on luck of the draw of where the blockers are placed.

Nonetheless, i have left it alone for now.

Reviewers: chandlerc, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29079

llvm-svn: 292971
2017-01-24 19:55:36 +00:00
Amaury Sechet d90f5f6698 Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one.
Summary: As per title. This will add the instructiions we are interested in in the worklist.

Reviewers: mehdi_amini, majnemer, andreadb

Differential Revision: https://reviews.llvm.org/D29081

llvm-svn: 292957
2017-01-24 17:48:25 +00:00
Amaury Sechet 5da456e6a1 Fix formating in foldSelectCttzCtlz. NFC
llvm-svn: 292934
2017-01-24 14:22:27 +00:00
Chandler Carruth 6acdca78a0 [PH] Replace uses of AssertingVH from members of analysis results with
a lazy-asserting PoisoningVH.

AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.

This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.

The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.

The rest is straight cleanup.

I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.

Differential Revision: https://reviews.llvm.org/D29006

llvm-svn: 292928
2017-01-24 12:55:57 +00:00
Simon Pilgrim 78f8630ac0 [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now

Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst

llvm-svn: 292913
2017-01-24 11:07:41 +00:00
Alexey Bataev 9f8bb384af [SLP] Refactoring of HorizontalReduction class, NFC.
Removed data members ReduxWidth and MinVecRegSize + some C++11 stylish
improvements.

Differential Revision: https://reviews.llvm.org/D29010

llvm-svn: 292899
2017-01-24 08:57:17 +00:00
Serge Pavlov 098ee2fe02 Update domtree incrementally in loop peeling.
With this change dominator tree remains in sync after each step of loop
peeling.

Differential Revision: https://reviews.llvm.org/D29029

llvm-svn: 292895
2017-01-24 06:58:39 +00:00
Kostya Serebryany 4b2ff07c11 [sanitizer-coverage] emit __sanitizer_cov_trace_pc_guard w/o a preceding 'if' by default. Update the docs, also add deprecation notes around other parts of sanitizer coverage
llvm-svn: 292862
2017-01-24 00:57:31 +00:00
Matt Arsenault 954a624fb9 SimplifyLibCalls: Replace more unary libcalls with intrinsics
llvm-svn: 292855
2017-01-23 23:55:08 +00:00
Michael Kuperstein 461aa57ad3 [LoopUnroll] First form LCSSA, then loop-simplify
Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the
same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher
in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving
LoopSimplify on the result.

This fixes PR31718.

Differential Revision: https://reviews.llvm.org/D29055

llvm-svn: 292854
2017-01-23 23:45:42 +00:00
Dehao Chen 14bf029053 Makes promoteIndirectCall an external function.
Summary: promoteIndirectCall should be a utility function that could be invoked by other optimization passes.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29051

llvm-svn: 292850
2017-01-23 23:18:24 +00:00
David L. Jones d21529fa0d [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848
2017-01-23 23:16:46 +00:00
Evgeniy Stepanov 29e56bceec Revert "Refactor SampleProfile.cpp to move computation inside a branch. (NFC)"
Causes MSan failures on the buildbot.

llvm-svn: 292840
2017-01-23 22:40:08 +00:00
Xinliang David Li cb253ce90b [PGO] add debug option to view annotated cfg after prof use annotation
Differential Revision: http://reviews.llvm.org/D28967 

llvm-svn: 292815
2017-01-23 18:58:24 +00:00
Dehao Chen a53219a5f2 Refactor SampleProfile.cpp to move computation inside a branch. (NFC)
llvm-svn: 292803
2017-01-23 17:09:02 +00:00
Simon Pilgrim f6f3a36159 [InstCombine][X86] Add MULDQ/MULUDQ constant folding support
llvm-svn: 292793
2017-01-23 15:22:59 +00:00
Amaury Sechet 2fec7e4f44 Tweak ASCII art in Simplify CFG. NFC
llvm-svn: 292792
2017-01-23 15:13:01 +00:00
Simon Pilgrim bb13fdabec [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same

llvm-svn: 292784
2017-01-23 12:07:32 +00:00
Chandler Carruth e8c66b2766 [PM] Replace the hard invalidate in JumpThreading for LVI with correct
invalidation of deleted functions in GlobalDCE.

This was always testing a bug really triggered in GlobalDCE. Right now
we have analyses with asserting value handles into IR. As long as those
remain, when *deleting* an IR unit, we cannot wait for the normal
invalidation scheme to kick in even though it was designed to work
correctly in the face of these kinds of deletions. Instead, the pass
needs to directly handle invalidating the analysis results pointing at
that IR unit.

I've tought the Inliner about this and this patch teaches GlobalDCE.
This will handle the asserting VH case in the existing test as well as
other issues of the same fundamental variety. I've moved the test into
the GlobalDCE directory and added a comment explaining what is going on.

Note that we cannot simply require LVI here because LVI is too lazy.

llvm-svn: 292773
2017-01-23 08:33:24 +00:00
Chandler Carruth 9524af4aac [PM] Clear any analyses for a dead function after inlining it and before
clearing its body. This is essential to avoid triggering asserting value
handles in analyses on the function's body.

I'm working on a test case for this behavior in LLVM, but Clang has
a great one that managed to trigger this on all of the bots already.

llvm-svn: 292770
2017-01-23 07:03:41 +00:00
Chandler Carruth a504f2b8e8 [PM] Teach LVI to correctly invalidate itself when its dependencies
become unavailable.

The AssumptionCache is now immutable but it still needs to respond to
DomTree invalidation if it ended up caching one.

This lets us remove one of the explicit invalidates of LVI but the
other one continues to avoid hitting a latent bug.

llvm-svn: 292769
2017-01-23 06:35:12 +00:00
Chandler Carruth b698d5964d [PM] Fix a really nasty bug introduced when adding PGO support to the
new PM's inliner.

The bug happens when we refine an SCC after having computed a proxy for
the FunctionAnalysisManager, and then proceed to compute fresh analyses
for functions in the *new* SCC using the manager provided by the old
SCC's proxy. *And* when we manage to mutate a function in this new SCC
in a way that invalidates those analyses. This can be... challenging to
reproduce.

I've managed to contrive a set of functions that trigger this and added
a test case, but it is a bit brittle. I've directly checked that the
passes run in the expected ways to help avoid the test just becoming
silently irrelevant.

This gets the new PM back to passing the LLVM test suite after the PGO
improvements landed.

llvm-svn: 292757
2017-01-22 10:34:01 +00:00
Chandler Carruth d4be9f4b8d [PM] Add some debug logging to the new PM inliner to make it easier to
trace its behavior.

llvm-svn: 292756
2017-01-22 10:33:58 +00:00
Sanjay Patel 478a83c905 [InstCombine] use m_APInt to allow ashr folds for vectors with splat constants
We may be able to assert that no shl-shl or lshr-lshr pairs ever get here
because we should have already handled those in foldShiftedShift().

llvm-svn: 292726
2017-01-21 17:59:59 +00:00
Chandler Carruth 7fd29cef42 [PM] Sink an LCSSA preservation assert from the LoopSimplify pass into
the library routine shared with the new PM and other code.

This assert checks that when LCSSA preservation is requested we start in
LCSSA form. Without this early assert, given *very* complex test cases
we can hit an assert or crash much later on when trying to preserve
LCSSA.

The new PM's loop simplify doesn't need to (and indeed can't) preserve
LCSSA as the new PM doesn't deal in transforms in the dependency graph.
But we asked the library to and shockingly, this didn't work very well!
Stop doing that. Now the assert will tell us immediately with existing
test cases. Before this, it took a pretty convoluted input to trigger
this.

However, sinking the assert also found a bug in LoopUnroll where we
asked simplifyLoop to preserve LCSSA *right before we reform it*. That's
kinda silly and unsurprising that it wasn't available. =D Stop doing
that too.

We also would assert that the unrolled loop was in LCSSA even if
preserving LCSSA was never requested! I don't have a test case or
anything here. I spotted it by inspection and it seems quite obvious. No
logic change anyways, that's just avoiding a spurrious assert.

llvm-svn: 292710
2017-01-21 04:16:53 +00:00
Michael Kuperstein 807982359d [SLP] Make ReductionOpcode have the right (enum) type. NFC.
llvm-svn: 292703
2017-01-21 02:03:03 +00:00
Anmol P. Paralkar 910dc8de3f MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info
Summary:
Under option -mergefunc-preserve-debug-info we:
- Do not create a new function for a thunk.
- Retain the debug info for a thunk's parameters (and associated
  instructions for the debug info) from the entry block.
  Note: -debug will display the algorithm at work.
- Create debug-info for the call (to the shared implementation) made by
  a thunk and its return value.
- Erase the rest of the function, retaining the (minimally sized) entry
  block to create a thunk.
- Preserve a thunk's call site to point to the thunk even when both occur
  within the same translation unit, to aid debugability. Note that this
  behaviour differs from the underlying -mergefunc implementation which
  modifies the thunk's call site to point to the shared implementation
  when both occur within the same translation unit.

Reviewers: echristo, eeckstein, dblaikie, aprantl, friss

Reviewed By: aprantl

Subscribers: davide, fhahn, jfb, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28075

llvm-svn: 292702
2017-01-21 02:02:56 +00:00
Peter Collingbourne b365d921cf LowerTypeTests: Fix use-after-free. Found by asan/msan.
llvm-svn: 292700
2017-01-21 01:57:44 +00:00
Michael Kuperstein f8458593cf [SLP] Delete useless helper. NFC.
The helper contained a branch for a special case that is unnecessary,
and a cast.

llvm-svn: 292698
2017-01-21 01:33:25 +00:00
Davide Italiano 71f2d9c2d5 [NewGVN] Optimize processing for instructions found trivially dead.
Don't call `isTriviallyDeadInstructions()` once we discover that
an instruction is dead. Instead, set DFS number zero (as suggested
by Danny) and forget about it (this also speeds up things as we
won't try to reprocess that block).

Differential Revision:  https://reviews.llvm.org/D28930

llvm-svn: 292676
2017-01-20 23:29:28 +00:00
Peter Collingbourne 67addbcacf LowerTypeTests: Simplify; always create SizeM1 with type IntPtrTy, move initialization out of if statement.
llvm-svn: 292674
2017-01-20 23:22:28 +00:00
Dehao Chen 77079003dd Add indirect call promotion to SamplePGO
Summary: This patch adds metadata for indirect call promotion in the sample profile loader.

Reviewers: xur, davidxl, dnovillo

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28923

llvm-svn: 292672
2017-01-20 22:56:07 +00:00
Easwaran Raman 12585b0148 Improve PGO support for the new inliner
This adds the following to the new PM based inliner in PGO mode:

* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.

* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.

* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.

Differential revision: https://reviews.llvm.org/D28331

llvm-svn: 292666
2017-01-20 22:44:04 +00:00
Peter Collingbourne e02b74e294 IPO, LTO: Plumb the summary from the LTO API into the pass manager.
Differential Revision: https://reviews.llvm.org/D28840

llvm-svn: 292661
2017-01-20 22:18:52 +00:00
Teresa Johnson 4566c6db87 [ThinLTO] Drop non-prevailing non-ODR weak to declarations
Summary:
Allow non-ODR weak/linkonce non-prevailing copies to be marked
as available_externally in the index. Add support for dropping these to
declarations in the backend.

Reviewers: mehdi_amini, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28806

llvm-svn: 292656
2017-01-20 21:54:58 +00:00
Peter Collingbourne f04a390099 LowerTypeTests: Implement importing of type identifiers.
To import a type identifier we read the summary and create external
references to the symbols defined when exporting.

Differential Revision: https://reviews.llvm.org/D28546

llvm-svn: 292654
2017-01-20 21:49:34 +00:00
Daniel Berlin 26addef1a0 NewGVN: Fix PR 31686 and PR 31698 by rewriting store leader handling.
Summary:

This rewrites store expression/leader handling.  We no longer use the
value operand as the leader, instead, we store it separately.  We also
now store the stored value as part of the expression, and compare it
when comparing stores for equality.  This enables us to get rid of a
bunch of our previous hacks and machinations, as the existing
machinery takes care of everything *except* updating the stored value
on classes.  The only time we have to update it is if the storecount
goes to 0, and when we do, we destroy it.

Since we no longer use the value operand as the leader, during elimination, we have to use the value operand.  Doing this also fixes a bunch of store forwarding cases we were missing.

Any value operand we use is guaranteed to either be updated by previous eliminations, or minimized by future ones.

(IE the fact that we don't use the most dominating value operand when it's not a constant does not affect anything).

Sadly, this change also exposes that we didn't pay attention to the
output of the pr31594.ll test, as it also very clearly exposes the
same store leader bug we are fixing here.

(I added pr31682.ll anyway, but maybe we think that's too large to be useful)

On the plus side, propagate-ir-flags.ll now passes due to the
corrected store forwarding.

This change was 3 stage'd on darwin and linux, with the full test-suite.

Reviewers:
davide
Subscribers:
llvm-commits

llvm-svn: 292648
2017-01-20 21:04:30 +00:00
Peter Collingbourne ee1416037e LowerTypeTests: Compute SizeM1BitWidth in exportTypeId. NFCI.
This avoids needing to store it in a separate field in TypeIdLowering.

llvm-svn: 292647
2017-01-20 20:57:40 +00:00
Simon Pilgrim a50a93fcd0 [InstCombine][X86] Add MULDQ/MULUDQ undef handling
llvm-svn: 292627
2017-01-20 18:20:30 +00:00
Simon Pilgrim 51b3b98e3a [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions
Simplify a packss/packus truncation based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28777

llvm-svn: 292591
2017-01-20 09:28:21 +00:00
Chandler Carruth e9b18e3d34 [PM] Port LoopSink to the new pass manager.
Like several other loop passes (the vectorizer, etc) this pass doesn't
really fit the model of a loop pass. The critical distinction is that it
isn't intended to be pipelined together with other loop passes. I plan
to add some documentation to the loop pass manager to make this more
clear on that side.

LoopSink is also different because it doesn't really need a lot of the
infrastructure of our loop passes. For example, if there aren't loop
invariant instructions causing a preheader to exist, there is no need to
form a preheader. It also doesn't need LCSSA because this pass is
only involved in sinking invariant instructions from a preheader into
the loop, not reasoning about live-outs.

This allows some nice simplifications to the pass in the new PM where we
can directly walk the loops once without restructuring them.

Differential Revision: https://reviews.llvm.org/D28921

llvm-svn: 292589
2017-01-20 08:42:19 +00:00
Chandler Carruth 1725c8c315 [LoopSink] Trivial comment cleanup.
llvm-svn: 292588
2017-01-20 08:42:14 +00:00
Daniel Berlin 89fea6fd9d NewGVN: Fix PR 31682, an overactive assert.
Part of the assert has been left active for further debugging.
The other part has been turned into a stat for tracking for the
moment.

llvm-svn: 292583
2017-01-20 06:38:41 +00:00
Dehao Chen 94f369fc7f clang-format SampleProfile.cpp (NFC)
llvm-svn: 292533
2017-01-19 23:20:31 +00:00
Davide Italiano 6c2c3e07bf [SCCP] Teach the pass how to handle `div` with overdefined operands.
This can prove that:

extern int f;
int g() {
    int x = 0;
    for (int i = 0; i < 365; ++i) {
        x /= f;
    }
    return x;
}

always returns zero. Thanks to Sanjoy for confirming this
transformation actually made sense (bugs are mine).

llvm-svn: 292531
2017-01-19 23:07:51 +00:00
Davide Italiano 93c6c18a85 [SCCP] Update comment in visitBinaryOp() after recent changes.
llvm-svn: 292519
2017-01-19 21:07:42 +00:00
Xin Tong 5ee40ba400 Improve what can be promoted in LICM.
Summary:
In case of non-alloca pointers, we check for whether it is a pointer
from malloc-like calls and it is not captured. In such case, we can
promote the pointer, as the caller will have no way to access this pointer
even if there is unwinding in middle of the loop.

Reviewers: hfinkel, sanjoy, reames, eli.friedman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28834

llvm-svn: 292510
2017-01-19 19:31:40 +00:00
Davide Italiano 2ef8c4e708 [InstCombine] Simplify gep (gep p, a), (b-a)
Patch by Andrea Canciani.

Differential Revision:  https://reviews.llvm.org/D27413

llvm-svn: 292506
2017-01-19 18:51:56 +00:00
Sanjay Patel 291c3d8ff2 [InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1
Try harder to fold icmp with shl nsw as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html

This is similar to the 'shl nuw' transforms that were added with D25913.

This may eventually help solve:
https://llvm.org/bugs/show_bug.cgi?id=30773

Differential Revision: https://reviews.llvm.org/D28406

llvm-svn: 292492
2017-01-19 16:12:10 +00:00
Mikael Holmen 8bf15614fb Test commit access, remove trailing whitespace
llvm-svn: 292482
2017-01-19 13:35:13 +00:00
Peter Collingbourne 22d9d3cdce LowerTypeTests: Implement exporting of type identifiers.
Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
  identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
  containing fine-grained information about the type identifier.

Differential Revision: https://reviews.llvm.org/D28424

llvm-svn: 292462
2017-01-19 01:20:11 +00:00
Michael Kuperstein 230867e583 [LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them
This changes the vectorizer to explicitly use the loopsimplify and lcssa utils,
instead of "requiring" the transformations as if they were analyses.

This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA
for all loops, but rather only for the loops we expect to modify.

Differential Revision: https://reviews.llvm.org/D28868

llvm-svn: 292456
2017-01-19 00:42:28 +00:00
Eli Friedman 0a2174533e Preserve domtree and loop-simplify for runtime unrolling.
Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.

Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.

Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D28073

llvm-svn: 292447
2017-01-18 23:26:37 +00:00
Sanjay Patel ae23d65a7d [InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI
llvm-svn: 292440
2017-01-18 21:16:12 +00:00
Sanjay Patel 589de5ea4e [InstCombine] remove a redundant check; NFCI
I missed deleting this check when I refactored this chunk in:
https://reviews.llvm.org/rL292260

llvm-svn: 292433
2017-01-18 20:09:59 +00:00
Peter Collingbourne 20a00933fb ThinLTOBitcodeWriter: Clear comdats on filtered globals.
Differential Revision: https://reviews.llvm.org/D28839

llvm-svn: 292431
2017-01-18 20:03:02 +00:00
Peter Collingbourne 10e3b12c7a Cloning: Copy comdats when cloning globals.
Differential Revision: https://reviews.llvm.org/D28838

llvm-svn: 292430
2017-01-18 20:02:31 +00:00
Michael Kuperstein 0de990da16 Fix up a comment. NFC.
llvm-svn: 292425
2017-01-18 19:05:48 +00:00
Michael Kuperstein 7cefb409b0 [LV] Allow reductions that have several uses outside the loop
We currently check whether a reduction has a single outside user. We don't
really need to require that - we just need to make sure a single value is
used externally. The number of external users of that value shouldn't actually
matter.

Differential Revision: https://reviews.llvm.org/D28830

llvm-svn: 292424
2017-01-18 19:02:52 +00:00
Davide Italiano bca9d73309 [NewGVN] We don't use postdom info anymore. Update.
Differential Revision:  https://reviews.llvm.org/D28842

llvm-svn: 292421
2017-01-18 18:42:28 +00:00
Simon Pilgrim fe2c0ed4cf [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles
Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.

llvm-svn: 292371
2017-01-18 14:47:49 +00:00
Simon Pilgrim a22c3a1c0f [InstCombine] Remove unnecessary intrinsics demanded elts handling
As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need.

llvm-svn: 292365
2017-01-18 13:44:04 +00:00
Chandler Carruth 8aaad7c4d9 [LoopDeletion] (cleanup, NFC) Fix one more local variable that didn't
follow LLVM's naming conventions while I'm here.

Again, sorry I didn't spot this earlier to coalesce with other cleanup
changes.

llvm-svn: 292333
2017-01-18 02:43:01 +00:00
Chandler Carruth d50c5fb13f [PM] Teach LoopDeletion to correctly update the LPM when loops are
deleted.

I've expanded its test coverage a bit including adding one test that
will crash clearly without this change.

llvm-svn: 292332
2017-01-18 02:41:26 +00:00
Eugene Zelenko 34c23279c2 [Target, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292320
2017-01-18 00:57:48 +00:00
Xin Tong 99c3da0e8b Skip loop header while we can when computing loop safety info
llvm-svn: 292310
2017-01-18 00:15:11 +00:00
Dehao Chen c3f87f02b1 Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller.
Summary: Partial unrolling should have separate threshold with full unrolling.

Reviewers: efriedma, mzolotukhin

Reviewed By: efriedma, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28831

llvm-svn: 292293
2017-01-17 23:39:33 +00:00
Chandler Carruth 80de5e6e01 [LoopDeletion] (cleanup, NFC) Use the dedicated helper to get a single
unique exit block if available rather than rolling it ourselves.

This is a little disappointing because that helper doesn't do anything
clever to short-circuit the (surprisingly expensive) computation of all
exit blocks. What's worse is that the way we compute this is hopelessly,
hilariously inefficient. We're literally computing the same information
two different ways and multiple times each way:
- hasDedicatedExits computes the exit block set and then looks at the
  predecessors of each
- getExitingBlocks computes the set of loop blocks which have exiting
  successors
- getUniqueExitBlock(s) computes the set of non-loop blocks reached from
  loop blocks (sound familiar?)

Anyways, at some point we should clean all of this up in the LoopInfo
API, but for now just simplifying the user I'm about to touch.

llvm-svn: 292282
2017-01-17 22:28:52 +00:00
Chandler Carruth aa885c990b [LoopDeletion] (cleanup, NFC) Fix another variable name to match LLVM
conventions, missed this one in a previous cleanup patch (sorry).

llvm-svn: 292279
2017-01-17 22:19:56 +00:00
Chandler Carruth bd551e9674 [LoopDeletion] (cleanup, NFC) Remove a pointless comment.
I hope that for any code, it is changed only with good reason and only
when the author knows what they are doing...

There is of course good reason to comment here about the subtlety of the
process, and I've left that comment in tact.

llvm-svn: 292275
2017-01-17 22:09:28 +00:00
Chandler Carruth 26169f001c [LoopDeletion] (cleanup, NFC) Make simple helper functions static
instead of members.

No state was being provided by the object so this seems strictly
simpler.

I've also tried to improve the name and comments for the functions to
more thoroughly document what they are doing.

llvm-svn: 292274
2017-01-17 22:07:26 +00:00
Chandler Carruth bb7e4b46e9 [LoopDeletion] (cleanup, NFC) Stop passing around reference to a vector
that we know has exactly one element when all we are going to do is get
that one element out of it.

Instead, pass around that one element.

There are more simplifications to come in this code...

llvm-svn: 292273
2017-01-17 22:00:52 +00:00
Chandler Carruth 04a73879a8 [PM] Clean up variable and parameter names to match modern LLVM naming
conventions more conistently before hacking on this code to integrate
nicely with new PM's loop pass infrastructure. NFC.

llvm-svn: 292272
2017-01-17 21:51:39 +00:00
Sanjay Patel 14715b3c2a [InstCombine] refactor foldICmpShlConstant(); NFCI
This reduces the size of and increases the symmetry with the planned functional change in:
https://reviews.llvm.org/D28406

llvm-svn: 292260
2017-01-17 21:25:16 +00:00
Matthew Simpson 3fbdaa5906 [LV] Mark non-consecutive-like pointers non-uniform
If a memory instruction will be vectorized, but it's pointer operand is
non-consecutive-like, the instruction is a gather or scatter operation. Its
pointer operand will be non-uniform. This should fix PR31671.

Reference: https://llvm.org/bugs/show_bug.cgi?id=31671
Differential Revision: https://reviews.llvm.org/D28819

llvm-svn: 292254
2017-01-17 20:51:39 +00:00
Dan Gohman 1209c7ac16 [WebAssembly] Add triple support for the new wasm object format
Differential Revision: https://reviews.llvm.org/D26701

llvm-svn: 292252
2017-01-17 20:34:09 +00:00
Sanjoy Das 6de072a712 [EarlyCSE] Don't DSE across readnone functions that may throw
Summary: Depends on D28740

Reviewers: dberlin, chandlerc, hfinkel, majnemer

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D28741

llvm-svn: 292249
2017-01-17 20:15:47 +00:00
David Majnemer de55c606d1 [InstCombine] Fold ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2)
This further extends r292179 to support additional binary operators
beyond subtraction.

llvm-svn: 292238
2017-01-17 18:08:06 +00:00
Sanjay Patel 5424bd2625 [InstCombine] reduce indent; NFCI
llvm-svn: 292230
2017-01-17 16:59:09 +00:00
Simon Pilgrim d4eb800b03 [InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions
Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded.

llvm-svn: 292209
2017-01-17 11:35:03 +00:00
Sanjoy Das 679bc32c6a [InstCombine] Don't DSE across readnone functions that may throw
Summary: Depends on D28740

Reviewers: dberlin, chandlerc, hfinkel, majnemer

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D28742

llvm-svn: 292197
2017-01-17 05:45:09 +00:00
David Majnemer 36d382b773 [InstCombine] Fold ((C1-zext(X)) & C2) -> zext((C1-X) & C2)
This is valid if C2 fits within the bitwidth of X thanks to two's
complement modulo arithmetic.

llvm-svn: 292179
2017-01-17 00:45:57 +00:00
Matt Arsenault b948b4d8df SimplifyLibCalls: Remove checks for fabs
Use the intrinsic instead of emitting the libcall which
will be replaced by the intrinsic.

llvm-svn: 292176
2017-01-17 00:30:31 +00:00
Matt Arsenault 7233344c28 SimplifyLibCalls: Replace fabs libcalls with intrinsics
Add missing fabs(fpext) optimzation that worked with the call,
and also fixes it creating a second fpext when there were multiple
uses.

llvm-svn: 292172
2017-01-17 00:10:40 +00:00
Sanjay Patel da5682afdd [InstCombine] use m_APInt instead of faking it
llvm-svn: 292164
2017-01-16 21:24:41 +00:00
Sanjay Patel 65cce20caa [InstCombine] fix names in canEvaluateShiftedShift(); NFC
It's not clear what 'First' and 'Second' mean, so use 'Inner' and 'Outer'
to match foldShiftedShift() and add comments with formulas, so it's easier
to see what's going on.

llvm-svn: 292153
2017-01-16 20:05:26 +00:00
Sanjay Patel ab8b32de71 [InstCombine] use m_APInt to allow shift-shift folds for vectors with splat constants
Some existing 'FIXME' tests are still not folded because of splat holes in value tracking.

llvm-svn: 292151
2017-01-16 19:35:45 +00:00
Sanjay Patel 646734a6cd [InstCombine] refactor shift-of-shift folds; NFCI
Reduces code duplication and makes it easier to extend these folds for vectors.

llvm-svn: 292145
2017-01-16 17:27:50 +00:00
Simon Pilgrim 73a68c25a0 [InstCombine][SSE] Add DemandedElts support for PSHUFB instructions
Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28745

llvm-svn: 292101
2017-01-16 11:30:41 +00:00
Sanjay Patel 20aaf58543 [InstCombine] fix formatting; NFC
llvm-svn: 292073
2017-01-15 17:55:35 +00:00
Sanjay Patel 5f8451afad [InstCombine] use m_APInt to allow ashr folds for vectors with splat constants
llvm-svn: 292064
2017-01-15 16:38:19 +00:00
Daniel Berlin aac56849a1 NewGVN: Change a bunch of densemap find_or_creates to lookups, since they should not be creating new entries
llvm-svn: 292059
2017-01-15 09:18:41 +00:00
Chandler Carruth ca68a3ec47 [PM] Introduce an analysis set used to preserve all analyses over
a function's CFG when that CFG is unchanged.

This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.

I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.

Differential Revision: https://reviews.llvm.org/D28627

llvm-svn: 292054
2017-01-15 06:32:49 +00:00
Eric Fiselier 0a9eb89cf9 Give comparator const call operator
llvm-svn: 292043
2017-01-15 02:06:44 +00:00
Chandler Carruth 2f19a324cb [PM] The assumption cache is fundamentally designed to be self-updating,
mark it as never invalidated in the new PM.

The old PM already required this to work, and after a discussion with
Hal this seems to really be the only sensible answer. The cache
gracefully degrades as the IR is mutated, and most things which do this
should already be incrementally updating the cache.

This gets rid of a bunch of logic preserving and testing the
invalidation of this analysis.

llvm-svn: 292039
2017-01-15 00:26:18 +00:00
Chandler Carruth 5edfd4d99e [PM] Fix instcombine's analysis preservation in the new pass manager to
cover domtree and alias analysis. These are the pretty clear analyses
that we would always want to survive this pass.

To make these survive, we also need to preserve the assumption cache.

Added a test that verifies the important bits of this preservation.

llvm-svn: 292037
2017-01-14 23:25:22 +00:00
Sanjay Patel ca3124f74b [InstCombine] clean up visitAshr(); NFCI
llvm-svn: 292036
2017-01-14 23:13:50 +00:00
Davide Italiano 6d28500ff9 [NewGVN] Fix a warning from GCC.
Patch by Gonsolo.
Differential Revision:  https://reviews.llvm.org/D28731

llvm-svn: 292031
2017-01-14 20:44:08 +00:00
Davide Italiano ed67f1978e [NewGVN] clang-format this file after recent changes.
llvm-svn: 292026
2017-01-14 20:15:04 +00:00
Davide Italiano 7cf29dcca5 [NewGVN] Try to be consistent wit the style used in this file. NFCI.
llvm-svn: 292025
2017-01-14 20:13:18 +00:00
Eugene Zelenko 5fa43960f3 [Transforms/Utils] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291983
2017-01-14 00:32:38 +00:00
Daniel Berlin b66164ca34 NewGVN: Kill unneeded DFSDomMap, cleanup a few comments.
llvm-svn: 291981
2017-01-14 00:24:23 +00:00
Sanjay Patel 40f401776b [InstCombine] optimize unsigned icmp of increment
Allows LLVM to optimize sequences like the following:

%add = add nuw i32 %x, 1
%cmp = icmp ugt i32 %add, %y

Into:

%cmp = icmp uge i32 %x, %y

Previously, only signed comparisons were being handled.

Decrements could also be handled, but 'sub nuw %x, 1' is currently canonicalized to
'add %x, -1' in InstCombineAddSub, losing the nuw flag. Removing that canonicalization
seems like it might have far-reaching ramifications so I kept this simple for now.

Patch by Matti Niemenmaa!

Differential Revision: https://reviews.llvm.org/D24700

llvm-svn: 291975
2017-01-13 23:25:46 +00:00
Sanjay Patel 2d4b456427 [InstCombine] use m_APInt to allow lshr folds for vectors with splat constants
llvm-svn: 291972
2017-01-13 23:04:10 +00:00
Daniel Berlin c0431fd02d NewGVN: Move leaders around properly to ensure we have a canonical dominating leader. Fixes PR 31613.
Summary:
This is a testcase where phi node cycling happens, and because we do
not order the leaders by domination or anything similar, the leader
keeps changing.

Using std::set for the members is too expensive, and we actually don't
need them sorted all the time, only at leader changes.

We could keep both a set and a vector, and keep them mostly sorted and
resort as necessary, or use a set and a fibheap, but all of this seems
premature.

After running some statistics, we are able to avoid the vast majority
of sorting by keeping a "next leader" field.  Most congruence classes only have
leader changes once or twice during GVN.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28594

llvm-svn: 291968
2017-01-13 22:40:01 +00:00
David Majnemer bba17390c7 [LoopStrengthReduce] Don't bother rewriting PHIs in catchswitch blocks
The catchswitch instruction cannot be split, don't bother trying to
rewrite it.

This fixes PR31627.

llvm-svn: 291966
2017-01-13 22:24:27 +00:00
David L. Jones 41cecba8e9 "Use" lambda captures which are otherwise only used in asserts. NFC
Summary:
The LLVM coding standards recommend "using" values that are only
needed by asserts:
http://llvm.org/docs/CodingStandards.html#assert-liberally

Without this change, LLVM cannot bootstrap with -Werror as the second
stage fails with this new warning:
https://reviews.llvm.org/rL291905

See also the previous fixes:
https://reviews.llvm.org/rL291916
https://reviews.llvm.org/rL291939
https://reviews.llvm.org/rL291940
https://reviews.llvm.org/rL291941

Reviewers: rsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28695

llvm-svn: 291957
2017-01-13 21:02:41 +00:00
Sanjay Patel acd24c7b6a [InstCombine] use 'match' and other clean-up; NFCI
llvm-svn: 291937
2017-01-13 18:52:10 +00:00
Sanjay Patel b22f6c5f26 [InstCombine] use m_APInt to allow shl folds for vectors with splat constants
llvm-svn: 291934
2017-01-13 18:39:09 +00:00
Sanjay Patel cf08203105 [InstCombine] use Op0/Op1 local variables more consistently with shifts; NFC
llvm-svn: 291923
2017-01-13 18:08:25 +00:00
Malcolm Parsons 17d266bc96 Remove unused lambda captures. NFC
llvm-svn: 291916
2017-01-13 17:12:16 +00:00
Sanjay Patel 5178363687 [InstCombine] if the condition of a select may be known via assumes, eliminate the select
This is a limited solution for PR31512:
https://llvm.org/bugs/show_bug.cgi?id=31512

The motivation is that we will need to increase usage of llvm.assume and/or metadata to solve PR28430:
https://llvm.org/bugs/show_bug.cgi?id=28430

...and this kind of simplification is needed to take advantage of that extra information.

The 'not' test case would be handled by:
https://reviews.llvm.org/D28485

Differential Revision:
https://reviews.llvm.org/D28337

llvm-svn: 291915
2017-01-13 17:02:42 +00:00