Commit Graph

18307 Commits

Author SHA1 Message Date
Teresa Johnson 32d95742b8 Recommit "r306541 - Add zero-length check to memcpy/memset load store loop expansion""
With fix for use-after-free errors. We can't add the new branch and
remove the old one until we are done with the Builder constructed for
the block.

llvm-svn: 306937
2017-07-01 03:24:10 +00:00
Teresa Johnson c12306c0ad Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."
This still breaks PPC tests we have. I'll forward reproduction
instructions to dehao.

llvm-svn: 306936
2017-07-01 03:24:09 +00:00
Teresa Johnson eb4fba9d61 re-commit r306336: Enable vectorizer-maximize-bandwidth by default.
Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306935
2017-07-01 03:24:08 +00:00
Teresa Johnson de56903bde revert r306336 for breaking ppc test.
llvm-svn: 306934
2017-07-01 03:24:07 +00:00
Teresa Johnson 1fbaffeba1 Enable vectorizer-maximize-bandwidth by default.
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:

spec/2006/fp/C++/444.namd                 26.84  -0.31%
spec/2006/fp/C++/447.dealII               46.19  +0.89%
spec/2006/fp/C++/450.soplex               42.92  -0.44%
spec/2006/fp/C++/453.povray               38.57  -2.25%
spec/2006/fp/C/433.milc                   24.54  -0.76%
spec/2006/fp/C/470.lbm                    41.08  +0.26%
spec/2006/fp/C/482.sphinx3                47.58  -0.99%
spec/2006/int/C++/471.omnetpp             22.06  +1.87%
spec/2006/int/C++/473.astar               22.65  -0.12%
spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
spec/2006/int/C/400.perlbench             33.43  +1.70%
spec/2006/int/C/401.bzip2                 23.02  -0.19%
spec/2006/int/C/403.gcc                   32.57  -0.43%
spec/2006/int/C/429.mcf                   40.35  +0.27%
spec/2006/int/C/445.gobmk                 26.96  +0.06%
spec/2006/int/C/456.hmmer                  24.4  +0.19%
spec/2006/int/C/458.sjeng                 27.91  -0.08%
spec/2006/int/C/462.libquantum            57.47  -0.20%
spec/2006/int/C/464.h264ref               46.52  +1.35%

geometric mean                                   +0.29%

The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.

I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.

Reviewers: hfinkel, mkuper, davidxl, chandlerc

Reviewed By: chandlerc

Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306933
2017-07-01 03:24:06 +00:00
Dinar Temirbulatov 2fb1075f14 [SLPVectorizer] Add isOdd() helper function, NFCI.
llvm-svn: 306887
2017-06-30 21:16:26 +00:00
Craig Topper bcf511c0da [InstCombine] Replace an unnecessary use of a matcher with just an isa and a cast. NFC
We aren't looking through any levels of IR here so I don't think we need the power of a matcher or the temporary variable it requires.

llvm-svn: 306885
2017-06-30 21:09:34 +00:00
Ayal Zaks 2ff59d4350 [LV] Sink casts to unravel first order recurrence
Check if a single cast is preventing handling a first-order-recurrence Phi,
because the scheduling constraints it imposes on the first-order-recurrence
shuffle are infeasible; but they can be made feasible by moving the cast
downwards. Record such casts and move them when vectorizing the loop.

Differential Revision: https://reviews.llvm.org/D33058

llvm-svn: 306884
2017-06-30 21:05:06 +00:00
Sumanth Gundapaneni 5372f0a73e [SimplifyCFG] Update the name of switch generated lookup table.
This patch appends the name of the function to the switch generated lookup
table. This will ease the visual debugging in identifying the function the table
is generated from.

Differential Revision: https://reviews.llvm.org/D34817

llvm-svn: 306867
2017-06-30 20:00:01 +00:00
Simon Pilgrim 77c3c5f9b8 [InstCombine] Add m_BitReverse pattern match helper. NFCI.
llvm-svn: 306860
2017-06-30 18:58:29 +00:00
Anna Thomas e5e5e59d8b [RuntimeUnrolling] Add logic for loops with multiple exit blocks
Summary:
Runtime unrolling is done for loops with a single exit block and a
single exiting block (and this exiting block should be the latch block).
This patch adds logic to support unrolling in the presence of multiple exit
blocks (which also means multiple exiting blocks).
Currently this is under an off-by-default option and is supported when
epilog code is generated. Support in presence of prolog code will be in
a future patch (we just need to add more tests, and update comments).

This patch is essentially an implementation patch. I have not added any
heuristic (in terms of branches added or code size) to decide when
this should be enabled.

Reviewers: mkuper, sanjoy, reames, evstupac

Reviewed by: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33001

llvm-svn: 306846
2017-06-30 17:57:07 +00:00
Nikolai Bozhenov bde9b14c6f Revert of r306525: "Canonicalize clamp of float types to minmax"
llvm-svn: 306815
2017-06-30 10:39:09 +00:00
Ayal Zaks 8d26f0a602 [LV] Optimize for size when vectorizing loops with tiny trip count
It may be detrimental to vectorize loops with very small trip count, as various
costs of the vectorized loop body as well as enclosing overheads including
runtime tests and scalar iterations may outweigh the gains of vectorizing. The
current cost model measures the cost of the vectorized loop body only, expecting
it will amortize other costs, and loops with known or expected very small trip
counts are not vectorized at all. This patch allows loops with very small trip
counts to be vectorized, but under OptForSize constraints, which ensure the cost
of the loop body is dominant, having no runtime guards nor scalar iterations.

Patch inspired by D32451.

Differential Revision: https://reviews.llvm.org/D34373

llvm-svn: 306803
2017-06-30 08:02:35 +00:00
Craig Topper 880bf82685 [InstCombine] In foldXorToXor, move the commutable matcher from the LHS match to the RHS match. No meaningful change intended.
There are two conditions ORed here with similar checks and each contain two matches that must be true for the if to succeed. With the commutable match on the first half of the OR then both ifs basically have the same first part and only the second part distinguishs. With this change we move the commutable match to second half and make the first half unique.

This caused some tests to change because we now produce a commuted result, but this shouldn't matter in practice.

llvm-svn: 306800
2017-06-30 07:37:41 +00:00
Chandler Carruth 3545a9e1f9 Remove the BBVectorize pass.
It served us well, helped kick-start much of the vectorization efforts
in LLVM, etc. Its time has come and past. Back in 2014:
http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html

Time to actually let go and move forward. =]

I've updated the release notes both about the removal and the
deprecation of the corresponding C API.

llvm-svn: 306797
2017-06-30 07:09:08 +00:00
Daniel Jasper 3b704ceba1 Revert "r306541 - Add zero-length check to memcpy/memset load store loop expansion"
Segfaults in non-optimized builds. I'll get a stack trace and a
reproducer to Teresa.

llvm-svn: 306793
2017-06-30 06:37:33 +00:00
Daniel Jasper 5ce1ce742e Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."
This still breaks PPC tests we have. I'll forward reproduction
instructions to dehao.

llvm-svn: 306792
2017-06-30 06:32:21 +00:00
Max Kazantsev 8d0322e612 [SCEV] Use depth limit instead of local cache for SExt and ZExt
In rL300494 there was an attempt to deal with excessive compile time on
invocations of getSign/ZeroExtExpr using local caching. This approach only
helps if we request the same SCEV multiple times throughout recursion. But
in the bug PR33431 we see a case where we request different values all the time,
so caching does not help and the size of the cache grows enormously.

In this patch we remove the local cache for this methods and add the recursion
depth limit instead, as we do for arithmetics. This gives us a guarantee that the
invocation sequence is limited and reasonably short.

Differential Revision: https://reviews.llvm.org/D34273

llvm-svn: 306785
2017-06-30 05:04:09 +00:00
Eric Christopher a95aac3751 Reduce indenting and clean up comparisons around sign bit.
llvm-svn: 306781
2017-06-30 01:57:48 +00:00
Eric Christopher 710c1c8faa Reduce the complexity of the signbit/branch test functions.
llvm-svn: 306779
2017-06-30 01:35:31 +00:00
Dehao Chen 2f31d0d86e Hook the sample PGO machinery in the new PM
Summary: This patch hooks up SampleProfileLoaderPass with the new PM.

Reviewers: chandlerc, davidxl, davide, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: tejohnson, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34720

llvm-svn: 306763
2017-06-29 23:33:05 +00:00
Dinar Temirbulatov f05c73c132 [SLPVectorizer] Moving Entry->NeedToGather check out of inner loop,
since it is invariant there. NFCI.

llvm-svn: 306749
2017-06-29 21:56:33 +00:00
Sam Clegg 3d65030c45 Remove `inline` keyword from inline `classof` methods
The style guide states that the explicit `inline`
should not be used with inline methods.  classof is
very common inline method with a fair amount on
inconsistency:

$ git grep classof ./include | grep inline | wc -l
230
$ git grep classof ./include | grep -v inline | wc -l
257

I chose to target this method rather the larger change
since this method is easily cargo-culted (I did it at
least once).  I considered doing the larger change and
removing all occurrences but that would be a much larger
change.

Differential Revision: https://reviews.llvm.org/D33906

llvm-svn: 306731
2017-06-29 19:35:17 +00:00
Xin Tong 02008c30b5 Remove useless header. NFC
llvm-svn: 306712
2017-06-29 17:48:12 +00:00
Leo Li 20fbad9307 [ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type.
Summary:
Indices for GEPs that index into a struct type should always be
constants. This added more checks in `collectConstantCandidates:` which make
sure constants for GEP pointer type are not hoisted.

This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538

Reviewers: ributzka, rnk

Reviewed By: ributzka

Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama

Differential Revision: https://reviews.llvm.org/D34576

llvm-svn: 306704
2017-06-29 17:03:34 +00:00
Daniel Berlin b7df17ec59 PredicateInfo: Use OrderedInstructions instead of our homemade
version.

llvm-svn: 306703
2017-06-29 17:01:14 +00:00
Daniel Berlin b779db7ebc NewGVN: Remove useless test in addPhiOfOps.
llvm-svn: 306702
2017-06-29 17:01:10 +00:00
Daniel Berlin 7c757aee38 Remove unneeded else from OrderedInstructions::dominates.
llvm-svn: 306701
2017-06-29 17:01:03 +00:00
Dinar Temirbulatov 7b96266a16 [SLPVectorizer] Introducing getTreeEntry() helper function [NFC]
Differential Revision: https://reviews.llvm.org/D34756

llvm-svn: 306655
2017-06-29 08:46:18 +00:00
Craig Topper 798a19ab8e [InstCombine] In visitXor, use m_Not on the instruction itself instead of looking for all ones in Op1. This is consistent with 3 other not checks before this one. NFCI
llvm-svn: 306617
2017-06-29 00:07:08 +00:00
Keno Fischer a236dae5d1 [InstCombine] Retain TBAA when narrowing memory accesses
Summary:
As discussed on the mailing list it is legal to propagate TBAA to loads/stores
from/to smaller regions of a larger load tagged with TBAA. Do so for
(load->extractvalue)=>(gep->load) and similar foldings.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D31954

llvm-svn: 306615
2017-06-28 23:36:40 +00:00
Ayal Zaks d9bc43ef2a [LV] Fix PR33613 - retain order of insertelement per part
r306381 caused PR33613, by reversing the order in which insertelements were
generated per unroll part. This patch fixes PR33613 by retraining this order,
placing each set of insertelements per part immediately after the last scalar
being packed for this part. Includes a test case derived from PR33613.

Reference: https://bugs.llvm.org/show_bug.cgi?id=33613
Differential Revision: https://reviews.llvm.org/D34760

llvm-svn: 306575
2017-06-28 17:59:33 +00:00
Geoff Berry b0573547f6 [LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D34532

llvm-svn: 306564
2017-06-28 17:01:15 +00:00
Sanjay Patel 4e96f19052 [InstCombine] use local variable to reduce code; NFCI
llvm-svn: 306560
2017-06-28 16:39:06 +00:00
Geoff Berry 66d9bdbca8 [LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D34531

llvm-svn: 306554
2017-06-28 15:53:17 +00:00
Teresa Johnson 538b8d25f0 Add zero-length check to memcpy/memset load store loop expansion
Summary:
I was testing using this expansion logic in other cases besides
NVPTX, and found some runtime failures due to the lack of a check
for a zero length memcpy/memset before the loop. There is already
such a check in the memmove expansion code though.

Reviewers: hfinkel

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D34707

llvm-svn: 306541
2017-06-28 13:07:37 +00:00
Nikolai Bozhenov b01e6b5a52 [InstCombine] Canonicalize clamp of float types to minmax in fast mode.
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.

This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.

Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D33186

llvm-svn: 306525
2017-06-28 09:26:20 +00:00
Max Kazantsev 6c466a376e [IRCE][NFC] Better get SCEV for 1 in calculateSubRanges
A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive
invocations, and we don't create a ConstantInt if 'true' branch is taken.

Differential Revision: https://reviews.llvm.org/D34672

llvm-svn: 306503
2017-06-28 04:57:45 +00:00
Kyle Butt f73c8a06a9 Inlining: Don't re-map simplified cloned instructions.
When simplifying an instruction that has been re-mapped, it should never
simplify to an instruction in the original function. In the edge case
where we are inlining a function into itself, the existing code led to
incorrect behavior. Replace the incorrect code with an assert verifying
that we never expect simplification to produce an instruction in the old
function, unless the functions are the same.

Differential Revision: https://reviews.llvm.org/D33850

llvm-svn: 306495
2017-06-28 01:41:25 +00:00
Peter Collingbourne 92648c25a4 Bitcode: Write the irsymtab to disk.
Differential Revision: https://reviews.llvm.org/D33973

llvm-svn: 306487
2017-06-27 23:50:11 +00:00
Geoff Berry 2573a19fe6 [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE.
llvm-svn: 306477
2017-06-27 22:25:02 +00:00
Dehao Chen 920d022519 re-commit r306336: Enable vectorizer-maximize-bandwidth by default.
Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306473
2017-06-27 22:05:58 +00:00
Craig Topper 5fe0197622 [InstCombine] Propagate nsw flag when turning mul by pow2 into shift when the constant is a vector splat or the scalar bit width is larger than 64-bits
The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less.

This patch changes it to use m_APInt to remove both these issues

Differential Revision: https://reviews.llvm.org/D34699

llvm-svn: 306457
2017-06-27 19:57:53 +00:00
Serge Guelton 7bc405aa4c [CodeExtractor] Prevent extraction of block involving blockaddress
BlockAddress are only valid within their function context, which does not
interact well with CodeExtractor. Detect this case and prevent it.

Differential Revision: https://reviews.llvm.org/D33839

llvm-svn: 306448
2017-06-27 18:57:53 +00:00
Yaxun Liu 7c44f340de [SROA] Fix APInt size when alloca address space is not 0
SROA assumes alloca address space is 0, which causes assertion. This patch fixes that.

Differential Revision: https://reviews.llvm.org/D34104

llvm-svn: 306440
2017-06-27 18:26:06 +00:00
Sanjay Patel 7227276d41 [InstCombine] canonicalize icmp predicate feeding select
This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. 
We have this transform for icmp+br, so unless there's some reason that icmp+select should be 
treated differently, we should do the same thing here.

The benefit comes from increasing the chances of creating identical instructions. This is shown in
the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE 
can simplify the identical cmps, and then InstCombine can fold the selects together.

The possible regression for the tests in select.ll raises questions about poison/undef:
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html

...but that transform is just as likely to be triggered by this canonicalization as it is to be 
missed, so we're just pointing out a commutation deficiency in the pattern matching:
https://reviews.llvm.org/rL228409

Differential Revision: https://reviews.llvm.org/D34242

llvm-svn: 306435
2017-06-27 17:53:22 +00:00
Dehao Chen 66131665c4 Enable ICP for AutoFDO.
Summary: AutoFDO should have ICP enabled.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D34662

llvm-svn: 306429
2017-06-27 17:23:33 +00:00
Anna Thomas dc935a6eb6 [LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI
Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block
(which is proven to be the only exiting block in the loop to be unrolled).

llvm-svn: 306410
2017-06-27 14:14:35 +00:00
Ayal Zaks fc1e210d44 Recommitting 306331.
Undoing revert 306338 after fixed bug: add metadata to the load instead of the
reverse shuffle added to it, retaining the original ValueMap implementation.

llvm-svn: 306381
2017-06-27 08:41:19 +00:00
Chandler Carruth 3f81d8024c [SROA] Fix PR32902 by more carefully propagating !nonnull metadata.
This is based heavily on the work done ni D34285. I mostly wanted to do
test cleanup for the author to save them some time, but I had a really
hard time understanding why it was so hard to write better test cases
for these issues.

The problem is that because SROA does a second rewrite of the loads and
because we *don't* propagate !nonnull for non-pointer loads, we first
introduced invalid !nonnull metadata and then stripped it back off just
in time to avoid most ways of this PR manifesting. Moving to the more
careful utility only fixes this by changing the predicate to look at the
new load's type rather than the target type. However, that *does* fix
the bug, and the utility is much nicer including adding range metadata
to model the nonnull property after a conversion to an integer.

However, we have bigger problems because we don't actually propagate
*range* metadata, and the utility to do this extracted from instcombine
isn't really in good shape to do this currently. It *only* handles the
case of copying range metadata from an integer load to a pointer load.
It doesn't even handle the trivial cases of propagating from one integer
load to another when they are the same width! This utility will need to
be beefed up prior to using in this location to get the metadata to
fully survive.

And even then, we need to go and teach things to turn the range metadata
into an assume the way we do with nonnull so that when we *promote* an
integer we don't lose the information.

All of this will require a new test case that looks kind-of like
`preserve-nonnull.ll` does here but focuses on range metadata. It will
also likely require more testing because it needs to correctly handle
changes to the integer width, especially as SROA actively tries to
change the integer width!

Last but not least, I'm a little worried about hooking the range
metadata up here because the instcombine logic for converting from
a range metadata *to* a nonnull metadata node seems broken in the face
of non-zero address spaces where null is not mapped to the integer `0`.
So that probably needs to get fixed with test cases both in SROA and in
instcombine to cover it.

But this *does* extract the core PR fix from D34285 of preventing the
!nonnull metadata from being propagated in a broken state just long
enough to feed into promotion and crash value tracking.

On D34285 there is some discussion of zero-extend handling because it
isn't necessary. First, the new load size covers all of the non-undef
(ie, possibly initialized) bits. This may even extend past the original
alloca if loading those bits could produce valid data. The only way its
valid for us to zero-extend an integer load in SROA is if the original
code had a zero extend or those bits were undef. And we get to assume
things like undef *never* satifies nonnull, so non undef bits can
participate here. No need to special case the zero-extend handling, it
just falls out correctly.

The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to
save a few rounds of trivial edits fixing style issues and test case
formulation.

Differental Revision: D34285

llvm-svn: 306379
2017-06-27 08:32:03 +00:00
Mikael Holmen 37b5120a9a [Reassociate] Make sure EraseInst sets MadeChange
Summary:
EraseInst didn't report that it made IR changes through MadeChange.

It is essential that changes to the IR are reported correctly,
since for example ReassociatePass::run() will indicate that all
analyses are preserved otherwise.
And the CGPassManager determines if the CallGraph is up-to-date
based on status from InstructionCombiningPass::runOnFunction().

Reviewers: craig.topper, rnk, davide

Reviewed By: rnk, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34616

llvm-svn: 306368
2017-06-27 05:32:13 +00:00
Dehao Chen 8b7effb344 revert r306336 for breaking ppc test.
llvm-svn: 306344
2017-06-26 23:05:35 +00:00
Ayal Zaks 3923c0c46b reverting 306331.
Causes TBAA metadata to be generates on reverse shuffles, investigating.

llvm-svn: 306338
2017-06-26 22:26:54 +00:00
Dehao Chen 79655792cc Enable vectorizer-maximize-bandwidth by default.
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:

spec/2006/fp/C++/444.namd                 26.84  -0.31%
spec/2006/fp/C++/447.dealII               46.19  +0.89%
spec/2006/fp/C++/450.soplex               42.92  -0.44%
spec/2006/fp/C++/453.povray               38.57  -2.25%
spec/2006/fp/C/433.milc                   24.54  -0.76%
spec/2006/fp/C/470.lbm                    41.08  +0.26%
spec/2006/fp/C/482.sphinx3                47.58  -0.99%
spec/2006/int/C++/471.omnetpp             22.06  +1.87%
spec/2006/int/C++/473.astar               22.65  -0.12%
spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
spec/2006/int/C/400.perlbench             33.43  +1.70%
spec/2006/int/C/401.bzip2                 23.02  -0.19%
spec/2006/int/C/403.gcc                   32.57  -0.43%
spec/2006/int/C/429.mcf                   40.35  +0.27%
spec/2006/int/C/445.gobmk                 26.96  +0.06%
spec/2006/int/C/456.hmmer                  24.4  +0.19%
spec/2006/int/C/458.sjeng                 27.91  -0.08%
spec/2006/int/C/462.libquantum            57.47  -0.20%
spec/2006/int/C/464.h264ref               46.52  +1.35%

geometric mean                                   +0.29%

The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.

I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.

Reviewers: hfinkel, mkuper, davidxl, chandlerc

Reviewed By: chandlerc

Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306336
2017-06-26 21:41:09 +00:00
Ayal Zaks e7e15d186b [LV] Changing the interface of ValueMap, NFC.
Instead of providing access to the internal MapStorage holding all Values
associated with a given Key, used for setting or resetting them all together,
ValueMap keeps its MapStorage internal; its new interface allows getting,
setting or resetting a single Value, per part or per part-and-lane.
Follows the discussion in https://reviews.llvm.org/D32871.

Differential Revision: https://reviews.llvm.org/D34473

llvm-svn: 306331
2017-06-26 21:03:51 +00:00
Wei Mi 71f06420e4 [GVN] Recommit the patch "Add phi-translate support in scalarpre".
The recommit fixes three bugs: The first one is to use CurrentBlock instead of
PREInstr's Parent as param of performScalarPREInsertion because the Parent
of a clone instruction may be uninitialized. The second one is stop PRE when
CurrentBlock to its predecessor is a backedge and an operand of CurInst is
defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

llvm-svn: 306313
2017-06-26 18:16:10 +00:00
Chandler Carruth 2abb65ae11 [InstCombine] Factor the logic for propagating !nonnull and !range
metadata out of InstCombine and into helpers.

NFC, this just exposes the logic used by InstCombine when propagating
metadata from one load instruction to another. The plan is to use this
in SROA to address PR32902.

If anyone has better ideas about how to factor this or name variables,
I'm all ears, but this seemed like a pretty good start and lets us make
progress on the PR.

This is based on a patch by Ariel Ben-Yehuda (D34285).

llvm-svn: 306267
2017-06-26 03:31:31 +00:00
Chandler Carruth 4a000883c7 [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.
This was reverted in r306252, but I already had the bug fixed and was
just trying to form a test case.

The original commit factored the logic for forming dedicated exits
inside of LoopSimplify into a helper that could be used elsewhere and
with an approach that required fewer intermediate data structures. See
that commit for full details including the change to the statistic, etc.

The code looked fine to me and my reviewers, but in fact didn't handle
indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty.

If you have code that looks *just* right, you can end up leaking these
predecessors into a subsequent rewrite, and crash deep down when trying
to update PHI nodes for predecessors that don't exist.

I've added an assert that makes the bug much more obvious, and then
changed the code to reliably clear the vector so we don't get this bug
again in some other form as the code changes.

I've also added a test case that *does* manage to catch this while also
giving some nice positive coverage in the face of indirectbr.

The real code that found this came out of what I think is CPython's
interpreter loop, but any code with really "creative" interpreter loops
mixing indirectbr and other exit paths could manage to tickle the bug.
I was hard to reduce the original test case because in addition to
having a particular pattern of IR, the whole thing depends on the order
of the predecessors which is in turn depends on use list order. The test
case added here was designed so that in multiple different predecessor
orderings it should always end up going down the same path and tripping
the same bug. I hope. At least, it tripped it for me without
manipulating the use list order which is better than anything bugpoint
could do...

llvm-svn: 306257
2017-06-25 22:45:31 +00:00
Anna Thomas e7cb633d29 [LoopDeletion] NFC: Move phi node value setting into prepass
Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator,
which caused loop deletion tests to hang due to infinite loop.
Had reverted it in rL306162.

rL306157 commit message:
Currently, the implementation of delete dead loops has a special case
when the loop being deleted is never executed. This special case
(updating of exit block's incoming values for phis) can be
run as a prepass for non-executable loops before performing
the actual deletion.

llvm-svn: 306254
2017-06-25 21:13:58 +00:00
Daniel Jasper 4c6cd4ccb7 Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."
This leads to a segfault. Chandler already has a test case and should be
able to recommit with a fix soon.

llvm-svn: 306252
2017-06-25 17:58:25 +00:00
Sanjay Patel 2f3ead7adc [InstCombine] add (sext i1 X), 1 --> zext (not X)
http://rise4fun.com/Alive/i8Q

A narrow bitwise logic op is obviously better than math for value tracking, 
and zext is better than sext. Typically, the 'not' will be folded into an 
icmp predicate.

The IR difference would even survive through codegen for x86, so we would see 
worse code:

https://godbolt.org/g/C14HMF

one_or_zero(int, int):                      # @one_or_zero(int, int)
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setle   %al
        retq

one_or_zero_alt(int, int):                  # @one_or_zero_alt(int, int)
        xorl    %ecx, %ecx
        cmpl    %esi, %edi
        setg    %cl
        movl    $1, %eax
        subl    %ecx, %eax
        retq

llvm-svn: 306243
2017-06-25 14:15:28 +00:00
Xinliang David Li b67530e9b9 [PGO] Implementate profile counter regiser promotion
Differential Revision: http://reviews.llvm.org/D34085

llvm-svn: 306231
2017-06-25 00:26:43 +00:00
Hiroshi Inoue b300824ee7 fix trivial typos in comment, NFC
dereferencable -> dereferenceable

llvm-svn: 306210
2017-06-24 15:43:33 +00:00
Craig Topper 7b66ffe875 [ValueTracking][InstCombine] Use m_Shr instead m_CombineOr(m_LShr, m_AShr). NFC
llvm-svn: 306205
2017-06-24 06:24:04 +00:00
Craig Topper 72ee6945af [Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC
llvm-svn: 306204
2017-06-24 06:24:01 +00:00
Vitaly Buka df19ad456e [InstCombine] Don't replace allocas with smaller globals
Summary:
InstCombine replaces large allocas with small globals consts causing buffer overflows
on valid code, see PR33372.

This fix permits this optimization only if the global is dereference for alloca size.

Fixes PR33372

Reviewers: eugenis, majnemer, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34311

llvm-svn: 306194
2017-06-24 01:35:19 +00:00
Anna Thomas 77a2e6b198 Revert "[LoopDeletion] NFC: Move phi node value setting into prepass"
This reverts commit r306157.
It caused some timeouts in clang tests. Perhaps unreachable loops have
far too many phi nodes.
Reverting and investigating.

llvm-svn: 306162
2017-06-23 21:30:48 +00:00
Anna Thomas a43b387f27 [LoopDeletion] NFC: Move phi node value setting into prepass
Currently, the implementation of delete dead loops has a special case
when the loop being deleted is never executed. This special case
(updating of exit block's incoming values for phis) can be
run as a prepass for non-executable loops before performing
the actual deletion.

llvm-svn: 306157
2017-06-23 20:38:50 +00:00
Craig Topper 68ed55e06a [CorrelatedValuePropagation] Fix typo in comment sense->since. NFC
llvm-svn: 306152
2017-06-23 20:28:40 +00:00
Craig Topper 29cdfe2cd9 [CorrelatedValuePropagation] Remove comment about iterating switch cases in reverse order. This is no longer being done after r298791. NFC
llvm-svn: 306151
2017-06-23 20:28:35 +00:00
Anna Thomas 91eed9ac1a [RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFC
The single exit block allowed in runtime unrolling is guaranteed to be
the Latch's successor, so rename it as LatchExitBlock.

llvm-svn: 306105
2017-06-23 14:28:01 +00:00
Anna Thomas d67165c93c [InstCombine] Recognize and simplify three way comparison idioms
Summary:
Many languages have a three way comparison idiom where comparing two values
produces not a boolean, but a tri-state value. Typical values (e.g. as used in
the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1
for greater than.

We actually do a great job already of converting three way comparisons into
binary comparisons when the result produced has one a single use. Unfortunately,
such values can have more than one use, and in that case, our existing
optimizations break down.

The patch adds a peephole which converts a three-way compare + test idiom into a
binary comparison on the original inputs. It focused on replacing the test on
the result of the three way compare and does nothing about removing the three
way compare itself. That's left to other optimizations (which do actually kick
in commonly.)
We currently recognize one idiom on signed integer compare. In the future, we
plan to recognize and simplify other comparison idioms on
other signed/unsigned datatypes such as floats, vectors etc.

This is a resurrection of Philip Reames' original patch:
https://reviews.llvm.org/D19452

Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34278

llvm-svn: 306100
2017-06-23 13:41:45 +00:00
Craig Topper 2c20c42cb6 [JumpThreading] Teach jump threading how to analyze (and (cmp A, C1), (cmp A, C2)) after InstCombine has turned it into (cmp (add A, C3), C4)
Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition.

But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare.

This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it.

Differential Revision: https://reviews.llvm.org/D33262

llvm-svn: 306085
2017-06-23 05:41:35 +00:00
Craig Topper 7927996140 [JumpThreading] Use some temporary variables to reduce the number of times we call the same methods. NFC
A future patch will add even more uses of these variables.

llvm-svn: 306084
2017-06-23 05:41:32 +00:00
Chandler Carruth 4ab0f4910a [LoopSimplify] Factor the logic to form dedicated exits into a utility.
I want to use the same logic as LoopSimplify to form dedicated exits in
another pass (SimpleLoopUnswitch) so I wanted to factor it out here.

I also noticed that there is a pretty significantly more efficient way
to implement this than the way the code in LoopSimplify worked. We don't
need to actually retain the set of unique exit blocks, we can just
rewrite them as we find them and use only a set to deduplicate.

This did require changing one part of LoopSimplify to not re-use the
unique set of exits, but it only used it to check that there was
a single unique exit. That part of the code is about to walk the exiting
blocks anyways, so it seemed better to rewrite it to use those exiting
blocks to compute this property on-demand.

I also had to ditch a statistic, but it doesn't seem terribly valuable.

Differential Revision: https://reviews.llvm.org/D34049

llvm-svn: 306081
2017-06-23 04:03:04 +00:00
Eric Christopher 5a7c2f1700 Remove the LoadCombine pass. It was never enabled and is unsupported.
Based on discussions with the author on mailing lists.

llvm-svn: 306067
2017-06-22 22:58:12 +00:00
Anna Thomas 72c90c87f8 [LoopDeletion] Update exits correctly when multiple duplicate edges from an exiting block
Summary:
Currently, we incorrectly update exit blocks of loops when there are multiple
edges from a single exiting block to the exit block. This can happen when we
have switches as the terminator of the exiting blocks.
The fix here is to correctly update the phi nodes in the exit block, and remove
all incoming values *except* for one which is from the preheader.

Note: Currently, this error can manifest only while deleting non-executed loops. However, it
is possible to trigger this error in invariant loops, once we enhance the logic
around the exit conditions for the loop check.

Reviewers: chandlerc, dberlin, sanjoy, efriedma

Reviewed by: efriedma

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D34516

llvm-svn: 306048
2017-06-22 20:20:56 +00:00
Craig Topper dffbbcb3fd [InstCombine] Teach foldSelectICmpAndOr to recognize (select (icmp slt (trunc (X)), 0), Y, (or Y, C2))
Summary:
InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y).

This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used.

Reviewers: spatel, majnemer, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34184

llvm-svn: 306029
2017-06-22 16:23:30 +00:00
Craig Topper 0de5e6a729 [InstCombine] Add one use checks to or/and->xnor folding
If the components of the and/or had multiple uses, this transform created an additional instruction.

This patch makes sure we remove one of the components.

Differential Revision: https://reviews.llvm.org/D34498

llvm-svn: 306027
2017-06-22 16:12:02 +00:00
Sanjay Patel d1e811979c [InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138)
There are 2 parts to this patch made simultaneously to avoid a regression.

We're reversing the canonicalization that moves bitwise vector ops before bitcasts. 
We're moving bitwise vector ops *after* bitcasts instead. That's the 1st and 3rd hunks 
of the patch. The motivation is that there's only one fold that currently depends on 
the existing canonicalization (see next), but there are many folds that would 
automatically benefit from the new canonicalization. 
PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these 
patterns in IR.

There's an or(and,andn) pattern that requires an adjustment in order to continue matching
to 'select' because the bitcast changes position. This match is unfortunately complicated 
because it requires 4 logic ops with optional bitcast and sext ops.

Test diffs:

  1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference - 
     bitcast comes before logic.
  2. There are also tests with no diffs in bitcast.ll that verify that we're still doing 
     folds that were enabled by the previous canonicalization.
  3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns 
     to look through bitcasts.
  4. logical-select.ll contains several tests for the or(and,andn) --> select fold to 
     verify that we are still handling those cases. The lone diff shows the movement of 
     the bitcast from the new canonicalization rule.

Differential Revision: https://reviews.llvm.org/D33517

llvm-svn: 306011
2017-06-22 15:46:54 +00:00
Sanjay Patel e800df8eac [InstCombine] add peekThroughBitcast() helper; NFC
This is an NFC portion of D33517. We have similar helpers in the backend.

llvm-svn: 306008
2017-06-22 15:28:01 +00:00
Diana Picus b512e91515 Revert "Enable vectorizer-maximize-bandwidth by default."
This reverts commit r305960 because it broke self-hosting on AArch64.

llvm-svn: 305990
2017-06-22 10:00:28 +00:00
Sam Clegg 705f798bff Mark dump() methods as const. NFC
Add const qualifier to any dump() method where adding one
was trivial.

Differential Revision: https://reviews.llvm.org/D34481

llvm-svn: 305963
2017-06-21 22:19:17 +00:00
Dehao Chen 014db29b89 Enable vectorizer-maximize-bandwidth by default.
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:

spec/2006/fp/C++/444.namd                 26.84  -0.31%
spec/2006/fp/C++/447.dealII               46.19  +0.89%
spec/2006/fp/C++/450.soplex               42.92  -0.44%
spec/2006/fp/C++/453.povray               38.57  -2.25%
spec/2006/fp/C/433.milc                   24.54  -0.76%
spec/2006/fp/C/470.lbm                    41.08  +0.26%
spec/2006/fp/C/482.sphinx3                47.58  -0.99%
spec/2006/int/C++/471.omnetpp             22.06  +1.87%
spec/2006/int/C++/473.astar               22.65  -0.12%
spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
spec/2006/int/C/400.perlbench             33.43  +1.70%
spec/2006/int/C/401.bzip2                 23.02  -0.19%
spec/2006/int/C/403.gcc                   32.57  -0.43%
spec/2006/int/C/429.mcf                   40.35  +0.27%
spec/2006/int/C/445.gobmk                 26.96  +0.06%
spec/2006/int/C/456.hmmer                  24.4  +0.19%
spec/2006/int/C/458.sjeng                 27.91  -0.08%
spec/2006/int/C/462.libquantum            57.47  -0.20%
spec/2006/int/C/464.h264ref               46.52  +1.35%

geometric mean                                   +0.29%

The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.

I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.

Reviewers: hfinkel, mkuper, davidxl, chandlerc

Reviewed By: chandlerc

Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 305960
2017-06-21 22:01:32 +00:00
Craig Topper 34caf5396f [Reassociate] Use early returns in a couple places to reduce indentation and improve readability. NFC
llvm-svn: 305946
2017-06-21 19:39:35 +00:00
Craig Topper 99a2e89920 [Reassociate] Const correct a helper function. NFC
llvm-svn: 305945
2017-06-21 19:39:33 +00:00
Craig Topper a074c101e5 [InstCombine] Cleanup using commutable matchers. Make a couple helper methods standalone static functions. Put 'if' around variable declaration instead of after. NFC
llvm-svn: 305941
2017-06-21 18:57:00 +00:00
Dehao Chen 50f2aa19e8 Do not inline recursive direct calls in sample loader pass.
Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls.

Reviewers: iteratee, davidxl

Reviewed By: iteratee

Subscribers: sanjoy, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D34456

llvm-svn: 305934
2017-06-21 17:57:43 +00:00
Craig Topper 5b173f2bb3 [InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits
Summary:
I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know.

This patch adds range metadata to these intrinsics based on the known bits.

Currently the patch punts if we already have range metadata present.

Reviewers: spatel, RKSimon, davide, majnemer

Reviewed By: RKSimon

Subscribers: sanjoy, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D32582

llvm-svn: 305927
2017-06-21 16:32:35 +00:00
Craig Topper ae86cc725d [InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) create more instructions than it removes
Summary:
Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184

This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision.

Reviewers: spatel, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34437

llvm-svn: 305926
2017-06-21 16:07:13 +00:00
Craig Topper cbac691c4b [Reassociate] Support xor reassociating for splat vectors
Summary: This patch adds support for xors of splat vectors.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34354

llvm-svn: 305925
2017-06-21 16:07:09 +00:00
Davide Italiano 0ec715be1f [NewGVN] Fix a bug that made the store verifier less effective.
We weren't actually checking for duplicated stores, as the condition
was always actually false. This was found by Coverity, and I have
no clue how to trigger this in real-world code (although I
 tried for a bit).

llvm-svn: 305867
2017-06-20 22:57:40 +00:00
Sanjay Patel 4ccbd58d70 [InstCombine] fix code/test comments for r305792; NFC
These diffs were in the last version of the patch in D33342,
but I accidentally committed the previous rev. 

llvm-svn: 305793
2017-06-20 12:45:46 +00:00
Sanjay Patel adca825dc1 [InstCombine] try to canonicalize xor-of-icmps to and-of-icmps
We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, 
but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, 
we can use the truth table definition of xor:

X ^ Y --> (X | Y) & !(X & Y)

...to see if we can convert the xor to and/or and then use the existing folds.

http://rise4fun.com/Alive/J9v

Differential Revision: https://reviews.llvm.org/D33342

llvm-svn: 305792
2017-06-20 12:40:55 +00:00
Vedant Kumar b5794ca90c [ProfileData] PR33517: Check for failure of symtab creation
With PR33517, it became apparent that symbol table creation can fail
when presented with malformed inputs. This patch makes that sort of
error detectable, so llvm-cov etc. can fail more gracefully.

Specifically, we now check that function names within the symbol table
aren't empty.

Testing: check-{llvm,clang,profile}, some unit test updates.
llvm-svn: 305765
2017-06-20 01:38:56 +00:00
Ana Pazos f731bde064 [PATCH] [PGO] Fixed cast operation in emIntrinsicVisitor::instrumentOneMemIntrinsic.
Reviewers: xur, efriedma, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34293

llvm-svn: 305737
2017-06-19 20:04:33 +00:00
Taewook Oh 9083547ae3 Improve profile-guided heuristics to use estimated trip count.
Summary:
Existing heuristic uses the ratio between the function entry
frequency and the loop invocation frequency to find cold loops. However,
even if the loop executes frequently, if it has a small trip count per
each invocation, vectorization is not beneficial. On the other hand,
even if the loop invocation frequency is much smaller than the function
invocation frequency, if the trip count is high it is still beneficial
to vectorize the loop.

This patch uses estimated trip count computed from the profile metadata
as a primary metric to determine coldness of the loop. If the estimated
trip count cannot be computed, it falls back to the original heuristics.

Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32451

llvm-svn: 305729
2017-06-19 18:48:58 +00:00
Bjorn Pettersson 475fcd9cd8 [InstCombine] Make sure AddReachableCodeToWorklist sets MadeIRChange
Summary:
Some optimizations in AddReachableCodeToWorklist did not update
the MadeIRChange state. This could happen both when removing
trivially dead instructions (DCE) and at constant folds.

It is essential that changes to the IR is reported correctly,
since for example InstCombinePass::run() will indicate that all
analyses are preserved otherwise.
And the CGPassManager determines if the CallGraph is up-to-date
based on status from InstructionCombiningPass::runOnFunction().

The new test case early_dce_clobbers_callgraph.ll is a reproducer
for some asserts that started to trigger after changes in the
inliner in r305245. With this patch the test case passes again.

Reviewers: sanjoy, craig.topper, dblaikie

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34346

llvm-svn: 305725
2017-06-19 18:00:27 +00:00
Hans Wennborg ca69fc1cb7 Revert r304824 "Fix PR23384 (part 3 of 3)"
This seems to be interacting badly with ASan somehow, causing false reports of
heap-buffer overflows: PR33514.

> Summary:
> The patch makes instruction count the highest priority for
> LSR solution for X86 (previously registers had highest priority).
>
> Reviewers: qcolombet
>
> Differential Revision: http://reviews.llvm.org/D30562
>
> From: Evgeny Stupachenko <evstupac@gmail.com>

llvm-svn: 305720
2017-06-19 17:57:15 +00:00
Davide Italiano daa9c0e403 [NewGVN] Simplify findConditionEquivalence(). NFCI.
llvm-svn: 305707
2017-06-19 16:46:15 +00:00