Commit Graph

6838 Commits

Author SHA1 Message Date
Jingyue Wu 1238f341ba [SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow.
Summary:
This patch implements my promised optimization to reunites certain sexts from
operands after we extract the constant offset. See the header comment of
reuniteExts for its motivation.

One key building block that enables this optimization is Bjarke's poison value
analysis (D11212). That helps to prove "a +nsw b" can't overflow.

Reviewers: broune

Subscribers: jholewinski, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12016

llvm-svn: 245003
2015-08-14 02:02:05 +00:00
Chandler Carruth bf143e2a20 [LIR] Re-instate r244880, reverted in r244884, factoring the handling of
AliasAnalysis in LoopIdiomRecognize.

The previous commit to LIR, r244879, exposed some scary bug in the loop
pass pipeline with an assert failure that showed up on several bots.
This patch got reverted as part of getting that revision reverted, but
they're actually independent and unrelated. This patch has no functional
change and should be completely safe. It is also useful for my current
work on the AA infrastructure.

llvm-svn: 244993
2015-08-14 00:21:10 +00:00
Sanjay Patel a75c41e5f3 don't repeat function names in comments; NFC
llvm-svn: 244977
2015-08-13 22:53:20 +00:00
Jingyue Wu 13a80eaceb [SeparateConstOffsetFromGEP] strengthen the inbounds attribute
We used to be over-conservative about preserving inbounds. Actually, the second
GEP (which applies the constant offset) can inherit the inbounds attribute of
the original GEP, because the resultant pointer is equivalent to that of the
original GEP. For example,

  x  = GEP inbounds a, i+5
    =>
  y = GEP a, i               // inbounds removed
  x = GEP inbounds y, 5      // inbounds preserved

llvm-svn: 244937
2015-08-13 18:48:49 +00:00
Erik Eckstein 11fc8175d9 [DeadStoreElimination] remove a redundant store even if the load is in a different block.
DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location.
So far this worked only if the store is in the same block as the load.
Now we can also handle stores which are in a different block than the load.
Example:

define i32 @test(i1, i32*) {
entry:
  %l2 = load i32, i32* %1, align 4
  br i1 %0, label %bb1, label %bb2
bb1:
  br label %bb3
bb2:
  ; This store is redundant
  store i32 %l2, i32* %1, align 4
  br label %bb3
bb3:
  ret i32 0
}

Differential Revision: http://reviews.llvm.org/D11854

llvm-svn: 244901
2015-08-13 15:36:11 +00:00
Renato Golin 655348f0b2 Revert "[LIR] Start leveraging the fundamental guarantees of a loop..."
This reverts commit r244879, as it broke the test-suite on
SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64.

llvm-svn: 244885
2015-08-13 11:25:38 +00:00
Renato Golin 4d57906b0e Revert "[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize."
This reverts commit r244880, as it broke the test-suite on
SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64.

llvm-svn: 244884
2015-08-13 11:25:35 +00:00
Ashutosh Nema 47802628f7 Test Commit.
llvm-svn: 244883
2015-08-13 11:18:35 +00:00
Chandler Carruth c2af09823f [LIR] Handle access to AliasAnalysis the same way as the other analysis
in LoopIdiomRecognize. This is what started me staring at this code. Now
migrating it with the new AA stuff will be trivial.

llvm-svn: 244880
2015-08-13 10:00:53 +00:00
Chandler Carruth 8ae7b81559 [LIR] Start leveraging the fundamental guarantees of a loop in
simplified form to remove redundant checks and simplify the code for
popcount recognition. We don't actually need to handle all of these
cases.

I've left a FIXME for one in particular until I finish inspecting to
make sure we don't actually *rely* on the predicate in any way.

llvm-svn: 244879
2015-08-13 09:56:20 +00:00
Chandler Carruth 18c2669aca [LIR] Handle the LoopInfo the same as all the other analyses. No utility
really in breaking pattern just for this analysis.

llvm-svn: 244878
2015-08-13 09:27:01 +00:00
Chen Li f458c6f313 [LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks in current loop
Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. 

Reviewers: reames, weimingz, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11997

llvm-svn: 244868
2015-08-13 05:24:29 +00:00
Chandler Carruth dc298329cc [LIR] Make the LoopIdiomRecognize pass get analyses essentially the same
way as every other pass. This simplifies the code quite a bit and is
also more idiomatic! <ba-dum!>

llvm-svn: 244853
2015-08-13 01:03:26 +00:00
Chandler Carruth 8219a501da [LIR] Remove the dedicated class for popcount recognition and sink the
code into methods on LoopIdiomRecognize.

This simplifies the code somewhat and also makes it much easier to move
the analyses around. Ultimately, the separate class wasn't providing
significant value over methods -- it contained the precondition basic
block and the current loop. The current loop is already available and
the precondition block wasn't needed everywhere and is easy to pass
around.

In several cases I just moved things to be static functions because they
already accepted most of their inputs as arguments.

This doesn't fix the way we manage analyses yet, that will be the next
patch, but it already makes the code over 50 lines shorter.

No functionality changed.

llvm-svn: 244851
2015-08-13 00:44:29 +00:00
Chandler Carruth d9c6070c98 [LIR] Move all the helpers to be private and re-order the methods in
a way that groups things logically. No functionality changed.

llvm-svn: 244845
2015-08-13 00:10:03 +00:00
Chandler Carruth be158b17db [LIR] Remove the 'LIRUtils' abstraction which was unnecessary and adding
complexity.

There is only one function that was called from multiple locations, and
that was 'getBranch' which has a reasonable one-line spelling already:
dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but
it doesn't seem to add much value. Instead, we should avoid calling it
so many times on the same basic blocks, but that will be in a subsequent
patch.

The other functions are only called in one location, so inline them
there, and take advantage of this to use direct early exit and reduce
indentation. This makes it much more clear what is being tested for, and
in fact makes it clear now to me that there are simpler ways to do this
work. However, this patch just does the mechanical inlining. I'll clean
up the functionality of the code to leverage loop simplified form more
effectively in a follow-up.

Despite lots of early line breaks due to early-exit, this is still
shorter than it was before.

llvm-svn: 244841
2015-08-12 23:55:56 +00:00
Chandler Carruth bad690e8f7 [LIR] Run clang-format over LoopIdiomRecognize in preparation for
a significant code cleanup here.

The handling of analyses in this pass is overly complex and can be
simplified significantly, but the right way to do that is to simplify
all of the code not just the analyses, and that'll require pretty
extensive edits that would be noisy with formatting changes mixed into
them.

llvm-svn: 244828
2015-08-12 23:06:37 +00:00
Philip Reames 971dc3a82a [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints
To be clear: this is an *optimization* not a correctness change.

CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll.

One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust.

I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially.

Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly.

Differential Revision: http://reviews.llvm.org/D11819

llvm-svn: 244821
2015-08-12 22:11:45 +00:00
Philip Reames 9ac4e38a16 [RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm
When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :)

This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed.  Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement.  That change will follow in the near future.  

llvm-svn: 244808
2015-08-12 21:00:20 +00:00
Chandler Carruth 19ac7d5b29 [PM/AA] Add missing static dependency edges from DSE and memdep to TLI.
I forgot to add these in r244780 and r244778. Sorry about that.

Also order the static dependencies in a lexicographical order.

llvm-svn: 244787
2015-08-12 18:10:45 +00:00
Chandler Carruth dbe40fb45e [PM/AA] Stop getting the TargetLibraryInfo out of the AliasAnalysis and
just depend on it directly.

This was particularly frustrating because there was a really wide
mixture of using a member variable and re-extracting it from the AA that
happened to be around. I think the result is much more clear.

I've also deleted all of the pointless null checks and used references
across the APIs where I could to make it explicit that this cannot be
null in a useful fashion.

llvm-svn: 244780
2015-08-12 18:01:44 +00:00
Sanjay Patel 956e29cc8e don't repeat function names in comments; NFC
llvm-svn: 244672
2015-08-11 21:24:04 +00:00
Sanjay Patel 41f3d95f76 fix 80-cols; NFC
llvm-svn: 244668
2015-08-11 21:11:56 +00:00
Igor Laevsky 4709c03715 [IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarter
Differential Revision: http://reviews.llvm.org/D11687

llvm-svn: 244474
2015-08-10 18:23:58 +00:00
Mark Heffernan 8939154a22 Add new llvm.loop.unroll.enable metadata.
This change adds the unroll metadata "llvm.loop.unroll.enable" which directs
the optimizer to unroll a loop fully if the trip count is known at compile time, and
unroll partially if the trip count is not known at compile time. This differs from
"llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not
known at compile time.

The "llvm.loop.unroll.enable" is intended to be added for loops annotated with
"#pragma unroll".

llvm-svn: 244466
2015-08-10 17:28:08 +00:00
Fraser Cormack e29ab2bfab Prevent the scalarizer from caching incorrect entries
The scalarizer can cache incorrect entries when walking up a chain of
insertelement instructions. This occurs when it encounters more than one
instruction that it is not actively searching for, as it unconditionally caches
every element it finds. The fix is to only cache the first element that it
isn't searching for so we don't overwrite correct entries.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D11559

llvm-svn: 244448
2015-08-10 14:48:47 +00:00
Benjamin Kramer df005cbe19 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
Adam Nemet 15840393f3 [LAA] Make the set of runtime checks part of the state of LAA, NFC
This is the full set of checks that clients can further filter. IOW,
it's client-agnostic.  This makes LAA complete in the sense that it now
provides the two main results of its analysis precomputed:

1. memory dependences via getDepChecker().getInsterestingDependences()
2. run-time checks via getRuntimePointerCheck().getChecks()

However, as a consequence we now compute this information pro-actively.
Thus if the client decides to skip the loop based on the dependences
we've computed the checks unnecessarily.  In order to see whether this
was a significant overhead I checked compile time on SPEC2k6 LTO bitcode
files.  The change was in the noise.

The checks are generated in canCheckPtrAtRT, at the same place where we
used to call groupChecks to merge checks.

llvm-svn: 244368
2015-08-07 22:44:15 +00:00
Pete Cooper ebcd748927 Convert a bunch of loops to foreach. NFC.
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst.  This commit changes a bunch
of eligible loops to use it.

llvm-svn: 244260
2015-08-06 20:22:46 +00:00
Nico Rieck 78199518c4 Rename inst_range() to instructions() for consistency. NFC
llvm-svn: 244248
2015-08-06 19:10:45 +00:00
Quentin Colombet 6443cce233 [Reassociation] Fix miscompile for va_arg arguments.
iisUnmovableInstruction() had a list of instructions hardcoded which are
considered unmovable. The list lacked (at least) an entry for the va_arg
and cmpxchg instructions.
Fix this by introducing a new Instruction::mayBeMemoryDependent()
instead of maintaining another instruction list.

Patch by Matthias Braun <matze@braunis.de>.

Differential Revision: http://reviews.llvm.org/D11577

rdar://problem/22118647

llvm-svn: 244244
2015-08-06 18:44:34 +00:00
Chandler Carruth 17e0bc37fd [PM/AA] Hoist the interface for BasicAA into a header file.
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.

I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.

llvm-svn: 244197
2015-08-06 07:33:15 +00:00
Chandler Carruth 50fee93926 [PM/AA] Simplify the AliasAnalysis interface by removing a wrapper
around a DataLayout interface in favor of directly querying DataLayout.

This wrapper specifically helped handle the case where this no
DataLayout, but LLVM now requires it simplifynig all of this. I've
updated callers to directly query DataLayout. This in turn exposed
a bunch of places where we should have DataLayout readily available but
don't which I've fixed. This then in turn exposed that we were passing
DataLayout around in a bunch of arguments rather than making it readily
available so I've also fixed that.

No functionality changed.

llvm-svn: 244189
2015-08-06 02:05:46 +00:00
Chen Li 50efd9220a [LoopUnswitch] Preserve make.implicit metadata for unswitched conditions
Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header.

Reviewers: sanjoy, weimingz

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11769

llvm-svn: 244132
2015-08-05 21:13:26 +00:00
Chandler Carruth b2fda0d95c [Unroll] Switch to using 'int' cost types in preparation for a somewhat
more involved change to the cost computation pattern.

llvm-svn: 244095
2015-08-05 18:46:21 +00:00
Tanya Lattner 0d28f80bd1 Rename all references to old mailing lists to new lists.llvm.org address.
llvm-svn: 243999
2015-08-05 03:51:17 +00:00
Sanjay Patel 924879ad2c wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994
2015-08-04 15:49:57 +00:00
David Majnemer eb518bd5d8 Drive-by fixes for LandingPad -> EHPad
This change was done as an audit and is by inspection.  The new EH
system is still very much a work in progress.  NFC for the landingpad
case.

llvm-svn: 243965
2015-08-04 08:21:40 +00:00
Sanjoy Das 215df9ed98 Revert "[LSR] Generate and use zero extends"
This reverts commit r243348 and r243357.  They caused PR24347.

llvm-svn: 243939
2015-08-04 01:52:05 +00:00
Chandler Carruth 87adb7a2e2 [Unroll] Improve the brute force loop unroll estimate by propagating
through PHI nodes across iterations.

This patch teaches the new advanced loop unrolling heuristics to propagate
constants into the loop from the preheader and around the backedge after
simulating each iteration. This lets us brute force solve simple recurrances
that aren't modeled effectively by SCEV. It also makes it more clear why we
need to process the loop in-order rather than bottom-up which might otherwise
make much more sense (for example, for DCE).

This came out of an attempt I'm making to develop a principled way to account
for dead code in the unroll estimation. When I implemented
a forward-propagating version of that it produced incorrect results due to
failing to propagate *cost* between loop iterations through the PHI nodes, and
it occured to me we really should at least propagate simplifications across
those edges, and it is quite easy thanks to the loop being in canonical and
LCSSA form.

Differential Revision: http://reviews.llvm.org/D11706

llvm-svn: 243900
2015-08-03 20:32:27 +00:00
Craig Topper e3dcce9700 De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842
2015-08-01 22:20:21 +00:00
David Majnemer 654e130b6e New EH representation for MSVC compatibility
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support.  Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.

Differential Revision: http://reviews.llvm.org/D11097

llvm-svn: 243766
2015-07-31 17:58:14 +00:00
Adam Nemet c75ad69ca5 [LDist] Filter the checks locally rather than in LAA, NFC
Before, we were passing the pointer partitions to LAA.  Now, we get all
the checks from LAA and filter out the checks within partitions in
LoopDistribution.

This effectively concludes the steps to move filtering memchecks from
LAA into its clients.  There is still some cleanup left to remove the
unused interfaces in LAA that still take PtrPartition.

(Moving this functionality to LoopDistribution also requires
needsChecking on pointers to be made public.)

llvm-svn: 243613
2015-07-30 03:29:16 +00:00
Nick Lewycky c3890d2969 Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.)
Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h.

llvm-svn: 243585
2015-07-29 22:32:47 +00:00
Michael Zolotukhin 9f06ef76d3 [Unroll] Handle SwitchInst properly.
Previously successor selection was simply wrong.

llvm-svn: 243545
2015-07-29 18:10:33 +00:00
Michael Zolotukhin 3a7d55b623 [Unroll] Don't crash when simplified branch condition is undef.
llvm-svn: 243544
2015-07-29 18:10:29 +00:00
Sanjoy Das cfe41f050c [Statepoints] Let patchable statepoints have a symbolic call target.
Summary:
As added initially, statepoints required their call targets to be a
constant pointer null if ``numPatchBytes`` was non-zero.  This turns out
to be a problem ergonomically, since there is no way to mark patchable
statepoints as calling a (readable) symbolic value.

This change remove the restriction of requiring ``null`` call targets
for patchable statepoints, and changes PlaceSafepoints to maintain the
symbolic call target through its transformation.

Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11550

llvm-svn: 243502
2015-07-28 23:50:30 +00:00
Michael Zolotukhin 80d13bac02 [Unroll] Add debug dumps to loop-unroll analyzer.
llvm-svn: 243471
2015-07-28 20:07:29 +00:00
Michael Zolotukhin a425c9d0e3 [Unroll] Don't analyze blocks outside the loop.
llvm-svn: 243466
2015-07-28 19:21:21 +00:00
Adam Nemet 0a674401bf [LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC
Before the patch, the checks were generated internally in
addRuntimeCheck.  Now, we use the new overloaded version of
addRuntimeCheck that takes the ready-made set of checks as a parameter.

The checks are now generated by the client (LoopDistribution) with the
new RuntimePointerChecking::generateChecks API.

Also the new printChecks API is used to print out the checks for
debugging.

This is to continue the transition over to the new model whereby clients
will get the full set of checks from LAA, filter it and then pass it to
LoopVersioning and in turn to addRuntimeCheck.

llvm-svn: 243382
2015-07-28 05:01:53 +00:00
Sanjoy Das 93b3504aa8 [LSR] Generate and use zero extends
Summary:
If a scale or a base register can be rewritten as "Zext({A,+,1})" then
LSR will now consider a formula of that form in its normal cost
computation.

Depends on D9180

Reviewers: qcolombet, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9181

llvm-svn: 243348
2015-07-27 23:27:51 +00:00
Pete Cooper 11bd958cb6 Revert "Remove unnecessary null check. NFC."
This reverts commit r243167.

Duncan pointed out that dyn_cast can return null in these cases, so this
was an unsafe commit to make.  Sorry for the noise.

Worryingly there were no tests which fail...

llvm-svn: 243302
2015-07-27 18:37:58 +00:00
Jingyue Wu bfefff555e Roll forward r243250
r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build
slaves, but I couldn't reproduce this failure locally. Probably a false
positive as I saw this test was broken by r243246 or r243247 too but passed
later without people fixing anything.

llvm-svn: 243253
2015-07-26 19:10:03 +00:00
Jingyue Wu 84879b71a9 Revert r243250
breaks tests

llvm-svn: 243251
2015-07-26 18:30:13 +00:00
Jingyue Wu bf485f059c [TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost
Summary:
This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider
addressing modes. It now returns TCC_Free when the GEP can be completely folded
to an addresing mode.

I started this patch as I refactored SLSR. Function isGEPFoldable looks common
and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost.

Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best
testing bed seems CostModel, but its getInstructionCost method invokes
getAddressComputationCost for GEPs which provides very coarse estimation. So
this patch also makes getInstructionCost call the updated getGEPCost for GEPs.
This change inevitably breaks some tests because the cost model changes, but
nothing looks seriously wrong -- if we believe the new cost model is the right
way to go, these tests should be updated.

This patch is not perfect yet -- the comments in some tests need to be updated.
I want to know whether this is a right approach before fixing those details.

Reviewers: chandlerc, hfinkel

Subscribers: aschwaighofer, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D9819

llvm-svn: 243250
2015-07-26 17:28:13 +00:00
Chen Li 145c2f57ae [LoopUnswitch] Improve loop unswitch pass to find trivial unswitch conditions more effectively
Summary:
This patch improves trivial loop unswitch. 

The current trivial loop unswitch only checks if loop header's terminator contains a trivial unswitch condition. But if the loop header only has one reachable successor (due to intentionally or unintentionally missed code simplification), we should consider the successor as part of the loop header. Therefore, instead of stopping at loop header's terminator, we should keep traversing its successors within loop until reach a *real* conditional branch or switch (whose condition can not be constant folded). This change will enable a single -loop-unswitch pass to unswitch multiple trivial conditions (unswitch one trivial condition could open opportunity to unswitch another one in the same loop), while the old implementation can unswitch only one per pass. 

Reviewers: reames, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11481

llvm-svn: 243203
2015-07-25 03:21:06 +00:00
Lawrence Hu dc8a83b53b Handle loop with negtive induction variable increment
This patch extend LoopReroll pass to hand the loops which
is similar to the following:

      while (len > 1) {
            sum4 += buf[len];
            sum4 += buf[len-1];
            len -= 2;
        }

llvm-svn: 243171
2015-07-24 22:01:49 +00:00
Pete Cooper 3191697138 Remove unnecessary null check. NFC.
Since both places which set this variable do so with dyn_cast, and not
dyn_cast_or_null, its impossible to get a nullptr here, so we can remove
the check.

llvm-svn: 243167
2015-07-24 21:38:01 +00:00
Pete Cooper 7679afda82 Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern

for (auto I = x.rbegin(), E = x.end(); I != E; ++I)

we can use make_range to construct the reverse range and iterate using
that instead.

llvm-svn: 243163
2015-07-24 21:13:43 +00:00
Philip Reames fa2c630f79 [RewriteStatepointsForGC] Adjust naming scheme to be more stable
The names for instructions inserted were previous dependent on iteration order.  By deriving the names from the original instructions, we can avoid instability in tests without resorting to ordered traversals.  It also makes the IR mildly easier to read at large scale.

llvm-svn: 243140
2015-07-24 19:01:39 +00:00
Michael Zolotukhin 57776b8159 Handle resolvable branches in complete loop unroll heuristic.
Summary:
Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate.
This patch is intended to be applied after D10205 (though it could be applied independently).

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10206

llvm-svn: 243084
2015-07-24 01:53:04 +00:00
Philip Reames 29e9ae7891 [RewriteStatepointsForGC] Fix release build warning
llvm-svn: 243076
2015-07-24 00:42:55 +00:00
Philip Reames 88958b2df3 [RewriteStatepointsForGC] Use a worklist algorithm for first part of base pointer algorithm [NFC]
The new code should hopefully be equivalent to the old code; it just uses a worklist to track instructions which need to visited rather than iterating over all instructions visited each time. This should be faster, but the primary benefit is that the purpose should be more clear and the diff of adding another instruction type (forthcoming) much more obvious.

Differential Revision: http://reviews.llvm.org/D11480

llvm-svn: 243071
2015-07-24 00:02:11 +00:00
Jingyue Wu 2e424da39b [NaryReassociate] remove redundant code
This check is already done by findClosestMatchingDominator.

llvm-svn: 243065
2015-07-23 23:13:37 +00:00
Philip Reames 9b141ed48e [RewriteStatepointsForGC] Rename PhiState to reflect that it's associated w/more than just PHIs
Today, Select instructions also have associated PhiStates.  In the near future, so will ExtractElement and SuffleVector.

llvm-svn: 243056
2015-07-23 22:49:14 +00:00
Philip Reames 2a892a630b [RewriteStatepointsForGC] Use idomatic mechanisms for debug tracing [NFC]
Deleting much of the code using trace-rewrite-statepoints and use idiomatic DEBUG statements instead.  This includes adding operator<< to a helper class.

llvm-svn: 243054
2015-07-23 22:25:26 +00:00
Philip Reames 273e6bbd11 [RewriteStatepointsForGC] Simplify code around meet of PhiStates [NFC]
We don't need to pass in the map from BDV to PhiStates; we can instead handle that externally and let the MeetPhiStates helper class just meet PhiStates.

llvm-svn: 243045
2015-07-23 21:41:27 +00:00
Matt Wala 878c144f8a [Scalarizer] Fix potential for stale data in Scattered across invocations
Summary:
Scalarizer has two data structures that hold information about changes
to the function, Gathered and Scattered. These are cleared in finish()
at the end of runOnFunction() if finish() detects any changes to the
function.

However, finish() was checking for changes by only checking if
Gathered was non-empty. The function visitStore() only modifies
Scattered without touching Gathered. As a result, Scattered could have
ended up having stale data if Scalarizer only scalarized store
instructions. Since the data in Scattered is used during the execution
of the pass, this introduced dangling pointer errors.

The fix is to check whether both Scattered and Gathered are empty
before deciding what to do in finish(). This also fixes a problem
where the Function can be modified although the pass returns false.

Reviewers: rnk

Subscribers: rnk, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D10459

llvm-svn: 243040
2015-07-23 20:53:46 +00:00
Chandler Carruth 08eebe2074 [GMR] Add a late run of GlobalsModRef to the main pass pipeline behind
the general GMR-in-non-LTO flag.

Without this, we have the global information during the CGSCC pipeline
for GVN and such, but don't have it available during the late loop
optimizations such as the vectorizer. Moreover, after the CGSCC pipeline
has finished we have substantially more accurate and refined call graph
information, function annotations, etc, which will make GMR even more
powerful than it is early in the pipelien.

Note that we have to play silly games with preserving AliasAnalysis
(which is now trivially preserved) in order to let a module analysis
magically be preserved into the entire function pass pipeline.
Simultaneously we have to not make GMR an immutable pass in order to be
able to re-run it and collect fresh data on the final call graph.

llvm-svn: 242999
2015-07-23 09:34:01 +00:00
Chandler Carruth 194f59ca5d [PM/AA] Extract the ModRef enums from the AliasAnalysis class in
preparation for de-coupling the AA implementations.

In order to do this, they had to become fake-scoped using the
traditional LLVM pattern of a leading initialism. These can't be actual
scoped enumerations because they're bitfields and thus inherently we use
them as integers.

I've also renamed the behavior enums that are specific to reasoning
about the mod/ref behavior of functions when called. This makes it more
clear that they have a very narrow domain of applicability.

I think there is a significantly cleaner API for all of this, but
I don't want to try to do really substantive changes for now, I just
want to refactor the things away from analysis groups so I'm preserving
the exact original design and just cleaning up the names, style, and
lifting out of the class.

Differential Revision: http://reviews.llvm.org/D10564

llvm-svn: 242963
2015-07-22 23:15:57 +00:00
Chandler Carruth 96ada25bf3 [PM/AA] Remove all of the dead AliasAnalysis pointers being threaded
through APIs that are no longer necessary now that the update API has
been removed.

This will make changes to the AA interfaces significantly less
disruptive (I hope). Either way, it seems like a really nice cleanup.

llvm-svn: 242882
2015-07-22 09:52:54 +00:00
Chen Li c0f3a158f0 [LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop()
Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change.

Reviewers: meheff, reames, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11276

llvm-svn: 242873
2015-07-22 05:26:29 +00:00
Chandler Carruth ccffdaf7ed [SROA] Fix a nasty pile of bugs to do with big-endian, different alloca
types and loads, loads or stores widened past the size of an alloca,
etc.

This started off with a bug report about big-endian behavior with
bitfields and loads and stores to a { i32, i24 } struct. An initial
attempt to fix this was sent for review in D10357, but that didn't
really get to the root of the problem.

The core issue was that canConvertValue and convertValue in SROA were
handling different bitwidth integers by doing a zext of the integer. It
wouldn't do a trunc though, only a zext! This would in turn lead SROA to
form an i24 load from an i24 alloca, zext it to i32, and then use it.
This would at least produce the wrong value for big-endian systems.

One of my many false starts here was to correct the computation for
big-endian systems by shifting. But this doesn't actually work because
the original code has a 64-bit store to the entire 8 bytes, and a 32-bit
load of the last 4 bytes, and because the alloc size is 8 bytes, we
can't lose that last (least significant if bigendian) byte! The real
problem here is that we're forming an i24 load in SROA which is actually
not sufficiently wide to load all of the necessary bits here. The source
has an i32 load, and SROA needs to form that as well.

The straightforward way to do this is to disable the zext logic in
canConvertValue and convertValue, forcing us to actually load all
32-bits. This seems like a really good change, but it in turn breaks
several other parts of SROA.

First in the chain of knock-on failures, we had places where we were
doing integer-widening promotion even though some of the integer loads
or stores extended *past the end* of the alloca's memory! There was even
a comment about preventing this, but it only prevented the case where
the type had a different bit size from its store size. So I added checks
to handle the cases where we actually have a widened load or store and
to avoid trying to special integer widening promotion in those cases.

Second, we actually rely on the ability to promote in the face of loads
past the end of an alloca! This is important so that we can (for
example) speculate loads around PHI nodes to do more promotion. The bits
loaded are garbage, but as long as they aren't used and the alignment is
suitable high (which it wasn't in the test case!) this is "fine". And we
can't stop promoting here, lots of things stop working well if we do. So
we need to add specific logic to handle the extension (and truncation)
case, but *only* where that extension or truncation are over bytes that
*are outside the alloca's allocated storage* and thus totally bogus to
load or store.

And of course, once we add back this correct handling of extension or
truncation, we need to correctly handle bigendian systems to avoid
re-introducing the exact bug that started us off on this chain of misery
in the first place, but this time even more subtle as it only happens
along speculated loads atop a PHI node.

I've ported an existing test for PHI speculation to the big-endian test
file and checked that we get that part correct, and I've added several
more interesting big-endian test cases that should help check that we're
getting this correct.

Fun times.

llvm-svn: 242869
2015-07-22 03:32:42 +00:00
Nick Lewycky f836c89c49 Fix a performance problem in memcpyopt by removing a linear scan over ranges when inserting a new range. No functionality change intended. Patch by Anthony Pesch!
llvm-svn: 242843
2015-07-21 21:56:26 +00:00
Philip Reames 6ff1a1e3d6 [RewriteStatepointsForGC] minor style cleanup
Use a named lambda for readability, common some code, remove a stale comments, and use llvm style variable names.

llvm-svn: 242827
2015-07-21 19:04:38 +00:00
Philip Reames 94babb7030 [RewriteStatepointsForGC] Hoist some code out of a loop
llvm-svn: 242808
2015-07-21 17:18:03 +00:00
Philip Reames 74ce2e7691 [RewriteStatepointsForGC] Delete trivial code
A bit more code cleanup: delete some a trivial true assertion and supporting code, remove a redundant cast, and use count in assertions where feasible.

llvm-svn: 242805
2015-07-21 16:51:17 +00:00
Philip Reames f388050105 [RewriteStatepointsForGC] Minor code cleanup [NFC]
We can use builders to simplify part of the code and we only check for the existance of the metadata value; this enables us to delete some redundant code.

llvm-svn: 242751
2015-07-21 00:49:55 +00:00
Chandler Carruth 9f2bf1aff5 [PM/AA] Remove the addEscapingUse update API that won't be easy to
directly model in the new PM.

This also was an incredibly brittle and expensive update API that was
never fully utilized by all the passes that claimed to preserve AA, nor
could it reasonably have been extended to all of them. Any number of
places add uses of values. If we ever wanted to reliably instrument
this, we would want a callback hook much like we have with ValueHandles,
but doing this for every use addition seems *extremely* expensive in
terms of compile time.

The only user of this update mechanism is GlobalsModRef. The idea of
using this to keep it up to date doesn't really work anyways as its
analysis requires a symmetric analysis of two different memory
locations. It would be very hard to make updates be sufficiently
rigorous to *guarantee* symmetric analysis in this way, and it pretty
certainly isn't true today.

However, folks have been using GMR with this update for a long time and
seem to not be hitting the issues. The reported issue that the update
hook fixes isn't even a problem any more as other changes to
GetUnderlyingObject worked around it, and that issue stemmed from *many*
years ago. As a consequence, a prior patch provided a flag to control
the unsafe behavior of GMR, and this patch removes the update mechanism
that has questionable compile-time tradeoffs and is causing problems
with moving to the new pass manager. Note the lack of test updates --
not one test in tree actually requires this update, even for a contrived
case.

All of this was extensively discussed on the dev list, this patch will
just enact what that discussion decides on. I'm sending it for review in
part to show what I'm planning, and in part to show the *amazing* amount
of work this avoids. Every call to the AA here is something like three
to six indirect function calls, which in the non-LTO pipeline never do
any work! =[

Differential Revision: http://reviews.llvm.org/D11214

llvm-svn: 242605
2015-07-18 03:26:46 +00:00
Cong Hou ab23bfbc0e Create a wrapper pass for BranchProbabilityInfo.
This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence.

http://reviews.llvm.org/D11241

llvm-svn: 242349
2015-07-15 22:48:29 +00:00
Chen Li 3f5ed1566e [LoopUnswitch] Add an else clause to IsTrivialUnswitchCondition() when checking HeaderTerm instruction type
Summary:
This is a trivial code change with no functionality effect. 

When LoopUnswitch determines trivial unswitch condition, it checks whether the loop header's terminator instruction is a branch instruction or switch instruction since trivial unswitch condition can only apply to these two instruction types. The current code does not fail the check directly on other instruction types, but check the nullness of LoopExitBB variable instead. The added else clause makes the check fail immediately on other instruction types and makes the code more obvious.  

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11239

llvm-svn: 242345
2015-07-15 22:41:13 +00:00
Michael Zolotukhin 31b3eaaf28 [LoopUnrolling] Handle cast instructions.
During estimation of unrolling effect we should be able to propagate
constants through casts.

Differential Revision: http://reviews.llvm.org/D10207

llvm-svn: 242257
2015-07-15 00:19:51 +00:00
Adam Nemet 9f7dedc376 [LAA] Introduce RuntimePointerChecking::PointerInfo, NFC
Turn this structure-of-arrays (i.e. the various pointer attributes) into
array-of-structures.

llvm-svn: 242219
2015-07-14 22:32:50 +00:00
Adam Nemet 7cdebac0c8 [LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC
I am planning to add more nested classes inside RuntimePointerCheck so
all these triple-nesting would be hard to follow.

Also rename it to RuntimePointerChecking (i.e. append 'ing').

llvm-svn: 242218
2015-07-14 22:32:44 +00:00
Tim Northover 586b741959 GVN: use a static array instead of regenerating it each time. NFC.
llvm-svn: 242202
2015-07-14 21:14:58 +00:00
Tim Northover d5fdef016d GVN: tolerate an instruction being replaced without existing in the leaderboard
Sometimes an incidentally created instruction can duplicate a Value used
elsewhere. It then often doesn't end up in the leader table. If it's later
removed, we attempt to remove it from the leader table and segfault.

Instead we should just ignore the removal request, which won't cause any
problems. The reverse situation, where the original instruction is replaced by
the new one (which you might think could leave the leader table empty) cannot
occur, because the incidental instruction will never be found in the first
place.

llvm-svn: 242199
2015-07-14 21:03:18 +00:00
David Majnemer 62690b1952 [SROA] Don't de-atomic volatile loads and stores
Volatile loads and stores are made visible in global state regardless of
what memory is involved.  It is not correct to disregard the ordering
and synchronization scope because it is possible to synchronize with
memory operations performed by hardware.

This partially addresses PR23737.

llvm-svn: 242126
2015-07-14 06:19:58 +00:00
Pete Cooper 90d95edbb4 Loop idiom recognizer was replacing too many uses of popcount.
When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.

The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed

 %tobool.5 = icmp ne i32 %num, 0
 store i1 %tobool.5, i1* %ptr
 br i1 %tobool.5, label %for.body.lr.ph, label %for.end

to

 store i1 %1, i1* %ptr
 %0 = call i32 @llvm.ctpop.i32(i32 %num)
 %1 = icmp ne i32 %0, 0
 br i1 %1, label %for.body.lr.ph, label %for.end

Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect.  The store doesn’t want the result of ctpop.

The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.

Reviewed by Hal Finkel.

llvm-svn: 242068
2015-07-13 21:25:33 +00:00
Mark Heffernan d7ebc24112 Enable runtime unrolling with unroll pragma metadata
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
loop has a runtime loop count. Previously, such loops would be unrolled with a
very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
enabled resulting in a very large (and likely unwise) unroll factor.

llvm-svn: 242047
2015-07-13 18:26:27 +00:00
Benjamin Kramer e448b5be05 Avoid using Loop::getSubLoopsVector.
Passes should never modify it, just use the const version. While there
reduce copying in LoopInterchange. No functional change intended.

llvm-svn: 242041
2015-07-13 17:21:14 +00:00
David Majnemer 6bc83e0f43 [LICM] Don't try to sink values out of loops without any exits
There is no suitable basic block to sink instructions in loops without
exits.  The only way an instruction in a loop without exits can be used
is as an incoming value to a PHI.  In such cases, the incoming block for
the corresponding value is unreachable.

This fixes PR24013.

Differential Revision: http://reviews.llvm.org/D10903

llvm-svn: 241987
2015-07-12 03:53:05 +00:00
Chandler Carruth 00ebdbcc47 [PM/AA] Completely remove the AliasAnalysis::copyValue interface.
No in-tree alias analysis used this facility, and it was not called in
any particularly rigorous way, so it seems unlikely to be correct.

Note that one of the only stateful AA implementations in-tree,
GlobalsModRef is completely broken currently (and any AA passes like it
are equally broken) because Module AA passes are not effectively
invalidated when a function pass that fails to update the AA stack runs.

Ultimately, it doesn't seem like we know how we want to build stateful
AA, and until then trying to support and maintain correctness for an
untested API is essentially impossible. To that end, I'm planning to rip
out all of the update API. It can return if and when we need it and know
how to build it on top of the new pass manager and as part of *tested*
stateful AA implementations in the tree.

Differential Revision: http://reviews.llvm.org/D10889

llvm-svn: 241975
2015-07-11 04:39:00 +00:00
Adam Nemet 215746b45a [LoopDist/LoopVer] Move LoopVersioning to a new module, NFC
Summary:
The class will obviously need improvement down the road.  For one, there
is no reason that addPHINodes would have to be exposed like that.  I
will make this and other improvements in follow-up patches.

The main goal is to be able to share this functionality.  The
LoopLoadElimination pass I am working on needs it too.  Later we can
move other clients as well (LV and Ashutosh's LICMVer).

Reviewers: hfinkel, ashutosh.nema

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10577

llvm-svn: 241932
2015-07-10 18:55:13 +00:00
Adam Nemet 1a689188c4 [LoopDist] Move loop-versioning helper functions to Cloning, NFC
Summary:
This makes them available to the LoopVersioning class as that is moved
to its own module in the next patch.

Reviewers: ashutosh.nema, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10576

llvm-svn: 241931
2015-07-10 18:55:09 +00:00
David Majnemer db82d2f338 Revert the new EH instructions
This reverts commits r241888-r241891, I didn't mean to commit them.

llvm-svn: 241893
2015-07-10 07:15:17 +00:00
David Majnemer ae2ffc8a8c New EH representation for MSVC compatibility
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support.  Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.

Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11041

llvm-svn: 241888
2015-07-10 07:00:44 +00:00
Sanjoy Das 6f062c8c2a [IndVars] Try to use existing values in RewriteLoopExitValues.
Summary:
In RewriteLoopExitValues, before expanding out an SCEV expression using
SCEVExpander, try to see if an existing LLVM IR expression already
computes the value we're interested in.  If so use that existing
expression.

Apart from reducing IndVars' reliance on the rest of the compilation
pipeline, this also prevents IndVars from concluding some expressions as
"high cost" when they're not.  For instance,
`InductiveRangeCheckElimination` often emits code of the following form:

```
len = umin(len_A, len_B)

loop:
  ...
  if (i++ < len)
    goto loop

outside_loop:
    use(i)
```

`SCEVExpander` refuses to rewrite the use of `i` in `outside_loop`,
since it thinks the value of `i` on loop exit, `len`, is a high cost
expansion since it contains an `umax` in it.  With this change,
`IndVars` can see that it can re-use `len` instead of creating a new
expression to compute `umin(len_A, len_B)`.

I considered putting this cleverness in `SCEVExpander`, but I was
worried that it may then have a deterimental effect on other passes
that use it.  So I decided it was better to just do this in the one
place where it seems like an obviously good idea, with the intent of
generalizing later if needed.

Reviewers: atrick, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10782

llvm-svn: 241838
2015-07-09 18:46:12 +00:00
Reid Kleckner 60381791b5 Rename llvm.frameescape and llvm.framerecover to localescape and localrecover
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.

These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.

Naming suggestions at this point are welcome, I'm happy to re-run sed.

Reviewers: majnemer, nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11011

llvm-svn: 241633
2015-07-07 22:25:32 +00:00
David Majnemer 6cc21f909c Revert "Revert r241570, it caused PR24053"
This reverts commit r241602.  We had a latent bug in SCCP where we would
make a basic block empty and then proceed to ask questions about it's
terminator.

llvm-svn: 241616
2015-07-07 18:49:41 +00:00
David Majnemer 9402e27ae0 [SCCP] Turn loads of null into undef instead of zero initialized values
Surprisingly, this is a correctness issue: the mmx type exists for
calling convention purposes, LLVM doesn't have a zero representation for
them.

This partially fixes PR23999.

llvm-svn: 241142
2015-07-01 05:37:57 +00:00
Jingyue Wu cf02ef315f [NaryReassociate] enhances nsw by leveraging @llvm.assume
Summary:
nsw are flaky and can often be removed by optimizations. This patch enhances
nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now
understands that

    assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b

As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE.

Test Plan: nary-gep.ll

Reviewers: broune, meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10822

llvm-svn: 241139
2015-07-01 03:38:49 +00:00
Jingyue Wu 3abde7bea5 [SLSR] S's basis must have the same type as S
llvm-svn: 240910
2015-06-28 17:45:05 +00:00
Philip Reames 8fe7f13af8 [RewriteStatepointsForGC] Generalized vector phi/select handling for base pointers
This change extends the detection of base pointers for vector constructs to handle arbitrary phi and select nodes. The existing non-vector code already handles those, so this is basically just extending the vector special case to be less special cased. It still isn't generalized vector handling since we can't handle arbitrary vector instructions (e.g. shufflevectors), but it's a lot closer.

The general structure of the change is as follows:
 * Extend the base defining value relation over a subset of vector instructions and vector typed phi & select instructions.
 * Move scalarization from before base pointer rewriting to after base pointer rewriting. The extension of the BDV relation is sufficient to find vector base phis for vector inputs.
 * Preserve the existing special case logic for when the base of a vector element is locally obvious. This general idea could be extended to the scalar case as well.

Differential Revision: http://reviews.llvm.org/D10461#inline-84275

llvm-svn: 240850
2015-06-26 22:47:37 +00:00
Peter Collingbourne 2a3443c7c5 GVN: If a branch has two identical successors, we cannot declare either dead.
This previously caused miscompilations as a result of phi nodes receiving
undef incoming values from blocks dominated by such successors.

Differential Revision: http://reviews.llvm.org/D10726

llvm-svn: 240670
2015-06-25 18:32:02 +00:00
Duncan P. N. Exon Smith 817ac8f40a Add simplify_type<const WeakVH>; simplify IndVarSimplify
r240214 fixed some UB in IndVarSimplify, and it needed a temporary
`WeakVH` to do it.  Add `simplify_type<const WeakVH>` so that this
temporary isn't necessary.

llvm-svn: 240599
2015-06-24 22:23:21 +00:00
David Majnemer 63d606bdcb [GVN] Intersect the IR flags when CSE'ing two instructions
We performed a simple, but incomplete, intersection when it came time to
CSE instructions.  It didn't handle, for example, the 'exact' flag.

This fixes PR23922.

llvm-svn: 240595
2015-06-24 21:52:25 +00:00
David Majnemer f6e500a0dc [Reassociate] Don't propogate flags when creating negations
Reassociate mutated existing instructions in order to form negations
which would create additional reassociate opportunities.

This fixes PR23926.

llvm-svn: 240593
2015-06-24 21:27:36 +00:00
Sanjay Patel 64ea207027 fix typos; NFC
llvm-svn: 240592
2015-06-24 20:42:33 +00:00
Sanjay Patel 09159b8f47 don't repeat function names in comments; NFC
llvm-svn: 240591
2015-06-24 20:40:57 +00:00
Mark Heffernan 9b536a640b This change fixes three bugs in loop unswitching. This change causes an 81% speed-up on a benchmark that is based on EigenConvolutionKernel2D from Eigen3, where the lack of loop unswitching blocks hoisting of loads out of a nested loop (see bug 23816 for how loop unswitching and load hoisting are related).
Change 1: Unswitching on trivial conditions should always happen regardless of the computed unswitching cost, as really the cost is zero. While there is code to make that happen, the logic that checks the unswitching cost against a threshold was moved to an earlier point (revision 147935) than the point where trivial unswitching is detected, so trivial unswitching is currently blocked by the cost threshold. This change fixes that.

Change 2: Before revision 147935 (from 2012-01-11), the threshold parameter was a per-loop threshold. So an unswitching happened only if the cost of the unswitching was less than the threshold. In an indirect way (and I believe unintentionally), the logic for this since then has been that the threshold is an over-all budget across all loops for all loop unswitching done by a given LoopUnswitch loop pass object. So if an unswitching with cost 100 happens in one function, that in effect reduces the threshold from 100 to 0 for the loops even in another function. This persists for the lifetime of that loop pass object. This makes no difference for most small examples but it is important for large examples. This revision fixes that.

Change 3: The cost is currently calculated as std::min(NumInstructions, 5 * NumBlocks). So a loop with 2 blocks and a million instructions will have an unswitching cost of 10. I changed this to just NumInstructions, as it were before revision 147935, though I'm open to e.g. instead replacing std::min with std::max.

I've tried to make the change minimally invasive while staying with what I think was the original intent of the code.
Submitted on behalf of broune@.

llvm-svn: 240438
2015-06-23 18:26:50 +00:00
Alexander Kornienko f00654e31b Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Weiming Zhao f1abad57da Fix PR13851: Preserve metadata for the unswitched branch
This patch copies the metadata of the unswitched branch to the newly
crreated branch in loop unswitch pass.

llvm-svn: 240378
2015-06-23 05:31:09 +00:00
Adam Nemet f530b329c7 [LoopDist] Improve variable names and comments in LoopVersioning class, NFC
As with the previous patch, the goal is to turn the class into a general
loop-versioning class.  This patch removes any references to loop
distribution.

llvm-svn: 240352
2015-06-22 22:59:40 +00:00
Justin Bogner 485212f67c IndVarSimplify: Avoid UB from binding a reference to a null pointer
Calling operator* on a WeakVH whose Value is null hits undefined
behaviour, since we bind the value to a reference. Instead, go through
`operator Value*` so that we work with the pointer itself.

Found by ubsan.

llvm-svn: 240214
2015-06-20 06:24:05 +00:00
Adam Nemet 7632500d7a [LoopDist] Rename RuntimeCheckEmitter to LoopVersioning, NFC
llvm-svn: 240165
2015-06-19 19:32:48 +00:00
Adam Nemet 772a150614 [LoopDist] Move pointer-to-partition computation out of RuntimeCheckEmitter, NFC
This starts preparing the class to become a (more) general
LoopVersioning utility class.

llvm-svn: 240164
2015-06-19 19:32:41 +00:00
Alexander Kornienko 70bc5f1398 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
Eric Christopher 572e03a396 Fix "the the" in comments.
llvm-svn: 240112
2015-06-19 01:53:21 +00:00
Jingyue Wu a941129d00 [NFC] more comments in SLSR
llvm-svn: 239984
2015-06-18 03:35:57 +00:00
Chandler Carruth ecbd16829a [PM/AA] Remove the UnknownSize static member from AliasAnalysis.
This is now living in MemoryLocation, which is what it pertains to. It
is also an enum there rather than a static data member which is left
never defined.

llvm-svn: 239886
2015-06-17 07:21:38 +00:00
Chandler Carruth ac80dc7532 [PM/AA] Remove the Location typedef from the AliasAnalysis class now
that it is its own entity in the form of MemoryLocation, and update all
the callers.

This is an entirely mechanical change. References to "Location" within
AA subclases become "MemoryLocation", and elsewhere
"AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps
out-of-tree folks update.

llvm-svn: 239885
2015-06-17 07:18:54 +00:00
Tyler Nowicki 0a91310c7f Rename Reduction variables/structures to Recurrence.
A reduction is a special kind of recurrence. In the loop vectorizer we currently
identify basic reductions. Future patches will extend this to identifying basic
recurrences.

llvm-svn: 239835
2015-06-16 18:07:34 +00:00
Philip Reames 66ab0f045a Move logic from JumpThreading into LazyValue info to simplify caller.
This change is hopefully NFC. The only tricky part is that I changed the context instruction being used to the branch rather than the comparison. I believe both to be correct, but the branch is strictly more powerful. With the moved code, using the branch instruction is required for the basic block comparison test to return the same result. The previous code was able to directly access both the branch and the comparison where the revised code is not.

Differential Revision: http://reviews.llvm.org/D9652

llvm-svn: 239797
2015-06-16 00:49:59 +00:00
Benjamin Kramer 258ea0dbdf [Statepoints] Skip a vector copy when uniquing values.
No functionality change intended.

llvm-svn: 239688
2015-06-13 19:50:38 +00:00
Matt Wala bfb5368cc7 Revert 239644.
llvm-svn: 239650
2015-06-13 01:08:00 +00:00
Matt Wala 1f48192d7c [Scalarizer] Fix potential for stale data in Scattered across invocations
Summary:
Scalarizer has two data structures that hold information about changes
to the function, Gathered and Scattered. These are cleared in finish()
at the end of runOnFunction() if finish() detects any changes to the
function. 

However, finish() was checking for changes by only checking if
Gathered was non-empty. The function visitStore() only modifies
Scattered without touching Gathered. As a result, Scattered could have
ended up having stale data if Scalarizer only scalarized store
instructions. Since the data in Scattered is used during the execution
of the pass, this introduced dangling pointer errors. 

The fix is to check whether both Scattered and Gathered are empty
before deciding what to do in finish().

Reviewers: srhines

Reviewed By: srhines

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10422

llvm-svn: 239644
2015-06-12 22:49:11 +00:00
Matt Wala a4afccd8a8 Fix a typo in a comment in MemCpyOpt (test commit)
llvm-svn: 239628
2015-06-12 18:16:51 +00:00
Alexey Samsonov 9947e48cd1 [GVN] Use a simpler form of IRBuilder constructor.
Summary:
A side effect of this change is that it IRBuilder now automatically
created debug info locations for new instructions, which is the
same as debug location of insertion point. This is fine for the
functions in questions (GetStoreValueForLoad and
GetMemInstValueForLoad), as they are used in two situations:
  * GVN::processLoad, which tries to eliminate a load. In this case
    new instructions would have the same debug location as the load they
    eventually replace;
  * MaterializeAdjustedValue, which adds new instructions to the end
    of the basic blocks, which could later be used to replace the load
    definition. In this case we don't yet know the way the load would
    be eventually replaced (either by assembling the precomputed values
    via PHI, or by using them directly), so just using the basic block
    strategy seems to be reasonable. There is also a special case
    in the code that *would* adjust the location of the last
    instruction replacing the load definition to the location of the
    load.

Test Plan: regression test suite

Reviewers: echristo, dberlin, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10405

llvm-svn: 239585
2015-06-12 01:39:48 +00:00
Alexey Samsonov ff449802c2 [GVN] Use IRBuilder more actively instead of creating instructions manually.
llvm-svn: 239584
2015-06-12 01:39:45 +00:00
Michael Zolotukhin c4e4f33e29 Update stale comment before analyzeLoopUnrollCost. NFC.
llvm-svn: 239565
2015-06-11 22:17:39 +00:00
Matt Arsenault 91f90e694f SLSR: Pass address space to isLegalAddressingMode
This only updates one of the uses. The other is used in cases
that may never touch memory, so I'm not sure why this is even
calling it. Perhaps there should be a new, similar hook for such
cases or pass -1 for unknown address space.

llvm-svn: 239540
2015-06-11 16:13:39 +00:00
Alexey Samsonov 89645dfa4d [GVN] Set proper debug locations for some instructions created by GVN.
Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.

This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.

Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.

See http://reviews.llvm.org/D10351 review thread for more details.

Test Plan: regression test suite

Reviewers: spatel, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10351

llvm-svn: 239479
2015-06-10 17:37:38 +00:00
Akira Hatanaka d9699bc7bd Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427
2015-06-09 19:07:19 +00:00
Akira Hatanaka 4a61619ff5 [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717

llvm-svn: 239325
2015-06-08 18:50:43 +00:00
Michael Zolotukhin a60bdb5639 Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic.
Summary:
Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor.
Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.

Test Plan: Tests would be submitted in subsequent patches.

Reviewers: atrick, chandlerc

Reviewed By: atrick, chandlerc

Subscribers: atrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D10205

llvm-svn: 239282
2015-06-08 03:28:06 +00:00
Matt Arsenault e81944fd5e SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingMode
llvm-svn: 239262
2015-06-07 20:17:44 +00:00
Matt Arsenault fb88aca348 Make NaryReassociate pass the address space to isLegalAddressingMode
No test since the kinds of transforms this prevents seem to not really
be relevant for SI's different addressing modes.

llvm-svn: 239261
2015-06-07 20:17:42 +00:00
Benjamin Kramer 82f865277e Remove global std::string. NFC.
llvm-svn: 239254
2015-06-07 16:36:28 +00:00
Sanjoy Das ad714b1af3 [LoopUnroll] Fix truncation bug in canUnrollCompletely.
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated.  Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10293

llvm-svn: 239218
2015-06-06 05:24:10 +00:00
David Majnemer 1c297e66fb [CVP] Don't assume Constants of type i1 can be known to be true or false
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt.  Instead,
it gets a ConstantExpr.  It then assumes that the Constant must be equal
to false because it isn't equal to true.

Instead, perform an additional comparison.

This fixes PR23752.

llvm-svn: 239217
2015-06-06 04:56:51 +00:00
Chandler Carruth 9dabd14d59 [Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:

- '*Threshold' is the threshold for full unrolling. It is measured
  against the estimated unrolled cost as computed by getUserCost in TTI
  (or CodeMetrics, etc). We will exceed this threshold when unrolling
  loops where unrolling exposes a significant degree of simplification
  of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
  estimated dynamic execution cost which needs to be saved by unrolling
  to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
  unrolling cost when the dynamic savings are expected to be high.

When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.

While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.

This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.

The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.

Differential Revision: http://reviews.llvm.org/D9966

llvm-svn: 239164
2015-06-05 17:01:43 +00:00
Chandler Carruth 70c61c1a8a [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and
port it to the new pass manager.

All this does is extract the inner "location" class used by AA into its
own full fledged type. This seems *much* cleaner as MemoryDependence and
soon MemorySSA also use this heavily, and it doesn't make much sense
being inside the AA infrastructure.

This will also make it much easier to break apart the AA infrastructure
into something that stands on its own rather than using the analysis
group design.

There are a few places where this makes APIs not make sense -- they were
taking an AliasAnalysis pointer just to build locations. I'll try to
clean those up in follow-up commits.

Differential Revision: http://reviews.llvm.org/D10228

llvm-svn: 239003
2015-06-04 02:03:15 +00:00
Vasileios Kalintiris 9f77f61ef3 Remove stray semicolon. NFC.
llvm-svn: 238908
2015-06-03 08:51:30 +00:00
Sanjoy Das 353a19e13c [RewriteStatepointsForGC] Strip deref info after rewriting.
Summary:
Once a gc.statepoint has been rewritten to relocate live references, the
SSA values represent physical pointers instead of logical references.
Logical dereferencability does not imply physical dereferencability and
after RewriteStatepointsForGC has run any attributes that imply
dereferencability of the logical references need to be stripped.

This current approach is conservative, and can be made more precise
later if needed.  For starters, we need to strip dereferencable
attributes only from pointers that live in the GC address space.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10105

llvm-svn: 238883
2015-06-02 22:33:37 +00:00
Sanjoy Das ea45f0e054 [NFCI] Change RewriteStatepointsForGC to a ModulePass.
Summary:
A later change that has RewriteStatepointsForGC change function
attributes throughout the module depends on this.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10104

llvm-svn: 238882
2015-06-02 22:33:34 +00:00
Owen Anderson 15d1805504 Teach the IR Sink pass to (conservatively) respect convergent annotations.
llvm-svn: 238762
2015-06-01 17:20:31 +00:00
Benjamin Kramer f5e2fc474d Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.


Call sites were found with the ASTMatcher + some semi-automated cleanup.

memberCallExpr(
    argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
    on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
    hasArgument(0, bindTemporaryExpr(
                       hasType(recordDecl(hasNonTrivialDestructor())),
                       has(constructExpr()))),
    unless(isInTemplateInstantiation()))

No functional change intended.

llvm-svn: 238602
2015-05-29 19:43:39 +00:00
Wei Mi e2538b5639 Enable exitValue rewrite only when the cost of expansion is low.
The patch evaluates the expansion cost of exitValue in indVarSimplify pass, and only does the rewriting when the expansion cost is low or loop can be deleted with the rewriting. It provides an option "-replexitval=" to control the default aggressiveness of the exitvalue rewriting. It also fixes some missing cases in SCEVExpander::isHighCostExpansionHelper to enhance the evaluation of SCEV expansion cost.

Differential Revision: http://reviews.llvm.org/D9800

llvm-svn: 238507
2015-05-28 21:49:07 +00:00
David Majnemer 587336d2ad [Reassociate] Canonicalizing 'x [+-] (-Constant * y)' isn't always a win
Canonicalizing 'x [+-] (-Constant * y)' is not a win if we don't *know*
we will open up CSE opportunities.

If the multiply was 'nsw', then negating 'y' requires us to clear the
'nsw' flag.  If this is actually worth pursuing, it is probably more
appropriate to do so in GVN or EarlyCSE.

This fixes PR23675.

llvm-svn: 238397
2015-05-28 06:16:39 +00:00
Jingyue Wu c2a014697a [NaryReassociate] Run EarlyCSE after NaryReassociate
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline

1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.

2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.

Test Plan: updated @reassociate_gep_nsw in nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: dberlin, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9947

llvm-svn: 238396
2015-05-28 04:56:52 +00:00