Commit Graph

7406 Commits

Author SHA1 Message Date
Adam Nemet c1d21817d1 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282758
2016-09-29 20:12:18 +00:00
Dehao Chen 5461d8bdb5 Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update.
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

llvm-svn: 282630
2016-09-28 21:00:58 +00:00
Dehao Chen 80c8ebb4d8 Fix the bug when -compile-twice is specified, the PSI will be invalidated.
Summary:
When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary.
The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24993

llvm-svn: 282616
2016-09-28 18:41:14 +00:00
Artur Pilipenko b6ce6e5dac Don't look through addrspacecast in GetPointerBaseWithConstantOffset
Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value.

This is similar to D13008.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D24729

llvm-svn: 282612
2016-09-28 17:57:16 +00:00
Sanjoy Das f0022125e0 [SCEV] Use a SmallPtrSet as a temporary union predicate; NFC
Summary:
Instead of creating and destroying SCEVUnionPredicate instances (which
internally creates and destroys a DenseMap), use temporary SmallPtrSet
instances of remember the set of predicates that will get reified into a
SCEVUnionPredicate.

Reviewers: silviu.baranga, sbaranga

Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25000

llvm-svn: 282606
2016-09-28 17:14:58 +00:00
Sanjay Patel 220a8730fb [InstSimplify] allow or-of-icmps folds with vector splat constants
llvm-svn: 282592
2016-09-28 14:27:21 +00:00
Sanjay Patel 1b312ad42d [InstSimplify] allow and-of-icmps folds with vector splat constants
llvm-svn: 282590
2016-09-28 13:53:13 +00:00
Adam Nemet 69330e0bc5 [LAA] Rename emitAnalysis to recordAnalys. NFC
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly.  Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).

llvm-svn: 282561
2016-09-28 00:58:36 +00:00
Adam Nemet c507ac96f5 [Inliner] Port all opt remarks to new streaming API
llvm-svn: 282559
2016-09-27 23:47:03 +00:00
Adam Nemet 04758ba385 Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

llvm-svn: 282544
2016-09-27 22:19:23 +00:00
Adam Nemet a62b7e1a28 Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282539
2016-09-27 20:55:07 +00:00
Sanjoy Das 237c84540f [SCEV] Replace a struct with a function; NFC
We can do this now thanks to C++11 lambdas.

llvm-svn: 282515
2016-09-27 18:01:48 +00:00
Sanjoy Das a26021414a [SCEV] Use find instead of find_as; NFC
We don't need the extra generality here.

llvm-svn: 282514
2016-09-27 18:01:46 +00:00
Sanjoy Das c220ac79c4 [SCEV] Reduce the scope of a struct; NFC
llvm-svn: 282513
2016-09-27 18:01:44 +00:00
Sanjoy Das c46bceb632 [SCEV] Remove custom RAII wrapper; NFC
Instead use the pre-existing `scope_exit` class.

llvm-svn: 282512
2016-09-27 18:01:42 +00:00
Sanjoy Das db93375711 [SCEV] Make PendingLoopPredicates more frugal; NFCI
I don't expect `PendingLoopPredicates` to have very many
elements (e.g. when -O3'ing the sqlite3 amalgamation,
`PendingLoopPredicates` has at most 3 elements).  So now we use a
`SmallPtrSet` for it instead of the more heavyweight `DenseSet`.

llvm-svn: 282511
2016-09-27 18:01:38 +00:00
Adam Nemet cc2a3fa8e8 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

llvm-svn: 282503
2016-09-27 16:39:24 +00:00
Adam Nemet 92e928c10a Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282499
2016-09-27 16:15:16 +00:00
Piotr Padlewski d9830eb79f [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437
2016-09-26 20:37:32 +00:00
Chandler Carruth 68abda52c2 [SCEV] Fix the order of members in the initializer list.
Noticed due to the warning on this line. Sanjoy is on
a less-than-awesome internet connection, so committing on his behalf.

llvm-svn: 282380
2016-09-26 04:49:58 +00:00
Sanjoy Das 5cb11b6423 [SCEV] Assign LoopPropertiesCache in the move constructor
In a previous change I collapsed two different caches into one.  When
doing that I noticed that ScalarEvolution's move constructor was not
moving those caches.

To keep the previous change simple, I've moved that bugfix into this
separate change.

llvm-svn: 282376
2016-09-26 02:44:10 +00:00
Sanjoy Das 5603fc00a6 [SCEV] Combine two predicates into one; NFC
Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking
the loop and maintaining similar sorts of caches.  This commit changes
SCEV to compute both the predicates via a single walk, and maintain a
single cache instead of two.

llvm-svn: 282375
2016-09-26 02:44:07 +00:00
Sanjoy Das 5c4869b39d [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage
Specifically, it moves SCEVUnionPredicates from its input into its own
storage.  Make this obvious at the type level.

llvm-svn: 282374
2016-09-26 01:10:27 +00:00
Sanjoy Das 6b76cdf0d5 [SCEV] Further isolate incidental data structure; NFC
llvm-svn: 282373
2016-09-26 01:10:25 +00:00
Sanjoy Das 7326861abd [SCEV] Simplify BackedgeTakenInfo::getMax; NFC
llvm-svn: 282372
2016-09-26 01:10:22 +00:00
Sanjoy Das e935c77e20 [SCEV] Reserve space in SmallVector; NFC
llvm-svn: 282368
2016-09-25 23:12:08 +00:00
Sanjoy Das c9bbf56358 [SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC
SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to
store the (optional) data out of line.

llvm-svn: 282366
2016-09-25 23:12:04 +00:00
Sanjoy Das d1eb62ad11 [SCEV] Simplify tracking ExitNotTakenInfo instances; NFC
This change simplifies a data structure optimization in the
`BackedgeTakenInfo` class for loops with exactly one computable exit.

I've sanity checked that this does not regress compile time performance,
using sqlite3's amalgamated build.

llvm-svn: 282365
2016-09-25 23:12:00 +00:00
Sanjoy Das 89eea6b2ed [SCEV] Rename a couple of fields; NFC
llvm-svn: 282364
2016-09-25 23:11:57 +00:00
Sanjoy Das bdd9710252 [SCEV] Remove incidental data structure; NFC
llvm-svn: 282363
2016-09-25 23:11:55 +00:00
Duncan P. N. Exon Smith b1b208a1f5 Analysis: Return early for UndefValue in computeKnownBits
There is no benefit in looking through assumptions on UndefValue to
guess known bits.  Return early to avoid walking their use-lists, and
assert that all instances of ConstantData are handled here for similar
reasons (UndefValue was the only integer/pointer holdout).

llvm-svn: 282337
2016-09-24 20:42:02 +00:00
Duncan P. N. Exon Smith b479873912 Analysis: Return early in isKnownNonNullAt for ConstantData
Check and return early for ConstantPointerNull and UndefValue
specifically in isKnownNonNullAt, and assert that ConstantData never
make it to isKnownNonNullFromDominatingCondition.

This confirms that isKnownNonNullFromDominatingCondition never walks
through the use-list of an instance of ConstantData.  Given that such
use-lists cross module boundaries, it never really made sense to do so,
and was potentially very expensive.

llvm-svn: 282333
2016-09-24 19:39:47 +00:00
Sanjay Patel 04949faf69 [TLI] isdigit / isascii / toascii param type should match return type (PR30484)
We crash in LibCallSimplifier if we don't check the validity of the function signature properly.

llvm-svn: 282278
2016-09-23 18:44:09 +00:00
Jun Bum Lim 3822939ba7 Enhance calcColdCallHeuristics for InvokeInst
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.

Reviewers: mcrosier, hfinkel, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D24868

llvm-svn: 282262
2016-09-23 17:26:14 +00:00
Adam Nemet e3cef93727 [LV] When reporting about a specific instruction without debug location use loop's
This can occur for example if some optimization drops the debug location.

llvm-svn: 282048
2016-09-21 03:14:20 +00:00
Michael Kuperstein 79dcc274b4 [InferAttributes] Don't access parameters that don't exist.
Check for the correct number of parameters before querying their type.
This fixes PR30455.

llvm-svn: 282038
2016-09-20 23:10:31 +00:00
Sanjay Patel b2332e1931 move variables closer to their uses; add FIXMEs; NFC
llvm-svn: 281972
2016-09-20 14:36:14 +00:00
Elena Demikhovsky 5f8cc0c346 [Loop Vectorizer] Consecutive memory access - fixed and simplified
Amended consecutive memory access detection in Loop Vectorizer.
Load/Store were not handled properly without preceding GEP instruction.

Differential Revision: https://reviews.llvm.org/D20789

llvm-svn: 281853
2016-09-18 13:56:08 +00:00
Sanjay Patel c96f6db246 [InstCombine] allow vector types for constant folding / computeKnownBits (PR24942)
computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine.

I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for 
ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements.

This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942:
https://llvm.org/bugs/show_bug.cgi?id=24942

Differential Revision: https://reviews.llvm.org/D24677

llvm-svn: 281777
2016-09-16 21:20:36 +00:00
David L Kreitzer 8bbabee21a Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Chandler Carruth 49d728ad21 [LCG] Redesign the lazy post-order iteration mechanism for the
LazyCallGraph to support repeated, stable iterations, even in the face
of graph updates.

This is particularly important to allow the CGSCC pass manager to walk
the RefSCCs (and thus everything else) in a module more than once. Lots
of unittests and other tests were hard or impossible to write because
repeated CGSCC pass managers which didn't invalidate the LazyCallGraph
would conclude the module was empty after the first one. =[ Really,
really bad.

The interesting thing is that in many ways this simplifies the code. We
can now re-use the same code for handling reference edge insertion
updates of the RefSCC graph as we use for handling call edge insertion
updates of the SCC graph. Outside of adapting to the shared logic for
this (which isn't trivial, but is *much* simpler than the DFS it
replaces!), the new code involves putting newly created RefSCCs when
deleting a reference edge into the cached list in the correct way, and
to re-formulate the iterator to be stable and effective even in the face
of these kinds of updates.

I've updated the unittests for the LazyCallGraph to re-iterate the
postorder sequence and verify that this all works. We even check for
using alternating iterators to trigger the lazy formation of RefSCCs
after mutation has occured.

It's worth noting that there are a reasonable number of likely
simplifications we can make past this. It isn't clear that we need to
keep the "LeafRefSCCs" around any more. But I've not removed that mostly
because I want this to be a more isolated change.

Differential Revision: https://reviews.llvm.org/D24219

llvm-svn: 281716
2016-09-16 10:20:17 +00:00
Sriraman Tallam 06a67ba57d [PM] Port CFGViewer and CFGPrinter to the new Pass Manager
Differential Revision: https://reviews.llvm.org/D24592

llvm-svn: 281640
2016-09-15 18:35:27 +00:00
Wei Mi f160e345be Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation queries LVI
to check non-null for pointer params of each callsite. If we know the def of
param is an alloca instruction, we know it is non-null and can return early from
LVI. Similarly, CVP queries LVI to check whether pointer for each mem access is
constant. If the def of the pointer is an alloca instruction, we know it is not
a constant pointer. These shortcuts can reduce the cost of CVP significantly.

Differential Revision: https://reviews.llvm.org/D18066

llvm-svn: 281586
2016-09-15 06:28:34 +00:00
Wei Mi 24662395df Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Andrea Di Biagio 7277afeec1 [ConstantFold] Improve the bitcast folding logic for constant vectors.
The constant folder didn't know how to always fold bitcasts of constant integer
vectors. In particular, it was unable to handle the case where a constant vector
had some undef elements, and the resulting (i.e. bitcasted) vector type had more
elements than the original vector type.

Example:
  %cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32>

On a little endian target, %cast could have been folded to:
  <4 x i32><i32 undef, i32 undef, i32 2, i32 0>

This patch improves the folding logic by teaching how to correctly propagate
undef elements in the folded vector.

Differential Revision: https://reviews.llvm.org/D24301

llvm-svn: 281343
2016-09-13 14:50:47 +00:00
Philip Reames 9db7948e90 [LVI] Complete the abstract of the cache layer [NFCI]
Convert the previous introduced is-a relationship between the LVICache and LVIImple clases into a has-a relationship and hide all the implementation details of the cache from the lazy query layer.

The only slightly concerning change here is removing the addition of a queried block into the SeenBlock set in LVIImpl::getBlockValue.  As far as I can tell, this was effectively dead code.  I think it *used* to be the case that getCachedValueInfo wasn't const and might end up inserting elements in the cache during lookup.  That's no longer true and hasn't been for a while.  I did fixup the const usage to make that more obvious.

llvm-svn: 281272
2016-09-12 22:38:44 +00:00
Philip Reames b627aec407 [LVI] Sink a couple more cache manipulation routines into the cache itself [NFCI]
The only interesting bit here is the refactor of the handle callback and even that's pretty straight-forward.

llvm-svn: 281267
2016-09-12 22:03:36 +00:00
Philip Reames 92e5e1b92d [LVI] Abstract out the actual cache logic [NFCI]
Seperate the caching logic from the implementation of the lazy analysis.  For the moment, the lazy analysis impl has a is-a relationship with the cache; this will change to a has-a relationship shortly.  This was done as two steps merely to keep the changes simple and the diff understandable.

llvm-svn: 281266
2016-09-12 21:46:58 +00:00
Justin Lebar 11a3204355 Add handling of !invariant.load to PropagateMetadata.
Summary:
This will let e.g. the load/store vectorizer propagate this metadata
appropriately.

Reviewers: arsenm

Subscribers: tra, jholewinski, hfinkel, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23479

llvm-svn: 281153
2016-09-11 01:39:08 +00:00
Dehao Chen 22ce5eb051 Do not widen load for different variable in GVN.
Summary:
Widening load in GVN is too early because it will block other optimizations like PRE, LICM.

https://llvm.org/bugs/show_bug.cgi?id=29110

The SPECCPU2006 benchmark impact of this patch:

Reference: o2_nopatch
(1): o2_patched

           Benchmark             Base:Reference   (1)  
-------------------------------------------------------
spec/2006/fp/C++/444.namd                  25.2  -0.08%
spec/2006/fp/C++/447.dealII               45.92  +1.05%
spec/2006/fp/C++/450.soplex                41.7  -0.26%
spec/2006/fp/C++/453.povray               35.65  +1.68%
spec/2006/fp/C/433.milc                   23.79  +0.42%
spec/2006/fp/C/470.lbm                    41.88  -1.12%
spec/2006/fp/C/482.sphinx3                47.94  +1.67%
spec/2006/int/C++/471.omnetpp             22.46  -0.36%
spec/2006/int/C++/473.astar               21.19  +0.24%
spec/2006/int/C++/483.xalancbmk           36.09  -0.11%
spec/2006/int/C/400.perlbench             33.28  +1.35%
spec/2006/int/C/401.bzip2                 22.76  -0.04%
spec/2006/int/C/403.gcc                   32.36  +0.12%
spec/2006/int/C/429.mcf                   41.04  -0.41%
spec/2006/int/C/445.gobmk                 26.94  +0.04%
spec/2006/int/C/456.hmmer                  24.5  -0.20%
spec/2006/int/C/458.sjeng                    28  -0.46%
spec/2006/int/C/462.libquantum            55.25  +0.27%
spec/2006/int/C/464.h264ref               45.87  +0.72%

geometric mean                                   +0.23%

For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400.

Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames

Subscribers: gberry, junbuml

Differential Revision: https://reviews.llvm.org/D24096

llvm-svn: 281074
2016-09-09 18:42:35 +00:00
Chandler Carruth 11b3f60cd9 [LCG] Clean up and make NDEBUG verify calls more rigorous with
make_scope_exit now that we have that utility.

This makes the code much more clear and readable by isolating the check.
It also makes it easy to go through and make sure all the interesting
update routines have a start and end verify so we don't slowly let the
graph drift into an invalid state.

llvm-svn: 280619
2016-09-04 08:34:31 +00:00
Chandler Carruth 1f621f0a70 [LCG] A NFC refactoring to extract the logic for doing
a postorder-sequence based update after edge insertion into a generic
helper function.

This separates the SCC-specific logic into two fairly simple lambdas and
extracts the rest into a generic helper template function. I think this
is a net win on its own merits because it disentangles different pieces
of the algorithm. Now there is one place that does the two-step
partition to identify a set of newly connected components and at the
same time update the postorder sequence.

However, I'm also hoping to re-use this an upcoming patch to update
a cached post-order sequence of RefSCCs when doing the analogous update
to the RefSCC graph, and I don't want to have two copies.

The diff is quite messy but this really is just moving things around and
making types generic rather than specific.

llvm-svn: 280618
2016-09-04 08:34:24 +00:00
Andrea Di Biagio bff3fd6700 Simplify code a bit. No functional change intended.
We don't need to call `GetCompareTy(LHS)' every single time true or false is
returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142.

llvm-svn: 280491
2016-09-02 15:55:25 +00:00
Andrea Di Biagio 805815f407 [instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN.
This patch fixes a crash caused by an incorrect folding of an ordered comparison
between a packed floating point vector and a splat vector of NaN.

An ordered comparison between a vector and a constant vector of NaN, should
always be folded into a constant vector where each element is i1 false.

Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar
'false'. Later on, this would cause an assertion failure, since the value type
of the folded value doesn't match the expected value type of the uses of the
original instruction: "Assertion failed: New->getType() == getType() &&
"replaceAllUses of value with new value of different type!".

This patch fixes the issue and adds a test case to the already existing test
InstSimplify/floating-point-compares.ll.

Differential Revision: https://reviews.llvm.org/D24143

llvm-svn: 280488
2016-09-02 14:47:43 +00:00
Michael Zolotukhin e0b2d97b52 [LoopInfo] Add verification by recomputation.
Summary:
Current implementation of LI verifier isn't ideal and fails to detect
some cases when LI is incorrect. For instance, it checks that all
recorded loops are in a correct form, but it has no way to check if
there are no more other (unrecorded in LI) loops in the function. This
patch adds a way to detect such bugs.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits, silvas, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23437

llvm-svn: 280280
2016-08-31 19:26:19 +00:00
Chad Rosier 83a120337a Fix indent. NFC.
llvm-svn: 280270
2016-08-31 18:37:52 +00:00
Tim Shen 48f814e8a3 s/static inline/static/ for headers I have changed in r279475. NFC.
llvm-svn: 280257
2016-08-31 16:48:13 +00:00
David Majnemer a90e51e106 [Loads] Properly populate the visited set in isDereferenceableAndAlignedPointer
There were paths where we wouldn't populate the visited set, causing us
to recurse forever if an SSA variable was defined in terms of itself.

This fixes PR30210.

llvm-svn: 280191
2016-08-31 03:22:32 +00:00
NAKAMURA Takumi b673b16857 Fixup r279618, instantiate *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC>, instead of *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID.
Or they were not instantiated as expected;

  llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID
  llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID

llvm-svn: 280105
2016-08-30 15:47:13 +00:00
Piotr Padlewski 442d38c0b4 NFC: add early exit in ModuleSummaryAnalysis
Summary:
Changed this code because it was not very readable.
The one question that I got after changing it is, should we
count calls to intrinsics? We don't add them to caller summary,
so maybe we shouldn't also count them?

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23949

llvm-svn: 280036
2016-08-30 00:46:26 +00:00
Easwaran Raman 7060af9d22 Fix a thinko in r278189.
llvm-svn: 280008
2016-08-29 20:45:51 +00:00
Elena Demikhovsky 3622fbfc68 [Loop Vectorizer] Fixed memory confilict checks.
Fixed a bug in run-time checks for possible memory conflicts inside loop.
The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size".

Differential revision: https://reviews.llvm.org/D23176

llvm-svn: 279930
2016-08-28 08:53:53 +00:00
Adam Nemet cef3314156 [Inliner] Report when inlining fails because callee's def is unavailable
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.

Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered.  I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off.  As before without -Rpass-with-hotness
there is no overhead.

Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'.  As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.

Reviewers: eraman, chandlerc, davidxl, hfinkel

Subscribers: tejohnson, Prazek, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D23415

llvm-svn: 279860
2016-08-26 20:21:05 +00:00
Bob Haarman 3db176410a limit the number of instructions per block examined by dead store elimination
Summary: Dead store elimination gets very expensive when large numbers of instructions need to be analyzed. This patch limits the number of instructions analyzed per store to the value of the memdep-block-scan-limit parameter (which defaults to 100). This resulted in no observed difference in performance of the generated code, and no change in the statistics for the dead store elimination pass, but improved compilation time on some files by more than an order of magnitude.

Reviewers: dexonsmith, bruno, george.burgess.iv, dberlin, reames, davidxl

Subscribers: davide, chandlerc, dberlin, davidxl, eraman, tejohnson, mbodart, llvm-commits

Differential Revision: https://reviews.llvm.org/D15537

llvm-svn: 279833
2016-08-26 16:34:27 +00:00
George Burgess IV ff7205ca85 Update a comment.
r279696, which changed `LLVM_CONSTEXPR AliasAttr` to `const AliasAttr`,
made this comment make less sense.

llvm-svn: 279699
2016-08-25 01:29:55 +00:00
George Burgess IV 381fc0ee3c Make some LLVM_CONSTEXPR variables const. NFC.
This patch changes LLVM_CONSTEXPR variable declarations to const
variable declarations, since LLVM_CONSTEXPR expands to nothing if the
current compiler doesn't support constexpr. In all of the changed
cases, it looks like the code intended the variable to be const instead
of sometimes-constexpr sometimes-not.

llvm-svn: 279696
2016-08-25 01:05:08 +00:00
Eugene Zelenko 1804a77b2a Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes.
Differential revision: https://reviews.llvm.org/D23861

llvm-svn: 279695
2016-08-25 00:45:04 +00:00
Evgeny Stupachenko d7f9c3564a The patch improves ValueTracking on left shift with nsw flag.
Summary:
The patch fixes PR28946.

Reviewers: majnemer, sanjoy

Differential Revision: http://reviews.llvm.org/D23296

From: Li Huang
llvm-svn: 279684
2016-08-24 23:01:33 +00:00
Chandler Carruth 8882346842 [PM] Introduce basic update capabilities to the new PM's CGSCC pass
manager, including both plumbing and logic to handle function pass
updates.

There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
   CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
   the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.

I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.

The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.

I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.

The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:

- We operate at three levels within the infrastructure: RefSCC, SCC, and
  Node. In each case, we are working bottom up and so we want to
  continue to iterate on the "lowest" node as the graph changes. Look at
  how we iterate over nodes in an SCC running function passes as those
  function passes mutate the CG. We continue to iterate on the "lowest"
  SCC, which is the one that continues to contain the function just
  processed.

- The call graph structure re-uses SCCs (and RefSCCs) during mutation
  events for the *highest* entry in the resulting new subgraph, not the
  lowest. This means that it is necessary to continually update the
  current SCC or RefSCC as it shifts. This is really surprising and
  subtle, and took a long time for me to work out. I actually tried
  changing the call graph to provide the opposite behavior, and it
  breaks *EVERYTHING*. The graph update algorithms are really deeply
  tied to this particualr pattern.

- When SCCs or RefSCCs are split apart and refined and we continually
  re-pin our processing to the bottom one in the subgraph, we need to
  enqueue the newly formed SCCs and RefSCCs for subsequent processing.
  Queuing them presents a few challenges:
  1) SCCs and RefSCCs use wildly different iteration strategies at
     a high level. We end up needing to converge them on worklist
     approaches that can be extended in order to be able to handle the
     mutations.
  2) The order of the enqueuing need to remain bottom-up post-order so
     that we don't get surprising order of visitation for things like
     the inliner.
  3) We need the worklists to have set semantics so we don't duplicate
     things endlessly. We don't need a *persistent* set though because
     we always keep processing the bottom node!!!! This is super, super
     surprising to me and took a long time to convince myself this is
     correct, but I'm pretty sure it is... Once we sink down to the
     bottom node, we can't re-split out the same node in any way, and
     the postorder of the current queue is fixed and unchanging.
  4) We need to make sure that the "current" SCC or RefSCC actually gets
     enqueued here such that we re-visit it because we continue
     processing a *new*, *bottom* SCC/RefSCC.

- We also need the ability to *skip* SCCs and RefSCCs that get merged
  into a larger component. We even need the ability to skip *nodes* from
  an SCC that are no longer part of that SCC.

This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.

We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.

Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:

- It is really nice to do this a function at a time because that
  function is likely hot in the cache. This means we want even the
  function pass adaptor to support online updates to the call graph!

- To update the call graph after arbitrary function pass mutations is
  quite hard. We have to build a fairly comprehensive set of
  data structures and then process them. Fortunately, some of this code
  is related to the code for building the cal graph in the first place.
  Unfortunately, very little of it makes any sense to share because the
  nature of what we're doing is so very different. I've factored out the
  one part that made sense at least.

- We need to transfer these updates into the various structures for the
  CGSCC pass manager. Once those were more sanely worked out, this
  became relatively easier. But some of those needs necessitated changes
  to the LazyCallGraph interface to make it significantly easier to
  extract the changed SCCs from an update operation.

- We also need to update the CGSCC analysis manager as the shape of the
  graph changes. When an SCC is merged away we need to clear analyses
  associated with it from the analysis manager which we didn't have
  support for in the analysis manager infrsatructure. New SCCs are easy!
  But then we have the case that the original SCC has its shape changed
  but remains in the call graph. There we need to *invalidate* the
  analyses associated with it.

- We also need to invalidate analyses after we *finish* processing an
  SCC. But the analyses we need to invalidate here are *only those for
  the newly updated SCC*!!! Because we only continue processing the
  bottom SCC, if we split SCCs apart the original one gets invalidated
  once when its shape changes and is not processed farther so its
  analyses will be correct. It is the bottom SCC which continues being
  processed and needs to have the "normal" invalidation done based on
  the preserved analyses set.

All of this is mostly background and context for the changes here.

Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.

Differential Revision: http://reviews.llvm.org/D21464

llvm-svn: 279618
2016-08-24 09:37:14 +00:00
David Majnemer 54690dcdb0 [ValueTracking] Use a function_ref to avoid multiple instantiations
No functional change intended, this should just be a code size
improvement.

llvm-svn: 279563
2016-08-23 20:52:00 +00:00
Sanjay Patel 6946e2ade3 [InstSimplify] allow icmp with constant folds for splat vectors, part 2
Completes the m_APInt changes for simplifyICmpWithConstant().

Other commits in this series:
https://reviews.llvm.org/rL279492
https://reviews.llvm.org/rL279530
https://reviews.llvm.org/rL279534
https://reviews.llvm.org/rL279538

llvm-svn: 279543
2016-08-23 18:00:51 +00:00
Sanjay Patel 200e3cbfb0 [InstSimplify] allow icmp with constant folds for splat vectors, part 1
llvm-svn: 279538
2016-08-23 17:30:56 +00:00
Sanjay Patel 67bde28627 [InstSimplify] add helper function for SimplifyICmpInst(); NFCI
And add a FIXME because the helper excludes folds for vectors. It's
not clear yet how many of these are actually testable (and therefore
necessary?) because later analysis uses computeKnownBits and other
methods to catch many of these cases.

llvm-svn: 279492
2016-08-22 23:12:02 +00:00
Tim Shen f2187ed321 [GraphTraits] Replace all NodeType usage with NodeRef
This should finish the GraphTraits migration.

Differential Revision: http://reviews.llvm.org/D23730

llvm-svn: 279475
2016-08-22 21:09:30 +00:00
Artur Pilipenko bc76ecada0 Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks.

See https://reviews.llvm.org/D18777 for details.

llvm-svn: 279433
2016-08-22 13:14:07 +00:00
Tim Shen b5e0f5ac95 [GraphTraits] Make nodes_iterator dereference to NodeType*/NodeRef
Currently nodes_iterator may dereference to a NodeType* or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later.

Differential Revision: https://reviews.llvm.org/D23704
Differential Revision: https://reviews.llvm.org/D23705

llvm-svn: 279326
2016-08-19 21:20:13 +00:00
Michael Kuperstein 41898f0396 [AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large.
Repeated inserts into AliasSetTracker have quadratic behavior - inserting a
pointer into AST is linear, since it requires walking over all "may" alias
sets and running an alias check vs. every pointer in the set.

We can avoid this by tracking the total number of pointers in "may" sets,
and when that number exceeds a threshold, declare the tracker "saturated".
This lumps all pointers into a single "may" set that aliases every other
pointer.

(This is a stop-gap solution until we migrate to MemorySSA)

This fixes PR28832.
Differential Revision: https://reviews.llvm.org/D23432

llvm-svn: 279274
2016-08-19 17:05:22 +00:00
Chandler Carruth b7be5b6479 [PM] Rework the new PM support for building the ModuleSummaryIndex to
directly produce the index as the value type result.

This requires making the index movable which is straightforward. It
greatly simplifies things by allowing us to completely avoid the builder
API and the layers of abstraction inherent there. Instead both pass
managers can directly construct these when run by value. They still
won't be constructed truly eagerly thanks to the optional in the legacy
PM. The code that directly builds the index can also just share a direct
function.

A notable change here is that the result type of the analysis for the
new PM is no longer a reference type. This was really problematic when
making changes to how we handle result types to make our interface
requirements *much* more strict and precise. But I think this is an
overall improvement.

Differential Revision: https://reviews.llvm.org/D23701

llvm-svn: 279216
2016-08-19 07:49:19 +00:00
Chandler Carruth e2f36bcb84 [Assumptions] Make collecting ephemeral values not quadratic in the
number of assume intrinsics.

The classical way to have a cache-friendly vector style container when
we need queue semantics for BFS instead of stack semantics for DFS is to
use an ever-growing vector and an index. Erasing from the front requires
O(size) work, and unless we expect the worklist to grow *very* large,
its probably cheaper to just grow and race down the list.

But that makes it more bad that we're putting the assume intrinsics in
this at all. We end up looking at the (by definition empty) use list to
see if they're ephemeral (when we've already put them in that set), etc.

Instead, directly populate the worklist with the operands when we mark
the assume intrinsics as ephemeral. Also, test the visited set *before*
putting things into the worklist so we don't accumulate the same value
in the list 100s of times.

It would be nice to use a set-vector for this but I think its useful to
test the set earlier to avoid repeatedly querying whether the same
instruction is safe to speculate.

Hopefully with these changes the number of values pushed onto the
worklist is smaller, and we avoid quadratic work by letting it grow as
necessary.

Differential Revision: https://reviews.llvm.org/D23396

llvm-svn: 279099
2016-08-18 17:51:24 +00:00
Hans Wennborg 3879035e66 SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932)
Differential Revision: https://reviews.llvm.org/D23594

llvm-svn: 278999
2016-08-17 22:50:18 +00:00
Justin Bogner cd1d5aaf2e Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Tim Shen eb3958fafd [GraphWriter] Change GraphWriter to use NodeRef in GraphTraits
Summary:
This is part of the "NodeType* -> NodeRef" migration. Notice that since
GraphWriter prints object address as identity, I added a static_assert on
NodeRef to be a pointer type.

Reviewers: dblaikie

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D23580

llvm-svn: 278966
2016-08-17 20:07:29 +00:00
Jonas Paulsson 7a79422536 [LoopStrenghtReduce] Refactoring and addition of a new target cost function.
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.

New target hook isFoldableMemAccessOffset(), which is used during formula
rating.

For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.

Updated tests:
test/CodeGen/SystemZ/loop-01.ll

Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152

llvm-svn: 278927
2016-08-17 13:24:19 +00:00
Justin Bogner b03fd12cef Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Duncan P. N. Exon Smith 6331dc171c ObjCARC: Don't increment or dereference end() when scanning args
When there's only one argument and it doesn't match one of the known
functions, return ARCInstKind::CallOrUser rather than falling through
to the two argument case.  The old behaviour both incremented past and
dereferenced end().

llvm-svn: 278881
2016-08-17 01:02:18 +00:00
Reid Kleckner b99b709068 Revert "Enhance SCEV to compute the trip count for some loops with unknown stride."
This reverts commit r278731. It caused http://crbug.com/638314

llvm-svn: 278853
2016-08-16 21:02:04 +00:00
David Majnemer 5c5df6283a [InstSimplify] Fold gep (gep V, C), (xor V, -1) to C-1
llvm-svn: 278779
2016-08-16 06:13:46 +00:00
Sanjoy Das 78db2963f6 Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag"
This reverts commit r278172.  It causes PR28946.

llvm-svn: 278740
2016-08-15 21:01:31 +00:00
David L Kreitzer 7fe18251a5 Enhance SCEV to compute the trip count for some loops with unknown stride.
Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 278731
2016-08-15 20:21:41 +00:00
David Majnemer 3b47a5a562 [ScopedNoAliasAA] collectMDInDomain should be a free function
collectMDInDomain doesn't use any class members, making it a free
function is not a functional change.

llvm-svn: 278651
2016-08-15 03:56:06 +00:00
David Majnemer 8b8869f8ef [ScopedNoAliasAA] Only collect noalias nodes if we have alias.scope nodes
No functional change is intended.

llvm-svn: 278646
2016-08-15 02:23:50 +00:00
David Majnemer ddc7ab26fc [ScopedNoAliasAA] Replace !ScopeNodes.size() with ScopeNodes.empty()
No functional change is intended.

llvm-svn: 278645
2016-08-15 02:23:48 +00:00
David Majnemer c77a1390de Revert "[ScopedNoAliasAA] Remove an unneccesary set"
This reverts commit r278641.  I'm not sure why but this has upset the
multistage builders...

llvm-svn: 278644
2016-08-15 02:23:46 +00:00
David Majnemer 5ec9c58f13 [ScopedNoAliasAA] Remove an unneccesary set
We are trying to prove that one group of operands is a subset of
another.  We did this by populating two Sets and determining that every
element within one was inside the other.

However, this is unnecessary.  We can simply construct a single set and
test if each operand is within it.

llvm-svn: 278641
2016-08-15 00:13:04 +00:00
Pete Cooper 35b00d5d9e Constify ValueTracking. NFC.
Almost all of the method here are only analysing Value's as opposed to
mutating them.  Mark all of the easy ones as const.

llvm-svn: 278585
2016-08-13 01:05:32 +00:00
Eugene Zelenko 3e3a057c20 Fix some Clang-tidy modernize-use-using and Include What You Use warnings.
Differential revision: https://reviews.llvm.org/D23478

llvm-svn: 278583
2016-08-13 00:50:41 +00:00
Ehsan Amiri 17e1701075 [BasicAA] Avoid calling GetUnderlyingObject, when the result of a previous call can be reused.
Recursive calls to aliasCheck from alias[GEP|Select|PHI] may result in a second call to GetUnderlyingObject for a Value, whose underlying object is already computed. This patch ensures that in this situations, the underlying object is not computed again, and the result of the previous call is resued.

https://reviews.llvm.org/D22305

llvm-svn: 278519
2016-08-12 16:05:03 +00:00
Artur Pilipenko 2e8f82d962 [LVI] Take guards into account
Teach LVI to gather control dependant constraints from guards.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23358

llvm-svn: 278518
2016-08-12 15:52:23 +00:00
Artur Pilipenko b623088abe [LVI] Fix potential memory corruption in getValueFromCondition
Rewrite Visited[Cond] = getValueFromConditionImpl(..., Visited) statement which can lead to a memory corruption since getValueFromConditionImpl changes Visited map and invalidates the iterators.

llvm-svn: 278514
2016-08-12 15:08:15 +00:00
Teresa Johnson f93b246f8b [PM] Port ModuleSummaryIndex analysis to new pass manager
Summary:
Port the ModuleSummaryAnalysisWrapperPass to the new pass manager.
Use it in the ported BitcodeWriterPass (similar to how we use the
legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass).

Also, pass the -module-summary opt flag through to the new pass
manager pipeline and through to the bitcode writer pass, and add
a test that uses it.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23439

llvm-svn: 278508
2016-08-12 13:53:02 +00:00
Artur Pilipenko 6669f253d5 [LVI] Take range metadata into account while calculating icmp condition constraints
Take range metadata into account for conditions like this:

%length = load i32, i32* %length_ptr, !range !{i32 0, i32 2147483647}
%cmp = icmp ult i32 %a, %length

This is a common pattern for range checks where the length of the array is dynamically loaded.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23267

llvm-svn: 278496
2016-08-12 10:14:11 +00:00
Artur Pilipenko 635625855f [LVI] Handle any predicate in comparisons like icmp <pred> (add Val, Offset), ...
Currently LVI can only gather value constraints from comparisons like:

* icmp <pred> Val, ...
* icmp ult (add Val, Offset), ...

In fact we can handle any predicate in latter comparisons.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23357

llvm-svn: 278493
2016-08-12 10:05:11 +00:00
David Majnemer 2d006e7673 Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278476
2016-08-12 04:32:42 +00:00
David Majnemer c700490f48 Use the range variant of remove_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278475
2016-08-12 04:32:37 +00:00
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
Pete Cooper 54a0255679 Refactor isValidAssumeForContext to reduce duplication and indentation. NFC.
This method had some duplicate code when we did or did not have a dom tree.  Refactor
it to remove the duplication, but also clean up the control flow to have less duplication.

llvm-svn: 278450
2016-08-12 01:00:15 +00:00
Xinliang David Li 1ce88fa0a5 Add comment /NFC
llvm-svn: 278438
2016-08-11 23:09:56 +00:00
Pete Cooper fa7ae4f3b6 Remove unnecessary extra version of isValidAssumeForContext. NFC.
There were 2 versions of this method.  A public one which takes a
const Instruction* and a private implementation which takes a mutable
Value* and casts to an Instruction*.

There was no need for the 2 versions as all callers pass a const Instruction*
and there was no need for a mutable pointer as we only do analysis here.

llvm-svn: 278434
2016-08-11 22:23:07 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Geoff Berry d01828096f [SCEV] Update interface to handle SCEVExpander insert point motion.
Summary:
This is an extension of the fix in r271424.  That fix dealt with builder
insert points being moved by SCEV expansion, but only for the lifetime
of the expand call.  This change modifies the interface so that LSR can
safely call expand multiple times at the same insert point and do the
right thing if one of the expansions decides to move the original insert
point.

This is a fix for PR28719.

Reviewers: sanjoy

Subscribers: llvm-commits, mcrosier, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23342

llvm-svn: 278413
2016-08-11 21:05:17 +00:00
Michael Kuperstein ee900b62ef [AliasSetTracker] Delete dead code
Deletes unused remove() and containsPointer() interfaces. NFC.

Differential Revision: https://reviews.llvm.org/D23360

llvm-svn: 278365
2016-08-11 17:20:20 +00:00
Easwaran Raman 0d58fcac99 Make more fields of InlineParams Optional.
Differential revision: https://reviews.llvm.org/D23386

llvm-svn: 278312
2016-08-11 03:58:05 +00:00
Piotr Padlewski d89875ca39 Changed sign of LastCallToStaticBouns
Summary:
I think it is much better this way.
When I firstly saw line:
  Cost += InlineConstants::LastCallToStaticBonus;
I though that this is a bug, because everywhere where the cost is being reduced
it is usuing -=.

Reviewers: eraman, tejohnson, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23222

llvm-svn: 278290
2016-08-10 21:15:22 +00:00
Andrew Kaylor b10f6876cd [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 278267
2016-08-10 18:47:19 +00:00
Artur Pilipenko fd223d5d25 [LVI] Handle conditions in the form of (cond1 && cond2)
Teach LVI how to gather information from conditions in the form of (cond1 && cond2). Our out-of-tree front-end emits range checks in this form.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23200

llvm-svn: 278231
2016-08-10 15:13:15 +00:00
Artur Pilipenko 933c07a4fb [LVI] NFC. Make getValueFromCondition return LVILatticeValue instead of changing reference argument
Instead of returning bool and setting LVILatticeValue reference argument return LVILattice value. Use overdefined value to denote the case when we didn't gather any information from the condition.

This change was separated from the review "[LVI] Handle conditions in the form of (cond1 && cond2)" (https://reviews.llvm.org/D23200#inline-199531). Once getValueFromCondition returns LVILatticeValue we can cache the result in Visited map.

llvm-svn: 278224
2016-08-10 13:38:07 +00:00
Artur Pilipenko a4b6a70a9c [LVI] Relax the assertion about LVILatticeVal type in getConstantRange
The problem was triggered by my recent change in CVP (D23059). Current code expected that integer constants are represented by constantrange LVILatticeVal and never represented as LVILatticeVal with constant tag. That is true for ConstantInt constants, although ConstantExpr integer type constants are legally represented as constant LVILatticeVal.

This code fails with CVP change in:

@b = global i32 0, align 4
define void @test6(i32 %a) {
bb:
  %add = add i32 %a, ptrtoint (i32* @b to i32)
  ret void
}
Currently getConstantRange code is not executed by any of the upstream passes. I'm going to add a test case to test/Transforms/CorrelatedValuePropagation/add.ll once I resubmit the CVP change.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23194

llvm-svn: 278217
2016-08-10 12:54:54 +00:00
Easwaran Raman 1c57cc2b68 Do not directly use inline threshold cl options in cost analysis.
This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use.

Differential revision: https://reviews.llvm.org/D22120

llvm-svn: 278189
2016-08-10 00:48:04 +00:00
Adam Nemet 896c09bd10 [Inliner,OptDiag] Add hotness attribute to opt diagnostics
Summary:
The inliner not being a function pass requires the work-around of
generating the OptimizationRemarkEmitter and in turn BFI on demand.
This will go away after the new PM is ready.

BFI is only computed inside ORE if the user has requested hotness
information for optimization diagnostitics (-pass-remark-with-hotness at
the 'opt' level).  Thus there is no additional overhead without the
flag.

Reviewers: hfinkel, davidxl, eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22694

llvm-svn: 278185
2016-08-10 00:44:44 +00:00
Andrew Kaylor 3c05edfd5e [ValueTracking] Improve ValueTracking on left shift with nsw flag
Patch by Li Huang

Differential Revison: https://reviews.llvm.org/D23296

llvm-svn: 278172
2016-08-09 22:41:35 +00:00
Wei Mi 575435012c Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.

Differential Revision: https://reviews.llvm.org/D22942

llvm-svn: 278161
2016-08-09 20:40:03 +00:00
Wei Mi 785858cf6c Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The fix for PR28705 will be committed consecutively.

In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 278160
2016-08-09 20:37:50 +00:00
Anna Thomas 037e540f08 [AliasAnalysis] Treat invariant.start as read-memory
Summary:
We teach alias analysis that invariant.start is readonly.
This helps with GVN and memcopy optimizations that currently treat.
invariant.start as a clobber.
We need to treat this as readonly, so that DSE does not incorrectly
remove stores prior to the invariant.start

Reviewers: sanjoy, reames, majnemer, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23214

llvm-svn: 278138
2016-08-09 17:18:05 +00:00
Artur Pilipenko c710a461b5 [LVI] Make LVI smarter about comparisons with non-constants
Make LVI smarter about comparisons with a non-constant. For example, a s< b constraints a to be in [INT_MIN, INT_MAX) range. This is a part of https://llvm.org/bugs/show_bug.cgi?id=28620 fix.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23205

llvm-svn: 278122
2016-08-09 14:50:08 +00:00
Artur Pilipenko d97eedff40 Revert 278107 which causes buildbot failures and in addition has wrong commit message
llvm-svn: 278109
2016-08-09 10:00:22 +00:00
Artur Pilipenko a410d81f64 Teach CorrelatedValuePropagation to mark adds as no wrap
Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23059

llvm-svn: 278107
2016-08-09 09:41:34 +00:00
Artur Pilipenko adcd01f6cd [LVI] NFC. Fix a typo Bofore -> Before
llvm-svn: 278105
2016-08-09 09:14:29 +00:00
Sean Silva 0746f3bfa4 Consistently use LoopAnalysisManager
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278079
2016-08-09 00:28:52 +00:00
Sean Silva fd03ac6a0c Consistently use ModuleAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278078
2016-08-09 00:28:38 +00:00
Sean Silva 36e0d01e13 Consistently use FunctionAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278077
2016-08-09 00:28:15 +00:00
Charles Davis e9c32c7ed3 Revert "[X86] Support the "ms-hotpatch" attribute."
This reverts commit r278048. Something changed between the last time I
built this--it takes awhile on my ridiculously slow and ancient
computer--and now that broke this.

llvm-svn: 278053
2016-08-08 21:20:15 +00:00
Charles Davis 0822aa118e [X86] Support the "ms-hotpatch" attribute.
Summary:
Based on two patches by Michael Mueller.

This is a target attribute that causes a function marked with it to be
emitted as "hotpatchable". This particular mechanism was originally
devised by Microsoft for patching their binaries (which they are
constantly updating to stay ahead of crackers, script kiddies, and other
ne'er-do-wells on the Internet), but is now commonly abused by Windows
programs to hook API functions.

This mechanism is target-specific. For x86, a two-byte no-op instruction
is emitted at the function's entry point; the entry point must be
immediately preceded by 64 (32-bit) or 128 (64-bit) bytes of padding.
This padding is where the patch code is written. The two byte no-op is
then overwritten with a short jump into this code. The no-op is usually
a `movl %edi, %edi` instruction; this is used as a magic value
indicating that this is a hotpatchable function.

Reviewers: majnemer, sanjoy, rnk

Subscribers: dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D19908

llvm-svn: 278048
2016-08-08 21:01:39 +00:00
Mehdi Amini c137c28c8b RefreshCallGraph does not modify the SCC, adding "const" to make it clear (NFC)
llvm-svn: 278037
2016-08-08 18:51:05 +00:00
Artur Pilipenko eed618d5c0 [LVI] NFC. On the fast dest path use inverse predicate instead of inverse range result
Gathering constantins from a condition on the false path ask makeAllowedICmpRegion about inverse predicate instead of inversing the resulting range.

This change was separated from the review "[LVI] Make LVI smarter about comparisons with non-constants" (https://reviews.llvm.org/D23205#inline-198361)

llvm-svn: 278009
2016-08-08 14:33:11 +00:00
Artur Pilipenko 54b50cc1a8 [LVI] NFC. Rename confusing local NegOffset to Offset
NegOffset is not necessarily negative

llvm-svn: 278008
2016-08-08 14:13:56 +00:00
Artur Pilipenko 21472910c1 [LVI] NFC. Extract LHS, RHS, Predicate locals in getValueFromCondition
llvm-svn: 278007
2016-08-08 14:08:37 +00:00
Eli Friedman 02419a9849 [JumpThreading] Fix handling of aliasing metadata.
Summary:
The correctness fix here is that when we CSE a load with another load,
we need to combine the metadata on the two loads. This matches the
behavior of other passes, like instcombine and GVN.

There's also a minor optimization improvement here: for load PRE, the
aliasing metadata on the inserted load should be the same as the
metadata on the original load. Not sure why the old code was throwing
it away.

Issue found by inspection.

Differential Revision: http://reviews.llvm.org/D21460

llvm-svn: 277977
2016-08-08 04:10:22 +00:00
David Majnemer d150137f64 [InstSimplify] Fold gep (gep V, C), (sub 0, V) to C
llvm-svn: 277952
2016-08-07 07:58:12 +00:00
David Majnemer dc8767a49a [InstSimplify] Try hard to simplify pointer comparisons
Simplify ptrtoint comparisons involving operands with different source
types.

llvm-svn: 277951
2016-08-07 07:58:10 +00:00
David Majnemer a19d0f2f3e [ValueTracking] Teach computeKnownBits about [su]min/max
Reasoning about a select in terms of a min or max allows us to derive a
tigher bound on the result.

llvm-svn: 277914
2016-08-06 08:16:00 +00:00
David Majnemer 1665d8635e [CallGraphSCCPass] Use an ArrayRef instead of a pair of iterators
No functional change is intended.

llvm-svn: 277913
2016-08-06 06:21:02 +00:00
Dehao Chen e1c7c57d11 Remove cold callsite heuristic that is not necessary because of cold callee heuristic.
llvm-svn: 277863
2016-08-05 20:49:04 +00:00
Dehao Chen de39cb9384 Replace hot-callsite based heuristic to use its own threshold parameter instead of share inline-hint parameter
Summary: Hot callsites should have higher threshold than inline hints. This patch uses separate threshold parameter for hot callsites.

Reviewers: davidxl, eraman

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D22368

llvm-svn: 277860
2016-08-05 20:28:41 +00:00
Sanjoy Das 6fa08aafcc [ConstantFolding] Don't create illegal (non-integral) inttoptrs
Reviewers: majnemer, arsenm

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23182

llvm-svn: 277854
2016-08-05 19:23:29 +00:00
Sanjoy Das b0b4e86215 [SCEV] Don't infinitely recurse on unreachable code
llvm-svn: 277848
2016-08-05 18:34:14 +00:00
Michael Kuperstein 3ceac2bbd5 [LV, X86] Be more optimistic about vectorizing shifts.
Shifts with a uniform but non-constant count were considered very expensive to
vectorize, because the splat of the uniform count and the shift would tend to
appear in different blocks. That made the splat invisible to ISel, and we'd
scalarize the shift at codegen time.

Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
are able to select the appropriate vector shifts. This updates the cost model to
to take this into account by making shifts by a uniform cheap again.

Differential Revision: https://reviews.llvm.org/D23049

llvm-svn: 277782
2016-08-04 22:48:03 +00:00
Sanjay Patel bcaf6f39dd [InstCombine] use m_APInt to allow icmp eq (op X, Y), C folds for splat constant vectors
I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well,
so we know where those folds are happening in InstSimplify.

llvm-svn: 277738
2016-08-04 17:48:04 +00:00
Alina Sbirlea 6f937b1144 LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.

Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
  considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
  for a maximum size allowed and relies on the next condition checking for %4 for correctness.
  This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).

Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.

Reviewers: arsenm, jlebar, tstellarAMD

Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23068

llvm-svn: 277735
2016-08-04 16:38:44 +00:00
David Majnemer 909793fa63 Reinstate "[CloneFunction] Don't remove side effecting calls"
This reinstates r277611 + r277614 and reverts r277642.  A cast_or_null
should have been a dyn_cast_or_null.

llvm-svn: 277691
2016-08-04 04:24:02 +00:00
Reid Kleckner a6be60871f Revert "[CloneFunction] Don't remove side effecting calls"
This reverts commit r277611 and the followup r277614.

Bootstrap builds and chromium builds are crashing during inlining after
this change.

llvm-svn: 277642
2016-08-03 20:01:01 +00:00
Sebastian Pop 031b1bc06f Pass EphValues by const-ref as it is not modified in the callee
Patch by Aditya Kumar.

Differential Revision: https://reviews.llvm.org/D22967

llvm-svn: 277634
2016-08-03 19:13:50 +00:00
David Majnemer fad0490869 [CloneFunction] Don't remove side effecting calls
We were able to figure out that the result of a call is some constant.
While propagating that fact, we added the constant to the value map.
This is problematic because it results in us losing the call site when
processing the value map.

This fixes PR28802.

llvm-svn: 277611
2016-08-03 17:12:47 +00:00
George Burgess IV 777efb1620 [CFLAA] Be more conservative with values we haven't seen.
There were issues with simply reporting AttrUnknown on
previously-unknown values in CFLAnders. So, we now act *entirely*
conservatively for values we haven't seen before. As in the prior patch
(r277362), writing a lit test for this isn't exactly trivial. If someone
wants a test badly, I'm willing to try to write one.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D23077

llvm-svn: 277533
2016-08-02 22:17:25 +00:00
Artur Pilipenko 2e19f59304 [LVI] NFC. Sink a condition type check from the caller down to getValueFromCondition
This is a preparatory refactoring to support conditions other than ICmpInst.

llvm-svn: 277479
2016-08-02 16:20:48 +00:00
Artur Pilipenko 2a8f96f5bc [LVI] NFC. Fix a typo getValueFromFromCondition -> getValueFromCondition
llvm-svn: 277466
2016-08-02 14:44:32 +00:00
Sean Silva f801575fd0 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277411
2016-08-02 02:15:45 +00:00
Tim Shen b44909eccb [ADT] NFC: Generalize GraphTraits requirement of "NodeType *" in interfaces to "NodeRef", and migrate SCCIterator.h to use NodeRef
Summary: By generalize the interface, users are able to inject more flexible Node token into the algorithm, for example, a pair of vector<Node>* and index integer. Currently I only migrated SCCIterator to use NodeRef, but more is coming. It's a NFC.

Reviewers: dblaikie, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22937

llvm-svn: 277399
2016-08-01 22:32:20 +00:00
George Burgess IV 5f0e76dca6 [CFLAA] Remove modref queries from CFLAA.
As it turns out, modref queries are broken with CFLAA. Specifically,
the data source we were using for determining modref behaviors
explicitly ignores operations on non-pointer values. So, it wouldn't
note e.g. storing an i32 to an i32* (or loading an i64 from an i64*).
It also ignores external function calls, rather than acting
conservatively for them.

(N.B. These operations, where necessary, *are* tracked by CFLAA; we just
use a different mechanism to do so. Said mechanism is relatively
imprecise, so it's unlikely that we can provide reasonably good modref
answers with it as implemented.)

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22978

llvm-svn: 277366
2016-08-01 18:47:28 +00:00
George Burgess IV 4c58266038 [CFLAA] Make CFLAnders more conservative with new Values.
Currently, CFLAnders assumes that values it hasn't seen don't alias
anything. This patch fixes that. Given that the only way for this to
happen is to query AA, rely on specific transformations happening, then
query AA again (looking for a specific set of queries), lit testing is a
bit difficult. If someone really wants a test, I'm happy to add one.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22981

llvm-svn: 277362
2016-08-01 18:27:33 +00:00
Sean Silva 423c7149dc Revert r277313 and r277314.
They seem to trigger an LSan failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/15140/steps/check-llvm%20asan/logs/stdio

Revert "Add the tests for r277313"

This reverts commit r277314.

Revert "CodeExtractor : Add ability to preserve profile data."

This reverts commit r277313.

llvm-svn: 277317
2016-08-01 04:16:09 +00:00
Sean Silva 6208924323 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277313
2016-08-01 02:59:26 +00:00
David Majnemer 718da3d1f6 [ConstantFolding] Handle bitcasts of undef fp vector elements
We used the wrong type for constructing a zero vector element which led
to type mismatches.

This fixes PR28771.

llvm-svn: 277197
2016-07-29 18:48:27 +00:00
Andrew Kaylor b99d1cc7ed Recommitting r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!)

llvm-svn: 277189
2016-07-29 18:23:18 +00:00
Matt Masten a6669a1e05 Initial support for vectorization using svml (short vector math library).
Differential Revision: https://reviews.llvm.org/D19544

llvm-svn: 277166
2016-07-29 16:42:44 +00:00
David Majnemer e4218cf11e [ConstantFolding] Fold bitcasts of vectors w/ undef elements
An undef vector element can be treated as if it had any value.  Folding
such a vector element to 0 in a bitcast can open up further folding
opportunities.

llvm-svn: 277104
2016-07-29 04:06:09 +00:00
David Majnemer a926b3e71b [ConstantFolding] Remove an unused ConstantFoldInstOperands overload
No functional change is intended.

llvm-svn: 277101
2016-07-29 03:27:33 +00:00
David Majnemer 57b94c8d6a [ConstantFolding] Use ConstantExpr::getWithOperands
ConstantExpr::getWithOperands does much of the hard work that
ConstantFoldInstOperandsImpl tries to do but more completely.

This lets us fold ExtractValue/InsertValue expressions.

llvm-svn: 277100
2016-07-29 03:27:31 +00:00
David Majnemer d536f2328e [ConstnatFolding] Teach the folder how to fold ConstantVector
A ConstantVector can have ConstantExpr operands and vice versa.
However, the folder had no ability to fold ConstantVectors which, in
some cases, was an optimization barrier.

Instead, rephrase the folder in terms of Constants instead of
ConstantExprs and teach callers how to deal with failure.

llvm-svn: 277099
2016-07-29 03:27:26 +00:00
George Burgess IV 0a9cbd4743 [CFLAA] Check for pointer types in more places.
This patch fixes an assertion that fires when we try to add non-pointer
Values to the CFLGraph. Centralizing the check for whether something
is/isn't a pointer type isn't completely trivial (and, in some cases,
would end up being entirely redundant), but it may be beneficial to do
so if this trips us up more in the future.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22947

llvm-svn: 277096
2016-07-29 01:23:45 +00:00
Adam Nemet aa3506c5f0 [BPI] Add new LazyBPI analysis
Summary:
The motivation is the same as in D22141: In order to add the hotness
attribute to optimization remarks we need BFI to be available in all
passes that emit optimization remarks.  BFI depends on BPI so unless we
make this lazy as well we would still compute BPI unconditionally.

The solution is to use the new LazyBPI pass in LazyBFI and only compute
BPI when computation of BFI is requested by the client.

I extended the laziness test using a LoopDistribute test to also cover
BPI.

Reviewers: hfinkel, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22835

llvm-svn: 277083
2016-07-28 23:31:12 +00:00
David Majnemer 19d024b2fd [ConstantFolding] Don't bail on folding if ConstantFoldConstantExpression fails
When folding an expression, we run ConstantFoldConstantExpression on
each operand of that expression.
However, ConstantFoldConstantExpression can fail and retur nullptr.

Previously, we would bail on further refining the expression.
Instead, use the original operand and see if we can refine a later
operand.

llvm-svn: 276959
2016-07-28 06:39:48 +00:00
George Burgess IV dbd35c44d4 [CFLAA] Add getModRefBehavior to CFLAnders.
This patch lets CFLAnders respond to mod-ref queries. It also includes
a small bugfix to CFLSteens.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22823

llvm-svn: 276939
2016-07-27 23:07:07 +00:00
Justin Lebar 58b377e87d [LVI] Use DenseMap::find_as in LazyValueInfo.
Summary:
This lets us avoid creating and destroying a CallbackVH every time we
check the cache.

This is good for a 2% e2e speedup when compiling one of the large Eigen
tests at -O3.

FTR, I tried making the ValueCache hashtable one-level -- i.e., mapping
a pair (Value*, BasicBlock*) to a lattice value, and that didn't seem to
provide any additional improvement.  Saving a word in LVILatticeVal by
merging the Tag and Val fields also didn't yield a speedup.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21951

llvm-svn: 276926
2016-07-27 22:33:36 +00:00
Matt Masten f5f7d1e878 test commit
llvm-svn: 276911
2016-07-27 20:22:21 +00:00
Sebastian Pop 18c964d7a4 add a verbose mode to Loop->print() to print all the basic blocks of a loop
Differential Revision: https://reviews.llvm.org/D22817

llvm-svn: 276838
2016-07-27 05:02:17 +00:00
David Majnemer bc36b15253 [ConstantFolding] Correctly handle failures in ConstantFoldConstantExpressionImpl
Failures in ConstantFoldConstantExpressionImpl were ignored causing
crashes down the line.

This fixes PR28725.

llvm-svn: 276827
2016-07-27 02:39:16 +00:00
Andrew Kaylor f990fa5f7b Reverting r276771 due to MSan failures.
llvm-svn: 276824
2016-07-27 01:19:24 +00:00
Hans Wennborg 685e8ff953 Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion."
It causes Clang tests to fail after Windows self-host (PR28705).

(Also reverts follow-up r276139.)

llvm-svn: 276822
2016-07-26 23:25:13 +00:00
David Majnemer 6774d612d4 [InstSimplify] Cast folding can be made more generic
Use isEliminableCastPair to determine if a pair of casts are foldable.

llvm-svn: 276777
2016-07-26 17:58:05 +00:00
Andrew Kaylor 3104a6bad0 Re-committing r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 276771
2016-07-26 17:23:13 +00:00
David Majnemer a90a621d1e Reapply: [InstSimplify] Add support for bitcasts"
This reverts commit r276700 and reapplies r276698.
The relevant clang tests have been updated.

llvm-svn: 276727
2016-07-26 05:52:29 +00:00
David Majnemer 6e06b577cc Revert "[InstSimplify] Add support for bitcasts"
This reverts commit r276698.  Clang has tests which rely on the
optimizer :(

llvm-svn: 276700
2016-07-25 22:24:59 +00:00
David Majnemer 62611fd3f7 [InstSimplify] Add support for bitcasts
BitCasts of BitCasts can be folded away as can BitCasts which don't
change the type of the operand.

llvm-svn: 276698
2016-07-25 22:04:58 +00:00
David Majnemer 126de5d4b4 [InstSimplify] Fold trunc([zs]ext(%V)) -> %V
Truncates can completely cancel out a zext or sext instruction.

llvm-svn: 276604
2016-07-25 03:39:21 +00:00
NAKAMURA Takumi bd072a9220 Trailing whitespace.
llvm-svn: 276596
2016-07-25 00:59:46 +00:00
Sean Silva ab6a683765 Avoid using a raw AssumptionCacheTracker in various inliner functions.
This unblocks the new PM part of River's patch in
https://reviews.llvm.org/D22706

Conveniently, this same change was needed for D21921 and so these
changes are just spun out from there.

llvm-svn: 276515
2016-07-23 04:22:50 +00:00
David Majnemer 796331c026 [LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad
While we handed loads past the end of an array, we didn't handle loads
_before_ the array.

This fixes PR28062.

N.B. While the bug in the code is obvious, I am struggling to craft a
test case which is reasonable in size.

llvm-svn: 276510
2016-07-23 02:56:49 +00:00
Sanjoy Das a7d9ec8751 [SCEV] Make isImpliedCondOperandsViaRanges smarter
This change lets us prove things like

  "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow"

It does this by replacing replacing getConstantDifference by
computeConstantDifference (which is smarter) in
isImpliedCondOperandsViaRanges.

llvm-svn: 276505
2016-07-23 00:54:36 +00:00
Sanjoy Das 0b1af85cc2 [SCEV] Change the interface of computeConstantDifference; NFC
This is in preparation of
s/getConstantDifference/computeConstantDifference/ in a later change.

llvm-svn: 276503
2016-07-23 00:28:56 +00:00
George Burgess IV 4ec1753ff4 [CFLAA] Add more offset-sensitivity tracking.
This patch teaches FunctionInfo about offsets.

Like the last patch, this one doesn't introduce any visible
functionality change (the core algorithm knows nothing about offsets;
they're just plumbed through). Tests will come when we start acting
differently because of the offsets.

Patch by Jia Chen.

(N.B. I made a tiny change to Jia's patch to avoid warnings by GCC: I
put DenseMapInfo specializations in the `llvm` namespace. Only realized
that those appeared when compiling locally. :) )

Differential Revision: https://reviews.llvm.org/D22634

llvm-svn: 276486
2016-07-22 22:30:48 +00:00
Sanjoy Das 095f5b204f [SCEV] Extract out a helper function; NFC
The helper will get smarter in a later change, but right now this is
just code reorganization.

llvm-svn: 276467
2016-07-22 20:47:55 +00:00
Reid Kleckner db10b5009d Use INT64_MAX instead of LLONG_MAX
llvm-svn: 276419
2016-07-22 14:11:58 +00:00
Sanjay Patel e9fc79bb13 [InstSimplify] don't crash handling a pointer or aggregate type
llvm-svn: 276345
2016-07-21 21:56:00 +00:00
Sanjay Patel a3bfb4e313 [InstSimplify] recognize trunc + icmp sgt/slt variants of select simplifications (PR28466)
rL245171 exposed a hole in InstSimplify that manifested in a strange way in PR28466:
https://llvm.org/bugs/show_bug.cgi?id=28466

It's possible to use trunc + icmp sgt/slt in place of an and + icmp eq/ne, so we need to
recognize that pattern to eliminate selects that are choosing between some value and some
bitmasked version of that value.

Note that there is significant room for improvement (refactoring) and enhancement (more
patterns, possibly in InstCombine rather than here).

Differential Revision: https://reviews.llvm.org/D22537

llvm-svn: 276341
2016-07-21 21:26:45 +00:00
Reid Kleckner a4de84604a Fix the clang-cl self-host with VS 2013 headers
std::numeric_limits<int64_t>::max() is not constexpr in VC 2013 headers,
and Clang complains that it isn't. MSVC 2013 itself is emitting a
dynamic initializer for this thing. Instead, use an enum.

llvm-svn: 276334
2016-07-21 21:06:04 +00:00
George Burgess IV 825a868796 Normalize file docs. NFC.
Having the added `\brief` made doxygen interpret it as the summary for
the `llvm` namespace (visible at:
http://llvm.org/doxygen/namespaces.html).

llvm-svn: 276331
2016-07-21 20:52:35 +00:00
Benjamin Kramer a9e477b286 [DemandedBits] Reduce number of duplicated DenseMap lookups.
No functionality change intended.

llvm-svn: 276278
2016-07-21 13:37:55 +00:00
Adam Nemet cbe2a9b213 [OptDiag] Missed these when making the IR Value a const pointer
llvm-svn: 276224
2016-07-21 01:11:12 +00:00
Adam Nemet 7cfd5971ab [OptDiag,LV] Add hotness attribute to applied-optimization remarks
Test coverage is provided by modifying the function in the FP-math
testcase that we are allowed to vectorize.

llvm-svn: 276223
2016-07-21 01:07:13 +00:00
Adam Nemet 0e0e2d5d26 [OptDiag,LV] Add hotness attribute to the derived analysis remarks
This includes FPCompute and Aliasing.

Testcase is based on no_fpmath.ll.

llvm-svn: 276211
2016-07-20 23:50:32 +00:00
Sanjay Patel 5f3c70307d [InstSimplify][InstCombine] don't crash when folding vector selects of icmp
Differential Revision: https://reviews.llvm.org/D22602

llvm-svn: 276209
2016-07-20 23:40:01 +00:00
George Burgess IV 6febd834b6 [CFLAA] Add offset tracking in CFLGraph.
(Also, refactor our constexpr handling to be less insane).

This patch lets us track field offsets in the CFL Graph, which is the
first step to making CFLAA field/offset sensitive. Woohoo! Note that
this patch shouldn't visibly change our behavior (since we make no use
of the offsets we're now tracking), so we can't quite add tests for this
yet.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22598

llvm-svn: 276201
2016-07-20 22:53:30 +00:00
Adam Nemet 5b3a5cf6b0 [OptDiag,LV] Add hotness attribute to analysis remarks
The earlier change added hotness attribute to missed-optimization
remarks.  This follows up with the analysis remarks (the ones explaining
the reason for the missed optimization).

llvm-svn: 276192
2016-07-20 21:44:26 +00:00
Adam Nemet 6100d16e7d [OptDiag] Take the IR Value as a const pointer
This helps because LoopAccessReport is passed around as a const
reference and we derive the basic block passed as the Value parameter
from the instruction in LoopAccessReport.

llvm-svn: 276191
2016-07-20 21:44:22 +00:00
Adam Nemet 78d1091c91 [OptDiag] Wrap a long line
llvm-svn: 276190
2016-07-20 21:44:18 +00:00
Wei Mi db80c0c77f Use ValueOffsetPair to enhance value reuse during SCEV expansion.
In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 276136
2016-07-20 16:40:33 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
George Burgess IV 22a0f1a0b9 Attempt to appease MSVC buildbots.
Broken by r276026.

llvm-svn: 276032
2016-07-19 21:35:47 +00:00
George Burgess IV 3b059841ff [CFLAA] Add some interproc. analysis to CFLAnders.
This patch adds function summary support to CFLAnders. It also comes
with a lot of tests! Woohoo!

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22450

llvm-svn: 276026
2016-07-19 20:47:15 +00:00
George Burgess IV c01b42faa5 [CFLAA] Teach CFLAnders to distinguish reads from writes.
This patch adds more specific edges to CFLAndersAliasAnalysis. The goal
of these edges is to give us more information about *how* two values
that MayAlias alias. With this, we can now tell cases like

a = b; // ergo, a may alias b

apart from

a = c;
b = c;

// so, a may alias b, but only because they were both assigned to c.

...And others.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22429

llvm-svn: 276023
2016-07-19 20:38:21 +00:00
David Majnemer f29b7bafb6 [RegionPass] Some minor cleanups
No functional change is intended.

llvm-svn: 276000
2016-07-19 17:50:27 +00:00
David Majnemer 1a4576e79d [LoopPass] Some minor cleanups
No functional change is intended.

llvm-svn: 275999
2016-07-19 17:50:24 +00:00
Simon Pilgrim 0ea8d275cc [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.

It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).

This patch changes both scalar and packed versions back to using x86-specific builtins.

It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.

A companion clang patch is at D22105

Differential Revision: https://reviews.llvm.org/D22106

llvm-svn: 275981
2016-07-19 15:07:43 +00:00
Sanjay Patel 5f5eb58eb5 refactor SimplifySelectInst; NFCI
llvm-svn: 275911
2016-07-18 20:56:53 +00:00
Adam Nemet 79ac42a5c9 [OptRemarkEmitter] Port to new PM
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer.  Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.

This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.

Reviewers: davidxl, silvas

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22436

llvm-svn: 275810
2016-07-18 16:29:21 +00:00
Teresa Johnson cd21a646f6 [ThinLTO] Perform profile-guided indirect call promotion
Summary:
To enable profile-guided indirect call promotion in ThinLTO mode, we
simply add call graph edges for each profitable target from the profile
to the summaries, then the summary-guided importing will consider the
callee for importing as usual.

Also we need to enable the indirect call promotion pass creation in the
PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO
backend), so that the newly imported functions are considered for
promotion in the backends.

The IC promotion profiles refer to callees by GUID, which required
adding GUIDs to the per-module VST in bitcode (and assigning them
valueIds similar to how they are assigned valueIds in the combined
index).

Reviewers: mehdi_amini, xur

Subscribers: mehdi_amini, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21932

llvm-svn: 275707
2016-07-17 14:47:01 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
George Burgess IV 22682e293b [CFLAA] Add attributes handling for CFLAnders.
This patch adds proper handling of stratified attributes into our
anders-style CFLAA implementation. It also comes bundled with more
CFLAnders tests. :)

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22325

llvm-svn: 275604
2016-07-15 20:02:49 +00:00
George Burgess IV 6d30aa03a0 [CFLAA] Add an initial CFLAnders implementation.
This adds an incomplete anders-style implementation for CFLAA. It's
incomplete in that it's missing interprocedural analysis, attrs
handling, etc. and that it needs more tests. More tests and features
will be added in future commits.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22291

llvm-svn: 275602
2016-07-15 19:53:25 +00:00
Adam Nemet aad816083e [OptRemark,LDist] RFC: Add hotness attribute
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334

This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution.  My goal is to shake out the design issues before scaling
it up to other types and passes.

Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count.  It's only printed in opt
currently since clang prints the diagnostic fields directly.  E.g.:

  remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)

A new API added is similar to emitOptimizationRemarkMissed.  The
difference is that it additionally takes a code region that the
diagnostic corresponds to.  From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI.  (Thanks to Hal for the analysis pass idea.)

This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context.  If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.

A new command-line option is added to turn this on in opt.

My plan is to switch all user of emitOptimizationRemark* to use this
module instead.

Reviewers: hfinkel

Subscribers: rcox2, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21771

llvm-svn: 275583
2016-07-15 17:23:20 +00:00
David Majnemer a940f360cb [AliasAnalysis] Give back AA results for fence instructions
Calling getModRefInfo with a fence resulted in crashes because fences
don't have a memory location.  Add a new predicate to Instruction
called isFenceLike which indicates that the instruction mutates memory
but not any single memory location in particular. In practice, it is a
proxy for the set of instructions which "mayWriteToMemory" but cannot be
used with MemoryLocation::get.

This fixes PR28570.

llvm-svn: 275581
2016-07-15 17:19:24 +00:00
Igor Laevsky ee40d1e8da Re-submit r272891 "Prevent dangling pointer problems in BranchProbabilityInfo"
Most possibly problem was caused by the same reason as PR28400. This change
bypasses it by using CallbackVH instead of AssertingVH.

Differential Revision: https://reviews.llvm.org/D20957

llvm-svn: 275563
2016-07-15 14:31:16 +00:00
Sanjoy Das b66374c634 [ValueTracking] Use Instruction::getFunction; NFC
llvm-svn: 275465
2016-07-14 20:19:01 +00:00
Tom Stellard 1b5cf6217e GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals
Summary:
In preparation for changing GlobalsAA to stop assuming that intrinsics
can't read arbitrary globals, we need to make sure GlobalsAA is querying
function attributes rather than relying on this assumption.

This patch was inspired by: http://reviews.llvm.org/D20206

Reviewers: jmolloy, hfinkel

Subscribers: eli.friedman, llvm-commits

Differential Revision: https://reviews.llvm.org/D21318

llvm-svn: 275433
2016-07-14 15:50:27 +00:00
Sjoerd Meijer 38c2cd0c14 This implements a more optimal algorithm for selecting a base constant in
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.

Differential Revision: http://reviews.llvm.org/D21183

llvm-svn: 275382
2016-07-14 07:44:20 +00:00
David Majnemer 17a95aaa7b Simplify llvm.masked.load w/ undef masks
We can always pick the passthru value if the mask is undef: we are
permitted to treat the mask as-if it were filled with zeros.

llvm-svn: 275379
2016-07-14 06:58:37 +00:00
David Majnemer 7f781aba97 [ConstantFolding] Fold masked loads
We can constant fold a masked load if the operands are appropriately
constant.

Differential Revision: http://reviews.llvm.org/D22324

llvm-svn: 275352
2016-07-14 00:29:50 +00:00
David Majnemer f89660aba7 [ConstantFolding] Extend FoldReinterpretLoadFromConstPtr to handle negative offsets
Treat loads which clip before the start of a global initializer the same
way we treat clipping beyond the end of the initializer: use zeros.

llvm-svn: 275345
2016-07-13 23:33:07 +00:00
David Majnemer d77a3b61eb Move a transform from InstCombine to InstSimplify.
This transform doesn't require any new instructions, it can safely live
in InstSimplify.

llvm-svn: 275344
2016-07-13 23:32:53 +00:00
Adam Nemet 7da74abf3d [LAA] Don't hold on to DominatorTree in the analysis result
llvm-svn: 275335
2016-07-13 22:36:35 +00:00
Adam Nemet b49d9a56eb [LAA] Don't hold on to TargetLibraryInfo in the analysis result
llvm-svn: 275334
2016-07-13 22:36:27 +00:00
Adam Nemet 1824e411c6 [LAA] Don't hold on to DataLayout in the analysis result
In fact, don't even pass this to the ctor since we can get it from the
module.

llvm-svn: 275326
2016-07-13 22:18:51 +00:00
Adam Nemet 6616ad08f6 [LAA] Don't hold on to LoopInfo in the analysis result
llvm-svn: 275325
2016-07-13 22:18:48 +00:00
Adam Nemet 1556357677 [LAA] Don't hold on to AliasAnalysis in the analysis result
llvm-svn: 275322
2016-07-13 21:39:09 +00:00
Andrew Kaylor 346dd7f1bd Reverting r275284 due to platform-specific test failures
llvm-svn: 275304
2016-07-13 19:09:16 +00:00
Andrew Kaylor 12cccdd731 Fix for Bug 26903, adds support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 275284
2016-07-13 17:25:11 +00:00
David Majnemer 4cff2f8d49 [ConstantFolding] Use sdiv_ov
This is a simplification, there should be no functional change.

llvm-svn: 275273
2016-07-13 15:53:46 +00:00
David Majnemer 1b3db33e3d [ConstantFolding] Don't treat negative GEP offsets as positive
GEP offsets are signed, don't treat them as huge positive numbers.

llvm-svn: 275251
2016-07-13 05:16:16 +00:00
Adam Nemet c2f791d8a7 [BFI] Add new LazyBFI analysis pass
Summary:
This is necessary for D21771.  In order to add the hotness attribute to
optimization remarks we need BFI to be available in all passes that emit
optimization remarks.

However we don't want to pay for computing BFI unless the hotness
attribute is requested.

This is achieved by making BFI lazy at the very high-level through a new
analysis pass -- BFI is not calculated unless requested.

I am adding a test to check the laziness under D21771 where the first
user of the analysis is added.

Reviewers: hfinkel, dexonsmith, davidxl

Subscribers: davidxl, dexonsmith, llvm-commits

Differential Revision: http://reviews.llvm.org/D22141

llvm-svn: 275250
2016-07-13 05:01:48 +00:00
David Majnemer 90a9704a41 [ConstantFolding] Cleanups
No functional change is intended, just a minor cleanup.

llvm-svn: 275249
2016-07-13 04:22:12 +00:00
David Majnemer 17bdf445e4 [IR] Make getIndexedOffsetInType return a signed result
A GEPed offset can go negative, the result of getIndexedOffsetInType
should according be a signed type.

llvm-svn: 275246
2016-07-13 03:42:38 +00:00
Keno Fischer 1efc3b70c5 Fix ScalarEvolutionExpander step scaling bug
The expandAddRecExprLiterally function incorrectly transforms
`[Start + Step * X]` into `Step * [Start + X]` instead of the correct
transform of `[Step * X] + Start`.

This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219
due to what appeared to be sufficiently complicated loop interactions.

Patch by Jameson Nash (jameson@juliacomputing.com).

Reviewers: sanjoy
Differential Revision: http://reviews.llvm.org/D16505

llvm-svn: 275239
2016-07-13 01:28:12 +00:00
Teresa Johnson 835df56cb3 Remove another unused variable from r275216
Remove another variable added in r275216 that was only used in debug
mode.

llvm-svn: 275238
2016-07-12 23:49:17 +00:00
Teresa Johnson 1e44b5d3ab Refactor indirect call promotion profitability analysis (NFC)
Summary:
Refactored the profitability analysis out of the IC promotion pass and
into lib/Analysis so that it can be accessed by the summary index
builder in a follow-on patch to enable IC promotion in ThinLTO (D21932).

Reviewers: davidxl, xur

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22182

llvm-svn: 275216
2016-07-12 21:13:44 +00:00
David Majnemer 8b401013c1 [LoopAccessAnalysis] Some minor cleanups
Use range-base for loops.
Use auto when appropriate.

No functional change is intended.

llvm-svn: 275213
2016-07-12 20:31:46 +00:00
George Burgess IV 1cbd039234 Attempt to make buildbots happy.
Woohoo, unused variable warnings in builds without asserts (as a result
of r275122).

llvm-svn: 275126
2016-07-11 23:18:32 +00:00
George Burgess IV de1be7171a [CFLAA] Simplify CFLGraphBuilder. NFC.
This patch simplifies the graph builder by encoding nodes as {Value,
Dereference Level} pairs. This lets us kill edge types, and allows us to
get rid of hacks in StratifiedSets (like addAttrsBelow/...). This
simplification also allows us to remove InstantiatedRelations and
InstantiatedAttrs.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D22080

llvm-svn: 275122
2016-07-11 22:59:09 +00:00
Alina Sbirlea 327955e057 Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.

Reviewers: llvm-commits, jlebar

Subscribers: arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21935

llvm-svn: 275100
2016-07-11 20:46:17 +00:00
Dehao Chen 9232f98279 Implement callsite-hotness based inline cost for Sample-based PGO
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.

E.g.

if (A1 && A2 && A3 && ..... && A10) {
  for (i=0; i < 100000000; i++) {
    callsite();
  }
}

Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.

In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.

Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.

Reviewers: davidxl, eraman, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22118

llvm-svn: 275073
2016-07-11 16:48:54 +00:00
Nicolai Haehnle 9343f36f8e AliasAnalysis: unify getModRefInfo(I, CS) semantics with other overloads
This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to
ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when
the Instruction is a call-site.

This is now more in line with getModRefInfo generally: it returns Mod when
I modifies a memory location that is accessed (read or written) by CS and
Ref when I reads a memory location that is written by CS.

From a grep of the code, the only uses of this particular getModRefInfo
overload are in MemorySSA and MemCpyOptimizer, and they only care about
where the result is MR_NoModRef or not. Therefore, this change should have
no visible effect.

Separated out from D17279 upon request.

llvm-svn: 275065
2016-07-11 14:11:45 +00:00
Hal Finkel 2cac58f604 Pointer-comparison folding should look through returned-argument functions
For functions which are known to return a specific argument, pointer-comparison
folding can look through the function calls as part of its analysis.

Differential Revision: http://reviews.llvm.org/D9387

llvm-svn: 275039
2016-07-11 03:37:59 +00:00
Hal Finkel bf3957a553 Teach isDereferenceablePointer to look through returned-argument functions
For functions which are known to return their argument,
isDereferenceableAndAlignedPointer can examine the argument value.

Differential Revision: http://reviews.llvm.org/D9384

llvm-svn: 275038
2016-07-11 03:08:49 +00:00
Hal Finkel e186debb8b Teach SCEV to look through returned-argument functions
When building SCEVs, if a function is known to return its argument, then we can
build the SCEV using the corresponding argument value.

Differential Revision: http://reviews.llvm.org/D9381

llvm-svn: 275037
2016-07-11 02:48:23 +00:00
Hal Finkel 6fd5e1f02b Teach computeKnownBits to look through returned-argument functions
If a function is known to return one of its arguments, we can use that in order
to compute known bits of the return value.

Differential Revision: http://reviews.llvm.org/D9397

llvm-svn: 275036
2016-07-11 02:25:14 +00:00
Hal Finkel 5c12d8fe8f BasicAA should look through functions with returned arguments
Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look
through returned-argument functions when answering queries. This is essential
so that we don't loose all other AA information when supplementing with
llvm.noalias.

Differential Revision: http://reviews.llvm.org/D9383

llvm-svn: 275035
2016-07-11 01:32:20 +00:00
George Burgess IV 53b195c39c [CFLAA] Make a constant variable `const`. NFC.
`const` was dropped by r274958, and the lack of `const` makes GCC6
(correctly) complain.

llvm-svn: 274961
2016-07-09 03:21:25 +00:00
George Burgess IV c294d0dcc2 [CFLAA] Move the graph builder out from CFLSteens. NFC.
Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D22022

llvm-svn: 274958
2016-07-09 02:54:42 +00:00
George Burgess IV 1c4e7962dd [CFLAA] Simplify CFLGraphBuilder. NFC.
This removes a few fields from the graph builder by making us compute
things (that we'd always compute anyway) more eagerly.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D22009

llvm-svn: 274957
2016-07-09 02:48:56 +00:00
Anna Thomas 9ad45adfd7 Revert "InstCombine rule to fold truncs whose value is available"
This reverts commit r274853.
Caused failure in ppcBE build

llvm-svn: 274943
2016-07-08 22:15:08 +00:00
Jingyue Wu 15f3e82d42 [TTI] Expose TTI::getGEPCost and use it in SLSR and NaryReassociate.
NFC.

llvm-svn: 274940
2016-07-08 21:48:05 +00:00
Xinliang David Li 07e08fa36b [PM] name the new PM LAA class LoopAccessAnalysis (LAA) /NFC
llvm-svn: 274934
2016-07-08 21:21:44 +00:00
Xinliang David Li 7853c1dd73 Rename LoopAccessAnalysis to LoopAccessLegacyAnalysis /NFC
llvm-svn: 274927
2016-07-08 20:55:26 +00:00
Anna Thomas 3124f6273a InstCombine rule to fold truncs whose value is available
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.

This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.

Differential Revision: http://reviews.llvm.org/D21791

llvm-svn: 274853
2016-07-08 15:18:56 +00:00
Sanjay Patel 490193d2e9 fix formatting; NFC
llvm-svn: 274765
2016-07-07 16:19:09 +00:00
Chandler Carruth 168800c97d [LCG] Hoist the definitions of the stream operator friends to be inline
friend definitions.

Based on the experiments Sean Silva and Reid did, this seems the safest
course of action and also will work around a questionable warning
provided by GCC6 on the old form of the code. Thanks for Davide pointing
out the issue and other suggesting ways to fix.

llvm-svn: 274740
2016-07-07 07:52:07 +00:00
David Majnemer 7afb46d3c8 [LoopAccessAnalysis] Fix an integer overflow
We were inappropriately using 32-bit types to account for quantities
that can be far larger.

Fixed in PR28443.

llvm-svn: 274737
2016-07-07 06:24:36 +00:00
Sean Silva 284b0324e2 [PM] Avoid getResult on a higher level in LoopAccessAnalysis
Note that require<domtree> and require<loops> aren't needed because they
come in implicitly via the loop pass manager.

llvm-svn: 274712
2016-07-07 01:01:53 +00:00
George Burgess IV e191996a57 [CFLAA] Split out more things from CFLSteens. NFC.
"More things" = StratifiedAttrs and various bits like interprocedural
summaries.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21964

llvm-svn: 274592
2016-07-06 00:47:21 +00:00
George Burgess IV 1ca8affb24 [CFLAA] Split the CFL graph out from CFLSteens. NFC.
Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21963

llvm-svn: 274591
2016-07-06 00:36:12 +00:00
George Burgess IV bfa401e5ad [CFLAA] Split into Anders+Steens analysis.
StratifiedSets (as implemented) is very fast, but its accuracy is also
limited. If we take a more aggressive andersens-like approach, we can be
way more accurate, but we'll also end up being slower.

So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.

Long-term, we want to end up in a place where CFLSteens is queried
first; if it can provide an answer, great (since queries are basically
map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.

This patch splits everything out so we can try to do something like
that when we get a reasonable CFLAnders implementation.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21910

llvm-svn: 274589
2016-07-06 00:26:41 +00:00
Nicolai Haehnle 84c9f9919a Add writeonly IR attribute
Summary:
This complements the earlier addition of IntrWriteMem and IntrWriteArgMem
LLVM intrinsic properties, see D18291.

Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis.

Reviewers: reames, joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18714

llvm-svn: 274485
2016-07-04 08:01:29 +00:00
NAKAMURA Takumi 4cb46e6747 Reformat blank lines.
llvm-svn: 274481
2016-07-04 01:26:33 +00:00
NAKAMURA Takumi f252951e90 Reformat comment lines.
llvm-svn: 274480
2016-07-04 01:26:27 +00:00
NAKAMURA Takumi 940cd9368d Untabify.
llvm-svn: 274479
2016-07-04 01:26:21 +00:00
NAKAMURA Takumi f4c6441b01 Reformat.
llvm-svn: 274478
2016-07-04 01:26:14 +00:00
Sean Silva 45835e731d Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

llvm-svn: 274455
2016-07-02 23:47:27 +00:00
Xinliang David Li 8a021317a2 [PM] Port LoopAccessInfo analysis to new PM
It is implemented as a LoopAnalysis pass as 
discussed and agreed upon.

llvm-svn: 274452
2016-07-02 21:18:40 +00:00
Benjamin Kramer 3bc1edf95b Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended.

llvm-svn: 274431
2016-07-02 11:41:39 +00:00
Xinliang David Li 94734eef33 [PM] refactor LoopAccessInfo code part-2
Differential Revision: http://reviews.llvm.org/D21636

llvm-svn: 274334
2016-07-01 05:59:55 +00:00
Adam Nemet f45594c912 [LAA] Fix alphabetical sorting of headers. NFC
llvm-svn: 274302
2016-07-01 00:09:02 +00:00
Matt Arsenault 727e279ac4 SLPVectorizer: Move propagateMetadata to VectorUtils
This will be re-used by the LoadStoreVectorizer.

Fix handling of range metadata and testcase by Justin Lebar.

llvm-svn: 274281
2016-06-30 21:17:59 +00:00
Sanjoy Das 0da2d14766 [SCEV] Compute max be count from shift operator only if all else fails
In particular, check to see if we can compute a precise trip count by
exhaustively simulating the loop first.

llvm-svn: 274199
2016-06-30 02:47:28 +00:00
George Burgess IV d86e38e1db [CFLAA] Add support for ModRef queries.
This patch makes CFLAA answer some ModRef queries. Because we don't
distinguish between reading/writing when making StratifiedSets, we're
unable to offer any of the readonly-related answers.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21858

llvm-svn: 274197
2016-06-30 02:11:26 +00:00
Elena Demikhovsky 5e21c94f25 Reverted patch 273864
llvm-svn: 274115
2016-06-29 10:01:06 +00:00
Craig Topper df7454f94b Revert "[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition."
This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything.

llvm-svn: 274102
2016-06-29 04:57:00 +00:00
Craig Topper 2cc199baff [ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition.
If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case.

I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at.

Differential revision: http://reviews.llvm.org/D21493

llvm-svn: 274098
2016-06-29 03:46:47 +00:00
Chad Rosier 02e831cf2f Typos. NFC.
llvm-svn: 274038
2016-06-28 17:19:10 +00:00
Xinliang David Li 3e176c77ab [BFI/MBFI]: cfg graph view with color scheme
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).

llvm-svn: 273996
2016-06-28 06:58:21 +00:00
Xinliang David Li 8dd5ce97f9 [BFI]: enhance BFI graph dump
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..

llvm-svn: 273992
2016-06-28 04:07:03 +00:00
Xinliang David Li 55415f2565 [BFI]: graph viewer code refactoring
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.

This patch also enables BFI graph to show branch probability
on edges as MBFI does before.

llvm-svn: 273990
2016-06-28 03:41:29 +00:00
Chandler Carruth dca834089a [PM] Improve the debugging and logging facilities of the CGSCC bits of
the new pass manager.

This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.

Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.

llvm-svn: 273961
2016-06-27 23:26:08 +00:00
George Burgess IV f10c7fc286 [CFLAA] Make MSVC happy. NFC.
Apparently, MSVC complains if there's an implicit conversion from
`unsigned` to `unsigned long long`, if the `unsigned` is the result of
a bit shift.

llvm-svn: 273955
2016-06-27 22:50:01 +00:00
Easwaran Raman 22eb80a114 Fix size computation of array allocation in inline cost analysis
Differential revision: http://reviews.llvm.org/D21690

llvm-svn: 273952
2016-06-27 22:31:53 +00:00
George Burgess IV 11ff52bffc [CFLAA] Use unsigned numbers for bit-shifts.
This uses `1U` instead of `1ULL` because StratifiedAttrs is a 32-bit
bitset.

Thanks to Hans-Bernhard Broker for bringing this up.

llvm-svn: 273902
2016-06-27 18:35:00 +00:00
Elena Demikhovsky 4c58b2761a Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)

  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Re-commit rL273257 - revision: http://reviews.llvm.org/D20789

llvm-svn: 273864
2016-06-27 11:19:23 +00:00
Igor Breger 7357849dca [ConstantFolding] Fix bitcast vector of i1.
Differential Revision: http://reviews.llvm.org/D21735

llvm-svn: 273845
2016-06-27 06:42:54 +00:00
Benjamin Kramer aa2091505f Apply clang-tidy's modernize-loop-convert to lib/Analysis.
Only minor manual fixes. No functionality change intended.

llvm-svn: 273816
2016-06-26 17:27:42 +00:00
David Majnemer bb53d23ef8 [InstSimplify] Replace calls to null with undef
Calling null is undefined behavior, we can simplify the resulting value
to undef.

llvm-svn: 273777
2016-06-25 07:37:30 +00:00
Peter Collingbourne 0312f614b1 IR: Introduce llvm.type.checked.load intrinsic.
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.

This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.

Differential Revision: http://reviews.llvm.org/D21121

llvm-svn: 273756
2016-06-25 00:23:04 +00:00
Eli Friedman 5a52856cc8 Fix documentation for FindAvailableLoadedValue.
llvm-svn: 273734
2016-06-24 21:32:15 +00:00
Peter Collingbourne 7efd750607 IR: New representation for CFI and virtual call optimization pass metadata.
The bitset metadata currently used in LLVM has a few problems:

1. It has the wrong name. The name "bitset" refers to an implementation
   detail of one use of the metadata (i.e. its original use case, CFI).
   This makes it harder to understand, as the name makes no sense in the
   context of virtual call optimization.

2. It is represented using a global named metadata node, rather than
   being directly associated with a global. This makes it harder to
   manipulate the metadata when rebuilding global variables, summarise it
   as part of ThinLTO and drop unused metadata when associated globals are
   dropped. For this reason, CFI does not currently work correctly when
   both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
   globals, and fails to associate metadata with the rebuilt globals. As I
   understand it, the same problem could also affect ASan, which rebuilds
   globals with a red zone.

This patch solves both of those problems in the following way:

1. Rename the metadata to "type metadata". This new name reflects how
   the metadata is currently being used (i.e. to represent type information
   for CFI and vtable opt). The new name is reflected in the name for the
   associated intrinsic (llvm.type.test) and pass (LowerTypeTests).

2. Attach metadata directly to the globals that it pertains to, rather
   than using the "llvm.bitsets" global metadata node as we are doing now.
   This is done using the newly introduced capability to attach
   metadata to global variables (r271348 and r271358).

See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html

Differential Revision: http://reviews.llvm.org/D21053

llvm-svn: 273729
2016-06-24 21:21:32 +00:00
Reid Kleckner fbd5eef691 Revert "InstCombine rule to fold trunc when value available"
This reverts commit r273608.

Broke building code with sanitizers, where apparently these kinds of
loads, casts, and truncations are common:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502
http://crbug.com/623099

llvm-svn: 273703
2016-06-24 18:42:58 +00:00
George Burgess IV d9c39fcbca Attempt to fix MSVC breakage caused by r273636.
Apparently earlier versions of MSVC don't have constexpr bitset ctors.

llvm-svn: 273637
2016-06-24 01:41:29 +00:00
George Burgess IV a3d62be733 [CFLAA] Propagate StratifiedAttrs in interproc. analysis.
This patch also has a refactor that kills StratifiedAttr, and leaves us
with StratifiedAttrs, because having both was mildly redundant.

This patch makes us correctly handle stratified attributes when doing
interprocedural analysis. It also adds another attribute, AttrCaller,
which acts like AttrUnknown. We can filter out AttrCaller values when
during interprocedural analysis, since the caller should have
information about what arguments it's passing to its callee.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21645

llvm-svn: 273636
2016-06-24 01:00:03 +00:00
George Burgess IV d14d05affe Attempt #2 to unbreak bots broken by r273596.
Some of the bots running GCC 4.7 seem to be having trouble with lambdas
that explicitly capture `this`. Relevant-looking bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53137

llvm-svn: 273613
2016-06-23 20:59:13 +00:00
Anna Thomas 31a0b2088f InstCombine rule to fold trunc when value available
Summary:
This instcombine rule folds away trunc operations that have value available from a prior load or store.
This kind of code can be generated as a result of GVN widening the load or from source code as well.

Reviewers: reames, majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21246

llvm-svn: 273608
2016-06-23 20:22:22 +00:00
George Burgess IV fe1397b977 Attempt to fix breakage caused by r273596.
llvm-svn: 273601
2016-06-23 19:16:04 +00:00
George Burgess IV 1f99da54c2 [CFLAA] Use better interprocedural function summaries.
Previously, we just unified any arguments that seemed to be related to
each other. With this patch, we now respect dereference levels, etc.
which should make us substantially more accurate. Proper handling of
StratifiedAttrs will be done in a later patch.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21536

llvm-svn: 273596
2016-06-23 18:55:23 +00:00
Sanjay Patel e053621071 [ValueTracking] simplify logic in ComputeNumSignBits (NFCI)
This was noted in http://reviews.llvm.org/D21610 . The previous code
predated the use of APInt ( http://reviews.llvm.org/rL47654 ), so it
had to account for the fixed width of uint64_t.

Now that we're using the variable width APInt, we can remove some
complexity.

llvm-svn: 273584
2016-06-23 17:41:59 +00:00
Michael Zolotukhin 2d3592d481 [LoopUnrollAnalyzer] Fix a bug in UnrolledInstAnalyzer::visitLoad.
When simplifying a load we need to make sure that the type of the
simplified value matches the type of the instruction we're processing.
In theory, we can handle casts here as we deal with constant data, but
since it's not implemented at the moment, we at least need to bail out.

This fixes PR28262.

llvm-svn: 273562
2016-06-23 14:31:31 +00:00
Xinliang David Li ce030acb4e [PM]: LoopAccessInfo simple refactoring
To make definition of mov ctors easier.
Differential Revision: http://reviews.llvm.org/D21563

llvm-svn: 273506
2016-06-22 23:20:59 +00:00
Sanjay Patel a06d989552 [ValueTracking] improve ComputeNumSignBits for vector constants
This is similar to the computeKnownBits improvement in rL268479. 
There's probably more we can do for vector logic instructions, but 
this should let us see non-splat constant masking ops that can
become vector selects instead of and/andn/or sequences.

Differential Revision: http://reviews.llvm.org/D21610

llvm-svn: 273459
2016-06-22 19:20:59 +00:00
Xinliang David Li b12b353a41 [BFI]: NFC refactoring
move getBlockProfileCount implementation to the
base class so that MBFI can share too.

llvm-svn: 273442
2016-06-22 17:12:12 +00:00
Elena Demikhovsky a266cf0518 reverted the prev commit due to assertion failure
llvm-svn: 273258
2016-06-21 12:10:11 +00:00
Elena Demikhovsky 9823c995bc Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)
  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Differential revision: http://reviews.llvm.org/D20789

llvm-svn: 273257
2016-06-21 11:32:01 +00:00
David Majnemer e61e4bfd87 Replace silly uses of 'signed' with 'int'
llvm-svn: 273244
2016-06-21 05:10:24 +00:00
Davide Italiano 9cc0bca23c [TargetLibraryInfo] Reduce code duplication.
llvm-svn: 273241
2016-06-21 04:32:21 +00:00
George Burgess IV 9fdbfe17a8 [CFLAA] Be more aggressive with interprocedural analysis.
This patch makes us perform interprocedural analysis on functions that
don't have internal linkage. It also removes a test that should've been
deleted in an earlier commit (since other tests now cover everything
that the newly-removed test covers).

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21513

llvm-svn: 273229
2016-06-21 01:42:47 +00:00
George Burgess IV a99cd049d0 Attempt to make MSVC buildbots happy.
Broken by r273219.

llvm-svn: 273220
2016-06-20 23:20:49 +00:00
George Burgess IV 87b2e41416 [CFLAA] Add interprocedural function summaries.
This patch adds function summaries, so that we don't need to recompute
various properties about function parameters/return values at each
callsite of a function. It also adds many interprocedural tests for
CFLAA.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21475#inline-182390

llvm-svn: 273219
2016-06-20 23:10:56 +00:00
Sanjay Patel 9ad8fb68f7 [InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869)
By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question
raised by PR27869:
https://llvm.org/bugs/show_bug.cgi?id=27869
...where InstCombine turns an icmp+zext into a shift causing us to miss the fold.

Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp.

Differential Revision: http://reviews.llvm.org/D21512

llvm-svn: 273200
2016-06-20 20:59:59 +00:00
Patrik Hagglund 96f13afcbc Avoid output indeterminism between GCC and Clang builds.
Remove dependency of the evalution order of function arguments, which
is unspecified.

Patch by David Stenberg.

llvm-svn: 273145
2016-06-20 10:19:04 +00:00
Eli Friedman f3b71581dd Fix dynamically linked debug builds.
On the surface, this might not look like it does anything... but
actually it brings in the declaration "extern template class
AnalysisManager<Loop>;", which suppresses the instantiation of the
constructor, which avoids the funny interaction between "extern
template" and -fvisibility-inlines-hidden.

llvm-svn: 273133
2016-06-20 02:48:11 +00:00
Sanjay Patel f8ee0e0218 fix formatting, typo; NFC
llvm-svn: 273118
2016-06-19 17:20:27 +00:00
Sean Silva 7cb30664fc Add a super basic LazyCallGraph DOT printer.
Access it through -passes=print-lcg-dot

Let me know any suggestions for changing the rendering; I'm not
particularly attached to what is implemented here.

llvm-svn: 273082
2016-06-18 09:17:32 +00:00
Sanjoy Das e8fd9561cb [SCEV] Fix incorrect trip count computation
The way we elide max expressions when computing trip counts is incorrect
-- it breaks cases like this:

```
static int wrapping_add(int a, int b) {
  return (int)((unsigned)a + (unsigned)b);
}

void test() {
  volatile int end_buf = 2147483548; // INT_MIN - 100
  int end = end_buf;

  unsigned counter = 0;
  for (int start = wrapping_add(end,  200); start < end; start++)
    counter++;

  print(counter);
}
```

Note: the `NoWrap` variable that was being tested has little to do with
the values flowing into the max expression; it is a property of the
induction variable.

test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test
functionality I'm reverting in this change, so I've deleted the test
fully.

llvm-svn: 273079
2016-06-18 04:38:31 +00:00
Adam Nemet a9f09c6245 [LAA] Enable symbolic stride speculation for all LAA clients
This is a functional change for LLE and LDist.  The other clients (LV,
LVerLICM) already had this explicitly enabled.

The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides.  This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter.  This makes the
interface more friendly to the new Pass Manager.

The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.

llvm-svn: 273064
2016-06-17 22:35:41 +00:00
Benjamin Kramer 4dea8f542b Avoid duplicated map lookups. No functionality change intended.
llvm-svn: 273030
2016-06-17 18:59:41 +00:00
Benjamin Kramer 1d67ac5639 [PPC] Strength-reduce SmallVectors into arrays.
No functionality change intended.

llvm-svn: 272999
2016-06-17 13:15:10 +00:00
Chandler Carruth 164a2aa6f4 [PM] Remove support for omitting the AnalysisManager argument to new
pass manager passes' `run` methods.

This removes a bunch of SFINAE goop from the pass manager and just
requires pass authors to accept `AnalysisManager<IRUnitT> &` as a dead
argument. This is a small price to pay for the simplicity of the system
as a whole, despite the noise that changing it causes at this stage.

This will also helpfull allow us to make the signature of the run
methods much more flexible for different kinds af passes to support
things like intelligently updating the pass's progression over IR units.

While this touches many, many, files, the changes are really boring.
Mostly made with the help of my trusty perl one liners.

Thanks to Sean and Hal for bouncing ideas for this with me in IRC.

llvm-svn: 272978
2016-06-17 00:11:01 +00:00
Adam Nemet c953bb9953 [LV] Move management of symbolic strides to LAA. NFCI
This is still NFCI, so the list of clients that allow symbolic stride
speculation does not change (yes: LV and LoopVersioningLICM, no: LLE,
LDist).  However since the symbolic strides are now managed by LAA
rather than passed by client a new bool parameter is used to enable
symbolic stride speculation.

The existing test Transforms/LoopVectorize/version-mem-access.ll checks
that stride speculation is performed for LV.

The previously added test Transforms/LoopLoadElim/symbolic-stride.ll
ensures that no speculation is performed for LLE.

The next patch will change the functionality and turn on symbolic stride
speculation in all of LAA's clients and remove the bool parameter.

llvm-svn: 272970
2016-06-16 22:57:55 +00:00
Matt Arsenault 8dad57cc49 TTI: Add hook for memory width to vectorize
llvm-svn: 272964
2016-06-16 21:43:12 +00:00
Igor Laevsky 87f0d0e185 Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo"
It was causing failures in Profile-i386 and Profile-x86_64 tests.

llvm-svn: 272912
2016-06-16 16:25:53 +00:00
Igor Laevsky c9179fd2c2 [JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo
We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise 
we will get dangling pointer inside BranchProbabilityInfo cache.

Differential Revision: http://reviews.llvm.org/D20957

llvm-svn: 272891
2016-06-16 13:28:25 +00:00
Adam Nemet 139ffba398 [LAA] Rename Strides to SymblicStrides in analyzeLoop. NFC
This is to facilitate to move of SymblicStrides from LV to LAA.

llvm-svn: 272879
2016-06-16 08:27:03 +00:00
Adam Nemet bdbc5227ce [LAA] Default getInfo to not speculate symbolic strides. NFC
Soon we won't be passing Strides to getInfo and then we'll have fewer
call sites to update.

llvm-svn: 272878
2016-06-16 08:26:56 +00:00
Eli Friedman bd254a6f45 [InstCombine] Don't widen metadata on store-to-load forwarding
The original check for load CSE or store-to-load forwarding is wrong
when the forwarded stored value happened to be a load.

Ref https://github.com/JuliaLang/julia/issues/16894

Differential Revision: http://reviews.llvm.org/D21271

Patch by Yichao Yu!

llvm-svn: 272868
2016-06-16 02:33:42 +00:00
George Burgess IV 259d90194e [CFLAA] Ignore non-pointers, move Attrs to graph nodes.
This patch makes CFLAA ignore non-pointer values, since we can now
sanely do that with the escaping/unknown attributes. Additionally,
StratifiedAttrs make more sense to sit on nodes than edges (since
they're properties of values, and ultimately end up on the nodes of
StratifiedSets). So, this patch puts said attributes on nodes.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21387

llvm-svn: 272833
2016-06-15 20:43:41 +00:00
David Majnemer b62692e2e0 [TargetLibraryInfo] Teach isValidProtoForLibFunc about tan
We would fail to validate the type of the tan function which would cause
downstream users of isValidProtoForLibFunc to assert.

This fixes PR28143.

llvm-svn: 272802
2016-06-15 16:47:23 +00:00
Sanjoy Das b277a425c4 [SCEV] Use dyn_cast<T> instead of dyn_cast<const T>; NFC
The const is unnecessary.

llvm-svn: 272759
2016-06-15 06:53:55 +00:00
Sanjoy Das aba989f454 [SCEV] Use cast<> instead of dyn_cast; NFC
llvm-svn: 272758
2016-06-15 06:53:51 +00:00
Sanjoy Das 0e392d5dd7 [SCEV] clang-format some sections
llvm-svn: 272753
2016-06-15 04:37:50 +00:00
Sanjoy Das 5a3d893b48 [SCEV] Change the interface for SolveQuadraticEquation; NFC
Use Optional<T> to denote the absence of a solution, not
SCEVCouldNotCompute.  This makes the usage of SolveQuadraticEquation
somewhat simpler.

llvm-svn: 272752
2016-06-15 04:37:47 +00:00
Peter Collingbourne 96efdd6107 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

llvm-svn: 272709
2016-06-14 21:01:22 +00:00
Sanjoy Das d7e8206b58 [ValueTracking] Calls to @llvm.assume always return
This change teaches llvm::isGuaranteedToTransferExecutionToSuccessor
that calls to @llvm.assume always terminate.  Most other relevant
intrinsics should be covered by the "CS.onlyReadsMemory() ||
CS.onlyAccessesArgMemory()" bit but we were missing @llvm.assumes
because we state that it clobbers memory.

Added an LICM test case, but this change is not specific to LICM.

llvm-svn: 272703
2016-06-14 20:23:16 +00:00
George Burgess IV 24eb0daf7c [CFLAA] Tag arguments as escaped instead of unknown.
This patch also includes some refactoring.

Prior to this patch, we tagged all CFLAA attributes as unknown. This is
suboptimal, since it meant that any Value used as an argument would be
considered to alias any other Value that existed.

Now that we have the machinery to tag sets below the set for an
arbitrary value with attributes, it's okay to be less conservative with
arguments. (Specifically, we still tag the set under an argument with
unknown).

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21262

llvm-svn: 272690
2016-06-14 18:12:28 +00:00
George Burgess IV e17756e0fe [CFLAA] Refactor graph-building code. NFC.
This patch refactors CFLAA's graph building code. This makes keeping
track of common state (TargetLibraryInfo, ...) easier.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21261

llvm-svn: 272688
2016-06-14 18:02:27 +00:00
Nicolai Haehnle 377975f2f7 AMDGPU: mark {exp,log}10{,f,l} library functions as unavailable
Summary:
The SimplifyLibCalls part of InstCombine generates calls to those otherwise.

I wonder if at some point we shouldn't just call disableAllFunctions() and
then enable functions on a whitelist basis...

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96495

Reviewers: arsenm, tstellarAMD

Subscribers: llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21282

llvm-svn: 272664
2016-06-14 13:14:53 +00:00
Sean Silva 687019facb [PM] Port LVI to the new PM.
This is a bit gnarly since LVI is maintaining its own cache.
I think this port could be somewhat cleaner, but I'd rather not spend
too much time on it while we still have the old pass hanging around and
limiting how much we can clean things up.
Once the old pass is gone it will be easier (less time spent) to clean
it up anyway.

This is the last dependency needed for porting JumpThreading which I'll
do in a follow-up commit (there's no printer pass for LVI or anything to
test it, so porting a pass that depends on it seems best).

I've been mostly following:
r269370 / D18834 which ported Dependence Analysis
r268601 / D19839 which ported BPI

llvm-svn: 272593
2016-06-13 22:01:25 +00:00
Sanjoy Das d0bdf3e02b Fix AAResults::callCapturesBefore for operand bundles
Summary:
AAResults::callCapturesBefore would previously ignore operand
bundles. It was possible for a later instruction to miss its memory
dependency on a call site that would only access the pointer through a
bundle.

Patch by Oscar Blumberg!

Reviewers: sanjoy

Differential Revision: http://reviews.llvm.org/D21286

llvm-svn: 272580
2016-06-13 19:55:04 +00:00
George Burgess IV 99646871a7 Attempt to make windows buildbots happy.
Broken by r272578. I didn't realize that the default move ctor
complaints would happen for non-template classes. :)

llvm-svn: 272579
2016-06-13 19:38:49 +00:00
George Burgess IV dc96febc37 [CFLAA] Refactor to remove redundant maps. NFC.
Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21233

llvm-svn: 272578
2016-06-13 19:21:18 +00:00
Eli Friedman f1da33e4d3 [LICM] Make isGuaranteedToExecute more accurate.
Summary:
Make isGuaranteedToExecute use the
isGuaranteedToTransferExecutionToSuccessor helper, and make that helper
a bit more accurate.

There's a potential performance impact here from assuming that arbitrary
calls might not return. This probably has little impact on loads and
stores to a pointer because most things alias analysis can reason about
are dereferenceable anyway. The other impacts, like less aggressive
hoisting of sdiv by a variable and less aggressive hoisting around
volatile memory operations, are unlikely to matter for real code.

This also impacts SCEV, which uses the same helper.  It's a minor
improvement there because we can tell that, for example, memcpy always
returns normally. Strictly speaking, it's also introducing
a bug, but it's not any worse than everywhere else we assume readonly
functions terminate.

Fixes http://llvm.org/PR27857.

Reviewers: hfinkel, reames, chandlerc, sanjoy

Subscribers: broune, llvm-commits

Differential Revision: http://reviews.llvm.org/D21167

llvm-svn: 272489
2016-06-11 21:48:25 +00:00
Mehdi Amini bbacddfe92 Interprocedural Register Allocation (IPRA) Analysis
Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.

When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.

The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.

An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D20769

llvm-svn: 272403
2016-06-10 16:19:46 +00:00
Richard Trieu fae70b6f2b Add null checks before using a pointer.
llvm-svn: 272359
2016-06-10 01:42:05 +00:00
George Burgess IV 652ec4f595 [CFLAA] Handle global/arg attrs more sanely.
Prior to this patch, we used argument/global stratified attributes in
order to note that a value could have come from either dereferencing a
global/arg, or from the assignment from a global/arg.

Now, AttrUnknown is placed on sets when we see a dereference, instead of
the global/arg attributes. This allows us to be more aggressive in the
future when we see global/arg attributes without AttrUnknown.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21110

llvm-svn: 272335
2016-06-09 23:15:04 +00:00
Easwaran Raman 71069cf67d Use ProfileSummaryInfo in inline cost analysis.
Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo.

Differential revision: http://reviews.llvm.org/D21045

llvm-svn: 272321
2016-06-09 22:23:21 +00:00
Xinliang David Li ecde1c7f3d Revert r272194 No need for it if loop Analysis Manager is used
llvm-svn: 272243
2016-06-09 03:22:39 +00:00
Sanjoy Das 1eade91513 Minor clean up in loopHasNoAbnormalExits; NFC
llvm-svn: 272238
2016-06-09 01:14:03 +00:00
Sanjoy Das c7f69b921f Be wary of abnormal exits from loop when exploiting UB
We can safely rely on a NoWrap add recurrence causing UB down the road
only if we know the loop does not have a exit expressed in a way that is
opaque to ScalarEvolution (e.g. by a function call that conditionally
calls exit(0)).

I believe with this change PR28012 is fixed.

Note: I had to change some llvm-lit tests in LoopReroll, since it looks
like they were depending on this incorrect behavior.

llvm-svn: 272237
2016-06-09 01:13:59 +00:00
Sanjoy Das 97cd7d5d44 Factor out a loopHasNoAbnormalExits; NFC
llvm-svn: 272236
2016-06-09 01:13:54 +00:00
Xinliang David Li 572135f717 [PM] Refector LoopAccessInfo analysis code
This is the preparation patch to port the analysis to new PM

Differential Revision: http://reviews.llvm.org/D20560

llvm-svn: 272194
2016-06-08 20:15:37 +00:00
Benjamin Kramer c321e53402 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
George Burgess IV fd4e2f7cb2 Attempt #2 to appease the buildbots.
MSVC calls the copy ctor on StratifiedSets for some reason. So,
undelete it.

llvm-svn: 272184
2016-06-08 17:56:35 +00:00
Sanjoy Das 2401c98475 [SCEV] Break out of loop if there is no more work to do
This is NFC as far as externally visible behavior is concerned, but will
keep us from spinning in the worklist traversal algorithm unnecessarily.

llvm-svn: 272182
2016-06-08 17:48:46 +00:00
Sanjoy Das 8598412e24 [SCEV] Track no-abnormal-exits instead of no-throw calls
Absence of may-unwind calls is not enough to guarantee that a
UB-generating use of an add-rec poison in the loop latch will actually
cause UB.  We also need to guard against calls that terminate the thread
or infinite loop themselves.

This partially addresses PR28012.

llvm-svn: 272181
2016-06-08 17:48:42 +00:00
Sanjoy Das 9a65cd214d Teach isGuarantdToTransferExecToSuccessor about debug info intrinsics
Calls to `@llvm.dbg.*` can be assumed to terminate.

llvm-svn: 272180
2016-06-08 17:48:36 +00:00
Sanjoy Das a19edc4d15 Fix a bug in SCEV's poison value propagation
The worklist algorithm introduced in rL271151 didn't check to see if the
direct users of the post-inc add recurrence propagates poison.  This
change fixes the problem and makes the code structure more obvious.

Note for release managers: correctness wise, this bug wasn't a
regression introduced by rL271151 -- the behavior of SCEV around
post-inc add recurrences was strictly improved (in terms of correctness)
in rL271151.

llvm-svn: 272179
2016-06-08 17:48:31 +00:00
George Burgess IV 785f391131 Try to appease buildbots.
r272064 apparently made them angry. This undoes some changes made in
r272064 (defaulting move ctors) to make them happy again.

llvm-svn: 272173
2016-06-08 17:27:14 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
George Burgess IV 60af226b86 [CFLAA] Kill dead code/fix comments in StratifiedSets.
Also use default/delete instead of hand-written ctors.

Thanks to Jia Chen for bringing this stuff up.

llvm-svn: 272064
2016-06-07 21:41:18 +00:00
George Burgess IV a1f9a2daeb [CFLAA] Add AttrEscaped, remove bit twiddling functions.
This patch does a few things:

- Unifies AttrAll and AttrUnknown (since they were used for more or less
  the same purpose anyway).

- Introduces AttrEscaped, an attribute that notes that a value escapes
  our analysis for a given set, but not that an unknown value flows into
  said set.

- Removes functions that take bit indices, since we also had functions
  that took bitsets, and the use of both (with similar names) was
  unclear and bug-prone.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21000

llvm-svn: 272040
2016-06-07 18:35:37 +00:00
Andrey Turetskiy 9f02c58670 [LAA] Improve non-wrapping pointer detection by handling loop-invariant case.
This fixes PR26314. This patch adds new helper “isNoWrap” with detection of
loop-invariant pointer case.

Patch by Roman Shirokiy.

Ref: https://llvm.org/bugs/show_bug.cgi?id=26314

Differential Revision: http://reviews.llvm.org/D17268

llvm-svn: 272014
2016-06-07 14:55:27 +00:00
Michael Zolotukhin 19edbadfc5 [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.
In some cases, when simplifying with SCEV, we might consider pointer values as
just usual integer values.  Thus, we might get a different type from what we
had originally in the map of simplified values, and hence we need to check
types before operating on the values.

This fixes PR28015.

llvm-svn: 271931
2016-06-06 19:21:40 +00:00
Matthew Simpson e3e3b994ae [LAA] Use load and store vectors (NFC)
Contributed-by: Aditya Kumar <hiraditya@msn.com>
Differential Revision: http://reviews.llvm.org/D20953

llvm-svn: 271895
2016-06-06 14:15:41 +00:00
Simon Pilgrim ba319ded5e [Analysis] Enabled BITREVERSE as a vectorizable intrinsic
Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve.

llvm-svn: 271803
2016-06-04 20:21:07 +00:00
Easwaran Raman 019e0bf592 Reapply r271728 after adding move cobstructor for ProfileSummaryInfo
llvm-svn: 271745
2016-06-03 22:54:26 +00:00
Easwaran Raman 94edaaaefb Revert r271728 as it breaks Windows build
llvm-svn: 271738
2016-06-03 21:14:26 +00:00
Easwaran Raman d142050f3a Analysis pass to access profile summary info
Differential Revision: http://reviews.llvm.org/D20648

llvm-svn: 271728
2016-06-03 20:37:19 +00:00
Sanjay Patel dba8b4c04d transform obscured FP sign bit ops into a fabs/fneg using TLI hook
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.

This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4

In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.

Differential Revision: http://reviews.llvm.org/D19391

llvm-svn: 271573
2016-06-02 20:01:37 +00:00
Sanjoy Das 48cad71243 Inline isDereferenceableFromAttribute; NFC
Now that `Value::getPointerDereferenceableBytes` looks beyond just
attributes, the name `isDereferenceableFromAttribute` is misleading.
Just inline the function, since it is small and only used once.

llvm-svn: 271456
2016-06-02 00:52:53 +00:00
Sanjoy Das 00953cbe1d Remove Value::isPointerDereferenceable; NFCI
... and merge into `Value::getPointerDereferenceableBytes`. This was
suggested by Artur Pilipenko in D20764 -- since we no longer allow loads
of unsized types, there is no need anymore to have this special logic.

llvm-svn: 271455
2016-06-02 00:52:48 +00:00
Geoff Berry 0c09517867 [SCEV] Keep SCEVExpander insert points consistent.
Summary:
Make sure that the SCEVExpander Builder insert point and any
saved/restored insert points are kept consistent (i.e. their Instruction
and BasicBlock match) when moving instructions in SCEVExpander.

This fixes an issue triggered by
http://reviews.llvm.org/D18001 [LSR] Create fewer redundant instructions.

Test case will be added in reapply commit of above change:
http://reviews.llvm.org/D18480 Reapply [LSR] Create fewer redundant instructions.

Reviewers: sanjoy

Subscribers: mzolotukhin, sanjoy, qcolombet, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20703

llvm-svn: 271424
2016-06-01 20:03:09 +00:00
Daniel Berlin 73694bb92b Revert "Claim NoAlias if two GEPs index different fields of the same struct"
This reverts commit 2d5d6493f43eb68493a3852b8c226ac9fafdc7eb.

llvm-svn: 271422
2016-06-01 18:55:32 +00:00
George Burgess IV 18b83fe6cf [CFLAA] Recognize builtin allocation functions.
This patch extends CFLAA to recognize allocation functions such as
malloc, free, etc, so we can treat them more aggressively.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D20776

llvm-svn: 271421
2016-06-01 18:39:54 +00:00
Daniel Berlin e846c9dc52 Claim NoAlias if two GEPs index different fields of the same struct
Patch by Taewook Oh

Summary: Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure.

Reviewers: hfinkel, dberlin

Subscribers: dberlin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20665

llvm-svn: 271415
2016-06-01 18:12:01 +00:00
Sanjoy Das 10df497a1f Reduce dependence on pointee types when deducing dereferenceability
Summary:
Change some of the internal interfaces in Loads.cpp to keep track of the
number of bytes we're trying to prove dereferenceable using an explicit
`Size` parameter.

Before this, the `Size` parameter was implicitly inferred from the
pointee type of the pointer whose dereferenceability we were trying to
prove, causing us to be conservative around bitcasts. This was
unfortunate since bitcast instructions are no-ops and should never
break optimizations.  With an explicit `Size` parameter, we're more
precise (as shown in the test cases), and the code is simpler.

We should eventually move towards a `DerefQuery` struct that groups
together a base pointer, an offset, a size and an alignment; but this
patch is a first step.

Reviewers: apilipenko, dblaikie, hfinkel, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20764

llvm-svn: 271406
2016-06-01 16:47:45 +00:00
George Burgess IV a880146925 [CFLAA] Don't link GEP pointers to GEP indices.
Code like the following is considered broken, and doesn't need to be
supported by our AA magicks:

void getFoo(int *P) {
  int *PAlias = (int *)((char *)NULL + (uintptr_t)P);
}

This patch makes CFLAA drop support for code like this.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D20775

llvm-svn: 271322
2016-05-31 19:55:05 +00:00
Saleem Abdulrasool d2f705ddf9 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Sanjoy Das f857081c8c [SCEV] Consolidate comments; NFC
Consolidate documentation by removing comments from the .cpp file where
the comments in the .cpp file were copy-pasted from the header.

llvm-svn: 271157
2016-05-29 00:38:22 +00:00
Sanjoy Das 108fcf2e2c [SCEV] Rename functions to LLVM style; NFC
llvm-svn: 271156
2016-05-29 00:38:00 +00:00
Sanjoy Das f49ca52b9d [SCEV] See through op.with.overflow intrinsics (re-apply)
Summary:
This change teaches SCEV to see reduce `(extractvalue
0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if
possible).

This was first checked in at r265912 but reverted in r265950 because it
exposed some issues around how SCEV handled post-inc add recurrences.
Those issues have now been fixed.

Reviewers: atrick, regehr

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18684

llvm-svn: 271152
2016-05-29 00:34:42 +00:00
Sanjoy Das 7e4a64167d [SCEV] Don't always add no-wrap flags to post-inc add recs
Fixes PR27315.

The post-inc version of an add recurrence needs to "follow the same
rules" as a normal add or subtract expression.  Otherwise we miscompile
programs like

```
int main() {
  int a = 0;
  unsigned a_u = 0;
  volatile long last_value;
  do {
    a_u += 3;
    last_value = (long) ((int) a_u);
    if (will_add_overflow(a, 3)) {
      // Leave, and don't actually do the increment, so no UB.
      printf("last_value = %ld\n", last_value);
      exit(0);
    }
    a += 3;
  } while (a != 46);
  return 0;
}
```

This patch changes SCEV to put no-wrap flags on post-inc add recurrences
only when the poison from a potential overflow will go ahead to cause
undefined behavior.

To avoid regressing performance too much, I've assumed infinite loops
without side effects is undefined behavior to prove poison<->UB
equivalence in more cases.  This isn't ideal, but is not new to LLVM as
a whole, and far better than the situation I'm trying to fix.

llvm-svn: 271151
2016-05-29 00:32:17 +00:00
Sanjoy Das 70c2bbd29c [ValueTracking] ICmp instructions propagate poison
This is a stripped down version of D19211, leaving out the questionable
"branching in poison is UB" bit.

llvm-svn: 271150
2016-05-29 00:31:18 +00:00
Michael Zolotukhin d69cd1e086 [LoopUnrollAnalyzer] Add a comment to visitCastInst.
llvm-svn: 271086
2016-05-28 01:40:14 +00:00
Benjamin Kramer 82de7d323d Apply clang-tidy's misc-move-constructor-init throughout LLVM.
No functionality change intended, maybe a tiny performance improvement.

llvm-svn: 270997
2016-05-27 14:27:24 +00:00
Michael Zolotukhin 15e745133e [LoopUnrollAnalyzer] Bail out instead of dying with assert when facing huge index.
This fixes PR27902.

llvm-svn: 270946
2016-05-27 00:55:16 +00:00
Michael Kuperstein ae21491819 [BasicAA] Extend inbound GEP negative offset logic to GlobalVariables
r270777 improved the precision of alloca vs. inbounbds GEP alias queries: if
we have (a) an inbounds GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points to would have a negative offset with
respect to the alloca, then the GEP can not alias pointer (b).

This makes the same logic fire when (b) is based on a GlobalVariable instead
of an alloca.

Differential Revision: http://reviews.llvm.org/D20652

llvm-svn: 270893
2016-05-26 19:30:49 +00:00
David Majnemer 7f32420ed5 [CaptureTracking] Volatile operations capture their memory location
The memory location that corresponds to a volatile operation is very
special.  They are observed by the machine in ways which we cannot
reason about.

Differential Revision: http://reviews.llvm.org/D20555

llvm-svn: 270879
2016-05-26 17:36:22 +00:00
Peter Collingbourne b9aa1f4a03 MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA.
It turns out that too many passes are relying on alias analysis results
for control dependencies. Until we fix that by introducing a more accurate
modelling of control dependencies, special case assume in MemorySSA instead.

Also introduce tests to ensure we don't regress the FunctionAttrs or LICM
passes.

Differential Revision: http://reviews.llvm.org/D20658

llvm-svn: 270823
2016-05-26 04:58:46 +00:00
Davide Italiano bd543d0a0b [LazyValueInfo] Simplify `return after else`. NFCI.
llvm-svn: 270779
2016-05-25 22:29:34 +00:00
Michael Kuperstein 82069c44ca [BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).

For example, consider code like:

struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;

Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1

Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.

Differential Revision: http://reviews.llvm.org/D20495

llvm-svn: 270777
2016-05-25 22:23:08 +00:00
Hal Finkel 2f6886844e Look for a loop's starting location in the llvm.loop metadata
Getting accurate locations for loops is important, because those locations are
used by the frontend to generate optimization remarks. Currently, optimization
remarks for loops often appear on the wrong line, often the first line of the
loop body instead of the loop itself. This is confusing because that line might
itself be another loop, or might be somewhere else completely if the body was
inlined function call. This happens because of the way we find the loop's
starting location. First, we look for a preheader, and if we find one, and its
terminator has a debug location, then we use that. Otherwise, we look for a
location on an instruction in the loop header.

The fallback heuristic is not bad, but will almost always find the beginning of
the body, and not the loop statement itself. The preheader location search
often fails because there's often not a preheader, and even when there is a
preheader, depending on how it was formed, it sometimes carries the location of
some preceeding code.

I don't see any good theoretical way to fix this problem. On the other hand,
this seems like a straightforward solution: Put the debug location in the
loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert
llvm.loop metadata with appropriate locations when generating debugging
information. With these changes, our loop remarks have much more accurate
locations.

Differential Revision: http://reviews.llvm.org/D19738

llvm-svn: 270771
2016-05-25 21:42:37 +00:00
Ahmed Bougacha 201b97f550 [TLI] Also cover Linux 64 libfunc (stat64, ...) prototype checking.
My script missed those in r270750.

llvm-svn: 270763
2016-05-25 21:16:33 +00:00
Ahmed Bougacha 1fe3f1ca50 [TLI] Fix NumParams==0 prototype checking typo.
There was a typo in r267758. It caused invalid accesses when
given something like "void @free(...)", as NumParams == 0, and
we then try to look at the 0th parameter.

Turns out, most of these were untested; add both attribute
and missing-prototype checks for all libc libfuncs.

Differential Revision: http://reviews.llvm.org/D20543

llvm-svn: 270750
2016-05-25 20:22:45 +00:00
Oleg Ranevskyy eb4eccae5c [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
Summary:
**Description**

This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.

When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call.
`getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`.

Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr.

This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.

**Minimal reproducer:**
```
int foo(int a, int b, int c);
int baz();

void bar()
{
   int arr[20];
   int i = 0;

   for (i = 0; i < 4; ++i)
     arr[i] = baz();

   for (; i < 20; ++i)
     arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}
```

**Clang command line:**
```
clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir
```

**Expected result:**
The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated.

Reviewers: sanjoy

Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D20058

llvm-svn: 270695
2016-05-25 13:01:33 +00:00
Michael Zolotukhin 7216dd4668 [LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.
This fixes PR27847. Now for real.

llvm-svn: 270629
2016-05-24 22:59:58 +00:00
Sanjay Patel 23019d1006 [ValueTracking, InstSimplify] extend isKnownNonZero() to handle vector constants
Similar in spirit to D20497 :
If all elements of a constant vector are known non-zero, then we can say that the
whole vector is known non-zero.

It seems like we could extend this to FP scalar/vector too, but isKnownNonZero()
says it only works for integers and pointers for now.

Differential Revision: http://reviews.llvm.org/D20544

llvm-svn: 270562
2016-05-24 14:18:49 +00:00
Michael Zolotukhin 3898b2b587 [LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.
This fixes PR27847.

llvm-svn: 270517
2016-05-24 00:51:01 +00:00
Sanjay Patel e8dc090a2b fix formatting; NFC
llvm-svn: 270465
2016-05-23 17:57:54 +00:00
Sanjay Patel 8ec7e7c216 use 'auto' with 'dyn_cast'; fix formatting; NFC
llvm-svn: 270370
2016-05-22 16:07:20 +00:00
Sanjay Patel e2e89ef936 [ValueTracking, InstCombine] extend isKnownToBeAPowerOfTwo() to handle vector splat constants
We could try harder to handle non-splat vector constants too, 
but that seems much rarer to me.

Note that the div test isn't resolved because there's a check
for isIntegerTy() guarding that transform.

Differential Revision: http://reviews.llvm.org/D20497

llvm-svn: 270369
2016-05-22 15:41:53 +00:00
Michael Kuperstein c6de57e47a Revert r270268 due to unused variable warnings.
llvm-svn: 270272
2016-05-20 20:55:51 +00:00
Michael Kuperstein f45e5b58b8 [BasicAA] Turn DecomposeGEPExpression runtime checks into asserts.
When it has a DataLayout, DecomposeGEPExpression() should return the same object
as GetUnderlyingObject(). Per the FIXME, it currently always has a DL, so the
runtime check is redundant and can become an assert.

llvm-svn: 270268
2016-05-20 20:26:50 +00:00
Easwaran Raman bb578ef0dd Allow -inline-threshold to override default threshold.
Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior.

Differential Revision: http://reviews.llvm.org/D20452

llvm-svn: 270153
2016-05-19 23:02:09 +00:00
Matthew Simpson 6feebe9847 [LAA] Check independence of strided accesses before forward case
This patch changes the order in which we attempt to prove the independence of
strided accesses. We previously did this after we knew the dependence distance
was positive. With this change, we check for independence before handling the
negative distance case. The patch prevents LAA from reporting forward
dependences for independent strided accesses.

This change was requested in the review of D19984.

llvm-svn: 270072
2016-05-19 15:37:19 +00:00
Sanjoy Das f5d40d5350 [SCEV] Be more aggressive in proving NUW
... for AddRec's in loops for which SCEV is unable to compute a max
tripcount.  This is the NUW variant of r269211 and fixes PR27691.

(Note: PR27691 is not a correct or stability bug, it was created to
track a pending task).

llvm-svn: 269790
2016-05-17 17:51:14 +00:00
Geoff Berry 9b4ff336ce [BasicAA] Update comments based on feedback from hfinkel. NFCI.
Original change Hal's comments were based on:
http://reviews.llvm.org/D19730

llvm-svn: 269678
2016-05-16 18:51:54 +00:00
Matthew Simpson 37ec5f914e [LAA] Rename forwarding conflict detection option (NFC)
This patch renames the option enabling the store-to-load forwarding conflict
detection optimization. This change was requested in the review of D20241.

llvm-svn: 269668
2016-05-16 17:00:56 +00:00
Adam Nemet 884d313b7f [LAA] Comment couldPreventStoreLoadForward. NFC
Also s/Cycles/Iters/ in NumCyclesForStoreLoadThroughMemory to make it
clear that this is not about clock cycles but loop cycles/iterations.

llvm-svn: 269667
2016-05-16 16:57:47 +00:00
Adam Nemet 9b5852aeb2 [LAA] clang-format the function couldPreventStoreLoadForward. NFC
llvm-svn: 269666
2016-05-16 16:57:42 +00:00
Matthew Simpson a250dc9f11 [LAA] Add option to disable conflict detection (NFC)
llvm-svn: 269654
2016-05-16 14:14:49 +00:00
Adam Nemet c62e554e9a [LAA] Include MaxSafeDepDistBytes in the analysis print-out
llvm-svn: 269508
2016-05-13 22:49:13 +00:00
Adam Nemet 4ad38b63d5 [LAA] Prepare the code to print more things in the summary. NFC
llvm-svn: 269507
2016-05-13 22:49:09 +00:00
Michael Zolotukhin 963a6d9c69 Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...""
This reverts commit r269395.

Try to reapply with a fix from chapuni.

llvm-svn: 269486
2016-05-13 21:23:25 +00:00
Silviu Baranga 24dbd2e760 [scan-build] fix warnings emiited on LLVM Analysis code base
Fix "Logic error" warnings of the type "Called C++ object pointer is
null" reported by Clang Static Analyzer on the following files:

lib/Analysis/ScalarEvolution.cpp,
lib/Analysis/LoopInfo.cpp.

Patch by Apelete Seketeli!

llvm-svn: 269424
2016-05-13 14:54:50 +00:00
Michael Zolotukhin 9be3b8b9bb Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."
This reverts commit r269388.

It caused some bots to fail, I'm reverting it until I investigate the
issue.

llvm-svn: 269395
2016-05-13 06:32:25 +00:00
Michael Zolotukhin b7b8052982 [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

llvm-svn: 269388
2016-05-13 01:42:39 +00:00
Michael Zolotukhin a59a308e8d [LoopUnrollAnalyzer] Don't treat gep-instructions with simplified offset as simplified.
Summary:
Currently we consider such instructions as simplified, which is incorrect,
because if their user isn't simplified, we can't actually simplify them too.
This biases our estimates of profitability: for instance the analyzer expects
much more gains from unrolling memcpy loops than there actually are.

Reviewers: hfinkel, chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17365

llvm-svn: 269387
2016-05-13 01:42:34 +00:00
Chandler Carruth 49c22190d0 [PM] Port of the DepndenceAnalysis to the new PM.
Ported DA to the new PM by splitting the former DependenceAnalysis Pass
into a DependenceInfo result type and DependenceAnalysisWrapperPass type
and adding a new PM-style DependenceAnalysis analysis pass returning the
DependenceInfo.

Patch by Philip Pfaffe, most of the review by Justin.

Differential Revision: http://reviews.llvm.org/D18834

llvm-svn: 269370
2016-05-12 22:19:39 +00:00
Adam Nemet 2c34ab51a4 [LAA] Use std::min. NFC
llvm-svn: 269356
2016-05-12 21:41:53 +00:00
Sanjoy Das 4e8c80382f [SCEVExpander] Fix a failed cast<> assertion
SCEVExpander::replaceCongruentIVs assumes the backedge value of an
SCEV-analysable PHI to always be an instruction, when this is not
necessarily true.  For now address this by bailing out of the
optimization if the backedge value of the PHI is a non-Instruction.

llvm-svn: 269213
2016-05-11 17:41:41 +00:00
Sanjoy Das abb7b93eb9 [SCEVExpander] Don't break SSA in replaceCongruentIVs
`SCEVExpander::replaceCongruentIVs` bypasses `hoistIVInc` if both the
original and the isomorphic increments are PHI nodes.  Doing this can
break SSA if the isomorphic increment is not dominated by the original
increment.  Get rid of the bypass, and let `hoistIVInc` do the right
thing.

Fixes PR27232 (compile time crash/hang).

llvm-svn: 269212
2016-05-11 17:41:34 +00:00
Sanjoy Das 787c2460c2 [SCEV] Be more aggressive around proving no-wrap
... for AddRec's in loops for which SCEV is unable to compute a max
tripcount.  This is not a problem for "normal" loops[0] that don't have
guards or assumes, but helps in cases where we have guards or assumes in
the loop that can be used to constrain incoming values over the backedge.

This partially fixes PR27691 (we still don't handle the NUW case).

[0]: for "normal" loops, in the cases where we'd be able to prove
no-wrap via isKnownPredicate, we'd also be able to compute a max
tripcount.

llvm-svn: 269211
2016-05-11 17:41:26 +00:00
Vedant Kumar ee20294af5 [BasicAA] Compare GEP indices based on value (Fix PR27418)
Equivalent GEP indices with different types are treated as different
indices altogether, leading to an incorrect AA result. Fix the issue
by comparing indices based on their values.

Thanks to Mikael Holmén for reporting the issue!

Differential Revision: http://reviews.llvm.org/D19935

llvm-svn: 269197
2016-05-11 15:45:43 +00:00
Artur Pilipenko 7a26326442 NFC. Introduce Value::isPointerDereferenceable
Extract a part of isDereferenceableAndAlignedPointer functionality to Value:
    
Reviewed By: hfinkel, sanjoy
    
Differential Revision: http://reviews.llvm.org/D17611

llvm-svn: 269190
2016-05-11 14:43:28 +00:00
Easwaran Raman 9b792923d0 Revert r269131
llvm-svn: 269138
2016-05-10 23:26:04 +00:00
Easwaran Raman 7eccf4ee0e Reapply r266477 and r266488
llvm-svn: 269131
2016-05-10 22:03:23 +00:00
Sanjay Patel 6786bc5390 [InstSimplify] use computeKnownBits on shift amount operands
Do simplifications common to all shift instructions based on the amount shifted:
1. If the shift amount is known larger than the bitwidth, the result is undefined.
2. If the valid bits of the shift amount are all known to be 0, it's a shift by zero, so the shift operand is the result.

Note that we could generalize the shift-by-zero transform into a shift-by-constant if all of the valid bits in the shift
amount are known, but that would have to be done in InstCombine rather than here because it would mean we need to create
a new shift instruction.

Differential Revision: http://reviews.llvm.org/D19874

llvm-svn: 269114
2016-05-10 20:46:54 +00:00
Peter Collingbourne ccdc225c27 Re-apply r269081 and r269082 with a fix for MSVC.
llvm-svn: 269094
2016-05-10 18:07:21 +00:00
Peter Collingbourne 4d41cb6cc6 Revert r269081 and r269082 while I try to find the right incantation to fix MSVC build.
llvm-svn: 269091
2016-05-10 17:54:43 +00:00
Peter Collingbourne 0df2b085bc WholeProgramDevirt: Move logic for finding devirtualizable call sites to Analysis.
The plan is to eventually make this logic simpler, however I expect it to
be a little tricky for the foreseeable future (at least until we're rid of
pointee types), so move it here so that it can be reused to build a summary
index for devirtualization.

Differential Revision: http://reviews.llvm.org/D20005

llvm-svn: 269081
2016-05-10 17:34:21 +00:00
Silviu Baranga adf4b739ea [LAA] Use re-written SCEV expressions when computing distances
This removes a redundant stride versioning step (we already
do it in getPtrStride, so it has no effect) and uses PSE to
get the SCEV expressions for the source and destination
(this might have changed when getPtrStride was called).

I discovered this through code inspection, and couldn't
produce a regression test for it.

llvm-svn: 269052
2016-05-10 12:28:49 +00:00
James Molloy aa1d638800 Revert "[VectorUtils] Query number of sign bits to allow more truncations"
This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690.

This reverts commit r268921.

llvm-svn: 269051
2016-05-10 12:27:23 +00:00
Denis Zobnin 15d1e64b2b [LAA] Rename "isStridedPtr" with "getPtrStride". NFC.
Changing misleading function name was approved in http://reviews.llvm.org/D17268.
Patch by Roman Shirokiy.

llvm-svn: 269021
2016-05-10 05:55:16 +00:00
Sanjoy Das 12c91dc4c8 [ValueTracking] Use guards to prove non-nullness of a value
Reviewers: apilipenko, majnemer, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20044

llvm-svn: 269008
2016-05-10 02:35:44 +00:00
Sanjoy Das d47f42435a [BasicAA] Guard intrinsics don't write to memory
Summary:
The idea is very close to what we do for assume intrinsics: we mark the
guard intrinsics as writing to arbitrary memory to maintain control
dependence, but under the covers we teach AA that they do not mod any
particular memory location.

Reviewers: chandlerc, hfinkel, gbiv, reames

Subscribers: george.burgess.iv, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19575

llvm-svn: 269007
2016-05-10 02:35:41 +00:00
Sanjoy Das 0b6518d24e [SCEVExpander] Clang format expressions; NFC
The boolean expressions are somewhat hard to read otherwise.

llvm-svn: 268998
2016-05-10 00:32:31 +00:00
Sanjoy Das 2512d0c837 [SCEV] Use guards to prove predicates
We can use calls to @llvm.experimental.guard to prove predicates,
relying on the fact that in all locations domianted by a call to
@llvm.experimental.guard the predicate it is guarding is known to be
true.

llvm-svn: 268997
2016-05-10 00:31:49 +00:00
Adam Nemet 0a77dfad95 [LV] Hint at the new loop distribution pragma in optimization remark
When we encounter unsafe memory dependencies, loop distribution could
help.

Even though, the diagnostics is in LAA, it's only currently emitted in
the vectorizer.

llvm-svn: 268987
2016-05-09 23:03:44 +00:00
Sanjay Patel 0f153424a9 [Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277)
Differential Revision: http://reviews.llvm.org/D20077

llvm-svn: 268980
2016-05-09 21:51:53 +00:00
Matt Arsenault 1af53a91c0 DivergenceAnalysis: Fix crash with no return blocks
The post dominator tree does not have a root node in this case.

llvm-svn: 268933
2016-05-09 16:57:08 +00:00
Sanjay Patel 0fb9880bf5 fix spelling; NFC
llvm-svn: 268929
2016-05-09 16:07:45 +00:00
James Molloy 5c20e27b7f [VectorUtils] Query number of sign bits to allow more truncations
When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations.

llvm-svn: 268921
2016-05-09 14:32:30 +00:00
David Majnemer eac58d8f68 [X86] Promote several single precision FP libcalls on Windows
A number of libcalls don't exist in any particular lib but are, instead,
defined in math.h as inline functions (even in C mode!).  Don't rely on
their existence when lowering @llvm.{cos,sin,floor,..}.f32, promote them
instead.

N.B. We had logic to handle FREM but were missing out on a number of
others.  This change generalizes the FREM handling.

llvm-svn: 268875
2016-05-08 08:15:50 +00:00
Sanjoy Das 987aaa1374 [ValueTracking] Hoist some computation out of a loop; NFC
There is no need to match the comparison instruction repeatedly.

llvm-svn: 268836
2016-05-07 02:08:24 +00:00
Sanjoy Das 5056e19fce Clean up comment; NFC
llvm-svn: 268835
2016-05-07 02:08:22 +00:00
Sanjoy Das 6082c1a39c Delete trailing whitespace; NFC
llvm-svn: 268834
2016-05-07 02:08:15 +00:00
Mehdi Amini 3b132e34b0 ThinLTO: fix assertion and refactor check for hidden use from inline ASM in a helper function
This test was crashing, and currently it breaks bootstrapping clang with debuginfo

Differential Revision: http://reviews.llvm.org/D20008

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268715
2016-05-06 08:25:33 +00:00
Adam Nemet 724ab22378 [LAA] Fix confusing debug message
This message used to be correct, when all we cared about was whether the
dependence was safe (i.e. NoDep) or unsafe.  With the current more
precise characterization, this is a forward dep.

llvm-svn: 268695
2016-05-05 23:41:28 +00:00
Xinliang David Li 28a932742c [PM] port Branch Frequency Analaysis pass to new PM
llvm-svn: 268687
2016-05-05 21:13:27 +00:00
Chad Rosier 226a734f1a [ValueTracking] Early exit when further analysis won't be fruitful.
This should have NFC in the context of codegen, but may have positive
implications on compile-time.

llvm-svn: 268651
2016-05-05 17:41:19 +00:00
Chad Rosier 25cfb7dbd6 [ValueTracking] Improve isImpliedCondition for matching LHS and Imm RHSs.
llvm-svn: 268636
2016-05-05 15:39:18 +00:00
Xinliang David Li 6e5dd41481 [PM] Port Branch Probability Analysis pass to the new pass manager.
Differential Revision: http://reviews.llvm.org/D19839

llvm-svn: 268601
2016-05-05 02:59:57 +00:00
David Majnemer 3918cdd2a1 [ConstantFolding, ValueTracking] Fold constants involving bitcasts of ConstantVector
We assumed that ConstantVectors would be rather uninteresting from the
perspective of analysis.  However, this is not the case due to a quirk
of how LLVM handles vectors of i1.  Vectors of i1 are not
ConstantDataVectors like vectors of i8, i16, i32 or i64 because i1's
SizeInBits differs from it's StoreSizeInBytes.  This leads to it being
categorized as a ConstantVector instead of a ConstantDataVector.

Instead, treat ConstantVector more uniformly.

This fixes PR27591.

llvm-svn: 268479
2016-05-04 06:13:33 +00:00
Justin Bogner e839c3e6ab PM: Check that loop passes preserve a basic set of analyses
A loop pass that didn't preserve this entire set of passes wouldn't
play well with other loop passes, since these are generally a basic
requirement to do any interesting transformations to a loop.

Adds a helper to get the set of analyses a loop pass should preserve,
and checks that any loop pass we run satisfies the requirement.

llvm-svn: 268444
2016-05-03 21:35:08 +00:00
Sanjoy Das 013a4ac4aa [SCEV] Tweak the output format and content of -analyze
In the "LoopDispositions:" section:

 - Instead of printing out a list, print out a "dictionary" to make it
   obvious by inspection which disposition is for which loop.  This is
   just a cosmetic change.

 - Print dispositions for parent _and_ sibling loops.  I will use this
   to write a test case.

llvm-svn: 268405
2016-05-03 17:49:57 +00:00
Anna Thomas 43d7e1cbff Fold compares irrespective of whether allocation can be elided
Summary
When a non-escaping pointer is compared to a global value, the
comparison can be folded even if the corresponding malloc/allocation
call cannot be elided.
We need to make sure the global value is not null, since comparisons to
null cannot be folded.

In future, we should also handle cases when the the comparison
instruction dominates the pointer escape.

Reviewers: sanjoy
Subscribers s.egerton, llvm-commits

Differential Revision: http://reviews.llvm.org/D19549

llvm-svn: 268390
2016-05-03 14:58:21 +00:00
David Majnemer 3d90bb79c4 [LoopUnroll] Unroll loops which have exit blocks to EH pads
We were overly cautious in our analysis of loops which have invokes
which unwind to EH pads.  The loop unroll transform is safe because it
only clones blocks in the loop body, it does not try to split critical
edges involving EH pads.  Instead, move the necessary safety check to
LoopUnswitch.

N.B. The safety check for loop unswitch is covered by an existing test
which fails without it.

llvm-svn: 268357
2016-05-03 03:57:40 +00:00
John Regehr e1c481dccf [LVI] Add an API to LazyValueInfo so that it can export ConstantRanges
that it computes. Currently this is used for testing and precision
tuning, but it might be used by optimizations later.

Differential Revision: http://reviews.llvm.org/D19179

llvm-svn: 268291
2016-05-02 19:58:00 +00:00
George Burgess IV 6edb891c8e [CFLAA] Fix a use-of-invalid-pointer bug.
As shown in the diff, we used to add to CFLAA's cache by doing
`Cache[Fn] = buildSetsFrom(Fn)`. `buildSetsFrom(Fn)` may cause `Cache`
to reallocate its underlying storage, if this happens and `Cache[Fn]`
was evaluated prior to `buildSetsFrom(Fn)`, then we'll store the result
to a bad address.

Patch by Jia Chen.

llvm-svn: 268269
2016-05-02 18:09:19 +00:00
Simon Pilgrim 33ae13d3c3 Fixed MSVC 'not all control paths return a value' warning
llvm-svn: 268198
2016-05-01 15:52:31 +00:00
Sanjoy Das f2f00fb11a [SCEV] When printing via -analysis, dump loop disposition
There are currently some bugs in tree around SCEV caching an incorrect
loop disposition.  Printing out loop dispositions will let us write
whitebox tests as those are fixed.

The dispositions are printed as a list in "inside out" order,
i.e. innermost loop first.

llvm-svn: 268177
2016-05-01 04:51:05 +00:00
David Majnemer 826e9831a7 [ValueTracking] Make the code in lookThroughCast
No functionality change is intended.

llvm-svn: 268108
2016-04-29 21:22:04 +00:00
Chad Rosier cd62bf5821 [InstCombine] Determine the result of a select based on a dominating condition.
Differential Revision: http://reviews.llvm.org/D19550

llvm-svn: 268104
2016-04-29 21:12:31 +00:00
David Majnemer d2a074b1f4 [ValueTracking] matchSelectPattern needs to be more careful around FP
matchSelectPattern attempts to see through casts which mask min/max
patterns from being more obvious.  Under certain circumstances, it would
misidentify a sequence of instructions as a min/max because it assumed
that folding casts would preserve the result.  This is not the case for
floating point <-> integer casts.

This fixes PR27575.

llvm-svn: 268086
2016-04-29 18:40:34 +00:00
Geoff Berry b92cd5293e [BasicAA] Treat llvm.assume as not accessing memory in getModRefBehavior(Function)
Reviewers: dberlin, chandlerc, hfinkel, reames, sanjoy

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19730

llvm-svn: 268068
2016-04-29 17:18:28 +00:00
Filipe Cabecinhas 0da9937517 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
Matt Arsenault 790eb1c490 DivergenceAnalysis: Fix crash with unreachable blocks
Unreachable blocks may not be in the dominator tree,
so don't crash on them.

llvm-svn: 268001
2016-04-29 06:17:47 +00:00
Chad Rosier 567556aa9c [Inliner] Formatting. NFC.
Patch by Aditya Kumar!
Differential Revision: http://reviews.llvm.org/D19047

llvm-svn: 267888
2016-04-28 14:47:23 +00:00
Ahmed Bougacha d765a82b54 [TLI] Unify LibFunc signature checking. NFCI.
I tried to be as close as possible to the strongest check that
existed before; cleaning these up properly is left for future work.

Differential Revision: http://reviews.llvm.org/D19469

llvm-svn: 267758
2016-04-27 19:04:35 +00:00
Ahmed Bougacha 220c4010bf [TLI] Fix indentation. NFC.
llvm-svn: 267757
2016-04-27 19:04:29 +00:00
Matthew Simpson e5dfb08fcb [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

llvm-svn: 267725
2016-04-27 15:20:21 +00:00
Teresa Johnson df5ef8711f [ThinLTO] Refine fix to avoid renaming of uses in inline assembly.
Summary:
Refine the workaround from r266877 that attempts to prevent
renaming of locals in inline assembly, so that in addition to looking
for a llvm.used local value, that there is at least one inline assembly
call in the module. Otherwise, debug functions added to the llvm.used
can block importing/exporting unnecessarily.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19573

llvm-svn: 267717
2016-04-27 14:19:38 +00:00
Artur Pilipenko 345f01481b NFC. Introduce Value::getPointerDerferecnceableBytes
Extract a part of isDereferenceableAndAlignedPointer functionality to Value::getPointerDerferecnceableBytes. Currently it's a NFC, but in future I'm going to accumulate all the logic about value dereferenceability in this function similarly to Value::getPointerAlignment function (D16144).

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17572

llvm-svn: 267708
2016-04-27 12:51:01 +00:00
Artur Pilipenko 9bb6beabf4 isSafeToLoadUnconditionally support queries without a context
This is required to use this function from isSafeToSpeculativelyExecute

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16231

llvm-svn: 267692
2016-04-27 11:00:48 +00:00
Philip Reames c67651dd70 [LVI] Delete stale and misleading comment.
llvm-svn: 267661
2016-04-27 03:03:15 +00:00
Philip Reames 2ab964e263 [LVI] Add a comment explaining a subtle piece of code
Or at least, I didn't understand the implications the first several times I read it it.

llvm-svn: 267648
2016-04-27 01:02:25 +00:00
Philip Reames 3f83dbeed9 [LVI] Reduce compile time by lazily scanning blocks if needed
When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null.  That's all well and good, but *then* we'd go recurse through our input blocks.  As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain.  The previous code papered over this by using the isKnownNonNull routine from value tracking.  This made the duplication less painful in the common case.

Instead, we know do the block scan only *after* we've gotten the recursive results back.  This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks.  For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so.

Note that the case where we *can't* prove something non-null is still the really expensive case.  We end up scanning each and every block looking for a dereference and never end up finding one.

llvm-svn: 267642
2016-04-27 00:30:55 +00:00
Philip Reames f105db4fc3 [LVI] Cut short search if we know we can't return a useful result
Previously we were recursing on our operands for unary and binary operators regardless of whether we knew how to reason about the operator in question.  This has the effect of doing a potentially large amount of work, only to throw it away.  By checking whether the operation is one LVI can handle, we can cut short the search and return the (overdefined) answer more quickly.  The quality of the results produced should not change.

llvm-svn: 267626
2016-04-26 23:27:33 +00:00
Philip Reames 053c2a6f25 [LVI] Apply transfer rule for overdefine inputs for binary operators
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules.  Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined.  This greatly impacts the precision of the overall analysis and makes it far more fragile as well.

This patch builds on 267609 which did the same thing for unary casts.

llvm-svn: 267620
2016-04-26 23:10:35 +00:00
Philip Reames e5030e85ea [LVI] A better fix for the assertion error introduced by 267609
Essentially, I was using the wrong size function.  For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert.  I fixed this, and also added a guard that the input is a sized type.  Test case is for the original mistake.  I'm not sure how to actually exercise the sized type check.

llvm-svn: 267618
2016-04-26 22:52:30 +00:00
Philip Reames d5c62a0aad [LVI] Speculative fix for assertion seen in clang bots
I'll clean this up and add a test case shortly.  I want to make sure this does actually fix the bots; if not, I'll revert.

llvm-svn: 267617
2016-04-26 22:31:53 +00:00
Philip Reames 38c87c2e50 [LVI] Infer local facts from unary expressions
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.

This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations.

Differential Revision: http://reviews.llvm.org/D19492

llvm-svn: 267609
2016-04-26 21:48:16 +00:00
Philip Reames 1918384155 [LVI] Make a precondition explicit rather than handling a case which never happens [NFC]
llvm-svn: 267481
2016-04-25 22:21:24 +00:00
Philip Reames 3bb2832900 [LVI] Clarify comments describing the lattice values
There has been much recent confusion about the partition in the lattice between constant and non-constant values.  Hopefully, documenting this will prevent confusion going forward.

llvm-svn: 267440
2016-04-25 18:48:43 +00:00
Philip Reames 6671577eb3 [LVI] Split solveBlockValueConstantRange into two [NFC]
This function handled both unary and binary operators.  Cloning and specializing leads to much easier to follow code with minimal duplicatation.

llvm-svn: 267438
2016-04-25 18:30:31 +00:00
Chad Rosier e2cbd13e56 [ValueTracking] Improve isImpliedCondition when the dominating cond is false.
llvm-svn: 267430
2016-04-25 17:23:36 +00:00
Silviu Baranga 795c629ec9 [SCEV] Improve the run-time checking of the NoWrap predicate
Summary:
This implements a new method of run-time checking the NoWrap
SCEV predicates, which should be easier to optimize and nicer
for targets that don't correctly handle multiplication/addition
of large integer types (like i128).

If the AddRec is {a,+,b} and the backedge taken count is c,
the idea is to check that |b| * c doesn't have unsigned overflow,
and depending on the sign of b, that:

   a + |b| * c >= a (b >= 0) or
   a - |b| * c <= a (b <= 0)

where the comparisons above are signed or unsigned, depending on
the flag that we're checking.

The advantage of doing this is that we avoid extending to a larger
type and we avoid the multiplication of large types (multiplying
i128 can be expensive).

Reviewers: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D19266

llvm-svn: 267389
2016-04-25 09:27:16 +00:00
Nick Lewycky af50837a31 Remove emacs mode markers from .cpp files. NFC
.cpp files are unambiguously C++, you only need the mode markers on .h files.

llvm-svn: 267353
2016-04-24 17:55:41 +00:00
Teresa Johnson 28e457bccd [ThinLTO] Remove GlobalValueInfo class from index
Summary:
Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly
reference summary objects. The info structure was there to support lazy
parsing of the combined index summary objects, which is no longer
needed and not supported.

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19462

llvm-svn: 267344
2016-04-24 14:57:11 +00:00
Mehdi Amini c3ed48c1bd Reorganize GlobalValueSummary with a "Flags" bitfield.
Right now it only contains the LinkageType, but will be extended
with "hasSection", "isOptSize", "hasInlineAssembly", etc.

Differential Revision: http://reviews.llvm.org/D19404

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267319
2016-04-24 03:18:18 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Peter Collingbourne 7dd8dbf486 Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.

LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.

Differential Revision: http://reviews.llvm.org/D18367

llvm-svn: 267223
2016-04-22 21:18:02 +00:00
Peter Collingbourne 265ebd7d70 CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
Sanjoy Das a6155b659a Have isKnownNotFullPoison be smarter around control flow
Summary:
(... while still not using a PostDomTree)

The way we use isKnownNotFullPoison from SCEV today, the new CFG walking
logic will not trigger for any realistic cases -- it will kick in only
for situations where we could have merged the contiguous basic blocks
anyway[0], since the poison generating instruction dominates all of its
non-PHI uses (which are the only uses we consider right now).

However, having this change in place will allow a later bugfix to break
fewer llvm-lit tests.

[0]: i.e. cases where block A branches to block B and B is A's only
successor and A is B's only predecessor.

Reviewers: broune, bjarke.roune

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19212

llvm-svn: 267175
2016-04-22 17:41:06 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Sanjoy Das efdeb45ffd [SCEV] Extract out a `isSCEVExprNeverPoison` helper; NFCI
Summary:
Also adds a small comment blurb on control flow + no-wrap flags, since
that question came up a few days back on llvm-dev.

Reviewers: bjarke.roune, broune

Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D19209

llvm-svn: 267110
2016-04-22 05:38:54 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Philip Reames 92c43699bc [unordered] Add tests and conservative handling in support of future changes [NFCI]
This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing.  At the moment, the code added is dead, but separating it makes follow on changes far more obvious.

llvm-svn: 266999
2016-04-21 16:51:08 +00:00
Chad Rosier 99bc480bc3 Address Philip's post-commit feedback for r266987. NFC.
llvm-svn: 266998
2016-04-21 16:18:02 +00:00
Chad Rosier af83e40dee Refactor implied condition logic from ValueTracking directly into CmpInst. NFC.
Differential Revision: http://reviews.llvm.org/D19330

llvm-svn: 266987
2016-04-21 14:04:54 +00:00
Nick Lewycky 762f8a8549 Add optimization for 'icmp slt (or A, B), A' and some related idioms based on knowledge of the sign bit for A and B.
No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table:

```
A is
+, -, +/-
F  F   F   +    B is
T  F   ?   -
?  F   ?   +/-
```

The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate.

There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers.

llvm-svn: 266939
2016-04-21 00:53:14 +00:00
Chad Rosier 41dd31f0b0 [ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC.
Phabricator Revision: http://reviews.llvm.org/D19277

llvm-svn: 266904
2016-04-20 19:15:26 +00:00
Teresa Johnson b35cc691ea [ThinLTO] Prevent importing of "llvm.used" values
Summary:
This patch prevents importing from (and therefore exporting from) any
module with a "llvm.used" local value. Local values need to be promoted
and renamed when importing, and their presense on the llvm.used variable
indicates that there are opaque uses that won't see the rename. One such
example is a use in inline assembly.

See also the discussion at:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html

As part of this, move collectUsedGlobalVariables out of Transforms/Utils
and into IR/Module so that it can be used more widely. There are several
other places in LLVM that used copies of this code that can be cleaned
up as a follow on NFC patch.

Reviewers: joker.eph

Subscribers: pcc, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18986

llvm-svn: 266877
2016-04-20 14:39:45 +00:00
David Majnemer b4b27230bf [ValueTracking, VectorUtils] Refactor getIntrinsicIDForCall
The functionality contained within getIntrinsicIDForCall is two-fold: it
checks if a CallInst's callee is a vectorizable intrinsic.  If it isn't
an intrinsic, it attempts to map the call's target to a suitable
intrinsic.

Move the mapping functionality into getIntrinsicForCallSite and rename
getIntrinsicIDForCall to getVectorIntrinsicIDForCall while
reimplementing it in terms of getIntrinsicForCallSite.

llvm-svn: 266801
2016-04-19 19:10:21 +00:00
Chad Rosier b7dfbb40a3 [ValueTracking] Improve isImpliedCondition for conditions with matching operands.
This patch improves SimplifyCFG to catch cases like:

  if (a < b) {
    if (a > b) <- known to be false
      unreachable;
  }

Phabricator Revision: http://reviews.llvm.org/D18905

llvm-svn: 266767
2016-04-19 17:19:14 +00:00
Brendon Cahoon be2da82cd8 [DependenceAnalysis] Refactor uses of getConstantPart. NFC.
Rather than checking for the SCEV type prior to calling
getContantPart, perform the checks in the function. This reduces
the number of places where the checks are needed.

Differential Revision: http://reviews.llvm.org/D19241

llvm-svn: 266759
2016-04-19 16:46:57 +00:00
Daniel Berlin 77fa84eadd Correct IDF calculator for ReverseIDF
Summary:
Need to use predecessors for reverse graph, successors for forward graph.
succ_iterator/pred_iterator are not compatible, this patch is all the work necessary to work around that (which is what everywhere else does).  Not sure if there is a better way, so cc'ing some random folks to take a gander :)

Reviewers: dblaikie, qcolombet, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18796

llvm-svn: 266718
2016-04-19 06:13:28 +00:00
Michael Kuperstein de16b44f74 Port DemandedBits to the new pass manager.
Differential Revision: http://reviews.llvm.org/D18679

llvm-svn: 266699
2016-04-18 23:55:01 +00:00
Sanjoy Das 432c1c3fb3 [BPI] Consider deoptimize calls as "unreachable"
Summary:
Calls to @llvm.experimental.deoptimize are expected to "never execute",
so optimize them as such.

Reviewers: chandlerc

Subscribers: junbuml, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19095

llvm-svn: 266654
2016-04-18 19:01:28 +00:00
Easwaran Raman a163d27611 Revert r266488.
This goes with r266477 which has been  reverted.

llvm-svn: 266631
2016-04-18 17:10:17 +00:00
Eric Liu d09f15ea6f Revert "Replace the use of MaxFunctionCount module flag"
This reverts commit r266477.

This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData",
while "ProfileData" depends on "Object", which depends on "BitCode", which
depends on "Analysis".

llvm-svn: 266619
2016-04-18 15:31:11 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Easwaran Raman 6d09f993c2 Add ProfileData to required_libraries
This should fix ppc64be build breakage due to r266477

llvm-svn: 266488
2016-04-15 23:08:52 +00:00
Easwaran Raman f53baca686 Replace the use of MaxFunctionCount module flag
Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count.

Differential Revision: http://reviews.llvm.org/D18622

llvm-svn: 266477
2016-04-15 21:39:58 +00:00
Justin Lebar 8650a4da93 [TTI] Add getInliningThresholdMultiplier.
Summary:
InlineCost's threshold is multiplied by this value.  This lets us adjust
the inlining threshold up or down on a per-target basis.  For example,
we might want to increase the threshold on targets where calls are
unusually expensive.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18560

llvm-svn: 266405
2016-04-15 01:38:48 +00:00
Michael Kuperstein 16f13e252b [AliasSetTracker] Correctly handle changing the size of an entry
If the size of an AST entry changes, we also need to make sure we perform
necessary alias set merges, as the new size may overlap pointers in other sets.
We happen to run into this with memset, because memset allows an entry for a
i8* pointer to have a decidedly non-i8 size.

This fixes PR27262.

Differential Revision: http://reviews.llvm.org/D18939

llvm-svn: 266381
2016-04-14 22:00:11 +00:00
Renato Golin 5cb666add7 [ARM] Adding IEEE-754 SIMD detection to loop vectorizer
Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON.

This patch teaches the loop vectorizer to only allow transformations of loops
that either contain no floating-point operations or have enough allowance
flags supporting lack of precision (ex. -ffast-math, Darwin).

For that, the target description now has a method which tells us if the
vectorizer is allowed to handle FP math without falling into unsafe
representations, plus a check on every FP instruction in the candidate loop
to check for the safety flags.

This commit makes LLVM behave like GCC with respect to ARM NEON support, but
it stops short of fixing the underlying problem: sub-normals. Neither GCC
nor LLVM have a flag for allowing sub-normal operations. Before this patch,
GCC only allows it using unsafe-math flags and LLVM allows it by default with
no way to turn it off (short of not using NEON at all).

As a first step, we push this change to make it safe and in sync with GCC.
The second step is to discuss a new sub-normal's flag on both communitues
and come up with a common solution. The third step is to improve the FastMath
flags in LLVM to encode sub-normals and use those flags to restrict NEON FP.

Fixes PR16275.

llvm-svn: 266363
2016-04-14 20:42:18 +00:00
Nicolai Haehnle 13d90f324c [DivergenceAnalysis] Treat PHI with incoming undef as constant
Summary:
If a PHI has an incoming undef, we can pretend that it is equal to one
non-undef, non-self incoming value.

This is particularly relevant in combination with the StructurizeCFG
pass, which introduces PHI nodes with undefs. Previously, this lead to
branch conditions that were uniform before StructurizeCFG to become
non-uniform afterwards, which confused the SIAnnotateControlFlow
pass.

This fixes a crash when Mesa radeonsi compiles a shader from
dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex

Reviewers: arsenm, tstellarAMD, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19013

llvm-svn: 266347
2016-04-14 17:42:47 +00:00
Silviu Baranga b77365b595 [SCEV][LAA] Add tests for SCEV expression transformations performed during LAA
Summary:
Add a print method to Predicated Scalar Evolution which prints all interesting
transformations done by PSE.

Loop Access Analysis will now print this as part of the analysis output.
We now use this to check the exact expression transformations that were done
by PSE in LAA.

The additional checking also acts as white-box testing for the getAsAddRec method.

Reviewers: anemet, sanjoy

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18792

llvm-svn: 266334
2016-04-14 16:08:45 +00:00
David Majnemer 0f26b0aeb4 [CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN
The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only
one of the inputs is NaN: -NUM will return the non-NaN argument while
-NAN would return NaN.

It is desirable to lower to @llvm.{min,max}num to -NAN if they don't
have a native instruction for -NUM.  Notably, ARMv7 NEON's vmin has the
-NAN semantics.

N.B.  Of course, it is only safe to do this if the intrinsic call is
marked nnan.

llvm-svn: 266279
2016-04-14 07:13:24 +00:00
George Burgess IV cae581d13f [CFLAA] Fix up code style a bit. NFC.
llvm-svn: 266262
2016-04-13 23:27:37 +00:00
Easwaran Raman d295b00ae9 Return immediately from analyzeCall if analyzeBlock returns false.
This is part of the patch reviewed at http://reviews.llvm.org/D17584

llvm-svn: 266249
2016-04-13 21:20:22 +00:00
David L Kreitzer 752c1448fe Simplify strlen to a subtraction for certain cases.
Patch by Li Huang (li1.huang@intel.com)

Differential Revision: http://reviews.llvm.org/D18230

llvm-svn: 266200
2016-04-13 14:31:06 +00:00
Petar Jovanovic 644b8c1a5d Calculate __builtin_object_size when pointer depends on a condition
This patch fixes calculating of builtin_object_size if it depends on a
condition. Before this patch compiler did not know how to calculate the
object size when it finds a condition that cannot be eliminated.
This patch enables calculating of builtin_object_size even in case when
condition cannot be eliminated by choosing minimum or maximum value as a
result from condition. Choosing minimum or maximum value from condition
is based on the second argument of __builtin_object_size function.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D18438

llvm-svn: 266193
2016-04-13 12:25:25 +00:00
David Majnemer 3ee5f34469 [InstCombine] We folded an fcmp to an i1 instead of a vector of i1
Remove an ad-hoc transform in InstCombine and replace it with more
general machinery (ValueTracking, InstructionSimplify and VectorUtils).

This fixes PR27332.

llvm-svn: 266175
2016-04-13 06:55:52 +00:00
Jeroen Ketema e48e393729 Add space between words in verify-scev-maps option help message
llvm-svn: 266149
2016-04-12 23:21:46 +00:00
George Burgess IV 278199f615 Add the allocsize attribute to LLVM.
`allocsize` is a function attribute that allows users to request that
LLVM treat arbitrary functions as allocation functions.

This patch makes LLVM accept the `allocsize` attribute, and makes
`@llvm.objectsize` recognize said attribute.

The review for this was split into two patches for ease of reviewing:
D18974 and D14933. As promised on the revisions, I'm landing both
patches as a single commit.

Differential Revision: http://reviews.llvm.org/D14933

llvm-svn: 266032
2016-04-12 01:05:35 +00:00
Sanjoy Das f9d88e650b This reverts commit r265913 and r265912
See PR27315

r265913: "[IndVars] Eliminate op.with.overflow when possible"

r265912: "[SCEV] See through op.with.overflow intrinsics"
llvm-svn: 265950
2016-04-11 15:26:18 +00:00
Teresa Johnson 2d5487cf44 [ThinLTO] Move summary computation from BitcodeWriter to new pass
Summary:
This is the first step in also serializing the index out to LLVM
assembly.

The per-module summary written to bitcode is moved out of the bitcode
writer and to a new analysis pass (ModuleSummaryIndexWrapperPass).
The pass itself uses a new builder class to compute index, and the
builder class is used directly in places where we don't have a pass
manager (e.g. llvm-as).

Because we are computing summaries outside of the bitcode writer, we no
longer can use value ids created by the bitcode writer's
ValueEnumerator. This required changing the reference graph edge type
to use a new ValueInfo class holding a union between a GUID (combined
index) and Value* (permodule index). The Value* are converted to the
appropriate value ID during bitcode writing.

Also, this enables removal of the BitWriter library's dependence on the
Analysis library that was previously required for the summary computation.

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18763

llvm-svn: 265941
2016-04-11 13:58:45 +00:00
Sanjoy Das 3c529a40ca [SCEV] See through op.with.overflow intrinsics
Summary:
This change teaches SCEV to see reduce `(extractvalue
0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if
possible).

Reviewers: atrick, regehr

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18684

llvm-svn: 265912
2016-04-10 22:50:26 +00:00
Easwaran Raman 9a3fc17ad4 Refactor Threshold computation. NFC.
This is part of changes reviewed in http://reviews.llvm.org/D17584.

llvm-svn: 265852
2016-04-08 21:28:02 +00:00
Sanjoy Das 87b9e1b727 Propagate Undef in llvm.cos Intrinsic
Summary:
The llvm cos intrinsic currently does not propagate undef's. This change
transforms cos(undef) to null value or 0.

There are 2 test cases added as well.

Patch by Anna Thomas!

Reviewers: sanjoy

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D18863

llvm-svn: 265825
2016-04-08 18:21:11 +00:00
Silviu Baranga 6f444dfd55 Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
This re-commits r265535 which was reverted in r265541 because it
broke the windows bots. The problem was that we had a PointerIntPair
which took a pointer to a struct allocated with new. The problem
was that new doesn't provide sufficient alignment guarantees.
This pattern was already present before r265535 and it just happened
to work. To fix this, we now separate the PointerToIntPair from the
ExitNotTakenInfo struct into a pointer and a bool.

Original commit message:

Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Reviewers: anemet, mzolotukhin, hfinkel, sanjoy

Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17201

llvm-svn: 265786
2016-04-08 14:29:09 +00:00
Sanjoy Das 5ce3272833 Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.

If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".

Motivation:

I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard.  So transforming:

```
void f(unsigned x) {
  unsigned t = 5 / x;
  (void)t;
}
```

to

```
void f(unsigned x) { }
```

is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).

Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM.  For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).

Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have.  This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.

For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store.  As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal.  The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal.  Such a
refined variant will look like it is `readonly`.  However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.

Note: this is not just a problem with atomics or with linking
differently optimized object files.  See PR26774 for more realistic
examples that involved neither.

This patch:

This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time.  It then changes a set of IPO passes to bail out if they see
such a function.

Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18634

llvm-svn: 265762
2016-04-08 00:48:30 +00:00
Mehdi Amini a797877a7e Const correctness for BranchProbabilityInfo (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265731
2016-04-07 21:59:28 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Silviu Baranga a393baf1fd Revert r265535 until we know how we can fix the bots
llvm-svn: 265541
2016-04-06 14:06:32 +00:00
Silviu Baranga 72b4a4a330 [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Reviewers: anemet, mzolotukhin, hfinkel, sanjoy

Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17201

llvm-svn: 265535
2016-04-06 13:18:26 +00:00
David Majnemer 12fd50410d [SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan
To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has
undefined behavior for negative numbers other than -0.0 (which allows
for better optimization, because there is no need to worry about errno
being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."

This means that it's unsafe to replace sqrt with llvm.sqrt unless the
call is annotated with nnan.

Thanks to Hal Finkel for pointing this out!

llvm-svn: 265521
2016-04-06 07:04:53 +00:00
David Majnemer 25d03dbcde [SLPVectorizer] Vectorize libcalls of sqrt
We didn't realize that we could transform the libcall into a vectorized
intrinsic.

llvm-svn: 265493
2016-04-06 00:14:59 +00:00
George Burgess IV 7e5404cc20 [CFLAA] Fix PR27213; incorrect tagging of args/globals
Prior to this patch, CFLAA wouldn't tag arguments/globals properly if
it didn't find any "interesting" edges on them. This means that, if all
you do is store constants to a global or argument, we would never
actually treat it as a global/argument.

Test case:

define void @foo(i32* %A, i32* %B) #0 {
entry:
  store i32 0, i32* %A, align 4
  store i32 0, i32* %B, align 4
  ret void
}

CFLAA would say that %A can't alias %B, because neither pointer was
used in an interesting way. This patch makes us note whether something
is an argument, global, ... regardless of how interesting CFLAA thinks
its uses are.

(For the record, using a value in an interesting way means loading
from it, using it in a GEP, ...)

llvm-svn: 265474
2016-04-05 21:40:45 +00:00
Junmo Park 53470fc451 Minor code cleanups. NFC.
llvm-svn: 265468
2016-04-05 21:14:31 +00:00
Duncan P. N. Exon Smith 1de3c7e790 IR: Introduce ConstantAggregate, NFC
Add a common parent class for ConstantArray, ConstantVector, and
ConstantStruct called ConstantAggregate.  These are the aggregate
subclasses of Constant that take operands.

This is mainly a cleanup, adding common `isa` target and removing
duplicated code.  However, it also simplifies caching which constants
point transitively at `GlobalValue` (a possible future direction).

llvm-svn: 265466
2016-04-05 21:10:45 +00:00
Brendon Cahoon 86f783e315 [DependenceAnalysis] Check if result of getConstantPart is null
A seg-fault occurs due to a reference of a null pointer, which is
the value returned by getConstantPart. This function returns
null if the constant part is not found. The code that calls this
function needs to check for the null return value.

Differential Revision: http://reviews.llvm.org/D18718

llvm-svn: 265319
2016-04-04 18:13:18 +00:00
Peter Zotov 0218d0f383 Mark some FP intrinsics as safe to speculatively execute
Floating point intrinsics in LLVM are generally not speculatively
executed, since most of them are defined to behave the same as libm
functions, which set errno.

However, the only error that can happen  when executing ceil, floor,
nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE,
and that requires the maximum value of the exponent to be smaller
than  the number of mantissa bits, which is not the case with any of
the floating point types supported by LLVM.

The trunc and copysign functions never set errno per per POSIX.1-2001.

Differential Revision: http://reviews.llvm.org/D18643

llvm-svn: 265262
2016-04-03 12:30:46 +00:00
Mehdi Amini 89038a1071 Fix "warning: variabl 'XX’ set but not used" in release build (variable used in assertion, NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265220
2016-04-02 05:34:19 +00:00
David Majnemer ae272d718e [NVPTX] Infer __nvvm_reflect as nounwind, readnone
This patch simply mirrors the attributes we give to @llvm.nvvm.reflect
to the __nvvm_reflect libdevice call.  This shaves about 30% of the code
in libdevice away because of CSE opportunities.  It's also helps us
figure out that libdevice implementations of transcendental functions
don't have side-effects.

llvm-svn: 265060
2016-03-31 21:29:57 +00:00
Sanjoy Das 56df0ec610 [InstCombine] Fix incorrect rule from rL236202
The rule for SMIN introduced in rL236202 doesn't work as advertised: the
check for Pred == ICmpInst::ICMP_SGT was missing.

llvm-svn: 264996
2016-03-31 05:14:34 +00:00
Sanjoy Das c9d6d8b106 Delete trailing whitespace
llvm-svn: 264995
2016-03-31 05:14:29 +00:00
Sanjoy Das e12c0e5159 [SCEV] Track NoWrap properties using MatchBinaryOp, NFC
This way once we teach MatchBinaryOp to map more things into arithmetic,
the non-wrapping add recurrence construction would understand it too.
Right now MatchBinaryOp still only understands arithmetic, so this is
solely a code-reorganization change.

llvm-svn: 264994
2016-03-31 05:14:26 +00:00
Sanjoy Das 118d919a6a [SCEV] NFC code motion to simplify later change
llvm-svn: 264993
2016-03-31 05:14:22 +00:00
James Molloy 8e46cd05a1 [VectorUtils] Don't try and truncate PHIs to a smaller bitwidth
We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means.

However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018.

This fixes PR17018.

llvm-svn: 264852
2016-03-30 10:11:43 +00:00
Sanjoy Das 2381fcd557 [SCEV] Extract out a MatchBinaryOp; NFCI
MatchBinaryOp abstracts out the IR instructions from the operations they
represent.  While this change is NFC, we will use this factoring later
to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add
X Y)`.

llvm-svn: 264747
2016-03-29 16:40:44 +00:00
Sanjoy Das 260ad4dd63 [SCEV] Use Operator::getOpcode instead of manual dispatch; NFC
llvm-svn: 264746
2016-03-29 16:40:39 +00:00
Eugene Zelenko 35623fb7d5 Fix Clang-tidy modernize-deprecated-headers warnings in some files; other minor fixes.
Differential revision: http://reviews.llvm.org/D18469

llvm-svn: 264598
2016-03-28 17:40:08 +00:00
Philip Reames b5681138e4 Allow value forwarding past release fences in GVN
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence.  

We choose to be much more conservative about stores.  In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store.  Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time.

The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. 

This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now.

Differential Revision: http://reviews.llvm.org/D11436

llvm-svn: 264472
2016-03-25 22:40:35 +00:00
Duncan P. N. Exon Smith 1d15a9f0c9 IR: Reserve an MDKind for !llvm.loop; NFC
This reserves an MDKind for !llvm.loop, which allows callers to avoid a
string-based lookup.  I'm not sure why it was missing.

There should be no functionality change here, just a small compile-time
speedup.

llvm-svn: 264371
2016-03-25 00:35:38 +00:00
Adam Nemet 59a6550425 [LAA] Formatting fix in previous change
llvm-svn: 264244
2016-03-24 05:15:24 +00:00
Adam Nemet 279784ffc4 [LAA] Support memchecks involving loop-invariant addresses
We used to only allow SCEVAddRecExpr for pointer expressions in order to
be able to compute the bounds.  However this is also trivially possible
for loop-invariant addresses (scUnknown) since then the bounds are the
address itself.

Interestingly, we used allow this for the special case when the
loop-invariant address happens to also be an SCEVAddRecExpr (in an outer
loop).

There are a couple more loops that are vectorized in SPEC after this.
My guess is that the main reason we don't see more because for example a
loop-invariant load is vectorized into a splat vector with several
vector-inserts.  This is likely to make the vectorization unprofitable.
I.e. we don't notice that a later LICM will move all of this out of the
loop so the cost estimate should really be 0.

llvm-svn: 264243
2016-03-24 04:28:47 +00:00
Easwaran Raman 12b79aa0f1 Add getBlockProfileCount method to BlockFrequencyInfo
Differential Revision: http://reviews.llvm.org/D18233

llvm-svn: 264179
2016-03-23 18:18:26 +00:00
Silviu Baranga d68ed85401 [SCEV] Change the SCEV Predicates interfaces for conversion to AddRecExpr to return SCEVAddRecExpr* instead of SCEV*
Summary:
This changes the conversion functions from SCEV * to SCEVAddRecExpr from
ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr*
instead of a SCEV* (which removes the need of most clients to do a
dyn_cast right after calling these functions).

We also don't add new predicates if the transformation was not successful.

This is not entirely a NFC (as it can theoretically remove some predicates
from LAA when we have an unknown dependece), but I couldn't find an obvious
regression test for it.

Reviewers: sanjoy

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18368

llvm-svn: 264161
2016-03-23 15:29:30 +00:00
Mehdi Amini c04fc7a60f Rename DenseMap::resize() into DenseMap::reserve() (NFC)
This is more coherent with usual containers.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264026
2016-03-22 07:20:00 +00:00
Matt Arsenault 155dda9134 Implement constant folding for bitreverse
llvm-svn: 263945
2016-03-21 15:00:35 +00:00
Silviu Baranga f875e4fd92 [IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA
Summary:
replaceCongruentIVs can break LCSSA when trying to replace IV increments
since it tries to replace all uses of a phi node with another phi node
while both of the phi nodes are not necessarily in the processed loop.
This will cause an assert in IndVars.

To fix this, we add a check to make sure that the replacement maintains
LCSSA.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18266

llvm-svn: 263941
2016-03-21 12:44:29 +00:00
Adam Nemet 709e3046ee [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead
Summary:
It can hurt performance to prefetch ahead too much.  Be conservative for
now and don't prefetch ahead more than 3 iterations on Cyclone.

Reviewers: hfinkel

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17949

llvm-svn: 263772
2016-03-18 00:27:43 +00:00
Adam Nemet 6d8beeca53 [LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses
Summary:
And use this TTI for Cyclone.  As it was explained in the original RFC
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW
prefetcher work up to 2KB strides.

I am also adding tests for this and the previous change (D17943):

* Cyclone prefetching accesses with a large stride
* Cyclone not prefetching accesses with a small stride
* Generic Aarch64 subtarget not prefetching either

Reviewers: hfinkel

Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17945

llvm-svn: 263771
2016-03-18 00:27:38 +00:00
Bjorn Steinbrink 59fdec673d Add Rust's personality function to the list of known personality functions
Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18192

llvm-svn: 263581
2016-03-15 20:35:45 +00:00
Manuel Jacob 6be355961e Re-add ConstantFoldInstOperands form taking opcode and return type.
Summary:
This form was replaced by a form taking an instruction instead of opcode and
return type in r258391.  After committing this change (and some depending,
follow-up changes) it turned out in the review thread to be controversial.  The
discussion didn't come to a conclusion yet.  I'm re-adding the old form to fix
the API regression and to provide a better base for discussion, possibly on
llvm-dev.

A difference to the original function is that it can't be called with GEPs
(similarly to how it was already the case for compares).  In order to support
opaque pointers in the future, folding GEPs needs to be passed the source
element type, which is not possible with the current API.

Reviewers: dberlin, reames

Subscribers: dblaikie, eddyb

Differential Revision: http://reviews.llvm.org/D17901

llvm-svn: 263501
2016-03-14 22:34:17 +00:00
Michael Kuperstein b7860fedd4 [AliasSetTracker] Do not strip pointer casts when processing MemSetInst
This fixes PR26843.

llvm-svn: 263462
2016-03-14 18:34:29 +00:00
Fiona Glaser 2e5c0c2858 ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression
Check to see if all operands are constant before calling simplify on them
so that we don't perform wasted simplifications.

llvm-svn: 263374
2016-03-13 05:36:15 +00:00
Chandler Carruth 5bfbc3f941 [AA] Make BasicAA just require domtree.
This doesn't change how many times we construct domtrees in the normal
pipeline, and it removes fragility and instability where basic-aa may
not be run in time to see domtrees because they happen to be constructed
afterward.

This isn't quite as clean as the change to memdep because there is
a mode where basic-aa specifically runs without domtrees -- in the
hacking version used by function-attrs with the legacy pass manager.

llvm-svn: 263234
2016-03-11 13:53:18 +00:00
Chandler Carruth aef32bd319 [memdep] Just require domtree for memdep.
This doesn't cause us to construct dominator trees any more often in the
normal pipeline, and removes an entire mode of memdep that needed to be
reasoned about and maintained. Perhaps more importantly, it removes the
ability for the results of memdep to be different because of accidental
pass scheduling goofs or the order of evaluation of 'getResult' calls.

Essentially, 'getCachedResult', unless across IR-unit boundaries, is
extremely dangerous. We need to work much harder to avoid it (or its
analog in the old pass manager).

llvm-svn: 263232
2016-03-11 13:46:00 +00:00
Chandler Carruth b47f8010a9 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Chandler Carruth b4faf13c15 [PM] Implement the final conclusion as to how the analysis IDs should
work in the face of the limitations of DLLs and templated static
variables.

This requires passes that use the AnalysisBase mixin provide a static
variable themselves. So as to keep their APIs clean, I've made these
private and befriended the CRTP base class (which is the common
practice).

I've added documentation to AnalysisBase for why this is necessary and
at what point we can go back to the much simpler system.

This is clearly a better pattern than the extern template as it caught
*numerous* places where the template magic hadn't been applied and
things were "just working" but would eventually have broken
mysteriously.

llvm-svn: 263216
2016-03-11 10:22:49 +00:00
Chandler Carruth 45a9c203a0 [PM/AA] Teach the AAManager how to handle module analyses in addition to
function analyses, and use it to wire up globals-aa to the new pass
manager.

llvm-svn: 263211
2016-03-11 09:15:11 +00:00
Chandler Carruth cf3f4f25ca [CG] Back out my pointless move ctor and add the explicit template
instantiation needed for the mingw dll build bot.

llvm-svn: 263114
2016-03-10 14:33:10 +00:00
Chandler Carruth 4c660f7087 [CG] Add a new pass manager printer pass for the old call graph and
actually finish wiring up the old call graph.

There were bugs in the old call graph that hadn't been caught because it
wasn't being tested. It wasn't being tested because it wasn't in the
pipeline system and we didn't have a printing pass to run in tests. This
fixes all of that.

As for why I'm still keeping the old call graph alive its so that I can
port GlobalsAA to the new pass manager with out forking it to work with
the lazy call graph. That's clearly the right eventual design, but it
seems pragmatic to defer that until its necessary. The old call graph
works just fine for GlobalsAA.

llvm-svn: 263104
2016-03-10 11:24:11 +00:00
Chandler Carruth 1ecd740cf0 [CG] Actually hoist up the generic CallGraphPrinter pass from a weird
location in the opt tool to live along side the analysis in LLVM's
libraries.

No functionality changed here, but this will allow me to port the
printer to the new pass manager as well.

llvm-svn: 263101
2016-03-10 11:08:44 +00:00
Chandler Carruth 5f432292a6 [CG] Rename the DOT printing pass to actually reference "DOT".
There is another pass by the generic name 'CallGraphPrinter' which is
actually just a call graph printer tucked away inside the opt tool. I'd
like to bring it out and make it follow the same patterns as the rest of
the CallGraph code, but doing so would end up conflicting with the name
of the DOT printing pass. So this makes the DOT printing pass name be
more precise.

No functionality changed here.

llvm-svn: 263100
2016-03-10 11:04:40 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Philip Reames d9f4a3d18c [BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAA
MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA.

In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice.

Differential Revision: http://reviews.llvm.org/D15912

llvm-svn: 263075
2016-03-09 23:19:56 +00:00
Philip Reames 8f12eba78d [ValueTracking] Extract isKnownPositive [NFCI]
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.

llvm-svn: 263062
2016-03-09 21:31:47 +00:00
Sanjoy Das 97d19bd95f [SCEV] Slightly generalize getRangeViaFactoring
Building on the previous change, this generalizes
ScalarEvolution::getRangeViaFactoring to work with
{Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign
extend or truncate operation, and k0 and k1 are constants.

llvm-svn: 262979
2016-03-09 01:51:02 +00:00
Sanjoy Das d3488c6060 [SCEV] Slightly generalize getRangeViaFactoring
This change generalizes ScalarEvolution::getRangeViaFactoring to work
with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign
extend or truncate operation.

llvm-svn: 262978
2016-03-09 01:50:57 +00:00
Sanjay Patel b8d071bc8a use range-based for loop; NFCI
llvm-svn: 262956
2016-03-08 20:53:48 +00:00
Easwaran Raman b1bd398ceb Revert revisions 262636, 262643, 262679, and 262682.
llvm-svn: 262883
2016-03-08 00:36:35 +00:00
Chandler Carruth af8321ecf7 [memdep] Switch to range based for loops.
llvm-svn: 262831
2016-03-07 15:12:57 +00:00
Chandler Carruth b32febe48e [memdep] Switch a function to return true on success instead of false.
This is much more clear and less surprising IMO. It also makes things
more consistent with the increasingly large chunk of LLVM code that
assumes true-on-success.

llvm-svn: 262826
2016-03-07 12:45:07 +00:00
Chandler Carruth 40e21f2a20 [memdep] Cleanup the implementation doxygen comments and remove
duplicated comments.

In several cases these had diverged making them especially nice to
canonicalize. I checked to make sure we weren't losing important
information of course.

llvm-svn: 262825
2016-03-07 12:30:06 +00:00
Chandler Carruth 60fb1b4bd2 [memdep] Run clang-format over the header before porting it to
the new pass manager.

The port will involve substantial edits here, and would likely introduce
bad formatting if formatted in isolation, so just get all the formatting
up to snuff. I'll also go through and try to freshen the doxygen here as
well as modernizing some of the code.

llvm-svn: 262821
2016-03-07 10:19:30 +00:00
Philip Reames a0c9f6e736 [LVI] Fix a bug which prevented use of !range metadata within a query
The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further.  The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor.

llvm-svn: 262751
2016-03-04 22:27:39 +00:00
Easwaran Raman 588c68a87b Fix a memory leak.
llvm-svn: 262682
2016-03-04 01:18:40 +00:00
Philip Reames b7270446cf [ValueTracking] "constant fold" an experimental hidden option
llvm-svn: 262648
2016-03-03 19:50:32 +00:00
Philip Reames 146307eb52 [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00
Easwaran Raman fd6557e368 Fix breakage caused by r262636.
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

llvm-svn: 262643
2016-03-03 18:53:20 +00:00
Sanjoy Das 724f5cf278 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das 11ef606f1d [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00
Easwaran Raman 3035719c86 Infrastructure for PGO enhancements in inliner
This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

llvm-svn: 262636
2016-03-03 18:26:33 +00:00
Chandler Carruth 12884f7f80 [AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.

Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Differential Revision: http://reviews.llvm.org/D17329

llvm-svn: 262490
2016-03-02 15:56:53 +00:00
Sanjoy Das dcd3a88e29 [SCEV] Minor naming, braces cleanup; NFC
llvm-svn: 262459
2016-03-02 04:52:22 +00:00
Sanjoy Das 6b017a11ba Add a comment with a rational for the unusual code structure
llvm-svn: 262454
2016-03-02 02:56:29 +00:00
Sanjoy Das eca1b53b95 Qualify getRangeForAffineAR with this-> for MSVC
llvm-svn: 262453
2016-03-02 02:44:08 +00:00
Sanjoy Das 1168f93c2b Perturb code in an attempt to appease MSVC
For some reason MSVC seems to think I'm calling getConstant() from a
static context.  Try to avoid this issue by explicitly specifying
'this->' (though I'm not confident that this will actually work).

llvm-svn: 262451
2016-03-02 02:34:20 +00:00
Sanjoy Das 62a1c33929 More code permutation to appease MSVC
llvm-svn: 262449
2016-03-02 02:15:42 +00:00
Sanjoy Das 9e5ebf145c Remove "auto" to appease the MSVC bots
llvm-svn: 262448
2016-03-02 01:59:37 +00:00
Sanjoy Das bf73098472 [SCEV] Make getRange smarter around selects
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.

The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).

llvm-svn: 262438
2016-03-02 00:57:54 +00:00
Sanjoy Das b765b633cb [SCEV] Extract out a getRangeForAffineAR; NFC
Pure code-motion change.  Will be used later in making getRange more clever.

llvm-svn: 262437
2016-03-02 00:57:39 +00:00
Sanjoy Das f1e9cae00e [SCEV] Minor cleanup: rename method, C++11'ify; NFC
llvm-svn: 262374
2016-03-01 19:28:01 +00:00
Adam Nemet b8486e5a32 [LAA] Add missing debug output
llvm-svn: 262279
2016-03-01 00:50:08 +00:00
Benjamin Kramer 6bb15021b3 [InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optzn
I accidentally removed this in r262212 but there was no test coverage to
detect it.

llvm-svn: 262215
2016-02-29 12:18:25 +00:00
Benjamin Kramer f5b2a47ac6 [InstSimplify] fsub 0.0, (fsub -0.0, X) ==> X is only safe if signed zeros are ignored.
Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746.

llvm-svn: 262212
2016-02-29 11:12:23 +00:00
NAKAMURA Takumi df0cd72657 [PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix for clang.
char AnalysisBase::ID should be declared as extern and defined in one module.

llvm-svn: 262188
2016-02-28 17:17:00 +00:00
NAKAMURA Takumi ca04a1f720 Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal tweaks."
I'll rework soon.

llvm-svn: 262186
2016-02-28 16:54:06 +00:00
NAKAMURA Takumi de40e7437e [PM] Appease mingw32's auto-import DLL build with minimal tweaks.
char AnalysisBase::ID should be declared as extern and defined in one module.

llvm-svn: 262185
2016-02-28 16:38:46 +00:00
Chandler Carruth afcec4c55a [PM] Provide explicit instantiation declarations and definitions for the
PassManager and AnalysisManager template specializations as well.

llvm-svn: 262128
2016-02-27 10:45:35 +00:00
Chandler Carruth 2a54094d40 [PM] Provide two templates for the two directionalities of analysis
manager proxies and use those rather than repeating their definition
four times.

There are real differences between the two directions: outer AMs are
const and don't need to have invalidation tracked. But every proxy in
a particular direction is identical except for the analysis manager type
and the IR unit they proxy into. This makes them prime candidates for
nice templates.

I've started introducing explicit template instantiation declarations
and definitions as well because we really shouldn't be emitting all this
everywhere. I'm going to go back and add the same for the other
templates like this in a follow-up patch.

I've left the analysis manager as an opaque type rather than using two
IR units and requiring it to be an AnalysisManager template
specialization. I think its important that users retain the ability to
provide their own custom analysis management layer and provided it has
the appropriate API everything should Just Work.

llvm-svn: 262127
2016-02-27 10:38:10 +00:00
Philip Reames 70b391864d Suppress an uncovered switch warning [NFC]
llvm-svn: 262109
2016-02-27 05:18:30 +00:00
Philip Reames adf0e35308 [LVI] Extend select handling to catch min/max/clamp idioms
Most of this is fairly straight forward. Add handling for min/max via existing matcher utility and ConstantRange routines.  Add handling for clamp by exploiting condition constraints on inputs.  

Note that I'm only handling two constant ranges at this point. It would be reasonable to consider treating overdefined as a full range if the instruction is typed as an integer, but that should be a separate change.

Differential Revision: http://reviews.llvm.org/D17184

llvm-svn: 262085
2016-02-26 22:53:59 +00:00
Chandler Carruth 3a63435551 [PM] Introduce CRTP mixin base classes to help define passes and
analyses in the new pass manager.

These just handle really basic stuff: turning a type name into a string
statically that is nice to print in logs, and getting a static unique ID
for each analysis.

Sadly, the format of passes in anonymous namespaces makes using their
names in tests really annoying so I've customized the names of the no-op
passes to keep tests sane to read.

This is the first of a few simplifying refactorings for the new pass
manager that should reduce boilerplate and confusion.

llvm-svn: 262004
2016-02-26 11:44:45 +00:00
Michael Zolotukhin 9f520ebc54 [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating.
Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect.

Reviewers: chandlerc, hfinkel

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17632

llvm-svn: 261958
2016-02-26 02:57:05 +00:00
Hongbin Zheng bc53977a0d Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC
Differential Revision: http://reviews.llvm.org/D17571

llvm-svn: 261904
2016-02-25 17:54:25 +00:00
Hongbin Zheng 751337faa7 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261903
2016-02-25 17:54:15 +00:00
Hongbin Zheng 3f97840721 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261902
2016-02-25 17:54:07 +00:00
Hongbin Zheng 66b19fbc4e Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC"
This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018.

llvm-svn: 261891
2016-02-25 16:45:53 +00:00
Hongbin Zheng ad782ce3f7 Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC"
This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2.

llvm-svn: 261890
2016-02-25 16:45:46 +00:00
Hongbin Zheng 921fabf34b Revert "Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC"
This reverts commit 8228b4d374edeb4cc0c5fddf6e1ab876918ee126.

llvm-svn: 261889
2016-02-25 16:45:37 +00:00
Hongbin Zheng 2fa386fd6c Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC
Differential Revision: http://reviews.llvm.org/D17571

llvm-svn: 261884
2016-02-25 16:33:26 +00:00
Hongbin Zheng 237197ba63 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261883
2016-02-25 16:33:15 +00:00
Hongbin Zheng a0273a04f5 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261882
2016-02-25 16:33:06 +00:00
Justin Bogner eecc3c826a PM: Implement a basic loop pass manager
This creates the new-style LoopPassManager and wires it up with dummy
and print passes.

This version doesn't support modifying the loop nest at all. It will
be far easier to discuss and evaluate the approaches to that with this
in place so that the boilerplate is out of the way.

llvm-svn: 261831
2016-02-25 07:23:08 +00:00
Artur Pilipenko 31bcca47d3 NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
Artur Pilipenko ae51afc5c7 NFC. Move getAlignment helper function from ValueTracking to Value class.
Reviewed By: reames, hfinkel

Differential Revision: http://reviews.llvm.org/D16144

llvm-svn: 261735
2016-02-24 12:25:10 +00:00
Chandler Carruth c5d211ef2c [PM] Remove an overly aggressive assert now that I can actually test the
pattern that triggers it. This essentially requires an immutable
function analysis, as that will survive anything we do to invalidate it.
When we have such patterns, the function analysis manager will not get
cleared between runs of the proxy.

If we actually need an assert about how things are queried, we can add
more elaborate machinery for computing it, but so far I'm not aware of
significant value provided.

Thanks to Justin Lebar for noticing this when he made a (seemingly
innocuous) change to FunctionAttrs that is enough to trigger it in one
test there. Now it is covered by a direct test of the pass manager code.

llvm-svn: 261627
2016-02-23 10:47:57 +00:00
Chandler Carruth 77b6e47f74 [PM] Improve the API and comments around the analysis manager proxies.
These are really handles that ensure the analyses get cleared at
appropriate places, and as such copying doesn't really make sense.
Instead, they should look more like unique ownership objects. Make that
the case.

Relatedly, if you create a temporary of one and move out of it
its destructor shouldn't actually clear anything. I don't think there is
any code that can trigger this currently, but it seems like a more
robust implementation.

If folks want, I can add a unittest that forces this to be exercised,
but that seems somewhat pointless -- whether a temporary is ever created
in the innards of AnalysisManager is not really something we should be
adding a reliance on, but I didn't want to leave a timebomb in the code
here.

If anyone has a cleaner way to represent this, I'm all ears, but
I wanted to assure myself that this wasn't in fact responsible for
another bug I'm chasing down (it wasn't) and figured I'd commit that.

llvm-svn: 261594
2016-02-23 00:05:00 +00:00
Krzysztof Parzyszek e261e5ac47 More detailed dependence test between volatile and non-volatile accesses
Differential Revision: http://reviews.llvm.org/D16857

llvm-svn: 261589
2016-02-22 23:07:43 +00:00
Sanjoy Das 5079f6260f [ConstantRange] Rename a method and add more doc
Rename makeNoWrapRegion to a more obvious makeGuaranteedNoWrapRegion,
and add a comment about the counter-intuitive aspects of the function.
This is to help prevent cases like PR26628.

llvm-svn: 261532
2016-02-22 16:13:02 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Tobias Grosser 934fcf4dc6 ScalerEvolution: Only erase temporary values if they actually have been added
This addresses post-review comments from Sanjoy Das for r261485.

llvm-svn: 261486
2016-02-21 18:50:09 +00:00
Tobias Grosser 11332e5ec5 ScalarEvolution: Do not keep temporary PHI values in ValueExprMap
Before this patch simplified SCEV expressions for PHI nodes were only returned
the very first time getSCEV() was called, but later calls to getSCEV always
returned the non-simplified value, which had "temporarily" been stored in the
ValueExprMap, but was never removed and consequently blocked the caching of the
simplified PHI expression.

llvm-svn: 261485
2016-02-21 17:42:10 +00:00
Joerg Sonnenberger 36894dcfed When MemoryDependenceAnalysis hits a CFG with many transparent blocks,
the algorithm easily degrades into quadratic memory and time complexity.
The easiest example is a long chain of BBs that don't otherwise use a
location. The caching will add an entry for every intermediate block and
limiting the number of results doesn't help as no results are produced
until a definition is found.

Introduce a limit similar to the existing instructions-per-block limit.
This limit counts the total number of blocks checked. If the limit is
reached, entries are considered unknown. The initial value is 1000,
which avoids regressions for normal sized functions while still
limiting edge cases to reasnable memory consumption and execution time.

Differential Revision: http://reviews.llvm.org/D16123

llvm-svn: 261430
2016-02-20 11:24:44 +00:00
Benjamin Kramer 2337c1fe13 [LVI] Move ConstantRanges instead of copying.
No functional change intended. Copying small (<= 64 bits) APInts isn't
expensive but bloats code by generating the slow path everywhere. Moving
doesn't care about the size of the value.

llvm-svn: 261426
2016-02-20 10:40:34 +00:00
Chandler Carruth 342c671b66 [PM/AA] Wire up CFLAA to the new pass manager fully, and port one of its
tests over to exercise this code.

This uncovered a few missing bits here and there in the analysis, but
nothing interesting.

llvm-svn: 261404
2016-02-20 03:52:02 +00:00
Chandler Carruth 4f846a5f15 [PM/AA] Port alias analysis evaluator to the new pass manager, and use
it to actually test the new pass manager AA wiring.

This patch was extracted from the (somewhat too large) D12357 and
rebosed on top of the slightly different design of the new pass manager
AA wiring that I just landed. With this we can start testing the AA in
a thorough way with the new pass manager.

Some minor cleanups to the code in the pass was necessitated here, but
otherwise it is a very minimal change.

Differential Revision: http://reviews.llvm.org/D17372

llvm-svn: 261403
2016-02-20 03:46:03 +00:00
Sanjoy Das 807d33da96 [SCEV] Don't spell `SCEV *` variables as `Scev`; NFC
It reads odd since most other places name a `SCEV *` as `S`.  Pure
renaming change.

llvm-svn: 261393
2016-02-20 01:44:10 +00:00
Sanjoy Das c42f7cc3f8 [SCEV] Don't use std::make_pair; NFC
`{A, B}` reads cleaner than `std::make_pair(A, B)`.

llvm-svn: 261392
2016-02-20 01:35:56 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Philip Reames bd09e86f82 [CaptureTracking] Support atomicrmw and cmpxchg
These atomic operations are conceptually both a load and store from the same location. As such, we can treat them as the most conservative of those two components which in practice, means we can treat them like stores. An cmpxchg or atomicrmw captures the values, but not the locations accessed.

Note: We can probably be more aggressive about the comparison value in an cmpxhg since to have it be in memory, it must already be captured, but I figured it was better to avoid that for the moment.

Note 2: It turns out that since we don't actually support cmpxchg of pointer type, writing a negative test is impossible.

Differential Revision: http://reviews.llvm.org/D17400

llvm-svn: 261245
2016-02-18 19:23:27 +00:00
Haicheng Wu 5cf99095bb [AliasSetTracker] Teach AliasSetTracker about MemSetInst
This change is to fix the problem discussed in
http://lists.llvm.org/pipermail/llvm-dev/2016-February/095446.html.

llvm-svn: 261052
2016-02-17 02:01:50 +00:00
Chandler Carruth e5944d97d8 [LCG] Construct an actual call graph with call-edge SCCs nested inside
reference-edge SCCs.

This essentially builds a more normal call graph as a subgraph of the
"reference graph" that was the old model. This allows both to exist and
the different use cases to use the aspect which addresses their needs.
Specifically, the pass manager and other *ordering* constrained logic
can use the reference graph to achieve conservative order of visit,
while analyses reasoning about attributes and other properties derived
from reachability can reason about the direct call graph.

Note that this isn't necessarily complete: it doesn't model edges to
declarations or indirect calls. Those can be found by scanning the
instructions of the function if desirable, and in fact every user
currently does this in order to handle things like calls to instrinsics.
If useful, we could consider caching this information in the call graph
to save the instruction scans, but currently that doesn't seem to be
important.

An important realization for why the representation chosen here works is
that the call graph is a formal subset of the reference graph and thus
both can live within the same data structure. All SCCs of the call graph
are necessarily contained within an SCC of the reference graph, etc.

The design is to build 'RefSCC's to model SCCs of the reference graph,
and then within them more literal SCCs for the call graph.

The formation of actual call edge SCCs is not done lazily, unlike
reference edge 'RefSCC's. Instead, once a reference SCC is formed, it
directly builds the call SCCs within it and stores them in a post-order
sequence. This is used to provide a consistent platform for mutation and
update of the graph. The post-order also allows for very efficient
updates in common cases by bounding the number of nodes (and thus edges)
considered.

There is considerable common code that I'm still looking for the best
way to factor out between the various DFS implementations here. So far,
my attempts have made the code harder to read and understand despite
reducing the duplication, which seems a poor tradeoff. I've not given up
on figuring out the right way to do this, but I wanted to wait until
I at least had the system working and tested to continue attempting to
factor it differently.

This also requires introducing several new algorithms in order to handle
all of the incremental update scenarios for the more complex structure
involving two edge colorings. I've tried to comment the algorithms
sufficiently to make it clear how this is expected to work, but they may
still need more extensive documentation.

I know that there are some changes which are not strictly necessarily
coupled here. The process of developing this started out with a very
focused set of changes for the new structure of the graph and
algorithms, but subsequent changes to bring the APIs and code into
consistent and understandable patterns also ended up touching on other
aspects. There was no good way to separate these out without causing
*massive* merge conflicts. Ultimately, to a large degree this is
a rewrite of most of the core algorithms in the LCG class and so I don't
think it really matters much.

Many thanks to the careful review by Sanjoy Das!

Differential Revision: http://reviews.llvm.org/D16802

llvm-svn: 261040
2016-02-17 00:18:16 +00:00
Philip Reames 845435c86a Revert 260705, it appears to be causing pr26628
The root issue appears to be a confusion around what makeNoWrapRegion actually does.   It seems likely we need two versions of this function with slightly different semantics.

llvm-svn: 260981
2016-02-16 17:14:30 +00:00
Junmo Park 6ebdc14cf1 [SCEVExpander] Make findExistingExpansion smarter
Summary:
Extending findExistingExpansion can use existing value in ExprValueMap.
This patch gives 0.3~0.5% performance improvements on 
benchmarks(test-suite, spec2000, spec2006, commercial benchmark)
   
Reviewers: mzolotukhin, sanjoy, zzheng

Differential Revision: http://reviews.llvm.org/D15559

llvm-svn: 260938
2016-02-16 06:46:58 +00:00
Chandler Carruth 6f5770b10f [PM/AA] Actually wire the AAManager I built for the new pass manager
into the new pass manager and fix the latent bugs there.

This lets everything live together nicely, but it isn't really useful
yet. I never finished wiring the AA layer up for the new pass manager,
and so subsequent patches will change this to do that wiring and get AA
stuff more fully integrated into the new pass manager. Turns out this is
necessary even to get functionattrs ported over. =]

llvm-svn: 260836
2016-02-13 23:32:00 +00:00
Benjamin Kramer 8f59adb217 [ConstantFolding] Reduce APInt and APFloat copying.
llvm-svn: 260826
2016-02-13 16:54:14 +00:00
Justin Lebar 144c5a6c15 Add convergent property to CodeMetrics.
Summary: No functional changes.

Reviewers: jingyue, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17126

llvm-svn: 260728
2016-02-12 21:01:31 +00:00
Philip Reames 2b9100dfbd [LVI] Exploit nsw/nuw when computing constant ranges
As the title says. Modelled after similar code in SCEV.

This is useful when analysing induction variables in loops which have been canonicalized by other passes. I wrote the tests as non-loops specifically to avoid the generality introduced in http://reviews.llvm.org/D17174. While that can handle many induction variables without *needing* to exploit nsw, there's no reason not to use it if we've already proven it.

Differential Revision: http://reviews.llvm.org/D17177

llvm-svn: 260705
2016-02-12 19:05:16 +00:00
Philip Reames 854a84c0b0 [LVI] Improve select handling to use condition
This patches teaches LVI to recognize clamp idioms (e.g. select(a > 5, a, 5) will always produce something greater than 5.

The tests end up being somewhat simplistic because trying to exercise the case I actually care about (a loop with a range check on a clamped secondary induction variable) ends up tripping across a couple of other imprecisions in the analysis. Ah, the joys of LVI...

Differential Revision: http://reviews.llvm.org/D16827

llvm-svn: 260627
2016-02-12 00:09:18 +00:00
Artur Pilipenko 66d6d3eb2d Make context-sensitive isDereferenceable queries in isSafeToLoadUnconditionally
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In the subsequent change isSafeToSpeculativelyExecute will be modified to use isSafeToLoadUnconditionally instead of isDereferenceableAndAlignedPointer.   

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D16227

llvm-svn: 260520
2016-02-11 13:42:59 +00:00
Philip Reames bb781b46e2 [LVI] Handle constants defensively
There's nothing preventing callers of LVI from asking for lattice values representing a Constant.  In fact, given that several callers are walking back through PHI nodes and trying to simplify predicates, such queries are actually quite common.  This is mostly harmless today, but we start volatiling assertions if we add new calls to getBlockValue in otherwise reasonable places.

Note that this change is not NFC.  Specifically:
1) The result returned through getValueAt will now be more precise.  In principle, this could trigger any latent infinite optimization loops in callers, but in practice, we're unlikely to see this.
2) The result returned through getBlockValueAt is potentially weakened for non-constants that were previously queried.  With the old code, you had the possibility that a later query might bypass the cache and discover some information the original query did not.  I can't find a scenario which actually causes this to happen, but it was in principle possible.  On the other hand, this may end up reducing compile time when the same value is queried repeatedly.  

llvm-svn: 260439
2016-02-10 21:46:32 +00:00
Sanjoy Das ef8ed0c0db [MemoryBuiltins] Fix an issue with hasNoAliasAttr
Summary:
`hasNoAliasAttr` is buggy: it checks to see if the called function has
a `noalias` attribute, which is incorrect since functions are not even
allowed to have the `noalias` attribute.  The comment on its only
caller, `llvm::isNoAliasFn`, makes it pretty clear that the intention
to do the `noalias` check on the return value, and not the callee.

Unfortunately I couldn't find a way to test this upstream -- fixing
this does not change the observable behavior of any of the passes that
use this.  This is not very surprising, since `noalias` does not tell
anything about the contents of the allocated memory (so, e.g., you
still cannot fold loads).  I'll be happy to be proven wrong though.

Reviewers: chandlerc, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17037

llvm-svn: 260298
2016-02-09 21:54:18 +00:00
Sanjoy Das ca2edc7ad5 [GMR/OperandBundles] Teach getModRefBehavior about operand bundles
In general, memory restrictions on a called function (e.g. readnone)
cannot be transferred to a CallSite that has operand bundles.  It is
possible to make this inference smarter, but lets fix the behavior to be
correct first.

llvm-svn: 260193
2016-02-09 02:31:47 +00:00
Sanjoy Das 1c481f50d2 Add an "addUsedAAAnalyses" helper function
Summary:
Passes that call `getAnalysisIfAvailable<T>` also need to call
`addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the
legacy pass manager that it uses `T`.  This contract was being
violated by passes that used `createLegacyPMAAResults`.  This change
fixes this by exposing a helper in AliasAnalysis.h,
`addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults
and does the right thing when called from `getAnalysisUsage`.

Reviewers: chandlerc

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17010

llvm-svn: 260183
2016-02-09 01:21:57 +00:00
Sanjoy Das 55394d929c Remove SCEVAAWrapperPass from createLegacyPMAAResults; NFC
Summary:
createLegacyPMAAResults is only called by CGSCC and Module passes, so
the call to getAnalysisIfAvailable<SCEVAAWrapperPass>() never
succeeds (SCEVAAWrapperPass is a function pass).

Reviewers: chandlerc

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17009

llvm-svn: 260182
2016-02-09 01:21:50 +00:00
Wei Mi fc1cab305f This patch is to fix PR26529 caused by r259736.
IndVarSimplify assumes scAddRecExpr to be expanded in literal form instead of
canonical form by calling disableCanonicalMode after it creates SCEVExpander.
When CanonicalMode is disabled, SCEVExpander::expand should always return PHI
node for scAddRecExpr. r259736 broke the assumption.

The fix is to let SCEVExpander::expand skip the reuse Value logic if
CanonicalMode is false.

In addition, Besides IndVarSimplify, LSR pass also calls disableCanonicalMode
before doing rewrite. We can remove the original check of LSRMode in reuse
Value logic and use CanonicalMode instead.

llvm-svn: 260174
2016-02-09 00:07:08 +00:00
Michael Zolotukhin 1da4afdfc9 Factor out UnrollAnalyzer to Analysis, and add unit tests for it.
Summary:
Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests.
The change itself is supposed to be NFC, except adding a couple of tests.

I plan to add more tests as I add new functionality and find/fix bugs.

Reviewers: chandlerc, hfinkel, sanjoy

Subscribers: zzheng, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D16623

llvm-svn: 260169
2016-02-08 23:03:59 +00:00
Silviu Baranga ea63a7f512 [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory
sanitizer issue. The PredicatedScalarEvolution's copy constructor
wasn't copying the Generation value, and was leaving it un-initialized.

Original commit message:

[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection

Summary:
This change adds no wrap SCEV predicates with:
  - support for runtime checking
  - support for expression rewriting:
      (sext ({x,+,y}) -> {sext(x),+,sext(y)}
      (zext ({x,+,y}) -> {zext(x),+,sext(y)}

Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.

We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.

Reviewers: mzolotukhin, sanjoy, anemet

Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel

Differential Revision: http://reviews.llvm.org/D15412

llvm-svn: 260112
2016-02-08 17:02:45 +00:00
Silviu Baranga 41b4973329 Revert r260086 and r260085. They have broken the memory
sanitizer bots.

llvm-svn: 260087
2016-02-08 11:56:15 +00:00
Silviu Baranga a35fadc7c4 [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
Summary:
This change adds no wrap SCEV predicates with:
  - support for runtime checking
  - support for expression rewriting:
      (sext ({x,+,y}) -> {sext(x),+,sext(y)}
      (zext ({x,+,y}) -> {zext(x),+,sext(y)}

Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.

We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.

Reviewers: mzolotukhin, sanjoy, anemet

Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel

Differential Revision: http://reviews.llvm.org/D15412

llvm-svn: 260085
2016-02-08 10:45:50 +00:00
Hans Wennborg 00ab73dcb0 CallAnalyzer::analyzeCall: change the condition back to "Cost < Threshold"
In r252595, I inadvertently changed the condition to "Cost <= Threshold",
which caused a significant size regression in Chrome. This commit rectifies
that.

llvm-svn: 259915
2016-02-05 20:32:42 +00:00
Wei Mi 33e7bc0029 Fix a regression for r259736.
When SCEV expansion tries to reuse an existing value, it is needed to ensure
that using the Value at the InsertPt will not break LCSSA. The fix adds a
check that InsertPt is either inside the candidate Value's parent loop, or
the candidate Value's parent loop is nullptr.

llvm-svn: 259815
2016-02-04 19:17:33 +00:00
Sanjoy Das 76c48e0e70 [SCEV] Add boolean accessors for NSW, NUW and NW; NFC
llvm-svn: 259809
2016-02-04 18:21:54 +00:00
Wei Mi a49559befb [SCEV] Try to reuse existing value during SCEV expansion
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.

This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.

The original commit triggered regressions in Polly tests. The regressions
exposed two problems which have been fixed in current version.

1. Polly will generate a new function based on the old one. To generate an
instruction for the new function, it builds SCEV for the old instruction,
applies some tranformation on the SCEV generated, then expands the transformed
SCEV and insert the expanded value into new function. Because SCEV expansion
may reuse value cached in ExprValueMap, the value in old function may be
inserted into new function, which is wrong.
   In SCEVExpander::expand, there is a logic to check the cached value to
be used should dominate the insertion point. However, for the above
case, the check always passes. That is because the insertion point is
in a new function, which is unreachable from the old function. However
for unreachable node, DominatorTreeBase::dominates thinks it will be
dominated by any other node.
   The fix is to simply add a check that the cached value to be used in
expansion should be in the same function as the insertion point instruction.

2. When the SCEV is of scConstant type, expanding it directly is cheaper than
reusing a normal value cached. Although in the cached value set in ExprValueMap,
there is a Constant type value, but it is not easy to find it out -- the cached
Value set is not sorted according to the potential cost. Existing reuse logic
in SCEVExpander::expand simply chooses the first legal element from the cached
value set.
   The fix is that when the SCEV is of scConstant type, don't try the reuse
logic. simply expand it.

Differential Revision: http://reviews.llvm.org/D12090

llvm-svn: 259736
2016-02-04 01:27:38 +00:00
David Majnemer fa8681e452 [ScalarEvolutionExpander] Simplify findInsertPointAfter
No functional change is intended.  The loop could only execute, at most,
once.

llvm-svn: 259701
2016-02-03 21:30:31 +00:00
Wei Mi 97de385868 Revert r259662, which caused regressions on polly tests.
llvm-svn: 259675
2016-02-03 18:05:57 +00:00
Wei Mi ed133978a0 [SCEV] Try to reuse existing value during SCEV expansion
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.

This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.

Differential Revision: http://reviews.llvm.org/D12090

llvm-svn: 259662
2016-02-03 17:05:12 +00:00
James Molloy 6e518a3b50 [DemandedBits] Revert r249687 due to PR26071
This regresses a test in LoopVectorize, so I'll need to go away and think about how to solve this in a way that isn't broken.

From the writeup in PR26071:

What's happening is that ComputeKnownZeroes is telling us that all bits except the LSB are zero. We're then deciding that only the LSB needs to be demanded from the icmp's inputs.

This is where we're wrong - we're assuming that after simplification the bits that were known zero will continue to be known zero. But they're not - during trivialization the upper bits get changed (because an XOR isn't shrunk), so the icmp fails.

The fault is in demandedbits - its contract does clearly state that a non-demanded bit may either be zero or one.

llvm-svn: 259649
2016-02-03 15:05:06 +00:00
Philip Reames b7571043f2 [LVI] Fix debug output
Due to staleness in a patch I committed yesterday, the debug output was reporting overdefined cases as being undefined.  Confusing to say the least.  The mistake appears to have only effected the debug output thankfully.

llvm-svn: 259594
2016-02-02 22:43:08 +00:00
Philip Reames ed8cd0d36e [LVI] Code motion only [NFC]
I introduced a declaration in 259583 to keep the diff readable.  This change just moves the definition up to remove the declaration again.

llvm-svn: 259585
2016-02-02 22:03:19 +00:00
Philip Reames d1f829d374 [LVI] Refactor to use newly introduced intersect utility
This patch uses the newly introduced 'intersect' utility (from 259461: [LVI] Introduce an intersect operation on lattice values) to simplify existing code in LVI.

While not introducing any new concepts, this change is probably not NFC.  The common 'intersect' function is more powerful that the ad-hoc implementations we'd had in a couple of places.  Given that, we may see optimizations triggering a bit more often.

llvm-svn: 259583
2016-02-02 21:57:37 +00:00
Eugene Zelenko ecefe5a81f Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16793

llvm-svn: 259539
2016-02-02 18:20:45 +00:00
Chandler Carruth a4499e9f73 [LCG] Build an edge abstraction for the LazyCallGraph and use it to
differentiate between indirect references to functions an direct calls.

This doesn't do a whole lot yet other than change the print out produced
by the analysis, but it lays the groundwork for a very major change I'm
working on next: teaching the call graph to actually be a call graph,
modeling *both* the indirect reference graph and the call graph
simultaneously. More details on that in the next patch though.

The rest of this is essentially a bunch of over-engineering that won't
be interesting until the next patch. But this also isolates essentially
all of the churn necessary to introduce the edge abstraction from the
very important behavior change necessary in order to separately model
the two graphs. So it should make review of the subsequent patch a bit
easier at the cost of making this patch seem poorly motivated. ;]

Differential Revision: http://reviews.llvm.org/D16038

llvm-svn: 259463
2016-02-02 03:57:13 +00:00
Philip Reames 44456b8963 [LVI] Introduce an intersect operation on lattice values
LVI has several separate sources of facts - edge local conditions, recursive queries, assumes, and control independent value facts - which all apply to the same value at the same location. The existing implementation was very conservative about exploiting all of these facts at once.

This change introduces an "intersect" function specifically to abstract the action of picking a good set of facts from all of the separate facts given. At the moment, this function is relatively simple (i.e. mostly just reuses the bits which were already there), but even the minor additions reveal the inherent power. For example, JumpThreading is now capable of doing an inductive proof that a particular value is always positive and removing a half range check.

I'm currently only using the new intersect function in one place. If folks are happy with the direction of the work, I plan on making a series of small changes without review to replace mergeIn with intersect at all the appropriate places.

Differential Revision: http://reviews.llvm.org/D14476

llvm-svn: 259461
2016-02-02 03:15:40 +00:00
Philip Reames 2c275cc686 [LVI] Fix a latent bug in getValueAt
This routine was returning Undefined for most queries.  This was utterly wrong.  Amusingly, we do not appear to have any callers of this which are actually trying to exploit unreachable code or this would have broken the world.

A better approach would be to explicit describe the intersection of facts.  That's blocked behind http://reviews.llvm.org/D14476 and I wanted to fix the current bug.

llvm-svn: 259446
2016-02-02 00:45:30 +00:00
Philip Reames 13f7324b86 [LVI] Remove overly tight assert from 259429
I'll submit a test case shortly which covers this, but it's causing clang self host problems in the builders so I wanted to get it removed.

llvm-svn: 259432
2016-02-01 23:21:11 +00:00
Philip Reames c0bdb0c1e5 [LVI] Add select handling
Teach LVI to handle select instructions in the exact same way it handles PHI nodes.  This is useful since various parts of the optimizer convert PHI nodes into selects and we don't want these transformations to cause inferior optimization.  

Note that this patch does nothing to exploit the implied constraint on the inputs represented by the select condition itself.  That will be a later patch and is blocked on http://reviews.llvm.org/D14476

llvm-svn: 259429
2016-02-01 22:57:53 +00:00
Jun Bum Lim 53907161cc Avoid inlining call sites in unreachable-terminated block
Summary:
If the normal destination of the invoke or the parent block of the call site is unreachable-terminated, there is little point in inlining the call site unless there is literally zero cost. Unlike my previous change (D15289), this change specifically handle the call sites followed by unreachable in the same basic block for call or in the normal destination for the invoke. This change could be a reasonable first step to conservatively inline call sites leading to an unreachable-terminated block while BFI / BPI is not yet available in inliner.

Reviewers: manmanren, majnemer, hfinkel, davidxl, mcrosier, dblaikie, eraman

Subscribers: dblaikie, davidxl, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16616

llvm-svn: 259403
2016-02-01 20:55:11 +00:00
Sanjoy Das 4c7b6d79c0 [SCEV] Clean up isKnownPredicateViaConstantRanges; NFCI
- ScalarEvolution::isKnownPredicateViaConstantRanges duplicates some
   logic already present in ConstantRange, use ConstantRange for those
   bits.

 - In some cases ScalarEvolution::isKnownPredicateViaConstantRanges
   returns `false` to mean "definitely false" (e.g. see the
   `LHSRange.getSignedMin().sge(RHSRange.getSignedMax())` case for
   `ICmpInst::ICMP_SLT`), but for `isKnownPredicateViaConstantRanges`,
   `false` actually means "don't know".  Get rid of this extra bit of
   code to avoid confusion.

llvm-svn: 259401
2016-02-01 20:48:14 +00:00
Sanjoy Das 401e631c4b [SCEV] Rename isKnownPredicateWithRanges; NFC
Make it obvious that it uses constant ranges, and use `Via` instead of
`With`, like other similar functions in SCEV.

llvm-svn: 259400
2016-02-01 20:48:10 +00:00
Jun Bum Lim ca832660ae [ValueTracking] Improve isKnownNonZero for PHI of non-zero constants
It is clear that a PHI is a non-zero if all incoming values are non-zero constants.

llvm-svn: 259370
2016-02-01 17:03:07 +00:00
Gerolf Hoflehner d24671f880 [BasicAA] NFC - revised comment for function adjustToPointerSize()
llvm-svn: 259300
2016-01-30 05:58:38 +00:00
Gerolf Hoflehner 87ddb65fa6 [BasicAA] Fix for missing must alias (D16343)
llvm-svn: 259299
2016-01-30 05:52:53 +00:00
Gerolf Hoflehner 73fc84bfe9 [BasicAA] Update on r259290 - added missing cast
llvm-svn: 259298
2016-01-30 05:35:09 +00:00
Gerolf Hoflehner 1d1fbb52e3 [BasicAA] NFC - utility function for two's complement wrap-around
llvm-svn: 259290
2016-01-30 02:42:11 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Yaron Keren eb2a25467e Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.

llvm-svn: 259240
2016-01-29 20:50:44 +00:00
Easwaran Raman 30a93c1848 Lower inlining threshold when the caller has minsize attribute.
When the caller has optsize attribute, we reduce the inlinining threshold
to OptSizeThreshold (=75) if it is not already lower than that. We don't do
the same for minsize and I suspect it was not intentional. This also addresses
a FIXME regarding checking optsize attribute explicitly instead of using the
right wrapper.

Differential Revision: http://reviews.llvm.org/D16493

llvm-svn: 259120
2016-01-28 23:44:41 +00:00
Matthias Braun 37e5d79c9c ValueTracking: Use fixed array for assumption exclude set in Query.
The Query structure is constructed often and is relevant for compiletime
performance. We can replace the SmallPtrSet for assumption exclusions in
this structure with a fixed size array because we know the maximum
number of elements.  This improves typical clang -O3 -emit-llvm compiletime
by 1.2% in my measurements.

Differential Revision: http://reviews.llvm.org/D16204

llvm-svn: 259025
2016-01-28 06:29:33 +00:00
George Burgess IV 77351ba3ae Minor style cleanup of CFLAA. NFC.
llvm-svn: 259008
2016-01-28 00:54:01 +00:00
Adam Nemet dadfbb52f7 [TTI] Add getPrefetchDistance from PPCLoopDataPrefetch, NFC
This patch is part of the work to make PPCLoopDataPrefetch
target-independent
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758).

As it was discussed in the above thread, getPrefetchDistance is
currently using instruction count which may change in the future.

llvm-svn: 258995
2016-01-27 22:21:25 +00:00
John McCall 3fe604f89f Add support for objc_unsafeClaimAutoreleasedReturnValue to the
ObjC ARC Optimizer.

The main implication of this is:

1. Ensuring that we treat it conservatively in terms of optimization.
2. We put the ASM marker on it so that the runtime can recognize
objc_unsafeClaimAutoreleasedReturnValue from releaseRV.

<rdar://problem/21567064>

Patch by Michael Gottesman!

llvm-svn: 258970
2016-01-27 19:05:08 +00:00
Benjamin Kramer 820f7548a1 Make some headers self-contained, remove unused includes that violate layering.
llvm-svn: 258937
2016-01-27 16:05:37 +00:00
Justin Lebar 511328817d Disable all standard lib functions for NVVM.
Summary:
NVVM doesn't have a standard library, as currently implemented, so this
just isn't going to work.  I'd like to revisit this, since it's hiding
opportunities for optimization, but correctness comes first.

Thank you to hfinkel for pointing me in the right direction here.

Reviewers: tra

Subscribers: echristo, jhen, llvm-commits, hfinkel

Differential Revision: http://reviews.llvm.org/D16604

llvm-svn: 258884
2016-01-26 23:51:06 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Haicheng Wu f1c00a22be [LIR] Add support for structs and hand unrolled loops
This is a recommit of r258620 which causes PR26293.

The original message:

Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258777
2016-01-26 02:27:47 +00:00
Quentin Colombet a392810bea Speculatively revert r258620 as it is the likely culprid of PR26293.
llvm-svn: 258703
2016-01-25 19:12:49 +00:00
James Molloy 121de0bcfa [DemandedBits] Fix computation of demanded bits for ICmps
The computation of ICmp demanded bits is independent of the individual operand being evaluated. We simply return a mask consisting of the minimum leading zeroes of both operands.

We were incorrectly passing "I" to ComputeKnownBits - this should be "UserI->getOperand(0)". In cases where we were evaluating the 1th operand, we were taking the minimum leading zeroes of it and itself.

This should fix PR26266.

llvm-svn: 258690
2016-01-25 14:49:36 +00:00
Manuel Jacob 0af37b21c8 Remove duplicate documentation in ConstantFolding.cpp. NFC.
The documentation for these functions is already present in the header file.

llvm-svn: 258649
2016-01-23 22:49:54 +00:00
Haicheng Wu dd5e9d2159 [LIR] Add support for structs and hand unrolled loops
Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258620
2016-01-23 06:52:41 +00:00
Eduard Burtescu 68e7f49f8e [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16282

llvm-svn: 258478
2016-01-22 03:08:27 +00:00
Eduard Burtescu e2a6917849 [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16422

llvm-svn: 258477
2016-01-22 01:51:51 +00:00
Eduard Burtescu 1423921a24 [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
Eduard Burtescu 2f4758b1cc [opaque pointer types] [NFC] Take advantage of get{Source,Result}ElementType when folding GEPs.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16302

llvm-svn: 258456
2016-01-21 23:42:06 +00:00
David Majnemer 3af5bf30e3 [InstCombine] Simplify (x >> y) <= x
This commit extends the patterns recognised by InstSimplify to also handle (x >> y) <= x in the same way as (x /u y) <= x.

The missing optimisation was found investigating why LLVM did not optimise away bound checks in a binary search: https://github.com/rust-lang/rust/pull/30917

Patch by Andrea Canciani!

Differential Revision: http://reviews.llvm.org/D16402

llvm-svn: 258422
2016-01-21 18:55:54 +00:00
Adam Nemet af761104ba [TTI] Add getCacheLineSize
Summary:
And use it in PPCLoopDataPrefetch.cpp.

@hfinkel, please let me know if your preference would be to preserve the
ppc-loop-prefetch-cache-line option in order to be able to override the
value of TTI::getCacheLineSize for PPC.

Reviewers: hfinkel

Subscribers: hulx2000, mcrosier, mssimpso, hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D16306

llvm-svn: 258419
2016-01-21 18:28:36 +00:00
Manuel Jacob f3ee254bc2 Undo r258163 "Move part of an if condition into an assertion. NFC."
This undoes the change made in r258163.  The assertion fails if `Ptr` is of a
vector type.  The previous code doesn't look completely correct either, so I'll
investigate this more.

llvm-svn: 258411
2016-01-21 17:36:14 +00:00
Manuel Jacob e902459c4b Change ConstantFoldInstOperands to take Instruction instead of opcode and type. NFC.
Summary:
The previous form, taking opcode and type, is moved to an internal
helper and the new form, taking an instruction, is a wrapper around this
helper.

Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16383

llvm-svn: 258391
2016-01-21 06:33:22 +00:00
Manuel Jacob 925d029461 Introduce ConstantFoldCastOperand function and migrate some callers of ConstantFoldInstOperands to use it. NFC.
Summary:
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: zzheng, dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16380

llvm-svn: 258390
2016-01-21 06:31:08 +00:00
Manuel Jacob a61ca37b6d Introduce ConstantFoldBinaryOpOperands function and migrate some callers of ConstantFoldInstOperands to use it. NFC.
Summary:
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16378

llvm-svn: 258389
2016-01-21 06:26:35 +00:00
Sanjay Patel f44bd38092 fix typo; NFC
llvm-svn: 258332
2016-01-20 18:59:48 +00:00
Sanjoy Das 29a4b5dc0d [SCEV] Fix PR26207
In some cases, the max backedge taken count can be more conservative
than the exact backedge taken count (for instance, because
ScalarEvolution::getRange is not control-flow sensitive whereas
computeExitLimitFromICmp can be).  In these cases,
computeExitLimitFromCond (specifically the bit that deals with `and` and
`or` instructions) can create an ExitLimit instance with a
`SCEVCouldNotCompute` max backedge count expression, but a computable
exact backedge count expression.  This violates an implicit SCEV
assumption: a computable exact BE count should imply a computable max BE
count.

This change

 - Makes the above implicit invariant explicit by adding an assert to
   ExitLimit's constructor

 - Changes `computeExitLimitFromCond` to be more robust around
   conservative max backedge counts

llvm-svn: 258184
2016-01-19 20:53:51 +00:00
Sanjoy Das 0ff078736f [SCEV] Use range-for; NFC
llvm-svn: 258183
2016-01-19 20:53:46 +00:00
Manuel Jacob 3f49f654a2 Move part of an if condition into an assertion. NFC.
llvm-svn: 258163
2016-01-19 19:04:49 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Dan Gohman 0553299586 [WebAssembly] Re-enable loop idiom recognition for memcpy et al.
llvm-svn: 258125
2016-01-19 14:49:23 +00:00
Adam Nemet d8968f0945 [LAA] Include function name in debug output
llvm-svn: 258088
2016-01-18 21:16:33 +00:00
Eduard Burtescu 90c4449128 [opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16272

llvm-svn: 258028
2016-01-18 00:10:01 +00:00
Sanjay Patel 6435c6ede0 fix variable names; NFC
llvm-svn: 258027
2016-01-17 23:18:05 +00:00
Sanjay Patel 9613b29927 fix typos; NFC
llvm-svn: 258026
2016-01-17 23:13:48 +00:00
Manuel Jacob 20c6d5bcb8 [opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the source element type of the GEP as an argument.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16281

llvm-svn: 258024
2016-01-17 22:46:43 +00:00
Manuel Jacob da2c9baa07 [NFC] Remove one dead PointerType::getElementType() call.
Reviewers: dblaikie, mjacob

Subscribers: llvm-commits, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16274

llvm-svn: 258022
2016-01-17 22:28:28 +00:00
Artur Pilipenko f84dc06e5b Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D16226

llvm-svn: 258010
2016-01-17 12:35:29 +00:00
Manuel Jacob 5f6eaac611 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Igor Laevsky 28eeb3f66c [BasicAliasAnalysis] Take into account operand bundles in the getModRefInfo function
Differential Revision: http://reviews.llvm.org/D16225

llvm-svn: 257991
2016-01-16 12:15:53 +00:00
Matthias Braun feb81bc682 ValueTracking: Put DataLayout reference into the Query structure, NFC.
It looks nicer and improves the compiletime of a typical
clang -O3 -emit-llvm run by ~0.6% for me.

Differential Revision: http://reviews.llvm.org/D16205

llvm-svn: 257944
2016-01-15 22:22:04 +00:00
Joseph Tremoulet 44b3f961e1 [WinEH] Rename CatchReturnInst::getParentPad, NFC
Summary:
Rename to getCatchSwitchParentPad, to make it more clear which ancestor
the "parent" in question is.  Add a comment pointing out the key feature
that the returned pad indicates which funclet contains the successor
block.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16222

llvm-svn: 257933
2016-01-15 21:16:19 +00:00
Artur Pilipenko 6dd6969cee Change isSafeToLoadUnconditionally arguments order. Separated from http://reviews.llvm.org/D10920.
llvm-svn: 257894
2016-01-15 15:27:46 +00:00
Sanjay Patel 960e5349af rangify; NFCI
llvm-svn: 257845
2016-01-15 00:08:10 +00:00
Sanjay Patel 784b5e3ff0 remove duplicate documentation comments (already in the header file) ; NFC
llvm-svn: 257835
2016-01-14 23:23:04 +00:00