Commit Graph

12034 Commits

Author SHA1 Message Date
David Peixotto 0d4d5e64ec Fix assertion in LICM doFinalization()
The doFinalization method checks that the LoopToAliasSetMap is
empty. LICM populates that map as it runs through the loop nest,
deleting the entries for child loops as it goes. However, if a child
loop is deleted by another pass (e.g. unrolling) then the loop will
never be deleted from the map because LICM walks the loop nest to
find entries it can delete.

The fix is to delete the loop from the map and free the alias set
when the loop is deleted from the loop nest.

Differential Revision: http://reviews.llvm.org/D5305

llvm-svn: 218387
2014-09-24 16:48:31 +00:00
Michael Liao d120916ca7 Allow BB duplication threshold to be adjusted through JumpThreading's ctor
- BB duplication may not be desired on targets where there is no or small
  branch penalty and code duplication needs restrict control.

llvm-svn: 218375
2014-09-24 04:59:06 +00:00
Reid Kleckner 78927e884b GlobalOpt: Preserve comdats of unoptimized initializers
Rather than slurping in and splatting out the whole ctor list, preserve
the existing array entries without trying to understand them.  Only
remove the entries that we know we can optimize away.  This way we don't
need to wire through priority and comdats or anything else we might add.

Fixes a linker issue where the .init_array or .ctors entry would point
to discarded initialization code if the comdat group from the TU with
the faulty global_ctors entry was dropped.

llvm-svn: 218337
2014-09-23 22:33:01 +00:00
Lenny Maiorani 9eefc81219 Using a deque to manage the stack of nodes is faster here.
Vector is slow due to many reallocations as the size regularly changes in
  unpredictable ways. See the investigation provided on the mailing list for
  more information:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120116/135228.html

llvm-svn: 218182
2014-09-20 13:29:20 +00:00
Eric Christopher d85ffb1fc0 Add a new pass FunctionTargetTransformInfo. This pass serves as a
shim between the TargetTransformInfo immutable pass and the Subtarget
via the TargetMachine and Function. Migrate a single call from
BasicTargetTransformInfo as an example and provide shims where TargetMachine
begins taking a Function to determine the subtarget.

No functional change.

llvm-svn: 218004
2014-09-18 00:34:14 +00:00
David Blaikie dba94ec3c7 Reapply fix in r217988 (reverted in r217989) and remove the alternative fix committed in r217987.
This type isn't owned polymorphically (as demonstrated by making the
dtor protected and everything still compiling) so just address the
warning by protecting the base dtor and making the derived class final.

llvm-svn: 217990
2014-09-17 22:27:36 +00:00
David Blaikie d8978ec085 Revert "Fix -Wnon-virtual-dtor warning introduced in r217982."
An alternative fix was already committed.

This reverts commit r217988.

llvm-svn: 217989
2014-09-17 22:17:59 +00:00
David Blaikie 20dd05ccfd Fix -Wnon-virtual-dtor warning introduced in r217982.
llvm-svn: 217988
2014-09-17 22:15:40 +00:00
Chris Bieneman cf93cbb7a4 Fixing a build error.
llvm-svn: 217983
2014-09-17 21:06:59 +00:00
Chris Bieneman ad070d0588 Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code.
Summary: This eliminates ~200 lines of code mostly file scoped struct definitions that were unnecessary.

Reviewers: chandlerc, resistor

Reviewed By: resistor

Subscribers: morisset, resistor, llvm-commits

Differential Revision: http://reviews.llvm.org/D5364

llvm-svn: 217982
2014-09-17 20:55:46 +00:00
Chad Rosier 307b50b0f6 [IndVarSimplify] Partially revert r217953 to see if this fixes the bots.
Specifically, disable widening of unsigned compare instructions.

llvm-svn: 217962
2014-09-17 16:35:09 +00:00
Chad Rosier bb99f40530 [IndVarSimplify] Widen loop compare instructions.
This improves other optimizations such as LSR.  A sext may be added to the
compare's other operand, but this can often be hoisted outside of the loop.

llvm-svn: 217953
2014-09-17 14:10:33 +00:00
Andrea Di Biagio 5b92b4971a [InstCombine] Fix wrong folding of constant comparison involving ahsr and negative quantities (PR20945).
Example:
define i1 @foo(i32 %a) {
  %shr = ashr i32 -9, %a
  %cmp = icmp ne i32 %shr, -5
  ret i1 %cmp
}

Before this fix, the instruction combiner wrongly thought that %shr
could have never been equal to -5. Therefore, %cmp was always folded to 'true'.
However, when %a is equal to 1, then %cmp evaluates to 'false'. Therefore,
in this example, it is not valid to fold %cmp to 'true'.
The problem was only affecting the case where the comparison was between
negative quantities where one of the quantities was obtained from arithmetic
shift of a negative constant.

This patch fixes the problem with the wrong folding (fixes PR20945).
With this patch, the 'icmp' from the example is now simplified to a
comparison between %a and 1. This still allows us to get rid of the arithmetic
shift (%shr).

llvm-svn: 217950
2014-09-17 11:32:31 +00:00
Jingyue Wu b67140b812 Remove dead code in SimplifyCFG
Summary: UsedByBranch is always true according to how BonusInst is defined.

Test Plan:
Passes check-all, and also verified 

if (BonusInst && !UsedByBranch) {
  ...
}

is never entered during check-all.

Reviewers: resistor, nadav, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, eliben, meheff

Differential Revision: http://reviews.llvm.org/D5324

llvm-svn: 217824
2014-09-15 20:48:13 +00:00
Nick Lewycky 9e6d184803 Add control of function merging to the PMBuilder.
llvm-svn: 217731
2014-09-13 21:46:00 +00:00
Benjamin Kramer 0bd147da17 Simplify code. No functionality change.
llvm-svn: 217726
2014-09-13 12:38:49 +00:00
Juergen Ributzka 14ae60407d [C API] Make the 'lower switch' pass available via the C API.
llvm-svn: 217630
2014-09-11 21:32:32 +00:00
Hal Finkel f83e1f7f66 [AlignmentFromAssumptions] Don't crash just because the target is 32-bit
We used to crash processing any relevant @llvm.assume on a 32-bit target
(because we'd ask SE to subtract expressions of differing types). I've copied
our 'simple.ll' test, but with the data layout from arm-linux-gnueabihf to get
some meaningful test coverage here.

llvm-svn: 217574
2014-09-11 08:40:17 +00:00
Rafael Espindola c435adcde0 Add doInitialization/doFinalization to DataLayoutPass.
With this a DataLayoutPass can be reused for multiple modules.

Once we have doInitialization/doFinalization, it doesn't seem necessary to pass
a Module to the constructor.

Overall this change seems in line with the idea of making DataLayout a required
part of Module. With it the only way of having a DataLayout used is to add it
to the Module.

llvm-svn: 217548
2014-09-10 21:27:43 +00:00
Hal Finkel 71b7084112 [AlignmentFromAssumptions] Don't divide by zero for unknown starting alignment
The routine that determines an alignment given some SCEV returns zero if the
answer is unknown. In a case where we could determine the increment of an
AddRec but not the starting alignment, we would compute the integer modulus by
zero (which is illegal and traps). Prevent this by returning early if either
the start or increment alignment is unknown (zero).

llvm-svn: 217544
2014-09-10 21:05:52 +00:00
Gerolf Hoflehner 008e5cdcba [PassManager] Adding Hidden attribute to EnableMLSM option
llvm-svn: 217539
2014-09-10 20:24:03 +00:00
Gerolf Hoflehner 24815d9b8f [MergedLoadStoreMotion] Move pass enabling option to PassManagerBuilder
llvm-svn: 217538
2014-09-10 19:55:29 +00:00
Sanjay Patel b653de1ada Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable.
"Unroll" is not the appropriate name for this variable. Clang already uses 
the term "interleave" in pragmas and metadata for this.

Differential Revision: http://reviews.llvm.org/D5066

llvm-svn: 217528
2014-09-10 17:58:16 +00:00
Gerolf Hoflehner e4f6684d1b Removed misleading comment.
llvm-svn: 217527
2014-09-10 17:54:50 +00:00
Stepan Dyatkovskiy fe134cdfa7 MergeFunctions: FunctionPtr has been renamed to FunctionNode.
It's supposed to store additional pass information for current function here.
That was the reason for name change.

llvm-svn: 217483
2014-09-10 10:08:25 +00:00
NAKAMURA Takumi 1ab0cf0e28 SampleProfile.cpp: Prune a stray \param added in r217437. [-Wdocumentation]
llvm-svn: 217465
2014-09-09 22:44:30 +00:00
NAKAMURA Takumi bb4fac9050 ScalarOpts/LLVMBuild.txt: Prune unused dependency to IPA.
llvm-svn: 217448
2014-09-09 15:00:38 +00:00
NAKAMURA Takumi 37ffecf06b ScalarOpts/LLVMBuild.txt: Reorder.
llvm-svn: 217447
2014-09-09 15:00:26 +00:00
Diego Novillo de1ab26f52 Re-factor sample profile reader into lib/ProfileData.
Summary:
This patch moves the profile reading logic out of the Sample Profile
transformation into a generic profile reader facility in
lib/ProfileData.

The intent is to use this new reader to implement a sample profile
reader/writer that can be used to convert sample profiles from external
sources into LLVM.

This first patch introduces no functional changes. It moves the profile
reading code from lib/Transforms/SampleProfile.cpp into
lib/ProfileData/SampleProfReader.cpp.

In subsequent patches I will:

- Add a bitcode format for sample profiles to allow for more efficient
  encoding of the profile.
- Add a writer for both text and bitcode format profiles.
- Add a 'convert' command to llvm-profdata to be able to convert between
  the two (and serve as entry point for other sample profile formats).

Reviewers: bogner, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5250

llvm-svn: 217437
2014-09-09 12:40:50 +00:00
Andrew Trick 8fc3c6c093 Add a comment to getNewAlignmentDiff.
llvm-svn: 217350
2014-09-07 23:16:24 +00:00
Hal Finkel 93873cc10e Check for all known bits on ret in InstCombine
From a combination of @llvm.assume calls (and perhaps through other means, such
as range metadata), it is possible that all bits of a return value might be
known. Previously, InstCombine did not check for this (which is understandable
given assumptions of constant propagation), but means that we'd miss simple
cases where assumptions are involved.

llvm-svn: 217346
2014-09-07 21:28:34 +00:00
Hal Finkel 7e1844940e Make use of @llvm.assume from LazyValueInfo
This change teaches LazyValueInfo to use the @llvm.assume intrinsic. Like with
the known-bits change (r217342), this requires feeding a "context" instruction
pointer through many functions. Aside from a little refactoring to reuse the
logic that turns predicates into constant ranges in LVI, the only new code is
that which can 'merge' the range from an assumption into that otherwise
computed. There is also a small addition to JumpThreading so that it can have
LVI use assumptions in the same block as the comparison feeding a conditional
branch.

With this patch, we can now simplify this as expected:
int foo(int a) {
  __builtin_assume(a > 5);
  if (a > 3) {
    bar();
    return 1;
  }
  return 0;
}

llvm-svn: 217345
2014-09-07 20:29:59 +00:00
Hal Finkel d67e463901 Add an AlignmentFromAssumptions Pass
This adds a ScalarEvolution-powered transformation that updates load, store and
memory intrinsic pointer alignments based on invariant((a+q) & b == 0)
expressions. Many of the simple cases we can get with ValueTracking, but we
still need something like this for the more complicated cases (such as those
with an offset) that require some algebra. Note that gcc's
__builtin_assume_aligned's optional third argument provides exactly for this
kind of 'misalignment' offset for which this kind of logic is necessary.

The primary motivation is to fixup alignments for vector loads/stores after
vectorization (and unrolling). This pass is added to the optimization pipeline
just after the SLP vectorizer runs (which, admittedly, does not preserve SE,
although I imagine it could).  Regardless, I actually don't think that the
preservation matters too much in this case: SE computes lazily, and this pass
won't issue any SE queries unless there are any assume intrinsics, so there
should be no real additional cost in the common case (SLP does preserve DT and
LoopInfo).

llvm-svn: 217344
2014-09-07 20:05:11 +00:00
Hal Finkel 15aeaaf24a Add additional patterns for @llvm.assume in ValueTracking
This builds on r217342, which added the infrastructure to compute known bits
using assumptions (@llvm.assume calls). That original commit added only a few
patterns (to catch common cases related to determining pointer alignment); this
change adds several other patterns for simple cases.

r217342 contained that, for assume(v & b = a), bits in the mask
that are known to be one, we can propagate known bits from the a to v. It also
had a known-bits transfer for assume(a = b). This patch adds:

assume(~(v & b) = a) : For those bits in the mask that are known to be one, we
                       can propagate inverted known bits from the a to v.

assume(v | b = a) :    For those bits in b that are known to be zero, we can
                       propagate known bits from the a to v.

assume(~(v | b) = a):  For those bits in b that are known to be zero, we can
                       propagate inverted known bits from the a to v.

assume(v ^ b = a) :    For those bits in b that are known to be zero, we can
		       propagate known bits from the a to v. For those bits in
		       b that are known to be one, we can propagate inverted
                       known bits from the a to v.

assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can
		       propagate inverted known bits from the a to v. For those
		       bits in b that are known to be one, we can propagate
                       known bits from the a to v.

assume(v << c = a) :   For those bits in a that are known, we can propagate them
                       to known bits in v shifted to the right by c.

assume(~(v << c) = a) : For those bits in a that are known, we can propagate
                        them inverted to known bits in v shifted to the right by c.

assume(v >> c = a) :   For those bits in a that are known, we can propagate them
                       to known bits in v shifted to the right by c.

assume(~(v >> c) = a) : For those bits in a that are known, we can propagate
                        them inverted to known bits in v shifted to the right by c.

assume(v >=_s c) where c is non-negative: The sign bit of v is zero

assume(v >_s c) where c is at least -1: The sign bit of v is zero

assume(v <=_s c) where c is negative: The sign bit of v is one

assume(v <_s c) where c is non-positive: The sign bit of v is one

assume(v <=_u c): Transfer the known high zero bits

assume(v <_u c): Transfer the known high zero bits (if c is know to be a power
                 of 2, transfer one more)

A small addition to InstCombine was necessary for some of the test cases. The
problem is that when InstCombine was simplifying and, or, etc. it would fail to
check the 'do I know all of the bits' condition before checking less specific
conditions and would not fully constant-fold the result. I'm not sure how to
trigger this aside from using assumptions, so I've just included the change
here.

llvm-svn: 217343
2014-09-07 19:21:07 +00:00
Hal Finkel 60db05896a Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

llvm-svn: 217342
2014-09-07 18:57:58 +00:00
Hal Finkel 57f03dda49 Add functions for finding ephemeral values
This adds a set of utility functions for collecting 'ephemeral' values. These
are LLVM IR values that are used only by @llvm.assume intrinsics (directly or
indirectly), and thus will be removed prior to code generation, implying that
they should be considered free for certain purposes (like inlining). The
inliner's cost analysis, and a few other passes, have been updated to account
for ephemeral values using the provided functionality.

This functionality is important for the usability of @llvm.assume, because it
limits the "non-local" side-effects of adding llvm.assume on inlining, loop
unrolling, etc. (these are hints, and do not generate code, so they should not
directly contribute to estimates of execution cost).

llvm-svn: 217335
2014-09-07 13:49:57 +00:00
Hal Finkel 74c2f355d2 Add an Assumption-Tracking Pass
This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.

The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.

There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.

This pass will be used by follow-up commits that make use of @llvm.assume.

llvm-svn: 217334
2014-09-07 12:44:26 +00:00
David Majnemer 6fe6ea740c InstCombine: Remove a special case pattern
The special case did not work when run under -reassociate and can easily
be expressed by a further generalization of an existing pattern.

llvm-svn: 217227
2014-09-05 06:09:24 +00:00
James Molloy 6b95d8ed36 Enable noalias metadata by default and swap the order of the SLP and Loop vectorizers by default.
After some time maturing, hopefully the flags themselves will be removed.

llvm-svn: 217144
2014-09-04 13:23:08 +00:00
Tilmann Scheller faabbb5fb6 [GVN] Format variable name.
Local variables need to start with an upper case letter.

llvm-svn: 217133
2014-09-04 06:38:00 +00:00
David Majnemer 13046deef3 IndVarSimplify: Address review comments for r217102
No functional change intended, just some cleanups and comments added.

llvm-svn: 217115
2014-09-04 00:23:13 +00:00
Kostya Serebryany 3175521844 [asan] fix debug info produced for asan-coverage=2
llvm-svn: 217106
2014-09-03 23:24:18 +00:00
David Majnemer c6ab01ecca IndVarSimplify: Don't let LFTR compare against a poison value
LinearFunctionTestReplace tries to use the *next* indvar to compare
against when possible.  However, it may be the case that the calculation
for the next indvar has NUW/NSW flags and that it may only be safely
used inside the loop.  Using it in a comparison to calculate the exit
condition could result in observing poison.

This fixes PR20680.

Differential Revision: http://reviews.llvm.org/D5174

llvm-svn: 217102
2014-09-03 23:03:18 +00:00
Kostya Serebryany 351b078b6d [asan] add -asan-coverage=3: instrument all blocks and critical edges.
llvm-svn: 217098
2014-09-03 22:37:37 +00:00
Benjamin Kramer 89854ebe8e Make some helpers static or move into the llvm namespace.
llvm-svn: 217077
2014-09-03 21:04:12 +00:00
Sanjay Patel 9433a28845 Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802).
The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math)
when converting scalar instructions into vectors. But this isn't a simple copy - we need to take
the intersection (the logical 'and') of the sets of flags on the scalars.

The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after:
http://reviews.llvm.org/D4015
http://llvm.org/viewvc/llvm-project?view=revision&revision=211339

The vast majority of changed files are existing tests that were not propagating IR flags, but I've
also added a new test file for focused testing of IR flag possibilities.

Differential Revision: http://reviews.llvm.org/D5172

llvm-svn: 217051
2014-09-03 17:40:30 +00:00
Sanjay Patel a982d992f0 Change name of copyFlags() to copyIRFlags(). Add convenience method for logical 'and' of all flags. NFC.
Adding 'IR' to the names in an attempt to be less ambiguous about the flags we're dealing with here.

The 'and' method is needed by the SLPVectorizer (PR20802) and possibly other passes.

llvm-svn: 217004
2014-09-03 01:06:50 +00:00
Hal Finkel 445dda5c4a Add pass-manager flags to use CFL AA
Add -use-cfl-aa (and -use-cfl-aa-in-codegen) to add CFL AA in the default pass
managers (for easy testing).

llvm-svn: 216978
2014-09-02 22:12:54 +00:00
Kostya Serebryany ad23852ac3 [asan] Assign a low branch weight to ASan's slow path, patch by Jonas Wagner. This speeds up asan (at least on SPEC) by 1%-5% or more. Also fix lint in dfsan.
llvm-svn: 216972
2014-09-02 21:46:51 +00:00
Yi Jiang 77a609b556 Generate extract for in-tree uses if the use is scalar operand in vectorized instruction. radar://18144665
llvm-svn: 216946
2014-09-02 21:00:39 +00:00
David Blaikie 15913f46b2 unique_ptrify the result of SpecialCaseList::create
llvm-svn: 216925
2014-09-02 18:13:54 +00:00
David Majnemer 49428105aa LICM: Don't crash when an instruction is used by an unreachable BB
Summary:
BBs might contain non-LCSSA'd values after the LCSSA pass is run if they
are unreachable from the entry block.

Normally, the users of the instruction would be PHIs but the unreachable
BBs have normal users; rewrite their uses to be undef values.

An alternative fix could involve fixing this at LCSSA but that would
require this invariant to hold after subsequent transforms.  If a BB
created an unreachable block, they would be in violation of this.

This fixes PR19798.

Differential Revision: http://reviews.llvm.org/D5146

llvm-svn: 216911
2014-09-02 16:22:00 +00:00
David Majnemer d4cffcf073 SROA: Don't insert instructions before a PHI
SROA may decide that it needs to insert a bitcast and would set it's
insertion point before a PHI.  This will create an invalid module
right quick.

Instead, choose the first insertion point in the basic block that holds
our PHI.

This fixes PR20822.

Differential Revision: http://reviews.llvm.org/D5141

llvm-svn: 216891
2014-09-01 21:20:14 +00:00
David Majnemer d2df50196f Revert "Revert two GEP-related InstCombine commits"
This reverts commit r216698 which reverted r216523 and r216598.

We would attempt to perform the transformation even if the match()
failed because, as a side effect, it would set V.  This would trick us
into believing that we correctly found a place to correctly apply the
transform.

An additional test case was added to getelementptr.ll so that we might
not regress in the future.

llvm-svn: 216890
2014-09-01 21:10:02 +00:00
Sanjay Patel 5ad239e15a Add a convenience method to copy wrapping, exact, and fast-math flags (NFC).
The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions.
This patch adds a convenience method to make that operation easier because we need to do this
in the loop vectorizer, SLP vectorizer, and possibly other places.

Although this is a 'no functional change' patch, I've added a testcase to verify that the exact
flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked
in existing testcases.

Differential Revision: http://reviews.llvm.org/D5138

llvm-svn: 216886
2014-09-01 18:44:57 +00:00
Chandler Carruth 18cee1defc Fix a really bad miscompile introduced in r216865 - the else-if logic
chain became completely broken here as *all* intrinsic users ended up
being skipped, and the ones that seemed to be singled out were actually
the exact wrong set.

This is a great example of why long else-if chains can be easily
confusing. Switch the entire code to use early exits and early continues
to have simpler (and more importantly, correct) logic here, as well as
fixing the reversed logic for detecting and continuing on lifetime
intrinsics.

I've also significantly cleaned up the test case and added another test
case demonstrating an example where the optimization is not (trivially)
safe to perform.

llvm-svn: 216871
2014-09-01 10:09:18 +00:00
Renato Golin 86a6c3f269 Small refactor on VectorizerHint for deduplication
Previously, the hint mechanism relied on clean up passes to remove redundant
metadata, which still showed up if running opt at low levels of optimization.
That also has shown that multiple nodes of the same type, but with different
values could still coexist, even if temporary, and cause confusion if the
next pass got the wrong value.

This patch makes sure that, if metadata already exists in a loop, the hint
mechanism will never append a new node, but always replace the existing one.
It also enhances the algorithm to cope with more metadata types in the future
by just adding a new type, not a lot of code.

Re-applying again due to MSVC 2013 being minimum requirement, and this patch
having C++11 that MSVC 2012 didn't support.

Fixes PR20655.

llvm-svn: 216870
2014-09-01 10:00:17 +00:00
Hal Finkel 0c083024f0 Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata
This feeds AA through the IFI structure into the inliner so that
AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which
functions only access their arguments (instead of just hard-coding some
knowledge of memory intrinsics). Most of the information is only available from
BasicAA; this is important for preserving alias scoping information for
target-specific intrinsics when doing the noalias parameter attribute to
metadata conversion.

llvm-svn: 216866
2014-09-01 09:01:39 +00:00
Nick Lewycky fc243d54d2 Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman Aden, review by Hal Finkel.
llvm-svn: 216865
2014-09-01 06:03:11 +00:00
Hal Finkel cbb85f249e Fix AddAliasScopeMetadata again - alias.scope must be a complete description
I thought that I had fixed this problem in r216818, but I did not do a very
good job. The underlying issue is that when we add alias.scope metadata we are
asserting that this metadata completely describes the aliasing relationships
within the current aliasing scope domain, and so in the context of translating
noalias argument attributes, the pointers must all be based on noalias
arguments (as underlying objects) and have no other kind of underlying object.
In r216818 excluding appropriate accesses from getting alias.scope metadata is
done by looking for underlying objects that are not identified function-local
objects -- but that's wrong because allocas, etc. are also function-local
objects and we need to explicitly check that all underlying objects are the
noalias arguments for which we're adding metadata aliasing scopes.

This fixes the underlying-object check for adding alias.scope metadata, and
does some refactoring of the related capture-checking eligibility logic (and
adds more comments; hopefully making everything a bit clearer).

Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the
feature is still disabled by default).

llvm-svn: 216863
2014-09-01 04:26:40 +00:00
Craig Topper 6dc4a8bc2c Fix some cases where StringRef was being passed by const reference. Remove const from some other StringRefs since its implicitly const already.
llvm-svn: 216820
2014-08-30 16:48:02 +00:00
Hal Finkel a3708df41a Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers
The previous implementation of AddAliasScopeMetadata, which adds noalias
metadata to preserve noalias parameter attribute information when inlining had
a flaw: it would add alias.scope metadata to accesses which might have been
derived from pointers other than noalias function parameters. This was
incorrect because even some access known not to alias with all noalias function
parameters could easily alias with an access derived from some other pointer.
Instead, when deriving from some unknown pointer, we cannot add alias.scope
metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4.
Furthermore, we cannot add alias.scope to functions unless we know they
access only argument-derived pointers (currently, we know this only for
memory intrinsics).

Also, we fix a theoretical problem with using the NoCapture attribute to skip
the capture check. This is incorrect (as explained in the comment added), but
would not matter in any code generated by Clang because we get only inferred
nocapture attributes in Clang-generated IR.

This functionality is not yet enabled by default.

llvm-svn: 216818
2014-08-30 12:48:33 +00:00
David Majnemer 492e612e01 InstCombine: Respect recursion depth in visitUDivOperand
llvm-svn: 216817
2014-08-30 09:19:05 +00:00
David Majnemer 5e96f1b4c8 InstCombine: Try harder to combine icmp instructions
consider: (and (icmp X, Y), (and Z, (icmp A, B)))
It may be possible to combine (icmp X, Y) with (icmp A, B).
If we successfully combine, create an 'and' instruction with Z.

This fixes PR20814.

N.B. There is room for improvement after this change but I'm not
convinced it's worth chasing yet.

llvm-svn: 216814
2014-08-30 06:18:20 +00:00
Hal Finkel 2d3d6da44b Fix a typo in AddAliasScopeMetadata
llvm-svn: 216741
2014-08-29 16:33:41 +00:00
David Majnemer 400e725bde Revert two GEP-related InstCombine commits
This reverts commit r216523 and r216598; people have reported
regressions.

llvm-svn: 216698
2014-08-29 00:06:43 +00:00
Reid Kleckner febb279c9c Don't promote byval pointer arguments when padding matters
Don't promote byval pointer arguments when when their size in bits is
not equal to their alloc size in bits. This can happen for x86_fp80,
where the size in bits is 80 but the alloca size in bits in 128.
Promoting these types can break passing unions of x86_fp80s and other
types.

Patch by Thomas Jablin!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D5057

llvm-svn: 216693
2014-08-28 22:42:00 +00:00
David Majnemer 074052b623 InstCombine: Remove redundant combines
InstSimplify already handles icmp (X+Y), X (and things like it)
appropriately.  The first thing that InstCombine does is run
InstSimplify on the instruction.

llvm-svn: 216659
2014-08-28 10:08:37 +00:00
Erik Eckstein 8354cfaf95 Fix: SLPVectorizer tried to move an instruction which was replaced by a vector instruction.
For a detailed description of the problem see the comment in the test file.
The problematic moveBefore() calls are not required anymore because the new
scheduling algorithm ensures a correct ordering anyway.

llvm-svn: 216656
2014-08-28 07:04:02 +00:00
David Majnemer 76d06bc613 InstSimplify: Move a transform from InstCombine to InstSimplify
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions.  Move them to InstSimplify;
while we are at it, make them more powerful.

llvm-svn: 216642
2014-08-28 03:34:28 +00:00
David Majnemer 22ccfc4484 InstCombine: Combine gep X, (Y-X) to Y
We try to perform this transform in InstSimplify but we aren't always
able to.  Sometimes, we need to insert a bitcast if X and Y don't have
the same time.

llvm-svn: 216598
2014-08-27 20:08:37 +00:00
Michael Zolotukhin 5dc466b863 [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).
llvm-svn: 216549
2014-08-27 15:01:18 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Craig Topper 3af9722529 Fix some cases were ArrayRefs were being passed by reference. Also remove 'const' from some other ArrayRef uses since its implicitly const already.
llvm-svn: 216524
2014-08-27 05:25:00 +00:00
David Majnemer 54e97d5dc0 InstCombine: Optimize GEP's involving ptrtoint better
We supported transforming:
(gep i8* X, -(ptrtoint Y))

to:
(inttoptr (sub (ptrtoint X), (ptrtoint Y)))

However, this only fired if 'X' had type i8*.  Generalize this to
support various types of different sizes.  This results in much better
CodeGen, especially for pointers to packed structs.

llvm-svn: 216523
2014-08-27 05:16:04 +00:00
Joerg Sonnenberger cb5674b9c2 Revert r210342 and r210343, add test case for the crasher.
PR 20642.

llvm-svn: 216475
2014-08-26 19:06:41 +00:00
Dinesh Dwivedi 4919bbe29d This patch enables SimplifyUsingDistributiveLaws() to handle following pattens.
(X >> Z) & (Y >> Z)  -> (X&Y) >> Z  for all shifts.
(X >> Z) | (Y >> Z)  -> (X|Y) >> Z  for all shifts.
(X >> Z) ^ (Y >> Z)  -> (X^Y) >> Z  for all shifts.

These patterns were previously handled separately in visitAnd()/visitOr()/visitXor().

Differential Revision: http://reviews.llvm.org/D4951

llvm-svn: 216443
2014-08-26 08:53:32 +00:00
Reid Kleckner 3715461b48 musttail: Don't eliminate varargs packs if there is a forwarding call
Also clean up and beef up this grep test for the feature.

llvm-svn: 216425
2014-08-26 00:59:51 +00:00
Sanjay Patel 4e31cdabd1 fix typos in comments
llvm-svn: 216424
2014-08-26 00:59:15 +00:00
Reid Kleckner e6e88f99b3 ArgPromotion: Don't touch variadic functions
Adding, removing, or changing non-pack parameters can change the ABI
classification of pack parameters. Clang and other frontends encode the
classification in the IR of the call site, but the callee side
determines it dynamically based on the number of registers consumed so
far. Changing the prototype affects the number of registers consumed
would break such code.

Dead argument elimination performs a similar task and already has a
similar check to avoid this problem.

Patch by Thomas Jablin!

llvm-svn: 216421
2014-08-25 23:58:48 +00:00
Rafael Espindola 3fd1e9933f Modernize raw_fd_ostream's constructor a bit.
Take a StringRef instead of a "const char *".
Take a "std::error_code &" instead of a "std::string &" for error.

A create static method would be even better, but this patch is already a bit too
big.

llvm-svn: 216393
2014-08-25 18:16:47 +00:00
Bruno Cardoso Lopes e2a1fa35df Remove dangling initializers in GlobalDCE
GlobalDCE deletes global vars and updates their initializers to nullptr
while leaving underlying constants to be cleaned up later by its uses.
The clean up may never happen, fix this by forcing it every time it's
safe to destroy constants.

Final patch by Rafael Espindola
http://reviews.llvm.org/D4931

<rdar://problem/17523868>

llvm-svn: 216390
2014-08-25 17:51:14 +00:00
Stepan Dyatkovskiy c90308bf83 MergeFunctions, tiny refactoring:
cmpAPFloat has been renamed to cmpAPFloats (multiple form).

llvm-svn: 216376
2014-08-25 08:22:46 +00:00
Stepan Dyatkovskiy 7f895c1184 MergeFunctions, tiny refactoring:
cmpAPInt has been renamed to cmpAPInts (multiple form).

llvm-svn: 216375
2014-08-25 08:19:50 +00:00
Stepan Dyatkovskiy 0b765dee6e MergeFunctions, tiny refactoring:
cmpType has been renamed to cmpTypes (multiple form).

llvm-svn: 216374
2014-08-25 08:16:39 +00:00
Stepan Dyatkovskiy 016daddc52 MergeFunctions, tiny refactoring:
cmpGEP has been renamed to cmpGEPs (multiple form).

llvm-svn: 216373
2014-08-25 08:12:45 +00:00
Karthik Bhat 7f33ff7dea Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)

llvm-svn: 216371
2014-08-25 04:56:54 +00:00
Craig Topper 4627679cec Use range based for loops to avoid needing to re-mention SmallPtrSet size.
llvm-svn: 216351
2014-08-24 23:23:06 +00:00
David Majnemer 0ffccf7fb5 InstCombine: Properly optimize or'ing bittests together
CFE, with -03, would turn:
bool f(unsigned x) {
  bool a = x & 1;
  bool b = x & 2;
  return a | b;
}

into:
  %1 = lshr i32 %x, 1
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

This sort of thing exposes a nasty pathology in GCC, ICC and LLVM.

Instead, we would rather want:
  %1 = and i32 %x, 3
  %2 = icmp ne i32 %1, 0

Things get a bit more interesting in the following case:
  %1 = lshr i32 %x, %y
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

Replacing it with the following sequence is better:
  %1 = shl nuw i32 1, %y
  %2 = or i32 %1, 1
  %3 = and i32 %2, %x
  %4 = icmp ne i32 %3, 0

This sequence is preferable because %1 doesn't involve %x and could
potentially be hoisted out of loops if it is invariant; only perform
this transform in the non-constant case if we know we won't increase
register pressure.

llvm-svn: 216343
2014-08-24 09:10:57 +00:00
Jingyue Wu ec33fa9aca [SROA] Fold a PHI node if all its incoming values are the same
Summary:
Fixes PR20425.

During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.

Test Plan: Added three more tests in phi-and-select.ll

Reviewers: nlewycky, eliben, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D4659

llvm-svn: 216299
2014-08-22 22:45:57 +00:00
David Majnemer 49775e0173 InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants
Consider:
  %add = add nuw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nuw' from the
instruction.

llvm-svn: 216273
2014-08-22 17:11:04 +00:00
David Majnemer 0e6c986696 InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN
We can preserve nsw during this transform if -C won't overflow.

llvm-svn: 216269
2014-08-22 16:41:23 +00:00
David Majnemer 42b83a5e36 InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants
Consider:
  %add = add nsw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nsw' from the
instruction.

This fixes PR20377.

llvm-svn: 216261
2014-08-22 07:56:32 +00:00
Erik Eckstein b49d7abb7b fix: SLPVectorizer crashes for unreachable blocks containing not schedulable instructions.
In unreachable blocks it's legal to have instructions like "%x = op %x".
Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for
unreachable blocks and ignore them.

Fixes bug 20646.

llvm-svn: 216256
2014-08-22 01:18:39 +00:00
Peter Collingbourne fab565a56b [dfsan] Fix non-determinism bug in non-zero label check annotator.
We now use a std::vector instead of a DenseSet to store the list of
label checks so that we can iterate over it deterministically.

llvm-svn: 216255
2014-08-22 01:18:18 +00:00
Reid Kleckner c36f48f08a SROA: Handle a case of store size being smaller than allocation size
In this case, we are creating an x86_fp80 slice for a union from C where
the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes,
and that's just fine. We can't, however, use regular loads and stores to
access the slice, because the store size is only 10 bytes / 80 bits.
Instead, use memcpy and memset.

Fixes PR18726.

Reviewed By: chandlerc

Differential Revision: http://reviews.llvm.org/D5012

llvm-svn: 216248
2014-08-22 00:09:56 +00:00
David Blaikie 2f3f76fdb1 Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks.
Somewhat unnoticed in the original implementation of discriminators, but
it could cause instructions to end up in new, small,
DW_TAG_lexical_blocks due to the use of DILexicalBlock to track
discriminator changes.

Instead, use DILexicalBlockFile which we already use to track file
changes without introducing new scopes, so it works well to track
discriminator changes in the same way.

llvm-svn: 216239
2014-08-21 22:45:21 +00:00
Rafael Espindola 7cebf36a95 Move some logic to populateLTOPassManager.
This will avoid code duplication in the next commit which calls it directly
from the gold plugin.

llvm-svn: 216211
2014-08-21 20:03:44 +00:00
Rafael Espindola 216e0c0617 Respect LibraryInfo in populateLTOPassManager and use it. NFC.
llvm-svn: 216203
2014-08-21 18:49:52 +00:00
Rafael Espindola e07caad9e7 Handle inlining in populateLTOPassManager like in populateModulePassManager.
No functionality change.

llvm-svn: 216178
2014-08-21 13:35:30 +00:00
Zinovy Nis 33406da5f4 [CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing.
llvm-svn: 216176
2014-08-21 13:30:05 +00:00
Rafael Espindola 208bc533cd Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder.
llvm-svn: 216174
2014-08-21 13:13:17 +00:00
Erik Verbruggen 2b98bd2a80 Reassociate x + -0.1234 * y into x - 0.1234 * y
This does not require -ffast-math, and it gives CSE/GVN more options to
eliminate duplicate expressions in, e.g.:

  return ((x + 0.1234 * y) * (x - 0.1234 * y));

Differential Revision: http://reviews.llvm.org/D4904

llvm-svn: 216169
2014-08-21 10:45:30 +00:00
Zinovy Nis 0a36cba29d [INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions.
Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:

int N = 100;
float **A;

void foo(int x0, int x1)
{
        float * A_cur = &A[0][0];
        float * A_next = &A[1][0];
        for(int x = x0; x < x1; ++x).
        {
          // Currently only [x+N] case is widened. Others 2 cases lead to sext.
          // This patch fixes it, so all 3 cases do not need sext.
          const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
          A_next[x] = div;
        }
}
...
> clang++ test.cpp -march=core-avx2 -Ofast  -fno-unroll-loops -fno-tree-vectorize -S -o -

Differential Revision: http://reviews.llvm.org/D4695

llvm-svn: 216160
2014-08-21 08:25:45 +00:00
Craig Topper 71b7b68b74 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 216158
2014-08-21 05:55:13 +00:00
David Majnemer 5d1aeba2ea InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1
Adapted from a patch by Richard Smith, test-case written by me.

llvm-svn: 216157
2014-08-21 05:14:48 +00:00
James Molloy 82c995d450 [LoopVectorizer] Limit unroll factor in the presence of nested reductions.
If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation.

llvm-svn: 216140
2014-08-20 23:53:52 +00:00
Yi Jiang 1a4e73d7bf New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition
llvm-svn: 216135
2014-08-20 22:55:40 +00:00
David Majnemer 42158f3eea InstCombine: Annotate sub with nuw when we prove it's safe
We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is
negative and the right-hand side is non-negative.

llvm-svn: 216045
2014-08-20 07:17:31 +00:00
Peter Collingbourne f39430bd4a [dfsan] Treat vararg custom functions like unimplemented functions.
Because declarations of these functions can appear in places like autoconf
checks, they have to be handled somehow, even though we do not support
vararg custom functions. We do so by printing a warning and calling the
uninstrumented function, as we do for unimplemented functions.

llvm-svn: 216042
2014-08-20 01:40:23 +00:00
David Majnemer 57d5bc8849 InstCombine: Annotate sub with nsw when we prove it's safe
We can prove that a 'sub' can be a 'sub nsw' under certain conditions:
- The sign bits of the operands is the same.
- Both operands have more than 1 sign bit.

The subtraction cannot be a signed overflow in either case.

llvm-svn: 216037
2014-08-19 23:36:30 +00:00
Renato Golin 06d601fb3e Revert "Small refactor on VectorizerHint for deduplication"
This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness.

llvm-svn: 215999
2014-08-19 18:08:50 +00:00
Renato Golin dd6394d833 Small refactor on VectorizerHint for deduplication
Previously, the hint mechanism relied on clean up passes to remove redundant
metadata, which still showed up if running opt at low levels of optimization.
That also has shown that multiple nodes of the same type, but with different
values could still coexist, even if temporary, and cause confusion if the
next pass got the wrong value.

This patch makes sure that, if metadata already exists in a loop, the hint
mechanism will never append a new node, but always replace the existing one.
It also enhances the algorithm to cope with more metadata types in the future
by just adding a new type, not a lot of code.

llvm-svn: 215994
2014-08-19 17:30:43 +00:00
Mayur Pandey 960507beb4 InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Differential Revision: http://reviews.llvm.org/D4898

llvm-svn: 215974
2014-08-19 08:19:19 +00:00
Craig Topper 97ebe53032 Const-correct and prevent a copy of a SmallPtrSet.
llvm-svn: 215973
2014-08-19 07:44:27 +00:00
Mayur Pandey 75b76c6a92 test commit (spelling correction)
llvm-svn: 215970
2014-08-19 06:41:55 +00:00
Craig Topper 6230691c91 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

llvm-svn: 215870
2014-08-18 00:24:38 +00:00
Craig Topper 5229cfd163 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 215868
2014-08-17 23:47:00 +00:00
Owen Anderson a4428aa484 Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set.
While this might seem like an obvious canonicalization, there is one subtle problem with it.  The result of the original expression
is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN.  This means that the
new expression is strictly more defined than the original one.  One unfortunate consequence of this is that the transform is not reversible!
It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it.  Thus, targets that prefer the original
form of the expression cannot reverse the transform to recover it.  Another way to think of it is that the transform has lost source-level
information (the fast math flags), which is undesirable.

llvm-svn: 215825
2014-08-17 03:51:29 +00:00
David Majnemer 1a0bbc8a5c InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C)
While *most* (X sdiv 1) operations will get caught by InstSimplify, it
is still possible for a sdiv to appear in the worklist which hasn't been
simplified yet.

This means that it is possible for 0 - (X sdiv 1) to get transformed
into (X sdiv -1); dividing by -1 can make the transform produce undef
values instead of the proper result.

Sorry for the lack of testcase, it's a bit problematic because it relies
on the exact order of operations in the worklist.

llvm-svn: 215818
2014-08-16 09:23:42 +00:00
David Majnemer f9a095d606 InstCombine: Combine mul with div.
We can combne a mul with a div if one of the operands is a multiple of
the other:

%mul = mul nsw nuw %a, C1
%ret = udiv %mul, C2
  =>
%ret = mul nsw %a, (C1 / C2)

This can expose further optimization opportunities if we end up
multiplying or dividing by a power of 2.

Consider this small example:

define i32 @f(i32 %a) {
  %mul = mul nuw i32 %a, 14
  %div = udiv exact i32 %mul, 7
  ret i32 %div
}

which gets CodeGen'd to:

    imull       $14, %edi, %eax
    imulq       $613566757, %rax, %rcx
    shrq        $32, %rcx
    subl        %ecx, %eax
    shrl        %eax
    addl        %ecx, %eax
    shrl        $2, %eax
    retq

We can now transform this into:
define i32 @f(i32 %a) {
  %shl = shl nuw i32 %a, 1
  ret i32 %shl
}

which gets CodeGen'd to:

    leal        (%rdi,%rdi), %eax
    retq

This fixes PR20681.

llvm-svn: 215815
2014-08-16 08:55:06 +00:00
Rafael Espindola ea46c32f81 Introduce a helper to combine instruction metadata.
Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use
it.

Patch by Björn Steinbrink!

llvm-svn: 215723
2014-08-15 15:46:38 +00:00
Hal Finkel 61c386126b Copy noalias metadata from call sites to inlined instructions
When a call site with noalias metadata is inlined, that metadata can be
propagated directly to the inlined instructions (only those that might access
memory because it is not useful on the others). Prior to inlining, the noalias
metadata could express that a call would not alias with some other memory
access, which implies that no instruction within that called function would
alias. By propagating the metadata to the inlined instructions, we preserve
that knowledge.

This should complete the enhancements requested in PR20500.

llvm-svn: 215676
2014-08-14 21:09:37 +00:00
Hal Finkel d2dee16c27 Add noalias metadata for general calls (not just memory intrinsics) during inlining
When preserving noalias function parameter attributes by adding noalias
metadata in the inliner, we should do this for general function calls (not just
memory intrinsics). The logic is very similar to what already existed (except
that we want to add this metadata even for functions taking no relevant
parameters). This metadata can be used by ModRef queries in the caller after
inlining.

This addresses the first part of PR20500. Adding noalias metadata during
inlining is still turned off by default.

llvm-svn: 215657
2014-08-14 16:44:03 +00:00
Chad Rosier 11ab941644 [Reassociation] Add support for reassociation with unsafe algebra.
Vector instructions are (still) not supported for either integer or floating
point.  Hopefully, that work will be landed shortly.

llvm-svn: 215647
2014-08-14 15:23:01 +00:00
David Majnemer 698dca0b95 InstCombine: ((A | ~B) ^ (~A | B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A | ~B),(~A |B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Patch by Mayur Pandey!

Differential Revision: http://reviews.llvm.org/D4883

llvm-svn: 215621
2014-08-14 06:46:25 +00:00
David Majnemer f1eda23514 Added InstCombine Transform for ((B | C) & A) | B -> B | (A & C)
Transform ((B | C) & A) | B --> B | (A & C)

Z3 Link: http://rise4fun.com/Z3/hP6p

Patch by Sonam Kumari!

Differential Revision: http://reviews.llvm.org/D4865

llvm-svn: 215619
2014-08-14 06:41:38 +00:00
Jan Vesely 0cd3ec6cfa utils: Fix segfault in flattencfg
v2: continue iterating through the rest of the bb
    use for loop

v3: initialize FlattenCFG pass in ScalarOps
    add test

v4: split off initializing flattencfg to a separate patch
    add comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215574
2014-08-13 20:31:53 +00:00
Jan Vesely 5a956d49f7 Initialize FlattenCFG pass
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215573
2014-08-13 20:31:52 +00:00
Benjamin Kramer a7c40ef022 Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

llvm-svn: 215558
2014-08-13 16:26:38 +00:00
Chandler Carruth 0fb998110a [optnone] Make the optnone attribute effective at suppressing function
attribute and function argument attribute synthesizing and propagating.

As with the other uses of this attribute, the goal remains a best-effort
(no guarantees) attempt to not optimize the function or assume things
about the function when optimizing. This is particularly useful for
compiler testing, bisecting miscompiles, triaging things, etc. I was
hitting specific issues using optnone to isolate test code from a test
driver for my fuzz testing, and this is one step of fixing that.

llvm-svn: 215538
2014-08-13 10:49:33 +00:00
Chandler Carruth 3f92ecc2a0 Revert r215415 which causse MSan to crash on a great deal of C++ code.
I've followed up on the original commit as well.

llvm-svn: 215532
2014-08-13 09:19:39 +00:00
Karthik Bhat a4a4db91be InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b)
Correctness proof of the transform using CVC3-

$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B;

$ cvc3 t.cvc
Valid.

llvm-svn: 215524
2014-08-13 05:13:14 +00:00
Matt Arsenault 4815f09bbe Allwo bitcast + struct GEP transform to work with addrspacecast
llvm-svn: 215467
2014-08-12 19:46:13 +00:00
Reid Kleckner 3ae6e1528a msan: Handle musttail calls
First, avoid calling setTailCall(false) on musttail calls.  The funciton
prototypes should be "congruent", so the shadow layout should be exactly
the same.

Second, avoid inserting instrumentation after a musttail call to
propagate the return value shadow.  We don't need to propagate the
result of a tail call, it should already be in the right place.

Reviewed By: eugenis

Differential Revision: http://reviews.llvm.org/D4331

llvm-svn: 215415
2014-08-12 00:12:43 +00:00
Reid Kleckner e31acf239a Move helper for getting a terminating musttail call to BasicBlock
No functional change.  To be used in future commits that need to look
for such instructions.

Reviewed By: rafael

Differential Revision: http://reviews.llvm.org/D4504

llvm-svn: 215413
2014-08-12 00:05:15 +00:00
David Majnemer ab07f00c64 InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b)
What follows bellow is a correctness proof of the transform using CVC3.

$ < t.cvc
A, B : BITVECTOR(32);

QUERY BVPLUS(32, A & B, A | B) = BVPLUS(32, A, B);

$ cvc3 < t.cvc
Valid.

llvm-svn: 215400
2014-08-11 22:32:02 +00:00
James Molloy 65b08f5e46 [LoopVectorizer] Enable support for floating-point subtraction reductions
llvm-svn: 215200
2014-08-08 12:41:08 +00:00
David Majnemer fe8c7540b0 GlobalOpt: Optimize in the face of insertvalue/extractvalue
GlobalOpt didn't know how to simulate InsertValueInst or
ExtractValueInst.  Optimizing these is pretty straightforward.

N.B. This came up when looking at clang's IRGen for MS ABI member
pointers; they are represented as aggregates.

llvm-svn: 215184
2014-08-08 05:50:43 +00:00
Gerolf Hoflehner ea96a3d336 Fix for multi-line comment warning
llvm-svn: 215169
2014-08-07 23:19:55 +00:00
Arnold Schwaighofer 4fb3c47456 SLPVectorizer: Use the type of the value loaded/stored to get the ABI alignment
We were using the pointer type which is incorrect.

llvm-svn: 215162
2014-08-07 22:47:27 +00:00
Owen Anderson 6c19ab1b5d Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. In
this case, the code path dealing with vector promotion was missing the explicit
checks for lifetime intrinsics that were present on the corresponding integer
promotion path.

llvm-svn: 215148
2014-08-07 21:07:35 +00:00
Rui Ueyama c487f7728e Revert "r214897 - Remove dead zero store to calloc initialized memory"
It broke msan.

llvm-svn: 214989
2014-08-06 19:30:38 +00:00
James Molloy 568da0990e Add a new option -run-slp-after-loop-vectorization.
This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around.

llvm-svn: 214963
2014-08-06 12:56:19 +00:00
Peter Collingbourne df240b252a [dfsan] Try not to create too many additional basic blocks in functions which
already have a large number of blocks. Works around a performance issue with
the greedy register allocator.

llvm-svn: 214944
2014-08-06 00:33:40 +00:00
JF Bastien ac8b66b32c Fix typos in comments and doc
Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com)

llvm-svn: 214934
2014-08-05 23:27:34 +00:00
Rafael Espindola f9e52cf015 Don't internalize all but main by default.
This is mostly a cleanup, but it changes a fairly old behavior.

Every "real" LTO user was already disabling the silly internalize pass
and creating the internalize pass itself. The difference with this
patch is for "opt -std-link-opts" and the C api.

Now to get a usable behavior out of opt one doesn't need the funny
looking command line:

opt -internalize -disable-internalize -internalize-public-api-list=foo,bar -std-link-opts

llvm-svn: 214919
2014-08-05 20:10:38 +00:00
Philip Reames 00c9b6461f Remove dead zero store to calloc initialized memory
Optimize the following IR:

%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4

Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store.  If the store is to an out of bounds address, it is undefined and thus also removable.

Reviewed By: nicholas

Differential Revision: http://reviews.llvm.org/D3942

llvm-svn: 214897
2014-08-05 17:48:20 +00:00
James Molloy 2b8933c354 Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.

llvm-svn: 214859
2014-08-05 12:30:34 +00:00
Manman Ren 062f58d550 [SimplifyCFG] fix accessing deleted PHINodes in switch-to-table conversion.
When we have a covered lookup table, make sure we don't delete PHINodes that
are cached in PHIs.

rdar://17887153

llvm-svn: 214642
2014-08-02 23:41:54 +00:00
Erik Eckstein 26a1bf7d84 fix bug 20513 - Crash in SLP Vectorizer
llvm-svn: 214638
2014-08-02 19:39:42 +00:00
Alexey Samsonov d9ad5cec0c [ASan] Use metadata to pass source-level information from Clang to ASan.
Instead of creating global variables for source locations and global names,
just create metadata nodes and strings. They will be transformed into actual
globals in the instrumentation pass (if necessary). This approach is more
flexible:
1) we don't have to ensure that our custom globals survive all the optimizations
2) if globals are discarded for some reason, we will simply ignore metadata for them
   and won't have to erase corresponding globals
3) metadata for source locations can be reused for other purposes: e.g. we may
   attach source location metadata to alloca instructions and provide better descriptions
   for stack variables in ASan error reports.

No functionality change.

llvm-svn: 214604
2014-08-02 00:35:50 +00:00
Tyler Nowicki 064896bbc5 Add diagnostics to the vectorizer cost model.
When the cost model determines vectorization is not possible/profitable these remarks print an analysis of that decision.

Note that in selectVectorizationFactor() we can assume that OptForSize and ForceVectorization are mutually exclusive.

Reviewed by Arnold Schwaighofer

llvm-svn: 214599
2014-08-02 00:14:03 +00:00
Peter Collingbourne e52646cd80 PartiallyInlineLibCalls: Check sqrt result type before transforming it.
Some configure scripts declare this with the wrong prototype, which can lead
to an assertion failure.

llvm-svn: 214593
2014-08-01 23:21:21 +00:00
Peter Collingbourne 142fdff0d5 [dfsan] Correctly handle loads and stores of zero size.
llvm-svn: 214561
2014-08-01 21:18:18 +00:00
Rafael Espindola 3f6481d0d3 Remove some calls to std::move.
Instead of moving out the data in a ErrorOr<std::unique_ptr<Foo>>, get
a reference to it.

Thanks to David Blaikie for the suggestion.

llvm-svn: 214516
2014-08-01 14:31:55 +00:00
Erik Eckstein 690dd037d9 SLPVectorizer: fix build problem in Release configuration
llvm-svn: 214496
2014-08-01 09:47:38 +00:00
Erik Eckstein c80e1dc081 SLPVectorizer: improved scheduling algorithm.
llvm-svn: 214494
2014-08-01 09:20:42 +00:00
Erik Eckstein f16a808292 SLP Vectorizer: added statistics counter
llvm-svn: 214487
2014-08-01 08:14:28 +00:00
Erik Eckstein 4944b2ff94 SLP Vectorizer: improve canonicalize tree operands of commutitive binary operands.
This reverts r214338 (except the test file) and replaces it with a more general algorithm.

llvm-svn: 214485
2014-08-01 08:05:55 +00:00
Suyog Sarda 56c9a87035 This patch implements transform for pattern "(A & ~B) ^ (~A) -> ~(A & B)".
Differential Revision: http://reviews.llvm.org/D4653

llvm-svn: 214479
2014-08-01 05:07:20 +00:00
Suyog Sarda 1c6c2f69f7 This patch implements transform for pattern "(A | B) & ((~A) ^ B) -> (A & B)".
Differential Revision: http://reviews.llvm.org/D4628

llvm-svn: 214478
2014-08-01 04:59:26 +00:00
Suyog Sarda 52324c82cc This patch implements transform for pattern "( A & (~B)) | (A ^ B) -> (A ^ B)"
Differential Revision: http://reviews.llvm.org/D4652

llvm-svn: 214477
2014-08-01 04:50:31 +00:00
Suyog Sarda 16d646594e This patch implements transform for pattern "(A & B) | ((~A) ^ B) -> (~A ^ B)".
Patch Credit to Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4655

llvm-svn: 214476
2014-08-01 04:41:43 +00:00
Tyler Nowicki b5a65395cc Improve the remark generated for -Rpass-missed.
The current remark is ambiguous and makes it sounds like explicitly specifying vectorization will allow the loop to be vectorized. This is not the case. The improved remark directs the user to -Rpass-analysis=loop-vectorize to determine the cause of the pass-miss.

Reviewed by Arnold Schwaighofer`

llvm-svn: 214445
2014-07-31 21:22:22 +00:00
Tyler Nowicki 9fe497fcac Improve the remark generated when a variable that is used outside the loop is not a reduction or induction variable.
Reviewed by Arnold Schwaighofer

llvm-svn: 214440
2014-07-31 21:02:40 +00:00
Evgeniy Stepanov 5997feb7dc [msan] Fix handling of array types.
Switch array type shadow from a single integer to
an array of integers (i.e. make it per-element).
This simplifies instrumentation of extractvalue and fixes PR20493.

llvm-svn: 214398
2014-07-31 11:02:27 +00:00
Stepan Dyatkovskiy 87c046189d MergeFunctions, tiny refactoring:
cmpOperation has been renamed to cmpOperations (multiple form).

llvm-svn: 214392
2014-07-31 07:16:59 +00:00
David Majnemer a92687d636 InstCombine: Correctly propagate NSW/NUW for x-(-A) -> x+A
We can only propagate the nsw bits if both subtraction instructions are
marked with the appropriate bit.

N.B.  We only propagate the nsw bit in InstCombine because the nuw case
is already handled in InstSimplify.

This fixes PR20189.

llvm-svn: 214385
2014-07-31 04:49:29 +00:00
David Majnemer 42af3601c2 InstCombine: Simplify (A ^ B) or/and (A ^ B ^ C)
While we can already transform A | (A ^ B) into A | B, things get bad
once we have (A ^ B) | (A ^ B ^ Cst) because reassociation will morph
this into (A ^ B) | ((A ^ Cst) ^ B).  Our existing patterns fail once
this happens.

To fix this, we add a new pattern which looks through the tree of xor
binary operators to see that, in fact, there exists a redundant xor
operation.

What follows bellow is a correctness proof of the transform using CVC3.

$ cat t.cvc
A, B, C : BITVECTOR(64);

QUERY BVXOR(A, B) | BVXOR(BVXOR(B, C), A) = BVXOR(A, B) | C;
QUERY BVXOR(BVXOR(A, C), B) | BVXOR(A, B) = BVXOR(A, B) | C;

QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C;
QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C;

$ cvc3 < t.cvc
Valid.
Valid.
Valid.
Valid.

llvm-svn: 214342
2014-07-30 21:26:37 +00:00
Chad Rosier 78f41b3ca7 SLP Vectorizer: Canonicalize tree operands of commutitive binary operands.
llvm-svn: 214338
2014-07-30 21:07:56 +00:00
Rafael Espindola d07cf400ab SimplifyCFG: Avoid miscompilations due to removed lifetime intrinsics.
The lifetime intrinsics need some work in order to make it clear which
optimizations are or are not valid.

For now dropping this optimization avoids a miscompilation.

Patch by Björn Steinbrink.

llvm-svn: 214336
2014-07-30 21:04:00 +00:00
Aaron Ballman 573f3b5313 Fixing a few -Woverloaded-virtual warnings by exposing the hidden virtual function as well. No functional changes intended.
llvm-svn: 214325
2014-07-30 19:23:59 +00:00
Rafael Espindola 3cf4af11d5 Add the missing hasLinkOnceODRLinkage predicate.
llvm-svn: 214312
2014-07-30 15:57:51 +00:00
Manman Ren f8a1967c8c [Debug Info] add DISubroutineType and its creation takes DITypeArray.
DITypeArray is an array of DITypeRef, at its creation, we will create
DITypeRef (i.e use the identifier if the type node has an identifier).

This is the last patch to unique the type array of a subroutine type.

rdar://17628609

llvm-svn: 214132
2014-07-28 22:24:06 +00:00
Manman Ren ab8ffbaaee [Debug Info] rename getTypeArray to getElements, setTypeArray to setArrays.
This is the second of a series of patches to handle type uniqueing of the
type array for a subroutine type.

For vector and array types, getElements returns the array of subranges, so it
is a better name than getTypeArray. Even for class, struct and enum types,
getElements returns the members, which can be subprograms.

setArrays can set up to two arrays, the second is the templates.

This commit should have no functionality change.

llvm-svn: 214112
2014-07-28 19:14:13 +00:00
Hal Finkel f5867a79c5 Canonicalization for @llvm.assume
Adds simple logical canonicalization of assumption intrinsics to instcombine,
currently:
 - invariant(a && b) -> invariant(a); invariant(b)
 - invariant(!(a || b)) -> invariant(!a); invariant(!b)

llvm-svn: 213977
2014-07-25 21:45:17 +00:00
Hal Finkel 930469107d Add @llvm.assume, lowering, and some basic properties
This is the first commit in a series that add an @llvm.assume intrinsic which
can be used to provide the optimizer with a condition it may assume to be true
(when the control flow would hit the intrinsic call). Some basic properties are added here:

 - llvm.invariant(true) is dead.
 - llvm.invariant(false) is unreachable (this directly corresponds to the
   documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef).

The intrinsic is tagged as writing arbitrarily, in order to maintain control
dependencies. BasicAA has been updated, however, to return NoModRef for any
particular location-based query so that we don't unnecessarily block code
motion.

llvm-svn: 213973
2014-07-25 21:13:35 +00:00
Duncan P. N. Exon Smith 4b4d8ecde1 Move -verify-use-list-order into llvm-uselistorder
Ugh.  Turns out not even transformation passes link in how to read IR.
I sincerely believe the buildbots will finally agree with my system
after this though.  (I don't really understand why all of this has been
working on my system, but not on all the buildbots.)

Create a new tool called llvm-uselistorder to use for verifying use-list
order.  For now, just dump everything from the (now defunct)
-verify-use-list-order pass into the tool.

This might be a better way to test use-list order anyway.

Part of PR5680.

llvm-svn: 213957
2014-07-25 17:13:03 +00:00
Hal Finkel ff0bcb60c9 Convert noalias parameter attributes into noalias metadata during inlining
This functionality is currently turned off by default.

Part of the motivation for introducing scoped-noalias metadata is to enable the
preservation of noalias parameter attribute information after inlining.
Sometimes this can be inferred from the code in the caller after inlining, but
often we simply lose valuable information.

The overall process if fairly simple:
 1. Create a new unqiue scope domain.
 2. For each (used) noalias parameter, create a new alias scope.
 3. For each pointer, collect the underlying objects. Add a noalias scope for
    each noalias parameter from which we're not derived (and has not been
    captured prior to that point).
 4. Add an alias.scope for each noalias parameter from which we might be
    derived (or has been captured before that point).

Note that the capture checks apply only if one of the underlying objects is not
an identified function-local object.

llvm-svn: 213949
2014-07-25 15:50:08 +00:00
Duncan P. N. Exon Smith 20a005f27a Try to fix a layering violation introduced by r213945
The dragonegg buildbot (and others?) started failing after
r213945/r213946 because `llvm-as` wasn't linking in the bitcode reader.
I think moving the verify functions to the same file as the verify pass
should fix the build.  Adding a command-line option for maintaining
use-list order in assembly as a drive-by to prevent warnings about
unused static functions.

llvm-svn: 213947
2014-07-25 15:41:49 +00:00
Duncan P. N. Exon Smith 6b6fdc992a IPO: Add use-list-order verifier
Add a -verify-use-list-order pass, which shuffles use-list order, writes
to bitcode, reads back, and verifies that the (shuffled) order matches.

  - The utility functions live in lib/IR/UseListOrder.cpp.

  - Moved (and renamed) the command-line option to enable writing
    use-lists, so that this pass can return early if the use-list orders
    aren't being serialized.

It's not clear that this pass is the right direction long-term (perhaps
a separate tool instead?), but short-term it's a great way to test the
use-list order prototype.  I've added an XFAIL-ed testcase that I'm
hoping to get working pretty quickly.

This is part of PR5680.

llvm-svn: 213945
2014-07-25 14:49:26 +00:00
Mark Heffernan 8ec1474f7f After unrolling a loop with llvm.loop.unroll.count metadata (unroll factor
hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates.  This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.

llvm-svn: 213900
2014-07-24 22:36:40 +00:00
Manman Ren 4d189fb9a6 Feedback from Hans on r213815. No functionaility change.
llvm-svn: 213895
2014-07-24 21:13:20 +00:00
Hal Finkel 9414665a3b Add scoped-noalias metadata
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

llvm-svn: 213864
2014-07-24 14:25:39 +00:00
Aaron Ballman 99e0ea0aa8 Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended.
llvm-svn: 213863
2014-07-24 14:24:59 +00:00
Hal Finkel cc39b67530 AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

llvm-svn: 213859
2014-07-24 12:16:19 +00:00
Manman Ren edc60376ed SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.

For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx

It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext

rdar://17735071

llvm-svn: 213815
2014-07-23 23:13:23 +00:00
David Blaikie 8e9cfa5497 ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

llvm-svn: 213805
2014-07-23 22:09:29 +00:00
Mark Heffernan 9e112443b6 Do not add unroll disable metadata after unrolling pass for loops with #pragma clang loop unroll(full).
llvm-svn: 213789
2014-07-23 20:05:44 +00:00
Mark Heffernan e6b4ba1c41 In unroll pragma syntax and loop hint metadata, change "enable" forms to a new form using the string "full".
llvm-svn: 213772
2014-07-23 17:31:37 +00:00
Nick Lewycky aba900c252 We may visit a call that uses an alloca multiple times in callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405!
llvm-svn: 213726
2014-07-23 06:24:49 +00:00
Suyog Sarda 3a8c2c1e6c This patch implements optimization as mentioned in PR19753: Optimize comparisons with "ashr/lshr exact" of a constanst.
It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch.
Added code for lshr as well as non-exact right shifts.

It implements : 
(icmp eq/ne (ashr/lshr const2, A), const1)" ->
(icmp eq/ne A, Log2(const2/const1)) ->
(icmp eq/ne A, Log2(const2) - Log2(const1))

Differential Revision: http://reviews.llvm.org/D4068
 

llvm-svn: 213678
2014-07-22 19:19:36 +00:00
Suyog Sarda b60ec909ca Added InstCombine transform for pattern "(A & B) ^ (A ^ B) -> (A | B)"
Patch idea by Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4618

llvm-svn: 213677
2014-07-22 18:30:54 +00:00
Suyog Sarda d64faf6cae Added InstCombine Transform for patterns:
"((~A & B) | A) -> (A | B)" and "((A & B) | ~A) -> (~A | B)"

Original Patch credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4591

llvm-svn: 213676
2014-07-22 18:09:41 +00:00
Alexey Samsonov bad4d0c38a [ASan] Fix comments about __sanitizer_cov function
llvm-svn: 213673
2014-07-22 17:46:09 +00:00
Suyog Sarda 521237cad6 This patch implements transform for pattern "(A | B) ^ (~A) -> (A | ~B)".
Patch Credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4588

llvm-svn: 213662
2014-07-22 15:37:39 +00:00
Sanjay Patel fc3d8f0a78 fixed typo in comment
llvm-svn: 213614
2014-07-22 04:57:06 +00:00
Mark Heffernan 9d20e42765 Rename metadata llvm.loop.vectorize.unroll to llvm.loop.vectorize.interleave.
llvm-svn: 213588
2014-07-21 23:11:03 +00:00
Duncan P. N. Exon Smith 6c99015fe2 Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build.  I'll reply on the list in a moment.

llvm-svn: 213562
2014-07-21 17:06:51 +00:00