Commit Graph

13630 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith 51149d5589 modules: Add explicit dependency on intrinsics_gen
`LLVM_ENABLE_MODULES` builds sometimes fail because `Intrinsics.td`
needs to regenerate `Instrinsics.h` before anyone can include anything
from the LLVM_IR module.  Represent the dependency explicitly to prevent
that.

llvm-svn: 239796
2015-06-16 00:44:12 +00:00
Philip Reames dfc29fba60 [InstCombine] Propagate non-null facts to call parameters
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.

Differential Revision: http://reviews.llvm.org/D9132

llvm-svn: 239795
2015-06-16 00:43:54 +00:00
Peter Collingbourne 82437bf7a5 Protection against stack-based memory corruption errors using SafeStack
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).

The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.

Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.

This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:

- Add the safestack function attribute, similar to the ssp, sspstrong and
  sspreq attributes.

- Add the SafeStack instrumentation pass that applies the safe stack to all
  functions that have the safestack attribute. This pass moves all unsafe local
  variables to the unsafe stack with a separate stack pointer, whereas all
  safe variables remain on the regular stack that is managed by LLVM as usual.

- Invoke the pass as the last stage before code generation (at the same time
  the existing cookie-based stack protector pass is invoked).

- Add unit tests for the safe stack.

Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.

Differential Revision: http://reviews.llvm.org/D6094

llvm-svn: 239761
2015-06-15 21:07:11 +00:00
Benjamin Kramer 258ea0dbdf [Statepoints] Skip a vector copy when uniquing values.
No functionality change intended.

llvm-svn: 239688
2015-06-13 19:50:38 +00:00
Matt Wala bfb5368cc7 Revert 239644.
llvm-svn: 239650
2015-06-13 01:08:00 +00:00
Matt Wala 1f48192d7c [Scalarizer] Fix potential for stale data in Scattered across invocations
Summary:
Scalarizer has two data structures that hold information about changes
to the function, Gathered and Scattered. These are cleared in finish()
at the end of runOnFunction() if finish() detects any changes to the
function. 

However, finish() was checking for changes by only checking if
Gathered was non-empty. The function visitStore() only modifies
Scattered without touching Gathered. As a result, Scattered could have
ended up having stale data if Scalarizer only scalarized store
instructions. Since the data in Scattered is used during the execution
of the pass, this introduced dangling pointer errors. 

The fix is to check whether both Scattered and Gathered are empty
before deciding what to do in finish().

Reviewers: srhines

Reviewed By: srhines

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10422

llvm-svn: 239644
2015-06-12 22:49:11 +00:00
Matt Wala a4afccd8a8 Fix a typo in a comment in MemCpyOpt (test commit)
llvm-svn: 239628
2015-06-12 18:16:51 +00:00
Alexander Potapenko f90556efb8 [ASan] format AddressSanitizer.cpp with `clang-format -style=Google`, NFC
llvm-svn: 239601
2015-06-12 11:27:06 +00:00
Peter Collingbourne 005354b1f4 LowerBitSets: Give names to aliases of unnamed bitset element objects.
It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.

llvm-svn: 239590
2015-06-12 03:25:05 +00:00
Teresa Johnson 43a65d9529 Revert commit r239480 as it causes https://code.google.com/p/chromium/issues/detail?id=499508#c3.
llvm-svn: 239589
2015-06-12 03:12:00 +00:00
Alexey Samsonov 201733b7f0 [SanitizerCoverage] Use llvm::getDISubprogram() to get location of the entry basic block.
DebugLoc::getFnDebugLoc() should soon be removed. Also,
getDISubprogram() might become more effective soon and wouldn't need to
scan debug locations at all, if function-level metadata would be emitted
by Clang.

llvm-svn: 239586
2015-06-12 01:48:47 +00:00
Alexey Samsonov 9947e48cd1 [GVN] Use a simpler form of IRBuilder constructor.
Summary:
A side effect of this change is that it IRBuilder now automatically
created debug info locations for new instructions, which is the
same as debug location of insertion point. This is fine for the
functions in questions (GetStoreValueForLoad and
GetMemInstValueForLoad), as they are used in two situations:
  * GVN::processLoad, which tries to eliminate a load. In this case
    new instructions would have the same debug location as the load they
    eventually replace;
  * MaterializeAdjustedValue, which adds new instructions to the end
    of the basic blocks, which could later be used to replace the load
    definition. In this case we don't yet know the way the load would
    be eventually replaced (either by assembling the precomputed values
    via PHI, or by using them directly), so just using the basic block
    strategy seems to be reasonable. There is also a special case
    in the code that *would* adjust the location of the last
    instruction replacing the load definition to the location of the
    load.

Test Plan: regression test suite

Reviewers: echristo, dberlin, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10405

llvm-svn: 239585
2015-06-12 01:39:48 +00:00
Alexey Samsonov ff449802c2 [GVN] Use IRBuilder more actively instead of creating instructions manually.
llvm-svn: 239584
2015-06-12 01:39:45 +00:00
Michael Zolotukhin c4e4f33e29 Update stale comment before analyzeLoopUnrollCost. NFC.
llvm-svn: 239565
2015-06-11 22:17:39 +00:00
Alexey Samsonov ea20199b48 [LoopUnroll] Use IRBuilder to create branch instructions.
Use IRBuilder::Create(Cond)?Br instead of constructing instructions
manually with BranchInst::Create(). It's consistent with other
uses of IRBuilder in this pass, and has an additional important
benefit:

Using IRBuilder will ensure that new branch instruction will get
the same debug location as original terminator instruction it will
eventually replace.

For now I'm not adding a testcase, as currently original terminator
instruction also lack debug location due to missing debug location
propagation in BasicBlock::splitBasicBlock. That is, the testcase
will accompany the fix for the latter I'm going to mail soon.

llvm-svn: 239550
2015-06-11 18:25:44 +00:00
Matt Arsenault 91f90e694f SLSR: Pass address space to isLegalAddressingMode
This only updates one of the uses. The other is used in cases
that may never touch memory, so I'm not sure why this is even
calling it. Perhaps there should be a new, similar hook for such
cases or pass -1 for unknown address space.

llvm-svn: 239540
2015-06-11 16:13:39 +00:00
Hao Liu 405f1d1651 [LoopVectorize] Revert the enabling of interleaved memory access in Loop Vectorizor, which was wrongly committed in r239514.
llvm-svn: 239515
2015-06-11 09:18:07 +00:00
Hao Liu 4566d18e89 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)

llvm-svn: 239514
2015-06-11 09:05:02 +00:00
Peter Collingbourne 115fe37621 ArgumentPromotion: Drop sret attribute on functions that are only called directly.
If the first argument to a function is a 'this' argument and the second
has the sret attribute, the ArgumentPromotion pass may promote the 'this'
argument to more than one argument, violating the IR constraint that 'sret'
may only be applied to the first or second argument.

Although this IR constraint is arguably unnecessary, it highlighted the fact
that ArgPromotion does not need to preserve this attribute. Dropping the
attribute reduces register pressure in the backend by avoiding the register
copy required by sret. Because sret implies noalias, we also replace the
former with the latter.

Differential Revision: http://reviews.llvm.org/D10353

llvm-svn: 239488
2015-06-10 21:14:34 +00:00
Teresa Johnson 232fa9af3b Add new EliminateAvailableExternally module pass, which is performed in
O2 compiles just before GlobalDCE, unless we are preparing for LTO.

This pass eliminates available externally globals (turning them into
declarations), regardless of whether they are dead/unreferenced, since
we are guaranteed to have a copy available elsewhere at link time.
This enables additional opportunities for GlobalDCE.

If we are preparing for LTO (e.g. a -flto -c compile), the pass is not
included as we want to preserve available externally functions for possible
link time inlining. The FE indicates whether we are doing an -flto compile
via the new PrepareForLTO flag on the PassManagerBuilder.

llvm-svn: 239480
2015-06-10 17:49:28 +00:00
Alexey Samsonov 89645dfa4d [GVN] Set proper debug locations for some instructions created by GVN.
Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.

This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.

Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.

See http://reviews.llvm.org/D10351 review thread for more details.

Test Plan: regression test suite

Reviewers: spatel, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10351

llvm-svn: 239479
2015-06-10 17:37:38 +00:00
Alexey Samsonov b7f02d371f [BasicBlockUtils] Set debug locations for instructions created in SplitBlockPredecessors.
Test Plan: regression test suite

Reviewers: eugenis, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10343

llvm-svn: 239438
2015-06-09 22:10:29 +00:00
Akira Hatanaka d9699bc7bd Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427
2015-06-09 19:07:19 +00:00
Arnold Schwaighofer 7e226271a1 MergeFunctions: Don't replace a weak function use by another equivalent weak function
We don't know whether the weak functions definition is the definitive definition.

rdar://21303727

llvm-svn: 239422
2015-06-09 18:19:17 +00:00
Denis Protivensky c09e376d8e MergeFunctions: Fix gcc warning in condition
llvm-svn: 239391
2015-06-09 09:28:37 +00:00
Anna Zaks 119046098a [asan] Prevent __attribute__((annotate)) triggering errors on Darwin
The following code triggers a fatal error in the compiler instrumentation
of ASan on Darwin because we place the attribute into llvm.metadata section,
which does not have the proper MachO section name.

void foo() __attribute__((annotate("custom")));
void foo() {;}

This commit reorders the checks so that we skip everything in llvm.metadata
first. It also removes the hard failure in case the section name does not
parse. That check will be done lower in the compilation pipeline anyway.

(Reviewed in http://reviews.llvm.org/D9093.)

llvm-svn: 239379
2015-06-09 00:58:08 +00:00
Arnold Schwaighofer 003c2e937b Fix unused variable warning
llvm-svn: 239369
2015-06-09 00:17:40 +00:00
Arnold Schwaighofer 0302da614a MergeFunctions: Impose a total order on the replacement of functions
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.

If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.

rdar://21265586

llvm-svn: 239367
2015-06-09 00:03:29 +00:00
Akira Hatanaka 4a61619ff5 [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717

llvm-svn: 239325
2015-06-08 18:50:43 +00:00
Hao Liu 32c0539691 [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
       a = A[i];         // load of even element
       b = A[i+1];       // load of odd element
       ...               // operations on a, b, c, d
       A[i] = c;         // store of even element
       A[i+1] = d;       // store of odd element
     }

  The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
     %wide.vec = load <8 x i32>, <8 x i32>* %ptr
     %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
     %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>

  The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
     %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> 
     store <8 x i32> %interleaved.vec, <8 x i32>* %ptr

This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. 

llvm-svn: 239291
2015-06-08 06:39:56 +00:00
Michael Zolotukhin a60bdb5639 Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic.
Summary:
Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor.
Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.

Test Plan: Tests would be submitted in subsequent patches.

Reviewers: atrick, chandlerc

Reviewed By: atrick, chandlerc

Subscribers: atrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D10205

llvm-svn: 239282
2015-06-08 03:28:06 +00:00
Matt Arsenault e81944fd5e SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingMode
llvm-svn: 239262
2015-06-07 20:17:44 +00:00
Matt Arsenault fb88aca348 Make NaryReassociate pass the address space to isLegalAddressingMode
No test since the kinds of transforms this prevents seem to not really
be relevant for SI's different addressing modes.

llvm-svn: 239261
2015-06-07 20:17:42 +00:00
Benjamin Kramer 82f865277e Remove global std::string. NFC.
llvm-svn: 239254
2015-06-07 16:36:28 +00:00
David Majnemer 3f0fb98d01 [InstCombine, InstSimplify] Move xforms from Combine to Simplify
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.

llvm-svn: 239229
2015-06-06 22:40:21 +00:00
Sanjoy Das ad714b1af3 [LoopUnroll] Fix truncation bug in canUnrollCompletely.
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated.  Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10293

llvm-svn: 239218
2015-06-06 05:24:10 +00:00
David Majnemer 1c297e66fb [CVP] Don't assume Constants of type i1 can be known to be true or false
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt.  Instead,
it gets a ConstantExpr.  It then assumes that the Constant must be equal
to false because it isn't equal to true.

Instead, perform an additional comparison.

This fixes PR23752.

llvm-svn: 239217
2015-06-06 04:56:51 +00:00
David Majnemer 468f670021 [InstCombine] Don't miscompile select to poison
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand.  However, doing so is only valid if the
computation doesn't inject poison into the computation.

It might be helpful to consider the following example:
  (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)

The select is equivalent to (add %i, 1) but not (add nsw %i, 1).

Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.

llvm-svn: 239215
2015-06-06 02:30:43 +00:00
Renato Golin 3dabb23384 Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least  on ARM, increasing the
compilation time (stage 2, clang's) by 5x.

llvm-svn: 239175
2015-06-05 18:24:12 +00:00
Sanjoy Das c80dad6f18 [InstCombine][NFC] Add a ``break;`` statement.
This change is NFC because both the ``break;`` and the fall through end
up returning immediately. However, this helps clarify intent and also
ensures correctness in case more ``case`` blocks are added later.

llvm-svn: 239172
2015-06-05 18:04:46 +00:00
Sanjoy Das 72cb5e1087 [InstCombine] Fix PR23751.
PR23751 was caused by a missing ``break;`` in r234388.

llvm-svn: 239171
2015-06-05 18:04:42 +00:00
Chandler Carruth 9dabd14d59 [Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:

- '*Threshold' is the threshold for full unrolling. It is measured
  against the estimated unrolled cost as computed by getUserCost in TTI
  (or CodeMetrics, etc). We will exceed this threshold when unrolling
  loops where unrolling exposes a significant degree of simplification
  of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
  estimated dynamic execution cost which needs to be saved by unrolling
  to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
  unrolling cost when the dynamic savings are expected to be high.

When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.

While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.

This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.

The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.

Differential Revision: http://reviews.llvm.org/D9966

llvm-svn: 239164
2015-06-05 17:01:43 +00:00
David Majnemer b58f32f7a8 [LoopVectorize] Don't crash on zero-sized types in isInductionPHI
isInductionPHI wants to calculate the stride based on the pointee size.
However, this is not possible when the pointee is zero sized.

This fixes PR23763.

llvm-svn: 239143
2015-06-05 10:52:40 +00:00
David Majnemer 6d8081835d [InstCombine] Rephrase fix to SimplifyWithOpReplaced
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
   it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
   rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
   the value.  One of the instructions is the original select, causing
   LLVM to never reach fixpoint.

Instead, strip the flags only when we are sure we are going to perform
the simplification.

llvm-svn: 239141
2015-06-05 09:57:57 +00:00
Daniel Jasper 917fa5ee66 Revert "[InstCombine] Don't miscompile safe increment idiom"
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?

Likely, we shouldn't return nullptr?

llvm-svn: 239139
2015-06-05 09:31:20 +00:00
David Majnemer 00f7d9ecc8 [InstCombine] Don't miscompile safe increment idiom
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison.  However, the other operand cannot
have flags.

This fixes PR23757.

llvm-svn: 239115
2015-06-04 23:11:30 +00:00
Diego Novillo b3029d26b8 Tidy code in InstrProfiling.cpp. NFC.
Removed the redundant "llvm::" from class names in InstrProfiling.cpp
clang-format is ran on the changes.

Patch from Betul Buyukkurt.

llvm-svn: 239034
2015-06-04 11:45:32 +00:00
Chandler Carruth 70c61c1a8a [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and
port it to the new pass manager.

All this does is extract the inner "location" class used by AA into its
own full fledged type. This seems *much* cleaner as MemoryDependence and
soon MemorySSA also use this heavily, and it doesn't make much sense
being inside the AA infrastructure.

This will also make it much easier to break apart the AA infrastructure
into something that stands on its own rather than using the analysis
group design.

There are a few places where this makes APIs not make sense -- they were
taking an AliasAnalysis pointer just to build locations. I'll try to
clean those up in follow-up commits.

Differential Revision: http://reviews.llvm.org/D10228

llvm-svn: 239003
2015-06-04 02:03:15 +00:00
Vasileios Kalintiris 9f77f61ef3 Remove stray semicolon. NFC.
llvm-svn: 238908
2015-06-03 08:51:30 +00:00
Sanjoy Das 353a19e13c [RewriteStatepointsForGC] Strip deref info after rewriting.
Summary:
Once a gc.statepoint has been rewritten to relocate live references, the
SSA values represent physical pointers instead of logical references.
Logical dereferencability does not imply physical dereferencability and
after RewriteStatepointsForGC has run any attributes that imply
dereferencability of the logical references need to be stripped.

This current approach is conservative, and can be made more precise
later if needed.  For starters, we need to strip dereferencable
attributes only from pointers that live in the GC address space.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10105

llvm-svn: 238883
2015-06-02 22:33:37 +00:00
Sanjoy Das ea45f0e054 [NFCI] Change RewriteStatepointsForGC to a ModulePass.
Summary:
A later change that has RewriteStatepointsForGC change function
attributes throughout the module depends on this.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10104

llvm-svn: 238882
2015-06-02 22:33:34 +00:00
Owen Anderson 15d1805504 Teach the IR Sink pass to (conservatively) respect convergent annotations.
llvm-svn: 238762
2015-06-01 17:20:31 +00:00
David Blaikie f5147ef0b9 [opaque pointer type] Explicitly store the pointee type of the result of a GEP
Alternatively, this type could be derived on-demand whenever
getResultElementType is called - if someone thinks that's the better
choice (simple time/space tradeoff), I'm happy to give it a go.

llvm-svn: 238716
2015-06-01 03:09:34 +00:00
Benjamin Kramer f5e2fc474d Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.


Call sites were found with the ASTMatcher + some semi-automated cleanup.

memberCallExpr(
    argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
    on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
    hasArgument(0, bindTemporaryExpr(
                       hasType(recordDecl(hasNonTrivialDestructor())),
                       has(constructExpr()))),
    unless(isInTemplateInstantiation()))

No functional change intended.

llvm-svn: 238602
2015-05-29 19:43:39 +00:00
Wei Mi e2538b5639 Enable exitValue rewrite only when the cost of expansion is low.
The patch evaluates the expansion cost of exitValue in indVarSimplify pass, and only does the rewriting when the expansion cost is low or loop can be deleted with the rewriting. It provides an option "-replexitval=" to control the default aggressiveness of the exitvalue rewriting. It also fixes some missing cases in SCEVExpander::isHighCostExpansionHelper to enhance the evaluation of SCEV expansion cost.

Differential Revision: http://reviews.llvm.org/D9800

llvm-svn: 238507
2015-05-28 21:49:07 +00:00
David Majnemer dd04352558 [InstCombine] Fold IntToPtr and PtrToInt into preceding loads.
Currently we only fold a BitCast into a Load when the BitCast is its
only user.

Do the same for any no-op cast.

Differential Revision: http://reviews.llvm.org/D9152

llvm-svn: 238452
2015-05-28 18:39:17 +00:00
Benjamin Kramer dba7ee90b5 Don't call utostr in Twine/raw_ostream contexts.
Creating temporary std::strings there is unnecessary.

llvm-svn: 238412
2015-05-28 11:24:24 +00:00
Yury Gribov 781bce2b94 [ASan] Fix previous commit. Patch by Max Ostapenko!
llvm-svn: 238403
2015-05-28 08:03:28 +00:00
Yury Gribov 98b18599a6 [ASan] New approach to dynamic allocas unpoisoning. Patch by Max Ostapenko!
Differential Revision: http://reviews.llvm.org/D7098

llvm-svn: 238402
2015-05-28 07:51:49 +00:00
David Majnemer 587336d2ad [Reassociate] Canonicalizing 'x [+-] (-Constant * y)' isn't always a win
Canonicalizing 'x [+-] (-Constant * y)' is not a win if we don't *know*
we will open up CSE opportunities.

If the multiply was 'nsw', then negating 'y' requires us to clear the
'nsw' flag.  If this is actually worth pursuing, it is probably more
appropriate to do so in GVN or EarlyCSE.

This fixes PR23675.

llvm-svn: 238397
2015-05-28 06:16:39 +00:00
Jingyue Wu c2a014697a [NaryReassociate] Run EarlyCSE after NaryReassociate
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline

1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.

2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.

Test Plan: updated @reassociate_gep_nsw in nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: dberlin, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9947

llvm-svn: 238396
2015-05-28 04:56:52 +00:00
Diego Novillo df4837ba6b Final fix for PR 23499 and IR test case.
This fixes a bit I forgot in r238335. In addition to the data record and
the counter, we can also move the name of the counter to the comdat for
the associated function.

I'm also adding an IR test case to check that these three elements are
placed in the proper comdat.

llvm-svn: 238351
2015-05-27 19:34:01 +00:00
Diego Novillo 98b4cf8fca Fix PR 23499 - Avoid multiple profile counters for functions in comdat sections.
Counter symbols created for linkonce functions are not discarded by ELF
linkers unless the symbols are placed in the same comdat section as its
associated function.

llvm-svn: 238335
2015-05-27 16:44:47 +00:00
Philip Reames 52e7a59e50 [PlaceSafepoints] Entry safepoint location doesn't need to be a terminator
Long ago, the poll insertion code assumed that the insertion site was a terminator.  As a result, the entry selection code would split a basic block to ensure it could pass a terminator.  The insertion code was updated quite a while ago - possibly before it ever landed upstream - but the now redundant work was never removed.  

While I'm at it, remove a comment which doesn't apply to the upstreamed code.  

NFC intended.

llvm-svn: 238254
2015-05-26 21:16:42 +00:00
Philip Reames 38840245e4 [PlaceSafepoints] Cleanup InsertSafepointPoll function
While working on another change, I noticed that the naming in this function was mildly deceptive.  While fixing that, I took the oppurtunity to modernize some of the code.  NFC intended.

llvm-svn: 238252
2015-05-26 21:03:23 +00:00
Craig Topper 042a39274a Use range-based for loops. NFC.
llvm-svn: 238154
2015-05-25 20:01:18 +00:00
Bjorn Steinbrink 236446cd4c Remove conflicting attributes before adding deduced readonly/readnone
Summary:
In case of functions that have a pointer argument and only pass it to
each other, the function attributes pass deduces that the pointer should
get the readnone attribute, but fails to remove a readonly attribute
that may already have been present.

Reviewers: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9995

llvm-svn: 238152
2015-05-25 19:46:38 +00:00
NAKAMURA Takumi 5582a6a4a5 Reformat.
llvm-svn: 238126
2015-05-25 01:43:34 +00:00
NAKAMURA Takumi fb3bd7127a Prune CRLFs.
llvm-svn: 238125
2015-05-25 01:43:23 +00:00
Chandler Carruth 04cc665cef [Unroll] Switch from an eagerly populated SCEV cache to one that is
lazily built.

Also, make it a much more generic SCEV cache, which today exposes only
a reduced GEP model description but could be extended in the future to
do other profitable caching of SCEV information.

llvm-svn: 238124
2015-05-25 01:00:46 +00:00
Craig Topper 10949ae742 Give more meaningful names than I and J to some for loop variables after converting to range-based loops.
llvm-svn: 238095
2015-05-23 08:45:10 +00:00
Craig Topper 37d0d866f4 Fix an unused variable warning in release builds.
llvm-svn: 238094
2015-05-23 08:20:33 +00:00
Craig Topper 77b9941ab9 Use range-based for loops. NFC.
llvm-svn: 238093
2015-05-23 08:01:41 +00:00
Philip Reames 7c78ef7dd9 Extend EarlyCSE to handle basic cases from JumpThreading and CVP
This patch extends EarlyCSE to take advantage of the information that a controlling branch gives us about the value of a Value within this and dominated basic blocks. If the current block has a single predecessor with a controlling branch, we can infer what the branch condition must have been to execute this block. The actual change to support this is downright simple because EarlyCSE's existing scoped hash table logic deals with most of the complexity around merging.

The patch actually implements two optimizations.
1) The first is analogous to JumpThreading in that it enables EarlyCSE's CSE handling to fold branches which are exactly redundant due to a previous branch to branches on constants. (It doesn't actually replace the branch or change the CFG.) This is pretty clearly a win since it enables substantial CFG simplification before we start trying to inline.
2) The second is analogous to CVP in that it exploits the knowledge gained to replace dominated *uses* of the original value. EarlyCSE does not otherwise reason about specific uses, so this is the more arguable one. It does enable further simplication and constant folding within the rest of the visit by EarlyCSE.

In both cases, the added code only handles the easy dominance based case of each optimization. The general case is deferred to the existing passes.

Differential Revision: http://reviews.llvm.org/D9763

llvm-svn: 238071
2015-05-22 23:53:24 +00:00
David Majnemer 4c3753c4d4 [InstCombine] Don't eagerly propagate nsw for A*B+A*C => A*(B+C)
InstCombine transforms A *nsw B +nsw A *nsw C to A *nsw (B + C).
This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then
nothing in the LHS overflows, but the multiplication in RHS overflows.

We need to first make sure that we won't multiple by INT_SMAX + 1.

Test case `add_of_mul` contributed by Sanjoy Das.

This fixes PR23635.

Differential Revision: http://reviews.llvm.org/D9629

llvm-svn: 238066
2015-05-22 23:02:11 +00:00
Chandler Carruth 0215608bda [Unroll] Separate the logic for testing each iteration of the loop,
accumulating estimated cost, and other loop-centric logic from the logic
used to analyze instructions in a particular iteration.

This makes the visitor very narrow in scope -- all it does is visit
instructions, update a map of simplified values, and return whether it
is able to optimize away a particular instruction.

The two cost metrics are now returned as an optional struct. When the
optional is left unengaged, there is no information about the unrolled
cost of the loop, when it is engaged the cost metrics are available to
run against the thresholds.

No functionality changed.

llvm-svn: 238033
2015-05-22 17:41:35 +00:00
David Majnemer 1503258157 [InstSimplify] Handle some overflow intrinsics in InstSimplify
This change does a few things:
- Move some InstCombine transforms to InstSimplify
- Run SimplifyCall from within InstCombine::visitCallInst
- Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0.

llvm-svn: 237995
2015-05-22 03:56:46 +00:00
Chandler Carruth 5189559905 [Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actual
problem instead of suggesting doing something that is trivial to do but
incorrect given the current design of the libraries.

llvm-svn: 237994
2015-05-22 03:07:28 +00:00
Chandler Carruth e1a0462dcc [Unroll] Extract the logic for caching SCEV-modeled GEPs with their
simplified model for use simulating each iteration into a separate
helper function that just returns the cache.

Building this cache had nothing to do with the rest of the unroll
analysis and so this removes an unnecessary coupling, etc. It should
also make it easier to think about the concept of providing fast cached
access to basic SCEV models as an orthogonal concept to the overall
unroll simulation.

I'd really like to see this kind of caching logic folded into SCEV
itself, it seems weird for us to provide it at this layer rather than
making repeated queries into SCEV fast all on their own.

No functionality changed.

llvm-svn: 237993
2015-05-22 03:02:22 +00:00
Chandler Carruth f174a156c3 [Unroll] Refactor the accumulation of optimized instruction costs into
a single location.

This reduces code duplication a bit and will also pave the way for
a better separation between the visitation algorithm and the unroll
analysis.

No functionality changed.

llvm-svn: 237990
2015-05-22 02:47:29 +00:00
Philip Reames b47b9c2b2b [LICM] Sinking doesn't involve the preheader
PR23608 pointed out that using the preheader to gain a context instruction isn't always legal because a loop might not have a preheader.  When looking into that, I realized that using the preheader to determine legality for sinking is questionable at best.  Given no test covers that case and the original commit didn't seem to intend it, I restructured the code to only ask context sensative queries for hoising of loads and stores.  This is effectively a partial revert of 237593.

llvm-svn: 237985
2015-05-22 02:14:05 +00:00
Daniel Berlin b301533ef1 MergedLoadStoreMotion preserves MemoryDependenceAnalysis, it does not require it.
(It already was coded assuming it can sometimes be null, so no other changes are necessary)

llvm-svn: 237978
2015-05-22 00:13:05 +00:00
Jingyue Wu 4fc97f6df8 [NaryReassoc] reassociate GEP for CSE
Summary:
x = &a[i];
y = &a[i + j];

=>

y = x + j;

along with some refactoring work such as extracting method
findClosestMatchingDominator.

Depends on D9786 which provides the ScalarEvolution::getGEPExpr interface.

Test Plan: nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9802

llvm-svn: 237971
2015-05-21 23:17:30 +00:00
David Majnemer 27e89ba24c [InstCombine] X - 0 is equal to X, not undef
A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform
into undef instead of %X.

This fixes PR23624.

llvm-svn: 237968
2015-05-21 23:04:21 +00:00
Benjamin Kramer e6987bf351 [LoopDistribute] Remove a layer of pointer indirection.
Just store InstPartitions directly into the std::list. No functional change
intended.

llvm-svn: 237930
2015-05-21 18:32:07 +00:00
Igor Laevsky d83f6976ba [RewriteStatepointsForGC] Fix debug assertion during derivable pointer rematerialization
Correct assertion would be that there is no other uses from chain we are currently cloning. It is ok to have other uses of values not from this chain.

Differential Revision: http://reviews.llvm.org/D9882

llvm-svn: 237899
2015-05-21 13:02:14 +00:00
Ahmed Bougacha 97876fa894 [MemCpyOpt] Do move the memset, but look at its dest's dependencies.
In effect a partial revert of r237858, which was a dumb shortcut.
Looking at the dependencies of the destination should be the proper
fix: if the new memset would depend on anything other than itself,
the transformation isn't correct.

llvm-svn: 237874
2015-05-21 01:43:39 +00:00
Ahmed Bougacha 0541c67ae7 [MemCpyOpt] Pass Instruction to IRBuilder, no need for NextNode. NFC.
We're erasing the instructions anyway.

llvm-svn: 237861
2015-05-21 00:08:35 +00:00
Ahmed Bougacha 5e0f425c27 [MemCpyOpt] Don't move the memset when optimizing memset+memcpy.
Fixes PR23599, another miscompile introduced by r235232: when there is
another dependency on the destination of the created memset (i.e., the
part of the original destination that the memcpy doesn't depend on)
between the memcpy and the original memset, we would insert the created
memset after the memcpy, and thus after the other dependency.

Instead, insert the created memset right after the old one.

llvm-svn: 237858
2015-05-20 23:55:16 +00:00
James Molloy 2b21a7cf36 Reapply r237539 with a fix for the Chromium build.
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.

> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.

llvm-svn: 237821
2015-05-20 18:41:25 +00:00
Pete Cooper 9e1d335697 Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC.
Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method.

This updates all users which were either using an unsigned to store it, or had a now unnecessary cast.

llvm-svn: 237810
2015-05-20 17:16:39 +00:00
Aaron Ballman ff7d4fad54 Silencing a -Wsign-compare warning; NFC.
llvm-svn: 237794
2015-05-20 14:53:50 +00:00
Swaroop Sridhar 665bc9c936 Add a GCStrategy for CoreCLR
This change adds a new GC strategy for supporting the CoreCLR runtime.

This strategy is currently identical to Statepoint-example GC, 
but is necessary for several upcoming changes specific to CoreCLR, such as:

1. Base-pointers not explicitly reported for interior pointers
2. Different format for stack-map encoding
3. Location of Safe-point polls: polls are only needed before loop-back edges and before tail-calls (not needed at function-entry)
4. Runtime specific handshake between calls to managed/unmanaged functions.

llvm-svn: 237753
2015-05-20 01:07:23 +00:00
Philip Reames d97cdf28e6 [PlaceSafepoints] Stop special casing some intrinsics
We were special casing a handful of intrinsics as not needing a safepoint before them.  After running into another valid case - memset - I took a closer look and realized that almost no intrinsics need to have a safepoint poll before them.  Restructure the code to make that apparent so that we stop hitting these bugs.  The only intrinsics which need a safepoint poll before them are ones which can run arbitrary code.

llvm-svn: 237744
2015-05-19 23:40:11 +00:00
Hans Wennborg 2f21b8760e Revert r237539: "Reapply r237520 with another fix for infinite looping"
This caused PR23583.

llvm-svn: 237739
2015-05-19 23:06:30 +00:00
Jingyue Wu 5db9cba066 [Speculation] NFC: more header comments
explaining how it differs from SpeculativeExecuteBB in SimplifyCFG.

llvm-svn: 237724
2015-05-19 20:52:45 +00:00
Igor Laevsky 285fe84edd [RewriteStatepointsForGC] Fix up naming in "relocationViaAlloca" and run it through clang-format.
Differential Revision: http://reviews.llvm.org/D9774

llvm-svn: 237703
2015-05-19 16:29:43 +00:00
Wei Mi 6a671635e6 Remove the InstructionSimplifierPass immediately after InstructionCombiningPass.
InstructionCombiningPass was added after LoopUnrollPass in r237395. Because
InstructionCombiningPass is strictly more powerful than InstructionSimplifierPass,
remove the unnecessary InstructionSimplifierPass.

Differential Revision: http://reviews.llvm.org/D9838

llvm-svn: 237702
2015-05-19 16:09:11 +00:00
Igor Laevsky e03171863d [RewriteStatepointsForGC] For some values (like gep's and bitcasts) it's cheaper to clone them after statepoint than to emit proper relocates for them. This change implements this logic. There is alredy similar optimization in CodeGenPrepare, but doing so during RewriteStatepointsForGC allows to capture more opprtunities such as relocates in loops and longer instruction chains.
Differential Revision: http://reviews.llvm.org/D9774

llvm-svn: 237701
2015-05-19 15:59:05 +00:00
David Blaikie ff6409d096 Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only
llvm-svn: 237624
2015-05-18 22:13:54 +00:00
Chen Li 74ca2a8777 [PlaceSafepoints] Assertion on that gc_result can not have preceding phis should only apply to invoke statepoint
Summary: When PlaceSafepoints pass replaces old return result with gc_result from statepoint, it asserts that gc_result can not have preceding phis in its parent block. This is only true on invoke statepoint, which terminates the block and puts its result at the beginning of the normal successor block. Call statepoint does not terminate the block and thus its result is in the same block with it. There should be no restriction on whether there are phis or not.

Reviewers: reames, igor-laevsky

Reviewed By: igor-laevsky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9803

llvm-svn: 237597
2015-05-18 19:02:25 +00:00
Sanjoy Das f8a0db50b2 Exploit dereferenceable_or_null attribute in LICM pass
Summary:
Allow hoisting of loads from values marked with dereferenceable_or_null
attribute. For values marked with the attribute perform
context-sensitive analysis to determine whether it's known-non-null or
not.

Patch by Artur Pilipenko!

Reviewers: hfinkel, sanjoy, reames

Reviewed By: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9253

llvm-svn: 237593
2015-05-18 18:07:00 +00:00
Jingyue Wu 2982d4d795 [ScalarEvolution] refactor: extract interface getGEPExpr
Summary:
This allows other passes (such as SLSR) to compute the SCEV expression for an
imaginary GEP.

Test Plan: no regression

Reviewers: atrick, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9786

llvm-svn: 237589
2015-05-18 17:03:25 +00:00
Andrew Trick 715b27f058 indvars cruft: don't replace phi nodes for no reason.
Don't replace a phi with an identical phi. This was done long ago to
"preserve" IVUsers analysis. The code has already called
SE->forgetValue(PN) so I see no purpose in creating a new value for
the phi.

llvm-svn: 237587
2015-05-18 16:49:34 +00:00
Andrew Trick 018e55a187 SimplifyIV comments and dead argument cleanup.
Remove crufty comments. IVUsers hasn't been used here for a long time.

llvm-svn: 237586
2015-05-18 16:49:31 +00:00
James Molloy 53958e187a Reapply r237520 with another fix for infinite looping
SimplifyDemandedBits was "simplifying" a constant by removing just sign bits.
This caused a canonicalization race between different parts of instcombine.

Fix and regression test added - third time lucky?

llvm-svn: 237539
2015-05-17 08:27:27 +00:00
James Molloy e8698ae3e1 Revert commits r237521 and r237520.
The AArch64 LNT bot is unhappy - I've found that the problem is in
SimpliftDemandedBits, but that's going to require another code review
so reverting in the meantime.

llvm-svn: 237528
2015-05-16 21:27:14 +00:00
Benjamin Kramer d9f06c8b87 Move Pass into anonymous namespace. NFC.
llvm-svn: 237526
2015-05-16 16:16:35 +00:00
James Molloy b5aa200a33 Reapply r237453 with a fix for the test timeouts.
The test timeouts were due to instcombine fighting itself. Regression test added.
Original log message:

Canonicalize min/max expressions correctly.

This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
    %2 = sext i32 %a to i64
      %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
    %2 = select i1 %1, i32 %a, i32 0
      %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237520
2015-05-16 13:10:45 +00:00
Ahmed Bougacha f8fa3b8d4b [MemCpyOpt] Turn memcpy from just-memset'd source into memset.
There's no point in copying around constants, so, when all else fails,
we can still transform memcpy of memset into two independent memsets.

To quote the example, we can turn:
  memset(dst1, c, dst1_size);
  memcpy(dst2, dst1, dst2_size);
into:
  memset(dst1, c, dst1_size);
  memset(dst2, c, dst2_size);
When dst2_size <= dst1_size.

Like r235232 for copy constructors, this can occur in move constructors.

Differential Revision: http://reviews.llvm.org/D9682

llvm-svn: 237506
2015-05-16 01:32:26 +00:00
Ahmed Bougacha 15a31f67f7 [MemCpyOpt] Remove dead argument. NFC.
llvm-svn: 237503
2015-05-16 01:23:47 +00:00
Jingyue Wu 25e2500ac8 [NFC] remove an extra new line
llvm-svn: 237462
2015-05-15 18:32:21 +00:00
Jingyue Wu 154eb5aa1d Add a speculative execution pass
Summary:
This is a pass for speculative execution of instructions for simple if-then (triangle) control flow. It's aimed at GPUs, but could perhaps be used in other contexts. Enabling this pass gives us a 1.0% geomean improvement on Google benchmark suites, with one benchmark improving 33%.

Credit goes to Jingyue Wu for writing an earlier version of this pass.

Patched by Bjarke Roune. 

Test Plan:
This patch adds a set of tests in test/Transforms/SpeculativeExecution/spec.ll
The pass is controlled by a flag which defaults to having the pass not run.

Reviewers: eliben, dberlin, meheff, jingyue, hfinkel

Reviewed By: jingyue, hfinkel

Subscribers: majnemer, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9360

llvm-svn: 237459
2015-05-15 17:54:48 +00:00
James Molloy 1675b4a57f Revert "Canonicalize min/max expressions correctly."
This reverts r237453 - it was causing timeouts on some bots. Reverting
while I investigate (it's probably InstCombine fighting itself...)

llvm-svn: 237458
2015-05-15 17:45:09 +00:00
Jingyue Wu 80a96d299a [SLSR] handle (B | i) * S
Summary:
Consider (B | i) * S as (B + i) * S if B and i have no bits set in
common.

Test Plan: @or in slsr-mul.ll

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9788

llvm-svn: 237456
2015-05-15 17:07:48 +00:00
James Molloy 6edf0b4cd4 Canonicalize min/max expressions correctly.
This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
  %2 = sext i32 %a to i64
  %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
  %2 = select i1 %1, i32 %a, i32 0
  %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237453
2015-05-15 16:10:59 +00:00
Sanjoy Das 2c2661456e [PlaceSafepoints] Fix a bug that came in with rL236672.
Transfer the calling convention from the invoke being replaced by
PlaceStatepoints to the new invoke to gc.statepoint created.  Add a test
case that would have caught this issue.

llvm-svn: 237414
2015-05-15 00:26:21 +00:00
Sanjoy Das 8045810c58 [PlaceSafepoints] Fix a bug that came in with rL236672.
rL236672 would generate all invoke statepoints with deopt args set to a
list containing the single element "0", instead of an empty list.

Also add a test case that would have caught this.

llvm-svn: 237413
2015-05-15 00:26:15 +00:00
Jingyue Wu ca32190379 [ValueTracking] refactor: extract method haveNoCommonBitsSet
Summary:
Extract method haveNoCommonBitsSet so that we don't have to duplicate this logic in
InstCombine and SeparateConstOffsetFromGEP.

This patch also makes SeparateConstOffsetFromGEP more precise by passing
DominatorTree to computeKnownBits.

Test Plan: value-tracking-domtree.ll that tests ValueTracking indeed leverages dominating conditions

Reviewers: broune, meheff, majnemer

Reviewed By: majnemer

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9734

llvm-svn: 237407
2015-05-14 23:53:19 +00:00
Wei Mi bf727ba371 Add another InstCombine pass after LoopUnroll.
This is to cleanup some redundency generated by LoopUnroll pass. Such redundency may not be cleaned up by existing passes after LoopUnroll.

Differential Revision: http://reviews.llvm.org/D9777

llvm-svn: 237395
2015-05-14 22:02:54 +00:00
Davide Italiano 95a77e8901 Don't rely on implicit pointerness of 'auto'.
This ends up being a copy. Pointy hat to me.
Reported by: dexonsmith, dblaikie

llvm-svn: 237394
2015-05-14 21:52:12 +00:00
Adam Nemet 2f85b7372c Attempt to fix MSVC bots
llvm-svn: 237359
2015-05-14 12:33:32 +00:00
Adam Nemet 938d3d63d6 New Loop Distribution pass
Summary:
This implements the initial version as was proposed earlier this year
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html).
Since then Loop Access Analysis was split out from the Loop Vectorizer
and was made into a separate analysis pass.  Loop Distribution becomes
the second user of this analysis.

The pass is off by default and can be enabled
with -enable-loop-distribution.  There is currently no notion of
profitability; if there is a loop with dependence cycles, the pass will
try to split them off from other memory operations into a separate loop.

I decided to remove the control-dependence calculation from this first
version.  This and the issues with the PDT are actively discussed so it
probably makes sense to treat it separately.  Right now I just mark all
terminator instruction required which keeps identical CFGs for each
distributed loop.  This seems to be working pretty well for 456.hmmer
where even though there is an empty if-then block in the distributed
loop initially, it gets completely removed.

The pass keeps DominatorTree and LoopInfo updated.  I've tested this
with -loop-distribute-verify with the testsuite where we distribute ~90
loops.  SimplifyLoop is violated in some cases and I have a FIXME
covering this.

Reviewers: hfinkel, nadav, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8831

llvm-svn: 237358
2015-05-14 12:05:18 +00:00
Pete Cooper 7c4d7b8fbe Construct ArrayRef<const T*> from vector<T>
ArrayRef already has a SFINAE constructor which can construct ArrayRef<const T*> from ArrayRef<T*>.

This adds methods to do the same directly from SmallVector and std::vector.  This avoids an intermediate step through the use of makeArrayRef.

Also update the users of this in LICM and SROA to remove the now unnecessary makeArrayRef call.

Reviewed by David Blaikie.

llvm-svn: 237309
2015-05-13 22:43:09 +00:00
Sanjoy Das ba74e645d8 [PlaceSafepoints] New attributes for patchable statepoints.
Summary:
This patch teaches the PlaceSafepoints pass about two `CallSite`
function attributes:

 * "statepoint-id": if the string value of this attribute can be parsed
   as an integer, then it is propagated to the ID parameter of the
   statepoint created.

 * "statepoint-num-patch-bytes": if the string value of this attribute
   can be parsed as an integer, then it is propagated to the `num patch
   bytes` parameter of the statepoint created.

This change intentionally does not assert on a malformed value for these
attributes, given that they're not "official" attributes.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9735

llvm-svn: 237286
2015-05-13 20:11:31 +00:00
Davide Italiano 80625afea8 [LoopIdiomRecognize] Use auto + range-based loop. NFC intended.
llvm-svn: 237284
2015-05-13 19:51:21 +00:00
Jingyue Wu c74e33bffe [NaryReassociate] avoid running forever
Avoid running forever by checking we are not reassociating an expression into
the same form.

Tested with @avoid_infinite_loops in nary-add.ll

llvm-svn: 237269
2015-05-13 18:12:24 +00:00
Diego Novillo ffc84e378a Add function entry counts from sample profiles.
This patch uses the new function profile metadata "function_entry_count"
to annotate entry counts from sample profiles.

In a sampling profile, the total samples collected at the function entry
are an approximation for the number of times that function was invoked.

llvm-svn: 237265
2015-05-13 17:04:29 +00:00
Pete Cooper 0cabcf211a Constify arguments to methods in LICM. NFC
llvm-svn: 237227
2015-05-13 01:12:18 +00:00
Pete Cooper 41e0ee3074 Change LoadAndStorePromoter to take ArrayRef instead of SmallVectorImpl&.
The array passed to LoadAndStorePromoter's constructor was a constant reference to a SmallVectorImpl, which is just the same as passing an ArrayRef.

Also, the data in the array can be 'const Instruction*' instead of 'Instruction*'.  Its not possible to convert a SmallVectorImpl<T*> to SmallVectorImpl<const T*>, but ArrayRef does provide such a method.

Currently this added calls to makeArrayRef which should be a nop, but i'm going to kick off a discussion about improving ArrayRef to not need these.

llvm-svn: 237226
2015-05-13 01:12:16 +00:00
Philip Reames 4d1a3ef659 [PlaceSafepoints] Reduce dominator tree recalculation
Reduce recalculation of the dominator tree by identifying all sites that will need a safepoint poll before doing any of the insertion. This allows us to invalidate the dominator info once, rather than once per safepoint poll inserted.

While I'm at it, update findLocationForEntrySafepoint to properly update the dom tree now that the interface has been made easy. When first written, it wasn't per comment in the code.

Differential Revision: http://reviews.llvm.org/D9727

llvm-svn: 237220
2015-05-13 00:32:23 +00:00
Jingyue Wu 4b6125d788 [SLSR] handles non-canonicalized Mul candidates
such as (2 + B) * S.

Tested by @non_canonicalized in slsr-mul.ll

llvm-svn: 237216
2015-05-13 00:03:17 +00:00
Sanjoy Das a1d39ba940 [Statepoints] Support for "patchable" statepoints.
Summary:
This change adds two new parameters to the statepoint intrinsic, `i64 id`
and `i32 num_patch_bytes`.  `id` gets propagated to the ID field
in the generated StackMap section.  If the `num_patch_bytes` is
non-zero then the statepoint is lowered to `num_patch_bytes` bytes of
nops instead of a call (the spill and reload code remains unchanged).
A non-zero `num_patch_bytes` is useful in situations where a language
runtime requires complete control over how a call is lowered.

This change brings statepoints one step closer to patchpoints.  With
some additional work (that is not part of this patch) it should be
possible to get rid of `TargetOpcode::STATEPOINT` altogether.

PlaceSafepoints generates `statepoint` wrappers with `id` set to
`0xABCDEF00` (the old default value for the ID reported in the stackmap)
and `num_patch_bytes` set to `0`.  This can be made more sophisticated
later.

Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9546

llvm-svn: 237214
2015-05-12 23:52:24 +00:00
Philip Reames 89fe570958 [PlaceSafepoints] Followup to commit L237172
Responding to review feedback from http://reviews.llvm.org/D9585

1) Remove a variable shadow by converting the outer loop to a range for loop.  We never really used the 'i' variable which was being shadowed.
2) Reduce DominatorTree recalculations by passing the DT to SplitEdge.

llvm-svn: 237212
2015-05-12 23:39:23 +00:00
Chandler Carruth a6ae877aec [Unrolling] Refactor the start and step offsets to simplify overflow
checking and make the cache faster and smaller.

I had thought that using an APInt here would be useful, but I think
I was just wrong. Notably, we don't have to do any fancy overflow
checking, we can just bound the values as quite small and do the math in
a higher precision integer. I've switched to a signed integer so that
UBSan will even point out if we ever have integer overflow. I've added
various asserts to try to catch things as well and hoisted the overflow
checks so that we just leave the too-large offsets out of the SCEV-GEP
cache. This makes the value in the cache quite a bit smaller which is
probably worthwhile.

No functionality changed here (for trip counts under 1 billion).

llvm-svn: 237209
2015-05-12 23:32:56 +00:00
Bjorn Steinbrink 2833966a3c CVP: Improve handling of Selects used as incoming PHI values
Summary:
If the branch that leads to the PHI node and the Select instruction
depend on correlated conditions, we might be able to directly use the
corresponding value from the Select instruction as the incoming value
for the PHI node, allowing later removal of the select instruction.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9051

llvm-svn: 237201
2015-05-12 22:31:47 +00:00
Philip Reames 311f710654 [RewriteStatepointsForGC] Extend base pointer to handle more cases w/vectors
When relocating a pointer, we need to determine a base pointer for the derived pointer being relocated. We have limited support for handling a pointer extracted from a vector; the current code only handled the case where the entire vector was known to contain base pointers. This patch extends the reasoning to handle chains of insertelements where the indices are constants. This case turns out to be fairly common in vectorized code. We can now handle vectors which contains mixtures of base and derived pointers provided the insertelements use constant indices.

Note that this doesn't solve the general problem. To handle variable indexed insertelements, we'd need to scalarize and introduce conditional branching based on the index. Alternatively, we could eagerly scalarize, but the code structure doesn't currently make either fix easy. The patch also doesn't handle shufflevector or other vector manipulation for much the same reasons. I plan to defer this work until I have a motivating test case.

Differential Revision: http://reviews.llvm.org/D9676

llvm-svn: 237200
2015-05-12 22:19:52 +00:00
Justin Bogner 383749af55 [PlaceSafepoints] Add missing "override" to PlaceBackedgeSafepointsImpl::runOnFunction
Pointed out by -Winconsistent-missing-override.

llvm-svn: 237196
2015-05-12 21:49:47 +00:00
Arnold Schwaighofer 6a8c5f6403 MergeFunctions: Two different sized allocas are *not* the same
llvm-svn: 237193
2015-05-12 21:42:22 +00:00
Justin Bogner 03038a56fe InstrProf: Update name of compiler-rt routine for setting filename
Patch by Teresa Johnson.

llvm-svn: 237186
2015-05-12 21:23:09 +00:00
Philip Reames 7b9817927a [PlaceSafepoints] Switch to being a FunctionPass
The pass doesn't actually modify the module outside of the function being processed. The only confusing piece is that it both inserts calls and then inlines the resulting calls. Given that, it definitely invalidates module level analysis results, but many FunctionPasses do that.

Differential Revision: http://reviews.llvm.org/D9590

llvm-svn: 237185
2015-05-12 21:21:18 +00:00
Philip Reames 9f12904ec9 [PlaceSafepoints] Make internal helper pass a FunctionPass
Switch from using a LoopPass to using a FunctionPass for the internal helper analysis pass. The next step is going to be to make this a true analysis pass which is required by the PlaceSafepoints pass itself.

p.s. The interesting semantic part here is that we're changing the iteration order over the loops. It shouldn't matter, but that's the reason to separate this into it's own distinct patch.

Differential Revision: http://reviews.llvm.org/D9588

llvm-svn: 237180
2015-05-12 21:09:36 +00:00
Philip Reames 57bdac96d9 [PlaceSafepoints] Use analysis infrastructure to get dominator tree
The old code computed dominators for every loop. This was terribly slow with no good reason. Just use the standard infrastructure for analysis passes.

Differential Revision: http://reviews.llvm.org/D9586

llvm-svn: 237176
2015-05-12 20:56:33 +00:00
Philip Reames 5708cca7ab [PlaceSafepoints] Remove dependence on LoopSimplify
As a step towards getting rid of internal pass manager hack entirely, remove the need for loop simplify to run in the inner pass manager. The new code does produce slightly different loop structures, so this isn't technically NFC.

Differential Revision: http://reviews.llvm.org/D9585

llvm-svn: 237172
2015-05-12 20:43:48 +00:00
Pete Cooper 833f34d837 Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.
We already had a method to iterate over all the incoming values of a PHI.  This just changes all eligible code to use it.

Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB.

llvm-svn: 237169
2015-05-12 20:05:31 +00:00
Pete Cooper 47e80cd796 Constify method. NFC
llvm-svn: 237167
2015-05-12 20:05:20 +00:00
Michael Zolotukhin 8c68171fef Reimplement heuristic for estimating complete-unroll optimization effects.
Summary:
This patch reimplements heuristic that tries to estimate optimization beneftis
from complete loop unrolling.

In this patch I kept the minimal changes - e.g. I removed code handling
branches and folding compares. That's a promising area, but now there
are too many questions to discuss before we can enable it.

Test Plan: Tests are included in the patch.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8816

llvm-svn: 237156
2015-05-12 17:20:03 +00:00
Sanjoy Das 5665c999c2 Rename variables in gc_relocate related functions to follow LLVM's naming conventions.
Summary:
This patch is to rename some variables to CamelCase in gc_relocate
related functions. There is no functionality change.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9681

llvm-svn: 237069
2015-05-11 23:47:27 +00:00
Ahmed Bougacha b61696656e [MemCpyOpt] Look at any dependency -not just source- for memset+memcpy.
This fixes another miscompile introduced by r235232: when there was a
dependency on the memcpy destination other than the memset, we would
ignore it, because we only looked at the source dependency.

It was a mistake to use SrcDepInfo.  Instead, just use DepInfo.

llvm-svn: 237066
2015-05-11 23:09:46 +00:00
Davide Italiano 8ed0446e97 [LoopIdiomRecognize] Transform backedge-taken count check into an assertion.
runOnCountable() allowed the caller to call on a loop without a
predictable backedge-taken count. Change the code so that only loops
with computable backdge-count can call this function, in order to catch
abuses.

llvm-svn: 237044
2015-05-11 21:02:34 +00:00
Sanjoy Das 89c5491a72 [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to vector of pointers
Summary:
In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for
each relocated pointer, and the gc_relocate has the same type with the
pointer. During the creation of gc_relocate intrinsic, llvm requires to
mangle its type. However, llvm does not support mangling of all possible
types. RewriteStatepointsForGC will hit an assertion failure when it
tries to create a gc_relocate for pointer to vector of pointers because
mangling for vector of pointers is not supported.

This patch changes the way RewriteStatepointsForGC pass creates
gc_relocate. For each relocated pointer, we erase the type of pointers
and create an unified gc_relocate of type i8 addrspace(1)*. Then a
bitcast is inserted to convert the gc_relocate to the correct type. In
this way, gc_relocate does not need to deal with different types of
pointers and the unsupported type mangling is no longer a problem. This
change would also ease further merge when LLVM erases types of pointers
and introduces an unified pointer type.

Some minor changes are also introduced to gc_relocate related part in
InstCombineCalls, CodeGenPrepare, and Verifier accordingly.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9592

llvm-svn: 237009
2015-05-11 18:49:34 +00:00
James Molloy 71b91c2dba Rip min/max pattern matching out of InstCombine and into
ValueTracking.

This matching functionality is useful in more than just InstCombine, so
make it available in ValueTracking.

NFC.

llvm-svn: 236998
2015-05-11 14:42:20 +00:00
Hal Finkel f0d68d788b [InstCombine/PowerPC] Fix single-precision QPX load/store replacement
The QPX single-precision load/store intrinsics have implied
truncation/extension from/to the declared value type of <4 x double> to the
memory type of <4 x float>. When we can prove the alignment of the pointer
argument, and thus replace the intrinsic with a regular load or store, we need
to load or store the correct data type (<4 x float>) instead of (<4 x double>).

llvm-svn: 236973
2015-05-11 06:37:03 +00:00
David Majnemer 7536460c0f [InstCombine] Canonicalize single element array store
Use the element type instead of the aggregate type.

Differential Revision: http://reviews.llvm.org/D9591

llvm-svn: 236969
2015-05-11 05:04:27 +00:00
David Majnemer 58fb038b1b [InstCombine] Canonicalize single element array load
Use the element type instead of the aggregate type.

Differential Revision: http://reviews.llvm.org/D9596

llvm-svn: 236968
2015-05-11 05:04:22 +00:00
Ismail Pazarbasi d02ce13bd9 SanitizerCoverage: Use `createSanitizerCtor` to create ctor and call init
Second attempt; instead of using a named local variable, passing
arguments directly to `createSanitizerCtorAndInitFunctions` worked
on Windows.

Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8780

llvm-svn: 236951
2015-05-10 13:45:05 +00:00
Igor Laevsky 5e23e16c6c This change is refactoring only. It moves basic block normalization for invokes to happen before replacement of a call with safepoint in "ReplaceWithStatepoint". Previously it was partly done before replacement of calls with safepoint and partly after call replacement but before RAUW's for gc_relocates, which was confusing.
llvm-svn: 236829
2015-05-08 11:59:09 +00:00
Alexey Samsonov ebd22570b2 Delete unused createSanitizerCoverageModulePass overload.
llvm-svn: 236791
2015-05-07 22:46:06 +00:00
Ismail Pazarbasi 416071e20a Revert "SanitizerCoverage: Use `createSanitizerCtor` to create ctor and call init"
Will fix tomorrow. Unbreak build bots now.

llvm-svn: 236786
2015-05-07 22:17:48 +00:00
Ismail Pazarbasi 5bc0feb3de SanitizerCoverage: Use `createSanitizerCtor` to create ctor and call init
Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8780

llvm-svn: 236780
2015-05-07 21:43:28 +00:00
Ismail Pazarbasi e5048e153a MSan: Use `createSanitizerCtor` to create ctor, and call `__msan_init`
Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8781

llvm-svn: 236779
2015-05-07 21:41:52 +00:00
Ismail Pazarbasi 2d4ae9f0d5 TSan: Use `createSanitizerCtor` to create ctor, and call `__tsan_init`
Reviewers: kcc, dvyukov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8779

llvm-svn: 236778
2015-05-07 21:41:23 +00:00
Ismail Pazarbasi 09c3709e75 ASan: Use `createSanitizerCtor` to create ctor, and call `__asan_init`
Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8778

llvm-svn: 236777
2015-05-07 21:40:46 +00:00
David Blaikie d9d900c05b Recommit r236670: [opaque pointer type] Pass explicit pointer type through GEP constant folding""
Clang regressions were caused by more stringent assertion checking
introduced by this change. Small fix needed to clang has been committed
in r236751.

llvm-svn: 236752
2015-05-07 17:28:58 +00:00
NAKAMURA Takumi 2a5bd54f4e Scalar/PlaceSafepoints.cpp: Fix a warning introduced in r228090. [-Wunused-variable]
llvm-svn: 236711
2015-05-07 10:18:46 +00:00
Mehdi Amini 2668a487a7 Update InstCombine to transform aggregate loads into scalar loads.
Summary:
One step further getting aggregate loads and store being optimized
properly. This will only handle struct with one element at this point.

Test Plan: Added unit tests for the new supported cases.

Reviewers: chandlerc, joker-eph, joker.eph, majnemer

Reviewed By: majnemer

Subscribers: pete, llvm-commits

Differential Revision: http://reviews.llvm.org/D8339

Patch by Amaury Sechet.

From: Amaury Sechet <amaury@fb.com>
llvm-svn: 236695
2015-05-07 05:52:40 +00:00
Alexey Samsonov 3514f27456 [SanitizerCoverage] Introduce SanitizerCoverageOptions struct.
Summary:
This gives frontend more precise control over collected coverage
information. User can still override these options by passing
-mllvm flags.

No functionality change.

Test Plan: regression test suite.

Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9539

llvm-svn: 236687
2015-05-07 01:00:31 +00:00
Philip Reames 7a738dd94c [JumpThreading] Simplify comparisons when simplifying branches
If we have recognized that a conditional is constant at a particular location in the code (while trying to decide if we can simplify a conditional branch), we can eagerly replace that condition with a constant if it's definition is post dominated by the branch in question.

In practice, this ends up being a compile time savings at most. JumpThreading would have visited each using branch anyways. CVP would have visited the cmp itself again. Unless LVI gives up early, we shouldn't gain any addition power by doing this transformation early. What we do gain is simplicity and compile time.

Differential Revision: http://reviews.llvm.org/D9312

llvm-svn: 236684
2015-05-07 00:19:14 +00:00
David Blaikie 567d0e5a90 Revert "[opaque pointer type] Pass explicit pointer type through GEP constant folding"
Causes regressions in Clang. Reverting while I investigate.

This reverts commit r236670.

llvm-svn: 236678
2015-05-06 23:56:21 +00:00
Sanjoy Das abf15608a7 [Statepoints] Clean up PlaceSafepoints.cpp: de-duplicate code.
Common duplicated code and remove unnecessary code.

llvm-svn: 236674
2015-05-06 23:53:21 +00:00
Sanjoy Das 93abd813ec [Statepoints] Clean up PlaceSafepoints.cpp: variable naming.
Use CamelCase.  NFC.

llvm-svn: 236673
2015-05-06 23:53:19 +00:00
Sanjoy Das abe1c685ac [IRBuilder] Add a CreateGCStatepointInvoke.
Renames the original CreateGCStatepoint to CreateGCStatepointCall, and
moves invoke creating functionality from PlaceSafepoints.cpp to
IRBuilder.cpp.

This changes the labels generated for PlaceSafepoints/invokes.ll so use
a regex there to make the basic block labels more resilient.

llvm-svn: 236672
2015-05-06 23:53:09 +00:00
David Blaikie e66a45fdb4 [opaque pointer type] Pass explicit pointer type through GEP constant folding
llvm-svn: 236670
2015-05-06 23:49:14 +00:00
Pete Cooper 2777d88745 Change typeIncompatible to return an AttrBuilder instead of new-ing an AttributeSet.
This makes use of the new API which can remove attributes from a set given a builder.

This is much faster than creating a temporary set and reduces llc time by about 0.3% which was all spent creating temporary attributes sets on the context.

llvm-svn: 236668
2015-05-06 23:19:56 +00:00
Alexey Samsonov 0a648a4bfe [SanitizerCoverage] Fix a couple of typos. NFC.
llvm-svn: 236643
2015-05-06 21:35:25 +00:00
Ismail Pazarbasi 56ccf1c9d5 Implement `createSanitizerCtor`, common helper function for all sanitizers
Summary:
This helper function creates a ctor function, which calls sanitizer's
init function with given arguments. This constructor is then expected
to be added to module's ctors. The patch helps unifying how sanitizer
constructor functions are created, and how init functions are called
across all sanitizers.

Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8777

llvm-svn: 236627
2015-05-06 18:48:22 +00:00
Wei Mi 062c74484d [X86] Disable loop unrolling in loop vectorization pass when VF is 1.
The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture,
by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce
the cost of overflow check, memory boundary check and extra prologue/epilogue code when
regular unroller will unroll the loop another time. Disable it when VF==1 remove the
unnecessary cost on x86. The same can be done for other platforms after verifying
interleaving/memory bound checking to be not perf critical on those platforms.

Differential Revision: http://reviews.llvm.org/D9515

llvm-svn: 236613
2015-05-06 17:12:25 +00:00
Adam Nemet e340f851a3 [DomTree] verifyDomTree to unconditionally perform DT verification
I folded the check for the flag -verify-dom-info into the only caller
where I think it is supposed to be checked: verifyAnalysis.  (The idea
of the flag is to enable this expensive verification in
verifyPreservedAnalysis.)

I'm assuming that when manually scheduling the verification pass
with -passes=verify<domtree>, we do want to perform the verification.

llvm-svn: 236575
2015-05-06 08:18:41 +00:00
Sanjoy Das 499d703f52 [Statepoint] Clean up Statepoint.h: accessor names.
Use getFoo() as accessors consistently and some other naming changes.

llvm-svn: 236564
2015-05-06 02:36:26 +00:00
David Majnemer ac256cfed2 [Inliner] Discard empty COMDAT groups
COMDAT groups which have become rendered unused because of inline are
discardable if we can prove that we've made the group empty.

This fixes PR22285.

llvm-svn: 236539
2015-05-05 20:14:22 +00:00
David Blaikie 73cf872adb [opaque pointer type] Track explicit GEP pointee type through in-memory IR
llvm-svn: 236510
2015-05-05 18:03:48 +00:00
Justin Bogner ba1900cefd InstrProf: Instrumenter support for setting profile output from command line
This change is the second of 3 patches to add support for specifying
the profile output from the command line via -fprofile-instr-generate=<path>,
where the specified output path/file will be overridden by the
LLVM_PROFILE_FILE environment variable.

This patch adds the necessary support to the llvm instrumenter, specifically
a new member of GCOVOptions for clang to save the specified filename, and
support for calling the new compiler-rt interface from __llvm_profile_init.

Patch by Teresa Johnson. Thanks!

llvm-svn: 236288
2015-04-30 23:49:23 +00:00
Matthias Braun e48484c64f InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bits
When optimizing demanded bits of the operands of an Add we have to
remove the nsw/nuw flags as we have no guarantee anymore that we don't
wrap.  This is legal here because the top bit is not demanded.  In fact
this operaion was already performed but missed in the case of an Add
with a constant on the right side.  To fix this this patch refactors the
code to unify the code paths in SimplifyDemandedUseBits() handling of
Add/Sub:

- The transformation of Add->Or is removed from the simplify demand
  code because the equivalent transformation exists in
  InstCombiner::visitAdd()
- KnownOnes/KnownZero are not adjusted for Add x, C anymore as
  computeKnownBits() already performs these computations.
- The simplification of the operands is unified. In this new version
  constant on the right side of a Sub are shrunk now as I could not find
  a reason why not to do so.
- The special case for clearing nsw/nuw in ShrinkDemandedConstant() is
  not necessary anymore as the caller does that already.

Differential Revision: http://reviews.llvm.org/D9415

llvm-svn: 236269
2015-04-30 22:05:30 +00:00
Matthias Braun ec6833420f InstCombine: Move Sub->Xor rule from SimplifyDemanded to InstCombine
The rule that turns a sub to xor if the LHS is 2^n-1 and the remaining bits
are known zero, does not use the demanded bits at all: Move it to the
normal InstCombine code path.

Differential Revision: http://reviews.llvm.org/D9417

llvm-svn: 236268
2015-04-30 22:04:26 +00:00
Sanjoy Das 08e95b4703 [InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.
Summary:
Optimizing these well are especially interesting for IRCE since it
"clamps" values by generating this sort of pattern through SCEV
expressions.

Depends on D9352.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9353

llvm-svn: 236203
2015-04-30 04:56:04 +00:00
Sanjoy Das a8c178f280 [InstCombine] Add a new formula for SMIN.
Summary:
After this change `MatchSelectPattern` recognizes the following form
of SMIN:

  Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C)

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9352

llvm-svn: 236202
2015-04-30 04:56:00 +00:00
David Blaikie bf0a42ac09 [opaque pointer type] Store the value type of an alloca
llvm-svn: 236175
2015-04-29 23:00:35 +00:00
David Blaikie f64246be72 [opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.

This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.

llvm-svn: 236160
2015-04-29 21:22:39 +00:00
Duncan P. N. Exon Smith a9308c49ef IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
Philip Reames 63294cbb6a [RewriteStatepointsForGC] Exclude constant values from being considered live at a safepoint
There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code.

To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work.

Differential Revision: http://reviews.llvm.org/D9236

llvm-svn: 235821
2015-04-26 19:48:03 +00:00
Philip Reames 2e78fa49a8 Don't Place Entry Safepoints Before the llvm.frameescape() Intrinsic
llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block.

Patch by: Swaroop.Sridhar@microsoft.com
Differential Revision: http://reviews.llvm.org/D8910

llvm-svn: 235820
2015-04-26 19:41:23 +00:00
Sanjay Patel c1d20a36fb [x86] instcombine more cases of insertps into a shufflevector
This is a follow-on to D8833 (insertps optimization when the zero mask is not used).

In this patch, we check for the case where the zmask is used, but both input vectors
to the insertps intrinsic are the same operand or the zmask overrides the destination
lane. This lets us replace the 2nd shuffle input operand with the zero vector.

Differential Revision: http://reviews.llvm.org/D9257

llvm-svn: 235810
2015-04-25 20:55:25 +00:00
Hans Wennborg 86ac630585 SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes
When using bit tests for hole checks, we call AddPredecessorToBlock to give the
phi node a value from the bit test block. This would break if we've
previously called removePredecessor on the default destination because the
switch is fully covered.

Test case by Mark Lacey.

llvm-svn: 235771
2015-04-24 20:57:56 +00:00
Andrew Kaylor 08c5f1efc1 Fix LoopInterchange/reductions.ll test for debug builds
llvm-svn: 235734
2015-04-24 17:39:16 +00:00
Aaron Ballman 5e90906c0d Removing dead code; NFC. This code was triggering a C4718 warning (recursive call has no side effects, deleting) with MSVC.
llvm-svn: 235717
2015-04-24 12:51:45 +00:00
Jingyue Wu 72fca6c89b Resurrect r235688
We should skip vector types which are not SCEVable.

test/CodeGen/NVPTX/sched2.ll passes

llvm-svn: 235695
2015-04-24 04:22:39 +00:00
Michael Zolotukhin d98330c424 Fix a couple of typos in comments.
llvm-svn: 235674
2015-04-24 00:10:27 +00:00
Michael Zolotukhin 10ed96bf09 Fix comment for NoCommonBits.
Maybe there is a better wording, but at least it should be technically
correct now.

llvm-svn: 235660
2015-04-23 22:55:48 +00:00
David Blaikie 348de69a30 Recommit r235458: [opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst
(reverted in r235533)

Original commit message:

"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.

Totally open to ideas/suggestions/improvements, of course.

With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"

The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.

llvm-svn: 235651
2015-04-23 21:36:23 +00:00
Philip Reames 5461d45abf Move Value.isDereferenceablePointer to ValueTracking [NFC]
Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking.

This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality.

Patch by: Artur Pilipenko <apilipenko@azulsystems.com>
Differential Revision: reviews.llvm.org/D9075

llvm-svn: 235611
2015-04-23 17:36:48 +00:00
Karthik Bhat 24e6cc2de4 Move common loop utility function isInductionPHI into LoopUtils.cpp
This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp.
This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON

llvm-svn: 235577
2015-04-23 08:29:20 +00:00
Karthik Bhat 8210fdf26e Add support to interchange loops with reductions.
This patch enables interchanging of tightly nested loops with reductions.
Differential Revision: http://reviews.llvm.org/D8314

llvm-svn: 235571
2015-04-23 04:51:44 +00:00
David Majnemer 7d0e99c601 [InstCombine] Use a more targeted fix instead of r235544
Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while
taking advantage that the sign bit is not set.  We do this optimization
to further shrink the mask but shrinking the mask isn't NSW/NUW
preserving in this case.

llvm-svn: 235558
2015-04-22 22:42:05 +00:00
David Majnemer fe58d13a17 [InstCombine] Clear out nsw/nuw if we modify computation in the chain
An nsw/nuw operation relies on the values feeding into it to not
overflow if 'poison' is not to be produced.  This means that
optimizations which make modifications to the bottom of a chain (like
SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that
they will be preserved.

This fixes PR23309.

llvm-svn: 235544
2015-04-22 20:59:28 +00:00
David Blaikie d2db881e85 Revert "[opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst"
This reverts commit r235458.

It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.

llvm-svn: 235533
2015-04-22 18:16:49 +00:00
Sanjay Patel c96ee08016 don't repeat function names in comments; NFC
llvm-svn: 235531
2015-04-22 18:04:46 +00:00
David Blaikie 506993636e [opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.

Totally open to ideas/suggestions/improvements, of course.

With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)

llvm-svn: 235458
2015-04-21 23:26:57 +00:00
Wei Mi a0adf9fd41 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimization,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D8911

llvm-svn: 235455
2015-04-21 23:02:15 +00:00
Wei Mi 2940bc82ac Revert r235451 since it is attached to a wrong Differential Revision. Sorry.
llvm-svn: 235453
2015-04-21 22:56:09 +00:00
Wei Mi 6e3344ed98 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimizations,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D9007

llvm-svn: 235451
2015-04-21 22:37:09 +00:00
Ahmed Bougacha 9692e30e8b [MemCpyOpt] Use the raw i8* dest when optimizing memset+memcpy.
MemIntrinsic::getDest() looks through pointer casts, and using it
directly when building the new GEP+memset results in stuff like:

  %0 = getelementptr i64* %p, i32 16
  %1 = bitcast i64* %0 to i8*
  call ..memset(i8* %1, ...)

instead of the correct:

  %0 = bitcast i64* %p to i8*
  %1 = getelementptr i8* %0, i32 16
  call ..memset(i8* %1, ...)

Instead, use getRawDest, which just gives you the i8* value.
While there, use the memcpy's dest, as it's live anyway.

In most cases, when the optimization triggers, the memset and memcpy
sizes are the same, so the built memset is 0-sized and eliminated.
The problem occurs when they're different.

Fixes a regression caused by r235232: PR23300.

llvm-svn: 235419
2015-04-21 21:28:33 +00:00
Daniel Berlin b4e7a4a40c Revamp PredIteratorCache interface to be cleaner.
Summary:
This lets us use range based for loops.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9169

llvm-svn: 235416
2015-04-21 21:11:50 +00:00
Sanjoy Das 7be03d69e5 [LSR][NFC] Remove a stale comment.
The comment was made stale in r171735.

llvm-svn: 235414
2015-04-21 20:42:50 +00:00
Jingyue Wu f1edf3e88f [SLSR] garbage-collect unused instructions
Summary:
After we rewrite a candidate, the instructions used by the old form may
become unused. This patch cleans up these unused instructions so that we
needn't run DCE after SLSR.

Test Plan: removed -dce in all the SLSR tests

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9101

llvm-svn: 235410
2015-04-21 19:56:18 +00:00
Jingyue Wu f763c3fd45 [SeparateConstOffsetFromGEP] garbage-collect intermediate instructions
Summary: so that we needn't run DCE after this pass.

Test Plan: removed -dce from the commandline in split-gep.ll and split-gep-and-gvn.ll

Reviewers: meheff

Subscribers: llvm-commits, HaoLiu, hfinkel, jholewinski

Differential Revision: http://reviews.llvm.org/D9096

llvm-svn: 235409
2015-04-21 19:53:18 +00:00
Daniel Berlin 2372a193ba Move IDF Calculation to a separate file, expose an interface to it.
Summary:
MemorySSA uses this algorithm as well, and this enables us to reuse the code in both places.

There are no actual algorithm or datastructure changes in here, just code movement.

Reviewers: qcolombet, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9118

llvm-svn: 235406
2015-04-21 19:13:02 +00:00
Duncan P. N. Exon Smith 60635e39b6 DebugInfo: Drop rest of DIDescriptor subclasses
Delete the remaining subclasses of (the already deleted) `DIDescriptor`.
Part of PR23080.

llvm-svn: 235404
2015-04-21 18:44:06 +00:00
Duncan P. N. Exon Smith d4a19a396d DebugInfo: Assert dbg.declare/value insts are valid
Remove early returns for when `getVariable()` is null, and just assert
that it never happens.  The Verifier already confirms that there's a
valid variable on these intrinsics, so we should assume the debug info
isn't broken.  I also updated a check for a `!dbg` attachment, which the
Verifier similarly guarantees.

llvm-svn: 235400
2015-04-21 18:24:23 +00:00
Duncan P. N. Exon Smith 2fbe13540a DebugInfo: Delete subclasses of DIScope
Delete subclasses of (the already defunct) `DIScope`, updating users to
use the raw pointers from the `Metadata` hierarchy directly.

llvm-svn: 235356
2015-04-20 22:10:08 +00:00
Akira Hatanaka 2cc2b63f53 [InlineFunction] Don't add lifetime markers for zero-sized allocas.
This commit fixes the code which adds lifetime markers in InlineFunction to skip
zero-sized allocas instead of asserting on them.

rdar://problem/20531155

llvm-svn: 235312
2015-04-20 16:11:05 +00:00
Karthik Bhat 76aa662cf0 [NFC] Refactor identification of reductions as common utility function.
This patch refactors reduction identification code out of LoopVectorizer and
exposes them as common utilities.
No functional change.
Review: http://reviews.llvm.org/D9046

llvm-svn: 235284
2015-04-20 04:38:33 +00:00
Ahmed Bougacha 05b72c1fd8 [MemCpyOpt] Don't force i64 when promoting memset/memcpy sizes.
Harden r235258 to support any integer bitwidth.  The quick glance at
the reference made me think only i32 and i64 were valid types, but
they're not special, so any overload is legal.

Thanks to David Majnemer for noticing!

llvm-svn: 235261
2015-04-18 23:06:04 +00:00
Ahmed Bougacha 7216ccc3f3 [MemCpyOpt] Promote both memset/memcpy sizes if differently typed.
Followup to r235232, which caused PR23278.

We can't assume the memset and memcpy sizes have the same type, as
nothing in the language reference prevents that.
Instead, zext both to i64 if they disagree.

While there, robustify tests by using i8 %c rather than i8 0 for the
memset character.

llvm-svn: 235258
2015-04-18 17:57:41 +00:00
Benjamin Kramer 2a7404a907 [InstCombine] Create zero constants on demand.
No functional change intended.

llvm-svn: 235257
2015-04-18 16:52:08 +00:00
David Majnemer 45951a6626 [InstCombine] (mul nsw 1, INT_MIN) != (shl nsw 1, 31)
Multiplying INT_MIN by 1 doesn't trigger nsw.  However, shifting 1 into
the sign bit *does* trigger nsw.

llvm-svn: 235250
2015-04-18 04:41:30 +00:00
Duncan P. N. Exon Smith ed557b55ee DebugInfo: Remove DIDescriptor from the DebugInfo API
Stop using `DIDescriptor` and its subclasses in the `DebugInfoFinder`
API, as well as the rest of the API hanging around in `DebugInfo.h`.

llvm-svn: 235240
2015-04-17 23:20:10 +00:00
Ahmed Bougacha 83f78a459a [MemCpyOpt] Optimize double-storing by memset+memcpy.
A common idiom in some code is to do the following:

  memset(dst, 0, dst_size);
  memcpy(dst, src, src_size);

Some of the memset is redundant; instead, we can do:

  memcpy(dst, src, src_size);
  memset(dst + src_size, 0,
         dst_size <= src_size ? 0 : dst_size - src_size);

Original patch by: Joel Jones
Differential Revision: http://reviews.llvm.org/D498

llvm-svn: 235232
2015-04-17 22:20:57 +00:00
Jingyue Wu 8579b81329 [NaryReassociate] run NaryReassociate iteratively
Summary:
An alternative is to use a worklist approach. However, that approach
would break the traversing order so that we couldn't lookup SeenExprs
efficiently. I don't see a clear winner here, so I picked the easier approach.

Along with two minor improvements:
1. preserves ScalarEvolution by forgetting instructions replaced
2. removes dead code locally avoiding the need of running DCE afterwards

Test Plan: add to slsr-add.ll a test that requires multiple iterations

Reviewers: broune, dberlin, atrick, meheff

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9058

llvm-svn: 235151
2015-04-17 00:25:10 +00:00
Jingyue Wu 771dfe91cf [NaryReassociate] speeds up candidate searching
Summary:
This fixes a left-over efficiency issue in D8950.

As Andrew and Daniel suggested, we can store the candidates in a stack
and pop the top element when it does not dominate the current
instruction. This reduces the worst-case time complexity to O(n).

Test Plan: a new test in nary-add.ll that exercises this optimization.

Reviewers: broune, dberlin, meheff, atrick

Reviewed By: atrick

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D9055

llvm-svn: 235129
2015-04-16 18:42:31 +00:00
Sanjay Patel c86867cd5f [X86, SSE] instcombine common cases of insertps intrinsics into shuffles
This is very similar to D8486 / r232852 (vperm2). If we treat insertps intrinsics
as shufflevectors, we can optimize them better.

I've left all but the full zero case of the zero mask variants out of this patch. 
I don't think those can be converted into a single shuffle in all cases, but I'd
be happy to be proven wrong as I was for vperm2f128.

Either way, we'd need to support whatever sequence we come up with for those cases
in the backend before converting them here.

Differential Revision: http://reviews.llvm.org/D8833

llvm-svn: 235124
2015-04-16 17:52:13 +00:00
Aaron Ballman a2f9943cf6 Silencing a -Wunused-but-set-variable warning; NFC.
llvm-svn: 235094
2015-04-16 13:29:36 +00:00
Duncan P. N. Exon Smith b273d06b63 DebugInfo: Gut DIScope, DIEnumerator and DISubrange
The only class the still has API left is `DIDescriptor` itself.

llvm-svn: 235067
2015-04-16 01:37:00 +00:00
Duncan P. N. Exon Smith 35ef22cf53 DebugInfo: Gut DICompileUnit and DIFile
Continuing gutting `DIDescriptor` subclasses; this edition,
`DICompileUnit` and `DIFile`.  In the name of PR23080.

llvm-svn: 235055
2015-04-15 23:19:27 +00:00
Duncan P. N. Exon Smith 62e0f454a0 DebugInfo: Remove 'inlinedAt:' field from MDLocalVariable
Remove 'inlinedAt:' from MDLocalVariable.  Besides saving some memory
(variables with it seem to be single largest `Metadata` contributer to
memory usage right now in -g -flto builds), this stops optimization and
backend passes from having to change local variables.

The 'inlinedAt:' field was used by the backend in two ways:

 1. To tell the backend whether and into what a variable was inlined.
 2. To create a unique id for each inlined variable.

Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg`
attachment, and change the DWARF backend to use a typedef called
`InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`.
This `DebugLoc` is already passed reliably through the backend (as
verified by r234021).

This commit removes the check from r234021, but I added a new check
(that will survive) in r235048, and changed the `DIBuilder` API in
r235041 to require a `!dbg` attachment whose 'scope:` is in the same
`MDSubprogram` as the variable's.

If this breaks your out-of-tree testcases, perhaps the script I used
(mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778
in a moment.

llvm-svn: 235050
2015-04-15 22:29:27 +00:00
Duncan P. N. Exon Smith cd1aecfe36 DebugInfo: Require a DebugLoc in DIBuilder::insertDeclare()
Change `DIBuilder::insertDeclare()` and `insertDbgValueIntrinsic()` to
take an `MDLocation*`/`DebugLoc` parameter which it attaches to the
created intrinsic.  Assert at creation time that the `scope:` field's
subprogram matches the variable's.  There's a matching `clang` commit to
use the API.

The context for this is PR22778, which is removing the `inlinedAt:`
field from `MDLocalVariable`, instead deferring to the `!dbg` location
attached to the debug info intrinsic.  The best way to ensure we always
have a `!dbg` attachment is to require one at creation time.  I'll be
adding verifier checks next, but this API change is the best way to
shake out frontend bugs.

Note: I added an `llvm_unreachable()` in `bindings/go` and passed in
`nullptr` for the `DebugLoc`.  The `llgo` folks will eventually need to
pass a valid `DebugLoc` here.

llvm-svn: 235041
2015-04-15 21:18:07 +00:00
Daniel Berlin 25db4f4141 Add range iterators for post order and inverse post order. Use them
llvm-svn: 235026
2015-04-15 17:41:42 +00:00
Jingyue Wu 43885ebb3a [SLSR] handle candidate form (B + i * S)
Summary:
With this patch, SLSR may rewrite

S1: X = B + i * S
S2: Y = B + i' * S

to

S2: Y = X + (i' - i) * S

A secondary improvement: if (i' - i) is a power of 2, emit Y as X + (S << log(i' - i)). (S << log(i' -i)) is in a canonical form and thus more likely GVN'ed than (i' - i) * S.

Test Plan: slsr-add.ll

Reviewers: hfinkel, sanjoy, meheff, broune, eliben

Reviewed By: eliben

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8983

llvm-svn: 235019
2015-04-15 16:46:13 +00:00
Richard Trieu 6b1aa5f5e1 Change range-based for-loops to be -Wrange-loop-analysis clean.
No functionality change.

llvm-svn: 234963
2015-04-15 01:21:15 +00:00
Jingyue Wu 8cb6b2a292 Simplify n-ary adds by reassociation
Summary:
This transformation reassociates a n-ary add so that the add can partially reuse
existing instructions. For example, this pass can simplify

  void foo(int a, int b) {
    bar(a + b);
    bar((a + 2) + b);
  }

to

  void foo(int a, int b) {
    int t = a + b;
    bar(t);
    bar(t + 2);
  }

saving one add instruction.

Fixes PR22357 (https://llvm.org/bugs/show_bug.cgi?id=22357).

Test Plan: nary-add.ll

Reviewers: broune, dberlin, hfinkel, meheff, sanjoy, atrick

Reviewed By: sanjoy, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8950

llvm-svn: 234855
2015-04-14 04:59:22 +00:00
Duncan P. N. Exon Smith acdee690c8 DebugInfo: Update signature of DICompileUnit::replace*()
Change `DICompileUnit::replaceSubprograms()` and
`DICompileUnit::replaceGlobalVariables()` to match the `MDCompileUnit`
equivalents that they're wrapping.

llvm-svn: 234852
2015-04-14 03:51:36 +00:00
Duncan P. N. Exon Smith 537b4a8159 DebugInfo: Gut DISubprogram and DILexicalBlock*
Gut the `DIDescriptor` wrappers around `MDLocalScope` subclasses.  Note
that `DILexicalBlock` wraps `MDLexicalBlockBase`, not `MDLexicalBlock`.

llvm-svn: 234850
2015-04-14 03:40:37 +00:00
Sanjoy Das e178f46965 [LoopUnrollRuntime] Avoid high-cost trip count computation.
Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count.  Avoid runtime unrolling if this computation
will be expensive.

Depends on D8993.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8994

llvm-svn: 234846
2015-04-14 03:20:38 +00:00
Sanjoy Das 2e6bb3b947 [SCEV] Refactor out isHighCostExpansion. NFCI.
Summary:
Move isHighCostExpansion from IndVarSimplify to SCEVExpander.  This
exposed function will be used in a subsequent change.

Reviewers: bogner, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8995

llvm-svn: 234844
2015-04-14 03:20:28 +00:00
Duncan P. N. Exon Smith 7348ddaa74 DebugInfo: Gut DIVariable and DIGlobalVariable
Gut all the non-pointer API from the variable wrappers, except an
implicit conversion from `DIGlobalVariable` to `DIDescriptor`.  Note
that if you're updating out-of-tree code, `DIVariable` wraps
`MDLocalVariable` (`MDVariable` is a common base class shared with
`MDGlobalVariable`).

llvm-svn: 234840
2015-04-14 02:22:36 +00:00
Duncan P. N. Exon Smith b7e221ba55 DebugInfo: Gut DILocation
This is along the same lines as r234832, but for `DILocation`.  Clean
out all accessors from `DILocation`.  Any callers should be using
`MDLocation` directly (e.g., via `operator->()`).

llvm-svn: 234835
2015-04-14 01:35:55 +00:00
Duncan P. N. Exon Smith 6a0320a991 DebugInfo: Gut DIExpression
Completely gut `DIExpression`, turning it into a simple wrapper around
`MDExpression *`.  There are two bits of magic left:

  - It's constructed from `const MDExpression*` but convertible to
    `MDExpression*`.
  - It's default-constructed to `nullptr`.

Otherwise, it should behave quite like a raw pointer.  Once I've done
the same to the rest of the `DIDescriptor` subclasses, I'll come back to
delete them entirely (and update call sites as necessary to deal with
the missing magic).

llvm-svn: 234832
2015-04-14 01:12:42 +00:00
Philip Reames ba1984958d [RewriteStatepointsForGC] Delete dead code [NFC]
Before we had real liveness, we needed to track every value that base pointer
insertion code created because these now might be live.  We now just rerun 
the data flow liveness algorithm (which is actually faster!) and no longer 
need the associated code.

llvm-svn: 234827
2015-04-14 00:41:34 +00:00
Duncan P. N. Exon Smith 843237f573 DebugInfo: Move DILocation::computeNewDiscriminators()
As documented in PR23200 (and the FIXMEs I've added to the code here),
this logic is fairly broken: it modifies the `LLVMContext` in a way that
affects other modules and cannot be serialized to assembly/bitcode.  For
now, move it over to `MDLocation::computeNewDiscriminators()` anyway.

llvm-svn: 234825
2015-04-14 00:35:42 +00:00
Duncan P. N. Exon Smith 4fd839b0da AddDiscriminators: Create new MDLocation directly
I don't see a reason to add the `copyWithNewScope()` API over to
`MDLocation` -- it seems to be a holdover from when creating locations
required knowing details of operand layout -- so change
`AddDiscriminators` to call `MDLocation::get()` directly.  Should be no
functionality change here.

llvm-svn: 234824
2015-04-14 00:34:30 +00:00
Duncan P. N. Exon Smith 7fa6629d7d StripSymbols: Use DIGlobalVariable::getConstant() instead of getGlobal()
The only difference between the two is a `dyn_cast<>` to
`GlobalVariable`.  If optimizations have left anything behind when a
global gets replaced, then it doesn't seem like the debug info is dead.

I can't seem to find an optimization that would leave behind a
non-`GlobalVariable` without nulling the reference entirely, so I
haven't added a testcase (but I'll be deleting `getGlobal()` in a future
commit).

llvm-svn: 234792
2015-04-13 20:13:30 +00:00
Nick Lewycky d6f241d53b GCC complains thusly: "attributes at the beginning of statement are ignored [-Werror=attributes]". Very well then! NFC
llvm-svn: 234788
2015-04-13 20:03:08 +00:00
Philip Reames f209a153f1 [RwriteStatepointsForGC] Minor indentation and naming [NFC]
Use early-return style that's preferred in LLVM and updating the naming in places I touched with other changes in the last few days.  Hopefully, NFC.

llvm-svn: 234785
2015-04-13 20:00:30 +00:00
Nick Lewycky abe2cc17da Subtraction is not commutative. Fixes PR23212!
llvm-svn: 234780
2015-04-13 19:17:37 +00:00
Philip Reames 2114275263 [RewriteStatepointsForGC] Avoid inserting empty holder
We use dummy calls to adjust the liveness of values over statepoints in the midst of the insertion.  If there are no values which need held live, there's no point in actually inserting the holder.  

llvm-svn: 234779
2015-04-13 19:07:47 +00:00
Philip Reames 69e51cae33 [RewriteStatepointsForGC] Fix a latent bug in normalization for invoke statepoint [NFC]
Since we're restructuring the CFG, we also need to make sure to update the analsis passes. While I'm touching the code, I dedicided to restructure it a bit.  The code involved here was very confusing.  This change moves the normalization to essentially being a pre-pass before the main insertion work and updates a few comments to actually say what is happening and *why*.

The restructuring should be covered by existing tests.  I couldn't easily see how to create a test for the invalidation bug.  Suggestions welcome.

llvm-svn: 234769
2015-04-13 18:07:21 +00:00
Philip Reames 9a2e01d908 [RewriteStatepointsForGC] Strengthen assertions around liveness
This is related to the issues addressed in 234651.  These assertions check the properties ensured by that change at the place of use.  Note that a similiar property is checked in checkBasicSSA, but without the reachability constraint.  Technically, the liveness would be correct to include unreachable values, but this would be problematic for actual relocation.

llvm-svn: 234766
2015-04-13 17:35:55 +00:00
Philip Reames e73300b925 [RewriteStatepointsForGC] Move an expensive debugging check to XDEBUG
The check in question is attempting to help find cases where we haven't relocated a pointer at a safepoint we should have.  It does this by coercing the value to null at any safepoint which doesn't relocate it.  

Unfortunately, this turns out to be rather expensive in terms of memory usage and time.  The number of stores inserted can grow with O(number of values x number of statepoints).  On at least one example I looked at, over half of peak memory usage was coming from this check.  

With this change, the check is no longer enabled by default in Asserts builds.  It is enabled for expensive asserts builds and has a command line option to enable it in both Asserts and non-Asserts builds.  

llvm-svn: 234761
2015-04-13 16:41:32 +00:00
Mark Lacey 274f48b5a8 Fix typo.
llvm-svn: 234706
2015-04-12 18:18:51 +00:00
Sanjoy Das 71190feca5 [LoopUnrollRuntime] Clean up a predicate.
Clean up a predicate I added in r229731, fix the relevant comment and
add a test case.  The earlier version is confusing to read and was also
buggy (probably not a coincidence) till Alexey fixed it in r233881.

llvm-svn: 234701
2015-04-12 01:24:01 +00:00
Benjamin Kramer 79de6e6d89 Mark empty default constructors as =default if it makes the type POD
NFC

llvm-svn: 234694
2015-04-11 18:57:14 +00:00
Benjamin Kramer dd0ff85701 Remove empty non-virtual destructors or mark them =default when non-public
These add no value but can make a class non-trivially copyable. NFC.

llvm-svn: 234688
2015-04-11 15:32:26 +00:00
Alexander Kornienko f817c1cb9a Use 'override/final' instead of 'virtual' for overridden methods
The patch is generated using clang-tidy misc-use-override check.

This command was used:

  tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
    -checks='-*,misc-use-override' -header-filter='llvm|clang' \
    -j=32 -fix -format

http://reviews.llvm.org/D8925

llvm-svn: 234679
2015-04-11 02:11:45 +00:00
Duncan P. N. Exon Smith 63ffa21d90 DebugInfo: Rewrite atSameLineAs() as MDLocation::canDiscriminate()
Rewrite `DILocation::atSameLineAs()` as `MDLocation::canDiscriminate()`
with a doxygen comment explaining its purpose.  I've added a few FIXMEs
where I think this check is too weak; fixing that is tracked by PR23199.

llvm-svn: 234674
2015-04-11 01:00:47 +00:00
Philip Reames 9638ff9b37 [Statepoints] Fix a release only build failure
A function which is used only in Asserts builds needs to be defined only in Asserts builds.

llvm-svn: 234667
2015-04-11 00:06:47 +00:00
Philip Reames 4d80ede538 [RewriteStatepointsForGC] Use a SetVector for a worklist [NFC]
Using a SetVector to replace equivelent but more verbose functionality.

llvm-svn: 234662
2015-04-10 23:11:26 +00:00
Philip Reames df1ef08c0c [RewriteStatepointsForGC] Use an actual liveness algorithm
When rewriting statepoints to make relocations explicit, we need to have a conservative but consistent notion of where a particular pointer is live at a particular site. The old code just used dominance, which is correct, but decidedly more conservative then it needed to be. This patch implements a simple dataflow algorithm that's run one per function (well, twice counting fixup after base pointer insertion). There's still lots of room to make this faster, but it's fast enough for all practical purposes today.

Differential Revision: http://reviews.llvm.org/D8674

llvm-svn: 234657
2015-04-10 22:53:14 +00:00
Philip Reames 704e78b149 [RewriteStatepointsForGC] clang-format file
Format the entire file to reduce diff of change to follow.

llvm-svn: 234656
2015-04-10 22:34:56 +00:00
Philip Reames f66d73708b [RewriteStatepointsForGC] Missed review comment from 234651 & build fix
After submitting 234651, I noticed I hadn't responded to a review comment by mjacob.  This patch addresses that comment and fixes a Release only build problem due to an unused variable.  

llvm-svn: 234653
2015-04-10 22:16:58 +00:00
Philip Reames 85b36a8157 [RewriteStatepointsForGC] Preprocess the IR to remove unreachable blocks and single entry phis
Two related small changes:

    Various dominance based queries about liveness can get confused if we're talking about unreachable blocks. To avoid reasoning about such cases, just remove them before rewriting statepoints.
    Remove single entry phis (likely left behind by LCSSA) to reduce the number of live values.

Both of these are motivated by http://reviews.llvm.org/D8674 which will be submitted shortly.

Differential Revision: http://reviews.llvm.org/D8675

llvm-svn: 234651
2015-04-10 22:07:04 +00:00
Philip Reames 8531d8c491 [RewriteStatepointsForGC] Limited support for vectors of pointers
This patch adds limited support for inserting explicit relocations when there's a vector of pointers live over the statepoint. This doesn't handle the case where the vector contains a mix of base and non-base pointers; that's future work.

The current implementation just scalarizes the vector over the gc.statepoint before doing the explicit rewrite. An alternate approach would be to plumb the vector all the way though the backend lowering, but doing that appears challenging. In particular, the size of the indirect spill slot is currently assumed to be sizeof(pointer) throughout the backend.

In practice, this is enough to allow running the SLP and Loop vectorizers before RewriteStatepointsForGC.

Differential Revision: http://reviews.llvm.org/D8671

llvm-svn: 234647
2015-04-10 21:48:25 +00:00
Sanjoy Das b6c5914308 [InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP.
Summary:
This change moves creating calls to `llvm.uadd.with.overflow` from
InstCombine to CodeGenPrep.  Combining overflow check patterns into
calls to the said intrinsic in InstCombine inhibits optimization because
it introduces an intrinsic call that not all other transforms and
analyses understand.

Depends on D8888.

Reviewers: majnemer, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8889

llvm-svn: 234638
2015-04-10 21:07:09 +00:00
Reid Kleckner 6e48a826e8 [WinEH] Try to make outlining invokes work a little better
WinEH currently turns invokes into calls. Long term, we will reconsider
this, but for now, make sure we remap the operands and clone the
successors of the new terminator.

llvm-svn: 234608
2015-04-10 16:26:42 +00:00
Benjamin Kramer 3a09ef64ee [CallSite] Make construction from Value* (or Instruction*) explicit.
CallSite roughly behaves as a common base CallInst and InvokeInst. Bring
the behavior closer to that model by making upcasts explicit. Downcasts
remain implicit and work as before.

Following dyn_cast as a mental model checking whether a Value *V isa
CallSite now looks like this: 
  if (auto CS = CallSite(V)) // think dyn_cast
instead of:
  if (CallSite CS = V)

This is an extra token but I think it is slightly clearer. Making the
ctor explicit has the advantage of not accidentally creating nullptr
CallSites, e.g. when you pass a Value * to a function taking a CallSite
argument.

llvm-svn: 234601
2015-04-10 14:50:08 +00:00
Benjamin Kramer 619c4e57ba Reduce dyn_cast<> to isa<> or cast<> where possible.
No functional change intended.

llvm-svn: 234586
2015-04-10 11:24:51 +00:00
Cameron Zwarich b282ef0111 Eliminate O(n^2) worst-case behavior in SSA construction
The code uses a priority queue and a worklist, which share the same
visited set, but the visited set is only updated when inserting into
the priority queue. Instead, switch to using separate visited sets
for the priority queue and worklist.

llvm-svn: 234425
2015-04-08 18:26:20 +00:00
Adam Nemet ce48250f11 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
(Re-apply r234361 with a fix and a testcase for PR23157)

Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234424
2015-04-08 17:48:40 +00:00
Sanjoy Das b098447128 [InstCombine] Refactor out OptimizeOverflowCheck. NFCI.
Summary:
This patch adds an enum `OverflowCheckFlavor` and a function
`OptimizeOverflowCheck`.  This will allow InstCombine to optimize
overflow checks without directly introducing an intermediate call to the
`llvm.$op.with.overflow` instrinsics.

This specific change is a refactoring and does not intend to change
behavior.

Reviewers: majnemer, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8888

llvm-svn: 234388
2015-04-08 04:27:22 +00:00
Adam Nemet e09a928c80 Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform stores"
This reverts commit r234361.

It caused PR23157.

llvm-svn: 234387
2015-04-08 04:16:55 +00:00
Adam Nemet 0515c33b70 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234361
2015-04-07 21:46:16 +00:00
Duncan P. N. Exon Smith 000fa2c646 DebugInfo: Remove DITypedArray<>, replace with typedefs
Replace all uses of `DITypedArray<>` with `MDTupleTypedArrayWrapper<>`
and `MDTypeRefArray`.  The APIs are completely different, but the
provided functionality is the same: treat an `MDTuple` as if it's an
array of a particular element type.

To simplify this patch a bit, I've temporarily typedef'ed
`DebugNodeArray` to `DIArray` and `MDTypeRefArray` to `DITypeArray`.
I've also temporarily conditionalized the accessors to check for null --
eventually these should be changed to asserts and the callers should
check for null themselves.

There's a tiny accompanying patch to clang.

llvm-svn: 234290
2015-04-07 04:14:33 +00:00
Duncan P. N. Exon Smith 6186fb2cd0 Transforms: Stop using DIDescriptor::is*() and auto-casting
Same as r234255, but for lib/Analysis and lib/Transforms.

llvm-svn: 234257
2015-04-06 23:27:00 +00:00
David Blaikie 9ac527037c ArgPromo: Bail out earlier for varargs functions
llvm-svn: 234224
2015-04-06 21:27:31 +00:00
Ismail Pazarbasi 198d6d53e2 Move `checkInterfaceFunction` to ModuleUtils
Summary:
Instead of making a local copy of `checkInterfaceFunction` for each
sanitizer, move the function in a common place.

Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8775

llvm-svn: 234220
2015-04-06 21:09:08 +00:00
Duncan P. N. Exon Smith a5099dce62 DebugInfo: Remove DIDescriptor::Verify()
Remove `DIDescriptor::Verify()` and the `Verify()`s from subclasses.
They had already been gutted, and just did an `isa<>` check.

In a couple of cases I've temporarily dropped the check entirely, but
subsequent commits are going to disallow conversions to the
`DIDescriptor`s directly from `MDNode`, so the checks will come back in
another form soon enough.

llvm-svn: 234201
2015-04-06 19:49:39 +00:00
Jingyue Wu 96d74006fd [SLSR] consider &B[S << i] as &B[(1 << i) * S]
Summary: This reduces handling &B[(1 << i) * s] to handling &B[i * S].

Test Plan: slsr-gep.ll

Reviewers: meheff

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D8837

llvm-svn: 234180
2015-04-06 17:15:48 +00:00
David Blaikie 1b01e7e263 clang-format my last commit
llvm-svn: 234127
2015-04-05 22:44:57 +00:00
David Blaikie 64646029bf [opaque pointer type] The last of the GEP IRBuilder API migrations
There's still lots of callers passing nullptr, of course - some because
they'll never be migrated (InstCombines for bitcasts - well they don't
make any sense when the pointer type is opaque anyway, for example) and
others that will need more engineering to pass Types around.

llvm-svn: 234126
2015-04-05 22:41:44 +00:00
David Blaikie 4e5d47f436 [opaque pointer type] More GEP API migrations
llvm-svn: 234108
2015-04-04 21:07:10 +00:00
David Blaikie 95d3e53720 [opaque pointer type] More GEP IRBuilder API migrations
llvm-svn: 234064
2015-04-03 23:03:54 +00:00
David Blaikie aa41cd57e0 [opaque pointer type] More GEP IRBuilder API migrations...
llvm-svn: 234058
2015-04-03 21:33:42 +00:00
David Blaikie 65fab6d896 Use early returns to reduce indentation.
llvm-svn: 234057
2015-04-03 21:32:06 +00:00
David Majnemer 98cfe2b7a5 [InstCombine] Use DataLayout to determine vector element width
InstCombine didn't realize that it needs to use DataLayout to determine
how wide pointers are.  This lead to assertion failures.

This fixes PR23113.

llvm-svn: 234046
2015-04-03 20:18:40 +00:00
David Blaikie 93c5444fe0 [opaque pointer type] More GEP API migrations in IRBuilder uses
The plan here is to push the API changes out from the common components
(like Constant::getGetElementPtr and IRBuilder::CreateGEP related
functions) and just update callers to either pass the type if it's
obvious, or pass null.

Do this with LoadInst as well and anything else that comes up, then to
start porting specific uses to not pass null anymore - this may require
some refactoring in each case.

llvm-svn: 234042
2015-04-03 19:41:44 +00:00
Reid Kleckner 92b9c6e6fe [ASan] Don't use stack malloc for 32-bit functions using inline asm
This prevents us from running out of registers in the backend.

Introducing stack malloc calls prevents the backend from recognizing the
inline asm operands as stack objects. When the backend recognizes a
stack object, it doesn't need to materialize the address of the memory
in a physical register. Instead it generates a simple SP-based memory
operand. Introducing a stack malloc forces the backend to find a free
register for every memory operand. 32-bit x86 simply doesn't have enough
registers for this to succeed in most cases.

Reviewers: kcc, samsonov

Differential Revision: http://reviews.llvm.org/D8790

llvm-svn: 233979
2015-04-02 21:44:55 +00:00
Jingyue Wu 99a6bed965 [SLSR] handles off bounds GEPs
Summary:
The old requirement on GEP candidates being in bounds is unnecessary.
For off-bound GEPs, we still have

  &B[i * S] = B + (i * S) * e = B + (i * e) * S

Test Plan: slsr_offbound_gep in slsr-gep.ll

Reviewers: meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8809

llvm-svn: 233949
2015-04-02 21:18:32 +00:00
David Blaikie 4a2e73b066 [opaque pointer type] API migration for GEP constant factories
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.

llvm-svn: 233938
2015-04-02 18:55:32 +00:00
Alexey Samsonov f96cde9f68 Fix a bug indicated by -fsanitize=shift-exponent.
llvm-svn: 233881
2015-04-02 01:30:10 +00:00
Ahmed Bougacha 408d010a7c [SimplifyLibCalls] Ignore nobuiltin/unavailable fortified libcalls.
We used to do this before refactorings around r225640.
Some clang users checked for _chk libcall availability using:
  __has_builtin(__builtin___memcpy_chk)
When compiling with -fno-builtin, this is always true.
When passing -ffreestanding/-mkernel, which both imply -fno-builtin, we
end up with fortified libcalls, which isn't acceptable in a freestanding
environment which only provides their non-fortified counterparts.

Until we change clang and/or teach external users to check for availability
differently, disregard the "nobuiltin" attribute and TLI::has.

Workaround for PR23093.

llvm-svn: 233776
2015-04-01 00:45:09 +00:00
David Blaikie d288fb8681 [opaque pointer type] Change GetElementPtrInst::getIndexedType to take the pointee type
This pushes the use of PointerType::getElementType up into several
callers - I'll essentially just have to keep pushing that up the stack
until I can eliminate every call to it...

llvm-svn: 233604
2015-03-30 21:41:43 +00:00
David Blaikie 3909da7f4b [opaque pointer type] More IRBuilder::createGEP (non-inbounds) migrations: CodeGenPrepare and SimplifyLibCalls
llvm-svn: 233596
2015-03-30 20:42:56 +00:00
Duncan P. N. Exon Smith ec819c096b Transforms: Use the new DebugLoc API, NFC
Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API.

llvm-svn: 233587
2015-03-30 19:49:49 +00:00
David Blaikie 87ca1b6e0c Constrain the type of a parameter now that callers without this constraint have been removed.
llvm-svn: 233419
2015-03-27 20:56:11 +00:00
David Blaikie e15dcbdf3e Recommit r233116 better: Remove a redundant instcombine involving bitcasts of geps of bitcasts
This just didn't need to be here at all, but the assertion I tried to
add wasn't appropriate either - the circumstance isn't impossible, it's
just not important to deal with it here - the gep-rooted version of this
instcombine will handle this case, we don't need to duplicate it for the
case where the gep happens to be used in a bitcast.

llvm-svn: 233404
2015-03-27 20:13:55 +00:00
Anna Zaks bf28d3aa33 [asan] Speed up isInterestingAlloca check
We make many redundant calls to isInterestingAlloca in the AddressSanitzier
pass. This is especially inefficient for allocas that have many uses. Let's
cache the results to speed up compilation.

The compile time improvements depend on the input. I did not see much
difference on benchmarks; however, I have a test case where compile time
goes from minutes to under a second.

llvm-svn: 233397
2015-03-27 18:52:01 +00:00
Yaron Keren 75e0c4b060 Remove superfluous .str() and replace std::string concatenation with Twine.
llvm-svn: 233392
2015-03-27 17:51:30 +00:00
James Molloy 0cbb2a8603 Reapply r233175 and r233183: float2int.
This re-adds float2int to the tree, after fixing PR23038. It turns
out the argument to APSInt() is true-if-unsigned, rather than
true-if-signed :(. Added testcase and explanatory comment.

llvm-svn: 233370
2015-03-27 10:36:57 +00:00
Sanjoy Das 7041fb1c13 [NFC] Fix typo in comment.
llvm-svn: 233363
2015-03-27 06:01:56 +00:00
Philip Reames a6ebf075b1 Code cleanup [NFC]
The assertion here was more expensive then it needed to be.  We're only inserting allocas in the entry block, so we only need to consider ones in the entry block.

llvm-svn: 233362
2015-03-27 05:53:16 +00:00
Philip Reames 24c6cd52e0 More code cleanup [NFC]
llvm-svn: 233361
2015-03-27 05:47:00 +00:00
Philip Reames 18d0feb7d2 More code cleanup [NFC]
Minor naming, one potentially unsafe cast

llvm-svn: 233359
2015-03-27 05:39:32 +00:00
Philip Reames aa66dfa028 Code simplification and style cleanup
All the removed assertions are either implied locally by the assert at the top of the function or properties of the verifier.

llvm-svn: 233358
2015-03-27 05:34:44 +00:00
Karthik Bhat 0f8c908934 Refactor Code inside LoopVectorizer's function isInductionVariable.
This patch exposes LoopVectorizer's isInductionVariable function as common
a functionality.
http://reviews.llvm.org/D8608

llvm-svn: 233352
2015-03-27 03:44:15 +00:00
Nick Lewycky ffb0864b44 Revert r233175 and r233183 with it. This pulls float2int back out of the tree, due to PR23038.
llvm-svn: 233350
2015-03-27 02:00:11 +00:00
Benjamin Kramer 7fa8c430f7 InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0
Anding and comparing with zero can be done in a single instruction on
most archs so this is a bit cheaper.

llvm-svn: 233291
2015-03-26 17:12:06 +00:00
Jingyue Wu 177a81578f [SLSR] handle candidate form &B[i * S]
Summary:
This patch enhances SLSR to handle another candidate form &B[i * S]. If
we found two candidates

S1: X = &B[i * S]
S2: Y = &B[i' * S]

and S1 dominates S2, we can replace S2 with

Y = &X[(i' - i) * S]

Test Plan:
slsr-gep.ll
X86/no-slsr.ll: verify that we do not run SLSR on GEPs that already fit into
an addressing mode

Reviewers: eliben, atrick, meheff, hfinkel

Reviewed By: hfinkel

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D7459

llvm-svn: 233286
2015-03-26 16:49:24 +00:00
Andrea Di Biagio 460948c9ab [optnone] Skip pass Float2Int on optnone functions.
Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int
is not run on optnone functions.

llvm-svn: 233183
2015-03-25 12:22:37 +00:00
James Molloy cb75d92458 Reapply r233062: "float2int": Add a new pass to demote from float to int where possible.
Now with a fix for PR23008 and extra regression test.

llvm-svn: 233175
2015-03-25 10:03:42 +00:00
David Blaikie 156d46eda0 Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)

SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)

llvm-svn: 233131
2015-03-24 23:34:31 +00:00
Sanjay Patel e304bea010 optimize the AVX2 (integer) version of vperm2 into a shuffle
...because this is what happens when an instruction
set puts its underwear on after its pants.

This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110

llvm-svn: 233127
2015-03-24 22:39:29 +00:00
David Blaikie 68d535c45f Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)

llvm-svn: 233126
2015-03-24 22:38:16 +00:00
Philip Reames 2b969d7010 Merge empty landing pads in SimplifyCFG
This patch tries to merge duplicate landing pads when they branch to a common shared target.

Given IR that looks like this:
lpad1:
  %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
         cleanup
  br label %shared_resume
lpad2:
  %exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
          cleanup
  br label %shared_resume
shared_resume:
  call void @fn()
  ret void
}

We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.

Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.

Differential Revision: http://reviews.llvm.org/D8297

llvm-svn: 233125
2015-03-24 22:28:45 +00:00
David Blaikie 1a6bb9fcf6 Revert "Remove an InstCombine that seems to have become redundant."
Assertion fires in compiler-rt. Guess it does fire..

This reverts commit r233116.

llvm-svn: 233121
2015-03-24 21:50:35 +00:00
David Blaikie e37e10dc57 Remove an InstCombine that seems to have become redundant.
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.

llvm-svn: 233116
2015-03-24 21:31:31 +00:00
Sanjay Patel 43a87fdc79 [X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles
This is the IR optimizer follow-on patch for D8563: the x86 backend patch
that converts this kind of shuffle back into a vperm2.

This is also a continuation of the transform that started in D8486. 
In that patch, Andrea suggested that we could convert vperm2 intrinsics that
use zero masks into a single shuffle. 

This is an implementation of that suggestion.

Differential Revision: http://reviews.llvm.org/D8567

llvm-svn: 233110
2015-03-24 20:36:42 +00:00
Hans Wennborg e42c64551a Revert r233062 ""float2int": Add a new pass to demote from float to int where possible."
This caused PR23008, compiles failing with: "Use still stuck around after Def is
destroyed: %.sroa.speculated"

Also reverting follow-up r233064.

llvm-svn: 233105
2015-03-24 20:07:08 +00:00
Sanjoy Das 45dc94a856 [IRCE] Fix how IRCE checks for no-sign-overflow.
IRCE requires the induction variables it handles to not sign-overflow.
The current scheme of checking if sext({X,+,S}) == {sext(X),+,sext(S)}
fails when SCEV simplifies sext(X) too.  After this change we //also//
check no-signed-wrap by looking at the flags set on the SCEVAddRecExpr.

llvm-svn: 233102
2015-03-24 19:29:22 +00:00
Sanjoy Das 337d46b36f [IRCE] Fix a regression introduced in r232444.
IRCE should not try to eliminate range checks that check an induction
variable against a loop-varying length.

llvm-svn: 233101
2015-03-24 19:29:18 +00:00
Benjamin Kramer e3b961a6e2 [float2int] Sort includes and add missing raw_ostream include.
llvm-svn: 233064
2015-03-24 11:28:47 +00:00
James Molloy 408df5160c "float2int": Add a new pass to demote from float to int where possible.
It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used.

This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers.

This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point.

To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around.

llvm-svn: 233062
2015-03-24 11:15:23 +00:00
Benjamin Kramer 799003bf8c Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.
llvm-svn: 232998
2015-03-23 19:32:43 +00:00
Benjamin Kramer 1f7c328bf2 [ctorutils] Update and sort includes. NFC.
llvm-svn: 232995
2015-03-23 19:06:17 +00:00
Benjamin Kramer b85d3756a6 Another set of missing raw_ostream.h. Still no functional change.
llvm-svn: 232993
2015-03-23 18:45:56 +00:00
Benjamin Kramer 16132e6faa Purge unused includes throughout libSupport.
NFC.

llvm-svn: 232976
2015-03-23 18:07:13 +00:00
Benjamin Kramer 51f6096cf8 Move private classes into anonymous namespaces
NFC.

llvm-svn: 232944
2015-03-23 12:30:58 +00:00
Benjamin Kramer d6aa0ec737 [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform.
llvm-svn: 232903
2015-03-21 22:04:26 +00:00
Benjamin Kramer 7857d723f1 [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

llvm-svn: 232902
2015-03-21 21:09:33 +00:00
Benjamin Kramer 691363e7f2 SimplifyLibCalls: Add basic optimization of memchr calls.
This is just memchr(x, y, 0) -> nullptr and constant folding.

llvm-svn: 232896
2015-03-21 15:36:21 +00:00
Kostya Serebryany f4e35cc47d [sanitizer] experimental tracing for cmp instructions
llvm-svn: 232873
2015-03-21 01:29:36 +00:00
Sanjay Patel ccf5f24b7b [X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles
vperm2* intrinsics are just shuffles. 
In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:

1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
   really angry (and so I have regrets about some patches from last week).

2. Doing mask conversion logic in header files is hard to write and 
   subsequently read.

There are a couple of TODOs in this patch to complete this optimization.

Differential Revision: http://reviews.llvm.org/D8486

llvm-svn: 232852
2015-03-20 21:47:56 +00:00
Andrew Kaylor 3170e5620e Fixing a bug with WinEH PHI handling
llvm-svn: 232851
2015-03-20 21:42:54 +00:00
Duncan P. N. Exon Smith 18c97fa2a0 SanitizerCoverage: Check for null DebugLocs
After a WIP patch to make `DIDescriptor` accessors more strict, this
started asserting.

llvm-svn: 232832
2015-03-20 18:48:45 +00:00
Duncan P. N. Exon Smith 41a1546ebc SampleProfile: Check for missing debug locations
Don't use `DebugLoc` accessors if we're pointing at null, which will be
a problem after a WIP patch to make the `DIDescriptor` accessors more
strict.  Caught by Frontend/profile-sample-use-loc-tracking.c (in
clang).

llvm-svn: 232792
2015-03-20 00:56:55 +00:00
Duncan P. N. Exon Smith ab58a568ee Verifier: Remove the separate -verify-di pass
Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
Instead, call into the `DebugInfoVerifier` from inside
`VerifierLegacyPass::finalizeModule()`.  This better matches the logic
in `verifyModule()` (used by the new PassManager), avoids requiring two
separate passes to verify the IR, and makes the API for "add a pass to
verify the IR" simple.

Note: the `-verify-debug-info` flag still works (for now, at least;
eventually it might make sense to just remove it).

llvm-svn: 232772
2015-03-19 22:24:17 +00:00
Peter Collingbourne 994ba3d29c LowerBitSets: Avoid reusing byte set addresses.
Each use of the byte array uses a different alias. This makes the
backend less likely to reuse previously computed byte array addresses,
improving the security of the CFI mechanism based on this pass.

Differential Revision: http://reviews.llvm.org/D8455

llvm-svn: 232770
2015-03-19 22:02:10 +00:00
Peter Collingbourne 070843d60b libLTO, llvm-lto, gold: Introduce flag for controlling optimization level.
This change also introduces a link-time optimization level of 1. This
optimization level runs only the globaldce pass as well as cleanup passes for
passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html

llvm-svn: 232769
2015-03-19 22:01:00 +00:00
Duncan P. N. Exon Smith 0a93e2db9c PassManagerBuilder: Remove effectively dead 'StripDebug' option
`StripDebug` was only used by tools/opt/opt.cpp in
`AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
command-line flag before it calls `AddStandardLinkPasses()`.  Stripping
debug info twice isn't very useful.

llvm-svn: 232765
2015-03-19 21:37:17 +00:00
Peter Collingbourne 0dbc7088da GlobalDCE: Improve performance for large modules containing comdats.
When we encounter a global with a comdat, rather than iterating over
every global in the module to find globals in the same comdat, store the
members in a multimap. This effectively lowers the complexity to O(N log N),
improving performance significantly for large modules such as might be
encountered during LTO.

It looks like we used to do something like this until r219191.

No functional change.

Differential Revision: http://reviews.llvm.org/D8431

llvm-svn: 232743
2015-03-19 18:23:29 +00:00
Daniel Jasper 5add63f21e [InstCombine] Don't fold a GEP into itself through a PHI node
This can only occur (I think) through the back-edge of the loop.

However, folding a GEP into itself means that the value of the previous
iteration needs to be stored in the meantime, thus requiring an
additional register variable to be live, but not actually achieving
anything (the gep still needs to be executed once per loop iteration).

The attached test case is derived from:
  typedef unsigned uint32;
  typedef unsigned char uint8;
  inline uint8 *f(uint32 value, uint8 *target) {
    while (value >= 0x80) {
      value >>= 7;
      ++target;
    }
    ++target;
    return target;
  }
  uint8 *g(uint32 b, uint8 *target) {
    target = f(b, f(42, target));
    return target;
  }

What happens is that the GEP stored in incptr2 is folded into itself
through the loop's back-edge and the phi-node stored in loopptr,
effectively incrementing the ptr by "2" in each iteration instead of "1".

In this case, it is actually increasing the number of GEPs required as
the GEP before the loop can't be folded away anymore. For comparison:

With this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
    %shr = lshr i32 %newval, 7
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %0, i64 2
    ret i8* %incptr3
  }

Without this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %incptr = getelementptr inbounds i8, i8* %buffer, i64 1
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %shr = lshr i32 %newval, 7
    %incptr2 = getelementptr inbounds i8, i8* %0, i64 2
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
    ret i8* %incptr3
  }

Review: http://reviews.llvm.org/D8245
llvm-svn: 232718
2015-03-19 11:05:08 +00:00
Sanjoy Das 7182d36f66 [ConstantRange] Split makeICmpRegion in two.
Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts.  The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true.  The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8345

llvm-svn: 232575
2015-03-18 00:41:24 +00:00
Michael Zolotukhin 9ef5671d36 Try to fix a test broken by one of my previous commits.
llvm-svn: 232536
2015-03-17 20:31:56 +00:00
Michael Zolotukhin 9b3cf604ce LoopVectorize: teach loop vectorizer to vectorize calls.
The tests would be committed in a commit for http://reviews.llvm.org/D8131

Review: http://reviews.llvm.org/D8095
llvm-svn: 232530
2015-03-17 19:46:50 +00:00
Michael Zolotukhin 1d4e52512c LoopVectorizer: Add TargetTransformInfo.
Review: http://reviews.llvm.org/D8092
llvm-svn: 232522
2015-03-17 19:17:18 +00:00
Kostya Serebryany b1870a64cf [asan] remove redundant ifndefs. NFC
llvm-svn: 232521
2015-03-17 19:13:23 +00:00
Michael Liao 24fcae8fa0 [SwitchLowering] Remove incoming values in the reverse order
- To prevent invalidating *successive* indices.
 

llvm-svn: 232510
2015-03-17 18:03:10 +00:00
David Blaikie c4dfa63928 Fix GCC -Wparentheses warning (& reformat now that the precedence is fixed)
Benign warning (clang deliberately suppresses this case) but does
regularly produce bad formatting, so it's nice to fix/reformat.

llvm-svn: 232508
2015-03-17 17:48:24 +00:00
Dmitry Vyukov 618d580ec9 asan: optimization experiments
The experiments can be used to evaluate potential optimizations that remove
instrumentation (assess false negatives). Instead of completely removing
some instrumentation, you set Exp to a non-zero value (mask of optimization
experiments that want to remove instrumentation of this instruction).
If Exp is non-zero, this pass will emit special calls into runtime
(e.g. __asan_report_exp_load1 instead of __asan_report_load1). These calls
make runtime terminate the program in a special way (with a different
exit status). Then you run the new compiler on a buggy corpus, collect
the special terminations (ideally, you don't see them at all -- no false
negatives) and make the decision on the optimization.

The exact reaction to experiments in runtime is not implemented in this patch.
It will be defined and implemented in a subsequent patch.

http://reviews.llvm.org/D8198

llvm-svn: 232502
2015-03-17 16:59:19 +00:00
Reid Kleckner 0b16859805 Use an underlying enum type of unsigned to silence a -Wmicrosoft warning about being unable to put (unsigned)-1 into the default underyling type of int
llvm-svn: 232498
2015-03-17 16:50:20 +00:00
Sanjoy Das 9c1bfae604 [IRCE] Add a -irce-print-range-checks option.
-irce-print-range-checks prints out the set of range checks recognized
by IRCE.

llvm-svn: 232451
2015-03-17 01:40:22 +00:00
Duncan P. N. Exon Smith 170c26d75e MapMetadata: Allow unresolved metadata if it won't change
Allow unresolved nodes through the `MapMetadata()` if
`RF_NoModuleLevelChanges`, since there's no remapping to do anyway.

This fixes PR22929.  I'll add a clang test as a follow-up.

llvm-svn: 232449
2015-03-17 01:14:40 +00:00
Sanjoy Das 7a0b7f5996 [IRCE] Add comments, NFC.
This change adds some comments that justify why a potentially
overflowing operation is safe.

llvm-svn: 232445
2015-03-17 00:42:16 +00:00
Sanjoy Das e2cde6f195 [IRCE] Support half-range checks.
This change to IRCE gets it to recognize "half" range checks.  Half
range checks are range checks that only either check if the index is
`slt` some positive integer ("length") or if the index is `sge` `0`.

The range solver does not try to be clever / aggressive about solving
half-range checks -- it transforms "I < L" to "0 <= I < L" and "0 <= I"
to "0 <= I < INT_SMAX".  This is safe, but not always optimal.

llvm-svn: 232444
2015-03-17 00:42:13 +00:00
Justin Bogner 3faa76bfab GCOV: Make the exit block placement from r223193 optional
By default we want our gcov emission to stay 4.2 compatible, which
means we need to continue emit the exit block last by default. We add
an option to emit it before the body for users that need it.

llvm-svn: 232438
2015-03-16 23:52:03 +00:00
Peter Collingbourne ad0bdcd238 LowerBitSets: do not use private aliases at all on Darwin.
LLVM currently turns these into linker-private symbols, which can be dead
stripped by the Darwin linker.

llvm-svn: 232435
2015-03-16 23:36:24 +00:00
Gabor Horvath fee043439c [llvm] Replacing asserts with static_asserts where appropriate
Summary:
This patch consists of the suggestions of clang-tidy/misc-static-assert check.


Reviewers: alexfh

Reviewed By: alexfh

Subscribers: xazax.hun, llvm-commits

Differential Revision: http://reviews.llvm.org/D8343

llvm-svn: 232366
2015-03-16 09:53:42 +00:00
Dmitry Vyukov ee842385ad asan: fix overflows in isSafeAccess
As pointed out in http://reviews.llvm.org/D7583
The current checks can cause overflows when object size/access offset cross Quintillion bytes.

http://reviews.llvm.org/D8193

llvm-svn: 232358
2015-03-16 08:04:26 +00:00
Michael Gottesman d63436fb2e One more try with unused.
llvm-svn: 232357
2015-03-16 08:00:27 +00:00
Michael Gottesman a0d2d3379e Add in an unreachable after a covered switch to appease certain bots.
llvm-svn: 232356
2015-03-16 07:46:34 +00:00
Michael Gottesman c219dd1de1 Remove a used that snuck in that seems to be triggering the MSVC buildbots.
llvm-svn: 232355
2015-03-16 07:34:17 +00:00
Michael Gottesman c01ab519e6 [objc-arc] Fix indentation of debug logging so it is easy to read the output.
llvm-svn: 232352
2015-03-16 07:02:39 +00:00
Michael Gottesman dd60f9bb09 [objc-arc] Make the ARC optimizer more conservative by forcing it to be non-safe in both direction, but mitigate the problem by noting that we just care if there was a further use.
The problem here is the infamous one direction known safe. I was
hesitant to turn it off before b/c of the potential for regressions
without an actual bug from users hitting the problem. This is that bug ;
).

The main performance impact of having known safe in both directions is
that often times it is very difficult to find two releases without a use
in-between them since we are so conservative with determining potential
uses. The one direction known safe gets around that problem by taking
advantage of many situations where we have two retains in a row,
allowing us to avoid that problem. That being said, the one direction
known safe is unsafe. Consider the following situation:

retain(x)
retain(x)
call(x)
call(x)
release(x)

Then we know the following about the reference count of x:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
retain(x)
// rc(x) == N+2
call A(x)
call B(x)
// rc(x) >= 1 (since we can not release a deallocated pointer).
release(x)
// rc(x) >= 0

That is all the information that we can know statically. That means that
we know that A(x), B(x) together can release (x) at most N+1 times. Lets
say that we remove the inner retain, release pair.

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
call A(x)
call B(x)
// rc(x) >= 1
release(x)
// rc(x) >= 0

We knew before that A(x), B(x) could release x up to N+1 times meaning
that rc(x) may be zero at the release(x). That is not safe. On the other
hand, consider the following situation where we have a must use of
release(x) that x must be kept alive for after the release(x)**. Then we
know that:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
retain(x)
// rc(x) == N+2
call A(x)
call B(x)
// rc(x) >= 2 (since we know that we are going to release x and that that release can not be the last use of x).
release(x)
// rc(x) >= 1 (since we can not deallocate the pointer since we have a must use after x).
…
// rc(x) >= 1
use(x)

Thus we know that statically the calls to A(x), B(x) can together only
release rc(x) N times. Thus if we remove the inner retain, release pair:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
call A(x)
call B(x)
// rc(x) >= 1
…
// rc(x) >= 1
use(x)

We are still safe unless in the final … there are unbalanced retains,
releases which would have caused the program to blow up anyways even
before optimization occurred. The simplest form of must use is an
additional release that has not been paired up with any retain (if we
had paired the release with a retain and removed it we would not have
the additional use). This fits nicely into the ARC framework since
basically what you do is say that given any nested releases regardless
of what is in between, the inner release is known safe. This enables us to get
back the lost performance.

<rdar://problem/19023795>

llvm-svn: 232351
2015-03-16 07:02:36 +00:00
Michael Gottesman 7a26d8fa54 [objc-arc] Treat memcpy, memove, memset as just using pointers, not decrementing them.
This will be tested in the next commit (which required it). The commit
is going to update a bunch of tests at the same time.

llvm-svn: 232350
2015-03-16 07:02:32 +00:00
Michael Gottesman 6779217ec5 [objc-arc] Rename ConnectTDBUTraversals => PairUpRetainsReleases.
This is a name that is more descriptive of what the method really does. NFC.

llvm-svn: 232349
2015-03-16 07:02:30 +00:00
Michael Gottesman 65cb7377fd [objc-arc] Move initialization of ARCMDKindCache into the class itself. I also made it lazy.
llvm-svn: 232348
2015-03-16 07:02:27 +00:00
Michael Gottesman ca3a47288b [objc-arc] Change EntryPointType to an enum class outside of ARCRuntimeEntryPoints called ARCRuntimeEntryPointKind.
llvm-svn: 232347
2015-03-16 07:02:24 +00:00
David Blaikie 86ecb1bdaf [opaque pointer type] IRBuilder gep migration progress
llvm-svn: 232294
2015-03-15 01:03:19 +00:00
Mehdi Amini b344ac9afe Update InstCombine to transform aggregate stores into scalar stores.
Summary: This is a first step toward getting proper support for aggregate loads and stores.

Test Plan: Added unittests

Reviewers: reames, chandlerc

Reviewed By: chandlerc

Subscribers: majnemer, joker.eph, chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D7780

Patch by Amaury Sechet

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 232284
2015-03-14 22:19:33 +00:00
David Blaikie 72edd88273 Add some missed formatting
llvm-svn: 232281
2015-03-14 21:40:12 +00:00
David Blaikie 7682663ef6 [opaque pointer type] gep API migration, ArgPromo
This involved threading the type-to-gep through a data structure, since
the code was relying on the pointer type to carry this information. I
imagine there will be a lot of this work across the project... slow
work chasing each use case, but the assertions will help keep me honest.

llvm-svn: 232277
2015-03-14 21:11:26 +00:00
David Blaikie 096b1da29d [opaque pointer type] more gep API migration
llvm-svn: 232274
2015-03-14 19:53:33 +00:00
David Blaikie 22319eb920 [opaque pointer type] more gep API migrations
Adding nullptr to all the IRBuilder stuff because it's the first thing
that fails to build when testing without the back-compat functions, so
I'll keep having to re-add these locally for each chunk of migration I
do. Might as well check them in to save me the churn. Eventually I'll
have to migrate these too, but I'm going breadth-first.

llvm-svn: 232270
2015-03-14 19:24:04 +00:00
David Blaikie 741c8f81e4 [opaque pointer type] Start migrating GEP creation to explicitly specify the pointee type
I'm just going to migrate these in a pretty ad-hoc & incremental way -
providing the backwards compatible API for now, then locally removing
it, fixing a few callers, adding it back in and commiting those callers.
Rinse, repeat.

The assertions should ensure that if I get this wrong we'll find out
about it and not just have one giant patch to revert, recommit, revert,
recommit, etc.

llvm-svn: 232240
2015-03-14 01:53:18 +00:00
Peter Collingbourne c9f277f754 LowerBitSets: Do not export symbols for bit set referenced globals on Darwin.
The linker on that platform may re-order symbols or strip dead symbols, which
will break bit set checks. Avoid this by hiding the symbols from the linker.

llvm-svn: 232235
2015-03-14 00:00:49 +00:00
Robert Lougher 1858ba7626 Reapply "[Reassociate] Add initial support for vector instructions."
This reapplies the patch previously committed at revision 232190.  This was
reverted at revision 232196 as it caused test failures in tests that did not
expect operands to be commuted.  I have made the tests more resilient to
reassociation in revision 232206.

llvm-svn: 232209
2015-03-13 20:53:01 +00:00
Duncan P. N. Exon Smith be95b4afc6 instcombine: alloca: Canonicalize scalar allocation array size
As a follow-up to r232200, add an `-instcombine` to canonicalize scalar
allocations to `i32 1`.  Since r232200, `iX 1` (for X != 32) are only
created by RAUWs, so this shouldn't fire too often.  Nevertheless, it's
a cheap check and a nice cleanup.

llvm-svn: 232202
2015-03-13 19:42:09 +00:00
Duncan P. N. Exon Smith 07ff9b03f6 instcombine: alloca: Limit array size type promotion
Move type promotion of the size of the array allocation to the end of
`simplifyAllocaArraySize()`.  This avoids promoting the type of the
array size if it's a `ConstantInt`, since the next -instcombine
iteration will drop it to a scalar allocation anyway.  Similarly, this
avoids promoting the type if it's an `UndefValue`, in which case the
alloca gets RAUW'ed.

This is NFC when considered over the lifetime of -instcombine, since
it's just reducing the number of iterations needed to reach fixed point.

llvm-svn: 232201
2015-03-13 19:34:55 +00:00
Duncan P. N. Exon Smith 720762e2c0 AsmWriter: Write alloca array size explicitly (and -instcombine fixup)
Write the `alloca` array size explicitly when it's non-canonical.
Previously, if the array size was `iX 1` (where X is not 32), the type
would mutate to `i32` when round-tripping through assembly.

The testcase I added fails in `verify-uselistorder` (as well as
`FileCheck`), since the use-lists for `i32 1` and `i64 1` change.
(Manman Ren came across this when running `verify-uselistorder` on some
non-trivial, optimized code as part of PR5680.)

The type mutation started with r104911, which allowed array sizes to be
something other than an `i32`.  Starting with r204945, we
"canonicalized" to `i64` on 64-bit platforms -- and then on every
round-trip through assembly, mutated back to `i32`.

I bundled a fixup for `-instcombine` to avoid r204945 on scalar
allocations.  (There wasn't a clean way to sequence this into two
commits, since the assembly change on its own caused testcase churn, and
the `-instcombine` change can't be tested without the assembly changes.)

An obvious alternative fix -- change `AllocaInst::AllocaInst()`,
`AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for
scalar allocations -- was rejected out of hand, since this required
teaching them each about the data layout.

A follow-up commit will add an `-instcombine` to canonicalize the scalar
allocation array size to `i32 1` rather than leaving `iX 1` alone.

rdar://problem/20075773

llvm-svn: 232200
2015-03-13 19:30:44 +00:00
Duncan P. N. Exon Smith bb730135c9 instcombine: alloca: Remove nesting in simplifyAllocaArraySize(), NFC
llvm-svn: 232199
2015-03-13 19:26:33 +00:00
Duncan P. N. Exon Smith c6820ec1c2 instcombine: alloca: Split out simplifyAllocaArraySize(), NFC
Follow-up commits will change some of the logic here.  Splitting into a
separate function simplifies the logic by allowing early returns instead
of deeper nesting.

llvm-svn: 232197
2015-03-13 19:22:03 +00:00
Robert Lougher 5e0ea66d59 Revert: "[Reassociate] Add initial support for vector instructions."
This reverts revision 232190 due to buildbot failure reported on clang-hexagon-elf
for test arm64_vtst.c.  To be investigated.

llvm-svn: 232196
2015-03-13 19:20:46 +00:00
Robert Lougher 1bad505c3c [Reassociate] Add initial support for vector instructions.
This patch adds initial support for vector instructions to the reassociation
pass. It enables most parts of the pass to work with vectors but to keep the
size of the patch small, optimization of Xor trees, canonicalization of
negative constants and converting shifts to muls, etc., have been left out.
This will be handled in later patches.

The patch is based on an initial patch by Chad Rosier.

Differential Revision: http://reviews.llvm.org/D7566

llvm-svn: 232190
2015-03-13 18:33:27 +00:00
Kevin Qin 49bc764310 Reapply 'Run LICM pass after loop unrolling pass.'
It's firstly committed at r231630, and reverted at r231635.

Function pass InstructionSimplifier is inserted as barrier to
make sure loop unroll pass won't affect on LICM pass.

llvm-svn: 232011
2015-03-12 05:36:01 +00:00
Andrew Kaylor 6b67d42773 Extended support for native Windows C++ EH outlining
Differential Review: http://reviews.llvm.org/D7886

llvm-svn: 231981
2015-03-11 23:22:06 +00:00
David Majnemer d61a6fd8ed InstCombine: Don't fold call bitcast into args if callee is byval
This fixes a bug reported here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150309/265341.html

llvm-svn: 231948
2015-03-11 18:03:05 +00:00
Sanjay Patel c04b6f242c Inliner should not add callgraph edges for intrinsic calls (PR22857)
The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics.

This patch makes sure that the Inliner does not add such an edge to the callgraph.

Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857

Differential Revision: http://reviews.llvm.org/D8231

llvm-svn: 231927
2015-03-11 15:12:32 +00:00
Philip Reames 71c4035c18 If a conditional branch jumps to the same target, remove the condition
Given that large parts of inst combine is restricted to instructions which have one use, getting rid of a use on the condition can help the effectiveness of the optimizer. Also, it allows the condition to potentially be deleted by instcombine rather than waiting for another pass.

I noticed this completely by accident in another test case. It's not anything that actually came from a real workload.

p.s. We should probably do the same thing for switch instructions.

Differential Revision: http://reviews.llvm.org/D8220

llvm-svn: 231881
2015-03-10 22:52:37 +00:00
Sanjay Patel 0fdb437b25 remove function names from comments; NFC
llvm-svn: 231826
2015-03-10 19:42:57 +00:00
Michael Zolotukhin 267e12f714 Enable loop-rotate before loop-vectorize by default
llvm-svn: 231820
2015-03-10 19:07:41 +00:00
Adam Nemet 98c4c5dd78 [LAA-memchecks 2/3] Move number of memcheck threshold checking to LV
Now the analysis won't "fail" if the memchecks exceed the threshold.  It
is the transform pass' responsibility to perform the check.

This allows the transform pass to further analyze/eliminate the
memchecks.  E.g. in Loop distribution we only need to check pointers
that end up in different partitions.

Note that there is a slight change of functionality here.  The logic in
analyzeLoop is that if dependence checking fails due to non-constant
distance between the pointers, another attempt is made to prove safety
of the dependences purely using run-time checks.

Before this patch we could fail the loop due to exceeding the memcheck
threshold after the first step, now we only check the threshold in the
client after the full analysis.  There is no measurable compile-time
effect but I wanted to record this here.

llvm-svn: 231817
2015-03-10 18:54:23 +00:00
Sanjay Patel abf7023c63 remove names from comments; NFC
llvm-svn: 231813
2015-03-10 18:41:22 +00:00
Sanjay Patel 51bd9421ac fix typos; NFC
llvm-svn: 231812
2015-03-10 18:37:05 +00:00
Sanjay Patel f1b0db1545 remove function names from comments; NFC
llvm-svn: 231801
2015-03-10 16:42:24 +00:00
Owen Anderson 58364dc4da Fix a crash in InstCombine where we could try to truncate a switch comparison to zero width.
llvm-svn: 231761
2015-03-10 06:51:39 +00:00
Owen Anderson 51b75b8c34 Fix an infinite loop in InstCombine when an instruction with no users and side effects can be constant folded.
ReplaceInstUsesWith needs to return nullptr when the input has no users,
because in that case it does not mutate the program.  Otherwise, we can
get stuck in an infinite loop of repeatedly attempting to constant fold
and instruction with no users.

llvm-svn: 231755
2015-03-10 05:13:47 +00:00
Mehdi Amini a28d91d81b DataLayout is mandatory, update the API to reflect it with references.
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
2015-03-10 02:37:25 +00:00
Kostya Serebryany 48a4023f40 [sanitizer] fix instrumentation with -mllvm -sanitizer-coverage-block-threshold=0 to actually do something useful.
llvm-svn: 231736
2015-03-10 01:58:27 +00:00
Kostya Serebryany 8fb05ac998 [sanitizer] decrease sanitizer-coverage-block-threshold from 1000 to 500 as another horrible workaround for PR17409
llvm-svn: 231733
2015-03-10 01:11:53 +00:00
Benjamin Kramer 7bd1f7cb58 Remove the remaining uses of abs64 and nuke it.
std::abs works just fine and we're already using it in many places. NFC intended.

llvm-svn: 231696
2015-03-09 20:20:16 +00:00
Benjamin Kramer f044d3f93b Make helper functions static.
Found by -Wmissing-prototypes. NFC.

llvm-svn: 231664
2015-03-09 16:23:46 +00:00
Benjamin Kramer fd3bc74460 SymbolRewriter: Hide implementation details
NFC.

llvm-svn: 231660
2015-03-09 15:50:47 +00:00
Kevin Qin 65b07b8e1b Revert r231630 - Run LICM pass after loop unrolling pass.
As it broke llvm bootstrap.

llvm-svn: 231635
2015-03-09 07:26:37 +00:00
Kevin Qin 715b01e979 Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization.
Runtime unrolling is an expensive optimization which can bring benefit
only if the loop is hot and iteration number is relatively large enough.
For some loops, we know they are not worth to be runtime unrolled.
The scalar loop from vectorization is one of the cases.

llvm-svn: 231631
2015-03-09 06:14:18 +00:00
Kevin Qin a998735def Run LICM pass after loop unrolling pass.
Runtime unrollng will introduce a runtime check in loop prologue.
If the unrolled loop is a inner loop, then the proglogue will be inside
the outer loop. LICM pass can help to promote the runtime check out if
the checked value is loop invariant.

llvm-svn: 231630
2015-03-09 06:14:07 +00:00
David Blaikie dc3f01e9cf Simplify expressions involving boolean constants with clang-tidy
Patch by Richard (legalize at xmission dot com).

Differential Revision: http://reviews.llvm.org/D8154

llvm-svn: 231617
2015-03-09 01:57:13 +00:00
Olivier Sallenave 049d803ce0 Do not restrict interleaved unrolling to small loops, depending on the target.
llvm-svn: 231528
2015-03-06 23:12:04 +00:00
Benjamin Kramer e8a64a20f2 LoopInterchange: Remove empty method.
llvm-svn: 231503
2015-03-06 19:37:26 +00:00
Benjamin Kramer 79442920bf LoopInterchange: Rephrase instruction moving using ilist's splice and factor it into a function
+ Random cleanups. No functional change.

llvm-svn: 231501
2015-03-06 18:59:14 +00:00
Benjamin Kramer 298a3a0567 Fold init() helpers into constructors. NFC.
llvm-svn: 231486
2015-03-06 16:21:15 +00:00
Daniel Jasper 6adbd7aecf Change the way in which error case is being handled.
Specifically this:
* Prevents an "unused" warning in non-assert builds.
* In that error case return with out removing a child loop instead of
  looping forever.

llvm-svn: 231459
2015-03-06 10:39:14 +00:00
Karthik Bhat 88db86dd29 Add a new pass "Loop Interchange"
This pass interchanges loops to provide a more cache-friendly memory access.

For e.g. given a loop like -
  for(int i=0;i<N;i++)
    for(int j=0;j<N;j++)
      A[j][i] = A[j][i]+B[j][i];

is interchanged to -
  for(int j=0;j<N;j++)
    for(int i=0;i<N;i++)
      A[j][i] = A[j][i]+B[j][i];

This pass is currently disabled by default.

To give a brief introduction it consists of 3 stages-

LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix.
LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time.
LoopInterchangeTransform : Which does the actual transform.

LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks.

TODO:
1) Add support for reductions and lcssa phi.
2) Improve profitability model.
3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange.
4) Improve compile time regression found in llvm lnt due to this pass.
5) Fix issues in Dependency Analysis module.

A special thanks to Hal for reviewing this code.
Review: http://reviews.llvm.org/D7499

llvm-svn: 231458
2015-03-06 10:11:25 +00:00
Yaron Keren 322bdad085 Silence C4715 'not all control paths return a value' warnings.
llvm-svn: 231455
2015-03-06 07:49:14 +00:00
Michael Gottesman 6ff10c959a [objc-arc] Sprinkle some more auto on some iterators.
llvm-svn: 231447
2015-03-06 02:10:03 +00:00
Michael Gottesman 16e6a2057f [objc-arc] Move the detection of potential uses or altering of a ref count onto PtrState.
llvm-svn: 231446
2015-03-06 02:07:12 +00:00
Michael Gottesman 6080596328 [objc-arc] Move the checking of whether or not we can match onto PtrStates and out of the main dataflow.
These refactored computations check whether or not we are at a stage
of the sequence where we can perform a match. This patch moves the
computation out of the main dataflow and into
{BottomUp,TopDown}PtrState.

llvm-svn: 231439
2015-03-06 00:34:42 +00:00
Michael Gottesman 4eae396ae9 [objc-arc] Refactor (Re-)initialization of PtrState from dataflow -> {TopDown,BottomUp}PtrState Class.
This initialization occurs when we see a new retain or release. Before
we performed the actual initialization inline in the dataflow. That is
just messy.

llvm-svn: 231438
2015-03-06 00:34:39 +00:00
Michael Gottesman feb138e211 [objc-arc] Create two subclasses of PtrState in preparation for moving per ptr state change behavior onto a PtrState class.
This will enable the main ObjCARCOpts dataflow to work with higher
level concepts such as "can this ptr state be modified by this ref
count" and not need to understand the nitty gritty details of how that
is determined. This makes the dataflow cleaner.

llvm-svn: 231437
2015-03-06 00:34:36 +00:00
Michael Gottesman 41c01005ed [objc-arc] Extract out MDNodes into a cache structure so the information can be passed around.
llvm-svn: 231436
2015-03-06 00:34:33 +00:00
Michael Gottesman f6bcb81000 [objc-arc] Remove annotations code.
It will always be in the history if it is needed again. Now it is just dead
code.

llvm-svn: 231435
2015-03-06 00:34:29 +00:00
Michael Gottesman d45907bd38 Fix build error.
llvm-svn: 231430
2015-03-05 23:57:07 +00:00
Michael Gottesman a9fc016281 [objc-arc] Change some casts and loop iterators to use auto.
llvm-svn: 231427
2015-03-05 23:29:06 +00:00
Michael Gottesman 68b91dbf84 [objc-arc] Extract out state specific to a ref count from the main objc arc sequence dataflow. This will allow me to separate the actual ARC queries from the meat of the dataflow algorithm.
llvm-svn: 231426
2015-03-05 23:29:03 +00:00
Michael Gottesman 0be6920e23 [objc-arc] Extract blot map vector into its own file. NFC.
llvm-svn: 231425
2015-03-05 23:28:58 +00:00
Michael Kuperstein bcb26d6880 [InstCombine] Fix an assertion when fmul has a ConstantExpr operand
isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions.

Patch by Pawel Jurek <pawel.jurek@intel.com>
Differential Revision: http://reviews.llvm.org/D8053

llvm-svn: 231359
2015-03-05 08:38:57 +00:00
Kostya Serebryany 83ce8779d5 [sanitizer] add nosanitize metadata to more coverage instrumentation instructions
llvm-svn: 231333
2015-03-05 01:20:05 +00:00
Sanjoy Das a5397c0198 [IndVarSimplify] use the "canonical" way to infer no-wrap.
Summary:
rL225282 introduced an ad-hoc way to promote some additions to nuw or
nsw.  Since then SCEV has become smarter in directly proving no-wrap;
and using the canonical "ext(A op B) == ext(A) op ext(B)" method of
proving no-wrap is just as powerful now.  Rip out the existing
complexity in favor of getting SCEV to do all the heaving lifting
internally.

This change does not add any unit tests because it is supposed to be a
non-functional change.  Tests added in rL225282 and rL226075 are valid
tests for this change.

Reviewers: atrick, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7981

llvm-svn: 231306
2015-03-04 22:24:23 +00:00
Reid Kleckner 4276945161 Try to satisfy sanitizer lint check
llvm-svn: 231284
2015-03-04 20:38:59 +00:00
Mehdi Amini 46a43556db Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
Dmitry Vyukov b37b95ed3e asan: do not instrument direct inbounds accesses to stack variables
Do not instrument direct accesses to stack variables that can be
proven to be inbounds, e.g. accesses to fields of structs on stack.

But it eliminates 33% of instrumentation on webrtc/modules_unittests
(number of memory accesses goes down from 290152 to 193998) and
reduces binary size by 15% (from 74M to 64M) and improved compilation time by 6-12%.

The optimization is guarded by asan-opt-stack flag that is off by default.

http://reviews.llvm.org/D7583

llvm-svn: 231241
2015-03-04 13:27:53 +00:00
Philip Reames 6da37857d1 [RewriteStatepointsForGC] Fix a relocation bug w.r.t values defined by invoke instructions
RewriteStatepointsForGC pass emits an alloca for each GC pointer which will be relocated. It then inserts stores after def and all relocations, and inserts loads before each use as well. In the end, mem2reg is used to update IR with relocations in SSA form.

However, there is a problem with inserting stores for values defined by invoke instructions. The code didn't expect a def was a terminator instruction, and inserting instructions after these terminators resulted in malformed IR.

This patch fixes this problem by handling invoke instructions as a special case. If the def is an invoke instruction, the store will be inserted at the beginning of the normal destination block. Since return value from invoke instruction does not dominate the unwind destination block, no action is needed there.

Patch by: Chen Li
Differential Revision: http://reviews.llvm.org/D7923

llvm-svn: 231183
2015-03-04 00:13:52 +00:00
Kostya Serebryany be5e0ed919 [sanitizer/coverage] Add AFL-style coverage counters (search heuristic for fuzzing).
Introduce -mllvm -sanitizer-coverage-8bit-counters=1
which adds imprecise thread-unfriendly 8-bit coverage counters.

The run-time library maps these 8-bit counters to 8-bit bitsets in the same way
AFL (http://lcamtuf.coredump.cx/afl/technical_details.txt) does:
counter values are divided into 8 ranges and based on the counter
value one of the bits in the bitset is set.
The AFL ranges are used here: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+.

These counters provide a search heuristic for single-threaded
coverage-guided fuzzers, we do not expect them to be useful for other purposes.

Depending on the value of -fsanitize-coverage=[123] flag,
these counters will be added to the function entry blocks (=1),
every basic block (=2), or every edge (=3).

Use these counters as an optional search heuristic in the Fuzzer library.
Add a test where this heuristic is critical.

llvm-svn: 231166
2015-03-03 23:27:02 +00:00
David Majnemer 1bacc0abc9 InstCombine: Ensure select condition types are identical before merging
Selection conditions may be vectors or scalars.  Make sure InstCombine
doesn't indiscriminately assume that a select which is value dependent
on another select have identical select condition types.

This fixes PR22773.

llvm-svn: 231156
2015-03-03 22:40:36 +00:00
David Blaikie 9469072367 RewriteStatepointsForGC::PhiState: Remove explicit copy ctor in favor of the Rule of Zero
The assertion was just checking a class invariant that's pretty easy to
verify by inspection (no mutating operations, and the two non-copy ctors
already ensure the state is maintained) so remove the explicit copy ctor
in favor of the default, thus allowing the use of the default copy
assignment operator without hitting the C++11 deprecation here.

llvm-svn: 231143
2015-03-03 21:49:07 +00:00
David Blaikie 7f1e0565b3 Revert "Remove the explicit SDNodeIterator::operator= in favor of the implicit default"
Accidentally committed a few more of these cleanup changes than
intended. Still breaking these out & tidying them up.

This reverts commit r231135.

llvm-svn: 231136
2015-03-03 21:18:16 +00:00
David Blaikie bb8da4c08f Remove the explicit SDNodeIterator::operator= in favor of the implicit default
There doesn't seem to be any need to assert that iterator assignment is
between iterators over the same node - if you want to reuse an iterator
variable to iterate another node, that's perfectly acceptable. Just
don't mix comparisons between iterators into disjoint sequences, as
usual.

llvm-svn: 231135
2015-03-03 21:17:08 +00:00
Peter Collingbourne da2dbf21a9 LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets.
By loading from indexed offsets into a byte array and applying a mask, a
program can test bits from the bit set with a relatively short instruction
sequence. For example, suppose we have 15 bit sets to lay out:

A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits),
F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits),
L (4 bits), M (3 bits), N (2 bits), O (1 bit)

These bits can be laid out in a 16-byte array like this:

      Byte Offset
    0123456789ABCDEF
Bit
  7 HHHHHHHHHIIIIIII
  6 GGGGGGGGGGJJJJJJ
  5 FFFFFFFFFFFKKKKK
  4 EEEEEEEEEEEELLLL
  3 DDDDDDDDDDDDDMMM
  2 CCCCCCCCCCCCCCNN
  1 BBBBBBBBBBBBBBBO
  0 AAAAAAAAAAAAAAAA

For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to
test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done
in 1-2 machine instructions on x86, or 4-6 instructions on ARM.

This uses the LPT multiprocessor scheduling algorithm to lay out the bits
efficiently.

Saves ~450KB of instructions in a recent build of Chromium.

Differential Revision: http://reviews.llvm.org/D7954

llvm-svn: 231043
2015-03-03 00:49:28 +00:00
Benjamin Kramer 838752d3f6 LoopIdiom: Give globals for memset_pattern16 private linkage.
There's really no reason to have them have entries in the symbol table
anymore. Old versions of ld64 had some bugs in this area but those have
been fixed long ago.

llvm-svn: 231041
2015-03-03 00:17:09 +00:00
Sanjoy Das 2d38031271 Revert some changes that were made to fix PR20680.
This re-lands change r230921.  r230921 was reverted because it broke a
clang test; a checkin fixing the clang test will be commited shortly.

Summary:
As far as I can tell, the real bug causing the issue was fixed in
r230533.  SCEVExpander should mark an increment operation as nuw or nsw
only if it can *prove* that the operation does not overflow.  There
shouldn't be any situation where we have to do something different
because of no-wrap flags generated by SCEVExpander.

Revert "IndVarSimplify: Allow LFTR to fire more often"

This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213).

Revert "IndVarSimplify: Don't let LFTR compare against a poison value"

This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102).

Reviewers: majnemer, atrick, spatel

Differential Revision: http://reviews.llvm.org/D7979

llvm-svn: 231018
2015-03-02 21:41:07 +00:00
Michael Zolotukhin 9302236680 Make ToVectorTy static.
llvm-svn: 231007
2015-03-02 20:43:24 +00:00
Benjamin Kramer cc34ba687b SLPVectorizer: Rewrite ArrayRef slice compare to be more idiomatic.
NFC intended.

llvm-svn: 230965
2015-03-02 15:24:36 +00:00
NAKAMURA Takumi 0cd23c842e Revert r230921, "Revert some changes that were made to fix PR20680.", for now.
It caused a failure on clang/test/Misc/backend-optimization-failure.cpp .

llvm-svn: 230929
2015-03-02 01:14:03 +00:00
Sanjoy Das 876bd51486 Revert some changes that were made to fix PR20680.
Summary:
As far as I can tell, the real bug causing the issue was fixed in
r230533.  SCEVExpander should mark an increment operation as nuw or nsw
only if it can *prove* that the operation does not overflow.  There
shouldn't be any situation where we have to do something different
because of no-wrap flags generated by SCEVExpander.

Revert "IndVarSimplify: Allow LFTR to fire more often"

This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213).

Revert "IndVarSimplify: Don't let LFTR compare against a poison value"

This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102).

Reviewers: majnemer, atrick, spatel

Differential Revision: http://reviews.llvm.org/D7979

llvm-svn: 230921
2015-03-01 23:36:26 +00:00
Benjamin Kramer cb570f1bc9 TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator.
Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting
them early also slightly simplifies code.

Thanks to Sanjay for the IR test case.

llvm-svn: 230856
2015-02-28 16:47:27 +00:00
Benjamin Kramer 5fbfe2ffdc Convert push_back loops into append calls.
No functionality change intended.

llvm-svn: 230849
2015-02-28 13:20:15 +00:00
Yaron Keren 42a7adf171 Silence variable set but not used warning, NFC.
llvm-svn: 230848
2015-02-28 13:11:24 +00:00
Benjamin Kramer 4f6ac16292 Replace std::copy with a back inserter with vector append where feasible
All of the cases were just appending from random access iterators to a
vector. Using insert/append can grow the vector to the perfect size
directly and moves the growing out of the loop. No intended functionalty
change.

llvm-svn: 230845
2015-02-28 10:11:12 +00:00
Philip Reames 28e61ce60f [RewriteStatepointsForGC] Reduce indentation via early continue [NFC]
llvm-svn: 230836
2015-02-28 01:57:44 +00:00
Philip Reames 2e5bcbe8d5 [RewriteStatepointsForGC] Fix another order of iteration bug
It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated.  We need to nail this down to avoid non-deterministic output and possible test failures.  

The modified test is the one I first noticed something odd in.  The change is making it more strict to report the error.  With the test change, but without the code change, the test fails roughly 1 in 5.  With the code change, I've run ~30 runs without error.

Long term, the right fix here is to adjust the naming scheme.  I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend.  HJust because I only noticed one case doesn't mean it's actually the only case.  I hope to get to the right change Monday.

std->llvm data structure changes bugfix change #3

llvm-svn: 230835
2015-02-28 01:52:09 +00:00
Philip Reames f986d68b36 [RewriteStatepointsForGC] Reduce indentation via early continue [NFC]
llvm-svn: 230829
2015-02-28 00:54:41 +00:00
Philip Reames a226e6115c [RewriteStatepointsForGC] Fix iterator invalidation bug
Inserting into a DenseMap you're iterating over is not well defined.  This is unfortunate since this is well defined on a std::map.

"cleanup per llvm code style standards" bug #2

llvm-svn: 230827
2015-02-28 00:47:50 +00:00
Philip Reames a5aeaf4b4f [RewriteStatepointsForGC] Add tests for the base pointer identification algorithm
These tests cover the 'base object' identification and rewritting portion of RewriteStatepointsForGC.  These aren't completely exhaustive, but they've proven to be reasonable effective over time at finding regressions.

In the process of porting these tests over, I found my first "cleanup per llvm code style standards" bug.  We were relying on the order of iteration when testing the base pointers found for a derived pointer.  When we switched from std::set to DenseSet, this stopped being a safe assumption.  I'm suspecting I'm going to find more of those.  In particular, I'm now really wondering about the main iteration loop for this algorithm.  I need to go take a closer look at the assumptions there.

I'm not really happy with the fact these are testing what is essentially debug output (i.e. enabled via command line flags).  Suggestions for how to structure this better are very welcome.  

llvm-svn: 230818
2015-02-28 00:20:48 +00:00
Sanjay Patel b92e9164d2 remove function names from comments; NFC
llvm-svn: 230766
2015-02-27 17:27:15 +00:00
Anna Zaks 8ed1d8196b [asan] Skip promotable allocas to improve performance at -O0
Currently, the ASan executables built with -O0 are unnecessarily slow.
The main reason is that ASan instrumentation pass inserts redundant
checks around promotable allocas. These allocas do not get instrumented
under -O1 because they get converted to virtual registered by mem2reg.
With this patch, ASan instrumentation pass will only instrument non
promotable allocas, giving us a speedup of 39% on a collection of
benchmarks with -O0. (There is no measurable speedup at -O1.)

llvm-svn: 230724
2015-02-27 03:12:36 +00:00
Eric Christopher b9f0009b5a Remove DebugLoc::print(LLVMContext, raw_ostream), it was just
forwarding to the one that didn't take a context.

llvm-svn: 230700
2015-02-26 23:32:17 +00:00
Hal Finkel 221f467185 [InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores
InstCombine has long had logic to convert aligned Altivec load/store intrinsics
into regular loads and stores. This mirrors that functionality for QPX vector
load/store intrinsics.

llvm-svn: 230660
2015-02-26 18:56:03 +00:00
Sanjoy Das e91665de39 IRCE: only touch loops that have been shown to have a high
backedge-taken count in profiliing data.

llvm-svn: 230619
2015-02-26 08:56:04 +00:00
Sanjoy Das e75ed92630 IRCE: generalize to handle loops with decreasing induction variables.
IRCE can now split the iteration space for loops like:

   for (i = n; i >= 0; i--)
     a[i + k] = 42; // bounds check on access

llvm-svn: 230618
2015-02-26 08:19:31 +00:00
Sanjoy Das 48c75814a5 IRCE: print newline after printing an InductiveRangeCheck.
llvm-svn: 230607
2015-02-26 04:03:31 +00:00
Ramkumar Ramachandra 3408f3e296 PlaceSafepoints: use IRBuilder helpers
Use the IRBuilder helpers for gc.statepoint and gc.result, instead of
coding the construction by hand. Note that the gc.statepoint IRBuilder
handles only CallInst, not InvokeInst; retain that part of hand-coding.

Differential Revision: http://reviews.llvm.org/D7518

llvm-svn: 230591
2015-02-26 00:35:56 +00:00
Justin Bogner 2e427d4dbd InstrProf: Make the __llvm_profile_runtime_user symbol hidden
This symbol exists only to pull in the required pieces of the runtime,
so nothing ever needs to refer to it. Making it hidden avoids the
potential for issues with duplicate symbols when linking profiled
libraries together.

llvm-svn: 230566
2015-02-25 22:52:20 +00:00
Sanjay Patel cc29f4f2cb only propagate equality comparisons of FP values that we are certain are non-zero
This is a follow-on to r227491 which tightens the check for propagating FP
values. If a non-constant value happens to be a zero, we would hit the same
bug as before.

Bug noted and patch suggested by Eli Friedman.

llvm-svn: 230564
2015-02-25 22:46:08 +00:00
JF Bastien d52c990a90 InstCombine: extract instead of shuffle when performing vector/array type punning
Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is.

Test Plan: make check

Reviewers: jvoung, chandlerc

Subscribers: llvm-commits
llvm-svn: 230560
2015-02-25 22:30:51 +00:00
Peter Collingbourne eba7f73ff9 LowerBitSets: Align referenced globals.
This change aligns globals to the next highest power of 2 bytes, up to a
maximum of 128. This makes it more likely that we will be able to compress
bit sets with a greater alignment. In many more cases, we can now take
advantage of a new optimization also introduced in this patch that removes
bit set checks if the bit set is all ones.

The 128 byte maximum was found to provide the best tradeoff between instruction
overhead and data overhead in a recent build of Chromium. It allows us to
remove ~2.4MB of instructions at the cost of ~250KB of data.

Differential Revision: http://reviews.llvm.org/D7873

llvm-svn: 230540
2015-02-25 20:42:41 +00:00
Charles Davis 33d1dc0008 [IC] Turn non-null MD on pointer loads to range MD on integer loads.
Summary:
This change fixes the FIXME that you recently added when you committed
(a modified version of) my patch.  When `InstCombine` combines a load and
store of an pointer to those of an equivalently-sized integer, it currently
drops any `!nonnull` metadata that might be present.  This change replaces
`!nonnull` metadata with `!range !{ 1, -1 }` metadata instead.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7621

llvm-svn: 230462
2015-02-25 05:10:25 +00:00
Peter Collingbourne 1baeaa395a LowerBitSets: Introduce global layout builder.
The builder is based on a layout algorithm that tries to keep members of
small bit sets together. The new layout compresses Chromium's bit sets to
around 15% of their original size.

Differential Revision: http://reviews.llvm.org/D7796

llvm-svn: 230394
2015-02-24 23:17:02 +00:00
Sanjay Patel cee38616c8 remove function names from comments; NFC
llvm-svn: 230391
2015-02-24 22:43:06 +00:00
Kuba Brecka f5875d3026 Fix alloca_instruments_all_paddings.cc test to work under higher -O levels (llvm part)
When AddressSanitizer only a single dynamic alloca and no static allocas, due to an early exit from FunctionStackPoisoner::poisonStack we forget to unpoison the dynamic alloca.  This patch fixes that.

Reviewed at http://reviews.llvm.org/D7810

llvm-svn: 230316
2015-02-24 09:47:05 +00:00
Sanjoy Das 82ea3d45b5 New instcombine rule: max(~a,~b) -> ~min(a, b)
This case is interesting because ScalarEvolutionExpander lowers min(a,
b) as ~max(~a,~b).  I think the profitability heuristics can be made
more clever/aggressive, but this is a start.

Differential Revision: http://reviews.llvm.org/D7821

llvm-svn: 230285
2015-02-24 00:08:41 +00:00
Sanjay Patel 27aa1423d2 add newline for easier reading; NFC
llvm-svn: 230265
2015-02-23 21:32:09 +00:00
Andrew Kaylor f22fe4ae18 Remap frame variables for native Windows exception handling.
Differential Revision: http://reviews.llvm.org/D7770

llvm-svn: 230249
2015-02-23 20:01:56 +00:00
Chad Rosier 543900539f Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity.
This patch adds the isProfitableToHoist API.  For AArch64, we want to prevent a
fmul from being hoisted in cases where it is more profitable to form a
fmsub/fmadd.

Phabricator Review: http://reviews.llvm.org/D7299
Patch by Lawrence Hu <lawrence@codeaurora.org>

llvm-svn: 230241
2015-02-23 19:15:16 +00:00
Mehdi Amini cd3ca6f7dd InstSimplify: simplify 0 / X if nnan and nsz
From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 230238
2015-02-23 18:30:25 +00:00
David Blaikie 5e5d7840fb Roll condition into an assert then wrap it 'ifndef NDEBUG' to protect from the inevitable "unused variable" warning in a non-asserts build.
llvm-svn: 230181
2015-02-22 20:58:38 +00:00
Hal Finkel 3d4269ab05 [LICM] Refactor to expose functionality as utility functions
This refactors the core functionality of LICM: HoistRegion, SinkRegion and
PromoteAliasSet (renamed to promoteLoopAccessesToScalars) as utility functions
in LoopUtils. This will enable other transformations to make use of them
directly.

Patch by Ashutosh Nema.

llvm-svn: 230178
2015-02-22 18:35:32 +00:00
NAKAMURA Takumi f7d08f6dcc RewriteStatepointsForGC.cpp: Fix for -Asserts to mark isNullConstant() as LLVM_ATTRIBUTE_UNUSED. [-Wunused-function]
llvm-svn: 230169
2015-02-22 09:58:19 +00:00
NAKAMURA Takumi 02aa295a00 RewriteStatepointsForGC.cpp: Fix for -Asserts. [-Wunused-variable]
llvm-svn: 230168
2015-02-22 09:58:13 +00:00
NAKAMURA Takumi 6c24684c95 LowerBitSets.cpp: Prune incorrect \param(s). [-Wdocumentation]
\param should be used as itemized.

llvm-svn: 230167
2015-02-22 09:51:42 +00:00
Sanjoy Das 95c476db94 IRCE: generalize InductiveRangeCheck::computeSafeIterationSpace to
work with a non-canonical induction variable.

This is currently a non-functional change because we only ever call
computeSafeIterationSpace on a canonical induction variable; but the
generalization will be useful in a later commit.

llvm-svn: 230151
2015-02-21 22:20:22 +00:00
Sanjoy Das 7fc60da2f5 IRCE: use SCEVs instead of llvm::Value's for intermediate
calculations.  Semantically non-functional change.

This gets rid of some of the SCEV -> Value -> SCEV round tripping and
the Construct(SMin|SMax)Of and MaybeSimplify helper routines.

llvm-svn: 230150
2015-02-21 22:07:32 +00:00
Philip Reames 0b1b387441 [PlaceSafepoints] Adjust enablement logic to default to off and be GC configurable per GC
Previously, this pass ran over every function in the Module if added to the pass order.  With this change, it runs only over those with a GC attribute where the GC explicitly opts in.  A GC can also choose which of entry safepoint polls, backedge safepoint polls, and call safepoints it wants.  I hope to get these exposed as checks on the GCStrategy at some point, but for now, the checks are manual string comparisons.

llvm-svn: 230097
2015-02-21 00:09:09 +00:00
David Blaikie 82ad78771b Remove some unnecessary unreachables in favor of (sometimes implicit) assertions
Also simplify some else-after-return cases including some standard
algorithm convenience/use.

llvm-svn: 230094
2015-02-20 23:44:24 +00:00
Philip Reames 1f3e5c195c Hide a bunch of advanced testing options in default opt --help output
These are internal options.  I need to go through, evaluate which are worth keeping and which not.  Many of them should probably be renamed as well.  Until I have time to do that, we can at least stop poluting the standard opt -help output.

llvm-svn: 230088
2015-02-20 23:32:03 +00:00
Philip Reames 1f017547bb [RewriteStatepointsForGC] Use DenseSet in place of std::set [NFC]
This should be the last cleanup on non-llvm preferred data structures.  I left one use of std::set in an assertion; DenseSet didn't seem to have a tombstone for CallSite defined.  That might be worth fixing, but wasn't worth it for a debug only use.

llvm-svn: 230084
2015-02-20 23:16:52 +00:00
Philip Reames e9c3b9bd46 [RewriteStatepointsForGC] Replace std::map with DenseMap
I'd done the work of extracting the typedef in a previous commit, but didn't actually change it.  Hopefully this will make any subtle changes easier to isolate.

llvm-svn: 230081
2015-02-20 22:48:20 +00:00
Philip Reames d2b664642f [RewriteStatepointsForGC] Cleanup - replace std::vector usage [NFC]
Migrate std::vector usage to a combination of SmallVector and ArrayRef.

llvm-svn: 230079
2015-02-20 22:39:41 +00:00
Philip Reames 860660ea5e [RewriteStatepointsForGC] More style cleanup [NFC]
Use llvm_unreachable where appropriate, use SmallVector where easy to do so, introduce typedefs for planned type migrations.

llvm-svn: 230068
2015-02-20 22:05:18 +00:00
Philip Reames 0a3240f4de [RewriteStatepointsForGC] Remove notion of SafepointBounds [NFC]
The notion of a range of inserted safepoint related code is no longer really applicable.  This survived over from an earlier implementation.  Just saving the inserted gc.statepoint and working from that is far clearer given the current code structure.  Particularly when invokable statepoints get involved.

llvm-svn: 230063
2015-02-20 21:34:11 +00:00
Benjamin Kramer 911d5b3ace LoopRotate: When reconstructing loop simplify form don't split edges from indirectbrs.
Yet another chapter in the endless story. While this looks like we leave
the loop in a non-canonical state this replicates the logic in
LoopSimplify so it doesn't diverge from the canonical form in any way.

PR21968

llvm-svn: 230058
2015-02-20 20:49:25 +00:00
Peter Collingbourne e6909c8e8b Introduce bitset metadata format and bitset lowering pass.
This patch introduces a new mechanism that allows IR modules to co-operatively
build pointer sets corresponding to addresses within a given set of
globals. One particular use case for this is to allow a C++ program to
efficiently verify (at each call site) that a vtable pointer is in the set
of valid vtable pointers for the class or its derived classes. One way of
doing this is for a toolchain component to build, for each class, a bit set
that maps to the memory region allocated for the vtables, such that each 1
bit in the bit set maps to a valid vtable for that class, and lay out the
vtables next to each other, to minimize the total size of the bit sets.

The patch introduces a metadata format for representing pointer sets, an
'@llvm.bitset.test' intrinsic and an LTO lowering pass that lays out the globals
and builds the bitsets, and documents the new feature.

Differential Revision: http://reviews.llvm.org/D7288

llvm-svn: 230054
2015-02-20 20:30:47 +00:00
Philip Reames fa2fcf173b [GC, RewriteStatepointsForGC] Style cleanup and bug fix
When doing style cleanup, I noticed a minor bug in this code.  If we have a pointer that we think is unused after a statepoint and thus doesn't need relocation, we store a null pointer into the alloca we're about to promote.  This helps turn a mistake in liveness analysis into an easily debuggable crash.  It turned out this code had never been updated to handle invoke statepoints.  

There's no test for this.  Without a bug in liveness, it appears impossible to make this trigger in a way which is visible in the resulting IR.  We might store the null, but when promoting the alloca, there will be no uses and thus nothing to test against.  Suggestions on how to test are very welcome.

llvm-svn: 230047
2015-02-20 19:51:56 +00:00
Reid Kleckner a070ee5ef5 Use unreachable instead of assert(false) to silence MSVC warning
llvm-svn: 230045
2015-02-20 19:46:02 +00:00
Philip Reames f20413245a [GC] Style cleanup for RewriteStatepointForGC (1 of many) [NFC]
Starting to update variable naming and types to match LLVM style.  This will be an incremental process to minimize the chance of breakage as I work.  Step one, rename member variables to LLVM CamelCase and use llvm's ADT.  Much more to come.

llvm-svn: 230042
2015-02-20 19:26:04 +00:00
Philip Reames 2ef029c7ae Bugfix for 229954
Before calling Function::getGC to test for enablement, we need to make sure there's actually a GC at all via Function::hasGC.  Otherwise, we'd crash on functions without a GC.  Thankfully, this only mattered if you manually scheduled the pass, but still, oops. :(

llvm-svn: 230040
2015-02-20 18:56:14 +00:00
Benjamin Kramer 6f66545ae6 RewriteStatepointsForGC: Move details into anonymous namespaces. NFC.
While there reduce the number of duplicated std::map lookups.

llvm-svn: 230012
2015-02-20 14:00:58 +00:00
Benjamin Kramer d4a3a55564 Wrap recursive function only used in assert in #ifndef NDEBUG.
Avoids unused function warnings in Release builds.

llvm-svn: 230009
2015-02-20 13:15:49 +00:00
Nick Lewycky eb3231eefa Fix build in release mode, four cases of -Wunused-variable.
llvm-svn: 229976
2015-02-20 07:14:02 +00:00
Hal Finkel 847e05f569 [InstCombine] Remove unnecessary variable indexing into single-element arrays
This change addresses a deficiency pointed out in PR22629. To copy from the bug
report:

[from the bug report]

Consider this code:

int f(int x) {
  int a[] = {12};
  return a[x];
}

GCC knows to optimize this to

movl     $12, %eax
ret

The code generated by recent Clang at -O3 is:

movslq   %edi, %rax
movl     .L_ZZ1fiE1a(,%rax,4), %eax
retq

.L_ZZ1fiE1a:
  .long    12                      # 0xc

[end from the bug report]

This definitely seems worth fixing. I've also seen this kind of code before (as
the base case of generic vector wrapper templates with one element).

The general idea is to look at the GEP feeding a load or a store, which has
some variable as its first non-zero index, and determine if that index must be
zero (or else an out-of-bounds access would occur). We can do this for allocas
and globals with constant initializers where we know the maximum size of the
underlying object. When we find such a GEP, we create a new one for the memory
access with that first variable index replaced with a constant zero.

Even if we can't eliminate the memory access (and sometimes we can't), it is
still useful because it removes unnecessary indexing calculations.

llvm-svn: 229959
2015-02-20 03:05:53 +00:00
Philip Reames 6faacf4772 Adjust enablement of RewriteStatepointsForGC
When back merging the changes in 229945 I noticed that I forgot to mark the test cases with the appropriate GC.  We want the rewriting to be off by default (even when manually added to the pass order), not on-by default.  To keep the current test working, mark them as using the statepoint-example GC and whitelist that GC.  

Longer term, we need a better selection mechanism here for both actual usage and testing.  As I migrate more tests to the in tree version of this pass, I will probably need to update the enable/disable logic as well. 

llvm-svn: 229954
2015-02-20 02:34:49 +00:00
Philip Reames d16a9b1fdc Add a pass for constructing gc.statepoint sequences w/explicit relocations
This patch consists of a single pass whose only purpose is to visit previous inserted gc.statepoints which do not have gc.relocates inserted yet, and insert them. This can be used either immediately after IR generation to perform 'early safepoint insertion' or late in the pass order to perform 'late insertion'.

This patch is setting the stage for work to continue in tree.  In particular, there are known naming and style violations in the current patch.  I'll try to get those resolved over the next week or so.  As I touch each area to make style changes, I need to make sure we have adequate testing in place.  As part of the cleanup, I will be cleaning up a collection of test cases we have out of tree and submitting them upstream. The tests included in this change are very basic and mostly to provide examples of usage.

The pass has several main subproblems it needs to address:
- First, it has identify any live pointers. In the current code, the use of address spaces to distinguish pointers to GC managed objects is hard coded, but this will become parametrizable in the near future.  Note that the current change doesn't actually contain a useful liveness analysis.  It was seperated into a followup change as the code wasn't ready to be shared.  Instead, the current implementation just considers any dominating def of appropriate pointer type to be live.
- Second, it has to identify base pointers for each live pointer. This is a fairly straight forward data flow algorithm. 
- Third, the information in the previous steps is used to actually introduce rewrites. Rather than trying to do this by hand, we simply re-purpose the code behind Mem2Reg to do this for us.

llvm-svn: 229945
2015-02-20 01:06:44 +00:00
Kostya Serebryany 885994618c [sanitizer] when dumping the basic block trace, also dump the module names. Patch by Laszlo Szekeres
llvm-svn: 229940
2015-02-20 00:30:44 +00:00
Michael Gottesman 0fc2accb58 [objc-arc-contract] We can not move retains over instructions which can not conservatively be proven to not decrement the retain's RCIdentity.
I also cleaned up the code to make it more understandable for mere mortals.

<rdar://problem/19853758>

llvm-svn: 229937
2015-02-20 00:02:49 +00:00
Michael Gottesman 5ab64de62b [objc-arc] Add the predicate CanDecrementRefCount.
This is different from CanAlterRefCount since CanDecrementRefCount is
attempting to prove specifically whether or not an instruction can
decrement instead of the more general question of whether it can
decrement or increment.

llvm-svn: 229936
2015-02-20 00:02:45 +00:00
Benjamin Kramer dfedfeb298 SSAUpdater: Use range-based for. NFC.
llvm-svn: 229908
2015-02-19 20:04:02 +00:00
Michael Gottesman 2e0e4e07b4 [objc-arc] Convert the bodies of ARCInstKind predicates into covered switches.
This is much better than the previous manner of just using
short-curcuiting booleans from:

1. A "naive" efficiency perspective: we do not have to rely on the
compiler to change the short circuiting boolean operations into a
switch.
2. An understanding perspective by making the implicit behavior of
negative predicates explicit.
3. A maintainability perspective through the covered switch flag making
it easy to know where to update code when adding new ARCInstKinds.

llvm-svn: 229906
2015-02-19 19:51:36 +00:00
Michael Gottesman 6f729fa675 [objc-arc] Change the InstructionClass to be an enum class called ARCInstKind.
I also renamed ObjCARCUtil.cpp -> ARCInstKind.cpp. That file only contained
items related to ARCInstKind anyways.

llvm-svn: 229905
2015-02-19 19:51:32 +00:00
Adam Nemet 57ac766ee9 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229899
2015-02-19 19:15:21 +00:00
Adam Nemet 2bd6e984ef [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229897
2015-02-19 19:15:15 +00:00
Adam Nemet 3e87634fd8 [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229896
2015-02-19 19:15:13 +00:00
Adam Nemet 339f42b396 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229894
2015-02-19 19:15:07 +00:00
Adam Nemet 3bfd93d789 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229893
2015-02-19 19:15:04 +00:00
Adam Nemet 436018c3ff [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229892
2015-02-19 19:15:00 +00:00
Adam Nemet c922853b93 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229891
2015-02-19 19:14:56 +00:00
Adam Nemet f219c64723 [LoopAccesses] Make VectorizerParams global + fix for cyclic dep
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This commits also has the fix (D7731) to the break dependence cycle
between the analysis and vector libraries.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229890
2015-02-19 19:14:52 +00:00
Adam Nemet 04d4163e95 Revert "Reformat."
This reverts commit r229651.

I'd like to ultimately revert r229650 but this reformat stands in the
way.  I'll reformat the affected files once the the loop-access pass is
fully committed.

llvm-svn: 229889
2015-02-19 19:14:34 +00:00
Benjamin Kramer 1c2beed7fd LSR: Move set instead of copying. NFC.
llvm-svn: 229871
2015-02-19 17:19:43 +00:00
Benjamin Kramer ea68a944a1 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Michael Gottesman e5ad66f8a9 [objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC.
The RCIdentity root ("Reference Count Identity Root") of a value V is a
dominating value U for which retaining or releasing U is equivalent to
retaining or releasing V. In other words, ARC operations on V are
equivalent to ARC operations on U.

This is a useful property to ascertain since we can use this in the ARC
optimizer to make it easier to match up ARC operations by always mapping
ARC operations to RCIdentityRoots instead of pointers themselves. Then
we perform pairing of retains, releases which are applied to the same
RCIdentityRoot.

In general, the two ways that we see RCIdentical values in ObjC are via:

  1. PointerCasts
  2. Forwarding Calls that return their argument verbatim.

As such in ObjC, two RCIdentical pointers must always point to the same
memory location.

Previously this concept was implicit in the code and various methods
that dealt with this concept were given functional names that did not
conform to any name in the "ARC" model. This often times resulted in
code that was hard for the non-ARC acquanted to understand resulting in
unhappiness and confusion.

llvm-svn: 229796
2015-02-19 00:42:38 +00:00
Michael Gottesman dfa3e4b08a [objc-arc-contract] Rename contractRelease => tryToContractReleaseIntoStoreStrong.
NFC. Makes it clearer what this method is actually supposed to do.

llvm-svn: 229795
2015-02-19 00:42:34 +00:00
Michael Gottesman 1827973f80 [objc-arc-contract] Refactor out tryToPeepholeInstruction into its own method. NFC.
The main method of ObjCARCContract is really large and busy. By refactoring this
out, it becomes easier to reason about.

llvm-svn: 229794
2015-02-19 00:42:30 +00:00
Michael Gottesman 56bd6a077a [objc-arc-contract] Reorganize the code a bit and make the debug output easier to read.
llvm-svn: 229793
2015-02-19 00:42:27 +00:00
Sanjoy Das 11b279a832 Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if
computing the trip count overflows.  This change also gets rid of the
backedge check in the prologue loop and the extra check for
overflowing trip-count.

Differential Revision: http://reviews.llvm.org/D7715

llvm-svn: 229731
2015-02-18 19:32:25 +00:00
Andrew Kaylor 527c5dc68d Adding implementation to outline C++ catch handlers for native Windows 64 exception handling.
Differential Revision: http://reviews.llvm.org/D7363

llvm-svn: 229715
2015-02-18 18:31:51 +00:00
Mohit K. Bhakkad 518946e440 [MSan][MIPS] VarArgHelper for MIPS64
Reviewers: Reviewers: eugenis, kcc, samsonov, petarj

Subscribers: dsanders, sagar, llvm-commits

Differential Revision: http://reviews.llvm.org/D7182

llvm-svn: 229667
2015-02-18 11:41:24 +00:00
NAKAMURA Takumi a250484c4c Reformat.
llvm-svn: 229651
2015-02-18 08:36:14 +00:00
NAKAMURA Takumi fa520c5f49 Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector.
r229622: "[LoopAccesses] Make VectorizerParams global"
  r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it"
  r229624: "[LoopAccesses] Cache the result of canVectorizeMemory"
  r229626: "[LoopAccesses] Create the analysis pass"
  r229628: "[LoopAccesses] Change debug messages from LV to LAA"
  r229630: "[LoopAccesses] Add canAnalyzeLoop"
  r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport"
  r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport"
  r229633: "[LoopAccesses] Add -analyze support"
  r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference"
  r229638: "Analysis: fix buildbots"

llvm-svn: 229650
2015-02-18 08:34:47 +00:00
Craig Topper 1348f17205 [X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2.
llvm-svn: 229641
2015-02-18 06:24:49 +00:00
Craig Topper b324e43aed [X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles.
llvm-svn: 229640
2015-02-18 06:24:44 +00:00
Adam Nemet 85fd9f8d09 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229634
2015-02-18 03:44:33 +00:00
Adam Nemet d7350dbb85 [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229632
2015-02-18 03:44:25 +00:00
Adam Nemet 8b12afbeee [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229631
2015-02-18 03:44:20 +00:00
Adam Nemet d0db4c1395 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229628
2015-02-18 03:43:37 +00:00
Adam Nemet d6b7e29815 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229626
2015-02-18 03:43:24 +00:00
Adam Nemet 01abb2c355 [LoopAccesses] Make blockNeedsPredication static
blockNeedsPredication is in LoopAccess in order to share it with the
vectorizer.  It's a utility needed by LoopAccess not strictly provided
by it but it's a good place to share it.  This makes the function static
so that it no longer required to create an LoopAccessInfo instance in
order to access it from LV.

This was actually causing problems because it would have required
creating LAI much earlier that LV::canVectorizeMemory().

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229625
2015-02-18 03:43:19 +00:00
Adam Nemet 3cf32ad6db [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229624
2015-02-18 03:42:57 +00:00
Adam Nemet 5474be2c80 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229623
2015-02-18 03:42:50 +00:00
Adam Nemet 4f3ede5a01 [LoopAccesses] Make VectorizerParams global
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229622
2015-02-18 03:42:43 +00:00
Adam Nemet 30f16e1696 [LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo
LoopAccessAnalysis will be used as the name of the pass.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229621
2015-02-18 03:42:35 +00:00
Akira Hatanaka 1defd5afbd [InstCombine] Do not insert a GEP instruction before a landingpad instruction.
InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the
insertion point, which caused the verifier to complain when a GEP was inserted
before a landingpad instruction. This commit fixes it to use getFirstInsertionPt
instead.

rdar://problem/19394964

llvm-svn: 229619
2015-02-18 03:30:11 +00:00
Hal Finkel 4393559621 [BDCE] Don't forget uses of root instructions seen before the instruction itself
When visiting the initial list of "root" instructions (those which must always
be alive), for those that are integer-valued (such as invokes returning an
integer), we mark their bits as (initially) all dead (we might, obviously, find
uses of those bits later, but all bits are assumed dead until proven
otherwise). Don't do so, however, if we're already seen a use of those bits by
another root instruction (such as a store).

Fixes a miscompile of the sanitizer unit tests on x86_64.

Also, add a debug line for visiting the root instructions, and remove a debug
line which tried to print instructions being removed (printing dead
instructions is dangerous, and can sometimes crash).

llvm-svn: 229618
2015-02-18 03:12:28 +00:00
Benjamin Kramer 6cd780ff21 Prefer SmallVector::append/insert over push_back loops.
Same functionality, but hoists the vector growth out of the loop.

llvm-svn: 229500
2015-02-17 15:29:18 +00:00
Elena Demikhovsky ef035bb974 Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613

llvm-svn: 229495
2015-02-17 13:10:05 +00:00
Hal Finkel 2bb61ba2fe [BDCE] Add a bit-tracking DCE pass
BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the
"aggressive DCE" pass), with the added capability to track dead bits of integer
valued instructions and remove those instructions when all of the bits are
dead.

Currently, it does not actually do this all-bits-dead removal, but rather
replaces the instruction's uses with a constant zero, and lets instcombine (and
the later run of ADCE) do the rest. Because we essentially get a run of ADCE
"for free" while tracking the dead bits, we also do what ADCE does and removes
actually-dead instructions as well (this includes instructions newly trivially
dead because all bits were dead, but not all such instructions can be removed).

The motivation for this is a case like:

int __attribute__((const)) foo(int i);
int bar(int x) {
  x |= (4 & foo(5));
  x |= (8 & foo(3));
  x |= (16 & foo(2));
  x |= (32 & foo(1));
  x |= (64 & foo(0));
  x |= (128& foo(4));
  return x >> 4;
}

As it turns out, if you order the bit-field insertions so that all of the dead
ones come last, then instcombine will remove them. However, if you pick some
other order (such as the one above), the fact that some of the calls to foo()
are useless is not locally obvious, and we don't remove them (without this
pass).

I did a quick compile-time overhead check using sqlite from the test suite
(Release+Asserts). BDCE took ~0.4% of the compilation time (making it about
twice as expensive as ADCE).

I've not looked at why yet, but we eliminate instructions due to having
all-dead bits in:
External/SPEC/CFP2006/447.dealII/447.dealII
External/SPEC/CINT2006/400.perlbench/400.perlbench
External/SPEC/CINT2006/403.gcc/403.gcc
MultiSource/Applications/ClamAV/clamscan
MultiSource/Benchmarks/7zip/7zip-benchmark

llvm-svn: 229462
2015-02-17 01:36:59 +00:00
Mehdi Amini b9a0fa4822 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229437
2015-02-16 21:47:54 +00:00
James Molloy 83570247f1 Run LICM as part of the cleanup phase from the scalar optimizer.
Things like LoopUnrolling can produce loop invariant values - make sure
we pick them up.

llvm-svn: 229419
2015-02-16 18:59:54 +00:00
Hal Finkel c64150b8f3 [ADCE] Don't indent inside an anonymous namespace
To be consistent with what clang-format does, don't add extra indentation
inside an anonymous namespace. NFC.

llvm-svn: 229412
2015-02-16 18:08:00 +00:00
James Molloy e32d806b5f [LoopReroll] Relax some assumptions a little.
We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.

llvm-svn: 229406
2015-02-16 17:02:00 +00:00
James Molloy 4c7deb2259 [LoopReroll] Don't crash on dead code
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.

llvm-svn: 229405
2015-02-16 17:01:52 +00:00
Evgeniy Stepanov 292acab847 [asan] Reuse a common function.
Do not reimplement RoundUpToAlignment.

llvm-svn: 229397
2015-02-16 14:49:37 +00:00
Aaron Ballman f9a1897c72 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
Hal Finkel 8626ed2eae [ADCE] Convert another loop for a range-based for
We can use a range-based for for the operands loop too; NFC.

llvm-svn: 229319
2015-02-15 15:51:25 +00:00
Hal Finkel 92fb2d3803 [ADCE] Use inst_range and range-based fors
Convert a few loops to range-based fors; NFC.

llvm-svn: 229318
2015-02-15 15:51:23 +00:00
Hal Finkel c6035cff55 [ADCE] Fix formatting of pointer types
We prefer to put the * with the variable, not with the type; NFC.

llvm-svn: 229317
2015-02-15 15:47:52 +00:00
Hal Finkel 234d8fea7b [ADCE] Fix capitalization of another local variable
Bring another local variable in compliance with our naming conventions, NFC.

llvm-svn: 229316
2015-02-15 15:45:30 +00:00
Hal Finkel 75901293a1 [ADCE] Fix capitalization of some local variables
Bring some local variables in compliance with our naming conventions, NFC.

llvm-svn: 229315
2015-02-15 15:45:28 +00:00
Elena Demikhovsky 6f5a859633 Enabled cost calculation for masked memory operations.
We already have implementation for cost calculation for
masked memory operations. I just call it from the loop vectorizer.

llvm-svn: 229290
2015-02-15 08:08:48 +00:00
Ramkumar Ramachandra 8fcb498a9a InstCombine: propagate deref via new addDereferenceableAttr
The "dereferenceable" attribute cannot be added via .addAttribute(),
since it also expects a size in bytes. AttrBuilder#addAttribute or
AttributeSet#addAttribute is wrapped by classes Function, InvokeInst,
and CallInst. Add corresponding wrappers to
AttrBuilder#addDereferenceableAttr.

Having done this, propagate the dereferenceable attribute via
gc.relocate, adding a test to exercise it. Note that -datalayout is
required during execution over and above -instcombine, because
InstCombine only optionally requires DataLayoutPass.

Differential Revision: http://reviews.llvm.org/D7510

llvm-svn: 229265
2015-02-14 19:37:54 +00:00
Andrea Di Biagio f54432388f [optnone] Skip pass Constant Hoisting on optnone functions.
Added test CodeGen/X86/constant-hoisting-optnone.ll to verify that
pass Constant Hoisting is not run on optnone functions.

llvm-svn: 229258
2015-02-14 15:11:48 +00:00
Duncan P. N. Exon Smith 2c79ad974c Transforms: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229202
2015-02-14 01:11:29 +00:00
Philip Reames 9ae15209ad [InstCombine] When canonicalizing gep indices, prefer zext when possible
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead.  This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?).  We already apply a similar transform in DAGCombine, this just extends that to the IR level.

This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. 

Differential Revision: http://reviews.llvm.org/D7255

llvm-svn: 229189
2015-02-14 00:05:36 +00:00
Andrea Di Biagio 30d471f6aa [InstCombine] Fix regression introduced at r227197.
This patch fixes a problem I accidentally introduced in an instruction combine
on select instructions added at r227197. That revision taught the instruction
combiner how to fold a cttz/ctlz followed by a icmp plus select into a single
cttz/ctlz with flag 'is_zero_undef' cleared.

However, the new rule added at r227197 would have produced wrong results in the
case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a
zero-extend or truncate. In that case, the folded instruction would have
been inserted in a wrong location thus leaving the CFG in an inconsistent
state.

This patch fixes the problem and add two reproducible test cases to
existing test 'InstCombine/select-cmp-cttz-ctlz.ll'.

llvm-svn: 229124
2015-02-13 16:33:34 +00:00
James Molloy 1b6207e6eb [SimplifyCFG] Be more aggressive
Up the phi node folding threshold from a cheap "1" to a meagre "2".

Update tests for extra added selects and slight code churn.

llvm-svn: 229099
2015-02-13 10:48:30 +00:00
Chandler Carruth 30d69c2e36 [PM] Remove the old 'PassManager.h' header file at the top level of
LLVM's include tree and the use of using declarations to hide the
'legacy' namespace for the old pass manager.

This undoes the primary modules-hostile change I made to keep
out-of-tree targets building. I sent an email inquiring about whether
this would be reasonable to do at this phase and people seemed fine with
it, so making it a reality. This should allow us to start bootstrapping
with modules to a certain extent along with making it easier to mix and
match headers in general.

The updates to any code for users of LLVM are very mechanical. Switch
from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h".
Qualify the types which now produce compile errors with "legacy::". The
most common ones are "PassManager", "PassManagerBase", and
"FunctionPassManager".

llvm-svn: 229094
2015-02-13 10:01:29 +00:00
Chandler Carruth 1fbc316534 [unroll] Concede defeat and disable the unroll analyzer for now.
The issues with the new unroll analyzer are more fundamental than code
cleanup, algorithm, or data structure changes. I've sent an email to the
original commit thread with details and a proposal for how to redesign
things. I'm disabling this for now so that we don't spend time
debugging issues with it in its current state.

llvm-svn: 229064
2015-02-13 05:31:46 +00:00
Michael Liao d266b928ae [InstCombine] Fix a bug when combining `icmp` from `ptrtoint`
- First, there's a crash when we try to combine that pointers into `icmp`
  directly by creating a `bitcast`, which is invalid if that two pointers are
  from different address spaces.

- It's not always appropriate to cast one pointer to another if they are from
  different address spaces as that is not no-op cast. Instead, we only combine
  `icmp` from `ptrtoint` if that two pointers are of the same address space.

llvm-svn: 229063
2015-02-13 04:51:26 +00:00
Chandler Carruth 6c03dff7cc [unroll] Merge the simplification and DCE estimation methods on the
UnrollAnalyzer.

Now they share a single worklist and have less implicit state between
them. There was no real benefit to separating these two things out.

I'm going to subsequently refactor things to share even more code.

llvm-svn: 229062
2015-02-13 04:39:05 +00:00
Chandler Carruth d9591d8922 [unroll] Remove pointless dyn_cast<>s to Instruction - the users of an
instruction must by definition be instructions.

llvm-svn: 229061
2015-02-13 04:33:21 +00:00
Chandler Carruth 5457e20d27 [unroll] Don't check the loop set for whether an instruction is
contained in it each time we try to add it to the worklist, just check
this when pulling it off the worklist. That way we do it at most once
per instruction with the cost of the worklist set we would need to pay
anyways.

llvm-svn: 229060
2015-02-13 04:30:44 +00:00
Chandler Carruth e5c30e4e10 [unroll] Change the other worklist in the unroll analyzer to be a set
vector.

In addition to dramatically reducing the work required for contrived
example loops, this also has to correct some serious latent bugs in the
cost computation. Previously, we might add an instruction onto the
worklist once for every load which it used and was simplified. Then we
would visit it many times and accumulate "savings" each time.

I mean, fortunately this couldn't matter for things like calls with 100s
of operands, but even for binary operators this code seems like it must
be double counting the savings.

I just noticed this by inspection and due to the runtime problems it can
introduce, I don't have any test cases for cases where the cost produced
by this routine is unacceptable.

llvm-svn: 229059
2015-02-13 04:27:50 +00:00
Chandler Carruth 7824bc9241 [unroll] Replace a boolean, for loop, condition, and break with
std::all_of and a lambda. Much cleaner, no functionality
changed.

llvm-svn: 229058
2015-02-13 04:18:14 +00:00
Chandler Carruth 06d537cdd6 [unroll] Directly query for dead instructions.
In the unroll analyzer, it is checking each user to see if that user
will become dead. However, it first checked if that user was missing
from the simplified values map, and then if was also missing from the
dead instructions set. We add everything from the simplified values map
to the dead instructions set, so the first step is completely subsumed
by the second. Moreover, the first step requires *inserting* something
into the simplified value map which isn't what we want at all.

This also replaces a dyn_cast with a cast as an instruction cannot be
used by a non-instruction.

llvm-svn: 229057
2015-02-13 04:14:05 +00:00
Chandler Carruth 82cb30f10c [unroll] Replace a linear time check for no uses with a constant time
check.

Also hoist this into the enqueue process as it is faster even than
testing the worklist set, we should just directly filter these out much
like we filter out constants and such.

llvm-svn: 229056
2015-02-13 04:06:08 +00:00
Chandler Carruth 3b057b3216 [unroll] Rather than an operand set, use a setvector for the worklist.
We don't just want to handle duplicate operands within an instruction,
but also duplicates across operands of different instructions. I should
have gone straight to this, but I had convinced myself that it wasn't
going to be necessary briefly. I've come to my senses after chatting
more with Nick, and am now happier here.

llvm-svn: 229054
2015-02-13 03:57:40 +00:00
Chandler Carruth 17a0496b5a [unroll] Extract the code to enqueue operansd for the worklist in the
unroll analysis into a lambda and call it. That's much simpler than
duplicating all the code.

llvm-svn: 229053
2015-02-13 03:49:41 +00:00
Chandler Carruth 8c86375a10 [unroll] Use a small set to de-duplicate operands prior to putting them
into the worklist. This avoids allocating lots of worklist memory for
them when there are large numbers of repeated operands.

llvm-svn: 229052
2015-02-13 03:48:38 +00:00
Chandler Carruth 93063e6191 [unroll] Make the unroll cost analysis terminate deterministically and
reasonably quickly.

I don't have a reduced test case, but for a version of FFMPEG, this
makes the loop unroller start finishing at all (after over 15 minutes of
running, it hadn't terminated for me, no idea if it was a true infloop
or just exponential work).

The key thing here is to check the DeadInstructions set when pulling
things off the worklist. Without this, we would re-walk the user list of
already dead instructions again and again and again. Consider phi nodes
with many, many operands and other patterns.

The other important aspect of this is that because we would keep
re-visiting instructions that were already known dead, we kept adding
their cost savings to this! This would cause our cost savings to be
*insanely* inflated from this.

While I was here, I also rotated the operand walk out of the worklist
loop to make the code easier to read. There is still work to be done to
minimize worklist traffic because we don't de-duplicate operands. This
means we may add the same instruction onto the worklist 1000s of times
if it shows up in 1000s of operansd to a PHI node for example.

Still, with this patch, the ffmpeg testcase I have finishes quickly and
I can't measure the runtime impact of the unroll analysis any more. I'll
probably try to do a few more cleanups to this code, but not sure how
much cleanup I can justify right now.

llvm-svn: 229038
2015-02-13 03:40:58 +00:00
Chandler Carruth dd6029fc6e [unroll] Make range based for loops a bit more explicit and more
readable.

The biggest thing that was causing me problems is recognizing the
references vs. poniters here. I also found that for maps naming the loop
variable as KeyValue helps make it obvious why you don't actually use it
directly. Finally, using 'auto' instead of 'User *' doesn't seem like
a good tradeoff. Much like with the other cases, I like to know its
a pointer, and 'User' is just as long and tells the reader a lot more.

llvm-svn: 229033
2015-02-13 02:45:17 +00:00
Chandler Carruth 87fdafc7b2 [IC] Fix a bug with the instcombine canonicalizing of loads and
propagating of metadata.

We were propagating !nonnull metadata even when the newly formed load is
no longer of a pointer type. This is clearly broken and results in LLVM
failing the verifier and aborting. This patch just restricts the
propagation of !nonnull metadata to when we actually have a pointer
type.

This bug report and the initial version of this patch was provided by
Charles Davis! Many thanks for finding this!

We still need to add logic to round-trip the metadata correctly if we
combine from pointer types to integer types and then back by using range
metadata for the integer type loads. But this is the minimal and safe
version of the patch, which is important so we can backport it into 3.6.

llvm-svn: 229029
2015-02-13 02:30:01 +00:00
Chandler Carruth 415f41258f [unroll] Avoid the "Insn" abbreviation of Instruction. This is quite
hard to type and read for me, and is inconsistent with the other
abbreviation in the base class "Inst". For most of these (where they are
used widely) I prefer just spelling it out as Instruction. I've changed
two of the short-lived variables to use "Inst" to match the base class.

llvm-svn: 229028
2015-02-13 02:17:39 +00:00
Chandler Carruth 302a133b1e [unroll] Tidy up the integer we use to accumululate the number of
instructions optimized. NFC, just separating this out from the
functionality changing commit.

llvm-svn: 229026
2015-02-13 02:10:56 +00:00
Chandler Carruth 10a9926ab5 [unroll] Don't use a map from pointer to bool. Use a set.
This is much more efficient. In particular, the query with the user
instruction has to insert a false for every missing instruction into the
set. This is just a cleanup a long the way to fixing the underlying
algorithm problems here.

llvm-svn: 228994
2015-02-13 00:29:39 +00:00
Michael Zolotukhin 1b48019751 Prevent division by 0.
When we try to estimate number of potentially removed instructions in
loop unroller, we analyze first N iterations and then scale the
computed number by TripCount/N. We should bail out early if N is 0.

llvm-svn: 228988
2015-02-13 00:17:03 +00:00
Chandler Carruth 186ad60815 [unroll] Update the new analysis logic from r228265 to use modern coding
conventions for function names consistently. Some were already using
this but not all.

llvm-svn: 228987
2015-02-13 00:00:24 +00:00
Benjamin Kramer 443c7967ea InstCombine: Allow folding of xor into icmp by changing the predicate for vectors
The loop vectorizer can create this pattern.

llvm-svn: 228954
2015-02-12 20:26:46 +00:00
James Molloy e805ad95dc [LoopRerolling] Be more forgiving with instruction order.
We can't solve the full subgraph isomorphism problem. But we can
allow obvious cases, where for example two instructions of different
types are out of order. Due to them having different types/opcodes,
there is no ambiguity.

llvm-svn: 228931
2015-02-12 15:54:14 +00:00
Dmitry Vyukov 2e8d82e607 tsan: do not instrument not captured values
I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example:

$ ls -l old/modules_unittests new/modules_unittests
-rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests
-rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests
$ objdump -d old/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
239871
$ objdump -d new/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
148365

http://reviews.llvm.org/D7069

llvm-svn: 228917
2015-02-12 09:55:28 +00:00
Chandler Carruth 63aaa98d94 [slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.

llvm-svn: 228899
2015-02-12 02:30:56 +00:00
Tim Northover 02438033e8 DeadArgElim: aggregate Return assessment properly.
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.

llvm-svn: 228885
2015-02-11 23:13:11 +00:00
Mehdi Amini 9730116bd6 Reassociate: cannot negate a INT_MIN value
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7286

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 228872
2015-02-11 19:54:44 +00:00
James Molloy 7c336576a5 [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

llvm-svn: 228826
2015-02-11 12:15:41 +00:00
James Molloy f147359376 [LoopReroll] Introduce the concept of DAGRootSets.
A DAGRootSet models an induction variable being used in a rerollable
loop. For example:

   x[i*3+0] = y1
   x[i*3+1] = y2
   x[i*3+2] = y3

   Base instruction -> i*3
                    +---+----+
                   /    |     \
               ST[y1]  +1     +2  <-- Roots
                        |      |
                      ST[y2] ST[y3]

There may be multiple DAGRootSets, for example:

   x[i*2+0] = ...   (1)
   x[i*2+1] = ...   (1)
   x[i*2+4] = ...   (2)
   x[i*2+5] = ...   (2)
   x[(i+1234)*2+5678] = ... (3)
   x[(i+1234)*2+5679] = ... (3)

This concept is similar to the "Scale" member used previously, but allows
multiple independent sets of roots based off the same induction variable.

llvm-svn: 228821
2015-02-11 09:19:47 +00:00
Zachary Turner 3bd47cee78 Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.
This allows IDEs to recognize the entire set of header files for
each of the core LLVM projects.

Differential Revision: http://reviews.llvm.org/D7526
Reviewed By: Chris Bieneman

llvm-svn: 228798
2015-02-11 03:28:02 +00:00
Justin Bogner d24e185784 InstrProf: Lower coverage mappings by setting their sections appropriately
Add handling for __llvm_coverage_mapping to the InstrProfiling
pass. We need to make sure the constant and any profile names it
refers to are in the correct sections, which is easier and cleaner to
do here where we have to know about profiling sections anyway.

This is really tricky to test without a frontend, so I'm committing
the test for the fix in clang. If anyone knows a good way to test this
within LLVM, please let me know.

Fixes PR22531.

llvm-svn: 228793
2015-02-11 02:52:44 +00:00