Commit Graph

2849 Commits

Author SHA1 Message Date
David Blaikie 65fab6d896 Use early returns to reduce indentation.
llvm-svn: 234057
2015-04-03 21:32:06 +00:00
Alexey Samsonov f96cde9f68 Fix a bug indicated by -fsanitize=shift-exponent.
llvm-svn: 233881
2015-04-02 01:30:10 +00:00
Ahmed Bougacha 408d010a7c [SimplifyLibCalls] Ignore nobuiltin/unavailable fortified libcalls.
We used to do this before refactorings around r225640.
Some clang users checked for _chk libcall availability using:
  __has_builtin(__builtin___memcpy_chk)
When compiling with -fno-builtin, this is always true.
When passing -ffreestanding/-mkernel, which both imply -fno-builtin, we
end up with fortified libcalls, which isn't acceptable in a freestanding
environment which only provides their non-fortified counterparts.

Until we change clang and/or teach external users to check for availability
differently, disregard the "nobuiltin" attribute and TLI::has.

Workaround for PR23093.

llvm-svn: 233776
2015-04-01 00:45:09 +00:00
David Blaikie 3909da7f4b [opaque pointer type] More IRBuilder::createGEP (non-inbounds) migrations: CodeGenPrepare and SimplifyLibCalls
llvm-svn: 233596
2015-03-30 20:42:56 +00:00
Duncan P. N. Exon Smith ec819c096b Transforms: Use the new DebugLoc API, NFC
Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API.

llvm-svn: 233587
2015-03-30 19:49:49 +00:00
Philip Reames 2b969d7010 Merge empty landing pads in SimplifyCFG
This patch tries to merge duplicate landing pads when they branch to a common shared target.

Given IR that looks like this:
lpad1:
  %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
         cleanup
  br label %shared_resume
lpad2:
  %exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
          cleanup
  br label %shared_resume
shared_resume:
  call void @fn()
  ret void
}

We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.

Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.

Differential Revision: http://reviews.llvm.org/D8297

llvm-svn: 233125
2015-03-24 22:28:45 +00:00
Benjamin Kramer 799003bf8c Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.
llvm-svn: 232998
2015-03-23 19:32:43 +00:00
Benjamin Kramer 1f7c328bf2 [ctorutils] Update and sort includes. NFC.
llvm-svn: 232995
2015-03-23 19:06:17 +00:00
Benjamin Kramer 16132e6faa Purge unused includes throughout libSupport.
NFC.

llvm-svn: 232976
2015-03-23 18:07:13 +00:00
Benjamin Kramer d6aa0ec737 [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform.
llvm-svn: 232903
2015-03-21 22:04:26 +00:00
Benjamin Kramer 7857d723f1 [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

llvm-svn: 232902
2015-03-21 21:09:33 +00:00
Benjamin Kramer 691363e7f2 SimplifyLibCalls: Add basic optimization of memchr calls.
This is just memchr(x, y, 0) -> nullptr and constant folding.

llvm-svn: 232896
2015-03-21 15:36:21 +00:00
Andrew Kaylor 3170e5620e Fixing a bug with WinEH PHI handling
llvm-svn: 232851
2015-03-20 21:42:54 +00:00
Sanjoy Das 7182d36f66 [ConstantRange] Split makeICmpRegion in two.
Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts.  The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true.  The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8345

llvm-svn: 232575
2015-03-18 00:41:24 +00:00
Michael Liao 24fcae8fa0 [SwitchLowering] Remove incoming values in the reverse order
- To prevent invalidating *successive* indices.
 

llvm-svn: 232510
2015-03-17 18:03:10 +00:00
Duncan P. N. Exon Smith 170c26d75e MapMetadata: Allow unresolved metadata if it won't change
Allow unresolved nodes through the `MapMetadata()` if
`RF_NoModuleLevelChanges`, since there's no remapping to do anyway.

This fixes PR22929.  I'll add a clang test as a follow-up.

llvm-svn: 232449
2015-03-17 01:14:40 +00:00
David Blaikie 741c8f81e4 [opaque pointer type] Start migrating GEP creation to explicitly specify the pointee type
I'm just going to migrate these in a pretty ad-hoc & incremental way -
providing the backwards compatible API for now, then locally removing
it, fixing a few callers, adding it back in and commiting those callers.
Rinse, repeat.

The assertions should ensure that if I get this wrong we'll find out
about it and not just have one giant patch to revert, recommit, revert,
recommit, etc.

llvm-svn: 232240
2015-03-14 01:53:18 +00:00
Andrew Kaylor 6b67d42773 Extended support for native Windows C++ EH outlining
Differential Review: http://reviews.llvm.org/D7886

llvm-svn: 231981
2015-03-11 23:22:06 +00:00
Sanjay Patel c04b6f242c Inliner should not add callgraph edges for intrinsic calls (PR22857)
The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics.

This patch makes sure that the Inliner does not add such an edge to the callgraph.

Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857

Differential Revision: http://reviews.llvm.org/D8231

llvm-svn: 231927
2015-03-11 15:12:32 +00:00
Sanjay Patel 0fdb437b25 remove function names from comments; NFC
llvm-svn: 231826
2015-03-10 19:42:57 +00:00
Sanjay Patel abf7023c63 remove names from comments; NFC
llvm-svn: 231813
2015-03-10 18:41:22 +00:00
Sanjay Patel 51bd9421ac fix typos; NFC
llvm-svn: 231812
2015-03-10 18:37:05 +00:00
Mehdi Amini a28d91d81b DataLayout is mandatory, update the API to reflect it with references.
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
2015-03-10 02:37:25 +00:00
Benjamin Kramer fd3bc74460 SymbolRewriter: Hide implementation details
NFC.

llvm-svn: 231660
2015-03-09 15:50:47 +00:00
Kevin Qin 65b07b8e1b Revert r231630 - Run LICM pass after loop unrolling pass.
As it broke llvm bootstrap.

llvm-svn: 231635
2015-03-09 07:26:37 +00:00
Kevin Qin a998735def Run LICM pass after loop unrolling pass.
Runtime unrollng will introduce a runtime check in loop prologue.
If the unrolled loop is a inner loop, then the proglogue will be inside
the outer loop. LICM pass can help to promote the runtime check out if
the checked value is loop invariant.

llvm-svn: 231630
2015-03-09 06:14:07 +00:00
Sanjoy Das a5397c0198 [IndVarSimplify] use the "canonical" way to infer no-wrap.
Summary:
rL225282 introduced an ad-hoc way to promote some additions to nuw or
nsw.  Since then SCEV has become smarter in directly proving no-wrap;
and using the canonical "ext(A op B) == ext(A) op ext(B)" method of
proving no-wrap is just as powerful now.  Rip out the existing
complexity in favor of getting SCEV to do all the heaving lifting
internally.

This change does not add any unit tests because it is supposed to be a
non-functional change.  Tests added in rL225282 and rL226075 are valid
tests for this change.

Reviewers: atrick, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7981

llvm-svn: 231306
2015-03-04 22:24:23 +00:00
Mehdi Amini 46a43556db Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
Andrew Kaylor f22fe4ae18 Remap frame variables for native Windows exception handling.
Differential Revision: http://reviews.llvm.org/D7770

llvm-svn: 230249
2015-02-23 20:01:56 +00:00
Chad Rosier 543900539f Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity.
This patch adds the isProfitableToHoist API.  For AArch64, we want to prevent a
fmul from being hoisted in cases where it is more profitable to form a
fmsub/fmadd.

Phabricator Review: http://reviews.llvm.org/D7299
Patch by Lawrence Hu <lawrence@codeaurora.org>

llvm-svn: 230241
2015-02-23 19:15:16 +00:00
Benjamin Kramer dfedfeb298 SSAUpdater: Use range-based for. NFC.
llvm-svn: 229908
2015-02-19 20:04:02 +00:00
Benjamin Kramer ea68a944a1 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Sanjoy Das 11b279a832 Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if
computing the trip count overflows.  This change also gets rid of the
backedge check in the prologue loop and the extra check for
overflowing trip-count.

Differential Revision: http://reviews.llvm.org/D7715

llvm-svn: 229731
2015-02-18 19:32:25 +00:00
Andrew Kaylor 527c5dc68d Adding implementation to outline C++ catch handlers for native Windows 64 exception handling.
Differential Revision: http://reviews.llvm.org/D7363

llvm-svn: 229715
2015-02-18 18:31:51 +00:00
Benjamin Kramer 6cd780ff21 Prefer SmallVector::append/insert over push_back loops.
Same functionality, but hoists the vector growth out of the loop.

llvm-svn: 229500
2015-02-17 15:29:18 +00:00
Evgeniy Stepanov 292acab847 [asan] Reuse a common function.
Do not reimplement RoundUpToAlignment.

llvm-svn: 229397
2015-02-16 14:49:37 +00:00
Aaron Ballman f9a1897c72 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
James Molloy 1b6207e6eb [SimplifyCFG] Be more aggressive
Up the phi node folding threshold from a cheap "1" to a meagre "2".

Update tests for extra added selects and slight code churn.

llvm-svn: 229099
2015-02-13 10:48:30 +00:00
Chandler Carruth 30d69c2e36 [PM] Remove the old 'PassManager.h' header file at the top level of
LLVM's include tree and the use of using declarations to hide the
'legacy' namespace for the old pass manager.

This undoes the primary modules-hostile change I made to keep
out-of-tree targets building. I sent an email inquiring about whether
this would be reasonable to do at this phase and people seemed fine with
it, so making it a reality. This should allow us to start bootstrapping
with modules to a certain extent along with making it easier to mix and
match headers in general.

The updates to any code for users of LLVM are very mechanical. Switch
from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h".
Qualify the types which now produce compile errors with "legacy::". The
most common ones are "PassManager", "PassManagerBase", and
"FunctionPassManager".

llvm-svn: 229094
2015-02-13 10:01:29 +00:00
James Molloy 7c336576a5 [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

llvm-svn: 228826
2015-02-11 12:15:41 +00:00
Zachary Turner 3bd47cee78 Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.
This allows IDEs to recognize the entire set of header files for
each of the core LLVM projects.

Differential Revision: http://reviews.llvm.org/D7526
Reviewed By: Chris Bieneman

llvm-svn: 228798
2015-02-11 03:28:02 +00:00
Reid Kleckner 96d011315a Don't promote asynch EH invokes of nounwind functions to calls
If the landingpad of the invoke is using a personality function that
catches asynch exceptions, then it can catch a trap.

Also add some landingpads to invalid LLVM IR test cases that lack them.

Over-the-shoulder reviewed by David Majnemer.

llvm-svn: 228782
2015-02-11 01:23:16 +00:00
Duncan P. N. Exon Smith bd75ad4d0c IR: Take uint64_t in DIBuilder::createExpression()
`DIExpression` deals with `uint64_t`, so it doesn't make sense that
`createExpression()` is created from `int64_t`.  Switch to `uint64_t` to
unify them.

I've temporarily left in the `int64_t` version, which forwards to the
`uint64_t` version.  I'll delete it once I've updated the callers.

llvm-svn: 228619
2015-02-09 22:13:27 +00:00
Akira Hatanaka 8d3cb829ce Fix a bug in DemoteRegToStack where a reload instruction was inserted into the
wrong basic block.

This would happen when the result of an invoke was used by a phi instruction
in the invoke's normal destination block. An instruction to reload the invoke's
value would get inserted before the critical edge was split and a new basic
block (which is the correct insertion point for the reload) was created. This
commit fixes the bug by splitting the critical edge before all the reload
instructions are inserted.

Also, hoist up the code which computes the insertion point to the only place
that need that computation.

rdar://problem/15978721

llvm-svn: 228566
2015-02-09 06:38:23 +00:00
Bjorn Steinbrink 5ec7522771 Correctly combine alias.scope metadata by a union instead of intersecting
Summary:
The alias.scope metadata represents sets of things an instruction might
alias with. When generically combining the metadata from two
instructions the result must be the union of the original sets, because
the new instruction might alias with anything any of the original
instructions aliased with.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7490

llvm-svn: 228525
2015-02-08 17:07:14 +00:00
Hans Wennborg 8b4dbdf15d LowerSwitch: Use ConstantInt for CaseRange::{Low,High}
Case values are always ConstantInt. This allows us to remove
a bunch of casts. NFC.

llvm-svn: 228312
2015-02-05 16:58:10 +00:00
Hans Wennborg 8c82fbcb73 LowerSwitch: remove default args from CaseRange ctor; NFC
llvm-svn: 228311
2015-02-05 16:50:27 +00:00
Duncan P. N. Exon Smith 920df5c1bb Utils: Resolve cycles under distinct MDNodes
Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`,
and resolve them at the end.  Previously, these cycles wouldn't get
resolved.

llvm-svn: 228180
2015-02-04 19:44:34 +00:00
Jingyue Wu 49a766e468 Resurrect the assertion removed by r227717
Summary: MSVC can compile "LoopID->getOperand(0) == LoopID" when LoopID is MDNode*.

Test Plan: no regression

Reviewers: mkuper

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7327

llvm-svn: 227853
2015-02-02 20:41:11 +00:00
Michael Kuperstein a691f3e921 Removed assert that doesn't typecheck and breaks debug MSVC build.
llvm-svn: 227717
2015-02-01 08:46:20 +00:00
Jingyue Wu 0220df0dfd [NVPTX] Emit .pragma "nounroll" for loops marked with nounroll
Summary:
CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA
driver from unrolling a loop marked with llvm.loop.unroll.disable is not
unrolled by CUDA driver, we need to emit .pragma "nounroll" at the
header of that loop.

This patch also extracts getting unroll metadata from loop ID metadata
into a shared helper function.

Test Plan: test/CodeGen/NVPTX/nounroll.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7041

llvm-svn: 227703
2015-02-01 02:27:45 +00:00
Adrian Prantl 133e102b8f Remove a redundant dyn_cast.
llvm-svn: 227605
2015-01-30 19:42:59 +00:00
Adrian Prantl 3e2659eb92 Inliner: Use replaceDbgDeclareForAlloca() instead of splicing the
instruction and generalize it to optionally dereference the variable.
Follow-up to r227544.

llvm-svn: 227604
2015-01-30 19:37:48 +00:00
Adrian Prantl 4d365250ec Fix PR22386. The inliner moves static allocas to the entry basic block
so we need to move the dbg.declare intrinsics that describe them, too.

llvm-svn: 227544
2015-01-30 01:55:25 +00:00
Philip Reames 9198b33b48 Teach SplitBlockPredecessors how to handle landingpad blocks.
Patch by: Igor Laevsky <igor@azulsystems.com>

"Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors."

Differential Revision: http://reviews.llvm.org/D7157

llvm-svn: 227390
2015-01-28 23:06:47 +00:00
Chandler Carruth b81dfa6378 [LPM] Stop using the string based preservation API. It is an
abomination.

For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.

To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.

String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/

I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]

llvm-svn: 227294
2015-01-28 04:57:56 +00:00
Saleem Abdulrasool c44d71b8df SymbolRewriter: allow rewriting with comdats
COMDATs must be identically named to the symbol.  When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs.  This corrects the behaviour and
adds test cases for this.

llvm-svn: 227261
2015-01-27 22:57:39 +00:00
Saleem Abdulrasool 9769b18cba SymbolRewriter: prevent unnecessary rewrite
The rewrite for the pattern based rewrite is unnecessary if the existing name
matches the pattern.

llvm-svn: 227260
2015-01-27 22:57:35 +00:00
Ahmed Bougacha 1ac9356524 [SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.
This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.

llvm-svn: 227250
2015-01-27 21:52:16 +00:00
Hans Wennborg b64cb271dc SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
The range check would get optimized away later, but we might as well not emit
them in the first place.

http://reviews.llvm.org/D6471

llvm-svn: 227126
2015-01-26 19:52:34 +00:00
Hans Wennborg 6800008f04 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.

Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.

Differential Revision: http://reviews.llvm.org/D6471

llvm-svn: 227125
2015-01-26 19:52:32 +00:00
Hans Wennborg 90b827cae2 Make ConstantFoldTerminator() handle switches with unreachable default.
Tested by Transforms/SimplifyCFG/switch-to-br.ll's @unreachable function.

Differential Revision: http://reviews.llvm.org/D6471

llvm-svn: 227124
2015-01-26 19:52:24 +00:00
Chandler Carruth 72793727cc [PM] Move the LowerExpectIntrinsic pass to the Scalar library.
It was already in the Scalar header and referenced extensively as being
in this library, the source file was just in the utils directory for
some reason. No actual functionality changed. I noticed as it didn't
make sense to add a pass header to the utils headers.

llvm-svn: 226991
2015-01-24 10:18:47 +00:00
Hans Wennborg ae9c971a2f LowerSwitch: replace unreachable default with popular case destination
SimplifyCFG currently does this transformation, but I'm planning to remove that
to allow other passes, such as this one, to exploit the unreachable default.

This patch takes care to keep track of what case values are unreachable even
after the transformation, allowing for more efficient lowering.

Differential Revision: http://reviews.llvm.org/D6697

llvm-svn: 226934
2015-01-23 20:43:51 +00:00
Reid Kleckner f12b33454f Revert "Don't remove a landing pad if the invoke requires a table entry."
This reverts commit r176827.

Björn Steinbrink pointed out that this didn't actually fix the bug
(PR15555) it was attempting to fix.

With this reverted, we can now remove landingpad cleanups that
immediately resume unwinding, converting the invoke to a call.

llvm-svn: 226850
2015-01-22 19:29:46 +00:00
David Blaikie df706288fb DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined calls being coalesced
When two calls from the same MDLocation are inlined they currently get
treated as one inlined function call (creating difficulty debugging,
duplicate variables, etc).

Clang worked around this by including column information on inline calls
which doesn't address LTO inlining or calls to the same function from
the same line and column (such as through a macro). It also didn't
address ctor and member function calls.

By making the inlinedAt locations distinct, every call site has an
explicitly distinct location that cannot be coalesced with any other
call.

This can produce linearly (2x in the worst case where every call is
inlined and the call instruction has a non-call instruction at the same
location) more debug locations. Any increase beyond that are in cases
where the Clang workaround was insufficient and the new scheme is
creating necessary distinct nodes that were being erroneously coalesced
previously.

After this change to LLVM the incomplete workarounds in Clang. That
should reduce the number of debug locations (in a build without column
info, the default on Darwin, not the default on Linux) by not creating
pseudo-distinct locations for every call to an inline function.

(oh, and I made the inlined-at chain rebuilding iterative instead of
recursive because I was having trouble wrapping my head around it the
way it was - open to discussion on the right design for that function
(including going back to a recursive solution))

llvm-svn: 226736
2015-01-21 22:57:29 +00:00
Chandler Carruth 9280382ac6 [PM] Replace an abuse of inheritance to override a single function with
a more direct approach: a type-erased glorified function pointer. Now we
can pass a function pointer into this for the easy case and we can even
pass a lambda into it in the interesting case in the instruction
combiner.

I'll be using this shortly to simplify the interfaces to InstCombiner,
but this helps pave the way and seems like a better design for the
libcall simplifier utility.

llvm-svn: 226640
2015-01-21 02:11:59 +00:00
Duncan P. N. Exon Smith 03e0583a2d IR: Move MDNode clone() methods from ValueMapper to MDNode, NFC
Now that the clone methods used by `MapMetadata()` don't do any
remapping (and return a temporary), they make more sense as member
functions on `MDNode` (and subclasses).

llvm-svn: 226541
2015-01-20 02:56:57 +00:00
Chandler Carruth 10f28f26fd [PM] Replace the Pass argument in MergeBasicBlockIntoOnlyPred with
a DominatorTree argument as that is the analysis that it wants to
update.

This removes the last non-loop utility function in Utils/ which accepts
a raw Pass argument.

llvm-svn: 226537
2015-01-20 01:37:09 +00:00
Duncan P. N. Exon Smith fed199a758 IR: Introduce GenericDwarfNode
As part of PR22235, introduce `DwarfNode` and `GenericDwarfNode`.  The
former is a metadata node with a DWARF tag.  The latter matches our
current (generic) schema of a header with string (and stringified
integer) data and an arbitrary number of operands.

This doesn't move it into place yet; that change will require a large
number of testcase updates.

llvm-svn: 226529
2015-01-20 00:01:43 +00:00
Duncan P. N. Exon Smith 2bc00f4a38 IR: Merge UniquableMDNode back into MDNode, NFC
As pointed out in r226501, the distinction between `MDNode` and
`UniquableMDNode` is confusing.  When we need subclasses of `MDNode`
that don't use all its functionality it might make sense to break it
apart again, but until then this makes the code clearer.

llvm-svn: 226520
2015-01-19 23:13:14 +00:00
Duncan P. N. Exon Smith 6dc22bf27b Utils: Simplify MapMetadata(), NFC
Extract out the operand remapping loops, which are now very similar.

llvm-svn: 226515
2015-01-19 22:44:32 +00:00
Duncan P. N. Exon Smith 9fa10658ce Skip upcast, NFC
llvm-svn: 226514
2015-01-19 22:41:14 +00:00
Duncan P. N. Exon Smith c862be860d Fix whitespace, NFC
llvm-svn: 226512
2015-01-19 22:40:25 +00:00
Duncan P. N. Exon Smith 0dcffe2cdc Utils: Simplify MapMetadata(), NFC
Take advantage of the new ability of temporary nodes to mutate to
distinct and uniqued nodes to greatly simplify the `MapMetadata()`
helper functions.

llvm-svn: 226511
2015-01-19 22:39:07 +00:00
Duncan P. N. Exon Smith 422e5c7acc Cleanup whitespace, NFC
llvm-svn: 226507
2015-01-19 22:16:01 +00:00
Duncan P. N. Exon Smith 7d82313bcd IR: Return unique_ptr from MDNode::getTemporary()
Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to
return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and
clean up call sites.  (For now, `DIBuilder` call sites just call
`release()` immediately.)

There's an accompanying change in each of clang and polly to use the new
API.

llvm-svn: 226504
2015-01-19 21:30:18 +00:00
Duncan P. N. Exon Smith 946fdcc50c IR: Remove MDNodeFwdDecl
Remove `MDNodeFwdDecl` (as promised in r226481).  Aside from API
changes, there's no real functionality change here.
`MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`,
which returns a tuple with `isTemporary()` equal to true.

The main point is that we can now add temporaries of other `MDNode`
subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the
first place because I didn't recognize this need, and thought they were
only needed to handle forward references).

A few things left out of (or highlighted by) this commit:

  - I've had to remove the (few) uses of `std::unique_ptr<>` to deal
    with temporaries, since the destructor is no longer public.
    `getTemporary()` should probably return the equivalent of
    `std::unique_ptr<T, MDNode::deleteTemporary>`.
  - `MDLocation::getTemporary()` doesn't exist yet (worse, it actually
    does exist, but does the wrong thing: `MDNode::getTemporary()` is
    inherited and returns an `MDTuple`).
  - `MDNode` now only has one subclass, `UniquableMDNode`, and the
    distinction between them is actually somewhat confusing.

I'll fix those up next.

llvm-svn: 226501
2015-01-19 20:36:39 +00:00
Duncan P. N. Exon Smith de03a8b38d IR: Add isUniqued() and isTemporary()
Change `MDNode::isDistinct()` to only apply to 'distinct' nodes (not
temporaries), and introduce `MDNode::isUniqued()` and
`MDNode::isTemporary()` for the other two possibilities.

llvm-svn: 226482
2015-01-19 18:45:35 +00:00
Chandler Carruth d450056c78 [PM] Replace the Pass argument to SplitEdge with specific analyses used
and updated.

This may appear to remove handling for things like alias analysis when
splitting critical edges here, but in fact no callers of SplitEdge
relied on this. Similarly, all of them wanted to preserve LCSSA if there
was any update of the loop info. That makes the interface much simpler.

With this, all of BasicBlockUtils.h is free of Pass arguments and
prepared for the new pass manager. This is tho majority of utilities
that relied on pass arguments.

llvm-svn: 226459
2015-01-19 12:36:53 +00:00
Chandler Carruth 37df2cfbf8 [PM] Remove the Pass argument from all of the critical edge splitting
APIs and replace it and numerous booleans with an option struct.

The critical edge splitting API has a really large surface of flags and
so it seems worth burning a small option struct / builder. This struct
can be constructed with the various preserved analyses and then flags
can be flipped in a builder style.

The various users are now responsible for directly passing along their
analysis information. This should be enough for the critical edge
splitting to work cleanly with the new pass manager as well.

This API is still pretty crufty and could be cleaned up a lot, but I've
focused on this change just threading an option struct rather than
a pass through the API.

llvm-svn: 226456
2015-01-19 12:09:11 +00:00
Chandler Carruth ad34d91343 [PM] Relax asserts and always try to reconstruct loop simplify form when
we can while splitting critical edges.

The only code which called this and didn't require simplified loops to
be preserved is polly, and the code behaves correctly there anyways.
Without this change, it becomes really hard to share this code with the
new pass manager where things like preserving loop simplify form don't
make any sense.

If anyone discovers this code behaving incorrectly, what it *should* be
testing for is whether the loops it needs to be in simplified form are
in fact in that form. It should always be trying to preserve that form
when it exists.

llvm-svn: 226443
2015-01-19 10:23:00 +00:00
Chandler Carruth 0eae112009 [PM] Lift the analyses into the interface for
SplitLandingPadPredecessors and remove the Pass argument from its
interface.

Another step to the utilities being usable with both old and new pass
managers.

llvm-svn: 226426
2015-01-19 03:03:39 +00:00
Chandler Carruth b5797b659f [PM] Pull the analyses used for another utility routine into its API
rather than relying on the pass object.

This one is a bit annoying, but will pay off. First, supporting this one
will make the next one much easier, and for utilities like LoopSimplify,
this is moving them (slowly) closer to not having to pass the pass
object around throughout their APIs.

llvm-svn: 226396
2015-01-18 09:21:15 +00:00
Chandler Carruth 32c52c7e04 [PM] Sink the specific analyses preserved by SplitBlock into its
interface, removing Pass from its interface.

This also makes those analyses optional so that passes which don't even
preserve these (or use them) can skip the logic entirely.

llvm-svn: 226394
2015-01-18 02:39:37 +00:00
Chandler Carruth b5c115357c [PM] Replace another Pass argument with specific analyses that are
optionally updated by MergeBlockIntoPredecessors.

No functionality changed, just refactoring to clear the way for the new
pass manager.

llvm-svn: 226392
2015-01-18 02:11:23 +00:00
Chandler Carruth 5eee895ccf [PM] Lift the actual analyses used into the inferface rather than
accepting a Pass and querying it for analyses.

This is necessary to allow the utilities to work both with the old and
new pass managers, and I also think this makes the interface much more
clear and helps the reader know what analyses the utility can actually
handle. I plan to repeat this process iteratively to clean up all the
pass utilities.

llvm-svn: 226386
2015-01-18 01:45:07 +00:00
Chandler Carruth 691addc25f [PM] Now that LoopInfo isn't in the Pass type hierarchy, it is much
cleaner to derive from the generic base.

Thise removes a ton of boiler plate code and somewhat strange and
pointless indirections. It also remove a bunch of the previously needed
friend declarations. To fully remove these, I also lifted the verify
logic into the generic LoopInfoBase, which seems good anyways -- it is
generic and useful logic even for the machine side.

llvm-svn: 226385
2015-01-18 01:25:51 +00:00
Chandler Carruth 24fd029a60 [PM] Remove a dead field.
This was dead even before I refactored how we initialized it, but my
refactoring made it trivially dead and it is now caught by a Clang
warning. This fixes the warning and should clean up the -Werror bot
failures (sorry!).

llvm-svn: 226376
2015-01-17 14:31:35 +00:00
Chandler Carruth 4f8f307c77 [PM] Split the LoopInfo object apart from the legacy pass, creating
a LoopInfoWrapperPass to wire the object up to the legacy pass manager.

This switches all the clients of LoopInfo over and paves the way to port
LoopInfo to the new pass manager. No functionality change is intended
with this iteration.

llvm-svn: 226373
2015-01-17 14:16:18 +00:00
Chandler Carruth b98f63dbdb [PM] Separate the TargetLibraryInfo object from the immutable pass.
The pass is really just a means of accessing a cached instance of the
TargetLibraryInfo object, and this way we can re-use that object for the
new pass manager as its result.

Lots of delta, but nothing interesting happening here. This is the
common pattern that is developing to allow analyses to live in both the
old and new pass manager -- a wrapper pass in the old pass manager
emulates the separation intrinsic to the new pass manager between the
result and pass for analyses.

llvm-svn: 226157
2015-01-15 10:41:28 +00:00
David Majnemer f0982d0ac6 SimplifyIndVar: Remove unused variable
OtherOperandIdx is not used anymore, remove it to silence warnings.

llvm-svn: 226138
2015-01-15 07:11:23 +00:00
NAKAMURA Takumi 24ebfcb619 Update libdeps since TLI was moved from Target to Analysis in r226078.
llvm-svn: 226126
2015-01-15 05:21:00 +00:00
Chandler Carruth 62d4215baa [PM] Move TargetLibraryInfo into the Analysis library.
While the term "Target" is in the name, it doesn't really have to do
with the LLVM Target library -- this isn't an abstraction which LLVM
targets generally need to implement or extend. It has much more to do
with modeling the various runtime libraries on different OSes and with
different runtime environments. The "target" in this sense is the more
general sense of a target of cross compilation.

This is in preparation for porting this analysis to the new pass
manager.

No functionality changed, and updates inbound for Clang and Polly.

llvm-svn: 226078
2015-01-15 02:16:27 +00:00
Sanjoy Das 8c252bde36 Fix PR22222
The bug was introduced in r225282. r225282 assumed that sub X, Y is
the same as add X, -Y. This is not correct if we are going to upgrade
the sub to sub nuw. This change fixes the issue by making the
optimization ignore sub instructions.

Differential Revision: http://reviews.llvm.org/D6979

llvm-svn: 226075
2015-01-15 01:46:09 +00:00
Duncan P. N. Exon Smith e65b0663e6 Remove trailing slash from r225924
llvm-svn: 225929
2015-01-14 01:42:43 +00:00
Duncan P. N. Exon Smith e54cd9a6f3 Utils: Remove unreachable break, NFC
llvm-svn: 225924
2015-01-14 01:31:34 +00:00
Duncan P. N. Exon Smith a5a0f5766a Utils: Handle remapping distinct MDLocations
Part of PR21433.

llvm-svn: 225921
2015-01-14 01:29:32 +00:00
Duncan P. N. Exon Smith b84840c04e Utils: Thread distinct-ness through the cloneMD*() functions, NFC
The new logic isn't actually reachable yet, so no functionality change.

llvm-svn: 225918
2015-01-14 01:24:38 +00:00
Duncan P. N. Exon Smith 7c69c1ebda Utils: Extract cloneMDNode(), NFC
llvm-svn: 225917
2015-01-14 01:22:47 +00:00
Duncan P. N. Exon Smith b6515d6a71 Utils: Move cloneMD*() up, NFC
llvm-svn: 225915
2015-01-14 01:21:24 +00:00
Duncan P. N. Exon Smith 47d82981d6 Utils: Add mapping for uniqued MDLocations
Still doesn't handle distinct ones.  Part of PR21433.

llvm-svn: 225914
2015-01-14 01:20:27 +00:00
Duncan P. N. Exon Smith 4766e01250 Utils: Extract cloneMDTuple(), NFC
llvm-svn: 225912
2015-01-14 01:12:14 +00:00
Duncan P. N. Exon Smith fb9d128ab1 Utils: Extract shouldRemapUniquedNode(), NFC
llvm-svn: 225911
2015-01-14 01:08:47 +00:00
Duncan P. N. Exon Smith 637e765907 Utils: Simplify code, NFC
llvm-svn: 225906
2015-01-14 01:07:03 +00:00
Duncan P. N. Exon Smith b557989a40 Utils: Extract mapUniquedNode(), NFC
llvm-svn: 225905
2015-01-14 01:06:21 +00:00
Duncan P. N. Exon Smith 8725ca8c60 Utils: MDNode => UniquableMDNode, NFC
Although this makes the `cast<>` assert more often, the
`assert(Node->isResolved())` on the following line would assert in all
those cases.  So, no functionality change here.

llvm-svn: 225903
2015-01-14 01:05:17 +00:00
Duncan P. N. Exon Smith 14cc94c1c6 Utils: Separate out mapDistinctNode(), NFC
llvm-svn: 225902
2015-01-14 01:03:05 +00:00
Duncan P. N. Exon Smith 3956a85e6e Utils: Use helper function directly, NFC
llvm-svn: 225901
2015-01-14 01:02:17 +00:00
Duncan P. N. Exon Smith 077affdbb9 Utils: Extract helper function, NFC
llvm-svn: 225897
2015-01-14 01:01:19 +00:00
Duncan P. N. Exon Smith 34651ee2f6 Utils: Use MDTuple::get() directly, NFC
Working towards supporting `MDLocation` in `MapMetadata()`.

llvm-svn: 225896
2015-01-14 00:59:57 +00:00
Ahmed Bougacha 71d7b18e3d [SimplifyLibCalls] Don't try to simplify indirect calls.
It turns out, all callsites of the simplifier are guarded by a check for
CallInst::getCalledFunction (i.e., to make sure the callee is direct).

This check wasn't done when trying to further optimize a simplified fortified
libcall, introduced by a refactoring in r225640.

Fix that, add a testcase, and document the requirement.

llvm-svn: 225895
2015-01-14 00:55:05 +00:00
Ramkumar Ramachandra 181233b2b7 fix {typo, build failure} in r225760
llvm-svn: 225762
2015-01-13 04:17:47 +00:00
Ramkumar Ramachandra 40c3e03e27 Standardize {pred,succ,use,user}_empty()
The functions {pred,succ,use,user}_{begin,end} exist, but many users
have to check *_begin() with *_end() by hand to determine if the
BasicBlock or User is empty. Fix this with a standard *_empty(),
demonstrating a few usecases.

llvm-svn: 225760
2015-01-13 03:46:47 +00:00
Duncan P. N. Exon Smith 118632dbf6 IR: Split GenericMDNode into MDTuple and UniquableMDNode
Split `GenericMDNode` into two classes (with more descriptive names).

  - `UniquableMDNode` will be a common subclass for `MDNode`s that are
    sometimes uniqued like constants, and sometimes 'distinct'.

    This class gets the (short-lived) RAUW support and related API.

  - `MDTuple` is the basic tuple that has always been returned by
    `MDNode::get()`.  This is as opposed to more specific nodes to be
    added soon, which have additional fields, custom assembly syntax,
    and extra semantics.

    This class gets the hash-related logic, since other sublcasses of
    `UniquableMDNode` may need to hash based on other fields.

To keep this diff from getting too big, I've added casts to `MDTuple`
that won't really scale as new subclasses of `UniquableMDNode` are
added, but I'll clean those up incrementally.

(No functionality change intended.)

llvm-svn: 225682
2015-01-12 20:09:34 +00:00
Ahmed Bougacha e03bef7543 [SimplifyLibCalls] Factor out fortified libcall handling.
This lets us remove CGP duplicate.

Differential Revision: http://reviews.llvm.org/D6541

llvm-svn: 225640
2015-01-12 17:22:43 +00:00
Ahmed Bougacha 6722f5e5b3 [SimplifyLibCalls] Factor out str/mem libcall optimizations.
Put them in a separate function, so we can reuse them to further
simplify fortified libcalls as well.

Differential Revision: http://reviews.llvm.org/D6540

llvm-svn: 225639
2015-01-12 17:20:06 +00:00
Ahmed Bougacha b7d8afb6c5 [SimplifyLibCalls] Factor out signature checks for fortifiable libcalls.
The checks are the same for fortified counterparts to the libcalls, so
we might as well do them in a single place.

Differential Revision: http://reviews.llvm.org/D6539

llvm-svn: 225638
2015-01-12 17:18:19 +00:00
Hans Wennborg dcc6e5bc03 SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210)
The previous code assumed that such instructions could not have any uses
outside CaseDest, with the motivation that the instruction could not
dominate CommonDest because CommonDest has phi nodes in it. That simply
isn't true; e.g., CommonDest could have an edge back to itself.

llvm-svn: 225552
2015-01-09 22:13:31 +00:00
Duncan P. N. Exon Smith 953e1a48f0 Utils: Keep distinct MDNodes distinct in MapMetadata()
Create new copies of distinct `MDNode`s instead of following the
uniquing `MDNode` logic.

Just like self-references (or other cycles), `MapMetadata()` creates a
new node.  In practice most calls use `RF_NoModuleLevelChanges`, in
which case nothing is duplicated anyway.

Part of PR22111.

llvm-svn: 225476
2015-01-08 22:42:30 +00:00
Sanjoy Das 7c0ce26614 This patch teaches IndVarSimplify to add nuw and nsw to certain kinds
of operations that provably don't overflow. For example, we can prove
%civ.inc below does not sign-overflow. With this change,
IndVarSimplify changes %civ.inc to an add nsw.

  define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
   entry:
    %length = load i32* %length_ptr, !range !0
    %len.sub.1 = sub i32 %length, 1
    %upper = icmp slt i32 %init, %len.sub.1
    br i1 %upper, label %loop, label %exit
  
   loop:
    %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
    %civ.inc = add i32 %civ, 1
    %cmp = icmp slt i32 %civ.inc, %length
    br i1 %cmp, label %latch, label %break
  
   latch:
    store i32 0, i32* %array
    %check = icmp slt i32 %civ.inc, %len.sub.1
    br i1 %check, label %loop, label %break
  
   break:
    ret i32 %civ.inc
  
   exit:
    ret i32 42
  }

Differential Revision: http://reviews.llvm.org/D6748

llvm-svn: 225282
2015-01-06 19:02:56 +00:00
Saleem Abdulrasool 150a1dc5c2 SymbolRewriter: use iplist::splice
The swap implementation for iplist is currently unsupported.  Simply splice the
old list into place, which achieves the same purpose.  This is needed in order
to thread the -frewrite-map-file frontend option correctly.  NFC.

llvm-svn: 225186
2015-01-05 17:56:32 +00:00
Saleem Abdulrasool d37ce30888 SymbolRewriter: 80-column
Wrap a couple of lines.  NFC.

llvm-svn: 225185
2015-01-05 17:56:29 +00:00
Chandler Carruth 66b3130cda [PM] Split the AssumptionTracker immutable pass into two separate APIs:
a cache of assumptions for a single function, and an immutable pass that
manages those caches.

The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.

Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.

For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.

llvm-svn: 225131
2015-01-04 12:03:27 +00:00
Michael Liao 5313da3263 [SimplifyCFG] Revise common code sinking
- Fix the case where more than 1 common instructions derived from the same
  operand cannot be sunk. When a pair of value has more than 1 derived values
  in both branches, only 1 derived value could be sunk.
- Replace BB1 -> (BB2, PN) map with joint value map, i.e.
  map of (BB1, BB2) -> PN, which is more accurate to track common ops.

llvm-svn: 224757
2014-12-23 08:26:55 +00:00
Michael Kuperstein 0bf33ffde4 Remove a bad cast in CloneModule()
A cast that was introduced in r209007 was accidentally left in after the changes made to GlobalAlias rules in r210062. This crashes if the aliasee is a now-leggal ConstantExpr.

llvm-svn: 224756
2014-12-23 08:23:45 +00:00
Bruno Cardoso Lopes bad65c3b70 [LCSSA] Handle PHI insertion in disjoint loops
Take two disjoint Loops L1 and L2.

LoopSimplify fails to simplify some loops (e.g. when indirect branches
are involved). In such situations, it can happen that an exit for L1 is
the header of L2. Thus, when we create PHIs in one of such exits we are
also inserting PHIs in L2 header.

This could break LCSSA form for L2 because these inserted PHIs can also
have uses in L2 exits, which are never handled in the current
implementation. Provide a fix for this corner case and test that we
don't assert/crash on that.

Differential Revision: http://reviews.llvm.org/D6624

rdar://problem/19166231

llvm-svn: 224740
2014-12-22 22:35:46 +00:00
Duncan P. N. Exon Smith 46d7af5729 Rename MapValue(Metadata*) to MapMetadata()
Instead of reusing the name `MapValue()` when mapping `Metadata`, use
`MapMetadata()`.  The old name doesn't make much sense after the
`Metadata`/`Value` split.

llvm-svn: 224566
2014-12-19 06:06:18 +00:00
Michael Kuperstein fffb6996c9 The inliner needs to fix up debug information for llvm.dbg.declare, not only for llvm.dbg.value.
Patch by Amjad Aboud

Differential Revision: http://reviews.llvm.org/D6525

llvm-svn: 224015
2014-12-11 12:41:10 +00:00
Kaelyn Takata 22324f378a Rename static functiom "map" to be more descriptive and to avoid
potential confusion with the std::map type.

llvm-svn: 223853
2014-12-09 23:32:46 +00:00
Frederic Riss 35f0a9aeba Remove unneeded curly braces.
llvm-svn: 223809
2014-12-09 18:57:39 +00:00
Frederic Riss ff58fd207e Reorder the code to avoid inserting at the beginning of a vector.
As per dblaikie suggestion, thanks\!

llvm-svn: 223808
2014-12-09 18:57:34 +00:00
Duncan P. N. Exon Smith 5bf8fef580 IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532.  Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.

I have a follow-up patch prepared for `clang`.  If this breaks other
sub-projects, I apologize in advance :(.  Help me compile it on Darwin
I'll try to fix it.  FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.

This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.

Here's a quick guide for updating your code:

  - `Metadata` is the root of a class hierarchy with three main classes:
    `MDNode`, `MDString`, and `ValueAsMetadata`.  It is distinct from
    the `Value` class hierarchy.  It is typeless -- i.e., instances do
    *not* have a `Type`.

  - `MDNode`'s operands are all `Metadata *` (instead of `Value *`).

  - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
    replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.

    If you're referring solely to resolved `MDNode`s -- post graph
    construction -- just use `MDNode*`.

  - `MDNode` (and the rest of `Metadata`) have only limited support for
    `replaceAllUsesWith()`.

    As long as an `MDNode` is pointing at a forward declaration -- the
    result of `MDNode::getTemporary()` -- it maintains a side map of its
    uses and can RAUW itself.  Once the forward declarations are fully
    resolved RAUW support is dropped on the ground.  This means that
    uniquing collisions on changing operands cause nodes to become
    "distinct".  (This already happened fairly commonly, whenever an
    operand went to null.)

    If you're constructing complex (non self-reference) `MDNode` cycles,
    you need to call `MDNode::resolveCycles()` on each node (or on a
    top-level node that somehow references all of the nodes).  Also,
    don't do that.  Metadata cycles (and the RAUW machinery needed to
    construct them) are expensive.

  - An `MDNode` can only refer to a `Constant` through a bridge called
    `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).

    As a side effect, accessing an operand of an `MDNode` that is known
    to be, e.g., `ConstantInt`, takes three steps: first, cast from
    `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
    third, cast down to `ConstantInt`.

    The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
    metadata schema owners transition away from using `Constant`s when
    the type isn't important (and they don't care about referring to
    `GlobalValue`s).

    In the meantime, I've added transitional API to the `mdconst`
    namespace that matches semantics with the old code, in order to
    avoid adding the error-prone three-step equivalent to every call
    site.  If your old code was:

        MDNode *N = foo();
        bar(isa             <ConstantInt>(N->getOperand(0)));
        baz(cast            <ConstantInt>(N->getOperand(1)));
        bak(cast_or_null    <ConstantInt>(N->getOperand(2)));
        bat(dyn_cast        <ConstantInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));

    you can trivially match its semantics with:

        MDNode *N = foo();
        bar(mdconst::hasa               <ConstantInt>(N->getOperand(0)));
        baz(mdconst::extract            <ConstantInt>(N->getOperand(1)));
        bak(mdconst::extract_or_null    <ConstantInt>(N->getOperand(2)));
        bat(mdconst::dyn_extract        <ConstantInt>(N->getOperand(3)));
        bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));

    and when you transition your metadata schema to `MDInt`:

        MDNode *N = foo();
        bar(isa             <MDInt>(N->getOperand(0)));
        baz(cast            <MDInt>(N->getOperand(1)));
        bak(cast_or_null    <MDInt>(N->getOperand(2)));
        bat(dyn_cast        <MDInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));

  - A `CallInst` -- specifically, intrinsic instructions -- can refer to
    metadata through a bridge called `MetadataAsValue`.  This is a
    subclass of `Value` where `getType()->isMetadataTy()`.

    `MetadataAsValue` is the *only* class that can legally refer to a
    `LocalAsMetadata`, which is a bridged form of non-`Constant` values
    like `Argument` and `Instruction`.  It can also refer to any other
    `Metadata` subclass.

(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)

llvm-svn: 223802
2014-12-09 18:38:53 +00:00
Frederic Riss 7c78db5065 Correctly handle complex locations expressions in replaceDbgDeclareForAlloca()
replaceDbgDeclareForAlloca() replaces an alloca by a value storing the
address of what was the alloca. If there is a dbg.declare corresponding
to that alloca, we need to lower it to a dbg.value describing the additional
dereference operation to be performed to get to the underlying variable.
 This is done by adding a DW_OP_deref to the complex location part of the
location description. This deref was added to the end of the operation list,
which is wrong. The expression applies to what is described by the
dbg.{declare,value}, and as we are changing this, we need to apply the
DW_OP_deref as the first operation in the list.

Part of the fix for rdar://19162268.

llvm-svn: 223799
2014-12-09 17:55:48 +00:00
Juergen Ributzka 194350a936 Revert "Move function to obtain branch weights into the BranchInst class. NFC."
This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare.

llvm-svn: 223795
2014-12-09 17:32:12 +00:00
Juergen Ributzka e2aa3aa38a Move function to obtain branch weights into the BranchInst class. NFC.
Make this function available to other parts of LLVM.

llvm-svn: 223784
2014-12-09 16:36:06 +00:00
Duncan P. N. Exon Smith b236211c4c Utils: Style cleanups, NFC
llvm-svn: 223556
2014-12-06 00:48:17 +00:00
Duncan P. N. Exon Smith b13f7d2e36 Utils: Avoid RAUW on metadata in CloneFunction()
llvm-svn: 223555
2014-12-06 00:48:13 +00:00
Matthias Braun 395a82f6cc correct spelling, NFC
llvm-svn: 223274
2014-12-03 22:10:39 +00:00
Matthias Braun d34e4d2354 [SimplifyLibCalls] Improve double->float shrinking to consider constants
This allows cases like float x; fmin(1.0, x); to be optimized to fminf(1.0f, x);

rdar://19049359

Differential Revision: http://reviews.llvm.org/D6496

llvm-svn: 223270
2014-12-03 21:46:33 +00:00
Matthias Braun 892c923c46 [SimplifyLibCalls] Enable double to float shrinking for copysign
rdar://19049359

Differential Revision: http://reviews.llvm.org/D6495

llvm-svn: 223269
2014-12-03 21:46:29 +00:00
Bruno Cardoso Lopes 15520db9ad [SwitchLowering] Handle destinations on multiple phi instructions
Follow up from r222926. Also handle multiple destinations from merged
cases on multiple and subsequent phi instructions.

rdar://problem/19106978

llvm-svn: 223135
2014-12-02 18:31:53 +00:00
Hans Wennborg 5bef5b522b Revert r223049, r223050 and r223051 while investigating test failures.
I didn't foresee affecting the Clang test suite :/

llvm-svn: 223054
2014-12-01 17:36:43 +00:00
Hans Wennborg 269ebb612e SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
They would get optimized away later, but we might as well not emit them.

llvm-svn: 223051
2014-12-01 17:08:38 +00:00
Hans Wennborg 5a1e5c05d8 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations, and
SDag lowering is now prepared to handle them efficiently.

For example, branches to the unreachable destination will be optimized away,
such as in the case of range checks for switch lookup tables.

On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and
Chromium by 30 kB).

llvm-svn: 223050
2014-12-01 17:08:35 +00:00
Bruno Cardoso Lopes bc7ba2c766 [SwitchLowering] Handle multiple destinations on condensed case stmts
Switch cases statements with sequential values that branch to the same
destination BB may often be handled together in a single new source BB.
In this scenario we need to remove remaining incoming values from PHI
instructions in the destination BB, as to match the number of source
branches.

Differential Revision: http://reviews.llvm.org/D6415

rdar://problem/19040894

llvm-svn: 222926
2014-11-28 19:47:33 +00:00
Erik Eckstein 0d86c7623f reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
Fixed missing dominance check.
Original commit message:

This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
   if (idx < tablesize)
      r = table[idx]; // table does not contain default_value
   else
      r = default_value;
   if (r != default_value)
      ...
Is optimized to:
   cond = idx < tablesize;
   if (cond)
      r = table[idx];
   else
      r = default_value;
   if (cond)
      ...
Jump threading will then eliminate the second if(cond).

llvm-svn: 222891
2014-11-27 15:13:14 +00:00
Erik Eckstein 2190cd9ffa Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible."
It is breaking the clang bootstrag.

llvm-svn: 222877
2014-11-27 10:59:08 +00:00
Erik Eckstein e73e308ab9 Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
    if (idx < tablesize)
       r = table[idx]; // table does not contain default_value
    else
       r = default_value;
    if (r != default_value)
       ...
Is optimized to:
    cond = idx < tablesize;
    if (cond)
       r = table[idx];
    else
       r = default_value;
    if (cond)
       ...
\endcode
Jump threading will then eliminate the second if(cond).

llvm-svn: 222872
2014-11-27 08:33:51 +00:00
Mehdi Amini ffd0100618 SimplifyCFG: Refactor GatherConstantCompares() result in a struct
Code seems cleaner and easier to understand this way

This is basically r222416, after fixes for MSVC lack of standard 
support, and a few cleaning (got rid of a warning).
Thanks Nakamura Takumi and Nico Weber for the MSVC fixes.

llvm-svn: 222472
2014-11-20 22:40:25 +00:00
Michael Zolotukhin 0dcae71449 Fix a trip-count overflow issue in LoopUnroll.
Currently LoopUnroll generates a prologue loop before the main loop
body to execute first N%UnrollFactor iterations. Also, this loop is
used if trip-count can overflow - it's determined by a runtime check.

However, we've been mistakenly optimizing this loop to a linear code for
UnrollFactor = 2, not taking into account that it also serves as a safe
version of the loop if its trip-count overflows.

llvm-svn: 222451
2014-11-20 20:19:55 +00:00
Timur Iskhodzhanov 71526a3eda Revert r222416, r222422, r222426: the former revision had problems and fixing them introduced bugs
llvm-svn: 222428
2014-11-20 12:36:43 +00:00
Timur Iskhodzhanov a0bffc0c11 Fix a typo
llvm-svn: 222426
2014-11-20 11:48:58 +00:00
NAKAMURA Takumi 5a83192570 SimplifyCFG.cpp: Tweak to let msc17 compliant.
- Use LLVM_DELETED_FUNCTION.
  - Don't use member initializers.
  - Don't use initializer list.

llvm-svn: 222422
2014-11-20 08:59:02 +00:00
Mehdi Amini 65253e76ed SimplifyCFG: Refactor GatherConstantCompares() result in a struct
Code seems cleaner and easier to understand this way

llvm-svn: 222416
2014-11-20 06:51:02 +00:00
Nico Weber 06839a536f Try to fix MSVS build after r222384. No intended behavior change.
llvm-svn: 222386
2014-11-19 21:16:11 +00:00
Mehdi Amini 9a25cb8806 SimplifyCFG: turn recursive GatherConstantCompares into iterative
A long sequence of || or && could lead to a stack explosion.

llvm-svn: 222384
2014-11-19 20:09:11 +00:00
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Kostya Serebryany e5ea424a77 Introduce llvm::SplitAllCriticalEdges
Summary:
move the code from BreakCriticalEdges::runOnFunction()
into a separate utility function llvm::SplitAllCriticalEdges()
so that it can be used independently.
No functionality change intended.

Test Plan: check-llvm

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6313

llvm-svn: 222288
2014-11-19 00:17:31 +00:00
Hans Wennborg a6a11a969a SimplifyCFG: Range'ify some for-loops. No functional change.
llvm-svn: 222215
2014-11-18 02:37:11 +00:00
Juergen Ributzka c9591e9bdb [SimplifyCFG] Make the value type of the hole check bitmask a power-of-2.
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.

The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.

To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.

This fixes rdar://problem/18984639.

llvm-svn: 222168
2014-11-17 19:39:56 +00:00
Erik Eckstein 105374fe5e Optimize switch lookup tables with linear mapping.
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:

int f1(int x) {
  switch (x) {
    case 0: return 10;
    case 1: return 11;
    case 2: return 12;
    case 3: return 13;
  }
  return 0;
}

generates:

define i32 @f1(i32 %x) #0 {
entry:
  %0 = icmp ult i32 %x, 4
  br i1 %0, label %switch.lookup, label %return

switch.lookup:
  %switch.offset = add i32 %x, 10
  ret i32 %switch.offset

return:
  ret i32 0
}

llvm-svn: 222121
2014-11-17 09:13:57 +00:00
David Blaikie 711cd9c53c Remove redundant virtual on overriden functions.
llvm-svn: 222023
2014-11-14 19:06:36 +00:00
Reid Kleckner 971c3ea67b Use nullptr instead of NULL for variadic sentinels
Windows defines NULL to 0, which when used as an argument to a variadic
function, is not a null pointer constant. As a result, Clang's
-Wsentinel fires on this code. Using '0' would be wrong on most 64-bit
platforms, but both MSVC and Clang make it work on Windows. Sidestep the
issue with nullptr.

llvm-svn: 221940
2014-11-13 22:55:19 +00:00
Ahmed Bougacha 55a333d89b Add fortified (__*_chk) library functions to TLI (NFC)
One of them (__memcpy_chk) was already there, the others were checked
by comparing function names.
Note that the fortified libfuncs are now part of TLI, but are always
available, because they aren't generated, only optimized into the
non-checking versions.

Differential Revision: http://reviews.llvm.org/D6179

llvm-svn: 221817
2014-11-12 21:23:34 +00:00
Sanjay Patel 7777b50eaf remove function names from comments; NFC
llvm-svn: 221798
2014-11-12 18:07:42 +00:00
Duncan P. N. Exon Smith de36e8040f Revert "IR: MDNode => Value"
Instead, we're going to separate metadata from the Value hierarchy.  See
PR21532.

This reverts commit r221375.
This reverts commit r221373.
This reverts commit r221359.
This reverts commit r221167.
This reverts commit r221027.
This reverts commit r221024.
This reverts commit r221023.
This reverts commit r220995.
This reverts commit r220994.

llvm-svn: 221711
2014-11-11 21:30:22 +00:00
Juergen Ributzka d441725d3d [SwitchLowering] Fix the "fixPhis" function.
Switch statements may have more than one incoming edge into the same BB if they
all have the same value. When the switch statement is converted these incoming
edges are now coming from multiple BBs. Updating all incoming values to be from
a single BB is incorrect and would generate invalid LLVM IR.

The fix is to only update the first occurrence of an incoming value. Switch
lowering will perform subsequent calls to this helper function for each incoming
edge with a new basic block - updating all edges in the process.

This fixes rdar://problem/18916275.

llvm-svn: 221627
2014-11-10 21:05:27 +00:00
Vasileios Kalintiris ccde2a9a1e Fix extra semicolon warning. NFC.
llvm-svn: 221613
2014-11-10 17:37:53 +00:00
Saleem Abdulrasool d2c5d7f6da Transforms: address some late comments
We already use the llvm namespace.  Remove the unnecessary prefix.  Use the
StringRef::equals method to compare with C strings rather than instantiating
std::strings.

Addresses late review comments from David Majnemer.

llvm-svn: 221564
2014-11-08 00:00:50 +00:00
Saleem Abdulrasool 92b13aac04 Transforms: sort source files in build
Sort target sources.  NFC.

llvm-svn: 221563
2014-11-08 00:00:47 +00:00
Saleem Abdulrasool 89c5ad4cda Transforms: use typedef rather than using aliases
Visual Studio 2012 apparently does not support using alias declarations.  Use
the more traditional typedef approach.  This should let the Windows buildbots
pass.  NFC.

llvm-svn: 221554
2014-11-07 22:09:52 +00:00
Saleem Abdulrasool 5898e09057 Transform: add SymbolRewriter pass
This introduces the symbol rewriter. This is an IR->IR transformation that is
implemented as a CodeGenPrepare pass. This allows for the transparent
adjustment of the symbols during compilation.

It provides a clean, simple, elegant solution for symbol inter-positioning. This
technique is often used, such as in the various sanitizers and performance
analysis.

The control of this is via a custom YAML syntax map file that indicates source
to destination mapping, so as to avoid having the compiler to know the exact
details of the source to destination transformations.

llvm-svn: 221548
2014-11-07 21:32:08 +00:00
Michael Ilseman a7202bdbed Fix heap-use-after-free bug in expandSDiv when the operands are
constants, as discovered by ASAN.

Patch by Mehdi Amini!

llvm-svn: 221401
2014-11-05 21:28:24 +00:00
Mark Heffernan 2d393ea6ef Revert earlier change removing setPreservesCFG from instcombine (r221223) and
change LoopSimplifyPass to be !isCFGOnly.  The motivation for the earlier patch
(r221223) was that LoopSimplify is not preserved by instcombine though
setPreservesCFG indicates that it is.  This change fixes the issue
by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less
invasive.

llvm-svn: 221311
2014-11-04 23:02:09 +00:00
Reid Kleckner dd3f3edafa Revert "Transforms: reapply SVN r219899"
This reverts commit r220811 and r220839. It made an incorrect change to
musttail handling.

llvm-svn: 221226
2014-11-04 02:02:14 +00:00
Duncan P. N. Exon Smith 3d5a02f677 IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()
Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of
`MDNode` to one of `Value`.  Part of PR21433.

llvm-svn: 221167
2014-11-03 18:13:57 +00:00
Duncan P. N. Exon Smith 4abd1a0808 IR: MDNode => Value: Instruction::getAllMetadata()
Change `Instruction::getAllMetadata()` to modify a vector of `Value`
instead of `MDNode` and update call sites.  This is part of PR21433.

llvm-svn: 221027
2014-11-01 00:26:42 +00:00
Duncan P. N. Exon Smith 3872d0084c IR: MDNode => Value: Instruction::getMetadata()
Change `Instruction::getMetadata()` to return `Value` as part of
PR21433.

Update most callers to use `Instruction::getMDNode()`, which wraps the
result in a `cast_or_null<MDNode>`.

llvm-svn: 221024
2014-11-01 00:10:31 +00:00
Saleem Abdulrasool d178ada55e Transforms: reapply SVN r219899
This restores the commit from SVN r219899 with an additional change to ensure
that the CodeGen is correct for the case that was identified as being incorrect
(originally PR7272).

In the case that during inlining we need to synthesize a value on the stack
(i.e. for passing a value byval), then any function involving that alloca must
be stripped of its tailness as the restriction that it does not access the
parent's stack no longer holds.  Unfortunately, a single alloca can cause a
rippling effect through out the inlining as the value may be aliased or may be
mutated through an escaped external call.  As such, we simply track if an alloca
has been introduced in the frame during inlining, and strip any tail calls.

llvm-svn: 220811
2014-10-28 18:27:37 +00:00
NAKAMURA Takumi 335a7bcf1e Untabify and whitespace cleanups.
llvm-svn: 220771
2014-10-28 11:53:30 +00:00
Sanjay Patel 848309da7c Handle sqrt() shrinking in SimplifyLibCalls like any other call
This patch removes a chunk of special case logic for folding 
(float)sqrt((double)x) -> sqrtf(x)
in InstCombineCasts and handles it in the mainstream path of SimplifyLibCalls.

No functional change intended, but I loosened the restriction on the existing
sqrt testcases to allow for this optimization even without unsafe-fp-math because
that's the existing behavior.

I also added a missing test case for not shrinking the llvm.sqrt.f64 intrinsic
in case the result is used as a double.

Differential Revision: http://reviews.llvm.org/D5919

llvm-svn: 220514
2014-10-23 21:52:45 +00:00
Philip Reames d92c2a7592 Preserving 'nonnull' metadata in SimplifyCFG
When we hoist two loads above an if, we can preserve the nonnull metadata.  We could also do the same for sinking them, but we appear to not handle metadata at all in that case.

Thanks to Hal for the review.

Differential Revision: http://reviews.llvm.org/D5910

llvm-svn: 220392
2014-10-22 16:37:13 +00:00
Sanjay Patel a92fa44740 Shrinkify libcalls: use float versions of double libm functions with fast-math (bug 17850)
When a call to a double-precision libm function has fast-math semantics 
(via function attribute for now because there is no IR-level FMF on calls), 
we can avoid fpext/fptrunc operations and use the float version of the call
if the input and output are both float.

We already do this optimization using a command-line option; this patch just
adds the ability for fast-math to use the existing functionality.

I moved the cl::opt from InstructionCombining into SimplifyLibCalls because
it's only ever used internally to that class.

Modified the existing test cases to use the unsafe-fp-math attribute rather
than repeating all tests.

This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850

Differential Revision: http://reviews.llvm.org/D5893

llvm-svn: 220390
2014-10-22 15:29:23 +00:00
Philip Reames d7c21364a9 Teach combineMetadata how to merge 'nonnull' metadata.
combineMetadata is used when merging two instructions into one.  This change teaches it how to merge 'nonnull' - i.e. only preserve it on the new instruction if it's set on both sources.  This isn't actually used yet since I haven't adjusted any of the call sites to pass in nonnull as a 'known metadata'.  

llvm-svn: 220325
2014-10-21 21:02:19 +00:00
Paul Robinson f60e0a160f Do not attribute static allocas to the call site's DebugLoc.
When functions are inlined, instructions without debug information are
attributed to the call site's DebugLoc. After inlining, inlined static
allocas are moved to the caller's entry block, adjacent to the caller's
original static alloca instructions. By retaining the call site's
DebugLoc, these instructions could cause instructions that were
subsequently inserted at the entry block to pick up the same DebugLoc.

Patch by Wolfgang Pieb!

llvm-svn: 220255
2014-10-21 01:00:55 +00:00
Sanjay Patel c699a6117b fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().

In the simplest case, this:

   y = sqrt(x * x);

becomes this:

   y = fabs(x);

This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.

Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.

Differential Revision: http://reviews.llvm.org/D5787

llvm-svn: 219944
2014-10-16 18:48:17 +00:00
Hal Finkel 68dc3c7ab2 Preserve non-byval pointer alignment attributes using @llvm.assume when inlining
For pointer-typed function arguments, enhanced alignment can be asserted using
the 'align' attribute. When inlining, if this enhanced alignment information is
not otherwise available, preserve it using @llvm.assume-based alignment
assumptions.

llvm-svn: 219876
2014-10-15 23:44:41 +00:00
Sanjay Patel 0ca42bb5a8 Optimize away fabs() calls when input is squared (known positive).
Eliminate library calls and intrinsic calls to fabs when the input 
is a squared value.

Note that no unsafe-math / fast-math assumptions are needed for
this optimization.

Differential Revision: http://reviews.llvm.org/D5777

llvm-svn: 219717
2014-10-14 20:43:11 +00:00
Marcello Maggioni 5bbe3df63f Switch to select optimization for two-case switches
This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block
and a test to check that this condition is handled.

llvm-svn: 219656
2014-10-14 01:58:26 +00:00
Joerg Sonnenberger 5ca10d0edb Revert r219223, it creates invalid PHI nodes.
llvm-svn: 219587
2014-10-12 17:16:04 +00:00
Arnold Schwaighofer d7d010eb2a SimplifyCFG: Don't convert phis into selects if we could remove undef behavior
instead

We used to transform this:

  define void @test6(i1 %cond, i8* %ptr) {
  entry:
    br i1 %cond, label %bb1, label %bb2

  bb1:
    br label %bb2

  bb2:
    %ptr.2 = phi i8* [ %ptr, %entry ], [ null, %bb1 ]
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

into this:

  define void @test6(i1 %cond, i8* %ptr) {
    %ptr.2 = select i1 %cond, i8* null, i8* %ptr
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

because the simplifycfg transformation into selects would happen to happen
before the simplifycfg transformation that removes unreachable control flow
(We have 'unreachable control flow' due to the store to null which is undefined
behavior).

The existing transformation that removes unreachable control flow in simplifycfg
is:

  /// If BB has an incoming value that will always trigger undefined behavior
  /// (eg. null pointer dereference), remove the branch leading here.
  static bool removeUndefIntroducingPredecessor(BasicBlock *BB)

Now we generate:

  define void @test6(i1 %cond, i8* %ptr) {
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

I did not see any impact on the test-suite + externals.

rdar://18596215

llvm-svn: 219462
2014-10-10 01:27:02 +00:00
Duncan P. N. Exon Smith c46cfcbbc6 LoopUnroll: Create sub-loops in LoopInfo
`LoopUnrollPass` says that it preserves `LoopInfo` -- make it so.  In
particular, tell `LoopInfo` about copies of inner loops when unrolling
the outer loop.

Conservatively, also tell `ScalarEvolution` to forget about the original
versions of these loops, since their inputs may have changed.

Fixes PR20987.

llvm-svn: 219241
2014-10-07 21:19:00 +00:00
Duncan P. N. Exon Smith 9b4d37e8f5 LoopUnroll: Only check for ScalarEvolution analysis once, NFC
A follow-up commit will add use to a tight loop.  We might as well just
find it once anyway.

llvm-svn: 219239
2014-10-07 21:12:44 +00:00
Marcello Maggioni 963bc87dbd Two case switch to select optimization
This optimization tries to convert switch instructions that are used to select a value with only 2 unique cases + default block
to a select or a couple of selects (depending if the default block is reachable or not).

The typical case this optimization wants to be able to optimize is this one:

Example:
switch (a) {
  case 10:                %0 = icmp eq i32 %a, 10
    return 10;            %1 = select i1 %0, i32 10, i32 4
  case 20:        ---->   %2 = icmp eq i32 %a, 20
    return 2;             %3 = select i1 %2, i32 2, i32 %1
  default:
    return 4;
}

It also sets the base for further optimizations that are planned and being reviewed.

llvm-svn: 219223
2014-10-07 18:16:44 +00:00
Duncan P. N. Exon Smith e5d7d9797b LoopUnroll: Change code order of changes to new basic blocks
Add new basic blocks to `LoopInfo` earlier.  No functionality change
intended (simplifies upcoming bugfix patch).

llvm-svn: 219150
2014-10-06 22:05:02 +00:00
Duncan P. N. Exon Smith 0bbf5418c6 Sink comment, NFC
llvm-svn: 219149
2014-10-06 22:04:59 +00:00
Duncan P. N. Exon Smith 611afb229c DIBuilder: Encapsulate DIExpression's element type
`DIExpression`'s elements are 64-bit integers that are stored as
`ConstantInt`.  The accessors already encapsulate the storage.  This
commit updates the `DIBuilder` API to also encapsulate that.

llvm-svn: 218797
2014-10-01 20:26:08 +00:00
Adrian Prantl 87b7eb9d0f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Adrian Prantl b458dc2eee Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl 25a7174e7a Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
Tom Stellard 0a4e9a3b25 C API: Add LLVMCloneModule()
llvm-svn: 218775
2014-10-01 17:14:57 +00:00
Jingyue Wu fc0296704c [SimplifyCFG] threshold for folding branches with common destination
Summary:
This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.

The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:

  __global__ void foo(int a, int b, int c, int d, int e, int n,
                      const int *input, int *output) {
    int sum = 0;
    for (int i = 0; i < n; ++i)
      sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
    *output = sum;
  }

The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:

  sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];

Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.

We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!

Test Plan: Added one test case to check the threshold is in effect

Reviewers: nadav, eliben, meheff, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D5529

llvm-svn: 218711
2014-09-30 22:23:38 +00:00
Kevin Qin fc02e3c363 Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra
iterations which is can't divided by the unroll factor. It
generates an if-then-else sequence to jump into a factor -1
times unrolled loop body, like

    extraiters = tripcount % loopfactor
    if (extraiters == 0) jump Loop:
    if (extraiters == loopfactor) jump L1
    if (extraiters == loopfactor-1) jump L2
    ...
    L1:  LoopBody;
    L2:  LoopBody;
    ...
    if tripcount < loopfactor jump End
    Loop:
    ...
    End:

It means if the unroll factor is 4, the loop body will be 7
times unrolled, 3 are in loop prologue, and 4 are in the loop.
This commit is to use a loop to execute the extra iterations
in prologue, like

        extraiters = tripcount % loopfactor
        if (extraiters == 0) jump Loop:
        else jump Prol
 Prol:  LoopBody;
        extraiters -= 1                 // Omitted if unroll factor is 2.
        if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
        if (tripcount < loopfactor) jump End
 Loop:
 ...
 End:

Then when unroll factor is 4, the loop body will be copied by
only 5 times, 1 in the prologue loop, 4 in the original loop.
And if the unroll factor is 2, new loop won't be created, just
as the original solution.

llvm-svn: 218604
2014-09-29 11:15:00 +00:00
Reid Kleckner 78927e884b GlobalOpt: Preserve comdats of unoptimized initializers
Rather than slurping in and splatting out the whole ctor list, preserve
the existing array entries without trying to understand them.  Only
remove the entries that we know we can optimize away.  This way we don't
need to wire through priority and comdats or anything else we might add.

Fixes a linker issue where the .init_array or .ctors entry would point
to discarded initialization code if the comdat group from the TU with
the faulty global_ctors entry was dropped.

llvm-svn: 218337
2014-09-23 22:33:01 +00:00
Chris Bieneman cf93cbb7a4 Fixing a build error.
llvm-svn: 217983
2014-09-17 21:06:59 +00:00
Chris Bieneman ad070d0588 Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code.
Summary: This eliminates ~200 lines of code mostly file scoped struct definitions that were unnecessary.

Reviewers: chandlerc, resistor

Reviewed By: resistor

Subscribers: morisset, resistor, llvm-commits

Differential Revision: http://reviews.llvm.org/D5364

llvm-svn: 217982
2014-09-17 20:55:46 +00:00
Jingyue Wu b67140b812 Remove dead code in SimplifyCFG
Summary: UsedByBranch is always true according to how BonusInst is defined.

Test Plan:
Passes check-all, and also verified 

if (BonusInst && !UsedByBranch) {
  ...
}

is never entered during check-all.

Reviewers: resistor, nadav, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, eliben, meheff

Differential Revision: http://reviews.llvm.org/D5324

llvm-svn: 217824
2014-09-15 20:48:13 +00:00
Benjamin Kramer 0bd147da17 Simplify code. No functionality change.
llvm-svn: 217726
2014-09-13 12:38:49 +00:00
Hal Finkel 60db05896a Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

llvm-svn: 217342
2014-09-07 18:57:58 +00:00
Hal Finkel 74c2f355d2 Add an Assumption-Tracking Pass
This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.

The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.

There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.

This pass will be used by follow-up commits that make use of @llvm.assume.

llvm-svn: 217334
2014-09-07 12:44:26 +00:00
James Molloy 6b95d8ed36 Enable noalias metadata by default and swap the order of the SLP and Loop vectorizers by default.
After some time maturing, hopefully the flags themselves will be removed.

llvm-svn: 217144
2014-09-04 13:23:08 +00:00
Hal Finkel 0c083024f0 Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata
This feeds AA through the IFI structure into the inliner so that
AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which
functions only access their arguments (instead of just hard-coding some
knowledge of memory intrinsics). Most of the information is only available from
BasicAA; this is important for preserving alias scoping information for
target-specific intrinsics when doing the noalias parameter attribute to
metadata conversion.

llvm-svn: 216866
2014-09-01 09:01:39 +00:00
Hal Finkel cbb85f249e Fix AddAliasScopeMetadata again - alias.scope must be a complete description
I thought that I had fixed this problem in r216818, but I did not do a very
good job. The underlying issue is that when we add alias.scope metadata we are
asserting that this metadata completely describes the aliasing relationships
within the current aliasing scope domain, and so in the context of translating
noalias argument attributes, the pointers must all be based on noalias
arguments (as underlying objects) and have no other kind of underlying object.
In r216818 excluding appropriate accesses from getting alias.scope metadata is
done by looking for underlying objects that are not identified function-local
objects -- but that's wrong because allocas, etc. are also function-local
objects and we need to explicitly check that all underlying objects are the
noalias arguments for which we're adding metadata aliasing scopes.

This fixes the underlying-object check for adding alias.scope metadata, and
does some refactoring of the related capture-checking eligibility logic (and
adds more comments; hopefully making everything a bit clearer).

Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the
feature is still disabled by default).

llvm-svn: 216863
2014-09-01 04:26:40 +00:00
Hal Finkel a3708df41a Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers
The previous implementation of AddAliasScopeMetadata, which adds noalias
metadata to preserve noalias parameter attribute information when inlining had
a flaw: it would add alias.scope metadata to accesses which might have been
derived from pointers other than noalias function parameters. This was
incorrect because even some access known not to alias with all noalias function
parameters could easily alias with an access derived from some other pointer.
Instead, when deriving from some unknown pointer, we cannot add alias.scope
metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4.
Furthermore, we cannot add alias.scope to functions unless we know they
access only argument-derived pointers (currently, we know this only for
memory intrinsics).

Also, we fix a theoretical problem with using the NoCapture attribute to skip
the capture check. This is incorrect (as explained in the comment added), but
would not matter in any code generated by Clang because we get only inferred
nocapture attributes in Clang-generated IR.

This functionality is not yet enabled by default.

llvm-svn: 216818
2014-08-30 12:48:33 +00:00
Hal Finkel 2d3d6da44b Fix a typo in AddAliasScopeMetadata
llvm-svn: 216741
2014-08-29 16:33:41 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Bruno Cardoso Lopes e2a1fa35df Remove dangling initializers in GlobalDCE
GlobalDCE deletes global vars and updates their initializers to nullptr
while leaving underlying constants to be cleaned up later by its uses.
The clean up may never happen, fix this by forcing it every time it's
safe to destroy constants.

Final patch by Rafael Espindola
http://reviews.llvm.org/D4931

<rdar://problem/17523868>

llvm-svn: 216390
2014-08-25 17:51:14 +00:00
Craig Topper 4627679cec Use range based for loops to avoid needing to re-mention SmallPtrSet size.
llvm-svn: 216351
2014-08-24 23:23:06 +00:00
David Blaikie 2f3f76fdb1 Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks.
Somewhat unnoticed in the original implementation of discriminators, but
it could cause instructions to end up in new, small,
DW_TAG_lexical_blocks due to the use of DILexicalBlock to track
discriminator changes.

Instead, use DILexicalBlockFile which we already use to track file
changes without introducing new scopes, so it works well to track
discriminator changes in the same way.

llvm-svn: 216239
2014-08-21 22:45:21 +00:00
Craig Topper 71b7b68b74 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 216158
2014-08-21 05:55:13 +00:00
Craig Topper 6230691c91 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

llvm-svn: 215870
2014-08-18 00:24:38 +00:00
Craig Topper 5229cfd163 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 215868
2014-08-17 23:47:00 +00:00
Rafael Espindola ea46c32f81 Introduce a helper to combine instruction metadata.
Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use
it.

Patch by Björn Steinbrink!

llvm-svn: 215723
2014-08-15 15:46:38 +00:00
Hal Finkel 61c386126b Copy noalias metadata from call sites to inlined instructions
When a call site with noalias metadata is inlined, that metadata can be
propagated directly to the inlined instructions (only those that might access
memory because it is not useful on the others). Prior to inlining, the noalias
metadata could express that a call would not alias with some other memory
access, which implies that no instruction within that called function would
alias. By propagating the metadata to the inlined instructions, we preserve
that knowledge.

This should complete the enhancements requested in PR20500.

llvm-svn: 215676
2014-08-14 21:09:37 +00:00
Hal Finkel d2dee16c27 Add noalias metadata for general calls (not just memory intrinsics) during inlining
When preserving noalias function parameter attributes by adding noalias
metadata in the inliner, we should do this for general function calls (not just
memory intrinsics). The logic is very similar to what already existed (except
that we want to add this metadata even for functions taking no relevant
parameters). This metadata can be used by ModRef queries in the caller after
inlining.

This addresses the first part of PR20500. Adding noalias metadata during
inlining is still turned off by default.

llvm-svn: 215657
2014-08-14 16:44:03 +00:00
Jan Vesely 0cd3ec6cfa utils: Fix segfault in flattencfg
v2: continue iterating through the rest of the bb
    use for loop

v3: initialize FlattenCFG pass in ScalarOps
    add test

v4: split off initializing flattencfg to a separate patch
    add comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215574
2014-08-13 20:31:53 +00:00
Reid Kleckner e31acf239a Move helper for getting a terminating musttail call to BasicBlock
No functional change.  To be used in future commits that need to look
for such instructions.

Reviewed By: rafael

Differential Revision: http://reviews.llvm.org/D4504

llvm-svn: 215413
2014-08-12 00:05:15 +00:00
Manman Ren 062f58d550 [SimplifyCFG] fix accessing deleted PHINodes in switch-to-table conversion.
When we have a covered lookup table, make sure we don't delete PHINodes that
are cached in PHIs.

rdar://17887153

llvm-svn: 214642
2014-08-02 23:41:54 +00:00
Rafael Espindola d07cf400ab SimplifyCFG: Avoid miscompilations due to removed lifetime intrinsics.
The lifetime intrinsics need some work in order to make it clear which
optimizations are or are not valid.

For now dropping this optimization avoids a miscompilation.

Patch by Björn Steinbrink.

llvm-svn: 214336
2014-07-30 21:04:00 +00:00
Hal Finkel 930469107d Add @llvm.assume, lowering, and some basic properties
This is the first commit in a series that add an @llvm.assume intrinsic which
can be used to provide the optimizer with a condition it may assume to be true
(when the control flow would hit the intrinsic call). Some basic properties are added here:

 - llvm.invariant(true) is dead.
 - llvm.invariant(false) is unreachable (this directly corresponds to the
   documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef).

The intrinsic is tagged as writing arbitrarily, in order to maintain control
dependencies. BasicAA has been updated, however, to return NoModRef for any
particular location-based query so that we don't unnecessarily block code
motion.

llvm-svn: 213973
2014-07-25 21:13:35 +00:00
Hal Finkel ff0bcb60c9 Convert noalias parameter attributes into noalias metadata during inlining
This functionality is currently turned off by default.

Part of the motivation for introducing scoped-noalias metadata is to enable the
preservation of noalias parameter attribute information after inlining.
Sometimes this can be inferred from the code in the caller after inlining, but
often we simply lose valuable information.

The overall process if fairly simple:
 1. Create a new unqiue scope domain.
 2. For each (used) noalias parameter, create a new alias scope.
 3. For each pointer, collect the underlying objects. Add a noalias scope for
    each noalias parameter from which we're not derived (and has not been
    captured prior to that point).
 4. Add an alias.scope for each noalias parameter from which we might be
    derived (or has been captured before that point).

Note that the capture checks apply only if one of the underlying objects is not
an identified function-local object.

llvm-svn: 213949
2014-07-25 15:50:08 +00:00
Manman Ren 4d189fb9a6 Feedback from Hans on r213815. No functionaility change.
llvm-svn: 213895
2014-07-24 21:13:20 +00:00
Hal Finkel 9414665a3b Add scoped-noalias metadata
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

llvm-svn: 213864
2014-07-24 14:25:39 +00:00
Aaron Ballman 99e0ea0aa8 Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended.
llvm-svn: 213863
2014-07-24 14:24:59 +00:00
Manman Ren edc60376ed SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.

For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx

It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext

rdar://17735071

llvm-svn: 213815
2014-07-23 23:13:23 +00:00
Duncan P. N. Exon Smith 6c99015fe2 Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build.  I'll reply on the list in a moment.

llvm-svn: 213562
2014-07-21 17:06:51 +00:00
Manuel Jacob d11beffef4 [C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges.
Summary: This patch introduces two new iterator ranges and updates existing code to use it.  No functional change intended.

Test Plan: All tests (make check-all) still pass.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4481

llvm-svn: 213474
2014-07-20 09:10:11 +00:00
Peter Collingbourne 818f5c4837 Give SplitBlockAndInsertIfThen the ability to update a domtree.
llvm-svn: 213045
2014-07-15 04:40:27 +00:00
Owen Anderson a8d1c3e74e Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it did
not properly handle the case where the predecessor block was the entry block to
the function.  The only in-tree client of this is JumpThreading, which worked
around the issue in its own code.  This patch moves the solution into the helper
so that JumpThreading (and other clients) do not have to replicate the same fix
everywhere.

llvm-svn: 212875
2014-07-12 07:12:47 +00:00
Marcello Maggioni 78035b11ec Fixup PHIs in LowerSwitch when a Leaf node is not emitted.
This commit fixes bug http://llvm.org/bugs/show_bug.cgi?id=20103.

Thanks to Qwertyuiop for the report and the proposed fix.

llvm-svn: 212802
2014-07-11 10:34:36 +00:00
Mark Heffernan 675d401a26 Partially fix PR20058: reduce compile time for loop unrolling with very high count by reducing calls to SE->forgetLoop
llvm-svn: 212782
2014-07-10 23:30:06 +00:00
Hal Finkel a995f92627 Feeding isSafeToSpeculativelyExecute its DataLayout pointer
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.

This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.

llvm-svn: 212720
2014-07-10 14:41:31 +00:00
Alexey Samsonov b7dd329f2f Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
  * DFSanABIList in DFSan instrumentation pass.
  * SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).

Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.

No functionality change.

llvm-svn: 212643
2014-07-09 19:40:08 +00:00
Benjamin Kramer cccdadca45 Fix some Twine locals.
Two of those are use after frees. Found by clang-tidy, fixed by me.

llvm-svn: 212537
2014-07-08 14:55:06 +00:00
Sanjay Patel a932da8f35 Fix for PR17073 ( http://llvm.org/pr17073 ), simplifycfg illegally hoists an operation in a phi node that can trap.
This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those.

The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected.

llvm-svn: 212490
2014-07-07 21:19:00 +00:00
Sanjay Patel 0a2ada7b98 fixed some typos in comments
llvm-svn: 212423
2014-07-06 23:10:24 +00:00
Rafael Espindola adf21f2a56 Update the MemoryBuffer API to use ErrorOr.
llvm-svn: 212405
2014-07-06 17:43:13 +00:00
Marcello Maggioni 89c05ad165 Minor stylistic fix in SimplifyCFG (test commit)
llvm-svn: 212259
2014-07-03 08:29:06 +00:00
David Blaikie 644d2eee59 DebugInfo: Preserve debug location information when transforming a call into an invoke during inlining.
This both improves basic debug info quality, but also fixes a larger
hole whenever we inline a call/invoke without a location (debug info for
the entire inlining is lost and other badness that the debug info
emission code is currently working around but shouldn't have to).

llvm-svn: 212065
2014-06-30 20:30:39 +00:00
Alp Toker e69170a110 Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.

llvm-svn: 211814
2014-06-26 22:52:05 +00:00
Hans Wennborg b03ebfb77e Don't build switch tables for dllimport and TLS variables in GEPs
This is a follow-up to r211331, which failed to notice that we were
returning early from ValidLookupTableConstant for GEPs.

llvm-svn: 211753
2014-06-26 00:30:52 +00:00
Alp Toker 614717388c Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.

small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.

This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.

The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.

llvm-svn: 211749
2014-06-26 00:00:48 +00:00
Benjamin Kramer 0bf086f80f LoopUnrollRuntime: Check for overflow in the trip count calculation.
Fixes PR19823.

llvm-svn: 211436
2014-06-21 13:46:25 +00:00
Hans Wennborg 4dc895164a Don't build switch lookup tables for dllimport or TLS variables
We would previously put dllimport variables in switch lookup tables, which
doesn't work because the address cannot be used in a constant initializer.
This is basically the same problem that we have in PR19955.

Putting TLS variables in switch tables also desn't work, because the
address of such a variable is not constant.

Differential Revision: http://reviews.llvm.org/D4220

llvm-svn: 211331
2014-06-20 00:38:12 +00:00
Jim Grosbach fff5663d48 LowerSwitch: track bounding range for the condition tree.
When LowerSwitch transforms a switch instruction into a tree of ifs it
is actually performing a binary search into the various case ranges, to
see if the current value falls into one cases range of values.

So, if we have a program with something like this:

switch (a) {
case 0:
  do0();
  break;
case 1:
  do1();
  break;
case 2:
  do2();
  break;
default:
  break;
}

the code produced is something like this:

  if (a < 1) {
    if (a == 0) {
      do0();
    }
  } else {
    if (a < 2) {
      if (a == 1) {
        do1();
      }
    } else {
      if (a == 2) {
        do2();
      }
    }
  }

This code is inefficient because the check (a == 1) to execute do1() is
not needed.

The reason is that because we already checked that (a >= 1) initially by
checking that also  (a < 2) we basically already inferred that (a == 1)
without the need of an extra basic block spawned to check if actually (a
== 1).

The patch addresses this problem by keeping track of already
checked bounds in the LowerSwitch algorithm, so that when the time
arrives to produce a Leaf Block that checks the equality with the case
value / range the algorithm can decide if that block is really needed
depending on the already checked bounds .

For example, the above with "a = 1" would work like this:

the bounds start as LB: NONE , UB: NONE
as (a < 1) is emitted the bounds for the else path become LB: 1 UB:
NONE. This happens because by failing the test (a < 1) we know that the
value "a" cannot be smaller than 1 if we enter the else branch.
After the emitting the check (a < 2) the bounds in the if branch become
LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then
the upper bound becomes 2 - 1 = 1.

When it is time to emit the leaf block for "case 1:" we notice that 1
can be squeezed exactly in between the LB and UB, which means that if we
arrived to that block there is no need to emit a block that checks if (a
== 1).

Patch by: Marcello Maggioni <hayarms@gmail.com>

llvm-svn: 211038
2014-06-16 16:55:20 +00:00
Rafael Espindola db4ed0bdab Remove 'using std::errro_code' from lib.
llvm-svn: 210871
2014-06-13 02:24:39 +00:00
Rafael Espindola 3acea39853 Don't use 'using std::error_code' in include/llvm.
This should make sure that most new uses use the std prefix.

llvm-svn: 210835
2014-06-12 21:46:39 +00:00
Rafael Espindola a6e9c3e43a Remove system_error.h.
This is a minimal change to remove the header. I will remove the occurrences
of "using std::error_code" in a followup patch.

llvm-svn: 210803
2014-06-12 17:38:55 +00:00
Evgeniy Stepanov 2be29929be Fix line numbers for code inlined from __nodebug__ functions.
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.

With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.

Fixes PR19001.

llvm-svn: 210459
2014-06-09 09:09:19 +00:00
Rafael Espindola 64c1e18033 Allow alias to point to an arbitrary ConstantExpr.
This  patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.

This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like

@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
                                 i32 ptrtoint (i32* @bar to i32)) to i32*)

An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).

Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.

llvm-svn: 210062
2014-06-03 02:41:57 +00:00
Matt Arsenault c8fc08c31b Make bitcast, extractelement, and insertelement considered cheap for speculation.
This helps more branches into selects. On R600,
vectors are cheap and anything that helps
remove branches is very good.

llvm-svn: 209914
2014-05-30 18:34:43 +00:00
Dinesh Dwivedi d266cb1a0b LCSSA should be performed on the outermost affected loop while unrolling loop.
During loop-unroll, loop exits from the current loop may end up in in different
outer loop. This requires to re-form LCSSA recursively for one level down from
the outer most loop where loop exits are landed during unroll. This fixes PR18861.

Differential Revision: http://reviews.llvm.org/D2976

llvm-svn: 209796
2014-05-29 06:47:23 +00:00
Diego Novillo 7f8af8bf91 Add support for missed and analysis optimization remarks.
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.

-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.

-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.

The patch also:

1- Adds support in the inliner for the two new remarks and a
   test case.

2- Moves emitOptimizationRemark* functions to the llvm namespace.

3- Adds an LLVMContext argument instead of making them member functions
   of LLVMContext.

Reviewers: qcolombet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3682

llvm-svn: 209442
2014-05-22 14:19:46 +00:00
Eric Christopher a5ec92556f Revert "Patch for function cloning to inline all blocks whose address is taken"
as it was causing build failures in ruby.

This reverts commit r207713.

llvm-svn: 209135
2014-05-19 16:04:10 +00:00
Rafael Espindola f1bedd3747 Use create methods since msvc doesn't handle delegating constructors.
llvm-svn: 209076
2014-05-17 21:29:57 +00:00
Rafael Espindola 8370565820 Reduce abuse of default values in the GlobalAlias constructor.
This is in preparation for adding an optional offset.

llvm-svn: 209073
2014-05-17 19:57:46 +00:00
Reid Kleckner fceb76f5f9 Add comdat key field to llvm.global_ctors and llvm.global_dtors
This allows us to put dynamic initializers for weak data into the same
comdat group as the data being initialized.  This is necessary for MSVC
ABI compatibility.  Once we have comdats for guard variables, we can use
the combination to help GlobalOpt fire more often for weak data with
guarded initialization on other platforms.

Reviewers: nlewycky

Differential Revision: http://reviews.llvm.org/D3499

llvm-svn: 209015
2014-05-16 20:39:27 +00:00
Rafael Espindola 6b238633b7 Fix most of PR10367.
This patch changes the design of GlobalAlias so that it doesn't take a
ConstantExpr anymore. It now points directly to a GlobalObject, but its type is
independent of the aliasee type.

To avoid changing all alias related tests in this patches, I kept the common
syntax

@foo = alias i32* @bar

to mean the same as now. The cases that used to use cast now use the more
general syntax

@foo = alias i16, i32* @bar.

Note that GlobalAlias now behaves a bit more like GlobalVariable. We
know that its type is always a pointer, so we omit the '*'.

For the bitcode, a nice surprise is that we were writing both identical types
already, so the format change is minimal. Auto upgrade is handled by looking
through the casts and no new fields are needed for now. New bitcode will
simply have different types for Alias and Aliasee.

One last interesting point in the patch is that replaceAllUsesWith becomes
smart enough to avoid putting a ConstantExpr in the aliasee. This seems better
than checking and updating every caller.

A followup patch will delete getAliasedGlobal now that it is redundant. Another
patch will add support for an explicit offset.

llvm-svn: 209007
2014-05-16 19:35:39 +00:00
Rafael Espindola 4fe0094fd1 Change the GlobalAlias constructor to look a bit more like GlobalVariable.
This is part of the fix for pr10367. A GlobalAlias always has a pointer type,
so just have the constructor build the type.

llvm-svn: 208983
2014-05-16 13:34:04 +00:00
Reid Kleckner 900d46ff39 Don't insert lifetime.end markers between a musttail call and ret
The allocas going out of scope are immediately killed by the return
instruction.

This is a resend of r208912, which was committed accidentally.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3792

llvm-svn: 208920
2014-05-15 21:10:46 +00:00
Reid Kleckner b16564109b Revert "Don't insert lifetime.end markers between a musttail call and ret"
This reverts commit r208912.

It was committed accidentally without review.

llvm-svn: 208914
2014-05-15 20:41:05 +00:00
Reid Kleckner 6af21245eb Remove unused variable in inliner
We have to iterate over all the calls that were inlined to find out if
any were musttail.

Sink another variable down to where its used.

llvm-svn: 208913
2014-05-15 20:39:42 +00:00
Reid Kleckner 26ab7ead45 Don't insert lifetime.end markers between a musttail call and ret
The allocas going out of scope are immediately killed by the return
instruction.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3630

llvm-svn: 208912
2014-05-15 20:39:13 +00:00
Reid Kleckner f0915aa0e6 Teach the inliner how to preserve musttail invariants
The interesting case is what happens when you inline a musttail call
through a musttail call site.  In this case, we can't break perfect
forwarding or allow any stack growth.

Instead of merging control flow from the inlined return instruction
after a musttail call into the body of the caller, leave the inlined
return instruction in the caller so that the musttail call stays in the
tail position.

More work is required in http://reviews.llvm.org/D3630 to handle the
case where the inlined function has dynamic allocas or byval arguments.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3491

llvm-svn: 208910
2014-05-15 20:11:28 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Rafael Espindola 99e05cf163 Split GlobalValue into GlobalValue and GlobalObject.
This allows code to statically accept a Function or a GlobalVariable, but
not an alias. This is already a cleanup by itself IMHO, but the main
reason for it is that it gives a lot more confidence that the refactoring to fix
the design of GlobalAlias is correct. That will be a followup patch.

llvm-svn: 208716
2014-05-13 18:45:48 +00:00
Louis Gerbarg 1f54b82164 Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost
Since ExtractValue is not included in ComputeSpeculationCost CFGs containing
ExtractValueInsts cannot be simplified. In particular this interacts with
InstCombineCompare's tendency to insert add.with.overflow intrinsics for
certain idiomatic math operations, preventing optimization.

This patch adds ExtractValue to the ComputeSpeculationCost. Test case included

rdar://14853450

llvm-svn: 208434
2014-05-09 17:02:46 +00:00
Rafael Espindola fc13db477a Use auto and clang-format this snippet.
llvm-svn: 208421
2014-05-09 16:01:06 +00:00
Richard Smith c167d656e7 Re-commit r208025, reverted in r208030, with a fix for a conformance issue
which GCC detects and Clang does not!

llvm-svn: 208033
2014-05-06 01:44:26 +00:00
Richard Smith 09bf116939 Revert r208025, which made buildbots unhappy for unknown reasons.
llvm-svn: 208030
2014-05-06 01:26:00 +00:00
Richard Smith 6cf1d744d8 Add llvm::function_ref (and a couple of uses of it), representing a type-erased reference to a callable object.
llvm-svn: 208025
2014-05-06 01:01:29 +00:00
Nico Weber 4b2acde21a Teach GlobalDCE how to remove empty global_ctor entries.
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.

GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.

Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.

llvm-svn: 207856
2014-05-02 18:35:25 +00:00
Nick Lewycky 718ada97bc Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 times in a bootstrap of clang.
llvm-svn: 207828
2014-05-02 04:11:45 +00:00
Gerolf Hoflehner 3282af13d4 Patch for function cloning to inline all blocks whose address is taken
Not all address taken blocks get inlined. The reason is
that a blocks new address is known only when it is cloned. But e.g.
a branch instruction in a different block could need that address earlier
while it gets cloned. The solution is to collect the set of all
blocks that can potentially get inlined and compute a new block address
up front. Then clone and cleanup.

rdar://16427209

llvm-svn: 207713
2014-04-30 22:05:02 +00:00
Diego Novillo 34fc8a7c4c Add optimization remarks to the loop unroller and vectorizer.
Summary:
This calls emitOptimizationRemark from the loop unroller and vectorizer
at the point where they make a positive transformation. For the
vectorizer, it reports vectorization and interleave factors. For the
loop unroller, it reports all the different supported types of
unrolling.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3456

llvm-svn: 207528
2014-04-29 14:27:31 +00:00
Michael Zolotukhin a93fec040a Fix a typo in comment
llvm-svn: 207499
2014-04-29 07:35:33 +00:00
Chandler Carruth 5bdf72cef6 Fix rampant quadratic behavior in UpdatePHINodes. The operation of
mapping from a basic block to an incoming value, either for removal or
just lookup, is linear in the number of predecessors, and we were doing
this for every entry in the 'Preds' list which is in many cases almost
all of them!

Unfortunately, the fixes are quite ugly. PHI nodes just don't make this
operation easy. The efficient way to fix this is to have a clever
'remove_if' operation on PHI nodes that lets us do a single pass over
all the incoming values of the original PHI node, extracting the ones we
care about. Then we could quickly construct the new phi node from this
list. This would remove the remaining underlying quadratic movement of
unrelated incoming values and the need for silly backwards looping to
"minimize" how often we hit the quadratic case.

This is the last obvious fix for PR19499. It shaves another 20% off the
compile time for me, and while UpdatePHINodes remains in the profile,
most of the time is now stemming from the well known inefficiencies of
LVI and jump threading.

llvm-svn: 207409
2014-04-28 10:37:30 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Gerolf Hoflehner af7a87d2e3 RecursivelyDeleteTriviallyDeadInstructions() could remove
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

Repaired r207302.

llvm-svn: 207309
2014-04-26 05:58:11 +00:00
Gerolf Hoflehner 1da7cbd584 Restore CloneFunction.cpp which got accidently
overwritten by previous backout of r207303

llvm-svn: 207308
2014-04-26 05:43:41 +00:00
Gerolf Hoflehner c46e9b0423 Revert commit r207302 since build failures
have been reported.

llvm-svn: 207303
2014-04-26 02:03:17 +00:00
Gerolf Hoflehner 34210108b3 RecursivelyDeleteTriviallyDeadInstructions() could remove
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

llvm-svn: 207302
2014-04-26 01:19:16 +00:00
Adrian Prantl 232897feaa Unbreak the gdb buildbot by not lowering dbg.declare intrinsics for arrays.
llvm-svn: 207284
2014-04-25 23:00:25 +00:00
Adrian Prantl 32da88923a This reapplies r207235 with an additional bugfixes caught by the msan
buildbot - do not insert debug intrinsics before phi nodes.

Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207269
2014-04-25 20:49:25 +00:00
Adrian Prantl d2d9b76e48 Revert "This reapplies r207130 with an additional testcase+and a missing check for"
This reverts commit 207235 to investigate msan buildbot breakage.

llvm-svn: 207250
2014-04-25 18:18:09 +00:00
Adrian Prantl 0840a22452 Reapply r207135 without modifications.
Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.

rdar://problem/14874886, rdar://problem/16679936

llvm-svn: 207236
2014-04-25 17:01:04 +00:00
Adrian Prantl f5834a4b49 This reapplies r207130 with an additional testcase+and a missing check for
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207235
2014-04-25 17:01:00 +00:00
Craig Topper f40110f4d8 [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Adrian Prantl 6e5de2ea06 Revert "This reapplies r207130 with an additional testcase+and a missing check for"
Typo in testcase.

llvm-svn: 207166
2014-04-25 00:42:50 +00:00