Commit Graph

3409 Commits

Author SHA1 Message Date
Simon Atanasyan e12bef7ea7 ValueMapper: Fix unused var warning. NFC
llvm-svn: 266529
2016-04-16 11:49:40 +00:00
Duncan P. N. Exon Smith a77d073305 ValueMapper: Stop memoizing ConstantAsMetadata
Stop memoizing ConstantAsMetadata in ValueMapper::mapMetadata.  Now we
have to recompute it, but these metadata aren't particularly common, and
it restricts the lifetime of the Metadata map unnecessarily.

(The motivation is that I have a patch which uses a single Metadata map
for the lifetime of IRMover.  Mehdi profiled r266446 with the patch
applied and we saw a pretty big speedup in lib/Linker.)

llvm-svn: 266513
2016-04-16 03:39:44 +00:00
Duncan P. N. Exon Smith 39423b0294 Reapply "ValueMapper: Eliminate cross-file co-recursion, NFC"
This reverts commit r266507, reapplying r266503 (and r266505
"ValueMapper: Use API from r266503 in unit tests, NFC") completely
unchanged.

I reverted because of a bot failure here:
  http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16810/

However, looking more closely, the failure was from a host-compiler
crash (clang 3.7.1) when building:
  lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/DwarfAccelTable.cpp.o

I didn't modify that file, or anything it includes, with that commit.

The next build (which hadn't picked up my revert) got past it:
  http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16811/

I think this was just unfortunate timing.  I suppose the bot must be
flakey.

llvm-svn: 266510
2016-04-16 02:29:55 +00:00
Duncan P. N. Exon Smith 6fe1ff260b Revert "ValueMapper: Eliminate cross-file co-recursion, NFC"
This reverts commit r266503, in case it's the root cause of this bot
failure:

  http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16810

I'm also reverting r266505 -- "ValueMapper: Use API from r266503 in unit
tests, NFC" -- since it's in the way.

llvm-svn: 266507
2016-04-16 02:05:33 +00:00
Duncan P. N. Exon Smith f0d73f95c1 ValueMapper: Eliminate cross-file co-recursion, NFC
Eliminate co-recursion of Mapper::mapValue through
ValueMaterializer::materializeInitFor, through a major redesign of the
ValueMapper.cpp interface.

  - Expose a ValueMapper class that controls the entry points to the
    mapping algorithms.
  - Change IRLinker to use ValueMapper directly, rather than
    llvm::RemapInstruction, llvm::MapValue, etc.
  - Use (e.g.) ValueMapper::scheduleMapGlobalInit to add mapping work to
    a worklist in ValueMapper instead of recursing.

There were two fairly major complications.

Firstly, IRLinker::linkAppendingVarProto incorporates an on-the-fly IR
ugprade that I had to split apart.  Long-term, this upgrade should be
done in the bitcode reader (and we should only accept the "new" form),
but for now I've just made it work and added a FIXME.  The hold-op is
that we need to deprecate C API that relies on this.

Secondly, IRLinker has special logic to correctly implement aliases with
comdats, and uses two ValueToValueMapTy instances and two
ValueMaterializers.  I supported this by allowing clients to register an
alternate mapping context, whose MCID can be passed in when scheduling
new work.

While out of scope for this commit, it should now be straightforward to
remove recursion from Mapper::mapValue.

llvm-svn: 266503
2016-04-16 01:29:08 +00:00
Duncan P. N. Exon Smith db6861e7dd ValueMapper: Hide Mapper::VM behind an accessor, NFC
Change Mapper::VM to a pointer and add a `getVM()` accessor for it.
While this has no functionality change, it minimizes the diff on an
upcoming patch that allows switching between instances of
ValueToValueMapTy on a single Mapper instance.

llvm-svn: 266490
2016-04-15 23:18:43 +00:00
Adrian Prantl 75819aedf6 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

llvm-svn: 266446
2016-04-15 15:57:41 +00:00
Sanjay Patel f11ab05bdb [SimplifyCFG] propagate branch metadata when creating select (PR27344)
This is almost identical to:
http://reviews.llvm.org/rL264527

This doesn't solve PR27344; it just allows the profile weights to survive. 
To solve the bug, we need to use the profile weights in the backend.

llvm-svn: 266442
2016-04-15 15:32:12 +00:00
Dehao Chen 34cc676732 Fix null pointer access for discriminator assignment.
Summary: This fixes the buildbot failure.

Reviewers: dnovillo, davidxl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19129

llvm-svn: 266360
2016-04-14 19:46:38 +00:00
Dehao Chen 46f8fbbb1b Update discriminator assignment algorithm to handle nested call correctly.
Summary: Add discriminator for nested call correctly.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19127

llvm-svn: 266354
2016-04-14 18:37:18 +00:00
Davide Italiano 96d2a1c603 [ValueMapper] Range-loopify to improve readability. NFC.
llvm-svn: 266350
2016-04-14 18:07:32 +00:00
Duncan P. N. Exon Smith 11f60fd65a ValueMapper: Resolve cycles on the new nodes
Fix a major bug from r265456.  Although it's now much rarer, ValueMapper
sometimes has to duplicate cycles.  The
might-transitively-reference-a-temporary counts don't decrement on their
own when there are cycles, and you need to call MDNode::resolveCycles to
fix it.

r265456 was checking the input nodes to see if they were unresolved.
This is useless; they should never be unresolved.  Instead we should
check the output nodes and resolve cycles on them.

llvm-svn: 266258
2016-04-13 22:54:01 +00:00
David L Kreitzer 752c1448fe Simplify strlen to a subtraction for certain cases.
Patch by Li Huang (li1.huang@intel.com)

Differential Revision: http://reviews.llvm.org/D18230

llvm-svn: 266200
2016-04-13 14:31:06 +00:00
Mehdi Amini 818f67add5 Fix mismatch on returned type between header and implementation for createNameAnonFunctionPass()
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266151
2016-04-12 23:25:11 +00:00
Mehdi Amini d5faa267c4 Add a pass to name anonymous/nameless function
Summary:
For correct handling of alias to nameless
function, we need to be able to refer them through a GUID in the summary.
Here we name them using a hash of the non-private global names in the module.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18883

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266132
2016-04-12 21:35:28 +00:00
Mehdi Amini ae280e54a9 ThinLTO renaming: use module hash instead of position in the summary
This is more robust to changes in the link ordering.

Differential Revision: http://reviews.llvm.org/D18946

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266018
2016-04-11 23:26:46 +00:00
Hans Wennborg e9134897f4 Fix a couple of redundant conditional expressions (PR27283, PR28282)
llvm-svn: 265987
2016-04-11 20:35:01 +00:00
Matthew Simpson 53207a99f9 [LoopUtils, LV] Fix PR27246 (first-order recurrences)
This patch ensures that when we detect first-order recurrences, we reject a phi
node if its previous value is also a phi node. During vectorization the initial
and previous values of the recurrence are shuffled together to create the value
for the current iteration. However, phi nodes are not widened like other
instructions. This fixes PR27246.

Differential Revision: http://reviews.llvm.org/D18971

llvm-svn: 265983
2016-04-11 19:48:18 +00:00
Sanjoy Das f9d88e650b This reverts commit r265913 and r265912
See PR27315

r265913: "[IndVars] Eliminate op.with.overflow when possible"

r265912: "[SCEV] See through op.with.overflow intrinsics"
llvm-svn: 265950
2016-04-11 15:26:18 +00:00
Sanjoy Das a07ad647ee [IndVars] Eliminate op.with.overflow when possible
Summary:
If we can prove that an op.with.overflow intrinsic does not overflow, we
can get rid of the intrinsic, and replace it with non-wrapping
arithmetic.

Reviewers: atrick, regehr

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18685

llvm-svn: 265913
2016-04-10 22:50:31 +00:00
Sanjoy Das dd77e1e6a5 Maintain calling convention when inling calls to llvm.deoptimize
The behavior here was buggy -- we'd forget the calling convention after
inlining a callsite calling llvm.deoptimize.

llvm-svn: 265867
2016-04-09 00:22:59 +00:00
Evgeny Stupachenko 8788048403 test commit
llvm-svn: 265840
2016-04-08 20:20:38 +00:00
Duncan P. N. Exon Smith bb2c3e199e ValueMapper: Extract llvm::RemapFunction from IRMover.cpp, NFC
Strip out the remapping parts of IRLinker::linkFunctionBody and put them
in ValueMapper.cpp under the name Mapper::remapFunction (with a
top-level entry-point llvm::RemapFunction).

This is a nice cleanup on its own since it puts the remapping code
together and shares a single Mapper context for the entire
IRLinker::linkFunctionBody Call.  Besides that, this will make it easier
to break the co-recursion between IRMover.cpp and ValueMapper.cpp in
follow ups.

llvm-svn: 265835
2016-04-08 19:26:32 +00:00
Duncan P. N. Exon Smith adcebdf2d1 ValueMapper: Always use Mapper::mapValue from remapInstruction, NFCI
Use Mapper::mapValue instead of llvm::MapValue from
Mapper::remapInstruction when mapping an incoming block for a PHINode
(follow-up to r265832).  This will implicitly pass along the
Materializer argument, but when this code was added in r133513 there was
no Materializer argument.  I suspect this call to MapValue was just
missed in r182776 since it's not observable (basic blocks can't be
materialized, and they don't reference other values).

llvm-svn: 265833
2016-04-08 19:17:13 +00:00
Duncan P. N. Exon Smith a574e7a7a4 ValueMapper: Roll RemapInstruction into Mapper, NFC
Add Mapper::remapInstruction, move the guts of llvm::RemapInstruction
into it, and use the same Mapper for most of the calls to MapValue and
MapMetadata.  There should be no functionality change here.

I left off the call to MapValue that wasn't passing in a Materializer
argument (for basic blocks of PHINodes).  It shouldn't change
functionality either, but I'm suspicious enough to commit separately.

llvm-svn: 265832
2016-04-08 19:09:34 +00:00
Duncan P. N. Exon Smith 69341e6abc ValueMapper: Don't memoize metadata when RF_NoModuleLevelChanges
Prevent the Metadata side-table in ValueMap from growing unnecessarily
when RF_NoModuleLevelChanges.  As a drive-by, make ValueMap::hasMD,
which apparently had no users until I used it here for testing, actually
compile.

llvm-svn: 265828
2016-04-08 18:49:36 +00:00
Duncan P. N. Exon Smith e05ff7c1a7 ValueMapper: Stop memoizing MDStrings
Stop adding MDString to the Metadata section of the ValueMap in
MapMetadata.  It blows up the size of the map for no benefit, since we
can always return quickly anyway.

There is a potential follow-up that I don't think I'll push on right
away, but maybe someone else is interested:  stop checking for a
pre-mapped MDString, and move the `isa<MDString>()` checks in
Mapper::mapSimpleMetadata and MDNodeMapper::getMappedOp in front of the
`VM.getMappedMD()` calls.  While this would preclude explicitly
remapping MDStrings it would probably be a little faster.

llvm-svn: 265827
2016-04-08 18:47:02 +00:00
Duncan P. N. Exon Smith 4ec55f8ab6 Reapply "ValueMapper: Treat LocalAsMetadata more like function-local Values"
This reverts commit r265765, reapplying r265759 after changing a call from
LocalAsMetadata::get to ValueAsMetadata::get (and adding a unit test).  When a
local value is mapped to a constant (like "i32 %a" => "i32 7"), the new debug
intrinsic operand may no longer be pointing at a local.

    http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/19020/

The previous coommit message follows:

--

This is a partial re-commit -- maybe more of a re-implementation -- of
r265631 (reverted in r265637).

This makes RF_IgnoreMissingLocals behave (almost) consistently between
the Value and the Metadata hierarchy.  In particular:

  - MapValue returns nullptr or "metadata !{}" for missing locals in
    MetadataAsValue/LocalAsMetadata bridging paris, depending on
    the RF_IgnoreMissingLocals flag.

  - MapValue doesn't memoize LocalAsMetadata-related results.

  - MapMetadata no longer deals with LocalAsMetadata or
    RF_IgnoreMissingLocals at all.  (This wasn't in r265631 at all, but
    I realized during testing it would make the patch simpler with no
    loss of generality.)

r265631 went too far, making both functions universally ignore
RF_IgnoreMissingLocals.  This broke building (e.g.) compiler-rt.
Reassociate (and possibly other passes) don't currently maintain
dominates-use invariants for metadata operands, resulting in IR like
this:

    define void @foo(i32 %arg) {
      call void @llvm.some.intrinsic(metadata i32 %x)
      %x = add i32 1, i32 %arg
    }

If the inliner chooses to inline @foo into another function, then
RemapInstruction will call `MapValue(metadata i32 %x)` and assert that
the return is not nullptr.

I've filed PR27273 to add a Verifier check and fix the underlying
problem in the optimization passes.

As a workaround, return `!{}` instead of nullptr for unmapped
LocalAsMetadata when RF_IgnoreMissingLocals is unset.  Otherwise, match
the behaviour of r265631.

Original commit message:

    ValueMapper: Make LocalAsMetadata match function-local Values

    Start treating LocalAsMetadata similarly to function-local members of
    the Value hierarchy in MapValue and MapMetadata.

      - Don't memoize them.
      - Return nullptr if they are missing.

    This also cleans up ConstantAsMetadata to stop listening to the
    RF_IgnoreMissingLocals flag.

llvm-svn: 265768
2016-04-08 03:13:22 +00:00
Duncan P. N. Exon Smith 805873148a Revert "ValueMapper: Treat LocalAsMetadata more like function-local Values"
This reverts commit r265759, since even this limited version breaks some
bots:
  http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/3311
  http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/17696

This also reverts r265761 "ValueMapper: Unduplicate
RF_NoModuleLevelChanges check, NFC", since I had trouble separating it
from r265759.

llvm-svn: 265765
2016-04-08 00:56:21 +00:00
Sanjoy Das 5ce3272833 Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.

If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".

Motivation:

I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard.  So transforming:

```
void f(unsigned x) {
  unsigned t = 5 / x;
  (void)t;
}
```

to

```
void f(unsigned x) { }
```

is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).

Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM.  For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).

Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have.  This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.

For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store.  As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal.  The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal.  Such a
refined variant will look like it is `readonly`.  However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.

Note: this is not just a problem with atomics or with linking
differently optimized object files.  See PR26774 for more realistic
examples that involved neither.

This patch:

This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time.  It then changes a set of IPO passes to bail out if they see
such a function.

Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18634

llvm-svn: 265762
2016-04-08 00:48:30 +00:00
Duncan P. N. Exon Smith 0fa8aca0e6 ValueMapper: Unduplicate RF_NoModuleLevelChanges check, NFC
llvm-svn: 265761
2016-04-08 00:41:10 +00:00
Duncan P. N. Exon Smith 267185ec92 ValueMapper: Treat LocalAsMetadata more like function-local Values
This is a partial re-commit -- maybe more of a re-implementation -- of
r265631 (reverted in r265637).

This makes RF_IgnoreMissingLocals behave (almost) consistently between
the Value and the Metadata hierarchy.  In particular:

  - MapValue returns nullptr or "metadata !{}" for missing locals in
    MetadataAsValue/LocalAsMetadata bridging paris, depending on
    the RF_IgnoreMissingLocals flag.

  - MapValue doesn't memoize LocalAsMetadata-related results.

  - MapMetadata no longer deals with LocalAsMetadata or
    RF_IgnoreMissingLocals at all.  (This wasn't in r265631 at all, but
    I realized during testing it would make the patch simpler with no
    loss of generality.)

r265631 went too far, making both functions universally ignore
RF_IgnoreMissingLocals.  This broke building (e.g.) compiler-rt.
Reassociate (and possibly other passes) don't currently maintain
dominates-use invariants for metadata operands, resulting in IR like
this:

    define void @foo(i32 %arg) {
      call void @llvm.some.intrinsic(metadata i32 %x)
      %x = add i32 1, i32 %arg
    }

If the inliner chooses to inline @foo into another function, then
RemapInstruction will call `MapValue(metadata i32 %x)` and assert that
the return is not nullptr.

I've filed PR27273 to add a Verifier check and fix the underlying
problem in the optimization passes.

As a workaround, return `!{}` instead of nullptr for unmapped
LocalAsMetadata when RF_IgnoreMissingLocals is unset.  Otherwise, match
the behaviour of r265631.

Original commit message:

    ValueMapper: Make LocalAsMetadata match function-local Values

    Start treating LocalAsMetadata similarly to function-local members of
    the Value hierarchy in MapValue and MapMetadata.

      - Don't memoize them.
      - Return nullptr if they are missing.

    This also cleans up ConstantAsMetadata to stop listening to the
    RF_IgnoreMissingLocals flag.

llvm-svn: 265759
2016-04-08 00:33:44 +00:00
Duncan P. N. Exon Smith 45601e867d Revert "ValueMapper: Make LocalAsMetadata match function-local Values"
This reverts commit r265631, since it caused bot failures:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/3256
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/7272

Looks like something is depending on the old behaviour.  I'll try to
track it down and recommit.

llvm-svn: 265637
2016-04-07 02:10:50 +00:00
Duncan P. N. Exon Smith fdccad925c ValueMapper: Allow RF_IgnoreMissingLocals and RF_NullMapMissingGlobalValues
Remove the assertion that disallowed the combination, since
RF_IgnoreMissingLocals should have no effect on globals.  As it happens,
RF_NullMapMissingGlobalValues asserted in MapValue(Constant*,...), so I
also changed a cast to a cast_or_null to get my test passing.

llvm-svn: 265633
2016-04-07 01:22:45 +00:00
Duncan P. N. Exon Smith c1e4070708 ValueMapper: Make LocalAsMetadata match function-local Values
Start treating LocalAsMetadata similarly to function-local members of
the Value hierarchy in MapValue and MapMetadata.

  - Don't memoize them.
  - Return nullptr if they are missing.

This also cleans up ConstantAsMetadata to stop listening to the
RF_IgnoreMissingLocals flag.

llvm-svn: 265631
2016-04-07 01:08:39 +00:00
Duncan P. N. Exon Smith da68cbc4ad IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC
Clarify what this RemapFlag actually means.

  - Change the flag name to match its intended behaviour.
  - Clearly document that it's not supposed to affect globals.
  - Add a host of FIXMEs to indicate how to fix the behaviour to match
    the intent of the flag.

RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock.  Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).

When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.

This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.

llvm-svn: 265628
2016-04-07 00:26:43 +00:00
Michael Zolotukhin 56ad4048ae Follow-up for r265605: don't mutate vector we're iterating.
llvm-svn: 265625
2016-04-07 00:09:42 +00:00
Michael Zolotukhin 97567e141e [LoopUnroll] Fix the way we update DT after complete unrolling.
Updating dominators for exit-blocks of the unrolled loops is not enough,
as shown in PR27157. The proper way is to update dominators for all
dominance-children of original loop blocks.

llvm-svn: 265605
2016-04-06 21:47:12 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Duncan P. N. Exon Smith 6f2e37429a ValueMapper: Fix delayed blockaddress handling after r265273
r265273 added Mapper::mapBlockAddress, which delays mapping a
blockaddress value until the function has a body.  The condition was
backwards, and should be checking Function::empty instead of
GlobalValue::isDeclaration.

llvm-svn: 265508
2016-04-06 02:25:12 +00:00
Duncan P. N. Exon Smith 818e5f38d2 Try harder to appease MSVC after r265456
r265465 wasn't good enough.  I need to spell out all the moves.

llvm-svn: 265470
2016-04-05 21:25:33 +00:00
Duncan P. N. Exon Smith 1de3c7e790 IR: Introduce ConstantAggregate, NFC
Add a common parent class for ConstantArray, ConstantVector, and
ConstantStruct called ConstantAggregate.  These are the aggregate
subclasses of Constant that take operands.

This is mainly a cleanup, adding common `isa` target and removing
duplicated code.  However, it also simplifies caching which constants
point transitively at `GlobalValue` (a possible future direction).

llvm-svn: 265466
2016-04-05 21:10:45 +00:00
Duncan P. N. Exon Smith f880d35b80 Try to appease MSVC after r265456
I can't remember if adding `= default` will make MSVC happy, or if I
have to spell this out.  Let's try the cleaner version first.

llvm-svn: 265465
2016-04-05 21:07:01 +00:00
Duncan P. N. Exon Smith ea7df770ae ValueMapper: Rewrite Mapper::mapMetadata without recursion
This commit completely rewrites Mapper::mapMetadata (the implementation
of llvm::MapMetadata) using an iterative algorithm.  The guts of the new
algorithm are in MDNodeMapper::map, the entry function in a new class.

Previously, Mapper::mapMetadata performed a recursive exploration of the
graph with eager "just in case there's a reason" malloc traffic.

The new algorithm has these benefits:

  - New nodes and temporaries are not created eagerly.
  - Uniquing cycles are not duplicated (see new unit test).
  - No recursion.

Given a node to map, it does this:

 1. Use a worklist to perform a post-order traversal of the transitively
    referenced unmapped nodes.

 2. Track which nodes will change operands, and which will have new
    addresses in the mapped scheme.  Propagate the changes through the
    POT until fixed point, to pick up uniquing cycles that need to
    change.

 3. Map all the distinct nodes without touching their operands.  If
    RF_MoveDistinctMetadata, they get mapped to themselves; otherwise,
    they get mapped to clones.

 4. Map the uniqued nodes (bottom-up), lazily creating temporaries for
    forward references as needed.

 5. Remap the operands of the distinct nodes.

Mehdi helped me out by profiling this with -flto=thin.  On his workload
(importing/etc. for opt.cpp), MapMetadata sped up by 15%, contributed
about 50% less to persistent memory, and made about 100x fewer calls to
malloc.  The speedup is less than I'd hoped.  The profile mainly blames
DenseMap lookups; perhaps there's a way to reduce them (e.g., by
disallowing remapping of MDString).

It would be nice to break the strange remaining recursion on the Value
side: MapValue => materializeInitFor => RemapInstruction => MapValue.  I
think we could do this by having materializeInitFor return a worklist of
things to be remapped.

llvm-svn: 265456
2016-04-05 20:23:21 +00:00
David L Kreitzer 188de5ae69 Adds the ability to use an epilog remainder loop during loop unrolling and makes
this the default behavior.

Patch by Evgeny Stupachenko (evstupac@gmail.com).

Differential Revision: http://reviews.llvm.org/D18158

llvm-svn: 265388
2016-04-05 12:19:35 +00:00
Duncan P. N. Exon Smith 8e65f8ddfd ValueMapper: Remove old FIXMEs; almost NFC
Remove a few old FIXMEs from the original commit of the Metadata/Value
split in r223802.  These are commented out assertions to the effect that
calls between mapValue and mapMetadata never return nullptr.

(The only behaviour change is that Mapper::mapSimpleMetadata memoizes
the nullptr return.)

When I originally rewrote the mapping code, I thought we could be
stricter in the new metadata hierarchy and never return nullptr when
RF_NullMapMissingGlobalValues was off.  It's still not entirely clear to
me why these assertions failed (a few months ago, I had a theory that I
forgot to write down, but that's helping no one).

Understood or not, I no longer see how these commented-out assertions
would be useful.  I'm relegating them to the annals of source control
before making significant changes to ValueMapper.cpp.

llvm-svn: 265282
2016-04-04 04:59:56 +00:00
Duncan P. N. Exon Smith 756e1c3db4 ValueMapper: Disallow metadata mapping recursion through mapValue
This adds an assertion to maintain the property from r265273.  When
Mapper::mapSimpleMetadata calls Mapper::mapValue, it should not find its
way back to mapMetadataImpl.  This guarantees that mapSimpleMetadata is
not involved in any recursion.

Since Mapper::mapValue calls out to arbitrary materializers, we need to
save a bit on the ValueMap to make this assertion effective.

There should be no functionality change here.  This co-recursion should
already have been impossible.

llvm-svn: 265276
2016-04-03 20:54:51 +00:00
Duncan P. N. Exon Smith a997856b3d Work around MSVC failure from r265273
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19726

llvm-svn: 265275
2016-04-03 20:42:21 +00:00
Duncan P. N. Exon Smith c6065e3a25 ValueMapper: Avoid recursion in mapSimplifiedMetadata, NFC
The main change is to delay materializing GlobalValue initializers from
Mapper::mapValue until Mapper::~Mapper.  This effectively removes all
recursion from mapSimplifiedMetadata, as promised in r265270.
mapSimplifiedMetadata calls mapValue for ConstantAsMetadata nodes to
find the mapped constant, and now it shouldn't be possible for mapValue
to indirectly re-invoke mapMetadata.  I'll add an assertion to that
effect in a follow-up (separated so that the assertion can easily be
reverted independently, if it comes to that).

This a step toward a broader goal: converting Mapper::mapMetadataImpl
from a recursive to an iterative algorithm.

When a BlockAddress points at a BasicBlock inside an unmaterialized
function body, we need to delay it until the function body is
materialized in Mapper::~Mapper.  This commit creates a temporary
BasicBlock and returns a new BlockAddress, then RAUWs the BasicBlock
once it is known.  This situation should be extremely rare since a
BlockAddress is usually used from within the function it's referencing
(and BlockAddress itself is rare).

There should be no observable functionality change.

llvm-svn: 265273
2016-04-03 20:17:45 +00:00
Duncan P. N. Exon Smith ae8bd4bd11 ValueMapper: Split out mapSimpleMetadata, NFC
Split out a helper for mapping metadata without operands.  This is any
metadata that is not an MDNode, and any MDNode where the answer is known
without looking at operands.

Through some weird twists, this function is co-recursive:

    mapSimpleMetadata
    => MapValue
    => materializeInitFor
    => linkFunctionBody
    => RemapInstructions
    => MapMetadata
    => mapSimpleMetadata

I plan to break the recursion in a follow-up.

llvm-svn: 265270
2016-04-03 19:31:01 +00:00
Duncan P. N. Exon Smith 829dc87a68 ValueMapper: Introduce Mapper helper class, NFC
Remove a bunch of boilerplate from ValueMapper.cpp by using a new
file-local class called Mapper.

llvm-svn: 265268
2016-04-03 19:06:24 +00:00
Davide Italiano d4f5a059e0 [SimplifyLibCalls] Garbage collect dead code.
We already skip optimizations if the return value
of printf() is used, so CI->use_empty() is always
true.

Differential Revision:  http://reviews.llvm.org/D18656

llvm-svn: 265253
2016-04-03 01:46:52 +00:00
Duncan P. N. Exon Smith 4b520e5ef6 Linker: Remove IRMover::isMetadataUnneeded indirection; almost NFC
Instead of checking live during MapMetadata whether a subprogram is
needed, seed the ValueMap with `nullptr` up-front.

There is a small hypothetical functionality change.  Previously, calling
MapMetadataOp on a node whose "scope:" chain led to an unneeded
subprogram would return nullptr.  However, if that were ever called,
then the subprogram would be needed; a situation that the IRMover is
supposed to avoid a priori!

Besides cleaning up the code a little, this restores a nice property:
MapMetadataOp returns the same as MapMetadata.

llvm-svn: 265229
2016-04-02 17:12:00 +00:00
Duncan P. N. Exon Smith da4a56d1ab ValueMapper: Add support for seeding metadata with nullptr
Support seeding a ValueMap with nullptr for Metadata entries, a
situation I didn't consider in the Metadata/Value split.

I added a ValueMapper::getMappedMD accessor that returns an
Optional<Metadata*> with the mapped (possibly null) metadata.  IRMover
needs to use this to avoid modifying the map when it's checking for
unneeded subprograms.  I updated a call from bugpoint since I find the
new code clearer.

llvm-svn: 265228
2016-04-02 17:04:38 +00:00
Mehdi Amini 89038a1071 Fix "warning: variabl 'XX’ set but not used" in release build (variable used in assertion, NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265220
2016-04-02 05:34:19 +00:00
Sanjoy Das f83ab6de56 Don't insert stackrestore on deoptimizing returns
They're not necessary (since the stack pointer is trivially restored on
return), and the way LLVM inserts the stackrestore calls breaks the
IR (we get a stackrestore between the deoptimize call and the return).

llvm-svn: 265101
2016-04-01 02:51:30 +00:00
Sanjoy Das 18b92968ea Don't insert lifetime end markers on deoptimizing returns
They're not necessary (since the lifetime of the alloca is trivially
over due to the return), and the way LLVM inserts the lifetime.end
markers breaks the IR (we get a lifetime end marker between the
deoptimize call and the return).

llvm-svn: 265100
2016-04-01 02:51:26 +00:00
Evgeniy Stepanov f74f091ea6 Preserve blockaddress use edges in the module splitter.
"blockaddress" can not apply to an external function. All
blockaddress constant uses must belong to the same module as the
definition of the target function.

llvm-svn: 265061
2016-03-31 21:55:11 +00:00
Evgeniy Stepanov a614ab7b71 Preserve extern_weak linkage in CloneModule.
Only force "extern" linkage if the function used to be a definition
in the source module. Declarations keep their original linkage.

llvm-svn: 265043
2016-03-31 20:21:31 +00:00
Sanjoy Das 021de058df Introduce a @llvm.experimental.guard intrinsic
Summary:
As discussed on llvm-dev[1].

This change adds the basic boilerplate code around having this intrinsic
in LLVM:

 - Changes in Intrinsics.td, and the IR Verifier
 - A lowering pass to lower @llvm.experimental.guard to normal
   control flow
 - Inliner support

[1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html

Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18527

llvm-svn: 264976
2016-03-31 00:18:46 +00:00
Peter Collingbourne 2bc252acd5 Cloning: Reduce complexity of debug info cloning and fix correctness issue.
Commit r260791 contained an error in that it would introduce a cross-module
reference in the old module. It also introduced O(N^2) complexity in the
module cloner by requiring the entire module to be visited for each function.
Fix both of these problems by avoiding use of the CloneDebugInfoMetadata
function (which is only designed to do intra-module cloning) and cloning
function-attached metadata in the same way that we clone all other metadata.

Differential Revision: http://reviews.llvm.org/D18583

llvm-svn: 264935
2016-03-30 22:05:13 +00:00
Nirav Dave 8dd66e5753 Remove HasFnAttribute guards to getFnAttribute calls
These checks are redundant and can be removed

Reviewers: hans

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D18564

llvm-svn: 264872
2016-03-30 15:41:12 +00:00
George Burgess IV 49cad7d70b [MemorySSA] Make the visitor more careful with calls.
Prior to this patch, the MemorySSA caching visitor would cache all
calls that it visited. When paired with phi optimization, this can be
problematic. Consider:

define void @foo() {
  ; 1 = MemoryDef(liveOnEntry)
  call void @clobberFunction()
  br i1 undef, label %if.end, label %if.then

if.then:
  ; MemoryUse(??)
  call void @readOnlyFunction()
  ; 2 = MemoryDef(1)
  call void @clobberFunction()
  br label %if.end

if.end:
  ; 3 = MemoryPhi(...)
  ; MemoryUse(?)
  call void @readOnlyFunction()
  ret void
}

When optimizing MemoryUse(?), we visit defs 1 and 2, so we note to
cache them later. We ultimately end up not being able to optimize
passed the Phi, so we set MemoryUse(?) to point to the Phi. We then
cache the clobbering call for def 1 to be the Phi.

This commit changes this behavior so that we wipe out any calls
added to VisistedCalls while visiting the defs of a phi we couldn't
optimize.

Aside: With this patch, we now can bootstrap clang/LLVM without a
single MemorySSA verifier failure. Woohoo. :)

llvm-svn: 264820
2016-03-30 03:12:08 +00:00
George Burgess IV 82ee942a8c [MemorySSA] Change how the walker views/walks visited phis.
This patch teaches the caching MemorySSA walker a few things:

1. Not to walk Phis we've walked before. It seems that we tried to do
   this before, but it didn't work so well in cases like:

define void @foo() {
  %1 = alloca i8
  %2 = alloca i8
  br label %begin

begin:
  ; 3 = MemoryPhi({%0,liveOnEntry},{%end,2})
  ; 1 = MemoryDef(3)
  store i8 0, i8* %2
  br label %end

end:
  ; MemoryUse(?)
  load i8, i8* %1
  ; 2 = MemoryDef(1)
  store i8 0, i8* %2
  br label %begin
}

Because we wouldn't put Phis in Q.Visited until we tried to visit them.
So, when trying to optimize MemoryUse(?):
  - We would visit 3 above
    - ...Which would make us put {%0,liveOnEntry} in Q.Visited
    - ...Which would make us visit {%0,liveOnEntry}
    - ...Which would make us put {%end,2} in Q.Visited
    - ...Which would make us visit {%end,2}
      - ...Which would make us visit 3
        - ...Which would realize we've already visited everything in 3
        - ...Which would make us conservatively return 3.

In the added test-case, (@looped_visitedonlyonce) this behavior would
cause us to give incorrect results. Specifically, we'd visit 4 twice
in the same query, but on the second visit, we'd skip while.cond because
it had been visited, visit if.then/if.then2, and cache "1" as the
clobbering def on the way back.

2. If we try to walk the defs of a {Phi,MemLoc} and see it has been
   visited before, just hand back the Phi we're trying to optimize.

I promise this isn't as terrible as it seems. :)

We now insert {Phi,MemLoc} pairs just before walking the Phi's upward
defs. So, we check the cache for the {Phi,MemLoc} pair before checking
if we've already walked the Phi.

The {Phi,MemLoc} pair is (almost?) always guaranteed to have a cache
entry if we've already fully walked it, because we cache as we go.

So, if the {Phi,MemLoc} pair isn't in cache, either:
 (a) we must be in the process of visiting it (in which case, we can't
     give a better answer in a cache-as-we-go DFS walker)

 (b) we visited it, but didn't cache it on the way back (...which seems
     to require `ModifyingAccess` to not dominate `StartingAccess`,
     so I'm 99% sure that would be an error. If it's not an error, I
     haven't been able to get it to happen locally, so I suspect it's
     rare.)

- - - - -

As a consequence of this change, we no longer skip upward defs of phis,
so we can kill the `VisitedOnlyOne` check. This gives us better accuracy
than we had before, at the cost of potentially doing a bit more work
when we have a loop.

llvm-svn: 264814
2016-03-30 00:26:26 +00:00
Teresa Johnson b703c77b03 [ThinLTO] Remove post-pass metadata linking support
Since we have moved to a model where functions are imported in bulk from
each source module after making summary-based importing decisions, there
is no longer a need to link metadata as a postpass, and all users have
been removed.

This essentially reverts r255909 and follow-on fixes.

llvm-svn: 264763
2016-03-29 18:24:19 +00:00
Hyojin Sung 4673f10568 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
    
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.

llvm-svn: 264697
2016-03-29 04:08:57 +00:00
Evgeniy Stepanov f575b2687c Remove personality for declarations in CloneModule.
Personality is copied as part of copyFunctionAttributes, but it is
invalid on a declaration. Remove the personality attribute it the
function body is not cloned.

Also add a verifier run over output modules in the llvm-split tool.

llvm-svn: 264667
2016-03-28 21:37:02 +00:00
Reid Kleckner ba85781f58 Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops"
This reverts commit r264596.

It does not compile.

llvm-svn: 264604
2016-03-28 18:07:40 +00:00
Hyojin Sung 0ada5b0d14 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being 
applied (e.g., transforming nested loops into simplified forms and loop vectorization).

The patch augments the existing PHI node-based check by adding a pre-test if the BB actually 
belongs to a set of loop headers and not eliminating it if yes. 

llvm-svn: 264596
2016-03-28 17:22:25 +00:00
Davide Italiano 6db1dcbf6b [SimplifyLibCalls] Transform printf("%s", "a") -> putchar('a').
llvm-svn: 264588
2016-03-28 15:54:01 +00:00
Sanjay Patel 796db35f62 [SimplifyCFG] propagate branch metadata when creating select (PR26636)
llvm-svn: 264527
2016-03-26 23:30:50 +00:00
Sanjoy Das d4c783335b [RS4GC] Lower calls to @llvm.experimental.deoptimize
This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize``
to gc.statepoints wrapping ``__llvm_deoptimize``, and changes
``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize``
as a non GC leaf function.

I've had to hard code the ``"__llvm_deoptimize"`` name in
RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only
during codegen.  This isn't without precedent in the codebase, so I'm
not overtly concerned.

llvm-svn: 264456
2016-03-25 20:12:13 +00:00
David L Kreitzer 8d441eb936 Enable non-power-of-2 #pragma unroll counts.
Patch by Evgeny Stupachenko.

Differential Revision: http://reviews.llvm.org/D18202

llvm-svn: 264407
2016-03-25 14:24:52 +00:00
George Burgess IV 0e4898685f Fix bugs in the MemorySSA walker.
There are a few bugs in the walker that this patch addresses.
Primarily:
- Caching can break when we have multiple BBs without phis
- We weren't optimizing some phis properly
- Because of how the DFS iterator works, there were times where we
  wouldn't cache any results of our DFS

I left the test cases with FIXMEs in, because I'm not sure how much
effort it will take to get those to work (read: We'll probably
ultimately have to end up redoing the walker, or we'll have to come up
with some creative caching tricks), and more test coverage = better.

Differential Revision: http://reviews.llvm.org/D18065

llvm-svn: 264180
2016-03-23 18:31:55 +00:00
Davide Italiano 1a911e204d [ModuleUtils] Use range-based loop. NFC.
llvm-svn: 264122
2016-03-23 00:43:35 +00:00
Adam Nemet 8b47e0d0ea [LoopVersioning] Relax an assert for LCSSA PHIs
When you have multiple LCSSA (single-operand) PHIs that are converted
into two-operand PHIs due to versioning, only assert that the PHI
currently being converted has a single operand.  I.e. we don't want to
check PHIs that were converted earlier in the loop.

Fixes PR27023.

Thanks to Karl-Johan Karlsson for the minimized testcase!

llvm-svn: 264081
2016-03-22 18:38:15 +00:00
George Burgess IV 3887a41725 [MemorySSA] Consider def-only BBs for live-in calculations.
If we have a BB with only MemoryDefs, live-in calculations will ignore
it. This means we get results like this:

define void @foo(i8* %p) {
  ; 1 = MemoryDef(liveOnEntry)
  store i8 0, i8* %p
  br i1 undef, label %if.then, label %if.end

if.then:
  ; 2 = MemoryDef(1)
  store i8 1, i8* %p
  br label %if.end

if.end:
  ; 3 = MemoryDef(1)
  store i8 2, i8* %p
  ret void
}

...When there should be a MemoryPhi in the `if.end` BB.

This patch fixes that behavior.

llvm-svn: 263991
2016-03-21 21:25:39 +00:00
David Majnemer abae6b588b [SimplifyLibCalls] Only consider sinpi/cospi functions within the same function
The sinpi/cospi can be replaced with sincospi to remove unnecessary
computations.  However, we need to make sure that the calls are within
the same function!

This fixes PR26993.

llvm-svn: 263875
2016-03-19 04:53:02 +00:00
Mehdi Amini 8d05185a26 Rework linkInModule(), making it oblivious to ThinLTO
Summary:
ThinLTO is relying on linkInModule to import selected function.
However a lot of "magic" was hidden in linkInModule and the IRMover,
who would rename and promote global variables on the fly.

This is moving to an approach where the steps are decoupled and the
client is reponsible to specify the list of globals to import.
As a consequence some test are changed because they were relying on
the previous behavior which was importing the definition of *every*
single global without control on the client side.
Now the burden is on the client to decide if a global has to be imported
or not.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18122

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263863
2016-03-19 00:40:31 +00:00
Sanjoy Das 74af78e3b0 [IndVars] Make the fix for PR26973 more obvious; NFCI
llvm-svn: 263828
2016-03-18 20:37:11 +00:00
Sanjoy Das 60fb899f28 [IndVars] Pass the right loop to isLoopInvariantPredicate
The loop on IVOperand's incoming values assumes IVOperand to be an
induction variable on the loop over which `S Pred X` is invariant;
otherwise loop invariant incoming values to IVOperand are not guaranteed
to dominate the comparision.

This fixes PR26973.

llvm-svn: 263827
2016-03-18 20:37:07 +00:00
Adam Nemet b0c4eae073 [LoopVectorize] Annotate versioned loop with noalias metadata
Summary:
Use the new LoopVersioning facility (D16712) to add noalias metadata in
the vector loop if we versioned with memchecks.  This can enable some
optimization opportunities further down the pipeline (see the included
test or the benchmark improvement quoted in D16712).

The test also covers the bug I had in the initial version in D16712.

The vectorizer did not previously use LoopVersioning.  The reason is
that the vectorizer performs its transformations in single shot.  It
creates an empty single-block vector loop that it then populates with
the widened, if-converted instructions.  Thus creating an intermediate
versioned scalar loop seems wasteful.

So this patch (rather than bringing in LoopVersioning fully) adds a
special interface to LoopVersioning to allow the vectorizer to add
no-alias annotation while still performing its own versioning.

As the vectorizer propagates metadata from the instructions in the
original loop to the vector instructions we also check the pointer in
the original instruction and see if LoopVersioning can add no-alias
metadata based on the issued memchecks.

Reviewers: hfinkel, nadav, mzolotukhin

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17191

llvm-svn: 263744
2016-03-17 20:32:37 +00:00
Adam Nemet 5eccf07df3 [LoopVersioning] Annotate versioned loop with noalias metadata
Summary:
If we decide to version a loop to benefit a transformation, it makes
sense to record the now non-aliasing accesses in the newly versioned
loop.  This allows non-aliasing information to be used by subsequent
passes.

One example is 456.hmmer in SPECint2006 where after loop distribution,
we vectorize one of the newly distributed loops.  To vectorize we
version this loop to fully disambiguate may-aliasing accesses.  If we
add the noalias markers, we can use the same information in a later DSE
pass to eliminate some dead stores which amounts to ~25% of the
instructions of this hot memory-pipeline-bound loop.  The overall
performance improves by 18% on our ARM64.

The scoped noalias annotation is added in LoopVersioning.  The patch
then enables this for loop distribution.  A follow-on patch will enable
it for the vectorizer.  Eventually this should be run by default when
versioning the loop but first I'd like to get some feedback whether my
understanding and application of scoped noalias metadata is correct.

Essentially my approach was to have a separate alias domain for each
versioning of the loop.  For example, if we first version in loop
distribution and then in vectorization of the distributed loops, we have
a different set of memchecks for each versioning.  By keeping the scopes
in different domains they can conveniently be defined independently
since different alias domains don't affect each other.

As written, I also have a separate domain for each loop.  This is not
necessary and we could save some metadata here by using the same domain
across the different loops.  I don't think it's a big deal either way.

Probably the best is to review the tests first to see if I mapped this
problem correctly to scoped noalias markers.  I have plenty of comments
in the tests.

Note that the interface is prepared for the vectorizer which needs the
annotateInstWithNoAlias API.  The vectorizer does not use LoopVersioning
so we need a way to pass in the versioned instructions.  This is also
why the maps have to become part of the object state.

Also currently, we only have an AA-aware DSE after the vectorizer if we
also run the LTO pipeline.  Depending how widely this triggers we may
want to schedule a DSE toward the end of the regular pass pipeline.

Reviewers: hfinkel, nadav, ashutosh.nema

Subscribers: mssimpso, aemerson, llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16712

llvm-svn: 263743
2016-03-17 20:32:32 +00:00
Sanjay Patel 9e23fedaf0 propagate 'unpredictable' metadata on select instructions
This is similar to D18133 where we allowed profile weights on select instructions. 
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.

A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648

Differential Revision: http://reviews.llvm.org/D18220

llvm-svn: 263716
2016-03-17 15:30:52 +00:00
Adam Nemet fdb20595a1 [LV] Preserve LoopInfo when store predication is used
This was a latent bug that got exposed by the change to add LoopSimplify
as a dependence to LoopLoadElimination.  Since LoopInfo was corrupted
after LV, LoopSimplify mis-compiled nbench in the test-suite (more
details in the PR).

The problem was that when we create the blocks for predicated stores we
didn't add those to any loops.

The original testcase for store predication provides coverage for this
assuming we verify LI on the way out of LV.

Fixes PR26952.

llvm-svn: 263565
2016-03-15 18:06:20 +00:00
Eric Christopher 257338ff0f Use some braces to format this a little better.
llvm-svn: 263527
2016-03-15 03:01:31 +00:00
Eric Christopher ee00abe5e6 Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest
parentheses around '&&' within '||' [-Werror=parentheses].

llvm-svn: 263525
2016-03-15 02:19:06 +00:00
Teresa Johnson 26ab5772b0 [ThinLTO] Renaming of function index to module summary index (NFC)
(Resubmitting after fixing missing file issue)

With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263513
2016-03-15 00:04:37 +00:00
Justin Lebar 6827de19b2 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Teresa Johnson cec0cae313 Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"
This reverts commit r263490. Missed a file.

llvm-svn: 263493
2016-03-14 21:18:10 +00:00
Teresa Johnson 892920b358 [ThinLTO] Renaming of function index to module summary index (NFC)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263490
2016-03-14 21:05:56 +00:00
Sanjay Patel ee52b6e77d allow branch weight metadata on select instructions (PR26636)
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636

This doesn't accomplish anything on its own. It's the first step towards preserving 
and using branch weights with selects.

The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to 
determine how to correctly propagate the weights.

Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.

The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.

Differential Revision: http://reviews.llvm.org/D18133

llvm-svn: 263482
2016-03-14 20:18:59 +00:00
Mehdi Amini ba9fba81d6 Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Sanjay Patel 5658b58936 remove unnecessary cast; NFC
llvm-svn: 263343
2016-03-12 18:17:41 +00:00
Sanjay Patel 2e0027706a fix formatting; NFC
llvm-svn: 263342
2016-03-12 18:05:53 +00:00
Sanjay Patel 5781d840dd use range loops; NFCI
llvm-svn: 263341
2016-03-12 16:52:17 +00:00
Eric Christopher 35abd051c0 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
George Burgess IV b42b762bca [MemorySSA] Make a return type reflect reality. NFC.
llvm-svn: 263286
2016-03-11 19:34:03 +00:00
Sanjoy Das b51325dbdb Introduce @llvm.experimental.deoptimize
Summary:
This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality.  In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Note: this change does not address how `@llvm.experimental_deoptimize`
is lowered.  That will be done in a later change.

Reviewers: chandlerc, rnk, atrick, reames

Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet

Differential Revision: http://reviews.llvm.org/D17732

llvm-svn: 263281
2016-03-11 19:08:34 +00:00
Mehdi Amini 99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Pete Cooper adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Mehdi Amini bd04e8fed6 FunctionIndex is not optional for renameModuleForThinLTO(), make it a reference (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262976
2016-03-09 01:37:14 +00:00
Sanjay Patel eaf06851d0 rangify, fix function names; NFCI
llvm-svn: 262940
2016-03-08 17:12:32 +00:00
Sanjay Patel 5b8d741632 don't repeat function names in documentation comments; NFC
llvm-svn: 262937
2016-03-08 16:26:39 +00:00
Easwaran Raman b1bd398ceb Revert revisions 262636, 262643, 262679, and 262682.
llvm-svn: 262883
2016-03-08 00:36:35 +00:00
Easwaran Raman 3b7a8246c9 Fix a use-after-free bug introduced in r262636
llvm-svn: 262679
2016-03-04 00:44:01 +00:00
Easwaran Raman 3035719c86 Infrastructure for PGO enhancements in inliner
This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

llvm-svn: 262636
2016-03-03 18:26:33 +00:00
Matthew Simpson b840a6d6f4 [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

llvm-svn: 262624
2016-03-03 16:12:01 +00:00
Daniel Berlin 6412002d24 Really fix ASAN leak/etc issues with MemorySSA unittests
llvm-svn: 262519
2016-03-02 21:16:28 +00:00
Daniel Berlin 989e601b26 Revert "Fix ASAN detected errors in code and test" (it was not meant to be committed yet)
This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95.

llvm-svn: 262512
2016-03-02 20:36:22 +00:00
Daniel Berlin 27ed1c2eb0 Fix ASAN detected errors in code and test
llvm-svn: 262511
2016-03-02 20:27:29 +00:00
George Burgess IV e0e6e48b29 Attempt to fix ASAN failure in a MemorySSA test.
llvm-svn: 262452
2016-03-02 02:35:04 +00:00
Daniel Berlin 83fc77b4c0 Add the beginnings of an update API for preserving MemorySSA
Summary:
This adds the beginning of an update API to preserve MemorySSA.  In particular,
this patch adds a way to remove memory SSA accesses when instructions are
deleted.

It also adds relevant unit testing infrastructure for MemorySSA's API.

(There is an actual user of this API, i will make that diff dependent on this one.  In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P)

Reviewers: hfinkel, reames, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17157

llvm-svn: 262362
2016-03-01 18:46:54 +00:00
Dehao Chen 939993ff2f Move discriminator assignment to the right place.
Summary: Now discriminator is assigned per-function instead of per-module.

Reviewers: davidxl, dnovillo

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D17664

llvm-svn: 262240
2016-02-29 18:59:48 +00:00
David Majnemer ee0cbbbe69 [SimplifyCFG] Use a more elegant solution than r261731
The cleanupret instruction has an invariant that it's 'from' operand be
a cleanuppad.  This invariant was violated when we removed a dead block
which removed a cleanuppad leaving behind a cleanupret with an undef
'from' operand.

This was solved in r261731 by staving off the removal of the dead block
to a later pass.

However, it occured to me that we do not need to do this.
Instead, we can simply avoid processing the cleanupret if it has an
undef 'from' operand because we know that it will be removed soon.

llvm-svn: 261754
2016-02-24 17:30:48 +00:00
David Majnemer ec72e37220 [SimplifyCFG] Do not blindly remove unreachable blocks
DeleteDeadBlock was called indiscriminately, leading to cleanuprets with
undef cleanuppad references.

Instead, try to drain the BB of most of it's instructions if it is
unreachable.  We can then remove the BB if it solely consists of a
terminator (and maybe some phis).

llvm-svn: 261731
2016-02-24 10:02:16 +00:00
David Majnemer 223538f764 [WinEH] Don't inline an 'unwinds to caller' cleanupret into funclets which locally unwind
It is problematic if the inlinee has a cleanupret which unwinds to
caller and we inline it into a call site which doesn't unwind.

If the funclet unwinds anywhere other than to the caller,
then we will give the funclet two unwind destinations.
This will result in a verifier failure.

Seeing as how the caller wasn't an invoke (which would locally unwind)
and that the funclet cannot unwind to caller, we must conclude that an
'unwind to caller' cleanupret is dynamically unreachable.

This fixes PR26698.

Differential Revision: http://reviews.llvm.org/D17536

llvm-svn: 261656
2016-02-23 17:11:04 +00:00
Michael Zolotukhin 792a885537 Follow up for r261597: Add the * to the auto.
llvm-svn: 261600
2016-02-23 00:57:48 +00:00
Michael Zolotukhin 4fdf974e3e Follow-up for r261595: use range loop.
llvm-svn: 261597
2016-02-23 00:48:44 +00:00
Michael Zolotukhin de19ed1eb1 [LoopUnroll] Avoid unnecessary DT recomputation.
Summary:
When we completely unroll a loop, it's pretty easy to update DT in-place and
thus avoid rebuilding it. DT recalculation is one of the most time-consuming
tasks in loop-unroll, so avoiding it at least in case of full unroll should be
beneficial.

On some extreme (but still real-world) tests this patch improves compile time by
~2x.

Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc

Subscribers: joker.eph, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D17473

llvm-svn: 261595
2016-02-23 00:30:50 +00:00
Michael Zolotukhin d734bea8ba [LoopUnrolling] Fix a bug introduced in r259869 (PR26688).
The issue was that we only required LCSSA rebuilding if the immediate
parent-loop had values used outside of it. The fix is to enaable the
same logic for all outer loops, not only immediate parent.

llvm-svn: 261575
2016-02-22 21:21:45 +00:00
Benjamin Kramer 451f54cf62 Fix some abuse of auto flagged by clang's -Wrange-loop-analysis.
llvm-svn: 261524
2016-02-22 13:11:58 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith ec6f7fed54 TransformUtils: Avoid getNodePtrUnchecked() in integer division, NFC
Stop relying on `getNodePtrUnchecked()` being useful on invalid
iterators.  This function is documented to be for internal use only, and
the pointer type will eventually have to change to remove UB from
ilist_iterator.  Instead, check the iterator before it has been
invalidated.

llvm-svn: 261497
2016-02-21 20:14:29 +00:00
Benjamin Kramer 7d537ae747 [SimplifyCFG] Use pointer identity to simplify predicate.
No functional change intended.

llvm-svn: 261427
2016-02-20 10:40:42 +00:00
David Majnemer 1efa23ddab [SimplifyCFG] Merge together cleanuppads
Cleanuppads may be merged together if one is the only predecessor of the
other in which case a simple transform can be performed: replace the
a cleanupret with a branch and remove an unnecessary cleanuppad.

Differential Revision: http://reviews.llvm.org/D17459

llvm-svn: 261390
2016-02-20 01:07:45 +00:00
Matthew Simpson 29c997c1a1 [LV] Vectorize first-order recurrences
This patch enables the vectorization of first-order recurrences. A first-order
recurrence is a non-reduction recurrence relation in which the value of the
recurrence in the current loop iteration equals a value defined in the previous
iteration. The load PRE of the GVN pass often creates these recurrences by
hoisting loads from within loops.

In this patch, we add a new recurrence kind for first-order phi nodes and
attempt to vectorize them if possible. Vectorization is performed by shuffling
the values for the current and previous iterations. The vectorization cost
estimate is updated to account for the added shuffle instruction.

Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D16197

llvm-svn: 261346
2016-02-19 17:56:08 +00:00
Chandler Carruth 31088a9d58 [LPM] Factor all of the loop analysis usage updates into a common helper
routine.

We were getting this wrong in small ways and generally being very
inconsistent about it across loop passes. Instead, let's have a common
place where we do this. One minor downside is that this will require
some analyses like SCEV in more places than they are strictly needed.
However, this seems benign as these analyses are complete no-ops, and
without this consistency we can in many cases end up with the legacy
pass manager scheduling deciding to split up a loop pass pipeline in
order to run the function analysis half-way through. It is very, very
annoying to fix these without just being very pedantic across the board.

The only loop passes I've not updated here are ones that use
AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer.
They seemed less relevant.

With this patch, almost all of the problems in PR24804 around loop pass
pipelines are fixed. The one remaining issue is that we run simplify-cfg
and instcombine in the middle of the loop pass pipeline. We've recently
added some loop variants of these passes that would seem substantially
cleaner to use, but this at least gets us much closer to the previous
state. Notably, the seven loop pass managers is down to three.

I've not updated the loop passes using LoopAccessAnalysis because that
analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't
clear that those transforms want to support those forms anyways. They
all run late anyways, so this is harmless. Similarly, LSR is left alone
because it already carefully manages its forms and doesn't need to get
fused into a single loop pass manager with a bunch of other loop passes.

LoopReroll didn't use loop simplified form previously, and I've updated
the test case to match the trivially different output.

Finally, I've also factored all the pass initialization for the passes
that use this technique as well, so that should be done regularly and
reliably.

Thanks to James for the help reviewing and thinking about this stuff,
and Ben for help thinking about it as well!

Differential Revision: http://reviews.llvm.org/D17435

llvm-svn: 261316
2016-02-19 10:45:18 +00:00
Chandler Carruth ac07270828 [AA] Preserve the AA results wrapper pass as well as BasicAA in a few
more places to prevent gratuitous re-"runs" of these passes.

The passes themselves don't do any work when run, but we keep spending
time scheduling and running these needlessly when we really don't need
to do so.

This is the first patch towards fixing the really horrible loop pass
pipeline fragmentation pointed out by Sanjoy in PR24804.

llvm-svn: 261302
2016-02-19 03:12:14 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Adrian Prantl a5b2a64980 Debug Info: Teach LdStHasDebugValue() (Local.cpp) about DIExpressions.
This function is used to check whether a dbg.value intrinsic has already
been inserted, but without comparing the DIExpression, it would erroneously
fire on split aggregates and only the first scalar would survive.

Found via http://reviews.llvm.org/D16867.
<rdar://problem/24456528>

llvm-svn: 261145
2016-02-17 20:02:25 +00:00
Junmo Park 6ebdc14cf1 [SCEVExpander] Make findExistingExpansion smarter
Summary:
Extending findExistingExpansion can use existing value in ExprValueMap.
This patch gives 0.3~0.5% performance improvements on 
benchmarks(test-suite, spec2000, spec2006, commercial benchmark)
   
Reviewers: mzolotukhin, sanjoy, zzheng

Differential Revision: http://reviews.llvm.org/D15559

llvm-svn: 260938
2016-02-16 06:46:58 +00:00
Keno Fischer 7c7c3e3591 [Cloning] Clone every Function's Debug Info
Summary:
Export the CloneDebugInfoMetadata utility, which clones all debug info
associated with a function into the first module. Also use this function
in CloneModule on each function we clone (the CloneFunction entrypoint
already does this).

Without this, cloning a module will lead to DI quality regressions,
especially since r252219 reversed the Function <-> DISubprogram edge
(before we could get lucky and have this edge preserved if the
DISubprogram itself was, e.g. due to location metadata).

This was verified to fix missing debug information in julia and
a unittest to verify the new behavior is included.

Patch by Yichao Yu! Thanks!

Reviewers: loladiro, pcc
Differential Revision: http://reviews.llvm.org/D17165

llvm-svn: 260791
2016-02-13 02:04:29 +00:00
Justin Lebar 6086c6a387 Fix typo in comment.
llvm-svn: 260731
2016-02-12 21:01:37 +00:00
Justin Lebar db63949e8d [SimplifyCFG] Don't fold conditional branches that contain calls to convergent functions.
Summary:
Performing this optimization duplicates the call to the convergent
function and adds new control-flow dependencies, which is a no-no.

Reviewers: jingyue

Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17128

llvm-svn: 260730
2016-02-12 21:01:36 +00:00
Evgeniy Stepanov ba6ca87ffb [msan] Put msan constructor in a comdat.
MSan adds a constructor to each translation unit that calls
__msan_init, and does nothing else. The idea is to run __msan_init
before any instrumented code. This results in multiple constructors
and multiple .init_array entries in the final binary, one per
translation unit. This is absolutely unnecessary; one would be
enough.

This change moves the constructors to a comdat group in order to drop
the extra ones.

llvm-svn: 260632
2016-02-12 00:37:52 +00:00
Teresa Johnson 488a800a4c [ThinLTO] Move global processing from Linker to TransformUtils (NFC)
Summary:
As discussed on IRC, move the ThinLTOGlobalProcessing code out of
the linker, and into TransformUtils. The name of the class is changed
to FunctionImportGlobalProcessing.

Reviewers: joker.eph, rafael

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17081

llvm-svn: 260395
2016-02-10 18:11:31 +00:00
Daniel Berlin f6c9ae9c6d Rename a member variable to be more accurate with how it is used
llvm-svn: 260389
2016-02-10 17:41:25 +00:00
Daniel Berlin 932b4cbf5d Constify two functions, make them accessible to unit tests
llvm-svn: 260387
2016-02-10 17:39:43 +00:00
Sanjay Patel 4d36bbaf19 rangify; NFC
llvm-svn: 260151
2016-02-08 21:32:43 +00:00
Sanjay Patel e08381a529 fix typos; NFC
llvm-svn: 260130
2016-02-08 19:27:33 +00:00
Silviu Baranga ea63a7f512 [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory
sanitizer issue. The PredicatedScalarEvolution's copy constructor
wasn't copying the Generation value, and was leaving it un-initialized.

Original commit message:

[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection

Summary:
This change adds no wrap SCEV predicates with:
  - support for runtime checking
  - support for expression rewriting:
      (sext ({x,+,y}) -> {sext(x),+,sext(y)}
      (zext ({x,+,y}) -> {zext(x),+,sext(y)}

Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.

We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.

Reviewers: mzolotukhin, sanjoy, anemet

Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel

Differential Revision: http://reviews.llvm.org/D15412

llvm-svn: 260112
2016-02-08 17:02:45 +00:00
Silviu Baranga 41b4973329 Revert r260086 and r260085. They have broken the memory
sanitizer bots.

llvm-svn: 260087
2016-02-08 11:56:15 +00:00
Silviu Baranga 70a98bb9e8 [LoopVersioning] Don't assert when there are no memchecks
We shouldn't assert when there are no memchecks, since we
can have SCEV checks. There is already an assert covering
the case where there are no SCEV checks or memchecks.

This also changes the LAA pointer wrapping versioning test
to use the loop versioning pass (this was how I managed to
trigger the assert in the loop versioning pass).

llvm-svn: 260086
2016-02-08 11:15:29 +00:00
Daniel Berlin 905a646c24 Don't use module context here. It's unnecessary and makes it harder to write unittests
llvm-svn: 260015
2016-02-07 02:03:39 +00:00
Daniel Berlin 1b51a2957d Compute live-in for MemorySSA
llvm-svn: 260014
2016-02-07 01:52:19 +00:00
Daniel Berlin 7898ca658e Only insert into definingblocks once per block
llvm-svn: 260013
2016-02-07 01:52:15 +00:00
Michael Zolotukhin 73957179d3 [LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible.
In r255133 (reapplied r253126) we started to avoid redundant
recomputation of LCSSA after loop-unrolling. This patch moves one step
further in this direction - now we can avoid it for much wider range of
loops, as we start to look at IR and try to figure out if the
transformation actually breaks LCSSA phis or makes it necessary to
insert new ones.

Differential Revision: http://reviews.llvm.org/D16838

llvm-svn: 259869
2016-02-05 02:17:36 +00:00
Gerolf Hoflehner 2432bd0ddd [SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative to
D16251)

Summary:
This is a simpler fix to the problem than the dominator approach in
http://reviews.llvm.org/D16251. It adds only values into the gather() while loop
that have been seen before.

The actual endless loop is in the constant compare gather() routine in
Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the
queue:
%.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i

Here is what happens at the IR level:

for.cond.i:                                       ; preds = %if.end6.i,
%if.end.i54
%ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ]
%ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<<
%cmp2.i = icmp ult i32 %ix.0.i, %11
br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit

if.end6.i:                                        ; preds = %for.body.i
%cmp10.i = icmp ugt i32 %conv.i, %add9.i
%.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<<

When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i.
The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
(Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i).
And
now there is use of .ret.0.off0.i before a definition which triggers the
“endless” loop in gather():

while(!DFT.empty()) {

    V = DFT.pop_back_val();   // V is .ret.0.off0.i

    if (Instruction *I = dyn_cast<Instruction>(V)) {
      // If it is a || (or && depending on isEQ), process the operands.
      if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) {
        DFT.push_back(I->getOperand(1));  // This is now .ret.0.off0.i also
        DFT.push_back(I->getOperand(0));

        continue; // “endless loop” for .ret.0.off0.i
      }

Reviewers: reames, ahatanak

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16839

llvm-svn: 259730
2016-02-03 23:54:25 +00:00
Peter Collingbourne 83cc981c49 Add #include "llvm/Support/raw_ostream.h" to fix Windows build.
llvm-svn: 259623
2016-02-03 03:16:37 +00:00
Peter Collingbourne 9f7ec14009 Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused.
llvm-svn: 259621
2016-02-03 02:51:00 +00:00
Adam Nemet d52ed84160 [LoopVersioning] Expose loop versioning as a pass too
Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.

I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.

(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)

The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.

Reviewers: hfinkel, ashutosh.nema, sbaranga

Subscribers: zzheng, mssimpso, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D16612

llvm-svn: 259610
2016-02-03 00:06:10 +00:00
George Burgess IV 60adac46f2 Attempt #2 to unbreak r259595.
llvm-svn: 259602
2016-02-02 23:26:01 +00:00
George Burgess IV b5a229f779 Attempt to fix builds broken by r259595.
llvm-svn: 259599
2016-02-02 23:15:26 +00:00
George Burgess IV e1100f533f This patch adds MemorySSA to LLVM.
Please see include/llvm/Transforms/Utils/MemorySSA.h for a description
of MemorySSA, and what it does.

Differential Revision: http://reviews.llvm.org/D7864

llvm-svn: 259595
2016-02-02 22:46:49 +00:00
Eugene Zelenko ecefe5a81f Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16793

llvm-svn: 259539
2016-02-02 18:20:45 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Sergei Larin 427f570ce1 [SplitModule] In split module utility we should never separate alias with its aliasee.
Summary: When splitting module with preserving locals, we currently do not handle case of global alias being separated with its aliasee.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16585

llvm-svn: 259075
2016-01-28 18:59:28 +00:00
Junmo Park 502ff66967 Minor code formatting cleanup. NFC.
llvm-svn: 259010
2016-01-28 01:23:18 +00:00
Sanjay Patel 5264cc772c [SimplifyCFG] limit recursion depth when speculating instructions (PR26308)
This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=26308

With the switch to using the TTI cost model in:
http://reviews.llvm.org/rL228826
...it became possible to hit a zero-cost cycle of instructions (gep -> phi -> gep...), 
so we need a cap for the recursion in DominatesMergePoint().

A recursion depth parameter was already added for a different reason in:
http://reviews.llvm.org/rL255660
...so we can just set a limit for it.

I pulled "10" out of the air and made it an independent parameter that we can play with.
It might be higher than it needs to be given the currently low default value of 
PHINodeFoldingThreshold (2). That's the starting cost value that we enter the recursion
with, and most instructions have cost set to TCC_Basic (1), so I don't think we're going
to speculate more than 2 instructions with the current parameters.

As noted in the review and the TODO comment, we can do better than just limiting recursion
depth.

Differential Revision: http://reviews.llvm.org/D16637

llvm-svn: 258971
2016-01-27 19:22:45 +00:00
Benjamin Kramer 820f7548a1 Make some headers self-contained, remove unused includes that violate layering.
llvm-svn: 258937
2016-01-27 16:05:37 +00:00
David Majnemer fccf5c6e01 Revert "Revert "[SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)""
This reverts commit r258903 which reverted r255660.  r258903 was an
accidental commit and should not have been committed.

llvm-svn: 258905
2016-01-27 02:59:41 +00:00
David Majnemer c761afd1d1 [SimplifyCFG] Don't mistake icmp of and for a tree of comparisons
SimplifyCFG tries to turn complex branch conditions into a switch.
Some of it's logic attempts to reason about bitwise arithmetic produced
by InstCombine.  InstCombine can turn things like (X == 2) || (X == 3)
into (X & 1) == 2 and so SimplifyCFG tries to detect when this occurs so
that it can produce a switch instruction.

However, the legality checking was not sufficient to determine whether
or not this had occured.  Correctly check this case by requiring that
the right-hand side of the comparison be a power of two.

This fixes PR26323.

llvm-svn: 258904
2016-01-27 02:43:28 +00:00
David Majnemer 47de2140f7 Revert "[SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)"
This reverts commit r255660.

llvm-svn: 258903
2016-01-27 02:43:22 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Eugene Zelenko 6ac3f739ca Fix Clang-tidy modernize-use-nullptr and modernize-use-override warnings; other minor fixes.
Differential revision: reviews.llvm.org/D16568

llvm-svn: 258831
2016-01-26 18:48:36 +00:00
Sanjay Patel 980b280f50 [LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x)
This is a step towards solving PR25892:
https://llvm.org/bugs/show_bug.cgi?id=25892

It won't handle the reported case. As noted by the 'TODO' comments in the patch, 
we need to relax the hasOneUse() constraint and also match patterns that include
memset_chk() and the llvm.memset() intrinsic in addition to memset().

Differential Revision: http://reviews.llvm.org/D16337

llvm-svn: 258816
2016-01-26 16:17:24 +00:00
David Majnemer c67668ff21 [LoopSimplify] Reuse changeToUnreachable
Use existing functionality provided in changeToUnreachable instead of
reinventing it in LoopSimplify.

No functionality change is intended.

llvm-svn: 258663
2016-01-24 19:32:52 +00:00
David Majnemer 88542a0a69 [SCCP] Remove duplicate code
SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.

No functionality change intended.

llvm-svn: 258654
2016-01-24 06:26:47 +00:00
David Majnemer 35c46d3e0b [InstCombine, SCCP] Consolidate code used to remove instructions
InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so.  Share the code between the two.

No functionality change is intended.

llvm-svn: 258653
2016-01-24 05:26:18 +00:00
Sanjay Patel 577472141c move function definitions so we don't need separate declarations ; NFCI
llvm-svn: 258455
2016-01-21 23:38:43 +00:00
Sanjay Patel 9beec21fcf [LibCallSimplifier] refactor FP function signature checks ; NFCI
Use the helper function added in r258428.

The check should really be hoisted to the caller of all of these
optimize* functions, but that's another step.

llvm-svn: 258446
2016-01-21 22:58:01 +00:00
Sanjay Patel 042aed90ab avoid variable shadowing; NFC
llvm-svn: 258445
2016-01-21 22:41:16 +00:00
Sanjay Patel 0e603fc3a7 remove unnecessary variable; NFC
llvm-svn: 258444
2016-01-21 22:31:18 +00:00
David L Kreitzer 4d7257dfa1 Fix for two constant propagation problems in GVN with the assume intrinsic
instruction.

Patch by Yuanrui Zhang.

Differential Revision: http://reviews.llvm.org/D16100

llvm-svn: 258435
2016-01-21 21:32:35 +00:00
Sanjay Patel fcc7c1a0ba [LibCallSimplifier] don't get fooled by a fake fmin()
This is similar to the bug/fix:
https://llvm.org/bugs/show_bug.cgi?id=26211
http://reviews.llvm.org/rL258325

The fmin() test case reveals another bug caused by sloppy
code duplication. It will crash without this patch because
fp128 is a valid floating-point type, but we would think
that we had matched a function that used doubles.

The new helper function can be used to replace similar
checks that are used in several other places in this file.

llvm-svn: 258428
2016-01-21 20:19:54 +00:00
Sanjay Patel 4e971da272 make helper functions static; NFCI
llvm-svn: 258416
2016-01-21 18:01:57 +00:00
Manuel Jacob e902459c4b Change ConstantFoldInstOperands to take Instruction instead of opcode and type. NFC.
Summary:
The previous form, taking opcode and type, is moved to an internal
helper and the new form, taking an instruction, is a wrapper around this
helper.

Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16383

llvm-svn: 258391
2016-01-21 06:33:22 +00:00
Sanjay Patel bd2dc67142 [LibCallSimplifier] don't get fooled by a fake sqrt()
The test case will crash without this patch because the subsequent call to
hasUnsafeAlgebra() assumes that the call instruction is an FPMathOperator
(ie, returns an FP type).

This part of the function signature check was omitted for the sqrt() case, 
but seems to be in place for all other transforms.

Before:
http://reviews.llvm.org/rL257400
...we would have needlessly continued execution in optimizeSqrt(), but the
bug was harmless because we'd eventually fail some other check and return
without damage.

This should fix:
https://llvm.org/bugs/show_bug.cgi?id=26211

Differential Revision: http://reviews.llvm.org/D16198

llvm-svn: 258325
2016-01-20 17:41:14 +00:00
Joseph Tremoulet b41632bf0f [Inliner/WinEH] Honor implicit nounwinds
Summary:
Funclet EH tables require that a given funclet have only one unwind
destination for exceptional exits.  The verifier will therefore reject
e.g. two cleanuprets with different unwind dests for the same cleanup, or
two invokes exiting the same funclet but to different unwind dests.
Because catchswitch has no 'nounwind' variant, and because IR producers
are not *required* to annotate calls which will not unwind as 'nounwind',
it is legal to nest a call or an "unwind to caller" catchswitch within a
funclet pad that has an unwind destination other than caller; it is
undefined behavior for such a call or catchswitch to unwind.

Normally when inlining an invoke, calls in the inlined sequence are
rewritten to invokes that unwind to the callsite invoke's unwind
destination, and "unwind to caller" catchswitches in the inlined sequence
are rewritten to unwind to the callsite invoke's unwind destination.
However, if such a call or "unwind to caller" catchswitch is located in a
callee funclet that has another exceptional exit with an unwind
destination within the callee, applying the normal transformation would
give that callee funclet multiple unwind destinations for its exceptional
exits.  There would be no way for EH table generation to determine which
is the "true" exit, and the verifier would reject the function
accordingly.

Add logic to the inliner to detect these cases and leave such calls and
"unwind to caller" catchswitches as calls and "unwind to caller"
catchswitches in the inlined sequence.

This fixes PR26147.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: alexcrichton, llvm-commits

Differential Revision: http://reviews.llvm.org/D16319

llvm-svn: 258273
2016-01-20 02:15:15 +00:00
Sanjay Patel d4af297df1 getParent()->getParent() == getModule() ; NFC
llvm-svn: 258176
2016-01-19 19:58:49 +00:00
Sanjay Patel d3112a5bcc function names start with a lowercase letter; NFC
Note: There are no uses of these functions outside of
SimplifyLibCalls, so they could be static functions in
that file.

llvm-svn: 258172
2016-01-19 19:46:10 +00:00
Sanjay Patel b50325e276 fix formatting; NFC
llvm-svn: 258167
2016-01-19 19:17:47 +00:00
Sanjay Patel 4e86036733 don't repeat documentation comments in implementation file; NFC
llvm-svn: 258166
2016-01-19 19:16:10 +00:00
Sanjay Patel d1f4f03f5e [LibCallSimplifier] use instruction-level fast-math-flags to shrink calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

llvm-svn: 258158
2016-01-19 18:38:52 +00:00
Sanjay Patel 81a63cd11f [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, [small integer]) calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

As with D15937, the intent of the patch is to preserve the current behavior of the transform
except that we use the pow call's 'fast' attribute as a trigger rather than a function-level
attribute.

The TODO comment notes a potential follow-on patch that would propagate FMF to the new
instructions.

Differential Revision: http://reviews.llvm.org/D16122

llvm-svn: 258153
2016-01-19 18:15:12 +00:00
Tobias Edler von Koch 3f4f6f3ed6 Add a change accidentally left out from r258100
Also remove an executable bit introduced by r258083.

llvm-svn: 258101
2016-01-18 23:35:24 +00:00
Sergei Larin d19d4d30d8 Add to the split module utility an SCC based method which allows not to globalize any local variables.
Summary:
    Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
    This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
    Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
    
    Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
    
    Subscribers: llvm-commits, joker.eph
    
    Differential Revision: http://reviews.llvm.org/D16124

llvm-svn: 258083
2016-01-18 21:07:13 +00:00
Manuel Jacob 5f6eaac611 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Peter Collingbourne f0f5e87083 Introduce sanstats tool and llvm::CreateSanitizerStatReport function.
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.

Differential Revision: http://reviews.llvm.org/D16174

llvm-svn: 257970
2016-01-16 00:31:11 +00:00
James Y Knight ac03dca412 Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)

Differential Revision: http://reviews.llvm.org/D16145

llvm-svn: 257902
2016-01-15 16:33:06 +00:00
James Molloy f01488e2bc [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

llvm-svn: 257875
2016-01-15 09:20:19 +00:00
Rui Ueyama da00f2fdf4 Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
Keno Fischer 1dd319f3b6 [Utils] Fix incorrect dbg.declare store conversion
Summary: The dbg.declare -> dbg.value conversion did not check which operand of
the store instruction the alloca was passed to. As a result code that stored the
address of an alloca, rather than storing to the alloca, would still trigger
the conversion routine, leading to the insertion of an incorrect dbg.value
intrinsic.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16169

llvm-svn: 257787
2016-01-14 19:12:27 +00:00
James Y Knight 582f556251 Revert "Stop increasing alignment of externally-visible globals on ELF platforms."
This reverts commit r257719, due to PR26144.

llvm-svn: 257775
2016-01-14 16:33:21 +00:00
Joseph Tremoulet bba70e4424 [OperandBundles] Copy DebugLoc with calls/invokes
Summary:
The overloads of CallInst::Create and InvokeInst::Create that are used to
adjust operand bundles purport to create a new instruction "identical in
every way except [for] the operand bundles", so copy the DebugLoc along
with everything else.


Reviewers: sanjoy, majnemer

Subscribers: majnemer, dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16157

llvm-svn: 257745
2016-01-14 06:21:42 +00:00
James Y Knight 9de6d7becc Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

llvm-svn: 257719
2016-01-13 23:59:19 +00:00
Sanjay Patel 42c73555b0 hasNUses(0) == use_empty() ; NFCI
Also, improve variable name and remove unnecessary braces.

llvm-svn: 257687
2016-01-13 22:16:48 +00:00
Sanjay Patel e01dcab39d rangify; NFCI
llvm-svn: 257677
2016-01-13 21:39:26 +00:00
Keno Fischer 9aae445e09 [Utils] Insert DW_OP_bit_piece when only describing part of the variable
Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext
to find a value to describe the variable (in the expectation that those
zext/sext instruction will go away later). However, those values do not
cover the entire variable and thus need a DW_OP_bit_piece.

Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16061

llvm-svn: 257534
2016-01-12 22:46:09 +00:00
Sanjay Patel 53ba88dbb0 [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, 0.5) calls
Also, propagate the FMF to the newly created sqrt() call.

llvm-svn: 257503
2016-01-12 19:06:35 +00:00
Sanjay Patel a252815bc1 function names start with a lower case letter ; NFC
llvm-svn: 257496
2016-01-12 18:03:37 +00:00
Sanjay Patel 6002e78a06 [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(exp(x)) calls
See also:
http://reviews.llvm.org/rL255555
http://reviews.llvm.org/rL256871
http://reviews.llvm.org/rL256964
http://reviews.llvm.org/rL257400
http://reviews.llvm.org/rL257404
http://reviews.llvm.org/rL257414

llvm-svn: 257491
2016-01-12 17:30:37 +00:00
Sanjay Patel e896ede7f1 [LibCallSimplifier] use instruction-level fast-math-flags to transform log calls
Also, add tests to verify that we're checking 'fast' on both calls of each transform pair,
tighten the CHECK lines, and give the tests more meaningful names.

This is a continuation of:
http://reviews.llvm.org/rL255555
http://reviews.llvm.org/rL256871
http://reviews.llvm.org/rL256964
http://reviews.llvm.org/rL257400
http://reviews.llvm.org/rL257404

llvm-svn: 257414
2016-01-11 23:31:48 +00:00
Sanjay Patel 6c1ddbb7b6 [LibCallSimplifier] don't allow sqrt transform unless all ops are unsafe
Fix the FIXME added with:
http://reviews.llvm.org/rL257400

llvm-svn: 257404
2016-01-11 22:50:36 +00:00
Sanjay Patel 9f67dadea2 more space; NFC
llvm-svn: 257401
2016-01-11 22:35:39 +00:00
Sanjay Patel 683f29735f [LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

The intent of the patch is to preserve the current behavior of the transform except
that we use the sqrt instruction's 'fast' attribute as a trigger rather than the
function-level attribute.

But this raises a bug noted by the new FIXME comment.

In order to do this transform:
sqrt((x * x) * y) ---> fabs(x) * sqrt(y)

...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'. 
If any of those ops is strict, we should bail out.

Differential Revision: http://reviews.llvm.org/D15937

llvm-svn: 257400
2016-01-11 22:34:19 +00:00
Teresa Johnson b43257d594 Split resolveCycles(bool AllowTemps) into two interfaces and document
Address review feedback from r255909.

Move body of resolveCycles(bool AllowTemps) to
resolveRecursivelyImpl(bool AllowTemps). Revert resolveCycles back
to asserting on temps, and add new resolveNonTemporaries interface
to invoke the new implementation with AllowTemps=true. Document
the differences between these interfaces, specifically the effect
on RAUW support and uniquing. Call appropriate interface from
ValueMapper.

llvm-svn: 257389
2016-01-11 21:37:41 +00:00
Chen Li 509ff21300 Code refactoring for commit r257278.
llvm-svn: 257366
2016-01-11 19:20:53 +00:00
David Majnemer d9833ea579 [JumpThreading] Don't forget to report that the IR changed
JumpThreading's runOnFunction is supposed to return true if it made any
changes.  JumpThreading has a call to removeUnreachableBlocks which may
result in changes to the IR but runOnFunction didn't appropriate account
for this possibility, leading to badness.

While we are here, make sure to call LazyValueInfo::eraseBlock in
removeUnreachableBlocks;  JumpThreading preserves LVI.

This fixes PR26096.

llvm-svn: 257279
2016-01-10 07:13:04 +00:00
Chen Li c375450e3f Fix a control flow problem in commit rL257277.
llvm-svn: 257278
2016-01-10 06:13:32 +00:00
Chen Li 1689c2f54b [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary:
This is a fix of D13718. D13718 was committed but then reverted because of the following bug:
https://llvm.org/bugs/show_bug.cgi?id=25299

This patch fixes the issue shown in the bug.

Reviewers: majnemer, reames

Subscribers: jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14308

llvm-svn: 257277
2016-01-10 05:48:01 +00:00
Justin Bogner e9fb228d59 LoopInfo: Simplify ownership of Loop objects
It's strange that LoopInfo mostly owns the Loop objects, but that it
defers deleting them to the loop pass manager. Instead, change the
oddly named "updateUnloop" to "markAsRemoved" and have it queue the
Loop object for deletion. We can't delete the Loop immediately when we
remove it, since we need its pointer identity still, so we'll mark the
object as "invalid" so that clients can see what's going on.

llvm-svn: 257191
2016-01-08 19:08:53 +00:00
Easwaran Raman 7f18729039 Remove CloningDirector and associated code
With the removal of the old landing pad code in r249918, CloningDirector is not
 used anywhere else. NFCI.

llvm-svn: 257185
2016-01-08 18:23:17 +00:00
Sanjay Patel c2d6461a4a [LibCallSimplifier] less indenting; NFCI
llvm-svn: 256973
2016-01-06 20:52:21 +00:00
Chen Li 78bde83003 [SplitLandingPadPredecessors] Create a PHINode for the original landingpad only if it has some uses
Summary: This patch adds a check in SplitLandingPadPredecessors to see if the original landingpad instruction has any uses. If not, we don't need to create a PHINode for it in the joint block since it's gonna be a dead code anyway. The motivation for this patch is that we found a bug that SplitLandingPadPredecessors created a PHINode of token type landingpad, which failed the verifier since PHINode can not be token type. However, the created PHINode will never be used in our code pattern. This patch will workaround this bug, and we might add supports in SplitLandingPadPredecessors to handle token type landingpad with uses in the future.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15835

llvm-svn: 256972
2016-01-06 20:32:05 +00:00
Sanjay Patel cddcd7256c [LibCallSimplifier] use instruction-level fast-math-flags for tan/atan transform
llvm-svn: 256964
2016-01-06 19:23:35 +00:00
David Majnemer b70e23c390 [SimplifyLibCalls] Teach SimplifyLibCalls about operand bundles
If we replace one call-site with another, be sure to move over any
operand bundles that lingered on the old call-site.

This fixes PR26036.

llvm-svn: 256912
2016-01-06 05:01:34 +00:00
Sanjay Patel c7ddb7fcdb A (B + C) = A B + A C ; NFCI
llvm-svn: 256884
2016-01-06 00:32:15 +00:00
Manuel Jacob 3eedd11329 [Statepoints] Check for the "gc-leaf-function" attribute on call sites as well.
Reviewers: sanjoy, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15900

llvm-svn: 256875
2016-01-05 23:59:08 +00:00
Sanjay Patel 29095ea1b0 [LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax transforms
llvm-svn: 256871
2016-01-05 20:46:19 +00:00
David Majnemer 59eb733af1 [SimplifyCFG] Further improve our ability to remove redundant catchpads
In r256814, we managed to remove catchpads which were trivially redudant
because they were the same SSA value.  We can do better using the same
algorithm but with a smarter datastructure by hashing the SSA values
within the catchpad and comparing them structurally.

llvm-svn: 256815
2016-01-05 07:42:17 +00:00
David Majnemer 2fa8651a8f [SimplifyCFG] Remove redundant catchpads
Remove duplicate catchpad handlers from a catchswitch.

llvm-svn: 256814
2016-01-05 06:27:50 +00:00
Joseph Tremoulet 0d808888c1 [WinEH] Simplify unreachable catchpads
Summary:
At least for CoreCLR, a catchpad which immediately executes an
`unreachable` instruction indicates that the exception can never have a
matching type, and so such catchpads can be removed, and so can their
catchswitches if the catchswitch becomes empty.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15846

llvm-svn: 256809
2016-01-05 02:37:41 +00:00
Eric Christopher 49a7d6c473 Clarify that the bypassSlowDivision optimization operates on a single BB [v2]
Update some comments to be more explicit.

Change bypassSlowDivision and the functions it calls so that they take
BasicBlock*s and Instruction*s, rather than Function::iterator&s and
BasicBlock::iterator&s.

Change the APIs so that the caller is responsible for updating the
iterator, rather than the callee. This makes control flow much easier
to follow.

Patch by Justin Lebar!

llvm-svn: 256789
2016-01-04 23:18:58 +00:00
Sanjay Patel bee05caa6b [LibCallSimplifier] propagate FMF when shrinking binary calls
llvm-svn: 256682
2015-12-31 23:40:59 +00:00
Sanjay Patel aa23114cb4 [LibCallSimplifier] propagate FMF when shrinking unary calls
llvm-svn: 256679
2015-12-31 21:52:31 +00:00
Sanjay Patel 96475cbd22 Variable names start with an upper case letter; NFC
llvm-svn: 256676
2015-12-31 16:16:58 +00:00
Sanjay Patel d707db97a9 fix formatting; NFC
llvm-svn: 256675
2015-12-31 16:10:49 +00:00
Teresa Johnson 96f7f81aa3 [ThinLTO] Rename variables used in metadata linking (NFC)
As suggested in review for r255909, rename MDMaterialized to AllowTemps,
and identify the name of the boolean flag being set in calls to
saveMetadataList.

llvm-svn: 256653
2015-12-30 21:13:55 +00:00
Craig Topper 582d8ecf6a [Transforms] Use asserts instead of ifs around llvm_unreachable. NFC
llvm-svn: 256405
2015-12-25 02:04:17 +00:00
Sanjoy Das ab0626e35f Nonnull elements in OperandBundleCallSites are not all Instructions
`CloneAndPruneIntoFromInst` sometimes RAUW's dead instructions with
`undef` before erasing them (to avoid deleting instructions that still
have uses).  This changes the `WeakVH` in `OperandBundleCallSites` to
hold an `undef`, and we need to guard for this situation in eventuality
in `llvm::InlineFunction`.

llvm-svn: 256110
2015-12-19 22:40:28 +00:00
Keno Fischer 00cbf9a69a Clean up the processing of dbg.value in various places
Summary:
First up is instcombine, where in the dbg.declare -> dbg.value conversion,
the llvm.dbg.value needs to be called on the actual loaded value, rather
than the address (since the whole point of this transformation is to be
able to get rid of the alloca). Further, now that that's cleaned up, we
can remove a hack in the backend, that would add an implicit OP_deref if
the argument to dbg.value was an alloca. This stems from before the
existence of DIExpression and is no longer necessary since the deref can
be expressed explicitly.

Now, in order to make sure that the tests pass with this change, we need to
correct the printing of DEBUG_VALUE comments to take into account the
expression, which wasn't taken into account before.

Unfortunately, for both these changes, there were a number of incorrect
test cases (mostly the wrong number of DW_OP_derefs, but also a couple
where the test itself was broken more badly). aprantl and I have gone
through and adjusted these test case in order to make them pass with
these fixes and in some cases to make sure they're actually testing
what they are meant to test.

Reviewers: aprantl

Subscribers: dsanders

Differential Revision: http://reviews.llvm.org/D14186

llvm-svn: 256077
2015-12-19 02:02:44 +00:00
Andrew Kaylor 123048d26a [WinEH] Update LCSSA to handle catchswitch with handlers inside and outside a loop
Differential Revision: http://reviews.llvm.org/D15630

llvm-svn: 256005
2015-12-18 18:12:35 +00:00
Teresa Johnson 0e7c82cb69 [ThinLTO/LTO] Don't link in unneeded metadata
Summary:
Third patch split out from http://reviews.llvm.org/D14752.

Only map in needed DISubroutine metadata (imported or otherwise linked
in functions and other DISubroutine referenced by inlined instructions).
This is supported for ThinLTO, LTO and llvm-link --only-needed, with
associated tests for each one.

Depends on D14838.

Reviewers: dexonsmith, joker.eph

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14843

llvm-svn: 256003
2015-12-18 17:51:37 +00:00
Teresa Johnson e5a6191732 [ThinLTO] Metadata linking for imported functions
Summary:
Second patch split out from http://reviews.llvm.org/D14752.

Maps metadata as a post-pass from each module when importing complete,
suturing up final metadata to the temporary metadata left on the
imported instructions.

This entails saving the mapping from bitcode value id to temporary
metadata in the importing pass, and from bitcode value id to final
metadata during the metadata linking postpass.

Depends on D14825.

Reviewers: dexonsmith, joker.eph

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14838

llvm-svn: 255909
2015-12-17 17:14:09 +00:00
Justin Bogner 883a3ea67f LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC
As of r255720, the loop pass manager will DTRT when passes update the
loop info for removed loops, so they no longer need to reach into
LPPassManager APIs to do this kind of transformation. This change very
nearly removes the need for the LPPassManager to even be passed into
loop passes - the only remaining pass that uses the LPM argument is
LoopUnswitch.

llvm-svn: 255797
2015-12-16 18:40:20 +00:00
James Molloy 3d21dcf3ed [SimplifyCFG] Don't create unnecessary PHIs
In conditional store merging, we were creating PHIs when we didn't
need to. If the value to be predicated isn't defined in the block
we're predicating, then it doesn't need a PHI at all (because we only
deal with triangles and diamonds, any value not in the predicated BB
must dominate the predicated BB).

This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite.

Now with a fix (and fixed tests) for the conformance issue seen in Chromium.

llvm-svn: 255767
2015-12-16 14:12:44 +00:00
David Majnemer 3bb88c0210 [WinEH] Use operand bundles to describe call sites
SimplifyCFG allows tail merging with code which terminates in
unreachable which, in turn, makes it possible for an invoke to end up in
a funclet which it was not originally part of.

Using operand bundles on invokes allows us to determine whether or not
an invoke was part of a funclet in the source program.

Furthermore, it allows us to unambiguously answer questions about the
legality of inlining into call sites which the personality may have
trouble with.

Differential Revision: http://reviews.llvm.org/D15517

llvm-svn: 255674
2015-12-15 21:27:27 +00:00
Justin Bogner 843fb204b7 LPM: Stop threading `Pass *` through all of the loop utility APIs. NFC
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:

- The APIs have access to pretty well any Pass state they want, so
  it's hard to tell what they may or may not do.

- Other APIs have copied these and pass around a `Pass *` even though
  they don't even use it. Some of these just hand a nullptr to the API
  since the callers don't even have a pass available.

- Passes in the new pass manager don't work like the current ones, so
  the APIs can't be used as is there.

Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.

llvm-svn: 255669
2015-12-15 19:40:57 +00:00
Sanjay Patel 38a022623a [SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)
This is the last general step to allow more IR-level speculation with a safety harness in place in CodeGenPrepare.

The intent is to restore the behavior enabled by:
http://reviews.llvm.org/rL228826

but prevent bad performance such as:
https://llvm.org/bugs/show_bug.cgi?id=24818

Earlier patches in this sequence:
D12882 (disable SimplifyCFG speculation for expensive instructions)
D13297 (have CGP despeculate expensive ops)
D14630 (have CGP despeculate special versions of cttz/ctlz)

As shown in the test cases, we only have two instructions currently affected: ctz for some x86 and fdiv generally. 
Allowing exactly one expensive instruction is a bit of a hack, but it lines up with what is currently implemented
in CGP. If we make the despeculation more general in CGP, we can make the speculation here more liberal.

A follow-up patch will adjust the cost for sqrt and possibly other typically expensive math intrinsics (currently
everything is cheap by default). GPU targets would likely want to override those expensive default costs (just as
they probably should already override the cost of div/rem) because just about any math is cheaper than control-flow
on those targets.

Differential Revision: http://reviews.llvm.org/D15213

llvm-svn: 255660
2015-12-15 17:38:29 +00:00
Reid Kleckner db9a91e324 Revert "Don't create unnecessary PHIs"
This reverts commit r255489.

It causes test failures in Chromium and does not appear to respect the
AlternativeV parameter.

llvm-svn: 255562
2015-12-14 22:36:57 +00:00
David Majnemer bbfc7219ef [IR] Remove terminatepad
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function.  This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.

Depends on D15478.

Differential Revision: http://reviews.llvm.org/D15479

llvm-svn: 255522
2015-12-14 18:34:23 +00:00
Sanjay Patel af674fbfd9 getParent() ^ 3 == getModule() ; NFCI
llvm-svn: 255511
2015-12-14 17:24:23 +00:00
James Molloy 2b1e101e99 Don't create unnecessary PHIs
In conditional store merging, we were creating PHIs when we didn't
need to. If the value to be predicated isn't defined in the block
we're predicating, then it doesn't need a PHI at all (because we only
deal with triangles and diamonds, any value not in the predicated BB
must dominate the predicated BB).

This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite.

llvm-svn: 255489
2015-12-14 10:57:01 +00:00
David Majnemer 8a1c45d6e8 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

llvm-svn: 255422
2015-12-12 05:38:55 +00:00
James Molloy 1bb6ea5e2d [Mem2Reg] Respect optnone
Mem2Reg shouldn't be optimizing a function that is marked
optnone. There is a test checking this that fails when mem2reg is
explicitly added to the standard pass pipeline.

llvm-svn: 255336
2015-12-11 13:36:59 +00:00
Sanjoy Das ccd14566e2 Add arg_begin() and arg_end() to CallInst and InvokeInst; NFCI
- This simplifies the CallSite class, arg_begin / arg_end are now
   simple wrapper getters.

 - In several places, we were creating CallSite instances solely to call
   arg_begin and arg_end.  With this change, that's no longer required.

llvm-svn: 255226
2015-12-10 06:39:02 +00:00
Sanjoy Das 9abfb0b429 Use WeakVH to keep track of calls with operand bundles in CloneCodeInfo
`CloneAndPruneIntoFromInst` can DCE instructions after cloning them into
the new function, and so an AssertingVH is too strong.  This change
switches CloneCodeInfo to use a std::vector<WeakVH>.

llvm-svn: 255148
2015-12-09 20:33:52 +00:00
Sanjoy Das 1f8fd88873 Delete trailing whitespace; NFC
llvm-svn: 255147
2015-12-09 20:33:45 +00:00
Michael Zolotukhin 78760ee73d Revert "Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible.""
The bug in IndVarSimplify was fixed in r254976, r254977, so I'm
reapplying the original patch for avoiding redundant LCSSA recomputation.

This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466.

llvm-svn: 255133
2015-12-09 18:20:28 +00:00
Silviu Baranga 9cd9a7e310 Re-commit r255115, with the PredicatedScalarEvolution class moved to
ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform
and Analysis modules:

[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions

Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.

This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
  P1: {a,+,b} has nsw
  P2: b = 1.

Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.

The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.

Reviewers: mzolotukhin, anemet

Subscribers: jmolloy, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14296

llvm-svn: 255122
2015-12-09 16:06:28 +00:00
Silviu Baranga ad1ccb357b Revert r255115 until we figure out how to fix the bot failures.
llvm-svn: 255117
2015-12-09 15:25:28 +00:00
Silviu Baranga 41eb682501 [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.

This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
  P1: {a,+,b} has nsw
  P2: b = 1.

Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.

The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.

Reviewers: mzolotukhin, anemet

Subscribers: jmolloy, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14296

llvm-svn: 255115
2015-12-09 15:03:52 +00:00
Rafael Espindola cab951dd46 Return a std::unique_ptr from CloneModule. NFC.
llvm-svn: 255078
2015-12-08 23:57:17 +00:00
Sanjoy Das 8a954a0553 [OperandBundles] Fix a transform in simplifycfg
Reviewers: pcc, majnemer, reames

Subscribers: reames, llvm-commits

Differential Revision: http://reviews.llvm.org/D15345

llvm-svn: 255062
2015-12-08 22:26:08 +00:00
Sanjoy Das 8da1f95916 [OperandBundles] Remove unncessary constructor
The StringRef constructor is unnecessary (since we're converting to
std::string anyway), and having it requires an explicit call to
StringRef's or std::string's constructor.

llvm-svn: 255000
2015-12-08 03:50:32 +00:00
Rafael Espindola b6d56a7655 Create llvm.global_ctors in the new format.
llvm-svn: 254878
2015-12-06 16:18:25 +00:00
Weiming Zhao 8213072a45 [SimplifyLibCalls] Optimization for pow(x, n) where n is some constant
Summary:
    In order to avoid calling pow function we generate repeated fmul when n is a
    positive or negative whole number.
    
    For each exponent we pre-compute Addition Chains in order to minimize the no.
    of fmuls.
    Refer: http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html
    
    We pre-compute addition chains for exponents upto 32 (which results in a max of
    7 fmuls).

    For eg:
    4 = 2+2
    5 = 2+3
    6 = 3+3 and so on
    
    Hence,
    pow(x, 4.0) ==> y = fmul x, x
                    x = fmul y, y
                    ret x

    For negative exponents, we simply compute the reciprocal of the final result.
    
    Note: This transformation is only enabled under fast-math.
    
    Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: weimingz, majnemer, escha, davide, scanon, joerg

Subscribers: probinson, escha, llvm-commits

Differential Revision: http://reviews.llvm.org/D13994

llvm-svn: 254776
2015-12-04 22:00:47 +00:00
David Majnemer 70497c696a Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Rafael Espindola baa3bf8f76 Bring r254336 back:
The difference is that now we don't error on out-of-comdat access to
internal global values. We copy them instead. This seems to match the
expectation of COFF linkers (see pr25686).

Original message:

    Start deciding earlier what to link.

    A traditional linker is roughly split in symbol resolution and
"copying
    stuff".

    The two tasks are badly mixed in lib/Linker.

    This starts splitting them apart.

    With this patch there are no direct call to linkGlobalValueBody or
    linkGlobalValueProto. Everything is linked via WapValue.

    This also includes a few fixes:
    * A GV goes undefined if the comdat is dropped (comdat11.ll).
    * We error if an internal GV goes undefined (comdat13.ll).
    * We don't link an unused comdat.

    The first two match the behavior of an ELF linker. The second one is
    equivalent to running globaldce on the input.

llvm-svn: 254418
2015-12-01 15:19:48 +00:00
Evgeniy Stepanov 42f3b12274 [safestack] Protect byval function arguments.
Detect unsafe byval function arguments and move them to the unsafe
stack.

llvm-svn: 254353
2015-12-01 00:40:05 +00:00
Rafael Espindola e9841a6bb5 This reverts commit r254336 and r254344.
They broke a bot and I am debugging why.

llvm-svn: 254347
2015-11-30 23:54:19 +00:00
Rafael Espindola c109200c53 Start deciding earlier what to link.
A traditional linker is roughly split in symbol resolution and "copying
stuff".

The two tasks are badly mixed in lib/Linker.

This starts splitting them apart.

With this patch there are no direct call to linkGlobalValueBody or
linkGlobalValueProto. Everything is linked via WapValue.

This also includes a few fixes:
* A GV goes undefined if the comdat is dropped (comdat11.ll).
* We error if an internal GV goes undefined (comdat13.ll).
* We don't link an unused comdat.

The first two match the behavior of an ELF linker. The second one is
equivalent to running globaldce on the input.

llvm-svn: 254336
2015-11-30 22:01:43 +00:00
Davide Italiano 1aeed6a955 [SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math.
llvm-svn: 254317
2015-11-30 19:36:35 +00:00
Davide Italiano 0b14f29285 [SimplifyLibCalls] Don't crash if the function doesn't have a name.
llvm-svn: 254265
2015-11-29 21:58:56 +00:00
Davide Italiano e2db58cfb8 [SimplifyLibCalls] Cross out implemented transformations.
llvm-svn: 254264
2015-11-29 21:00:43 +00:00
Davide Italiano b8b7133c94 [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).
This one is enabled only under -ffast-math. There are cases where the
difference between the value computed and the correct value is huge
even for ffast-math, e.g. as Steven pointed out:

x = -1, y = -4
log(pow(-1), 4) = 0
4*log(-1) = NaN

I checked what GCC does and apparently they do the same optimization
(which result in the dramatic difference). Future work might try to
make this (slightly) less worse.

Differential Revision:	http://reviews.llvm.org/D14400

llvm-svn: 254263
2015-11-29 20:58:04 +00:00
Davide Italiano da3beebad1 [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!
llvm-svn: 254239
2015-11-28 22:27:48 +00:00
Benjamin Kramer 89766e5b1d [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized memory read below.
Found by msan!

llvm-svn: 254238
2015-11-28 21:43:12 +00:00
Rafael Espindola 19b52383c5 Simplify the linking of recursive data.
Now the ValueMapper has two callbacks. The first one maps the
declaration. The ValueMapper records the mapping and then materializes
the body/initializer.

llvm-svn: 254209
2015-11-27 20:28:19 +00:00
Davide Italiano ac0953a2e6 [SimplifyLibCalls] Use range-based loop. NFC.
llvm-svn: 254193
2015-11-27 08:05:40 +00:00
Benjamin Kramer fb419e71f4 [SimplifyLibCalls] Don't depend on a called function having a name, it might be an indirect call.
Fixes the crasher in PR25651 and related crashers using the same pattern.

llvm-svn: 254145
2015-11-26 09:51:17 +00:00
Sanjoy Das c521c7bea5 [OperandBundles] Extract duplicated code into a helper function, NFC
llvm-svn: 254047
2015-11-25 00:42:24 +00:00
Weiming Zhao 45d4cb9a14 [Utils] Put includes in correct order. NFC.
Summary:
    Followed the guidelines in:
    http://llvm.org/docs/CodingStandards.html#include-style
    
    However, I noticed that uppercase named headers come before lowercase ones
    throughout the codebase. So kept them as is.
    
    Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: majnemer, davide, jmolloy, atrick

Subscribers: sanjoy

Differential Revision: http://reviews.llvm.org/D14939

llvm-svn: 254005
2015-11-24 18:57:06 +00:00
Weiming Zhao 8d5c08f591 [SimplifyLibCalls] Removed some TODOs which are already implemented. NFC.
Summary:
D14302 implements tan(atan(x)) -> x
D14045 implements pow(exp(x), y) -> exp(x*y)

Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: majnemer, davide

Differential Revision: http://reviews.llvm.org/D14882

llvm-svn: 253768
2015-11-21 06:10:20 +00:00
Dehao Chen 014fb55711 Fix the debug build breakage that getDiscriminator is called by mistake.
llvm-svn: 253597
2015-11-19 20:29:27 +00:00
Michael Zolotukhin 6c11c04db3 Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."
The change exposed a bug in IndVarSimplify (PR25578), which led to a
failure (PR25538). When the bug is fixed, this patch can be reapplied.

The tests are kept in tree, as they're useful anyway, and will not break
with this revert.

llvm-svn: 253596
2015-11-19 20:28:32 +00:00
Dehao Chen 23e2278e27 Reimplement discriminator assignment algorithm.
Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line.

Reviewers: dblaikie, davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14738

llvm-svn: 253594
2015-11-19 19:53:05 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Davide Italiano c5cedd195a [SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math.
Differential Revision:	http://reviews.llvm.org/D14466

llvm-svn: 253521
2015-11-18 23:21:32 +00:00
Davide Italiano 455ea11d13 [BuildLibCalls] EmitStrNLen() is dead code. Garbage collect.
llvm-svn: 253514
2015-11-18 22:29:38 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Igor Laevsky 7310c68e85 Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)"
Failing clang test is now fixed by the r253458.

llvm-svn: 253459
2015-11-18 14:50:18 +00:00
Sanjoy Das f79d3449c5 [OperandBundles] Tighten OperandBundleDef's interface; NFC
llvm-svn: 253446
2015-11-18 08:30:07 +00:00
Sanjoy Das 2d16145acf Teach the inliner to track deoptimization state
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites.  The operation is described in more detail
in the LangRef changes.

Reviewers: reames, majnemer, chandlerc, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14552

llvm-svn: 253438
2015-11-18 06:23:38 +00:00
Michael Zolotukhin 927bdba29d [PR25538]: Fix a failure caused by r253126.
In r253126 we stopped to recompute LCSSA after loop unrolling in all
cases, except the unrolling is full and at least one of the loop exits
is outside the parent loop. In other cases the transformation should not
break LCSSA, but it turned out, that we also call SimplifyLoop on the
parent loop, which might break LCSSA by itself. This fix just triggers
LCSSA recomputation in this case as well.

I'm committing it without a test case for now, but I'll try to invent
one. It's a bit tricky because in an isolated test LoopSimplify would
be scheduled before LoopUnroll, and thus will change the test and hide
the problem.

llvm-svn: 253253
2015-11-16 21:17:26 +00:00
Davide Italiano ed5cc95d22 [SimplifyLibCalls] Generalize a comment. This doesn't apply only to sqrt.
llvm-svn: 253224
2015-11-16 16:54:28 +00:00
Pavel Labath 978060ce2f Don't generate discriminators for calls to debug intrinsics
Summary:
This fails a check in Verifier.cpp, which checks for location matches between the declared
variable and the !dbg attachments.

Reviewers: dnovillo, dblaikie, danielcdh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14657

llvm-svn: 253194
2015-11-16 10:40:38 +00:00
Keno Fischer 2ac0c27001 Also map the personality function in CloneFunctionInto
Summary: The Old personality function gets copied over, but the
Materializer didn't  have a chance to inspect it (e.g. to fix up
references to the correct module for the target function).
Also add a verifier check that makes sure the personality routine
is in the same module as the function whose personality it is.

Reviewers: majnemer

Subscribers: jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14474

llvm-svn: 253183
2015-11-16 05:13:30 +00:00
Teresa Johnson 83d03ddbf6 Fix mapping of unmaterialized global values during metadata linking
Summary:
The patch to move metadata linking after global value linking didn't
correctly map unmaterialized global values to null as desired. They
were in fact mapped to the source copy. It largely worked by accident
since most module linker clients destroyed the source module which
caused the source GVs to be replaced by null, but caused a failure with
LTO linking on Windows:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312869.html

The problem is that a null return value from materializeValueFor is
handled by mapping the value to self. This is the desired behavior when
materializeValueFor is passed a non-GlobalValue. The problem is how to
distinguish that case from the case where we really do want to map to
null.

This patch addresses this by passing in a new flag to the value mapper
indicating that unmapped global values should be mapped to null. Other
Value types are handled as before.

Note that the documented behavior of asserting on unmapped values when
the flag RF_IgnoreMissingValues isn't set is currently disabled with
FIXME notes due to bootstrap failures. I modified these disabled asserts
so when they are eventually enabled again it won't assert for the
unmapped values when the new RF_NullMapMissingGlobalValues flag is set.

I also considered using a callback into the value materializer, but a
flag seemed cleaner given that there are already existing flags.
I also considered modifying materializeValueFor to return the input
value when we want to map to source and then treat a null return
to mean map to null. However, there are other value materializer
subclasses that implement materializeValueFor, and they would all need
to be audited and the return values possibly changed, which seemed
error-prone.

Reviewers: dexonsmith, joker.eph

Subscribers: pcc, llvm-commits

Differential Revision: http://reviews.llvm.org/D14682

llvm-svn: 253170
2015-11-15 14:50:14 +00:00
Michael Zolotukhin 8ef44f93ca Don't recompute LCSSA after loop-unrolling when possible.
Summary:
Currently we always recompute LCSSA for outer loops after unrolling an
inner loop. That leads to compile time problem when we have big loop
nests, and we can solve it by avoiding unnecessary work. For instance,
if w eonly do partial unrolling, we don't break LCSSA, so we don't need
to rebuild it. Also, if all exits from the inner loop are inside the
enclosing loop, then complete unrolling won't break LCSSA either.

I replaced unconditional LCSSA recomputation with conditional recomputation +
unconditional assert and added several tests, which were failing when I
experimented with it.

Soon I plan to follow up with a similar patch for recalculation of dominators
tree.

Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14526

llvm-svn: 253126
2015-11-14 05:51:41 +00:00
Davide Italiano b883b01a8e [SimplifyLibCalls] Make a function shorter. NFC.
llvm-svn: 252970
2015-11-12 23:39:00 +00:00
Diego Novillo 0354a9f67b SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.

The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).

This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.

llvm-svn: 252763
2015-11-11 17:54:37 +00:00
Renato Golin 0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Igor Laevsky 01c3692a10 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Dehao Chen 3656e3064b Add discriminators for call instructions that are from the same line and same basic block.
Summary: Call instructions that are from the same line and same basic block needs to have separate discriminators to distinguish between different callsites.

Reviewers: davidxl, dnovillo, dblaikie

Subscribers: dblaikie, probinson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14464

llvm-svn: 252492
2015-11-09 17:30:38 +00:00
Silviu Baranga 2910a4f6b1 Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates
Summary:
LAA currently generates a set of SCEV predicates that must be checked by users.
In the case of Loop Distribute/Loop Load Elimination, no such predicates could have
been emitted, since we don't allow stride versioning. However, in the future there
could be SCEV predicates that will need to be checked.

This change adds support for SCEV predicate versioning in the Loop Distribute, Loop
Load Eliminate and the loop versioning infrastructure.

Reviewers: anemet

Subscribers: mssimpso, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14240

llvm-svn: 252467
2015-11-09 13:26:09 +00:00
Duncan P. N. Exon Smith 83c4b68720 ADT: Remove last implicit ilist iterator conversions, NFC
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress.  This removes them.

I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).

llvm-svn: 252371
2015-11-07 00:01:16 +00:00
Davide Italiano d9f87b4642 [SimplifyLibCalls] Don't hardcode the function name.
llvm-svn: 252342
2015-11-06 21:05:07 +00:00
Sanjoy Das 55ea67cea7 [ValueTracking] Add parameters to isImpliedCondition; NFC
Summary:
This change makes the `isImpliedCondition` interface similar to the rest
of the functions in ValueTracking (in that it takes a DataLayout,
AssumptionCache etc.).  This is an NFC, intended to make a later diff
less noisy.

Depends on D14369

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14391

llvm-svn: 252333
2015-11-06 19:01:08 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Davide Italiano a345877ce8 [SimplifyLibCalls] Use hasFloatVersion(). NFCI.
llvm-svn: 252186
2015-11-05 19:18:23 +00:00
James Molloy 9e959ac397 [SimplifyCFG] Tweak heuristic for merging conditional stores
We were correctly skipping dbginfo intrinsics and terminators, but the initial bailout wasn't, causing it to bail out on almost any block.

llvm-svn: 252152
2015-11-05 08:40:19 +00:00
Davide Italiano 51507d2ad8 [SimplifyLibCalls] New transformation: tan(atan(x)) -> x
This is enabled only under -ffast-math.
So, instead of emitting:
  4007b0:       50                      push   %rax
  4007b1:       e8 8a fd ff ff          callq  400540 <atanf@plt>
  4007b6:       58                      pop    %rax
  4007b7:       e9 94 fd ff ff          jmpq   400550 <tanf@plt>
  4007bc:       0f 1f 40 00             nopl   0x0(%rax)

for:
float mytan(float x) {
  return tanf(atanf(x));
}
we emit a single retq.

Differential Revision:	 http://reviews.llvm.org/D14302

llvm-svn: 252098
2015-11-04 23:36:56 +00:00
James Molloy 4de84ddec9 [SimplifyCFG] Merge conditional stores
We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:

  if (c & flag1)
    *a = x;
  if (c & flag2)
    *a = y;
  ...

There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.

It is, however, legal to merge the stores together and do the store once:

  tmp = undef;
  if (c & flag1)
    tmp = x;
  if (c & flag2)
    tmp = y;
  if (c & flag1 || c & flag2)
    *a = tmp;

The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.

This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.

The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.

This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.

llvm-svn: 252051
2015-11-04 15:28:04 +00:00
Davide Italiano c8a7913f23 [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)
This one is enabled only under -ffast-math (due to rounding/overflows)
but allows us to emit shorter code.

Before (on FreeBSD x86-64):
4007f0:       50                      push   %rax
4007f1:       f2 0f 11 0c 24          movsd  %xmm1,(%rsp)
4007f6:       e8 75 fd ff ff          callq  400570 <exp2@plt>
4007fb:       f2 0f 10 0c 24          movsd  (%rsp),%xmm1
400800:       58                      pop    %rax
400801:       e9 7a fd ff ff          jmpq   400580 <pow@plt>
400806:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
40080d:       00 00 00

After:
4007b0:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
4007b4:       e9 87 fd ff ff          jmpq   400540 <exp2@plt>
4007b9:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Differential Revision:	http://reviews.llvm.org/D14045

llvm-svn: 251976
2015-11-03 20:32:23 +00:00
Davide Italiano b7487e6b8d [SimplifyLibCalls] Remove variables that are not used. NFC.
llvm-svn: 251852
2015-11-02 23:07:14 +00:00
Davide Italiano e84d4da234 [SimplifyLibCalls] Merge two if statements. NFC.
llvm-svn: 251845
2015-11-02 22:33:26 +00:00
Artur Pilipenko 5c5011d503 Preserve load alignment and dereferenceable metadata during some transformations
Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D13953

llvm-svn: 251809
2015-11-02 17:53:51 +00:00
Davide Italiano 5cdf915191 Simplify a check. NFC.
llvm-svn: 251757
2015-11-01 00:09:16 +00:00
Davide Italiano 396f3eeafb [SimplifyLibCalls] Factor out other common code.
llvm-svn: 251754
2015-10-31 23:17:45 +00:00
Davide Italiano 3817486c47 [SimplifyLibCalls] Remove dead code.
llvm-svn: 251737
2015-10-31 08:28:10 +00:00
Dehao Chen 49359bf3d7 Recommit r251680 (also need to update clang test)
Update the discriminator assignment algorithm

* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.

original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }

; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3

llvm-svn: 251689
2015-10-30 05:07:15 +00:00
Dehao Chen 4d84b9321e Revert r251680:
Update the discriminator assignment algorithm

* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.

original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }

; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3

llvm-svn: 251685
2015-10-30 04:29:05 +00:00
Dehao Chen 9a5d2b18e0 Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.

original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }

; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3

llvm-svn: 251680
2015-10-30 02:38:29 +00:00
Dehao Chen 7ddf7865b4 clang-format lib/Transforms/Utils/AddDiscriminators.cpp
llvm-svn: 251656
2015-10-29 21:25:33 +00:00
Philip Reames 846e3e41ed [SimplifyCFG] Constant fold a branch implied by it's incoming edge
The most common use case is when eliminating redundant range checks in an example like the following:
c = a[i+1] + a[i];

Note that all the smarts of the transform (the implication engine) is already in ValueTracking and is tested directly through InstructionSimplify.

Differential Revision: http://reviews.llvm.org/D13040

llvm-svn: 251596
2015-10-29 03:11:49 +00:00
Davide Italiano a904e520c2 [SimplifyLibCalls] Factor out common unsafe-math checks.
llvm-svn: 251595
2015-10-29 02:58:44 +00:00
David Majnemer 492937095f [SimplifyCFG] Don't DCE catchret because the successor is unreachable
CatchReturnInst has side-effects: it runs a destructor.  This destructor
could conceivably run forever/call exit/etc. and should not be removed.

llvm-svn: 251461
2015-10-27 22:43:56 +00:00
Davide Italiano c692688cbd [SimplifyLibCalls] Use range-based loop. No functional change.
llvm-svn: 251383
2015-10-27 04:17:51 +00:00
David Blaikie 94c83370b5 Move the canonical header to the top of its matching cpp file as per coding convention
This ensures that the header will be verified to be standalone (and
avoid mistakes like the one fixed in r251178)

llvm-svn: 251326
2015-10-26 18:40:56 +00:00
Sanjoy Das 15c4c4604f [LCSSA] Unbreak build, don't reuse L; NFC
The build broke in r251248.

llvm-svn: 251251
2015-10-25 19:27:17 +00:00
Sanjoy Das 331521c688 [LCSSA] Use range for loops; NFC
llvm-svn: 251248
2015-10-25 19:08:32 +00:00
Chen Li 7009cd3554 Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
llvm-svn: 251149
2015-10-23 21:13:01 +00:00
Sanjoy Das 0a1bee8a80 [Inliner] Don't inline through callsites with operand bundles
Summary:
This change teaches the LLVM inliner to not inline through callsites
with unknown operand bundles.  Currently all operand bundles are
"unknown" operand bundles but in the near future we will add support for
inlining through some select kinds of operand bundles.

Reviewers: reames, chandlerc, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14001

llvm-svn: 251141
2015-10-23 20:09:55 +00:00
Chen Li c6e28782d8 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction.  

Reviewers: hfinkel, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13718

llvm-svn: 251061
2015-10-22 20:48:38 +00:00
David Majnemer dc3b67b4ca [SimplifyCFG] Don't use-after-free an SSA value
SimplifyTerminatorOnSelect didn't consider the possibility that the
condition might be related to one of PHI nodes.

This fixes PR25267.

llvm-svn: 250922
2015-10-21 18:22:24 +00:00
Philip Reames a956cc7f08 Revert 250343 and 250344
Turns out this approach is buggy.  In discussion about follow on work, Sanjoy pointed out that we could be subject to circular logic problems.  

Consider:
 if (i u< L) leave()
 if ((i + 1) u< L) leave()
 print(a[i] + a[i+1]) 

If we know that L is less than UINT_MAX, we could possible prove (in a control dependent way) that i + 1 does not overflow.  This gives us:
 if (i u< L) leave()
 if ((i +nuw 1) u< L) leave()
 print(a[i] + a[i+1]) 

If we now do the transform this patch proposed, we end up with:
 if ((i +nuw 1) u< L) leave_appropriately()
 print(a[i] + a[i+1]) 

That would be a miscompile when i==-1.  The problem here is that the control dependent nuw bits got used to prove something about the first condition.  That's obviously invalid.

This won't happen today, but since I plan to enhance LVI/CVP with exactly that transform at some point in the not too distant future...

llvm-svn: 250430
2015-10-15 16:51:00 +00:00
Philip Reames b42db21de8 [SimplifyCFG] Speculatively flatten CFG based on profiling metadata
If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path.

The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences.

Differential Revision: http://reviews.llvm.org/D13070

llvm-svn: 250343
2015-10-14 22:46:19 +00:00
David Majnemer eba62796cb [InlineFunction] Correctly inline TerminatePadInst
We forgot to append the terminatepad's arguments which resulted in us
treating the old terminatepad as an argument to the new terminatepad
causing us to crash immediately.  Instead, add the old terminatepad's
arguments to the new terminatepad.

This fixes PR25155.

llvm-svn: 250234
2015-10-13 22:08:17 +00:00
Duncan P. N. Exon Smith 5b4c837c58 TransformUtils: Remove implicit ilist iterator conversions, NFC
Continuing the work from last week to remove implicit ilist iterator
conversions.  First related commit was probably r249767, with some more
motivation in r249925.  This edition gets LLVMTransformUtils compiling
without the implicit conversions.

No functional change intended.

llvm-svn: 250142
2015-10-13 02:39:05 +00:00
Oliver Stannard 939724cd02 GlobalOpt does not treat externally_initialized globals correctly
GlobalOpt currently merges stores into the initialisers of internal,
externally_initialized globals, but should not do so as the value of the global
may change between the initialiser and any code in the module being run.

llvm-svn: 250035
2015-10-12 13:20:52 +00:00
Sanjoy Das c21a05a3a4 [PlaceSafeopints] Extract out `callsGCLeafFunction`, NFC
Summary:
This will be used in a later change to RewriteStatepointsForGC.

Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13490

llvm-svn: 249777
2015-10-08 23:18:30 +00:00
Sanjoy Das 0015e5a088 [IndVars] Preserve LCSSA in `eliminateIdentitySCEV`
Summary:
After r249211, SCEV can see through some LCSSA phis.  Add a
`replacementPreservesLCSSAForm` check before replacing uses of these phi
nodes with a simplified use of the induction variable to avoid breaking
LCSSA.

Fixes 25047.

Depends on D13460.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13461

llvm-svn: 249575
2015-10-07 17:38:31 +00:00
Hans Wennborg 083ca9bb32 Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D13321

llvm-svn: 249482
2015-10-06 23:24:35 +00:00
Sanjoy Das 5c8bead46d [IndVars] Don't break dominance in `eliminateIdentitySCEV`
Summary:
After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and
Y are related in the dominator tree, even if X is an operand to Y (I've
included a toy example in comments, and a real example as a test case).

This commit changes `SimplifyIndVar` to require a `DominatorTree`.  I
don't think this is a problem because `ScalarEvolution` requires it
anyway.

Fixes PR25051.

Depends on D13459.

Reviewers: atrick, hfinkel

Subscribers: joker.eph, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13460

llvm-svn: 249471
2015-10-06 21:44:49 +00:00
Sanjoy Das 088bb0ea9f [IndVars] Extract out eliminateIdentitySCEV, NFC
Summary:
Reflow a comment while at it.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13459

llvm-svn: 249470
2015-10-06 21:44:39 +00:00
Piotr Padlewski dc9b2cfc50 inariant.group handling in GVN
The most important part required to make clang
devirtualization works ( ͡°͜ʖ ͡°).
The code is able to find non local dependencies, but unfortunatelly
because the caller can only handle local dependencies, I had to add
some restrictions to look for dependencies only in the same BB.

http://reviews.llvm.org/D12992

llvm-svn: 249196
2015-10-02 22:12:22 +00:00
Bruno Cardoso Lopes b491a2d641 [SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization
When trying to optimize fortified library functions use the right
location to insert new instructions in order to preserve correct
def-use order.

This fixes an issue where a misplaced instruction definition would
happen to be *after* one of its use after a RAUW, forming invalid IR.
This behavior was introduced by r227250.

Differential Revision: http://reviews.llvm.org/D13301

rdar://problem/22802369

llvm-svn: 249092
2015-10-01 22:43:53 +00:00
Evgeniy Stepanov f608111d1b Fix debug info with SafeStack.
llvm-svn: 248933
2015-09-30 19:55:43 +00:00
Evgeniy Stepanov d8b86f7cdc Move dbg.declare intrinsics when merging and replacing allocas.
Place new and update dbg.declare calls immediately after the
corresponding alloca.

Current code in replaceDbgDeclareForAlloca puts the new dbg.declare
at the end of the basic block. LLVM codegen has problems emitting
debug info in a situation when dbg.declare appears after all uses of
the variable. This usually kinda works for inlining and ASan (two
users of this function) but not for SafeStack (see the pending change
in http://reviews.llvm.org/D13178).

llvm-svn: 248769
2015-09-29 00:30:19 +00:00
Fiona Glaser f74cc40e34 Improve performance of SimplifyInstructionsInBlock
1. Use a worklist, not a recursive approach, to avoid needless
   revisitation and being repeatedly forced to jump back to the
   start of the BB if a handle is invalidated.

2. Only insert operands to the worklist if they become unused
   after a dead instruction is removed, so we don’t have to
   visit them again in most cases.

3. Use a SmallSetVector to track the worklist.

4. Instead of pre-initting the SmallSetVector like in
   DeadCodeEliminationPass, only put things into the worklist
   if they have to be revisited after the first run-through.
   This minimizes how much the actual SmallSetVector gets used,
   which saves a lot of time.

llvm-svn: 248727
2015-09-28 18:56:07 +00:00
Joseph Tremoulet 09af67aba5 [EH] Create removeUnwindEdge utility
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.


Reviewers: andrew.w.kaylor, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13152

llvm-svn: 248677
2015-09-27 01:47:46 +00:00
Michael Zolotukhin d56ee06d1f [Unroll] When completely unrolling the loop, replace conditinal branches with unconditional.
Nothing is expected to change, except we do less redundant work in
clean-up.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12951

llvm-svn: 248444
2015-09-23 23:12:43 +00:00
Vedant Kumar ff08e926ba [Inline] Use AssumptionCache from the right Function
This changes the behavior of AddAligntmentAssumptions to match its
comment. I.e, prove the asserted alignment in the context of the caller,
not the callee.

Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko
who also saw a fix for the issue.

rdar://22521387

Differential Revision: http://reviews.llvm.org/D12997

llvm-svn: 248390
2015-09-23 15:49:08 +00:00
Sanjoy Das 2aacc0ecca [SCEV] Introduce ScalarEvolution::getOne and getZero.
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.

I've refactored the call sites I could find easily to use getZero /
getOne.

Reviewers: hfinkel, majnemer, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12947

llvm-svn: 248362
2015-09-23 01:59:04 +00:00
James Molloy 50a4c27f97 [LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions
We're currently losing any fast-math flags when synthesizing fcmps for
min/max reductions. In LV, make sure we copy over the scalar inst's
flags. In LoopUtils, we know we only ever match patterns with
hasUnsafeAlgebra, so apply that to any synthesized ops.

llvm-svn: 248201
2015-09-21 19:41:19 +00:00
Sanjay Patel 815adacd22 don't repeat function names in comments; NFC
llvm-svn: 247813
2015-09-16 16:21:08 +00:00