Commit Graph

1162 Commits

Author SHA1 Message Date
Shawn Landden 8b142bcc3f [SimplifyCFG] reverting preliminary Switch patches again
This reverts 363226 and 363227, both NFC intended

I swear I fixed the test case that is failing, and ran
the tests, but I will look into it again.

llvm-svn: 363229
2019-06-13 05:26:17 +00:00
Shawn Landden 636220e83c [SimpligyCFG] NFC intended, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

Differential Revision: https://reviews.llvm.org/D61151

llvm-svn: 363227
2019-06-13 05:01:44 +00:00
David L. Jones c73fadaa84 Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...'
We have observed some failures with internal builds with this revision.

- Performance regressions:
  - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings).
  - Benchmarks for Abseil's SwissTable.
- Correctness:
  - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP).

hwennborg has already been notified, and is aware of reproducer failures.

llvm-svn: 363220
2019-06-13 02:04:45 +00:00
Hans Wennborg d936e40575 Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
This was reverted in r360086 as it was supected of causing mysterious test
failures internally. However, it was never concluded that this patch was the
root cause.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 361811
2019-05-28 12:19:38 +00:00
Shawn Landden 343578759e [SimplifyCFG] back out all SwitchInst commits
They caused the sanitizer builds to fail.

My suspicion is the change the countLeadingZeros().

llvm-svn: 361736
2019-05-26 18:15:51 +00:00
Shawn Landden fa91ab85d9 [SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger
llvm-svn: 361728
2019-05-26 13:55:52 +00:00
Shawn Landden 30111c786f [SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize
Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

llvm-svn: 361727
2019-05-26 13:55:14 +00:00
Shawn Landden 444eaaf1cc [SimpligyCFG] NFC, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

llvm-svn: 361726
2019-05-26 13:54:04 +00:00
Shawn Landden b7cc093db2 [Support] make countLeadingZeros() and countTrailingZeros() return unsigned
This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().

(as well as __builtin_clzll())

llvm-svn: 361724
2019-05-26 13:49:58 +00:00
David Bolvansky 0290a77aa8 [SimplifyCFG] Added condition assumption for unreachable blocks
Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

llvm-svn: 361707
2019-05-25 22:34:27 +00:00
Alina Sbirlea f31eba6494 [MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.

Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60833

llvm-svn: 360270
2019-05-08 17:05:36 +00:00
Orlando Cazalet-Hyams 0d05177337 Test commit access
llvm-svn: 360125
2019-05-07 09:30:55 +00:00
Jordan Rupprecht 8f14e7cacf Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
This reverts r357452 (git commit 21eb771dcb).

This was causing strange optimization-related test failures on an internal test. Will followup with more details offline.

llvm-svn: 360086
2019-05-06 21:55:05 +00:00
Hans Wennborg 21eb771dcb Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)
The original commit caused false positives from AddressSanitizer's
use-after-scope checks, which have now been fixed in r358478.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 358483
2019-04-16 12:13:25 +00:00
David L. Jones 8b8a02175a Revert r357452 - 'SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)'
This revision causes tests to fail under ASAN. Since the cause of the failures
is not clear (could be ASAN, could be a Clang bug, could be a bug in this
revision), the safest course of action seems to be to revert while investigating.

llvm-svn: 357667
2019-04-04 02:27:57 +00:00
Hans Wennborg b669fea42f SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)
The code was previously checking that candidates for sinking had exactly
one use or were a store instruction (which can't have uses). This meant
we could sink call instructions only if they had a use.

That limitation seemed a bit arbitrary, so this patch changes it to
"instruction has zero or one use" which seems more natural and removes
the need to special-case stores.

Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 357452
2019-04-02 08:01:38 +00:00
Jeremy Morse 90ede5f4bf [SimplifyCFG] Retain debug info when threading jumps with critical edges
Fixes bug 38023: https://bugs.llvm.org/show_bug.cgi?id=38023

The SimplifyCFG pass will perform jump threading in some cases where
doing so is trivial and would simplify the CFG. When folding a series
of blocks with redundant conditional branches into an unconditional "critical
edge" block, it does not keep the debug location associated with the previous
conditional branch.

This patch fixes the bug described by copying the debug info from the
old conditional branch to the new unconditional branch instruction, and
adds a regression test for the SimplifyCFG pass that covers this case.

Patch by Stephen Tozer!

Differential Revision: https://reviews.llvm.org/D59206

llvm-svn: 355833
2019-03-11 16:23:59 +00:00
Max Kazantsev 20b9189975 [NFC] Rename DontDeleteUselessPHIs --> KeepOneInputPHIs
llvm-svn: 353801
2019-02-12 07:09:29 +00:00
Craig Topper 784929d045 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563
2019-02-08 20:48:56 +00:00
James Y Knight 14359ef1b6 [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Jeremy Morse f216da7ee0 [DebugInfo] Remove un-necessary logic from HoistThenElseCodeToIf
Following PR39807, the way in which SimplifyCFG hoists common code on
branch paths was fixed in r347782. However this left extra code hanging
around HoistThenElseCodeToIf that wasn't necessary and needlessly
complicated matters -- we no longer need to look up through the 'if'
basic block to find a location for hoisted 'select' insts, we can instead
use the location chosen by applyMergedLocation.

This patch deletes that extra logic, and updates a regression test to
reflect the new logic (selects get the merged location, not a previous
insts location).

Differential Revision: https://reviews.llvm.org/D55272

llvm-svn: 351058
2019-01-14 12:13:12 +00:00
Michael Kruse 978ba61536 Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.
The current llvm.mem.parallel_loop_access metadata has a problem in that
it uses LoopIDs. LoopID unfortunately is not loop identifier. It is
neither unique (there's even a regression test assigning the some LoopID
to multiple loops; can otherwise happen if passes such as LoopVersioning
make copies of entire loops) nor persistent (every time a property is
removed/added from a LoopID's MDNode, it will also receive a new LoopID;
this happens e.g. when calling Loop::setLoopAlreadyUnrolled()).
Since most loop transformation passes change the loop attributes (even
if it just to mark that a loop should not be processed again as
llvm.loop.isvectorized does, for the versioned and unversioned loop),
the parallel access information is lost for any subsequent pass.

This patch unlinks LoopIDs and parallel accesses.
llvm.mem.parallel_loop_access metadata on instruction is replaced by
llvm.access.group metadata. llvm.access.group points to a distinct
MDNode with no operands (avoiding the problem to ever need to add/remove
operands), called "access group". Alternatively, it can point to a list
of access groups. The LoopID then has an attribute
llvm.loop.parallel_accesses with all the access groups that are parallel
(no dependencies carries by this loop).

This intentionally avoid any kind of "ID". Loops that are clones/have
their attributes modifies retain the llvm.loop.parallel_accesses
attribute. Access instructions that a cloned point to the same access
group. It is not necessary for each access to have it's own "ID" MDNode,
but those memory access instructions with the same behavior can be
grouped together.

The behavior of llvm.mem.parallel_loop_access is not changed by this
patch, but should be considered deprecated.

Differential Revision: https://reviews.llvm.org/D52116

llvm-svn: 349725
2018-12-20 04:58:07 +00:00
Sanjay Patel 7d82d37854 [ValueTracking] add helper function for testing implied condition; NFCI
We were duplicating code around the existing isImpliedCondition() that
checks for a predecessor block/dominating condition, so make that a
wrapper call.

llvm-svn: 348088
2018-12-02 13:26:03 +00:00
Jeremy Morse 9b4cfa55b1 [DebugInfo] Give inlinable calls DILocs (PR39807)
In PR39807 we incorrectly handle circumstances where calls are common'd
from conditional blocks into the parent BB. Calls that can be inlined
must always have DebugLocs, however we strip them during commoning, which
the IR verifier asserts on.

Fix this by using applyMergedLocation: it will perform the same DebugLoc
stripping of conditional Locs, but will also generate an unknown location
DebugLoc that satisfies the requirement for inlinable calls to always have
locations.

Some of the prior logic for selecting a DebugLoc is now likely redundant;
I'll generate a follow-up to remove it (involves editing more regression
tests).

Differential Revision: https://reviews.llvm.org/D54997

llvm-svn: 347782
2018-11-28 17:58:45 +00:00
Vedant Kumar 4de31bba51 [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock
Add methods to BasicBlock which make it easier to efficiently check
whether a block has N (or more) predecessors.

This can be more efficient than using pred_size(), which is a linear
time operation.

We might consider adding similar methods for successors. I haven't done
so in this patch because succ_size() is already O(1).

With this patch applied, I measured a 0.065% compile-time reduction in
user time for running `opt -O3` on the sqlite3 amalgamation (30 trials).
The change in mergeStoreIntoSuccessor alone saves 45 million linked list
iterations in a stage2 Release build of llc.

See llvm.org/PR39702 for a harder but more general way of achieving
similar results.

Differential Revision: https://reviews.llvm.org/D54686

llvm-svn: 347256
2018-11-19 19:54:27 +00:00
Carlos Alberto Enciso fa9cf89734 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53390

llvm-svn: 346481
2018-11-09 09:42:10 +00:00
Matthias Braun 9fd397b423 ADT/STLExtras: Introduce llvm::empty; NFC
This is modeled after C++17 std::empty().

Differential Revision: https://reviews.llvm.org/D53909

llvm-svn: 345679
2018-10-31 00:23:23 +00:00
Carlos Alberto Enciso 9a24e1a7cd [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53287

llvm-svn: 345250
2018-10-25 09:58:59 +00:00
Chandler Carruth edb12a838a [TI removal] Make variables declared as `TerminatorInst` and initialized
by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502
2018-10-15 10:04:59 +00:00
Carlos Alberto Enciso c0952c8a08 Revert "[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG."
This reverts commit r344120.

It was causing buildbot failures.

llvm-svn: 344135
2018-10-10 12:09:34 +00:00
Carlos Alberto Enciso e7a347e5f8 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. 

Differential Revision: https://reviews.llvm.org/D52887

llvm-svn: 344120
2018-10-10 08:29:55 +00:00
Craig Topper 029d1ef6eb [SimplifyCFG] Pass AggressiveInsts to DominatesMergePoint by reference. Remove null check.
Summary:
At some point in the past the recursion in DominatesMergePoint used to pass null for AggressiveInsts as part of the recursion. It no longer does this. So there is no way for AggressiveInsts to be null.

This passes it by reference and removes the null check to make this explicit.

Reviewers: efriedma, reames

Reviewed By: efriedma

Subscribers: xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D52575

llvm-svn: 343828
2018-10-04 23:40:31 +00:00
Craig Topper 1d15f7b02b [SimplifyCFG] Change recursive calls to llvm::SimplifyCFG to instead use an outer while loop to revisit.
Summary:
The llvm::SimplifyCFG function creates a SimplifyCFGOpt object and calls run on it. There were numerous places reached from this run function that called back out llvm::SimplifyCFG which would create another SimplifyCFGOpt object. This is an inefficient use of stack space at minimum. We are also not passing along the LoopHeaders pointer passed into the outer llvm::SimplifyCFG call. So if its not null we lose it on the first recursion and get nullptr from there on.

This patch adds an outer loop around the main BasicBlock simplifying code and adds a flag to the SimplifyCFGOpt class that can be set by to request another iteration. I don't think we can iterate based just on the change flag alone since some of the simplifications delete a basic block entirely leaving nothing to iterate on.

Reviewers: bogner, eli.friedman, reames

Reviewed By: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52760

llvm-svn: 343816
2018-10-04 21:11:52 +00:00
Craig Topper d616d33a96 [SimplifyCFG] Use Value::hasNUses instead of 'getNumUses() =='. NFCI
getNumUses is linear in the number of uses. Since we're looking for a specific use count, we can use hasNUses which will stop as soon as it determines there are more than N uses instead of walking all of them.

llvm-svn: 343550
2018-10-01 23:09:52 +00:00
Craig Topper 90c0a0621c [SimplifyCFG] Update comments that refer to CondBB to say ThenBB instead. NFC
There is no variable in this function named CondBB, but there is one named ThenBB and I believe the comments are all refering to it.

llvm-svn: 343548
2018-10-01 22:56:11 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Carlos Alberto Enciso ba4e437c6a [DebugInfo][Dexter] Speculated BB presents illegal variable value to debugger.
When SimplifyCFG changes the PHI node into a select instruction, the debug information becomes ambiguous. It causes the debugger to display wrong variable value. 

Differential Revision: https://reviews.llvm.org/D51976

llvm-svn: 342527
2018-09-19 08:16:56 +00:00
David Green 2352b30c96 [SimplifyCFG] Put an alignment on generated switch tables
Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51800

llvm-svn: 342039
2018-09-12 09:54:17 +00:00
Martin Storsjo 22dcddf651 Revert "[SimplifyCFG] Common debug handling [NFC]"
This reverts commit r340997.

This change turned out not to be NFC after all, but e.g. causes
clang to crash when building the linux kernel for aarch64.

llvm-svn: 341031
2018-08-30 08:06:50 +00:00
Philip Reames bed556133d [SimplifyCFG] Rename a variable for readibility of a future change [NFC]
llvm-svn: 341004
2018-08-30 00:12:29 +00:00
Philip Reames 6bd16b5850 [SimplifyCFG] Fix a cost modeling oversight in branch commoning
The cost modeling was not accounting for the fact we were duplicating the instruction once per predecessor.  With a default threshold of 1, this meant we were actually creating #pred copies.

Adding to the fun, there is *absolutely no* test coverage for this.  Simply bailing for more than one predecessor passes all checked in tests.

llvm-svn: 341001
2018-08-30 00:03:02 +00:00
Philip Reames 7c57dac955 [SimplifyCFG] Common debug handling [NFC]
llvm-svn: 340997
2018-08-29 23:22:07 +00:00
Chandler Carruth 9ae926b973 [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.

All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.

llvm-svn: 340701
2018-08-26 09:51:22 +00:00
Chandler Carruth 698fbe7b59 [IR] Sink `isExceptional` predicate to `Instruction`, rename it to
`isExceptionalTermiantor` and implement it for opcodes as well following
the common pattern in `Instruction`.

Part of removing `TerminatorInst` from the `Instruction` type hierarchy
to make it easier to share logic and interfaces between instructions
that are both terminators and not terminators.

llvm-svn: 340699
2018-08-26 08:56:42 +00:00
Chandler Carruth 96fc1de77d [IR] Begin removal of TerminatorInst by removing successor manipulation.
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.

The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.

Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.

This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html

There will be a series of much more mechanical changes following this
one to complete this move.

Differential Revision: https://reviews.llvm.org/D47467

llvm-svn: 340698
2018-08-26 08:41:15 +00:00
Florian Hahn 406f1ff1cd [Local] Make DoesKMove required for combineMetadata.
This patch makes the DoesKMove argument non-optional, to force people
to think about it. Most cases where it is false are either code hoisting
or code sinking, where we pick one instruction from a set of
equal instructions among different code paths.

Reviewers: dberlin, nlopes, efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D47475

llvm-svn: 340606
2018-08-24 11:40:04 +00:00
Justin Bogner 6f1740d52f [SimplifyCFG] Replace some uses of bitwise or with logical or
It's clearer to use logical or for boolean values. Thanks to Steven
Zhang for noticing!

llvm-svn: 340153
2018-08-20 06:37:11 +00:00
Chijun Sima e8263f33d9 [SimplifyCFG] Remove pointer from SmallPtrSet before deletion
Summary:
Previously, `eraseFromParent()` calls `delete` which invalidates the value of the pointer. Copying the value of the pointer later is undefined behavior in C++11 and implementation-defined (which may cause a segfault on implementations having strict pointer safety) in C++14.

This patch removes the BasicBlock pointer from related SmallPtrSet before `delete` invalidates it in the SimplifyCFG pass.

Reviewers: kuhar, dmgreen, davide, trentxintong

Reviewed By: kuhar, dmgreen

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50717

llvm-svn: 339773
2018-08-15 13:56:21 +00:00
Manoj Gupta 77eeac3d9e llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: efriedma, george.burgess.iv

Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47895

llvm-svn: 336613
2018-07-09 22:27:23 +00:00
Jesper Antonsson 514b6b5796 Comment change to verify commit rights. NFC.
Summary: Just a silly one-character correction.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48709

llvm-svn: 335832
2018-06-28 10:55:04 +00:00
Florian Hahn a1cc848399 Use SmallPtrSet explicitly for SmallSets with pointer types (NFC).
Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This
patch replaces such types with SmallPtrSet, because IMO it is slightly
clearer and allows us to get rid of unnecessarily including SmallSet.h

Reviewers: dblaikie, craig.topper

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47836

llvm-svn: 334492
2018-06-12 11:16:56 +00:00
Craig Topper 61998289f9 Use SmallPtrSet instead of SmallSet in places where we iterate over the set.
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.

These places were found by hiding the begin/end methods in the SmallSet forwarding

llvm-svn: 334343
2018-06-09 05:04:20 +00:00
David Blaikie 31b98d2e99 Move Analysis/Utils/Local.h back to Transforms
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)

llvm-svn: 333954
2018-06-04 21:23:21 +00:00
David Stenberg 0af67e5b65 [SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest()
Summary:
Fix a case where FoldBranchToCommonDest() would bail out from doing CSE
when encountering a debug intrinsic. Handle that by skipping past the
debug intrinsics.

Also, as a minor refactoring, rename checkCSEInPredecessor() to
tryCSEWithPredecessor() to make it a bit more clear that the function
may remove instructions.

Reviewers: fhahn, craig.topper, dblaikie, xbolva00

Reviewed By: fhahn, xbolva00

Subscribers: vsk, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D46635

llvm-svn: 332698
2018-05-18 08:52:15 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Vedant Kumar e0b5f86b30 [STLExtras] Add distance() for ranges, pred_size(), and succ_size()
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.

Differential Revision: https://reviews.llvm.org/D46668

llvm-svn: 332057
2018-05-10 23:01:54 +00:00
Shiva Chen 2c864551df [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

llvm-svn: 331841
2018-05-09 02:40:45 +00:00
George Burgess IV f9d26af4ea Range-ify for loop; NFC
llvm-svn: 331582
2018-05-05 04:52:26 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Florian Hahn 3df8844b92 [SimplifyCFG] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC).
This patch updates some code responsible the skip debug info to use
BasicBlock::instructionsWithoutDebug. I think this makes things slightly
simpler and more direct.

Reviewers: aprantl, vsk, hans, danielcdh

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D46252

llvm-svn: 331221
2018-04-30 20:10:53 +00:00
Mandeep Singh Grang 636d94db3b [Transforms] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: kcc, pcc, danielcdh, jmolloy, sanjoy, dberlin, ruiu

Reviewed By: ruiu

Subscribers: ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D45142

llvm-svn: 330059
2018-04-13 19:47:57 +00:00
Craig Topper 7d3aba6687 [SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store.
Summary:
Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors.

This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store.

Reviewers: efriedma, jmolloy, ABataev

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39760

llvm-svn: 329142
2018-04-04 03:47:17 +00:00
Matt Morehouse 236cdaf84c [SimplifyCFG] Create attribute for fuzzing-specific optimizations.
Summary:
When building with libFuzzer, converting control flow to selects or
obscuring the original operands of CMPs reduces the effectiveness of
libFuzzer's heuristics.

This patch provides an attribute to disable or modify certain optimizations
for optimal fuzzing signal.

Provides a less aggressive alternative to https://reviews.llvm.org/D44057.

Reviewers: vitalybuka, davide, arsenm, hfinkel

Reviewed By: vitalybuka

Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44232

llvm-svn: 328214
2018-03-22 17:07:51 +00:00
David Blaikie 2be3922807 Fix a couple of layering violations in Transforms
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.

Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.

Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.

llvm-svn: 328165
2018-03-21 22:34:23 +00:00
Ulrich Weigand f4ceef8d3f [Debug] Retain both copies of debug intrinsics in HoistThenElseCodeToIf
When hoisting common code from the "then" and "else" branches of a condition
to before the "if", the HoistThenElseCodeToIf routine will attempt to merge
the debug location associated with the two original copies of the hoisted
instruction.

This is a problem in the special case where the hoisted instruction is a
debug info intrinsic, since for those the debug location is considered
part of the intrinsic and attempting to modify it may resut in invalid
IR.  This is the underlying cause of PR36410.

This patch fixes the problem by handling debug info intrinsics specially:
instead of hoisting one copy and merging the two locations, the code now
simply hoists both copies, each with its original location intact.  Note
that this is still only done in the case where both original copies are
otherwise (i.e. apart from location metadata) identical.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D44312

llvm-svn: 327622
2018-03-15 12:28:48 +00:00
Ulrich Weigand 019dd2316d Revert "[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf"
This reverts commit r327175 as problems in debug info generation were shown.

llvm-svn: 327176
2018-03-09 22:00:10 +00:00
Ulrich Weigand fa4e63c0d6 [Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf
When hoisting common code from the "then" and "else" branches of a condition
to before the "if", there is no need to require that debug intrinsics match
before moving them (and merging them).  Instead, we can simply always keep
all debug intrinsics from both sides of the "if".

This fixes PR36410, which describes a problem where as a result of the attempt
to merge debug locations for two debug intrinsics we end up with an invalid
intrinsic, where the scope indicated in the !dbg location no longer matches
the scope of the variable tracked by the intrinsic.

In addition, this has the benefit that we no longer throw away information
that is actually still valid, helping to generate better debug data.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D44312

llvm-svn: 327175
2018-03-09 21:37:07 +00:00
Serguei Katkov 66182d6c38 [SimplifyCFG] Re-apply Relax restriction for folding unconditional branches
The commit rL308422 introduces a restriction for folding unconditional
branches. Specifically if empty block with unconditional branch leads to
header of the loop then elimination of this basic block is prohibited.
However it seems this condition is redundantly strict.
If elimination of this basic block does not introduce more back edges
then we can eliminate this block.

The patch implements this relax of restriction.

The test profile/Linux/counter_promo_nest.c in compiler-rt project
is updated to meet this change.

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl	
Reviewed By: pacxx
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42691

llvm-svn: 324572
2018-02-08 07:16:29 +00:00
Serguei Katkov 276b32bb14 Revert [SimplifyCFG] Relax restriction for folding unconditional branches
The patch causes the failure of the test
compiler-rt/test/profile/Linux/counter_promo_nest.c

To unblock buildbot, revert the patch while investigation is in progress.

Differential Revision: https://reviews.llvm.org/D42691

llvm-svn: 324214
2018-02-05 09:05:43 +00:00
Serguei Katkov 6e93980e82 [SimplifyCFG] Relax restriction for folding unconditional branches
The commit rL308422 introduces a restriction for folding unconditional
branches. Specifically if empty block with unconditional branch leads to
header of the loop then elimination of this basic block is prohibited.
However it seems this condition is redundantly strict.
If elimination of this basic block does not introduce more back edges
then we can eliminate this block.

The patch implements this relax of restriction.

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl	
Reviewed By: pacxx
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42691

llvm-svn: 324208
2018-02-05 07:56:43 +00:00
Marcello Maggioni ddccd50313 [NFC] Commit to mention that r322248 is actually made by AndrewScheidecker
llvm-svn: 322249
2018-01-11 02:06:28 +00:00
Marcello Maggioni 7083423f22 [SimplifyCFG] Add cut-off for InitializeUniqueCases.
The function can take a significant amount of time on some
complicated test cases, but for the currently only use of
the function we can stop the initialization much earlier
when we find out we are going to discard the result anyway
in the caller of the function.

Adding configurable cut-off points so that we avoid wasting time.
NFCI.

llvm-svn: 322248
2018-01-11 02:01:16 +00:00
Davide Italiano 86b7949f62 [SimplifyCFG] Return to the pass manager the correct value.
I wanted to commit this with r321603, but I failed to squash
the two commits.

llvm-svn: 321606
2017-12-31 16:54:03 +00:00
Davide Italiano 9f074fe915 [SimplifyCFG] Stop hoisting musttail calls incorrectly.
PR35774.

llvm-svn: 321603
2017-12-31 16:47:16 +00:00
Benjamin Kramer c7fc81e659 Use phi ranges to simplify code. No functionality change intended.
llvm-svn: 321585
2017-12-30 15:27:33 +00:00
Guozhi Wei 29697c13bc Revert r321377, it causes regression to https://reviews.llvm.org/P8055.
llvm-svn: 321528
2017-12-28 17:02:34 +00:00
Guozhi Wei 33250340f4 [SimplifyCFG] Don't do if-conversion if there is a long dependence chain
If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.

This patch checks for the long dependence chain, and give up if-conversion if find one.

Differential Revision: https://reviews.llvm.org/D39352

llvm-svn: 321377
2017-12-22 18:54:04 +00:00
Michael Zolotukhin ad371e0caa [SimplifyCFG] Avoid quadratic on a predecessors number behavior in instruction sinking.
If a block has N predecessors, then the current algorithm will try to
sink common code to this block N times (whenever we visit a
predecessor). Every attempt to sink the common code includes going
through all predecessors, so the complexity of the algorithm becomes
O(N^2).
With this patch we try to sink common code only when we visit the block
itself. With this, the complexity goes down to O(N).
As a side effect, the moment the code is sunk is slightly different than
before (the order of simplifications has been changed), that's why I had
to adjust two tests (note that neither of the tests is supposed to test
SimplifyCFG):
* test/CodeGen/AArch64/arm64-jumptable.ll - changes in this test mimic
the changes that previous implementation of SimplifyCFG would do.
* test/CodeGen/ARM/avoid-cpsr-rmw.ll - in this test I disabled common
code sinking by a command line flag.

llvm-svn: 321236
2017-12-21 01:22:13 +00:00
Sanjay Patel 0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Mikael Holmen 0a3e98062f Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by JesperAntonsson.

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

llvm-svn: 319768
2017-12-05 14:14:00 +00:00
Mikael Holmen 9c13c8b6ec Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values.
Broke build bots so reverting.

llvm-svn: 319539
2017-12-01 13:11:39 +00:00
Mikael Holmen 9f047795fb Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by: JesperAntonsson

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

llvm-svn: 319537
2017-12-01 12:30:49 +00:00
Davide Italiano acf6065183 [SimplifyCFG] Use auto * when the type is obvious. NFCI.
llvm-svn: 317923
2017-11-10 20:46:21 +00:00
Easwaran Raman 0a0913def2 Add a wrapper function to set branch weights metadata.
Summary:
This wrapper checks if there is at least one non-zero weight before
setting the metadata.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39872

llvm-svn: 317845
2017-11-09 22:52:20 +00:00
Craig Topper 12463779d3 [SimplifyCFG] When merging conditional stores, don't count the store we're merging against the PHINodeFoldingThreshold
Merging conditional stores tries to check to see if the code is if convertible after the store is moved. But the store hasn't been moved yet so its being counted against the threshold.

The patch adds 1 to the threshold comparison to make sure we don't count the store. I've adjusted a test to use a lower threshold to ensure we still do that conversion with the lower threshold.

Differential Revision: https://reviews.llvm.org/D39570

llvm-svn: 317368
2017-11-03 21:08:13 +00:00
Bjorn Pettersson e73b85d1ab [SimplifyCFG] Discard speculated dbg intrinsics
Summary:
SpeculativelyExecuteBB can flatten the CFG by doing
speculative execution followed by a select instruction.
When the speculatively executed BB contained dbg intrinsics
the result could be a little bit weird, since those dbg
intrinsics were inserted before the select in the flattened
CFG. So when single stepping in the debugger, printing the
value of the variable referenced in the dbg intrinsic, it
could happen that it looked like the variable had values
that never actually were assigned to the variable.

This patch simply discards all dbg intrinsics that were found
in the speculatively executed BB.

Reviewers: aprantl, chandlerc, craig.topper

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39494

llvm-svn: 317198
2017-11-02 11:55:14 +00:00
Craig Topper 7c7fcabd3f [SimplifyCFG] Use a more generic name for the selects created by SpeculativelyExecuteBB to prevent long names from being created
Currently the selects are created with the names of their inputs concatenated together. It's possible to get cases that chain these selects together resulting in long names due to multiple levels of concatenation. Our internal branch of llvm managed to generate names over 100000 characters in length on a particular test due to an extreme compounding of the names.

This patch changes the name to a generic name that is not dependent on its inputs.

Differential Revision: https://reviews.llvm.org/D39440

llvm-svn: 317024
2017-10-31 19:03:51 +00:00
Eugene Zelenko 5adb96cc92 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316630
2017-10-26 00:55:39 +00:00
Sanjay Patel b80daf0b48 [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Sanjay Patel 24226504a7 [SimplifyCFG] try harder to forward switch condition to phi (PR34471)
The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen:

  int switcher(int x) {
      switch(x) {
      case 17: return 17;
      case 19: return 19;
      case 42: return 42;
      default: break;
      }
      return 0;
    }

  int comparator(int x) {
    if (x == 17) return 17;
    if (x == 19) return 19;
    if (x == 42) return 42;
    return 0;
  }

For the first example, we use a bit-test optimization to avoid a series of compare-and-branch:
https://godbolt.org/g/BivDsw

Differential Revision: https://reviews.llvm.org/D39011

llvm-svn: 316293
2017-10-22 16:51:03 +00:00
Vitaly Buka 524c0a639d Fix signed overflow detected by ubsan
This overflow does not affect algorithm, so just suppress it.

llvm-svn: 316018
2017-10-17 18:33:15 +00:00
Sanjay Patel 30f30d37fb [SimplifyCFG] use range-for-loops, tidy; NFCI
There seems to be something missing here as shown in PR34471:
https://bugs.llvm.org/show_bug.cgi?id=34471 

llvm-svn: 315855
2017-10-15 14:43:39 +00:00
Sanjay Patel 4c33d5213b [SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI
This is a follow-up to https://reviews.llvm.org/D38138. 

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping 
any options by using default param values.

llvm-svn: 314930
2017-10-04 20:26:25 +00:00
Dehao Chen f464627f28 Update getMergedLocation to check the instruction type and merge properly.
Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined.

Reviewers: dblaikie, aprantl

Reviewed By: dblaikie, aprantl

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37877

llvm-svn: 314694
2017-10-02 18:13:14 +00:00
Sanjay Patel 0f9b4773c1 [SimplifyCFG] add a struct to house optional folds (PR34603)
This was intended to be no-functional-change, but it's not - there's a test diff.

So I thought I should stop here and post it as-is to see if this looks like what was expected 
based on the discussion in PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

Notes:
 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried 
    through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. 
    The parameter isn't passed down, so we pick up the default value from the function signature 
    after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 
    'SimplifyCFG' calls.

 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. 
    This would theoretically allow us to differentiate the transforms controlled by those params 
    independently.

 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. 
    I just stopped here to minimize the diffs.

 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that 
    could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG 
    set to true no matter where in the pipeline it's creating SimplifyCFG passes?

    // Create an early function pass manager to cleanup the output of the
    // frontend.
    EarlyFPM.addPass(SimplifyCFGPass());

    -->

    /// \brief Construct a pass with the default thresholds
    /// and switch optimizations.
    SimplifyCFGPass::SimplifyCFGPass()
       : BonusInstThreshold(UserBonusInstThreshold),
         LateSimplifyCFG(true) {}   <-- switches get converted to lookup tables and loops may not be in canonical form

    If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' 
    setting via recursion was masking this bug.

Differential Revision: https://reviews.llvm.org/D38138

llvm-svn: 314308
2017-09-27 14:54:16 +00:00
Sanjay Patel 73811a152a [SimplifyCFG] don't create a no-op subtract
I noticed this inefficiency while investigating PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

This fix will likely push another bug (we don't maintain state of 'LateSimplifyCFG') 
into hiding, but I'll try to clean that up with a follow-up patch anyway.

llvm-svn: 313829
2017-09-20 22:31:35 +00:00
Sanjay Patel ca14697c2b [SimplifyCFG] fix typos/formatting; NFC
llvm-svn: 313671
2017-09-19 20:58:14 +00:00
Alexey Bataev 978e2e4760 [SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores.
Summary:
If SimplifyCFG pass is able to merge conditional stores into single one,
it loses the alignment. This may lead to incorrect codegen. Patch
sets the alignment of the new instruction if it is set in the original
one.

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36841

llvm-svn: 312030
2017-08-29 20:06:24 +00:00
Dehao Chen 191b24d3d2 revert r310985 which breaks for the following case:
struct string {
  ~string();
};
void f2();
void f1(int) { f2(); }
void run(int c) {
  string body;
  while (true) {
    if (c)
      f1(c);
    else
      f1(c);
  }
}

Will recommit once the issue is fixed.

llvm-svn: 311864
2017-08-27 22:22:39 +00:00
Dehao Chen 84d412035a Merge debug info when hoist then-else code to if.
Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached.

Reviewers: aprantl, probinson, dblaikie, echristo, loladiro

Reviewed By: aprantl

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36778

llvm-svn: 310985
2017-08-16 01:55:26 +00:00
Craig Topper 36af40c115 [SimplifyCFG] Fix typo in comment. NFC
llvm-svn: 309785
2017-08-02 02:34:16 +00:00
Chad Rosier dfd1de687d [Value Tracking] Default argument to true and rename accordingly. NFC.
IMHO this is a bit more readable.

llvm-svn: 309739
2017-08-01 20:18:54 +00:00
Sumanth Gundapaneni 8d50a50e98 [SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables
Differential Revision: https://reviews.llvm.org/D35579

llvm-svn: 309444
2017-07-28 22:25:40 +00:00
Balaram Makam b05a55787a [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary:
When simplifying unconditional branches from empty blocks, we pre-test if the
BB belongs to a set of loop headers and keep the block to prevent passes from
destroying canonical loop structure. However, the current algorithm fails if
the destination of the branch is a loop header. Especially when such a loop's
latch block is folded into loop header it results in additional backedges and
LoopSimplify turns it into a nested loop which prevent later optimizations
from being applied (e.g., loop  unrolling and loop interleaving).

This patch augments the existing algorithm by further checking if the
destination of the branch belongs to a set of loop headers and defer
eliminating it if yes to LateSimplifyCFG.

Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl

Reviewed By: efriedma

Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35411

llvm-svn: 308422
2017-07-19 08:53:34 +00:00
Craig Topper dfd01ea9ed [SimplifyCFG] Move a portion of an if statement that should already be implied to an assert
Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there?

Reviewers: jmolloy, davide, mcrosier

Reviewed By: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D35025

llvm-svn: 307276
2017-07-06 16:29:43 +00:00
Sumanth Gundapaneni 5372f0a73e [SimplifyCFG] Update the name of switch generated lookup table.
This patch appends the name of the function to the switch generated lookup
table. This will ease the visual debugging in identifying the function the table
is generated from.

Differential Revision: https://reviews.llvm.org/D34817

llvm-svn: 306867
2017-06-30 20:00:01 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
James Molloy a929063233 [GVNSink] GVNSink pass
This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts.

This pass attempts to sink instructions into successors, reducing static
instruction count and enabling if-conversion.
We use a variant of global value numbering to decide what can be sunk.
Consider:

[ %a1 = add i32 %b, 1  ]   [ %c1 = add i32 %d, 1  ]
[ %a2 = xor i32 %a1, 1 ]   [ %c2 = xor i32 %c1, 1 ]
                 \           /
           [ %e = phi i32 %a2, %c2 ]
           [ add i32 %e, 4         ]

GVN would number %a1 and %c1 differently because they compute different
results - the VN of an instruction is a function of its opcode and the
transitive closure of its operands. This is the key property for hoisting
and CSE.

What we want when sinking however is for a numbering that is a function of
the *uses* of an instruction, which allows us to answer the question "if I
replace %a1 with %c1, will it contribute in an equivalent way to all
successive instructions?". The (new) PostValueTable class in GVN provides this
mapping.

This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG.

Differential revision: https://reviews.llvm.org/D24805

llvm-svn: 303850
2017-05-25 12:51:11 +00:00
Craig Topper 8205a1a9b6 [ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.

Differential Revision: https://reviews.llvm.org/D33431

llvm-svn: 303773
2017-05-24 16:53:07 +00:00
Craig Topper e777fed152 [SimplifyCFG] Prevent a few APInt copies on method calls that return const reference. NFCI
llvm-svn: 303523
2017-05-22 00:49:35 +00:00
Reid Kleckner 96ab8726a3 [IR] De-virtualize ~Value to save a vptr
Summary:
Implements PR889

Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:

https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing

This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal.  However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue.  I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.

I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.

Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv

Reviewed By: chandlerc

Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31261

llvm-svn: 303362
2017-05-18 17:24:10 +00:00
Craig Topper 7e3e7afca8 [ConstantRange][SimplifyCFG] Add a helper method to allow SimplifyCFG to determine if a ConstantRange has more than 8 elements without requiring an allocation if the ConstantRange is 64-bits wide.
Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient.

The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math.

llvm-svn: 302385
2017-05-07 22:22:11 +00:00
Daniel Berlin 4d0fe64ae3 Kill off the old SimplifyInstruction API by converting remaining users.
llvm-svn: 301673
2017-04-28 19:55:38 +00:00
Craig Topper b45eabcf82 [ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.

Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.

I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.

Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.

Differential Revision: https://reviews.llvm.org/D32376

llvm-svn: 301432
2017-04-26 16:39:58 +00:00
Craig Topper f3dbd17d0a [APInt] Use isSubsetOf, intersects, and bit counting methods to reduce temporary APInts
This patch uses various APInt methods to reduce temporary APInt creation.

This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking.

I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time.

Differential Revision: https://reviews.llvm.org/D32495

llvm-svn: 301338
2017-04-25 17:46:30 +00:00
Mandeep Singh Grang 799a2edb3d [SimplifyCFG] Fix for non-determinism in codegen
Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718

Reviewers: majnemer, chenli, davide

Reviewed By: davide

Subscribers: davide, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D26726

llvm-svn: 301222
2017-04-24 19:20:45 +00:00
Craig Topper 7af078847c [SimplifyCFG] Fix the determination of PostBB in conditional store merging to handle the targets on the second branch being commuted
Currently we choose PostBB as the single successor of QFB, but its possible that QTB's single successor is QFB which would make QFB the correct choice.

Differential Revision: https://reviews.llvm.org/D32323

llvm-svn: 300992
2017-04-21 15:53:42 +00:00
Craig Topper c228068d90 [SimplifyCFG] Use hasNUses instead of comparing getNumUses to a constant."
The use list is a linked list so getNumUses requires a linear scan through the whole list. hasNUses will stop scanning at N and see if that is the end.

llvm-svn: 300505
2017-04-17 22:13:00 +00:00
Chandler Carruth 927d8e610a [IR] Redesign the case iterator in SwitchInst to actually be an iterator
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.

All of this allows the iterator to be used with common STL facilities,
standard algorithms, etc.

Doing this exposed some missing facilities in the iterator facade that
I've fixed and required some work to the actual iterator to fully
support the necessary API.

Differential Revision: https://reviews.llvm.org/D31548

llvm-svn: 300032
2017-04-12 07:27:28 +00:00
Joerg Sonnenberger fa7367428a Split the SimplifyCFG pass into two variants.
The first variant contains all current transformations except
transforming switches into lookup tables. The second variant
contains all current transformations.

The switch-to-lookup-table conversion results in code that is more
difficult to analyze and optimize by other passes. Most importantly,
it can inhibit Dead Code Elimination. As such it is often beneficial to
only apply this transformation very late. A common example is inlining,
which can often result in range restrictions for the switch expression.

Changes in execution time according to LNT:
SingleSource/Benchmarks/Misc/fp-convert +3.03%
MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20%
MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43%
and a couple of smaller changes. For perimeter it also results 2.6%
a smaller binary.

Differential Revision: https://reviews.llvm.org/D30333

llvm-svn: 298799
2017-03-26 06:44:08 +00:00
Chandler Carruth 0d256c0f5d [IR] Make SwitchInst::CaseIt almost a normal iterator.
This moves it to the iterator facade utilities giving it full random
access semantics, etc. It can also now be used with standard algorithms
like std::all_of and std::any_of and range adaptors like llvm::reverse.

Also make the semantics of iterating match what every other iterator
uses and forbid decrementing past the begin iterator. This was used as
a hacky way to work around iterator invalidation. However, every
instance trying to do this failed to actually avoid touching invalid
iterators despite the clear documentation that the removed and all
subsequent iterators become invalid including the end iterator. So I've
added a return of the next iterator to removeCase and rewritten the
loops that were doing this to correctly follow the iterator pattern of
either incremneting or removing and assigning fresh values to the
iterator and the end.

In one case we were trying to go backwards to make this cleaner but it
doesn't actually work. I've made that code match the code we use
everywhere else to remove cases as we iterate. This changes the order of
cases in one test output and I moved that test to CHECK-DAG so it
wouldn't care -- the order isn't semantically meaningful anyways.

llvm-svn: 298791
2017-03-26 02:49:23 +00:00
Benjamin Kramer 46f5e2c47b Make GCC happy again.
llvm-svn: 298702
2017-03-24 14:15:35 +00:00
Aditya Kumar 24f6ad51bb Fix: Refactor SimplifyCFG:canSinkInstructions [NFC]
Differential Revision: https://reviews.llvm.org/D30116

llvm-svn: 297955
2017-03-16 14:09:18 +00:00
Eric Liu 8c7d28b2f1 Revert "Refactor SimplifyCFG:canSinkInstructions [NFC]"
This reverts commit r297839, which breaks Transforms/SimplifyCFG/sink-common-code.ll

llvm-svn: 297845
2017-03-15 15:29:42 +00:00
Aditya Kumar ee55bf3e34 Refactor SimplifyCFG:canSinkInstructions [NFC]
llvm-svn: 297839
2017-03-15 14:26:45 +00:00
Craig Topper b9dbd4d596 [SimplifyCFG] Use APInt::operator| instead of APInt::Or. NFC
I'm looking to improve operator| to support rvalue references and may remove APInt::Or.

llvm-svn: 296982
2017-03-05 01:08:19 +00:00
Peter Collingbourne 0609acc10d SimplifyCFG: Register cloned assume intrinsics with assumption cache when creating critical edge.
Differential Revision: https://reviews.llvm.org/D29976

llvm-svn: 295145
2017-02-15 03:01:11 +00:00
Taewook Oh 505a25aec5 [InstCombine] Merge DebugLoc when speculatively hoisting store instruction
Summary: Along with https://reviews.llvm.org/D27804, debug locations need to be merged when hoisting store instructions as well. Not sure if just dropping debug locations would make more sense for this case, but as the branch instruction will have at least different discriminator with the hoisted store instruction, I think there will be no difference in practice.

Reviewers: aprantl, andreadb, danielcdh

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29062

llvm-svn: 293372
2017-01-28 07:05:43 +00:00
Akira Hatanaka 4ec7b20ef6 [SimplifyCFG] Do not sink and merge inline-asm instructions.
Conservatively disable sinking and merging inline-asm instructions as doing so
can potentially create arguments that cannot satisfy the inline-asm constraints.

For example, SimplifyCFG used to do the following transformation:

(before)
if.then:
  %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 8)
  br label %if.end
if.else:
  %1 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 6)
  br label %if.end

(after)
  %.sink = select i1 %tobool, i32 6, i32 8
  %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 %.sink)

This would result in a crash in the backend since only immediate integer operands
are permitted for constraint "n".

rdar://problem/30110806

Differential Revision: https://reviews.llvm.org/D29111

llvm-svn: 293025
2017-01-25 06:21:51 +00:00
Amaury Sechet 2fec7e4f44 Tweak ASCII art in Simplify CFG. NFC
llvm-svn: 292792
2017-01-23 15:13:01 +00:00
Robert Lougher b0124c1eb8 [DebugInfo] Remove redundant check in SimplifyCFG; NFC.
llvm-svn: 291813
2017-01-12 21:11:09 +00:00
Robert Lougher 6717a6fe54 [DebugInfo] DILocation variable declaration should be const; NFC.
llvm-svn: 291787
2017-01-12 18:33:49 +00:00
Robert Lougher 5bf0416f45 Reapply "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
This reapplies r289828 (reverted in r289833 as it broke the address sanitizer).  The
debugloc is now only set when the instruction is not a call, as this causes the
verifier to assert (the inliner requires an inlinable callsite to have a debug loc
if the caller and callee have debug info).

Original commit message:

Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Original review: https://reviews.llvm.org/D27590

llvm-svn: 290973
2017-01-04 17:40:32 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Michael Kuperstein 3ca147ea3d Preserve loop metadata when folding branches to a common destination.
Differential Revision: https://reviews.llvm.org/D27830

llvm-svn: 289992
2016-12-16 21:23:59 +00:00
Andrea Di Biagio f20c57eca9 [SimplifyCFG] Merge debug locations when hoisting an instruction from a then/else branch. NFC.
Now that a new API to merge debug locations has been committed at r289661 (see
review D26256 for more details), we can use it to "improve" the code added by
revision r280995.

Instead of nulling the debugloc of a commoned instruction, we use the 'merged'
debug location. At the moment, this is just a no functional change since
function `DILocation::getMergedLocation()` is just a stub and would always
return a null location.

Differential Revision: https://reviews.llvm.org/D27804

llvm-svn: 289862
2016-12-15 20:01:26 +00:00
Robert Lougher 6ea759a83e Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
Reverting as it is causing buildbot failures (address sanitizer).

llvm-svn: 289833
2016-12-15 16:59:13 +00:00
Robert Lougher cf17674211 [SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst
Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Differential Revision: https://reviews.llvm.org/D27590

llvm-svn: 289828
2016-12-15 16:17:53 +00:00
Hal Finkel 3ca4a6bcf1 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Eugene Zelenko a3fe70d233 Fix some Clang-tidy and Include What You Use warnings; other minor fixes (NFC).
This preparation to remove SetVector.h dependency on SmallSet.h.

llvm-svn: 288256
2016-11-30 17:48:10 +00:00
Dehao Chen 018a3afa99 Ignore debug info when making optimization decisions in SimplifyCFG.
Summary: Debug info should *not* affect code generation. This patch properly handles debug info to make sure the generated code are the same with or without debug info.

Reviewers: davidxl, mzolotukhin, jmolloy

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D25286

llvm-svn: 284415
2016-10-17 19:28:44 +00:00
Oliver Stannard fe4432b105 [SimplifyCFG] Don't lower complex ConstantExprs to lookup tables
Not all ConstantExprs can be represented by a global variable, for example most
pointer arithmetic other than addition of a constant, so we can't convert these
values from switch statements to lookup tables.

Differential Revision: https://reviews.llvm.org/D25550

llvm-svn: 284379
2016-10-17 12:00:24 +00:00
Benjamin Kramer d8b079708d [SimplifyCFG] Use the error checking provided by getPrevNode.
BasicBlock::size is O(insts), making this loop O(blocks*insts), which
can be really slow on generated code. getPrevNode already checks if
we're at the beginning of the block and returns nullptr if so, just use
that instead. No functionality change intended.

llvm-svn: 284303
2016-10-15 13:15:05 +00:00
Sanjoy Das bc357e8fa3 [SimplifyCFG] Don't create PHI nodes for constant bundle operands
Summary:
Constant bundle operands may need to retain their constant-ness for
correctness.  I'll admit that this is slightly odd, but it looks like
SimplifyCFG already does this for things like @llvm.frameaddress and
@llvm.stackmap, so I suppose adding one more case is not a big deal.

It is possible to add a mechanism to denote bundle operands that need to
remain constants, but that's probably too complicated for the time
being.

Reviewers: jmolloy

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D25502

llvm-svn: 284028
2016-10-12 18:15:33 +00:00
Oliver Stannard 4df1cc0b00 [ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.

Differential Revision: https://reviews.llvm.org/D24462

llvm-svn: 283530
2016-10-07 08:48:24 +00:00
David Majnemer 8c03c1bade [SimplifyCFG] Correctly test for unconditional branches in GetCaseResults
GetCaseResults assumed that a terminator with one successor was an
unconditional branch.  This is not necessarily the case, it could be a
cleanupret.

Strengthen the check by querying whether or not the terminator is
exceptional.

llvm-svn: 283517
2016-10-07 01:38:35 +00:00
James Molloy 0efb96a8ee [SimplifyCFG] Update (AND) IR flags when CSE'ing instructions
We were updating metadata but not IR flags. Because we pick an arbitrary instruction to be the CSE candidate, it comes down to luck (50% or less chance) if this results in broken codegen or not, which is why PR30373 which is actually not the fault of the commit it was bisected down to.

Fixes PR30373.

llvm-svn: 281889
2016-09-19 08:23:08 +00:00
James Molloy 104370ab37 [SimplifyCFG] Be even more conservative in SinkThenElseCodeToEnd
This should *actually* fix PR30244. This cranks up the workaround for PR30188 so that we never sink loads or stores of allocas.

The idea is that these should be removed by SROA/Mem2Reg, and any movement of them may well confuse SROA or just cause unwanted code churn. It's not ideal that the midend should be crippled like this, but that unwanted churn can really cause significant regressions in important workloads (tsan).

llvm-svn: 281162
2016-09-11 09:00:03 +00:00
James Molloy 18d96e8fa5 [SimplifyCFG] Harden up the profitability heuristic for block splitting during sinking
Exposed by PR30244, we will split a block currently if we think we can sink at least one instruction. However this isn't right - the reason we split predecessors is so that we can sink instructions that otherwise couldn't be sunk because it isn't safe to do so - stores, for example.

So, change the heuristic to only split if it thinks it can sink at least one non-speculatable instruction.

Should fix PR30244.

llvm-svn: 281160
2016-09-11 08:07:30 +00:00
Dehao Chen 87823f8e4d Remove debug info when hoisting instruction from then/else branch.
Summary: The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch.

Reviewers: davidxl, dblaikie, chandlerc, kcc, echristo

Subscribers: mehdi_amini, probinson, eric_niebler, andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D24164

llvm-svn: 280995
2016-09-08 21:53:33 +00:00
Peter Collingbourne 8f1dd5c41e IR: Remove Value::intersectOptionalDataWith, replace all calls with calls to Instruction::andIRFlags.
The two functions are functionally equivalent.

Differential Revision: https://reviews.llvm.org/D22830

llvm-svn: 280884
2016-09-07 23:39:04 +00:00
Hal Finkel ac5803ba91 [SimplifyCFG] Don't try to create metadata-valued PHIs
We can't create metadata-valued PHIs; don't try to do so when sinking.

I created a test case for this using the @llvm.type.test intrinsic, because it
takes a metadata parameter and does not have severe side effects (thus
SimplifyCFG is willing to otherwise sink it).

Previously, running the test case would crash with:

  Invalid use of metadata!
    %.sink = select i1 %flag, metadata <...>, metadata <0x4e45dc0>
  LLVM ERROR: Broken function found, compilation aborted!

llvm-svn: 280866
2016-09-07 21:38:22 +00:00
James Molloy 6c009c1c85 [SimplifyCFG] Followup fix to r280790
In failure cases it's not guaranteed that the PHI we're inspecting is actually in the successor block! In this case we need to bail out early, and never query getIncomingValueForBlock() as that will cause an assert.

llvm-svn: 280794
2016-09-07 09:01:22 +00:00
James Molloy ec905a62ae [SimplifyCFG] Update workaround for PR30188 to also include loads
I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores.

Fixes PR30188 (again).

llvm-svn: 280792
2016-09-07 08:40:20 +00:00
James Molloy bf1837d9c9 [SimplifyCFG] Check PHI uses more accurately
PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG.

Fixes PR30292.

llvm-svn: 280790
2016-09-07 08:15:54 +00:00
James Molloy f3cf2a494b [SimplifyCFG] Add a workaround to fix PR30188
We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

llvm-svn: 280470
2016-09-02 07:29:00 +00:00
James Molloy 88cad7e5cf [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

llvm-svn: 280364
2016-09-01 12:58:13 +00:00
James Molloy eec6df3193 [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

llvm-svn: 280351
2016-09-01 10:44:35 +00:00
James Molloy 21744689b9 [SimplifyCFG] Fix nondeterministic iteration order
We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet.

Should fix stage3 convergence builds.

llvm-svn: 280342
2016-09-01 09:01:34 +00:00
James Molloy e656642295 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

llvm-svn: 280338
2016-09-01 07:45:25 +00:00
James Molloy 3c1137c639 Revert "[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases"
This reverts commit r280218. This *also* causes buildbot errors. Sigh. Not a successful day all around!

llvm-svn: 280239
2016-08-31 13:32:28 +00:00
James Molloy cacfc16109 Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd"
This reverts commit r280216 - it caused buildbot failures.

llvm-svn: 280234
2016-08-31 13:16:52 +00:00
James Molloy 76c9d423a7 Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"
This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain.

llvm-svn: 280233
2016-08-31 13:16:45 +00:00
James Molloy 06a45483a1 Revert "[SimplifyCFG] Add a workaround to fix PR30188"
This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain.

llvm-svn: 280232
2016-08-31 13:16:36 +00:00
James Molloy 8a66a39cbf Revert "[SimplifyCFG] Fix bootstrap failure after r280220"
This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence.

llvm-svn: 280231
2016-08-31 13:16:30 +00:00
James Molloy b7efa6c227 [SimplifyCFG] Fix bootstrap failure after r280220
We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash.

llvm-svn: 280228
2016-08-31 12:33:48 +00:00
James Molloy 171fdac7ce [SimplifyCFG] Add a workaround to fix PR30188
We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

llvm-svn: 280219
2016-08-31 10:46:45 +00:00
James Molloy 8e69b032e5 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

llvm-svn: 280218
2016-08-31 10:46:39 +00:00
James Molloy c53b40b509 [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

llvm-svn: 280217
2016-08-31 10:46:33 +00:00
James Molloy 55bd04cd20 [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

llvm-svn: 280216
2016-08-31 10:46:23 +00:00
James Molloy 923e98c232 [SimplifyCFG] Tail-merge calls with sideeffects
This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour
at least vaguely consistent with the previous version and keep it as close to NFC as
I could.

There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup
along the way to ensure we don't create indirect calls.

Should fix PR28964.

llvm-svn: 280215
2016-08-31 10:46:16 +00:00
James Molloy d13b1239e4 [SimplifyCFG] Properly CSE metadata in SinkThenElseCodeToEnd
This was missing, meaning the metadata in sunk instructions was potentially bogus and could cause miscompiles.

llvm-svn: 280072
2016-08-30 10:56:08 +00:00
David Majnemer e8fd5f9ffd [SimplifyCFG] Hoisting invalidates metadata
We forgot to remove optimization metadata when performing hosting during
FoldTwoEntryPHINode.

This fixes PR29163.

llvm-svn: 279980
2016-08-29 17:14:08 +00:00
James Molloy 5bf2114265 [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
[Recommitting now an unrelated assertion in SROA is sorted out]

The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables.

This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup.

llvm-svn: 279460
2016-08-22 19:07:15 +00:00
James Molloy 475f4a763f Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"
This reverts commit r279443. It caused buildbot failures.

llvm-svn: 279447
2016-08-22 18:13:12 +00:00
James Molloy 353052698a [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables.

This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup.

llvm-svn: 279443
2016-08-22 17:40:23 +00:00
Reid Kleckner 98a48afa5d Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"
This reverts commit r279229. It breaks intrinsic function calls in
diamonds.

llvm-svn: 279313
2016-08-19 20:22:39 +00:00
James Molloy 11a1936b70 [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

llvm-svn: 279229
2016-08-19 10:10:27 +00:00
Duncan P. N. Exon Smith 0a12729f99 SimplifyCFG: Avoid dereferencing end()
When comparing a User* to a BasicBlock::iterator in
passingValueIsAlwaysUndefined, don't dereference the iterator in case it
is end().

llvm-svn: 278872
2016-08-16 23:57:56 +00:00
Reid Kleckner 70a600b8bb Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"
This reverts commit r278660.

It causes downstream assertion failure in InstCombine on shuffle
instructions. Comes up in __mm_swizzle_epi32.

llvm-svn: 278672
2016-08-15 15:42:31 +00:00
James Molloy 9a3c82f5cf [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

llvm-svn: 278660
2016-08-15 08:04:56 +00:00
David L Kreitzer 9667417a1a Fixed typo.
llvm-svn: 278565
2016-08-12 21:06:53 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Benjamin Kramer aa160c22f7 [SimplifyCFG] Make range reduction code deterministic.
This generated IR based on the order of evaluation, which is different
between GCC and Clang. With that in mind you get bootstrap miscompares
if you compare a Clang built with GCC-built Clang vs. Clang built with
Clang-built Clang. Diagnosing that made my head hurt.

This also reverts commit r277337, which "fixed" the test case.

llvm-svn: 277820
2016-08-05 14:55:02 +00:00
Junmo Park db8f6eebee Minor code cleanups. NFC.
llvm-svn: 277415
2016-08-02 04:38:27 +00:00
James Molloy bade86cedc [SimplifyCFG] Fix nasty RAUW bug from r277325
Using RAUW was wrong here; if we have a switch transform such as:
  18 -> 6 then
  6 -> 0

If we use RAUW, while performing the second transform the  *transformed* 6
from the first will be also replaced, so we end up with:
  18 -> 0
  6 -> 0

Found by clang stage2 bootstrap; testcase added.

llvm-svn: 277332
2016-08-01 09:34:48 +00:00
James Molloy b2e436de42 [SimplifyCFG] Range reduce switches
If a switch is sparse and all the cases (once sorted) are in arithmetic progression, we can extract the common factor out of the switch and create a dense switch. For example:

    switch (i) {
    case 5: ...
    case 9: ...
    case 13: ...
    case 17: ...
    }

can become:

    if ( (i - 5) % 4 ) goto default;
    switch ((i - 5) / 4) {
    case 0: ...
    case 1: ...
    case 2: ...
    case 3: ...
    }

or even better:

   switch ( ROTR(i - 5, 2) {
   case 0: ...
   case 1: ...
   case 2: ...
   case 3: ...
   }

The division and remainder operations could be costly so we only do this if the factor is a power of two, and emit a right-rotate instead of a divide/remainder sequence. Dense switches can be lowered significantly better than sparse switches and can even be transformed into lookup tables.

llvm-svn: 277325
2016-08-01 07:45:11 +00:00
Benjamin Kramer 135f735af1 Apply clang-tidy's modernize-loop-convert to most of lib/Transforms.
Only minor manual fixes. No functionality change intended.

llvm-svn: 273808
2016-06-26 12:28:59 +00:00
David Majnemer 1fea77c6fc [SimplifyCFG] Replace calls to null/undef with unreachable
Calling null is undefined behavior, a call to undef can be trivially
treated as a call to null.

llvm-svn: 273776
2016-06-25 07:37:27 +00:00
David Majnemer b8da3a2bb2 Reinstate r273711
r273711 was reverted by r273743.  The inliner needs to know about any
call sites in the inlined function.  These were obscured if we replaced
a call to undef with an undef but kept the call around.

This fixes PR28298.

llvm-svn: 273753
2016-06-25 00:04:10 +00:00
Nico Weber ae2ef4ccd4 Revert r273711, it caused PR28298.
llvm-svn: 273743
2016-06-24 22:52:39 +00:00
David Majnemer 3b3e954ea2 SimplifyInstruction does not imply DCE
We cannot remove an instruction with no uses just because
SimplifyInstruction succeeds.  It may have side effects.

llvm-svn: 273711
2016-06-24 19:34:46 +00:00
David Majnemer d770877328 Switch more loops to be range-based
This makes the code a little more concise, no functional change is
intended.

llvm-svn: 273644
2016-06-24 04:05:21 +00:00
Chuang-Yu Cheng 68f7f1cf00 Teaching SimplifyCFG to recognize the Or-Mask trick that InstCombine uses to
reduce the number of comparisons.

Specifically, InstCombine can turn:
  (i == 5334 || i == 5335)
into:
  ((i | 1) == 5335)

SimplifyCFG was already able to detect the pattern:
  (i == 5334 || i == 5335)
to:
  ((i & -2) == 5334)

This patch supersedes D21315 and resolves PR27555
(https://llvm.org/bugs/show_bug.cgi?id=27555).

Thanks to David and Chandler for the suggestions!

Author: Thomas Jablin (tjablin)
Reviewers: majnemer chandlerc halfdan cycheng

http://reviews.llvm.org/D21397

llvm-svn: 273639
2016-06-24 01:59:00 +00:00
Chuang-Yu Cheng 5078f94690 Use m_APInt in SimplifyCFG
Switch from m_Constant to m_APInt per David's request. NFC.

Author: Thomas Jablin (tjablin)
Reviewers: majnemer cycheng

http://reviews.llvm.org/D21440

llvm-svn: 272977
2016-06-17 00:04:39 +00:00
Chuang-Yu Cheng dbe00d51b4 SimplifyCFG is able to detect the pattern:
(i == 5334 || i == 5335)
to:
    ((i & -2) == 5334)

This transformation has some incorrect side conditions. Specifically, the
transformation is only applied when the right-hand side constant (5334 in
the example) is a power of two not equal and not equal to the negated mask.
These side conditions were added in r258904 to fix PR26323. The correct side
condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334].

It's a little bit hard to see why these transformations are correct and what
the side conditions ought to be. Here is a CVC3 program to verify them for
64-bit values:
    ONE  : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63);
    x    : BITVECTOR(64);
    y    : BITVECTOR(64);
    z    : BITVECTOR(64);
    mask : BITVECTOR(64) = BVSHL(ONE, z);
    QUERY( (y & ~mask = y) =>
           ((x & ~mask = y) <=> (x = y OR x = (y |  mask)))
    );

Please note that each pattern must be a dual implication (<--> or iff). One
directional implication can create spurious matches. If the implication is
only one-way, an unsatisfiable condition on the left side can imply a
satisfiable condition on the right side. Dual implication ensures that
satisfiable conditions are transformed to other satisfiable conditions and
unsatisfiable conditions are transformed to other unsatisfiable conditions.

Here is a concrete example of a unsatisfiable condition on the left
implying a satisfiable condition on the right:
    mask = (1 << z)
    (x & ~mask) == y --> (x == y || x == (y | mask))

Substituting y = 3, z = 0 yields:
    (x & -2) == 3 --> (x == 3 || x == 2)

The version of this code before r258904 had no side-conditions and
incorrectly justified itself in comments through one-directional
implication.

Thanks to Chandler for the suggestion!

Author: Thomas Jablin (tjablin)
Reviewers: chandlerc majnemer hfinkel cycheng

http://reviews.llvm.org/D21417

llvm-svn: 272873
2016-06-16 04:44:25 +00:00
Peter Collingbourne 96efdd6107 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

llvm-svn: 272709
2016-06-14 21:01:22 +00:00
David Majnemer 2482e1c017 [SimplifyCFG] Don't kill empty cleanuppads with multiple uses
A basic block could contain:
  %cp = cleanuppad []
  cleanupret from %cp unwind to caller

This basic block is empty and is thus a candidate for removal.  However,
there can be other uses of %cp outside of this basic block.  This is
only possible in unreachable blocks.

Make our transform more correct by checking that the pad has a single
user before removing the BB.

This fixes PR28005.

llvm-svn: 271816
2016-06-04 23:50:03 +00:00
David Majnemer 9f92f4c497 [SimplifyCFG] Remove cleanuppads which are empty except for calls to lifetime.end
A cleanuppad is not cheap, they turn into many instructions and result
in additional spills and fills.  It is not worth keeping a cleanuppad
around if all it does is hold a lifetime.end instruction.

N.B.  We first try to merge the cleanuppad with another cleanuppad to
avoid dropping the lifetime and debug info markers.

llvm-svn: 270314
2016-05-21 05:12:32 +00:00
Sanjay Patel 75892a1543 [SimplifyCFG] eliminate switch cases based on known range of switch condition
This was noted in PR24766:
https://llvm.org/bugs/show_bug.cgi?id=24766#c2

We may not know whether the sign bit(s) are zero or one, but we can still
optimize based on knowing that the sign bit is repeated.

Differential Revision: http://reviews.llvm.org/D20275

llvm-svn: 270222
2016-05-20 14:53:09 +00:00
Dehao Chen f16376b505 Follow-up patch of http://reviews.llvm.org/D19948 to handle missing profiles when simplifying CFG.
Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG.

Reviewers: spatel, davidxl

Subscribers: danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D20307

llvm-svn: 269995
2016-05-18 22:41:03 +00:00
Dehao Chen f6c0083b55 clang-format SimplifyCFG.cpp.
llvm-svn: 269974
2016-05-18 19:44:21 +00:00
Sanjay Patel 5d5134f676 use range-loops; NFCI
llvm-svn: 269471
2016-05-13 20:24:53 +00:00
Dehao Chen b76e5d948a Propagate branch metadata when some branch probability is missing.
Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation.

Reviewers: hfinkel, davidxl, spatel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19948

llvm-svn: 269137
2016-05-10 23:07:19 +00:00
Sanjay Patel 1cb6241a89 [SimplifyCFG] propagate branch metadata when creating select (retry r268550 / r268751 with possible fix)
Retrying r268550/r268751 which were reverted at r268577/r268765 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken another guess at fixing
the problem in this version of the patch and will watch for another failure.

Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268767
2016-05-06 18:07:46 +00:00
Sanjay Patel 84a0bf64a8 revert r268751 - caused same failures on msan bot
llvm-svn: 268765
2016-05-06 17:51:37 +00:00
Sanjay Patel 6609510c32 [SimplifyCFG] propagate branch metadata when creating select (retry r268550 with possible fix)
Retrying r268550 which was reverted at r268577 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken a guess at fixing
the problem in this version of the patch and will watch for another failure.

Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268751
2016-05-06 17:07:47 +00:00
Chad Rosier 4ab37c0037 [SimplifyCFG] Prefer a simplification based on a dominating condition.
Rather than merge two branches with a common destination.
Differential Revision: http://reviews.llvm.org/D19743

llvm-svn: 268735
2016-05-06 14:25:14 +00:00
Vitaly Buka fdcea9d78a Revert "[SimplifyCFG] propagate branch metadata when creating select"
MemorySanitizer: use-of-uninitialized-value
0x4910e47 in count /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:159:12
0x4910e47 in countLeadingZeros<unsigned long> /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:183
0x4910e47 in FitWeights /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:855
0x4910e47 in SimplifyCondBranchToCondBranch /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:2895

This reverts commit 609f4dd4bf3bc735c8c047a4d4b0a8e9e4d202e2.

llvm-svn: 268577
2016-05-04 23:59:33 +00:00
Sanjay Patel 7e8c285814 [SimplifyCFG] propagate branch metadata when creating select
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268550
2016-05-04 20:48:24 +00:00
Hans Wennborg 0c3518e84b [SimplifyCFG] isSafeToSpeculateStore now ignores debug info
This patch fixes PR27615.

@llvm.dbg.value instructions no longer count towards the maximum number of
instructions to look back at in the instruction list when searching for a
store instruction. This should make the output consistent between debug and
non-debug build.

Patch by Henric Karlsson <henric.karlsson@ericsson.com>!

Differential Revision: http://reviews.llvm.org/D19912

llvm-svn: 268512
2016-05-04 15:40:57 +00:00
Reid Kleckner bca59d2a43 Revert "[SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics"
This reverts commit r268254.

This change causes assertion failures while building Chromium. Reduced
test case coming soon.

llvm-svn: 268288
2016-05-02 19:43:22 +00:00
Hans Wennborg b7599329fc [SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics
Make it possible that TryToSimplifyUncondBranchFromEmptyBlock merges empty
basic block including lifetime intrinsics as well as phi nodes and
unconditional branch into its successor or predecessor(s).

If successor of empty block has single predecessor, all contents including
lifetime intrinsics are sinked into the successor. Otherwise, they are
hoisted into its predecessor(s) and then merged into the predecessor(s).

Patch by Josh Yoon <josh.yoon@samsung.com>!

Differential Revision: http://reviews.llvm.org/D19257

llvm-svn: 268254
2016-05-02 17:22:54 +00:00
Sanjay Patel facf45a82f [SimplifyCFG] propagate branch metadata when creating select
There's no existing test for this path, and I don't know how to expose
it in a regression test, but I'm assuming there's some reason this
path exists. 

llvm-svn: 267813
2016-04-27 23:14:12 +00:00
Sanjay Patel 29dea0d230 [SimplifyCFG] propagate branch metadata when creating select
llvm-svn: 267624
2016-04-26 23:15:48 +00:00
Hal Finkel e4c0c1679b [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.

llvm-svn: 267515
2016-04-26 02:06:06 +00:00
Chad Rosier e2cbd13e56 [ValueTracking] Improve isImpliedCondition when the dominating cond is false.
llvm-svn: 267430
2016-04-25 17:23:36 +00:00
Sanjay Patel dc88bd6e1f replace duplicated static functions for profile metadata access with BranchInst member function; NFCI
llvm-svn: 267295
2016-04-23 20:01:22 +00:00
Chad Rosier 41dd31f0b0 [ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC.
Phabricator Revision: http://reviews.llvm.org/D19277

llvm-svn: 266904
2016-04-20 19:15:26 +00:00
Chad Rosier b7dfbb40a3 [ValueTracking] Improve isImpliedCondition for conditions with matching operands.
This patch improves SimplifyCFG to catch cases like:

  if (a < b) {
    if (a > b) <- known to be false
      unreachable;
  }

Phabricator Revision: http://reviews.llvm.org/D18905

llvm-svn: 266767
2016-04-19 17:19:14 +00:00
Sanjay Patel f11ab05bdb [SimplifyCFG] propagate branch metadata when creating select (PR27344)
This is almost identical to:
http://reviews.llvm.org/rL264527

This doesn't solve PR27344; it just allows the profile weights to survive. 
To solve the bug, we need to use the profile weights in the backend.

llvm-svn: 266442
2016-04-15 15:32:12 +00:00
Hans Wennborg e9134897f4 Fix a couple of redundant conditional expressions (PR27283, PR28282)
llvm-svn: 265987
2016-04-11 20:35:01 +00:00
Duncan P. N. Exon Smith da68cbc4ad IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC
Clarify what this RemapFlag actually means.

  - Change the flag name to match its intended behaviour.
  - Clearly document that it's not supposed to affect globals.
  - Add a host of FIXMEs to indicate how to fix the behaviour to match
    the intent of the flag.

RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock.  Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).

When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.

This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.

llvm-svn: 265628
2016-04-07 00:26:43 +00:00
Hyojin Sung 4673f10568 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
    
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.

llvm-svn: 264697
2016-03-29 04:08:57 +00:00
Reid Kleckner ba85781f58 Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops"
This reverts commit r264596.

It does not compile.

llvm-svn: 264604
2016-03-28 18:07:40 +00:00
Hyojin Sung 0ada5b0d14 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being 
applied (e.g., transforming nested loops into simplified forms and loop vectorization).

The patch augments the existing PHI node-based check by adding a pre-test if the BB actually 
belongs to a set of loop headers and not eliminating it if yes. 

llvm-svn: 264596
2016-03-28 17:22:25 +00:00
Sanjay Patel 796db35f62 [SimplifyCFG] propagate branch metadata when creating select (PR26636)
llvm-svn: 264527
2016-03-26 23:30:50 +00:00
Sanjay Patel 9e23fedaf0 propagate 'unpredictable' metadata on select instructions
This is similar to D18133 where we allowed profile weights on select instructions. 
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.

A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648

Differential Revision: http://reviews.llvm.org/D18220

llvm-svn: 263716
2016-03-17 15:30:52 +00:00
Sanjay Patel ee52b6e77d allow branch weight metadata on select instructions (PR26636)
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636

This doesn't accomplish anything on its own. It's the first step towards preserving 
and using branch weights with selects.

The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to 
determine how to correctly propagate the weights.

Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.

The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.

Differential Revision: http://reviews.llvm.org/D18133

llvm-svn: 263482
2016-03-14 20:18:59 +00:00
Mehdi Amini ba9fba81d6 Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Sanjay Patel 5658b58936 remove unnecessary cast; NFC
llvm-svn: 263343
2016-03-12 18:17:41 +00:00
Sanjay Patel 2e0027706a fix formatting; NFC
llvm-svn: 263342
2016-03-12 18:05:53 +00:00
Sanjay Patel 5781d840dd use range loops; NFCI
llvm-svn: 263341
2016-03-12 16:52:17 +00:00
Eric Christopher 35abd051c0 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
Mehdi Amini 99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
David Majnemer ee0cbbbe69 [SimplifyCFG] Use a more elegant solution than r261731
The cleanupret instruction has an invariant that it's 'from' operand be
a cleanuppad.  This invariant was violated when we removed a dead block
which removed a cleanuppad leaving behind a cleanupret with an undef
'from' operand.

This was solved in r261731 by staving off the removal of the dead block
to a later pass.

However, it occured to me that we do not need to do this.
Instead, we can simply avoid processing the cleanupret if it has an
undef 'from' operand because we know that it will be removed soon.

llvm-svn: 261754
2016-02-24 17:30:48 +00:00
David Majnemer ec72e37220 [SimplifyCFG] Do not blindly remove unreachable blocks
DeleteDeadBlock was called indiscriminately, leading to cleanuprets with
undef cleanuppad references.

Instead, try to drain the BB of most of it's instructions if it is
unreachable.  We can then remove the BB if it solely consists of a
terminator (and maybe some phis).

llvm-svn: 261731
2016-02-24 10:02:16 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Benjamin Kramer 7d537ae747 [SimplifyCFG] Use pointer identity to simplify predicate.
No functional change intended.

llvm-svn: 261427
2016-02-20 10:40:42 +00:00
David Majnemer 1efa23ddab [SimplifyCFG] Merge together cleanuppads
Cleanuppads may be merged together if one is the only predecessor of the
other in which case a simple transform can be performed: replace the
a cleanupret with a branch and remove an unnecessary cleanuppad.

Differential Revision: http://reviews.llvm.org/D17459

llvm-svn: 261390
2016-02-20 01:07:45 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Justin Lebar db63949e8d [SimplifyCFG] Don't fold conditional branches that contain calls to convergent functions.
Summary:
Performing this optimization duplicates the call to the convergent
function and adds new control-flow dependencies, which is a no-no.

Reviewers: jingyue

Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17128

llvm-svn: 260730
2016-02-12 21:01:36 +00:00
Gerolf Hoflehner 2432bd0ddd [SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative to
D16251)

Summary:
This is a simpler fix to the problem than the dominator approach in
http://reviews.llvm.org/D16251. It adds only values into the gather() while loop
that have been seen before.

The actual endless loop is in the constant compare gather() routine in
Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the
queue:
%.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i

Here is what happens at the IR level:

for.cond.i:                                       ; preds = %if.end6.i,
%if.end.i54
%ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ]
%ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<<
%cmp2.i = icmp ult i32 %ix.0.i, %11
br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit

if.end6.i:                                        ; preds = %for.body.i
%cmp10.i = icmp ugt i32 %conv.i, %add9.i
%.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<<

When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i.
The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
(Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i).
And
now there is use of .ret.0.off0.i before a definition which triggers the
“endless” loop in gather():

while(!DFT.empty()) {

    V = DFT.pop_back_val();   // V is .ret.0.off0.i

    if (Instruction *I = dyn_cast<Instruction>(V)) {
      // If it is a || (or && depending on isEQ), process the operands.
      if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) {
        DFT.push_back(I->getOperand(1));  // This is now .ret.0.off0.i also
        DFT.push_back(I->getOperand(0));

        continue; // “endless loop” for .ret.0.off0.i
      }

Reviewers: reames, ahatanak

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16839

llvm-svn: 259730
2016-02-03 23:54:25 +00:00
Sanjay Patel 5264cc772c [SimplifyCFG] limit recursion depth when speculating instructions (PR26308)
This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=26308

With the switch to using the TTI cost model in:
http://reviews.llvm.org/rL228826
...it became possible to hit a zero-cost cycle of instructions (gep -> phi -> gep...), 
so we need a cap for the recursion in DominatesMergePoint().

A recursion depth parameter was already added for a different reason in:
http://reviews.llvm.org/rL255660
...so we can just set a limit for it.

I pulled "10" out of the air and made it an independent parameter that we can play with.
It might be higher than it needs to be given the currently low default value of 
PHINodeFoldingThreshold (2). That's the starting cost value that we enter the recursion
with, and most instructions have cost set to TCC_Basic (1), so I don't think we're going
to speculate more than 2 instructions with the current parameters.

As noted in the review and the TODO comment, we can do better than just limiting recursion
depth.

Differential Revision: http://reviews.llvm.org/D16637

llvm-svn: 258971
2016-01-27 19:22:45 +00:00
David Majnemer fccf5c6e01 Revert "Revert "[SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)""
This reverts commit r258903 which reverted r255660.  r258903 was an
accidental commit and should not have been committed.

llvm-svn: 258905
2016-01-27 02:59:41 +00:00
David Majnemer c761afd1d1 [SimplifyCFG] Don't mistake icmp of and for a tree of comparisons
SimplifyCFG tries to turn complex branch conditions into a switch.
Some of it's logic attempts to reason about bitwise arithmetic produced
by InstCombine.  InstCombine can turn things like (X == 2) || (X == 3)
into (X & 1) == 2 and so SimplifyCFG tries to detect when this occurs so
that it can produce a switch instruction.

However, the legality checking was not sufficient to determine whether
or not this had occured.  Correctly check this case by requiring that
the right-hand side of the comparison be a power of two.

This fixes PR26323.

llvm-svn: 258904
2016-01-27 02:43:28 +00:00
David Majnemer 47de2140f7 Revert "[SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)"
This reverts commit r255660.

llvm-svn: 258903
2016-01-27 02:43:22 +00:00
Manuel Jacob e902459c4b Change ConstantFoldInstOperands to take Instruction instead of opcode and type. NFC.
Summary:
The previous form, taking opcode and type, is moved to an internal
helper and the new form, taking an instruction, is a wrapper around this
helper.

Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16383

llvm-svn: 258391
2016-01-21 06:33:22 +00:00
Chen Li 509ff21300 Code refactoring for commit r257278.
llvm-svn: 257366
2016-01-11 19:20:53 +00:00
Chen Li c375450e3f Fix a control flow problem in commit rL257277.
llvm-svn: 257278
2016-01-10 06:13:32 +00:00
Chen Li 1689c2f54b [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary:
This is a fix of D13718. D13718 was committed but then reverted because of the following bug:
https://llvm.org/bugs/show_bug.cgi?id=25299

This patch fixes the issue shown in the bug.

Reviewers: majnemer, reames

Subscribers: jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14308

llvm-svn: 257277
2016-01-10 05:48:01 +00:00
Joseph Tremoulet 0d808888c1 [WinEH] Simplify unreachable catchpads
Summary:
At least for CoreCLR, a catchpad which immediately executes an
`unreachable` instruction indicates that the exception can never have a
matching type, and so such catchpads can be removed, and so can their
catchswitches if the catchswitch becomes empty.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15846

llvm-svn: 256809
2016-01-05 02:37:41 +00:00
James Molloy 3d21dcf3ed [SimplifyCFG] Don't create unnecessary PHIs
In conditional store merging, we were creating PHIs when we didn't
need to. If the value to be predicated isn't defined in the block
we're predicating, then it doesn't need a PHI at all (because we only
deal with triangles and diamonds, any value not in the predicated BB
must dominate the predicated BB).

This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite.

Now with a fix (and fixed tests) for the conformance issue seen in Chromium.

llvm-svn: 255767
2015-12-16 14:12:44 +00:00
Sanjay Patel 38a022623a [SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)
This is the last general step to allow more IR-level speculation with a safety harness in place in CodeGenPrepare.

The intent is to restore the behavior enabled by:
http://reviews.llvm.org/rL228826

but prevent bad performance such as:
https://llvm.org/bugs/show_bug.cgi?id=24818

Earlier patches in this sequence:
D12882 (disable SimplifyCFG speculation for expensive instructions)
D13297 (have CGP despeculate expensive ops)
D14630 (have CGP despeculate special versions of cttz/ctlz)

As shown in the test cases, we only have two instructions currently affected: ctz for some x86 and fdiv generally. 
Allowing exactly one expensive instruction is a bit of a hack, but it lines up with what is currently implemented
in CGP. If we make the despeculation more general in CGP, we can make the speculation here more liberal.

A follow-up patch will adjust the cost for sqrt and possibly other typically expensive math intrinsics (currently
everything is cheap by default). GPU targets would likely want to override those expensive default costs (just as
they probably should already override the cost of div/rem) because just about any math is cheaper than control-flow
on those targets.

Differential Revision: http://reviews.llvm.org/D15213

llvm-svn: 255660
2015-12-15 17:38:29 +00:00
Reid Kleckner db9a91e324 Revert "Don't create unnecessary PHIs"
This reverts commit r255489.

It causes test failures in Chromium and does not appear to respect the
AlternativeV parameter.

llvm-svn: 255562
2015-12-14 22:36:57 +00:00
David Majnemer bbfc7219ef [IR] Remove terminatepad
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function.  This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.

Depends on D15478.

Differential Revision: http://reviews.llvm.org/D15479

llvm-svn: 255522
2015-12-14 18:34:23 +00:00
James Molloy 2b1e101e99 Don't create unnecessary PHIs
In conditional store merging, we were creating PHIs when we didn't
need to. If the value to be predicated isn't defined in the block
we're predicating, then it doesn't need a PHI at all (because we only
deal with triangles and diamonds, any value not in the predicated BB
must dominate the predicated BB).

This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite.

llvm-svn: 255489
2015-12-14 10:57:01 +00:00
David Majnemer 8a1c45d6e8 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

llvm-svn: 255422
2015-12-12 05:38:55 +00:00
Weiming Zhao 45d4cb9a14 [Utils] Put includes in correct order. NFC.
Summary:
    Followed the guidelines in:
    http://llvm.org/docs/CodingStandards.html#include-style
    
    However, I noticed that uppercase named headers come before lowercase ones
    throughout the codebase. So kept them as is.
    
    Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: majnemer, davide, jmolloy, atrick

Subscribers: sanjoy

Differential Revision: http://reviews.llvm.org/D14939

llvm-svn: 254005
2015-11-24 18:57:06 +00:00
Igor Laevsky 7310c68e85 Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)"
Failing clang test is now fixed by the r253458.

llvm-svn: 253459
2015-11-18 14:50:18 +00:00
Renato Golin 0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Igor Laevsky 01c3692a10 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Duncan P. N. Exon Smith 83c4b68720 ADT: Remove last implicit ilist iterator conversions, NFC
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress.  This removes them.

I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).

llvm-svn: 252371
2015-11-07 00:01:16 +00:00
Sanjoy Das 55ea67cea7 [ValueTracking] Add parameters to isImpliedCondition; NFC
Summary:
This change makes the `isImpliedCondition` interface similar to the rest
of the functions in ValueTracking (in that it takes a DataLayout,
AssumptionCache etc.).  This is an NFC, intended to make a later diff
less noisy.

Depends on D14369

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14391

llvm-svn: 252333
2015-11-06 19:01:08 +00:00
James Molloy 9e959ac397 [SimplifyCFG] Tweak heuristic for merging conditional stores
We were correctly skipping dbginfo intrinsics and terminators, but the initial bailout wasn't, causing it to bail out on almost any block.

llvm-svn: 252152
2015-11-05 08:40:19 +00:00
James Molloy 4de84ddec9 [SimplifyCFG] Merge conditional stores
We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:

  if (c & flag1)
    *a = x;
  if (c & flag2)
    *a = y;
  ...

There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.

It is, however, legal to merge the stores together and do the store once:

  tmp = undef;
  if (c & flag1)
    tmp = x;
  if (c & flag2)
    tmp = y;
  if (c & flag1 || c & flag2)
    *a = tmp;

The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.

This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.

The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.

This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.

llvm-svn: 252051
2015-11-04 15:28:04 +00:00
Artur Pilipenko 5c5011d503 Preserve load alignment and dereferenceable metadata during some transformations
Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D13953

llvm-svn: 251809
2015-11-02 17:53:51 +00:00
Philip Reames 846e3e41ed [SimplifyCFG] Constant fold a branch implied by it's incoming edge
The most common use case is when eliminating redundant range checks in an example like the following:
c = a[i+1] + a[i];

Note that all the smarts of the transform (the implication engine) is already in ValueTracking and is tested directly through InstructionSimplify.

Differential Revision: http://reviews.llvm.org/D13040

llvm-svn: 251596
2015-10-29 03:11:49 +00:00
David Majnemer 492937095f [SimplifyCFG] Don't DCE catchret because the successor is unreachable
CatchReturnInst has side-effects: it runs a destructor.  This destructor
could conceivably run forever/call exit/etc. and should not be removed.

llvm-svn: 251461
2015-10-27 22:43:56 +00:00
Chen Li 7009cd3554 Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
llvm-svn: 251149
2015-10-23 21:13:01 +00:00
Chen Li c6e28782d8 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction.  

Reviewers: hfinkel, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13718

llvm-svn: 251061
2015-10-22 20:48:38 +00:00
David Majnemer dc3b67b4ca [SimplifyCFG] Don't use-after-free an SSA value
SimplifyTerminatorOnSelect didn't consider the possibility that the
condition might be related to one of PHI nodes.

This fixes PR25267.

llvm-svn: 250922
2015-10-21 18:22:24 +00:00
Philip Reames a956cc7f08 Revert 250343 and 250344
Turns out this approach is buggy.  In discussion about follow on work, Sanjoy pointed out that we could be subject to circular logic problems.  

Consider:
 if (i u< L) leave()
 if ((i + 1) u< L) leave()
 print(a[i] + a[i+1]) 

If we know that L is less than UINT_MAX, we could possible prove (in a control dependent way) that i + 1 does not overflow.  This gives us:
 if (i u< L) leave()
 if ((i +nuw 1) u< L) leave()
 print(a[i] + a[i+1]) 

If we now do the transform this patch proposed, we end up with:
 if ((i +nuw 1) u< L) leave_appropriately()
 print(a[i] + a[i+1]) 

That would be a miscompile when i==-1.  The problem here is that the control dependent nuw bits got used to prove something about the first condition.  That's obviously invalid.

This won't happen today, but since I plan to enhance LVI/CVP with exactly that transform at some point in the not too distant future...

llvm-svn: 250430
2015-10-15 16:51:00 +00:00
Philip Reames b42db21de8 [SimplifyCFG] Speculatively flatten CFG based on profiling metadata
If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path.

The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences.

Differential Revision: http://reviews.llvm.org/D13070

llvm-svn: 250343
2015-10-14 22:46:19 +00:00
Duncan P. N. Exon Smith 5b4c837c58 TransformUtils: Remove implicit ilist iterator conversions, NFC
Continuing the work from last week to remove implicit ilist iterator
conversions.  First related commit was probably r249767, with some more
motivation in r249925.  This edition gets LLVMTransformUtils compiling
without the implicit conversions.

No functional change intended.

llvm-svn: 250142
2015-10-13 02:39:05 +00:00
Piotr Padlewski dc9b2cfc50 inariant.group handling in GVN
The most important part required to make clang
devirtualization works ( ͡°͜ʖ ͡°).
The code is able to find non local dependencies, but unfortunatelly
because the caller can only handle local dependencies, I had to add
some restrictions to look for dependencies only in the same BB.

http://reviews.llvm.org/D12992

llvm-svn: 249196
2015-10-02 22:12:22 +00:00
Joseph Tremoulet 09af67aba5 [EH] Create removeUnwindEdge utility
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.


Reviewers: andrew.w.kaylor, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13152

llvm-svn: 248677
2015-09-27 01:47:46 +00:00
Sanjay Patel f9b776350f more space; NFC
llvm-svn: 247699
2015-09-15 15:24:42 +00:00
Filipe Cabecinhas 48b090a31f Remove gcc warning when comparing an unsigned var for >= 0
llvm-svn: 247352
2015-09-10 22:34:39 +00:00
Philip Reames 053701399d [SimplifyCFG] Use known bits to eliminate dead switch defaults
This is a follow up to http://reviews.llvm.org/D11995 implementing the suggestion by Hans.

If we know some of the bits of the value being switched on, we know that the maximum number of unique cases covers the unknown bits. This allows to eliminate switch defaults for large integers (i32) when most bits in the value are known.

Note that I had to make the transform contingent on not having any dead cases. This is conservatively correct with the old code, but required for the new code since we might have a dead case which varies one of the known bits. Counting that towards our number of covering cases would be bad.  If we do have dead cases, we'll eliminate them first, then revisit the possibly dead default.

Differential Revision: http://reviews.llvm.org/D12497

llvm-svn: 247309
2015-09-10 17:44:47 +00:00
Sanjay Patel 9361d35525 80-cols; NFC
llvm-svn: 247295
2015-09-10 16:31:19 +00:00
Sanjay Patel f4b34b76d4 use range-based for loop; NFCI
llvm-svn: 247294
2015-09-10 16:25:38 +00:00
Sanjay Patel 5e7bd91891 use range-based for loop; NFCI
llvm-svn: 247293
2015-09-10 16:15:21 +00:00
Sanjay Patel 59661459f1 fix typo; NFC
llvm-svn: 247287
2015-09-10 15:14:34 +00:00
Craig Topper 02a55d701d Fix build warning.
llvm-svn: 246908
2015-09-05 04:49:44 +00:00
Andrew Kaylor 2a9a6d8c38 Fix build warning
llvm-svn: 246903
2015-09-05 01:00:51 +00:00
Andrew Kaylor a212aba680 Fix build warning
llvm-svn: 246899
2015-09-04 23:58:32 +00:00
Andrew Kaylor 50e4e86c26 [WinEH] Teach SimplfyCFG to eliminate empty cleanup pads.
Differential Revision: http://reviews.llvm.org/D12434

llvm-svn: 246896
2015-09-04 23:39:40 +00:00
Philip Reames 98a2dabc08 [SimplifyCFG] Prune code from a provably unreachable switch default
As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine.

Note: Duplicate case values are disallowed by the LangRef and verifier.

Differential Revision: http://reviews.llvm.org/D11995

llvm-svn: 246125
2015-08-26 23:56:46 +00:00
Adrian Prantl cbdfdb74d3 Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata()
and make it always preserve debug locations, since all callers wanted this
behavior anyway.

This is addressing a post-commit review feedback for r245589.

NFC (inside the LLVM tree).

llvm-svn: 245622
2015-08-20 22:00:30 +00:00
Adrian Prantl baf90fc265 Fix a bug that caused SimplifyCFG to drop DebugLocs.
Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all
metadata in KnownSet, but the condition for DebugLocs was inverted.

Most users of dropUnknownMetadata() actually worked around this by not
adding LLVMContext::MD_dbg to their list of KnowIDs.
This is now made explicit.

llvm-svn: 245589
2015-08-20 18:24:02 +00:00
David Majnemer ba275f9947 Replace some calls to isa<LandingPadInst> with isEHPad()
No functionality change is intended.

llvm-svn: 245487
2015-08-19 19:54:02 +00:00
Pete Cooper ebcd748927 Convert a bunch of loops to foreach. NFC.
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst.  This commit changes a bunch
of eligible loops to use it.

llvm-svn: 244260
2015-08-06 20:22:46 +00:00
Craig Topper e3dcce9700 De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842
2015-08-01 22:20:21 +00:00
Sanjay Patel 09159b8f47 don't repeat function names in comments; NFC
llvm-svn: 240591
2015-06-24 20:40:57 +00:00
Sanjay Patel adb110c372 fix typos; NFC
llvm-svn: 240585
2015-06-24 20:07:50 +00:00
Alexander Kornienko f00654e31b Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Alexander Kornienko 70bc5f1398 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
Eric Christopher 572e03a396 Fix "the the" in comments.
llvm-svn: 240112
2015-06-19 01:53:21 +00:00
Hans Wennborg 86ac630585 SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes
When using bit tests for hole checks, we call AddPredecessorToBlock to give the
phi node a value from the bit test block. This would break if we've
previously called removePredecessor on the default destination because the
switch is fully covered.

Test case by Mark Lacey.

llvm-svn: 235771
2015-04-24 20:57:56 +00:00
Mark Lacey 274f48b5a8 Fix typo.
llvm-svn: 234706
2015-04-12 18:18:51 +00:00
David Blaikie aa41cd57e0 [opaque pointer type] More GEP IRBuilder API migrations...
llvm-svn: 234058
2015-04-03 21:33:42 +00:00
Philip Reames 2b969d7010 Merge empty landing pads in SimplifyCFG
This patch tries to merge duplicate landing pads when they branch to a common shared target.

Given IR that looks like this:
lpad1:
  %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
         cleanup
  br label %shared_resume
lpad2:
  %exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
          cleanup
  br label %shared_resume
shared_resume:
  call void @fn()
  ret void
}

We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.

Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.

Differential Revision: http://reviews.llvm.org/D8297

llvm-svn: 233125
2015-03-24 22:28:45 +00:00
Sanjoy Das 7182d36f66 [ConstantRange] Split makeICmpRegion in two.
Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts.  The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true.  The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8345

llvm-svn: 232575
2015-03-18 00:41:24 +00:00
Mehdi Amini a28d91d81b DataLayout is mandatory, update the API to reflect it with references.
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
2015-03-10 02:37:25 +00:00
Chad Rosier 543900539f Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity.
This patch adds the isProfitableToHoist API.  For AArch64, we want to prevent a
fmul from being hoisted in cases where it is more profitable to form a
fmsub/fmadd.

Phabricator Review: http://reviews.llvm.org/D7299
Patch by Lawrence Hu <lawrence@codeaurora.org>

llvm-svn: 230241
2015-02-23 19:15:16 +00:00
Benjamin Kramer ea68a944a1 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Aaron Ballman f9a1897c72 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
James Molloy 1b6207e6eb [SimplifyCFG] Be more aggressive
Up the phi node folding threshold from a cheap "1" to a meagre "2".

Update tests for extra added selects and slight code churn.

llvm-svn: 229099
2015-02-13 10:48:30 +00:00
James Molloy 7c336576a5 [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

llvm-svn: 228826
2015-02-11 12:15:41 +00:00
Hans Wennborg b64cb271dc SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
The range check would get optimized away later, but we might as well not emit
them in the first place.

http://reviews.llvm.org/D6471

llvm-svn: 227126
2015-01-26 19:52:34 +00:00
Hans Wennborg 6800008f04 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.

Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.

Differential Revision: http://reviews.llvm.org/D6471

llvm-svn: 227125
2015-01-26 19:52:32 +00:00
Reid Kleckner f12b33454f Revert "Don't remove a landing pad if the invoke requires a table entry."
This reverts commit r176827.

Björn Steinbrink pointed out that this didn't actually fix the bug
(PR15555) it was attempting to fix.

With this reverted, we can now remove landingpad cleanups that
immediately resume unwinding, converting the invoke to a call.

llvm-svn: 226850
2015-01-22 19:29:46 +00:00
Ramkumar Ramachandra 181233b2b7 fix {typo, build failure} in r225760
llvm-svn: 225762
2015-01-13 04:17:47 +00:00
Ramkumar Ramachandra 40c3e03e27 Standardize {pred,succ,use,user}_empty()
The functions {pred,succ,use,user}_{begin,end} exist, but many users
have to check *_begin() with *_end() by hand to determine if the
BasicBlock or User is empty. Fix this with a standard *_empty(),
demonstrating a few usecases.

llvm-svn: 225760
2015-01-13 03:46:47 +00:00
Hans Wennborg dcc6e5bc03 SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210)
The previous code assumed that such instructions could not have any uses
outside CaseDest, with the motivation that the instruction could not
dominate CommonDest because CommonDest has phi nodes in it. That simply
isn't true; e.g., CommonDest could have an edge back to itself.

llvm-svn: 225552
2015-01-09 22:13:31 +00:00
Chandler Carruth 66b3130cda [PM] Split the AssumptionTracker immutable pass into two separate APIs:
a cache of assumptions for a single function, and an immutable pass that
manages those caches.

The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.

Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.

For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.

llvm-svn: 225131
2015-01-04 12:03:27 +00:00
Michael Liao 5313da3263 [SimplifyCFG] Revise common code sinking
- Fix the case where more than 1 common instructions derived from the same
  operand cannot be sunk. When a pair of value has more than 1 derived values
  in both branches, only 1 derived value could be sunk.
- Replace BB1 -> (BB2, PN) map with joint value map, i.e.
  map of (BB1, BB2) -> PN, which is more accurate to track common ops.

llvm-svn: 224757
2014-12-23 08:26:55 +00:00
Duncan P. N. Exon Smith 5bf8fef580 IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532.  Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.

I have a follow-up patch prepared for `clang`.  If this breaks other
sub-projects, I apologize in advance :(.  Help me compile it on Darwin
I'll try to fix it.  FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.

This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.

Here's a quick guide for updating your code:

  - `Metadata` is the root of a class hierarchy with three main classes:
    `MDNode`, `MDString`, and `ValueAsMetadata`.  It is distinct from
    the `Value` class hierarchy.  It is typeless -- i.e., instances do
    *not* have a `Type`.

  - `MDNode`'s operands are all `Metadata *` (instead of `Value *`).

  - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
    replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.

    If you're referring solely to resolved `MDNode`s -- post graph
    construction -- just use `MDNode*`.

  - `MDNode` (and the rest of `Metadata`) have only limited support for
    `replaceAllUsesWith()`.

    As long as an `MDNode` is pointing at a forward declaration -- the
    result of `MDNode::getTemporary()` -- it maintains a side map of its
    uses and can RAUW itself.  Once the forward declarations are fully
    resolved RAUW support is dropped on the ground.  This means that
    uniquing collisions on changing operands cause nodes to become
    "distinct".  (This already happened fairly commonly, whenever an
    operand went to null.)

    If you're constructing complex (non self-reference) `MDNode` cycles,
    you need to call `MDNode::resolveCycles()` on each node (or on a
    top-level node that somehow references all of the nodes).  Also,
    don't do that.  Metadata cycles (and the RAUW machinery needed to
    construct them) are expensive.

  - An `MDNode` can only refer to a `Constant` through a bridge called
    `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).

    As a side effect, accessing an operand of an `MDNode` that is known
    to be, e.g., `ConstantInt`, takes three steps: first, cast from
    `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
    third, cast down to `ConstantInt`.

    The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
    metadata schema owners transition away from using `Constant`s when
    the type isn't important (and they don't care about referring to
    `GlobalValue`s).

    In the meantime, I've added transitional API to the `mdconst`
    namespace that matches semantics with the old code, in order to
    avoid adding the error-prone three-step equivalent to every call
    site.  If your old code was:

        MDNode *N = foo();
        bar(isa             <ConstantInt>(N->getOperand(0)));
        baz(cast            <ConstantInt>(N->getOperand(1)));
        bak(cast_or_null    <ConstantInt>(N->getOperand(2)));
        bat(dyn_cast        <ConstantInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));

    you can trivially match its semantics with:

        MDNode *N = foo();
        bar(mdconst::hasa               <ConstantInt>(N->getOperand(0)));
        baz(mdconst::extract            <ConstantInt>(N->getOperand(1)));
        bak(mdconst::extract_or_null    <ConstantInt>(N->getOperand(2)));
        bat(mdconst::dyn_extract        <ConstantInt>(N->getOperand(3)));
        bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));

    and when you transition your metadata schema to `MDInt`:

        MDNode *N = foo();
        bar(isa             <MDInt>(N->getOperand(0)));
        baz(cast            <MDInt>(N->getOperand(1)));
        bak(cast_or_null    <MDInt>(N->getOperand(2)));
        bat(dyn_cast        <MDInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));

  - A `CallInst` -- specifically, intrinsic instructions -- can refer to
    metadata through a bridge called `MetadataAsValue`.  This is a
    subclass of `Value` where `getType()->isMetadataTy()`.

    `MetadataAsValue` is the *only* class that can legally refer to a
    `LocalAsMetadata`, which is a bridged form of non-`Constant` values
    like `Argument` and `Instruction`.  It can also refer to any other
    `Metadata` subclass.

(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)

llvm-svn: 223802
2014-12-09 18:38:53 +00:00
Juergen Ributzka 194350a936 Revert "Move function to obtain branch weights into the BranchInst class. NFC."
This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare.

llvm-svn: 223795
2014-12-09 17:32:12 +00:00
Juergen Ributzka e2aa3aa38a Move function to obtain branch weights into the BranchInst class. NFC.
Make this function available to other parts of LLVM.

llvm-svn: 223784
2014-12-09 16:36:06 +00:00
Hans Wennborg 5bef5b522b Revert r223049, r223050 and r223051 while investigating test failures.
I didn't foresee affecting the Clang test suite :/

llvm-svn: 223054
2014-12-01 17:36:43 +00:00
Hans Wennborg 269ebb612e SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
They would get optimized away later, but we might as well not emit them.

llvm-svn: 223051
2014-12-01 17:08:38 +00:00
Hans Wennborg 5a1e5c05d8 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations, and
SDag lowering is now prepared to handle them efficiently.

For example, branches to the unreachable destination will be optimized away,
such as in the case of range checks for switch lookup tables.

On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and
Chromium by 30 kB).

llvm-svn: 223050
2014-12-01 17:08:35 +00:00
Erik Eckstein 0d86c7623f reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
Fixed missing dominance check.
Original commit message:

This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
   if (idx < tablesize)
      r = table[idx]; // table does not contain default_value
   else
      r = default_value;
   if (r != default_value)
      ...
Is optimized to:
   cond = idx < tablesize;
   if (cond)
      r = table[idx];
   else
      r = default_value;
   if (cond)
      ...
Jump threading will then eliminate the second if(cond).

llvm-svn: 222891
2014-11-27 15:13:14 +00:00
Erik Eckstein 2190cd9ffa Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible."
It is breaking the clang bootstrag.

llvm-svn: 222877
2014-11-27 10:59:08 +00:00
Erik Eckstein e73e308ab9 Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
    if (idx < tablesize)
       r = table[idx]; // table does not contain default_value
    else
       r = default_value;
    if (r != default_value)
       ...
Is optimized to:
    cond = idx < tablesize;
    if (cond)
       r = table[idx];
    else
       r = default_value;
    if (cond)
       ...
\endcode
Jump threading will then eliminate the second if(cond).

llvm-svn: 222872
2014-11-27 08:33:51 +00:00
Mehdi Amini ffd0100618 SimplifyCFG: Refactor GatherConstantCompares() result in a struct
Code seems cleaner and easier to understand this way

This is basically r222416, after fixes for MSVC lack of standard 
support, and a few cleaning (got rid of a warning).
Thanks Nakamura Takumi and Nico Weber for the MSVC fixes.

llvm-svn: 222472
2014-11-20 22:40:25 +00:00
Timur Iskhodzhanov 71526a3eda Revert r222416, r222422, r222426: the former revision had problems and fixing them introduced bugs
llvm-svn: 222428
2014-11-20 12:36:43 +00:00
Timur Iskhodzhanov a0bffc0c11 Fix a typo
llvm-svn: 222426
2014-11-20 11:48:58 +00:00
NAKAMURA Takumi 5a83192570 SimplifyCFG.cpp: Tweak to let msc17 compliant.
- Use LLVM_DELETED_FUNCTION.
  - Don't use member initializers.
  - Don't use initializer list.

llvm-svn: 222422
2014-11-20 08:59:02 +00:00
Mehdi Amini 65253e76ed SimplifyCFG: Refactor GatherConstantCompares() result in a struct
Code seems cleaner and easier to understand this way

llvm-svn: 222416
2014-11-20 06:51:02 +00:00
Nico Weber 06839a536f Try to fix MSVS build after r222384. No intended behavior change.
llvm-svn: 222386
2014-11-19 21:16:11 +00:00
Mehdi Amini 9a25cb8806 SimplifyCFG: turn recursive GatherConstantCompares into iterative
A long sequence of || or && could lead to a stack explosion.

llvm-svn: 222384
2014-11-19 20:09:11 +00:00
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Hans Wennborg a6a11a969a SimplifyCFG: Range'ify some for-loops. No functional change.
llvm-svn: 222215
2014-11-18 02:37:11 +00:00
Juergen Ributzka c9591e9bdb [SimplifyCFG] Make the value type of the hole check bitmask a power-of-2.
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.

The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.

To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.

This fixes rdar://problem/18984639.

llvm-svn: 222168
2014-11-17 19:39:56 +00:00
Erik Eckstein 105374fe5e Optimize switch lookup tables with linear mapping.
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:

int f1(int x) {
  switch (x) {
    case 0: return 10;
    case 1: return 11;
    case 2: return 12;
    case 3: return 13;
  }
  return 0;
}

generates:

define i32 @f1(i32 %x) #0 {
entry:
  %0 = icmp ult i32 %x, 4
  br i1 %0, label %switch.lookup, label %return

switch.lookup:
  %switch.offset = add i32 %x, 10
  ret i32 %switch.offset

return:
  ret i32 0
}

llvm-svn: 222121
2014-11-17 09:13:57 +00:00
Duncan P. N. Exon Smith de36e8040f Revert "IR: MDNode => Value"
Instead, we're going to separate metadata from the Value hierarchy.  See
PR21532.

This reverts commit r221375.
This reverts commit r221373.
This reverts commit r221359.
This reverts commit r221167.
This reverts commit r221027.
This reverts commit r221024.
This reverts commit r221023.
This reverts commit r220995.
This reverts commit r220994.

llvm-svn: 221711
2014-11-11 21:30:22 +00:00
Duncan P. N. Exon Smith 3872d0084c IR: MDNode => Value: Instruction::getMetadata()
Change `Instruction::getMetadata()` to return `Value` as part of
PR21433.

Update most callers to use `Instruction::getMDNode()`, which wraps the
result in a `cast_or_null<MDNode>`.

llvm-svn: 221024
2014-11-01 00:10:31 +00:00
Philip Reames d92c2a7592 Preserving 'nonnull' metadata in SimplifyCFG
When we hoist two loads above an if, we can preserve the nonnull metadata.  We could also do the same for sinking them, but we appear to not handle metadata at all in that case.

Thanks to Hal for the review.

Differential Revision: http://reviews.llvm.org/D5910

llvm-svn: 220392
2014-10-22 16:37:13 +00:00
Marcello Maggioni 5bbe3df63f Switch to select optimization for two-case switches
This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block
and a test to check that this condition is handled.

llvm-svn: 219656
2014-10-14 01:58:26 +00:00
Joerg Sonnenberger 5ca10d0edb Revert r219223, it creates invalid PHI nodes.
llvm-svn: 219587
2014-10-12 17:16:04 +00:00
Arnold Schwaighofer d7d010eb2a SimplifyCFG: Don't convert phis into selects if we could remove undef behavior
instead

We used to transform this:

  define void @test6(i1 %cond, i8* %ptr) {
  entry:
    br i1 %cond, label %bb1, label %bb2

  bb1:
    br label %bb2

  bb2:
    %ptr.2 = phi i8* [ %ptr, %entry ], [ null, %bb1 ]
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

into this:

  define void @test6(i1 %cond, i8* %ptr) {
    %ptr.2 = select i1 %cond, i8* null, i8* %ptr
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

because the simplifycfg transformation into selects would happen to happen
before the simplifycfg transformation that removes unreachable control flow
(We have 'unreachable control flow' due to the store to null which is undefined
behavior).

The existing transformation that removes unreachable control flow in simplifycfg
is:

  /// If BB has an incoming value that will always trigger undefined behavior
  /// (eg. null pointer dereference), remove the branch leading here.
  static bool removeUndefIntroducingPredecessor(BasicBlock *BB)

Now we generate:

  define void @test6(i1 %cond, i8* %ptr) {
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

I did not see any impact on the test-suite + externals.

rdar://18596215

llvm-svn: 219462
2014-10-10 01:27:02 +00:00
Marcello Maggioni 963bc87dbd Two case switch to select optimization
This optimization tries to convert switch instructions that are used to select a value with only 2 unique cases + default block
to a select or a couple of selects (depending if the default block is reachable or not).

The typical case this optimization wants to be able to optimize is this one:

Example:
switch (a) {
  case 10:                %0 = icmp eq i32 %a, 10
    return 10;            %1 = select i1 %0, i32 10, i32 4
  case 20:        ---->   %2 = icmp eq i32 %a, 20
    return 2;             %3 = select i1 %2, i32 2, i32 %1
  default:
    return 4;
}

It also sets the base for further optimizations that are planned and being reviewed.

llvm-svn: 219223
2014-10-07 18:16:44 +00:00
Jingyue Wu fc0296704c [SimplifyCFG] threshold for folding branches with common destination
Summary:
This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.

The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:

  __global__ void foo(int a, int b, int c, int d, int e, int n,
                      const int *input, int *output) {
    int sum = 0;
    for (int i = 0; i < n; ++i)
      sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
    *output = sum;
  }

The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:

  sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];

Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.

We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!

Test Plan: Added one test case to check the threshold is in effect

Reviewers: nadav, eliben, meheff, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D5529

llvm-svn: 218711
2014-09-30 22:23:38 +00:00
Jingyue Wu b67140b812 Remove dead code in SimplifyCFG
Summary: UsedByBranch is always true according to how BonusInst is defined.

Test Plan:
Passes check-all, and also verified 

if (BonusInst && !UsedByBranch) {
  ...
}

is never entered during check-all.

Reviewers: resistor, nadav, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, eliben, meheff

Differential Revision: http://reviews.llvm.org/D5324

llvm-svn: 217824
2014-09-15 20:48:13 +00:00
Hal Finkel 60db05896a Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

llvm-svn: 217342
2014-09-07 18:57:58 +00:00