Commit Graph

1012 Commits

Author SHA1 Message Date
chenglin.bi a43c0974f0 [SimplifyCFG] Add tests for simpilfycfg, switch to lookup table with i2 types; NFC 2022-10-15 02:25:27 +08:00
Arthur Eubanks e23aee7175 [test] Update some legacy PM tests 2022-09-30 11:31:02 -07:00
Mingming Liu ac28efa6c1 [SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches.
- Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved and could cause missed loop transformations (see 'test2' in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll).

To illustrate how inner-loop metadata might be dropped before this patch:

CFG Before

      entry
        |
        v
 ---> while.cond   ------------->  while.end
 |       |
 |       v
 |   while.body
 |       |
 |       v
 |    for.body <---- (md1)
 |       |  |______|
 |       v
 |    while.cond.exit (md2)
 |       |
 |_______|

CFG After

       entry
         |
         v
 ---> while.cond.rewrite  ------------->  while.end
 |       |
 |       v
 |   while.body
 |       |
 |       v
 |    for.body <---- (md2)
 |_______|  |______|

Basically, when 'while.cond.exit' is folded into 'while.cond', 'md2' overrides 'md1' and 'md1' is dropped from the CFG.

Differential Revision: https://reviews.llvm.org/D134152
2022-09-28 10:48:14 -07:00
Mingming Liu 34db7c64df [NFC] Use opaqueptr in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll
Use opaqueptr for test case
llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll.

- Adjust variable number accordingly since bitcast between different pointer
  types are not necessary.

Differential Revision: https://reviews.llvm.org/D134159
2022-09-19 09:01:11 -07:00
Nikita Popov dd61726d5b Revert "[SimplifyCFG] accumulate bonus insts cost"
This reverts commit e5581df60a.

This causes major compile-time regressions, about 2-3% end-to-end
on CTMark.
2022-09-19 14:46:43 +02:00
Mingming Liu 7392b45162 [NFC][SimplifyCFG]Precommit test case to show inner-loop metadata may not be preserved
- There is an outer while-loop and an inner for-loop in the test case.
  Inner-loop has `llvm.loop.unroll.enable` metadata that is not
  preserved. This happens around [1], when the loop metadata of outer loop
  overrides the inner loop metadata directly, without looking at whether inner-loop
  itself has loop metadata.

 [1] ab755e6562/llvm/lib/Transforms/Utils/Local.cpp (L1146)

Differential Revision: https://reviews.llvm.org/D134014
2022-09-18 22:48:09 -07:00
Yaxun (Sam) Liu e5581df60a [SimplifyCFG] accumulate bonus insts cost
SimplifyCFG folds

bool foo() {
  if (cond1) return false;
  if (cond2) return false;
  return true;
}

as

bool foo() {
  if (cond1 | cond2) return false
  return true;
}

'cond2' is called 'bonus insts' in branch folding since they introduce overhead
since the original CFG could do early exit but the folded CFG always executes
them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into
its predecessor BB which shares the destination. If it is below bonus-inst-threshold,
SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed.

When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts
in the current BB to be considered for folding. This causes issue for unrolled loops
which share destinations, e.g.

bool foo(int *a) {
  for (int i = 0; i < 32; i++)
    if (a[i] > 0) return false;
  return true;
}

After unrolling, it becomes

bool foo(int *a) {
  if(a[0]>0) return false
  if(a[1]>0) return false;
  //...
  if(a[31]>0) return false;
  return true;
}

SimplifyCFG will merge each BB with its predecessor BB,
and ends up with 32 'bonus insts' which are always executed, which
is much slower than the original CFG.

The root cause is that SimplifyCFG does not consider the
accumulated cost of 'bonus insts' which are folded from
different BB's.

This patch fixes that by introducing a ValueMap to track
costs of 'bonus insts' coming from different BB's into
the same BB, and cuts off if the accumulated cost
exceeds a threshold.

Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D132408
2022-09-18 20:21:14 -04:00
Yaxun (Sam) Liu 97b5736975 [SimplifyCFG] add a test for branch folding multiple BB
Reviewed by: Florian Hahn

Differential Revision: https://reviews.llvm.org/D132910
2022-09-17 21:17:55 -04:00
Arthur Eubanks 5a33d1f0b9 [SimplifyCFG] Don't hoist allocas
D129370 started hoisting allocas across stacksave/stackrestore
boundaries which is wrong.

Reviewed By: chill, rnk

Differential Revision: https://reviews.llvm.org/D133730
2022-09-13 09:23:39 -07:00
Arthur Eubanks 724664c56b [test][SimplifyCFG] Precommit test with hoisting inallocas 2022-09-12 15:07:52 -07:00
Simon Pilgrim e80b9a8f37 [SimplifyCFG][X86] Regenerate speculate-cttz-ctlz.ll
There's no difference between generic/bmi/lzcnt targets atm
2022-09-12 15:16:44 +01:00
Momchil Velikov 078899cd64 [SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions
SimplifyCFG does some common code hoisting, which is limited
to hoisting a sequence of identical instruction in identical
order and stops at the first non-identical instruction.

This patch allows hoisting instruction pairs over
same-length sequences of non-matching instructions. The
linear asymptotic complexity of the algorithm stays the
same, there's an extra parameter
`simplifycfg-hoist-common-skip-limit` serving to limit
compilation time and/or the size of the hoisted live ranges.

The patch improves SPECv6/525.x264_r by about 10%.

Reviewed By: nikic, dmgreen

Differential Revision: https://reviews.llvm.org/D129370
2022-09-05 15:13:46 +01:00
Nikita Popov ab6876a40d reland: [Local] Allow creating callbr with duplicate successors
Since D129288, callbr is allowed to have duplicate successors. This patch removes a limitation which prevents optimizations from actually producing such callbrs.

This is probably the riskiest of all the recent callbr changes, because code with incorrect assumptions might be lurking somewhere. I fixed the one case I encountered ahead of time in 8201e3ef5c.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D129997

Originally landed as
commit 08860f525a ("[Local] Allow creating callbr with duplicate successors")

Reverted in
commit 1cf6b93df1 ("Revert "[Local] Allow creating callbr with duplicate successors"")
2022-08-31 13:23:00 -07:00
Dmitry Makogon 9142f67ef2 [SimplifyCFG] Don't widen cond br if false branch has successors
Fixes https://github.com/llvm/llvm-project/issues/57221.

This limits the tryWidenCondBranchToCondBranch transform making it
work only if the false block of widenable condition branch
has no successors.

If that block has successors, then SimplifyCondBranchToCondBranch
may undo the transform done by tryWidenCondBranchToCondBranch, which
would lead to infinite cycle of transformation and eventually
an assert failing.

Differential Revision: https://reviews.llvm.org/D132356
2022-08-26 15:23:37 +07:00
Dmitry Makogon 56b213090f [NFC] Remove undef from xfailed SimplifyCFG test
The test fails not because of undef, so replacing with
normal condition.
2022-08-23 14:53:05 +07:00
Jameson Nash 3a8d7fe201 [SimplifyCFG] teach simplifycfg not to introduce ptrtoint for NI pointers
SimplifyCFG expects to be able to cast both sides to an int, if either side can be case to an int, but this is not desirable or legal, in general, per D104547.

Spotted in https://github.com/JuliaLang/julia/issues/45702

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128670
2022-08-15 15:11:48 -04:00
Nikita Popov 7314ad7a06 Revert "[SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions"
This reverts commit 7b0f6378e2.

As commented on the review, this patch has a correctness issue
regarding the modelling of memory effects.
2022-08-01 09:20:56 +02:00
Momchil Velikov 7b0f6378e2 [SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions
SimplifyCFG does some common code hoisting, which is limited to hoisting a
sequence of identical instruction in identical order and stops at the first
non-identical instruction.

This patch allows hoisting instruction pairs over same-length sequences of
non-matching instructions. The linear asymptotic complexity of the algorithm
stays the same, there's an extra parameter `simplifycfg-hoist-common-skip-limit`
serving to limit compilation time and/or the size of the hoisted live ranges.

The patch improves SPECv6/525.x264_r by about 10%.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D129370
2022-08-01 07:55:14 +01:00
Nick Desaulniers 1cf6b93df1 Revert "[Local] Allow creating callbr with duplicate successors"
This reverts commit 08860f525a.

Crashes during PPC64LE linux kernel builds as reported by @nathanchance.
https://reviews.llvm.org/D129997#3663632
2022-07-19 15:03:27 -07:00
Nikita Popov 08860f525a [Local] Allow creating callbr with duplicate successors
Since D129288, callbr is allowed to have duplicate successors. This
patch removes a limitation which prevents optimizations from actually
producing such callbrs.

Differential Revision: https://reviews.llvm.org/D129997
2022-07-19 14:28:22 +02:00
Nikita Popov 8201e3ef5c [BasicBlockUtils] Don't drop callbr with unique successor
As callbr is now allowed to have duplicate destinations, we can
have a callbr with a unique successor. Make sure it doesn't get
dropped, as we still need to preserve the side-effect.
2022-07-18 12:26:29 +02:00
David Kreitzer c720b6fddd Clarify the behavior of the llvm.vector.insert/extract intrinsics when the index
is out of range. Both intrinsics return a poison value.

Consequently, mark the intrinsics speculatable.
Differential Revision: https://reviews.llvm.org/D129656
2022-07-15 07:56:44 -07:00
Nikita Popov 2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Alexander Shaposhnikov c916840539 [SimplifyCFG] Improve SwitchToLookupTable optimization
Try to use the original value as an index (in the lookup table)
in more cases (to avoid one subtraction and shorten the dependency chain)
(https://github.com/llvm/llvm-project/issues/56189).

Test plan:
1/ ninja check-all
2/ bootstrapped LLVM + Clang pass tests

Differential revision: https://reviews.llvm.org/D128897
2022-07-13 23:21:45 +00:00
Yuanfang Chen fcb7d76d65 [coroutine] add nomerge function attribute to `llvm.coro.save`
It is illegal to merge two `llvm.coro.save` calls unless their
`llvm.coro.suspend` users are also merged. Marks it "nomerge" for
the moment.

This reverts D129025.

Alternative to D129025, which affects other token type users like WinEH.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D129530
2022-07-12 10:39:38 -07:00
Nikita Popov 40a4078e14 [BasicBlockUtils] Allow splitting predecessors with callbr terminators
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.

The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.

I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.

Differential Revision: https://reviews.llvm.org/D129205
2022-07-07 09:13:25 +02:00
Nikita Popov 20962c1240 [SimplifyCFG] Don't split predecessors of callbr terminator
This addresses the assertion failure reported in
https://reviews.llvm.org/D124159#3631240.

I believe that this limitation in SplitBlockPredecessors is not
actually necessary (because unlike with indirectbr, callbr is
restricted in a way that does allow updating successors), but for
now fix the assertion failure the same way we do everywhere else,
by also skipping callbr.
2022-07-06 15:38:53 +02:00
Nikita Popov 11950efe06 [ConstExpr] Remove div/rem constant expressions
D128820 stopped creating div/rem constant expressions by default;
this patch removes support for them entirely.

The getUDiv(), getExactUDiv(), getSDiv(), getExactSDiv(), getURem()
and getSRem() on ConstantExpr are removed, and ConstantExpr::get()
now only accepts binary operators for which
ConstantExpr::isSupportedBinOp() returns true. Uses of these methods
may be replaced either by corresponding IRBuilder methods, or
ConstantFoldBinaryOpOperands (if a constant result is required).

On the C API side, LLVMConstUDiv, LLVMConstExactUDiv, LLVMConstSDiv,
LLVMConstExactSDiv, LLVMConstURem and LLVMConstSRem are removed and
corresponding LLVMBuild methods should be used.

Importantly, this also means that constant expressions can no longer
trap! This patch still keeps the canTrap() method to minimize diff --
I plan to drop it in a separate NFC patch.

Differential Revision: https://reviews.llvm.org/D129148
2022-07-06 10:11:34 +02:00
Yuanfang Chen b170d856a3 [SimplifyCFG] Skip hoisting common instructions that return token type
By LangRef, hoisting token-returning instructions obsures the origin
so it should be skipped. Found this issue while investigating a
CoroSplit pass crash.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129025
2022-07-05 11:21:57 -07:00
Nikita Popov a4772cbaf0 Revert "[SimplifyCFG] Thread branches on same condition in more cases (PR54980)"
This reverts commit 4e545bdb35.

The newly added test is the third infinite combine loop caused by
this change. In this case, it's a combination of the branch to
common dest and jump threading folds that keeps peeling off loop
iterations.

The core problem here is that we ideally would not thread over
loop backedges, both because it is potentially non-profitable
(it may break canonical loop structure) and because it may result
in these kinds of loops. Unfortunately, due to the lack of a
dominator tree in SimplifyCFG, there is no good way to prevent
this. While we have LoopHeaders, this is an optional structure and
we don't do a good job of keeping it up to date. It would be fine
for a profitability check, but is not suitable for a correctness
check.

So for now I'm just giving up here, as I don't see a good way to
robustly prevent infinite combine loops.

Fixes https://github.com/llvm/llvm-project/issues/56203.
2022-07-05 16:57:46 +02:00
Nikita Popov dc969061c6 [SimplifyCFG] Thread all predecessors with same value at once
If there are multiple predecessors that have the same condition
value (and thus same "real destination"), these were previously
handled by copying the threaded block for each predecessor.
Instead, we can reuse one block for all of them. This makes the
behavior of SimplifyCFG's jump threading match that of the
actual JumpThreading pass.

This also avoids the infinite combine loop reported in:
https://reviews.llvm.org/D124159#3624387
2022-07-05 14:33:53 +02:00
Nikita Popov 02ab87f543 [SimplifyCFG] Add additional jump threading test (NFC)
A case where multiple predecessors can be threaded over the same
edge, with a phi node in the threaded block.
2022-07-05 13:57:58 +02:00
Nikita Popov 2b089e9ae0 [SimplifyCFG] Try to merge edge block when threading (PR55765)
When threading, we always create a new block for the threaded edge
(even if the edge is not critical), which will later get folded back
into the predecessor if possible. Depending on precise processing
order, this separate block may break the detection of trivial
cycles in the threading code, which normally avoids infinite
threading of loops. Explicitly merge the created edge block into
the predecessor to avoid this.

Fixes https://github.com/llvm/llvm-project/issues/55765.

Differential Revision: https://reviews.llvm.org/D127216
2022-06-20 10:29:33 +02:00
Nikita Popov daf897d559 [IR] Check for SignedMin/-1 division in canTrap() (PR56038)
In addition to division by zero, signed division also traps for
SignedMin / -1. This was handled in isSafeToSpeculativelyExecute(),
but not in Constant::canTrap().
2022-06-17 11:25:11 +02:00
Samuel Eubanks bf02ed240d Prevent crash when TurnSwitchRangeIntoICmp receives default unreachable destination
TurnSwitchRangeIntoICmp crashes when given a switch with a default
destination of unreachable
Addresses issue #53208
https://github.com/llvm/llvm-project/issues/53208

Differential revision: https://reviews.llvm.org/D127712
2022-06-16 16:11:24 +02:00
Nikita Popov 571c713144 [SimplifyCFG] Handle trapping aggregates (PR49839)
Handle the fact that not only constant expressions, but also
constant aggregates containing expressions can trap.

This still doesn't fix the original C reproducer, probably due to
more issues remaining in other passes.
2022-06-13 14:56:49 +02:00
Nikita Popov abefed6f97 [SimplifyCFG] Add test for PR49839 (NFC) 2022-06-13 14:35:09 +02:00
Florian Hahn b8d728a098
[SimplifyCFG,EarlyCSE] Update 2 tests to not branch on undef (NFC). 2022-06-12 18:03:26 +01:00
Nikita Popov 41d5033eb1 [IR] Enable opaque pointers by default
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:

* If IR that contains *neither* explicit ptr nor %T* types is passed
  to tools, we will now use opaque pointer mode, unless
  -opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
  It is possible to opt-out by calling setOpaquePointers(false) on
  LLVMContext.

A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.

Differential Revision: https://reviews.llvm.org/D126689
2022-06-02 09:40:56 +02:00
Nikita Popov 2e101cca69 [Local] Don't remove invoke of non-willreturn function
The code was only checking for memory side-effects, but not for
divergence side-effects. Replace this with a generic check.
2022-05-30 15:37:46 +02:00
Nikita Popov 1f1de06165 [SimplifyCFG] Add test for invoke of nounwind non-willreturn function (NFC)
Test both the case with and without willreturn attribute.
2022-05-30 15:36:33 +02:00
Florian Hahn a80081763c
[SimplifyCFG] Avoid shifting by a too large exponent.
TI->getBitWidth can be > 64 and in those cases the shift will be UB due
to the exponent being too large.

To fix this, cap the shift at 63. I think this should work out fine,
because TableSize is itself a 64 bit type and the maximum table size
must fit in the type. Also, if we would underestimate the size here, at
most we get an extra ZExt.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124608
2022-04-29 15:19:06 +01:00
Nikita Popov 884e9a877b [SimplifyCFG] Replace condition value when threading
Replace the condition value with the known constant value on the
threaded edge. This happens implicitly with phi threading because
we replace with the incoming value, but not for non-phi threading.
2022-04-29 09:50:27 +02:00
Nikita Popov 4e545bdb35 [SimplifyCFG] Thread branches on same condition in more cases (PR54980)
SimplifyCFG implements basic jump threading, if a branch is
performed on a phi node with constant operands. However,
InstCombine canonicalizes such phis to the condition value of a
previous branch, if possible. SimplifyCFG does support this as
well, but only in the very limited case where the same condition
is used in a direct predecessor -- notably, this does not include
the common diamond pattern (i.e. two consecutive if/elses on the
same condition).

This patch extends the code to look back a limited number of
blocks to find a branch on the same value, rather than only
looking at the direct predecessor.

Fixes https://github.com/llvm/llvm-project/issues/54980.

Differential Revision: https://reviews.llvm.org/D124159
2022-04-29 09:44:05 +02:00
Nikita Popov d727505e40 [SimplifyCFG] Remove one-use limitation in FoldCondBranchOnPHI()
BlockIsSimpleEnoughToThreadThrough() already checks that the phi
(and all other instructions) are not used outside the block, so
this one-use check is not necessary for legality. I also don't
see any reason why it would be necessary for profitability (in
fact, those extra uses will be replaced with constants, which
should be generally profitable).
2022-04-20 15:56:20 +02:00
Nikita Popov 37b1515b0a [SimplifyCFG] Add additional threading tests (NFC) 2022-04-20 15:40:44 +02:00
chenglin.bi 00871e2f4f [SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select
When switch with 2^n cases go to one result, check if the 2^n cases can be covered by n bit masks.
If yes we can use "and condition, ~mask" to simplify the switch

case 0 2 4 6 -> and condition, -7
https://alive2.llvm.org/ce/z/jjH_0N

case 0 2 8 10 -> and condition, -11
https://alive2.llvm.org/ce/z/K7E-2V

case 2 4 8 12 -> and (sub condition, 2), -11
https://alive2.llvm.org/ce/z/CrxbYg

Fix one case of https://github.com/llvm/llvm-project/issues/39957

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D122485
2022-04-15 00:10:00 +08:00
Sanjay Patel d038135e19 [SimplifyCFG] add more tests for switch to select transform; NFC
Also, make test names more descriptive -
additional coverage for D122485
2022-04-13 17:14:45 -04:00
chenglin.bi fd0641b58c [SimplifyCFG] add tests for switch to select; NFC
Baseline tests for D122485 (issue #39957)
2022-04-13 22:35:25 +08:00
chenglin.bi dfc98d0c9c Revert "[SimplifyCFG] add tests for switch to select; NFC"
This reverts commit e2d77a160c.
2022-04-13 22:35:25 +08:00