Commit Graph

1180 Commits

Author SHA1 Message Date
Roman Lebedev c8ba2b67a0
[SimplifyCFG] 'merge compatible invokes': fully support indirect invokes
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
2022-02-08 21:29:38 +03:00
Roman Lebedev 414b47645d
[SimplifyCFG] 'merge compatible invokes': don't create trivial PHI's with all-identical incoming values 2022-02-08 21:29:38 +03:00
Roman Lebedev 42ca7cc889
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ uses
If the original invokes had uses, the uses must have been in PHI's,
but that immediately results in the incoming values being incompatible.
But we'll replace uses of the original invokes with the use of the
merged invoke, so as long as the incoming values become compatible
after that, we can merge.
2022-02-08 17:49:38 +03:00
Roman Lebedev 9986d60224
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ PHIs but no uses
As long as the incoming values for all the invokes in the set
are identical, we can merge the invokes.
2022-02-08 17:49:38 +03:00
Roman Lebedev 8411560fd0
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ no uses, no PHI's
Even if the invokes have normal destination, iff it's the same block,
we can merge them. For now, require that there are no PHI nodes,
and the returned values of invokes aren't used.
2022-02-08 17:49:38 +03:00
Roman Lebedev 55cd727c9a
[SimplifyCFG] 'merge compatible invokes': allow PHI nodes in landing pads
... iff the incoming values for the invokes-to-be-merged
are compatible (identical).
2022-02-04 20:26:44 +03:00
Roman Lebedev 0d384e9228
[NFC][SimplifyCFG] Extract `IncomingValuesAreCompatible()` out of `SafeToMergeTerminators()` 2022-02-04 20:26:44 +03:00
Roman Lebedev 36df803dfd
[SimplifyCFG] Merge compatible `invoke`s of a `landingpad`
While nowadays SimplifyCFG knows how to hoist code from then-else blocks,
sink code from unconditional predecessors, and even promote the latter
by tail-merging `ret`/`resume` function terminators, that isn't everything.

While i (& others) have been trying to deal with merging/sinking `unreachable`,
apparently perhaps the more impactful remaining problem is merging the `throw`
calls.

If we start at the `landingpad`, all the predecessors are unwind edges of `invoke`s,
and in some cases some of the `invoke`s are mergeable.
```
/// This is a weird mix of hoisting and sinking. Visually, it goes from:
///          [...]        [...]
///            |            |
///        [invoke0]    [invoke1]
///           / \          / \
///     [cont0] [landingpad] [cont1]
/// to:
///      [...] [...]
///          \ /
///       [invoke]
///          / \
///     [cont] [landingpad]
```

This simplifies the IR/CFG, at the cost of debug info and extra PHI nodes.

Note that we don't require for *all* the `invokes` of the `landingpad`
to be mergeable, they can form more than a single set, we gracefully handle that.

For now, i completely disallowed normal destination, PHI nodes and indirect invokes
but that can be supported.

Out of all the CTMark projects, only 7zip is C++, so there isn't much impact:
https://llvm-compile-time-tracker.com/compare.php?from=ba8eb31bd9542828f6424e15a3014f80f14522c8&to=722fc871c84f14157d45c2159bc9c8c7e2825785&stat=size-total
... but there it currently causes size-total decrease.

Differential Revision: https://reviews.llvm.org/D117805
2022-02-04 17:04:21 +03:00
Roman Lebedev ee4ba9f3a1
Revert "[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`."
Unfortunately, it seems we really do need to take the long route;
start from the "merge" block, find (all the) "dispatch" blocks,
and deal with each "dispatch" block separately, instead of simply
starting from each "dispatch" block like it would logically make sense,
otherwise we run into a number of other missing folds around
`switch` formation, missing sinking/hoisting and phase ordering.

This reverts commit 85628ce75b.
This reverts commit c5fff90953.
This reverts commit 34a98e1046.
This reverts commit 1e353f0922.
2022-02-03 12:32:50 +03:00
Fangrui Song 85628ce75b [SimplifyCFG] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-02-02 15:11:22 -08:00
Roman Lebedev c5fff90953
[NFC][SimplifyCFG] Merge `FoldTwoEntryPHINode()` into it's only callee 2022-02-02 17:53:56 +03:00
Roman Lebedev 34a98e1046
[NFC][SimplifyCFG] `FoldTwoEntryPHINode()`: s/BB/MergeBB/ 2022-02-02 17:53:56 +03:00
Roman Lebedev 1e353f0922
[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`.
The current `FoldTwoEntryPHINode()` is not quite designed correctly.
It starts from the merge point, and then tries to detect
the 'divergence' point.

Because of that, it is limited to the simple two-predecessor case,
where the PHI completely goes away. but that is rather pessimistic,
and it doesn't make much sense from the costmodel side of things.

For example if there is some other unrelated predecessor of
the merge point,  we could split the merge point so that
the then/else blocks first branch to an empty block
and then to the merge point, and then we'd be able to speculate
the then/else code.

But if we'd instead simply start at the divergence point,
and look for the merge point, then we'll just natively support this case.

There's also the fact that `SpeculativelyExecuteBB()` already does
just that, but only if there is a single block to speculate,
and with a much more restrictive cost model.
But that also means we have code duplication.

Now, sadly, while this is as much NFCI as possible,
there is just no way to cleanly migrate to
the proper implementation. The results *are* going to be different
somewhat because of various phase ordering effects and SimplifyCFG
block iteration strategy.
2022-02-02 17:53:56 +03:00
pvellien 4e1c207726 [SimplifyCFG] Fix assertion failure when reusing table switch comparison
After D116332, some icmps no longer fold with the target-independent
constant folder. The SimplifyCFG code assumed that the comparison
would always fold, which is not guaranteed. Explicitly check that the
result is either true or false.

Differential Revision: https://reviews.llvm.org/D117184
2022-01-18 09:30:54 +01:00
Roman Lebedev 82c8aca934
[SimplifyCFG] Be more aggressive when sinking into block followed by unreachable
I strongly believe we need some variant of this.

The main problem is e.g. that the glibc's assert has 4 parameters,
but the profitability check is only okay with one extra phi node,
so D116692 doesn't even trigger on most of the expected cases.

While that restriction probably makes sense in normal code, if we
are about to run off of a cliff (into an `unreachable`), this
successor block is unlikely so the cost to setup these PHI nodes
should not be on the hotpath, and shouldn't matter performance-wise.

Likewise, we don't sink if there are unconditional predecessors
UNLESS we'd sink at least one non-speculatable instruction,
which is a performance workaround, but if we are about to run into
`unreachable`, it shouldn't matter.

Note that we only allow the case where there are at
most unconditiona branches on the way to the unreachable block.

Differential Revision: https://reviews.llvm.org/D117045
2022-01-13 23:30:31 +03:00
Craig Topper cbcbbd6ac8 [ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC
This function returns an upper bound on the number of bits needed
to represent the signed value. Use "Max" to match similar functions
in KnownBits like countMaxActiveBits.

Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old
name around to keep this patch size down. Will do a bulk rename as
follow up.

Rename KnownBits::countMaxSignedBits->countMaxSignificantBits.

Reviewed By: lebedev.ri, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D116522
2022-01-03 11:33:30 -08:00
Craig Topper 14849fe554 [SimplifyCFG] Make use of ComputeMinSignedBits and KnownBits::getBitWidth. NFC 2022-01-03 10:08:14 -08:00
Kazu Hirata 26bd534a79 [llvm] Use none_of instead of \!any_of (NFC) 2021-12-17 13:48:57 -08:00
Bjorn Pettersson 297fb66484 Use a deterministic order when updating the DominatorTree
This solves a problem with non-deterministic output from opt due
to not performing dominator tree updates in a deterministic order.

The problem that was analysed indicated that JumpThreading was using
the DomTreeUpdater via llvm::MergeBasicBlockIntoOnlyPred. When
preparing the list of updates to send to DomTreeUpdater::applyUpdates
we iterated over a SmallPtrSet, which didn't give a well-defined
order of updates to perform.

The added domtree-updates.ll test case is an example that would
result in non-deterministic printouts of the domtree. Semantically
those domtree:s are equivalent, but it show the fact that when we
use the domtree iterator the order in which nodes are visited depend
on the order in which dominator tree updates are performed.

Since some passes (at least EarlyCSE) are iterating over nodes in the
dominator tree in a similar fashion as the domtree printer, then the
order in which transforms are applied by such passes, transitively,
also depend on the order in which dominator tree updates are
performed. And taking EarlyCSE as an example the end result could be
different depending on in which order the transforms are applied.

Reviewed By: nikic, kuhar

Differential Revision: https://reviews.llvm.org/D110292
2021-11-29 13:14:50 +01:00
Jun Ma 07333810ca Revert "Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""""
This reverts commit c93f93b2e3.
2021-11-24 10:26:37 +08:00
Florian Hahn 2ead34716a
[SimplifyCFG] Add early bailout if Use is not in same BB.
Without this patch, passingValueIsAlwaysUndefined will iterate over all
instructions from I to the end of the basic block, even if the use is
outside the block.

This patch adds an early bail out, if the use instruction is outside I's
BB. This can greatly reduce compile-time in cases where very large basic
blocks are involved, with a large number of PHI nodes and incoming
values.

Note that the refactoring makes the handling of the case where I is a
phi and Use is in PHI more explicit  as well: for phi nodes, we can also
directly bail out. In the existing code, we would iterate until we reach
the end and return false.

Based on an earlier patch by Matt Wala.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D113293
2021-11-09 12:57:03 +00:00
Jun Ma c93f93b2e3 Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""""
This reverts commit 3a998c06a8.
2021-11-01 15:31:59 +08:00
Kazu Hirata c714da2ceb [Transforms] Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC) 2021-10-31 07:57:32 -07:00
Max Kazantsev 9bbfe0f72c [NFC] Remove obsolete simplifyOnceImpl function
The function simplifyOnce only calls simplifyOnceImpl and does nothing else.
Having this separate helper makes no sense. Removing it.

Patch by Dmitry Bakunevich!

Differential Revision: https://reviews.llvm.org/D112517
Reviewed By: mkazantsev
2021-10-26 13:51:42 +07:00
Nikita Popov 1848525842 [CodeMetrics] Don't require speculatability for ephemeral values
As discussed in D112016, our current requirement of speculatability
for ephemeral is overly strict: What we really care about is that
the instruction will be DCEd once the assume is dropped. For that
it is sufficient that the instruction is side-effect free and not
a terminator.

In particular, this allows non-dereferenceable loads to be ephemeral
values.

Differential Revision: https://reviews.llvm.org/D112179
2021-10-21 20:30:01 +02:00
Hongtao Yu 098a0d8fbc [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3.
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.

Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110847
2021-10-12 09:44:12 -07:00
Jun Ma 3a998c06a8 Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""
This reverts commit 8ba2adcf9e.
2021-09-27 20:39:05 +08:00
Arthur Eubanks e7249e4acf [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest
When determining whether to fold branches to a common destination by
merging two blocks, SimplifyCFG will count the number of instructions to
be moved into the first basic block. However, there's no reason to count
free instructions like bitcasts and other similar instructions.

This resolves missed branch foldings with -fstrict-vtable-pointers in
llvm-test-suite's lambda benchmark.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D108837
2021-09-22 09:52:37 -07:00
Max Kazantsev 073b254cff [SimplifyCFG] Redirect switch cases that lead to UB into an unreachable block
When following a case of a switch instruction is guaranteed to lead to
UB, we can safely break these edges and redirect those cases into a newly
created unreachable block. As result, CFG will become simpler and we can
remove some of Phi inputs to make further analyzes easier.

Patch by Dmitry Bakunevich!

Differential Revision: https://reviews.llvm.org/D109428
Reviewed By: lebedev.ri
2021-09-21 10:45:19 +07:00
Nikita Popov 0fc624f029 [IR] Return AAMDNodes from Instruction::getMetadata() (NFC)
getMetadata() currently uses a weird API where it populates a
structure passed to it, and optionally merges into it. Instead,
we can return the AAMDNodes and provide a separate merge() API.
This makes usages more compact.

Differential Revision: https://reviews.llvm.org/D109852
2021-09-16 21:06:57 +02:00
Arthur Eubanks d49cb5b303 [SimplifyCFG] Add bonus when seeing vector ops to branch fold to common dest
This makes some tests in vector-reductions-logical.ll more stable when
applying D108837.

The cost of branching is higher when vector ops are involved due to
potential SLP transformations.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D108935
2021-09-16 10:50:36 -07:00
Kazu Hirata 24c8eaec94 [Transforms] Use make_early_inc_range (NFC) 2021-09-15 19:55:24 -07:00
Owen Anderson 68079ef0eb Teach SimplifyCFG to fold switches into lookup tables in more cases.
In particular, it couldn't handle cases where lookup table constant
expressions involved bitcasts. This does not seem to come up
frequently in C++, but comes up reasonably often in Rust via
`#[derive(Debug)]`.

Originally reported by pcwalton.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D109565
2021-09-15 22:07:08 +00:00
Roman Lebedev 909cba9699
[SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA form for bonus instructions (PR51125)
I can't seem to wrap my head around the proper fix here,
we should be fine without this requirement, iff we can form this form,
but the naive attempt (https://reviews.llvm.org/D106317) has failed.
So just to unblock the release, put up a restriction.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51125
2021-09-09 12:28:09 +03:00
Jun Ma 8ba2adcf9e Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""
Differential Revision: https://reviews.llvm.org/D106056
2021-09-09 16:53:33 +08:00
Max Kazantsev 29d054bf12 [SimplifyCFG] Preserve knowledge about guarding condition by adding assume
This improvement adds "assume" after removal of branch basing on UB in successor block.

Consider the following example:

```
pred:
  x = ...
  cond = x > 10
  br cond, bb, other.succ

bb:
  phi [nullptr, pred], ... // other possible preds
  load(phi) // UB if we came from pred

other.succ:
  // here we know that x <= 10, but this knowledge is lost
  // after the branch is turned to unconditional unless we
  // preserve it with assume.
```

If we remove the branch basing on knowledge about UB in a successor block,
then the fact that x <= 10 is other.succ might be lost if this condition is
not inferrable from any dominating condition. To preserve this knowledge, we
can add assume intrinsic with (possibly inverted) branch condition.

Patch by Dmitry Bakunevich!

Differential Revision: https://reviews.llvm.org/D109054
Reviewed By: lebedev.ri
2021-09-08 14:05:17 +07:00
Roman Lebedev 5d4f37e895
[NFCI][SimplifyCFG] Rewrite `createUnreachableSwitchDefault()`
The only thing that function should do as per it's semantic,
is to ensure that the switch's default is a block consisting only of
an `unreachable` terminator.

So let's just create such a block and update switch's default
to point to it. There should be no need for all this weird dance
around predecessors/successors.
2021-08-20 13:28:08 +03:00
Sanjay Patel ec54e275f5 Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."
This reverts commit 9934a5b2ed.
This patch may cause miscompiles because it missed a constraint
as shown in the examples from:
https://llvm.org/PR51531
2021-08-19 08:43:51 -04:00
Jun Ma 9934a5b2ed [CVP] processSwitch: Remove default case when switch cover all possible values.
Differential Revision: https://reviews.llvm.org/D106056
2021-08-18 10:23:13 +08:00
Roman Lebedev 2eb554a9fe
Revert "Reland [SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)"
This is still wrong, as failing bots suggest.

This reverts commit 3d9beefc7d.
2021-08-16 11:07:42 +03:00
Roman Lebedev 3d9beefc7d
Reland [SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)
... with test change this time.

LLVM IR SSA form is "implicit" in `@pr51125`. While is a valid LLVM IR,
and does not require any PHI nodes, that completely breaks the further logic
in `CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses()`
that updates the live-out uses of the bonus instructions.

What i believe we need to do, is to first make the SSA form explicit,
by inserting tautological PHI nodes, and rewriting the offending uses.

```
$ /builddirs/llvm-project/build-Clang12/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -load-pass-plugin /repositories/alive2/build-Clang-release/tv/tv.so -tv -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=10 -tv -o /dev/null /tmp/test.ll

----------------------------------------
@global_pr51125 = global 4 bytes, align 4

define i32 @pr51125() {
%entry:
  br label %L

%L:
  %ld = load i32, * @global_pr51125, align 4
  %iszero = icmp eq i32 %ld, 0
  br i1 %iszero, label %exit, label %L2

%L2:
  store i32 4294967295, * @global_pr51125, align 4
  %cmp = icmp eq i32 %ld, 4294967295
  br i1 %cmp, label %L, label %exit

%exit:
  %r = phi i32 [ %ld, %L2 ], [ %ld, %L ]
  ret i32 %r
}
=>
@global_pr51125 = global 4 bytes, align 4

define i32 @pr51125() {
%entry:
  %ld.old = load i32, * @global_pr51125, align 4
  %iszero.old = icmp eq i32 %ld.old, 0
  br i1 %iszero.old, label %exit, label %L2

%L2:
  %ld2 = phi i32 [ %ld.old, %entry ], [ %ld, %L2 ]
  store i32 4294967295, * @global_pr51125, align 4
  %cmp = icmp ne i32 %ld2, 4294967295
  %ld = load i32, * @global_pr51125, align 4
  %iszero = icmp eq i32 %ld, 0
  %or.cond = select i1 %cmp, i1 1, i1 %iszero
  br i1 %or.cond, label %exit, label %L2

%exit:
  %ld1 = phi i32 [ poison, %L2 ], [ %ld.old, %entry ]
  %r = phi i32 [ %ld2, %L2 ], [ %ld.old, %entry ]
  ret i32 %r
}
Transformation seems to be correct!

```

Fixes https://bugs.llvm.org/show_bug.cgi?id=51125

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D106317
2021-08-15 19:16:04 +03:00
Roman Lebedev 60dd0121c9
Revert "[SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)"
Forgot to stage the test change.

This reverts commit 78af5cb213.
2021-08-15 19:15:09 +03:00
Roman Lebedev 78af5cb213
[SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)
LLVM IR SSA form is "implicit" in `@pr51125`. While is a valid LLVM IR,
and does not require any PHI nodes, that completely breaks the further logic
in `CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses()`
that updates the live-out uses of the bonus instructions.

What i believe we need to do, is to first make the SSA form explicit,
by inserting tautological PHI nodes, and rewriting the offending uses.

```
$ /builddirs/llvm-project/build-Clang12/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -load-pass-plugin /repositories/alive2/build-Clang-release/tv/tv.so -tv -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=10 -tv -o /dev/null /tmp/test.ll

----------------------------------------
@global_pr51125 = global 4 bytes, align 4

define i32 @pr51125() {
%entry:
  br label %L

%L:
  %ld = load i32, * @global_pr51125, align 4
  %iszero = icmp eq i32 %ld, 0
  br i1 %iszero, label %exit, label %L2

%L2:
  store i32 4294967295, * @global_pr51125, align 4
  %cmp = icmp eq i32 %ld, 4294967295
  br i1 %cmp, label %L, label %exit

%exit:
  %r = phi i32 [ %ld, %L2 ], [ %ld, %L ]
  ret i32 %r
}
=>
@global_pr51125 = global 4 bytes, align 4

define i32 @pr51125() {
%entry:
  %ld.old = load i32, * @global_pr51125, align 4
  %iszero.old = icmp eq i32 %ld.old, 0
  br i1 %iszero.old, label %exit, label %L2

%L2:
  %ld2 = phi i32 [ %ld.old, %entry ], [ %ld, %L2 ]
  store i32 4294967295, * @global_pr51125, align 4
  %cmp = icmp ne i32 %ld2, 4294967295
  %ld = load i32, * @global_pr51125, align 4
  %iszero = icmp eq i32 %ld, 0
  %or.cond = select i1 %cmp, i1 1, i1 %iszero
  br i1 %or.cond, label %exit, label %L2

%exit:
  %ld1 = phi i32 [ poison, %L2 ], [ %ld.old, %entry ]
  %r = phi i32 [ %ld2, %L2 ], [ %ld.old, %entry ]
  ret i32 %r
}
Transformation seems to be correct!

```

Fixes https://bugs.llvm.org/show_bug.cgi?id=51125

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D106317
2021-08-15 19:02:34 +03:00
Roman Lebedev c46546bd52
Reland "[NFCI][SimplifyCFG] simplifyCondBranch(): assert that branch is non-tautological""
The commit originally unearthed a problem, reported as
https://reviews.llvm.org/rGf30a7dff8a5b32919951dcbf92e4a9d56c4679ff#1019890
Now that the problem has been fixed, and the assertion no longer fires,
let's see if there are other cases it fires on.

This reverts commit 5c8c24d2de,
relanding commit f30a7dff8a.
2021-08-13 15:45:03 +03:00
Roman Lebedev 2702fb1148
[SimplifyCFG] Restart if `removeUndefIntroducingPredecessor()` made changes
It might changed the condition of a branch into a constant,
so we should restart and constant-fold terminator,
instead of continuing with the tautological "conditional" branch.
This fixes the issue reported at https://reviews.llvm.org/rGf30a7dff8a5b32919951dcbf92e4a9d56c4679ff
2021-08-13 15:45:03 +03:00
Roman Lebedev 5c8c24d2de
Revert "[NFCI][SimplifyCFG] simplifyCondBranch(): assert that branch is non-tautological"
The assertion does not hold on a provided reproducer.
Reverting until after fixing the problem.

This reverts commit f30a7dff8a.
2021-08-13 13:16:22 +03:00
Roman Lebedev f30a7dff8a
[NFCI][SimplifyCFG] simplifyCondBranch(): assert that branch is non-tautological
We really shouldn't deal with a conditional branch that can be trivially
constant-folded into an unconditional branch.

Indeed, barring failure to trigger BB reprocessing, that should be true,
so let's assert as much, and hope the assertion never fires.
If it does, we have a bug to fix.
2021-08-12 20:03:09 +03:00
Roman Lebedev 628f63d3d5
[SimplifyCFG] If FoldTwoEntryPHINode() changed things, restart
Mainly, i want to add an assertion that `SimplifyCFGOpt::simplifyCondBranch()`
doesn't get asked to deal with non-unconditional branches,
and if i do that, then said assertion fires on existing tests,
and this is what prevents it from firing.
2021-08-12 20:03:09 +03:00
Carl Ritson a1783b54e8 [SimpifyCFG] Remove recursion from FoldCondBranchOnPHI. NFCI.
Avoid stack overflow errors on systems with small stack sizes
by removing recursion in FoldCondBranchOnPHI.

This is a simple change as the recursion was only iteratively
calling the function again on the same arguments.
Ideally this would be compiled to a tail call, but there is
no guarantee.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107803
2021-08-10 19:14:31 +09:00
Momchil Velikov f171149e0d [SimpifyCFG] Speculate a store preceded by a local non-escaping load
In SimplifyCFG we may simplify the CFG by speculatively executing
certain stores, when they are preceded by a store to the same
location.  This patch allows such speculation also when the stores are
similarly preceded by a load.

In order for this transformation to be correct we need to ensure that
the memory location is writable and the store in the new location does
not introduce a data race.

Local objects (created by an `alloca` instruction) are always
writable, so once we are past a read from a location it is valid to
also write to that same location.

Seeing just a load does not guarantee absence of a data race (unlike
if we see a store) - the load may still be part of a race, just not
causing undefined behaviour
(cf. https://llvm.org/docs/Atomics.html#optimization-outside-atomic).

In the original program, a data race might have been prevented by the
condition, but once we move the store outside the condition, we must
be sure a data race wasn't possible anyway, no matter what the
condition evaluates to.

One way to be sure that a local object is never concurrently
read/written is check that its address never escapes the function.

Hence this transformation is restricted to local, non-escaping
objects.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D107281
2021-08-05 15:54:42 +01:00