Commit Graph

835 Commits

Author SHA1 Message Date
Florian Hahn af523514c4
[SimplifyCFG] Skip dbg intrinsics when checking for branch-only BBs.
Debug intrinsics are free to hoist and should be skipped when looking
for terminator-only blocks. As a consequence, we have to delegate to the
main hoisting loop to hoist any dbg intrinsics instead of jumping to the
terminator case directly.

This fixes PR49982.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D100640
2021-04-17 15:17:50 +01:00
Florian Hahn 31b5c2b1d2
[SimplifyCFG] Regenerate CHECK lines and add test for PR49982. 2021-04-16 12:05:54 +01:00
Florian Hahn 467b1f1cd2
[SimplifyCFG] Allow hoisting terminators only with HoistCommonInsts=false.
As a side-effect of the change to default HoistCommonInsts to false
early in the pipeline, we fail to convert conditional branch & phis to
selects early on, which prevents vectorization for loops that contain
conditional branches that effectively are selects (or if the loop gets
vectorized, it will get vectorized very inefficiently).

This patch updates SimplifyCFG to perform hoisting if the only
instruction in both BBs is an equal branch. In this case, the only
additional instructions are selects for phis, which should be cheap.

Even though we perform hoisting, the benefits of this kind of hoisting
should by far outweigh the negatives.

For example, the loop in the code below will not get vectorized on
AArch64 with the current default, but will with the patch. This is a
fundamental pattern we should definitely vectorize. Besides that, I
think the select variants should be easier to use for reasoning across
other passes as well.

https://clang.godbolt.org/z/sbjd8Wshx

```
double clamp(double v) {
  if (v < 0.0)
    return 0.0;
  if (v > 6.0)
    return 6.0;
  return v;
}

void loop(double* X, double *Y) {
  for (unsigned i = 0; i < 20000; i++) {
    X[i] = clamp(Y[i]);
  }
}
```

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D100329
2021-04-13 10:33:35 +01:00
Florian Hahn 59334755e4
[SimplifyCFG] Add test requiring only hoisting a branch. 2021-04-12 22:29:34 +01:00
Nikita Popov 9bad7de9a3 [SimplifyCFG] Handle two equal cases in switch to select
When converting a switch with two cases and a default into a
select, also handle the denegerate case where two cases have the
same value.

Generate this case directly as

  %or = or i1 %cmp1, %cmp2
  %res = select i1 %or, i32 %val, i32 %default

rather than

  %sel1 = select i1 %cmp1, i32 %val, i32 %default
  %res = select i1 %cmp2, i32 %val, i32 %sel1

as InstCombine is going to canonicalize to the former anyway.
2021-04-04 17:27:28 +02:00
Nikita Popov 7ca168dd5a [SimplifyCFG] Add switch-to-select test with two equal cases (NFC)
We handle the case where we have two cases and a default all having
different values, but not the case where two cases happen to have
the same one.

The PhaseOrdering test is a particularly bad example where this
showed up.
2021-04-04 17:14:59 +02:00
Nikita Popov cb4443994e [SimplifyCFG] Make test more robust (NFC)
These are supposed to test creation of a switch, so make sure
there is some actual code in the branches. Otherwise this could
be turned into a select instead.
2021-04-04 17:14:59 +02:00
Roman Lebedev b5822026dd
[SimplifyCFG] 'Fold branch to common dest': don't overestimate the cost
`FoldBranchToCommonDest()` has a certain budget (`-bonus-inst-threshold=`)
for bonus instruction duplication. And currently it calculates the cost
as-if it will actually duplicate into each predecessor.

But ignoring the budget, it won't always duplicate into each predecessor,
there are some correctness and profitability checks.
So when calculating the cost, we should first check into which blocks
will we *actually* duplicate, and only then use that block count
to do budgeting.
2021-03-23 18:30:26 +03:00
Roman Lebedev a866f72eb2
[NFC][SimplifyCFG] 'Fold branch to common dest': add test for cost overestimation
We should not count the cost of duplication into predecessors into which
we won't ultimately duplicate.
2021-03-23 18:30:26 +03:00
Roman Lebedev 514bc01ca3
[SimplifyCFG] FoldBranchToCommonDest(): properly handle same-block external uses (PR49510/PR49689)
We clone bonus instructions to the end of the predecessor block,
and then use `SSAUpdater::RewriteUseAfterInsertions()`.
But that only deals with the cases where the use-to-be-rewritten
are either in different block from the def, or come after the def.

But in some loop cases, the external use may be in the beginning of
predecessor block, before the newly cloned bonus instruction.
`SSAUpdater::RewriteUseAfterInsertions()` does not deal with that.
Notably, the external use can't happen to be both in the same block
and *after* the newly-cloned instruction, because of the fold preconditions.

To properly handle these cases, when the use is in the same block,
we should instead use `SSAUpdater::RewriteUse()`.
TBN, they do the same thing for PHI users.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49510
Likely Fixes https://bugs.llvm.org/show_bug.cgi?id=49689
2021-03-23 17:37:28 +03:00
Sanjay Patel 1bf8f9e228 [SimplifyCFG] use profile metadata to refine merging branch conditions
2nd try (original: 27ae17a6b0) with fix/test for crash. We must make
sure that TTI is available before trying to use it because it is not
required (might be another bug).

Original commit message:

This is one step towards solving:
https://llvm.org/PR49336

In that example, we disregard the recommended usage of builtin_expect,
so an expensive (unpredictable) branch is folded into another branch
that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor)
condition is likely to cause execution to bypass the 2nd (successor)
condition before merging conditions by using logic ops.

Differential Revision: https://reviews.llvm.org/D98898
2021-03-23 10:19:37 -04:00
Juneyoung Lee 960a767368 Reland "[InstCombine] Add simplification of two logical and/ors"
This relands 07c3b97e18 (D96945) which was reverted by
commit f49354838e.
The two-stage compilation successfully tests passes on my machine.
2021-03-23 16:24:50 +09:00
Juneyoung Lee 5c2e50b5d2 Reland "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe"
This relands commit 99108c791d (D95026) which was
reverted by 8d5a981a13 because the underlying
problem (https://llvm.org/pr49495) is fixed.
2021-03-23 09:19:53 +09:00
Sanjay Patel 95f7f7c21b Revert "[SimplifyCFG] use profile metadata to refine merging branch conditions"
This reverts commit 27ae17a6b0.
There are bot failures that end with:
 #4 0x00007fff7ae3c9b8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #5 0x00007fff84e504d8 (linux-vdso64.so.1+0x4d8)
 #6 0x00007fff7c419a5c llvm::TargetTransformInfo::getPredictableBranchThreshold() const (/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMAnalysis.so.13git+0x479a5c)

...but not sure how to trigger that yet.
2021-03-22 17:48:06 -04:00
Sanjay Patel 27ae17a6b0 [SimplifyCFG] use profile metadata to refine merging branch conditions
This is one step towards solving:
https://llvm.org/PR49336

In that example, we disregard the recommended usage of builtin_expect,
so an expensive (unpredictable) branch is folded into another branch
that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor)
condition is likely to cause execution to bypass the 2nd (successor)
condition before merging conditions by using logic ops.

Differential Revision: https://reviews.llvm.org/D98898
2021-03-22 16:49:21 -04:00
Sanjay Patel c21016715f [SimplifyCFG] adjust test branchweights; NFC
This will check the boundary conditions of the
revised change proposed in D98898.
2021-03-22 15:55:34 -04:00
Philip Reames 854de7c4d0 [tests] Refresh a bunch of autogen test to adjust for format changes 2021-03-22 10:41:39 -07:00
Sanjay Patel 92068d6c31 [SimplifyCFG] add tests for branch cond merging with prof metadata; NFC
See PR49336.
2021-03-18 15:39:50 -04:00
Sanjay Patel bd197ed0a5 [SimplifyCFG] avoid sinking insts within an infinite-loop
The test is reduced from a C source example in:
https://llvm.org/PR49541

It's possible that the test could be reduced further or
the predicate generalized further, but it seems to require
a few ingredients (including the "late" SimplifyCFG options
on the RUN line) to fall into the infinite-loop trap.
2021-03-12 08:04:57 -05:00
Juneyoung Lee f49354838e Revert "[InstCombine] Add simplification of two logical and/ors"
This reverts commit 07c3b97e18 due to a reported failure in two-stage build.
2021-03-10 05:48:31 +09:00
Mehdi Amini 8d5a981a13 Revert "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe"
This reverts commit 99108c791d.
Clang is miscompiling LLVM with this change, a stage-2 build hits
multiple failures.

As a repro, I built clang in a stage1 directory and used it this way:

cmake -G Ninja ../llvm \
  -DCMAKE_CXX_COMPILER=`pwd`/../build-stage1/bin/clang++ \
  -DCMAKE_C_COMPILER=`pwd`/../build-stage1/bin/clang \
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU" \
  -DLLVM_ENABLE_PROJECTS=mlir \
  -DLLVM_BUILD_EXAMPLES=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_ASSERTIONS=On
ninja check-mlir
2021-03-08 00:15:47 +00:00
Juneyoung Lee 07c3b97e18 [InstCombine] Add simplification of two logical and/ors
This is a patch that adds folding of two logical and/ors that share one variable:

a && (a && b) -> a && b
a && (a & b)  -> a && b
...

This is towards removing the poison-unsafe select optimization (D93065 has more context).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96945
2021-03-08 02:38:43 +09:00
Juneyoung Lee 99108c791d [SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe
This patch makes FoldBranchToCommonDest merge branch conditions into `select i1` rather than `and/or i1` when it is called by SimplifyCFG.
It is known that merging conditions into and/or is poison-unsafe, and this is towards making things *more* correct by removing possible miscompilations.
Currently, InstCombine simply consumes these selects into and/or of i1 (which is also unsafe), so the visible effect would be very small. The unsafe select -> and/or transformation will be removed in the future.
There has been efforts for updating optimizations to support the select form as well, and they are linked to D93065.

The safe transformation is fired when it is called by SimplifyCFG only. This is done by setting the new `PoisonSafe` argument as true.
Another place that calls FoldBranchToCommonDest is LoopSimplify. `PoisonSafe` flag is set to false in this case because enabling it has a nontrivial impact in performance because SCEV is more conservative with select form and InductiveRangeCheckElimination isn't aware of select form of and/or i1.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95026
2021-03-08 01:38:03 +09:00
Sanjay Patel 356cdabd3a [SimplifyCFG] avoid illegal phi with both poison and undef
In the example based on:
https://llvm.org/PR49218
...we are crashing because poison is a subclass of undef, so we merge blocks and create:

PHI node has multiple entries for the same basic block with different incoming values!
  %k3 = phi i64 [ poison, %entry ], [ %k3, %g ], [ undef, %entry ]

If both poison and undef values are incoming, we soften the poison values to undef.

Differential Revision: https://reviews.llvm.org/D97495
2021-02-27 09:10:32 -05:00
Juneyoung Lee 56d228a14e [SimplifyCFG] Update passingValueIsAlwaysUndefined to check more attributes
This is a simple patch to update SimplifyCFG's passingValueIsAlwaysUndefined to inspect more attributes.

A new function `CallBase::isPassingUndefUB` checks attributes that imply noundef.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97244
2021-02-24 10:40:50 +09:00
Juneyoung Lee edf2e96742 [SimplifyCFG] Minor tweaks to the added tests (NFC) 2021-02-23 17:32:28 +09:00
Juneyoung Lee 28be9af0f8 [SimplifyCFG] Add tests for D97244 (NFC) 2021-02-23 17:15:17 +09:00
Nikita Popov 5c62d66131 [SimplifyCFG] Regenerate test checks (NFC) 2021-01-23 21:24:54 +01:00
Roman Lebedev 1742203844
[SimplifyCFG] FoldBranchToCommonDest(): re-lift restrictions on liveout uses of bonus instructions
I have previously tried doing that in
b33fbbaa34 / d38205144f,
but eventually it was pointed out that the approach taken there
was just broken wrt how the uses of bonus instructions are updated
to account for the fact that they should now use either bonus instruction
or the cloned bonus instruction. In particluar, all that manual handling
of PHI nodes in successors was just wrong.

But, the fix is actually much much simpler than my initial approach:
just tell SSAUpdate about both instances of bonus instruction,
and let it deal with all the PHI handling.

Alive2 confirms that the reproducers from the original bugs (@pr48450*)
are now handled correctly.

This effectively reverts commit 59560e8589,
effectively relanding b33fbbaa34.
2021-01-23 01:29:05 +03:00
Roman Lebedev e838750005
[NFC][SimplifyCFG] fold-branch-to-common-dest.ll: reduce complexity of @pr48450* test
We don't need that many iterations there,
having less iterations helps alive2 verify it.
2021-01-23 01:29:04 +03:00
Roman Lebedev 9bd8bcf993
[NFC][SimplifyCFG] PerformBranchToCommonDestFolding(): fix instruction name preservation
NewBonusInst just took name from BonusInst, so BonusInst has no name,
so BonusInst.getName() makes no sense.
So we need to ask NewBonusInst for the name.
2021-01-23 01:29:03 +03:00
Roman Lebedev d1a6f92fd5
[InstCombine] Fold `(~x) | y` --> `~(x & (~y))` iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.

Note that we could position this transformation as just hoisting
of the `not` (still, iff y is freely negatible), but the test changes
show a number of regressions, so let's not do that.
2021-01-22 17:23:54 +03:00
Roman Lebedev 0895b836d7
[SimplifyCFG] FoldBranchToCommonDest(): don't deal with unconditional branches
The case where BB ends with an unconditional branch,
and has a single predecessor w/ conditional branch
to BB and a single successor of BB is exactly the pattern
SpeculativelyExecuteBB() transform deals with.
(and in this case they both allow speculating only a single instruction)

Well, or FoldTwoEntryPHINode(), if the final block
has only those two predecessors.

Here, in FoldBranchToCommonDest(), only a weird subset of that
transform is supported, and it's glued on the side in a weird way.
  In particular, it took me a bit to understand that the Cond
isn't actually a branch condition in that case, but just the value
we allow to speculate (otherwise it reads as a miscompile to me).
  Additionally, this only supports for the speculated instruction
to be an ICmp.

So let's just unclutter FoldBranchToCommonDest(), and leave
this transform up to SpeculativelyExecuteBB(). As far as i can tell,
this shouldn't really impact optimization potential, but if it does,
improving SpeculativelyExecuteBB() will be more beneficial anyways.

Notably, this only affects a single test,
but EarlyCSE should have run beforehand in the pipeline,
and then FoldTwoEntryPHINode() would have caught it.

This reverts commit rL158392 / commit d33f4efbfd.
2021-01-22 17:22:49 +03:00
Arthur Eubanks 6699029b67 [NewPM][opt] Run the "default" AA pipeline by default
We tend to assume that the AA pipeline is by default the default AA
pipeline and it's confusing when it's empty instead.

PR48779

Initially reverted due to BasicAA running analyses in an unspecified
order (multiple function calls as parameters), fixed by fetching
analyses before the call to construct BasicAA.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95117
2021-01-21 21:08:54 -08:00
Arthur Eubanks ba9b4ea4ee Revert "[NewPM][opt] Run the "default" AA pipeline by default"
This reverts commit be611431cd.

Other/new-pm-lto-defaults.ll failing
2021-01-21 20:16:34 -08:00
Arthur Eubanks be611431cd [NewPM][opt] Run the "default" AA pipeline by default
We tend to assume that the AA pipeline is by default the default AA
pipeline and it's confusing when it's empty instead.

PR48779

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95117
2021-01-21 19:46:38 -08:00
Juneyoung Lee 2e74a27756 [SimplifyCFG] Reapply update_test_checks.py (NFC) 2021-01-20 12:41:30 +09:00
Juneyoung Lee 395c737d9f [SimplifyCFG] Update SimplifyBranchOnICmpChain to recognize select form of and/or
This patch teaches SimplifyCFG::SimplifyBranchOnICmpChain to understand select form of
(x == C1 || x == C2 || ...) / (x != C1 && x != C2 && ...) and optimize them into switch if possible.
D93065 has more context about the transition, including links to the list of optimizations being updated.

Differential Revision: https://reviews.llvm.org/D93943
2021-01-19 08:53:40 +09:00
Nikita Popov 4229b87ed3 [ValueTracking] Fix isSafeToSpeculativelyExecute for sdiv (PR48778)
The != -1 check does not work correctly for all bitwidths. Use
isAllOnesValue() instead.
2021-01-17 20:06:17 +01:00
Nikita Popov 1cc477f030 [SimplifyCFG] Add test for PR48778 (NFC)
The sdiv is incorrectly speculated.
2021-01-17 20:06:17 +01:00
Dávid Bolvanský a1500105ee [SimplifyCFG] Optimize CFG when null is passed to a function with nonnull argument
Example:

```
__attribute__((nonnull,noinline)) char * pinc(char *p)  {
  return ++p;
}

char * foo(bool b, char *a) {
       return pinc(b ? 0 : a);

}
```

optimize to

```
char * foo(bool b, char *a) {
       return pinc(a);

}
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94180
2021-01-15 23:53:43 +01:00
Roman Lebedev a14c36fe27
[SimplifyCFG] switchToSelect(): don't forget to insert DomTree edge iff needed
DestBB might or might not already be a successor of SelectBB,
and it wasn't we need to ensure that we record the fact in DomTree.

The testcase used to crash in lazy domtree updater mode + non-per-function
domtree validity checks disabled.
2021-01-15 23:35:57 +03:00
Roman Lebedev 61ec228030
[NFC][SimplifyCFG] Add testcase showing that we fail to preserve DomTree in switchToSelect() 2021-01-15 23:35:55 +03:00
Florian Hahn d98fc62ae6 [SimplifyCFG] Keep !dgb metadata of moved instruction, if they match.
Currently SimplifyCFG drops the debug locations of 'bonus' instructions.
Such instructions are moved before the first branch. The reason for the
current behavior is that this could lead to surprising debug stepping,
if the block that's folded is dead.

In case the first branch and the instructions to be folded have the same
debug location, this shouldn't be an issue and we can keep the debug
location.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D93662
2021-01-09 19:15:16 +00:00
Roman Lebedev 6984781df9
[NFC][SimplifyCFG] Add a test with an undef cond branch to identical destinations 2021-01-08 02:15:26 +03:00
Roman Lebedev f8875c313c
[NFC][SimplifyCFG] Add test with an unreachable block with two identical successors 2021-01-08 02:15:25 +03:00
Roman Lebedev 8b9a0e6f7e
[NFC][SimlifyCFG] Add some indirectbr-of-blockaddress tests 2021-01-08 02:15:25 +03:00
Roman Lebedev 087be536fe
[NFC][SimplifyCFG] Add a test with cond br on constant w/ identical destinations 2021-01-08 02:15:24 +03:00
Roman Lebedev 6be1fd6b20
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): drop reachable errneous assert
I have added it in d15d81c because it *seemed* correct, was holding
for all the tests so far, and was validating the fix added in the same
commit, but as David Major is pointing out (with a reproducer),
the assertion isn't really correct after all. So remove it.

Note that the d15d81c still fine.
2021-01-07 18:05:04 +03:00
Roman Lebedev a14945c1db
[SimplifyCFG] SimplifyEqualityComparisonWithOnlyPredecessor(): really don't delete DomTree edges multiple times 2021-01-06 01:52:39 +03:00