In addition to division by zero, signed division also traps for
SignedMin / -1. This was handled in isSafeToSpeculativelyExecute(),
but not in Constant::canTrap().
Handle the fact that not only constant expressions, but also
constant aggregates containing expressions can trap.
This still doesn't fix the original C reproducer, probably due to
more issues remaining in other passes.
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:
* If IR that contains *neither* explicit ptr nor %T* types is passed
to tools, we will now use opaque pointer mode, unless
-opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
It is possible to opt-out by calling setOpaquePointers(false) on
LLVMContext.
A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.
Differential Revision: https://reviews.llvm.org/D126689
TI->getBitWidth can be > 64 and in those cases the shift will be UB due
to the exponent being too large.
To fix this, cap the shift at 63. I think this should work out fine,
because TableSize is itself a 64 bit type and the maximum table size
must fit in the type. Also, if we would underestimate the size here, at
most we get an extra ZExt.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D124608
Replace the condition value with the known constant value on the
threaded edge. This happens implicitly with phi threading because
we replace with the incoming value, but not for non-phi threading.
SimplifyCFG implements basic jump threading, if a branch is
performed on a phi node with constant operands. However,
InstCombine canonicalizes such phis to the condition value of a
previous branch, if possible. SimplifyCFG does support this as
well, but only in the very limited case where the same condition
is used in a direct predecessor -- notably, this does not include
the common diamond pattern (i.e. two consecutive if/elses on the
same condition).
This patch extends the code to look back a limited number of
blocks to find a branch on the same value, rather than only
looking at the direct predecessor.
Fixes https://github.com/llvm/llvm-project/issues/54980.
Differential Revision: https://reviews.llvm.org/D124159
BlockIsSimpleEnoughToThreadThrough() already checks that the phi
(and all other instructions) are not used outside the block, so
this one-use check is not necessary for legality. I also don't
see any reason why it would be necessary for profitability (in
fact, those extra uses will be replaced with constants, which
should be generally profitable).
According to LangRef, an access scope must have zero operands and
be distinct. The access group may either be a single access scope
or a list of access scopes.
LoopInfo may assert if this is not the case.
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
If the original invokes had uses, the uses must have been in PHI's,
but that immediately results in the incoming values being incompatible.
But we'll replace uses of the original invokes with the use of the
merged invoke, so as long as the incoming values become compatible
after that, we can merge.
Even if the invokes have normal destination, iff it's the same block,
we can merge them. For now, require that there are no PHI nodes,
and the returned values of invokes aren't used.
As per LangRef's definition of `noreturn` attribute:
```
noreturn
This function attribute indicates that the function never returns
normally, hence through a return instruction.
This produces undefined behavior at runtime if the function
ever does dynamically return. nnotated functions may still
raise an exception, i.a., nounwind is not implied.
```
So if we `invoke` a `noreturn` function, and the normal destination
of an invoke is not an `unreachable`, point it at the new `unreachable`
block.
The change/fix from the original commit is that we now actually create
the new block, and don't just repurpose the original block,
because said normal destination block could have other users.
This reverts commit db1176ce66,
relanding commit 598833c987.
As per LangRef's definition of `noreturn` attribute:
```
noreturn
This function attribute indicates that the function never returns
normally, hence through a return instruction.
This produces undefined behavior at runtime if the function
ever does dynamically return. nnotated functions may still
raise an exception, i.a., nounwind is not implied.
```
While nowadays SimplifyCFG knows how to hoist code from then-else blocks,
sink code from unconditional predecessors, and even promote the latter
by tail-merging `ret`/`resume` function terminators, that isn't everything.
While i (& others) have been trying to deal with merging/sinking `unreachable`,
apparently perhaps the more impactful remaining problem is merging the `throw`
calls.
If we start at the `landingpad`, all the predecessors are unwind edges of `invoke`s,
and in some cases some of the `invoke`s are mergeable.
```
/// This is a weird mix of hoisting and sinking. Visually, it goes from:
/// [...] [...]
/// | |
/// [invoke0] [invoke1]
/// / \ / \
/// [cont0] [landingpad] [cont1]
/// to:
/// [...] [...]
/// \ /
/// [invoke]
/// / \
/// [cont] [landingpad]
```
This simplifies the IR/CFG, at the cost of debug info and extra PHI nodes.
Note that we don't require for *all* the `invokes` of the `landingpad`
to be mergeable, they can form more than a single set, we gracefully handle that.
For now, i completely disallowed normal destination, PHI nodes and indirect invokes
but that can be supported.
Out of all the CTMark projects, only 7zip is C++, so there isn't much impact:
https://llvm-compile-time-tracker.com/compare.php?from=ba8eb31bd9542828f6424e15a3014f80f14522c8&to=722fc871c84f14157d45c2159bc9c8c7e2825785&stat=size-total
... but there it currently causes size-total decrease.
Differential Revision: https://reviews.llvm.org/D117805
Unfortunately, it seems we really do need to take the long route;
start from the "merge" block, find (all the) "dispatch" blocks,
and deal with each "dispatch" block separately, instead of simply
starting from each "dispatch" block like it would logically make sense,
otherwise we run into a number of other missing folds around
`switch` formation, missing sinking/hoisting and phase ordering.
This reverts commit 85628ce75b.
This reverts commit c5fff90953.
This reverts commit 34a98e1046.
This reverts commit 1e353f0922.
The current `FoldTwoEntryPHINode()` is not quite designed correctly.
It starts from the merge point, and then tries to detect
the 'divergence' point.
Because of that, it is limited to the simple two-predecessor case,
where the PHI completely goes away. but that is rather pessimistic,
and it doesn't make much sense from the costmodel side of things.
For example if there is some other unrelated predecessor of
the merge point, we could split the merge point so that
the then/else blocks first branch to an empty block
and then to the merge point, and then we'd be able to speculate
the then/else code.
But if we'd instead simply start at the divergence point,
and look for the merge point, then we'll just natively support this case.
There's also the fact that `SpeculativelyExecuteBB()` already does
just that, but only if there is a single block to speculate,
and with a much more restrictive cost model.
But that also means we have code duplication.
Now, sadly, while this is as much NFCI as possible,
there is just no way to cleanly migrate to
the proper implementation. The results *are* going to be different
somewhat because of various phase ordering effects and SimplifyCFG
block iteration strategy.
After D116332, some icmps no longer fold with the target-independent
constant folder. The SimplifyCFG code assumed that the comparison
would always fold, which is not guaranteed. Explicitly check that the
result is either true or false.
Differential Revision: https://reviews.llvm.org/D117184
I strongly believe we need some variant of this.
The main problem is e.g. that the glibc's assert has 4 parameters,
but the profitability check is only okay with one extra phi node,
so D116692 doesn't even trigger on most of the expected cases.
While that restriction probably makes sense in normal code, if we
are about to run off of a cliff (into an `unreachable`), this
successor block is unlikely so the cost to setup these PHI nodes
should not be on the hotpath, and shouldn't matter performance-wise.
Likewise, we don't sink if there are unconditional predecessors
UNLESS we'd sink at least one non-speculatable instruction,
which is a performance workaround, but if we are about to run into
`unreachable`, it shouldn't matter.
Note that we only allow the case where there are at
most unconditiona branches on the way to the unreachable block.
Differential Revision: https://reviews.llvm.org/D117045