This is now the same as isIntAttrKind(), so use that instead, as
it does not require manual maintenance. The naming is also more
accurate in that both int and type attributes have an argument,
but this method was only targeting int attributes.
I initially wanted to tighten the AttrBuilder assertion, but we
have some in-tree uses that would violate it.
Rules:
1. SCEVUnknown is a pointer if and only if the LLVM IR value is a
pointer.
2. SCEVPtrToInt is never a pointer.
3. If any other SCEV expression has no pointer operands, the result is
an integer.
4. If a SCEVAddExpr has exactly one pointer operand, the result is a
pointer.
5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other
pointer operands, the result is a pointer.
6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a
pointer.
7. Otherwise, the SCEV expression is invalid.
I'm not sure how useful rule 6 is in practice. If we exclude it, we can
guarantee that ScalarEvolution::getPointerBase always returns a
SCEVUnknown, which might be a helpful property. Anyway, I'll leave that
for a followup.
This is basically mop-up at this point; all the changes with significant
functional effects have landed. Some of the remaining changes could be
split off, but I don't see much point.
Differential Revision: https://reviews.llvm.org/D105510
This reverts commit 52aeacfbf5.
There isn't full agreement on a path forward yet, but there is agreement that
this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338
Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality"
This reverts commit f4877c78c0.
And all the related changes to tests:
This reverts commit 9a0152799f.
This reverts commit 3f7c9cc274.
This reverts commit 329f8197ef.
This reverts commit aa9f58cc2c.
This reverts commit 2df37d5ddd.
This reverts commit a72a441812.
Currently InstructionSimplify.cpp knows how to simplify floating point
instructions that have a NaN operand. It does not know how to handle the
matching constrained FP intrinsic.
This patch teaches it how to simplify so long as the exception handling
is not "fpexcept.strict".
Differential Revision: https://reviews.llvm.org/D103169
This reverts commit 4e413e1621,
which landed almost 10 months ago under premise that the original behavior
didn't match reality and was breaking users, even though it was correct as per
the LangRef. But the LangRef change still hasn't appeared, which might suggest
that the affected parties aren't really worried about this problem.
Please refer to discussion in:
* https://reviews.llvm.org/D87399 (`Revert "[InstCombine] erase instructions leading up to unreachable"`)
* https://reviews.llvm.org/D53184 (`[LangRef] Clarify semantics of volatile operations.`)
* https://reviews.llvm.org/D87149 (`[InstCombine] erase instructions leading up to unreachable`)
clang has `-Wnull-dereference` which will diagnose the obvious cases
of null dereference, it was adjusted in f4877c78c0,
but it will only catch the cases where the pointer is a null literal,
it will not catch the cases where an arbitrary store is expected to trap.
Differential Revision: https://reviews.llvm.org/D105338
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
This adds support for opaque pointers to expandAddToGEP() by always
generating an i8 GEP for opaque pointers. After looking at some other
cases (constexpr GEP folding, SROA GEP generation), I've come around
to the idea that we should use i8 GEPs for opaque pointers, because
the alternative would be to guess a GEP type from surrounding code,
which will not be reliable. Ultimately, i8 GEPs is where we want to
end up anyway, and opaque pointers just make that the natural choice.
There are a couple of other places in SCEVExpander that check pointer
element types, I plan to update those when I run across usable test
coverage that doesn't assert elsewhere.
Differential Revision: https://reviews.llvm.org/D105398
This replaces the current ad-hoc implementation,
by syncing the code from InstCombine's implementation in `InstCombinerImpl::visitUnreachableInst()`,
with one exception that here in SimplifyCFG we are allowed to remove EH instructions.
Effectively, this now allows SimplifyCFG to remove calls (iff they won't throw and will return),
arithmetic/logic operations, etc.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D105374
D74751 added `ClearDSOLocalOnDeclarations` and dropped dso_local for
isDeclarationForLinker `GlobalValue`s. It missed a case for imported
declarations (`doImportAsDefinition` is false while `isPerformingImport` is
true). This can lead to a linker error for a default visibility symbol in
`ld.lld -shared`.
When `ClearDSOLocalOnDeclarations` is true, we check
`isPerformingImport() && !doImportAsDefinition(&GV)` along with
`GV.isDeclarationForLinker()`. The new condition checks an imported declaration.
This patch fixes a `LLVMPolly.so` link error using a trunk clang -DLLVM_ENABLE_LTO=Thin.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D104986
Mimics similar change for InstCombine:
ce192ced2b / D104602
All these uses are in blocks that aren't reachable from function's entry,
and said blocks are removed by SimplifyCFG itself,
so we can't really test this change.
Somewhat related to D105338.
While it is up for discussion whether or not volatile store traps,
so far there has been no complaints that volatile load/cmpxchg/atomicrmw also may trap.
And even if simplifycfg currently concervatively believes that to be the case,
instcombine does not: https://godbolt.org/z/5vhv4K5b8
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D105343
This fixes a bug at LibCallSimplifier::optimizeMemChr which does the following transformation:
```
// memchr("\r\n", C, 2) != nullptr -> (1 << C & ((1 << '\r') | (1 << '\n')))
// != 0
// after bounds check.
```
As written above, a bounds check on C (whether it is less than integer bitwidth) is done before doing `1 << C` otherwise 1 << C will overflow.
If the bounds check is false, the result of (1 << C & ...) must not be used at all, otherwise the result of shift (which is poison) will contaminate the whole results.
A correct way to encode this is `select i1 (bounds check), (1 << C & ...), false` because select does not allow the unused operand to contaminate the result.
However, this optimization was introducing `and (bounds check), (1 << C & ...)` which cannot do that.
The bug was found from compilation of this C++ code: https://reviews.llvm.org/rG2fd3037ac615#1007197
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104901
- When emitting libcalls, do not only pass the calling convention from the
function prototype but also the attributes.
- Do not pass attributes from e.g. libc memcpy to llvm.memcpy.
Review: Reid Kleckner, Eli Friedman, Arthur Eubanks
Differential Revision: https://reviews.llvm.org/D103992
This patch enables the salvaging of debug values that may be calculated
from more than one SSA value, such as with binary operators that do not
use a constant argument. The actual functionality for this behaviour is
added in a previous commit (c7270567), but with the ability to actually
emit the resulting debug values switched off.
The reason for this is that the prior patch has been reverted several
times due to issues discovered downstream, some time after the actual
landing of the patch. The patch in question is rather large and touches
several widely used header files, and all issues discovered are more
related to the handling of variadic debug values as a whole rather than
the details of the patch itself. Therefore, to minimize the build time
impact and risk of conflicts involved in any potential future
revert/reapply of that patch, this significantly smaller patch (that
touches no header files) will instead be used as the capstone to enable
variadic debug value salvaging.
The review linked to this patch is mostly implemented by the previous
commit, c7270567, but also contains the changes in this patch.
Differential Revision: https://reviews.llvm.org/D91722
This is a partial reapply of the original commit and the followup commit
that were previously reverted; this reapply also includes a small fix
for a potential source of non-determinism, but also has a small change
to turn off variadic debug value salvaging, to ensure that any future
revert/reapply steps to disable and renable this feature do not risk
causing conflicts.
Differential Revision: https://reviews.llvm.org/D91722
This reverts commit 386b66b2fc.
Use poison instead of undef for cases dealing with unreachable
code. This still leaves the more interesting case of "load from
uninitialized memory" as undef.
This problem is exposed by D104598, after it tail-merges `ret` in
`@test_inline_constraint_S_label`, the verifier would start complaining
`invalid operand for inline asm constraint 'S'`.
Essentially, taking address of a block is mismodelled in IR.
It should probably be an explicit instruction, a first one in block,
that isn't identical to any other instruction of the same type,
so that it can't be hoisted.
Currently, UnrollLoop() is passed an AllowRuntime flag and decides
itself whether runtime unrolling should be used or not. This patch
pushes the decision into the caller and allows us to eliminate the
ULO.TripCount and ULO.TripMultiple parameters.
Differential Revision: https://reviews.llvm.org/D104487
As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.
I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D104477
Remove dependence on ULO.TripCount/ULO.TripMultiple from ORE and
debug code. For debug code, print information about all exits.
For optimization remarks, only include the unroll count and the
type of unroll (complete, partial or runtime), but omit detailed
information about exit folding, now that more than one exit may
be folded.
Differential Revision: https://reviews.llvm.org/D104482
Fold all exits based on known trip count/multiple information from
SCEV. Previously only the latch exit or the single exit were folded.
This doesn't yet eliminate ULO.TripCount and ULO.TripMultiple
entirely: They're still used to a) decide whether runtime unrolling
should be performed and b) for ORE remarks. However, the core
unrolling logic is independent of them now.
Differential Revision: https://reviews.llvm.org/D104203
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
We create flag variable "__llvm_fs_discriminator__" in the binary
to indicate that FSAFDO hierarchical discriminators are used.
This variable might be GC'ed by the linker since it is not explicitly
reference. I initially added the var to the use list in pass
MIRFSDiscriminator but it did not work. It turned out the used global
list is collected in lowering (before MIR pass) and then emitted in
the end of pass pipeline.
Here I add the variable to the use list in IR level's AddDiscriminators
pass. The machine level code is still keep in the case IR's
AddDiscriminators is not invoked. If this is the case, this just use
-Wl,--export-dynamic-symbol=__llvm_fs_discriminator__
to force the emit.
Differential Revision: https://reviews.llvm.org/D103988
This commit mostly just replaces bad uses of `NDEBUG` with uses of
`LLVM_ENABLE_ABI_BREAKING_CHANGES` - the safe way to include ABI
breaking changes (normally extra struct elements in headers).
Differential Revision: https://reviews.llvm.org/D104216
The problematic code pattern in the test is based on:
https://llvm.org/PR50638
If the IfCond is itself the phi that we are trying to remove,
then the loop around line 2835 can end up with something like:
%cmp = select i1 %cmp, i1 false, i1 true
That can then lead to a use-after-free and assert (although
I'm still not seeing that locally in my release + asserts build).
I think this can only happen with unreachable code.
Differential Revision: https://reviews.llvm.org/D104063
We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).
Differential Revision: https://reviews.llvm.org/D104029
This adds a function specialization pass to LLVM. Constant parameters
like function pointers and constant globals are propagated to the callee by
specializing the function.
This is a first version with a number of limitations:
- The pass is off by default, so needs to be enabled on the command line,
- It does not handle specialization of recursive functions,
- It does not yet handle constants and constant ranges,
- Only 1 argument per function is specialised,
- The cost-model could be further looked into, and perhaps related,
- We are not yet caching analysis results.
This is based on earlier work by Matthew Simpson (D36432) and Vinay Madhusudan.
More recently this was also discussed on the list, see:
https://lists.llvm.org/pipermail/llvm-dev/2021-March/149380.html.
The motivation for this work is that function specialisation often comes up as
a reason for performance differences of generated code between LLVM and GCC,
which has this enabled by default from optimisation level -O3 and up. And while
this certainly helps a few cpu benchmark cases, this also triggers in real
world codes and is thus a generally useful transformation to have in LLVM.
Function specialisation has great potential to increase compile-times and
code-size. The summary from some investigations with this patch is:
- Compile-time increases for short compile jobs is high relatively, but the
increase in absolute numbers still low.
- For longer compile-jobs, the extra compile time is around 1%, and very much
in line with GCC.
- It is difficult to blame one thing for compile-time increases: it looks like
everywhere a little bit more time is spent processing more functions and
instructions.
- But the function specialisation pass itself is not very expensive; it doesn't
show up very high in the profile of the optimisation passes.
The goal of this work is to reach parity with GCC which means that eventually
we would like to get this enabled by default. But first we would like to address
some of the limitations before that.
Differential Revision: https://reviews.llvm.org/D93838
Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.
> This reapplies c0f3dfb9, which was reverted following the discovery of
> crashes on linux kernel and chromium builds - these issues have since
> been fixed, allowing this patch to re-land.
This reverts commit 36ec97f76a.
The change caused non-determinism in the compiler, see comments on the code
review at https://reviews.llvm.org/D91722.
Reverting to unbreak people's builds until that can be addressed.
This also reverts the follow-up "[DebugInfo] Limit the number of values
that may be referenced by a dbg.value" in
a0bd6105d8.
Needs to be discussed more.
This reverts commit 255a5c1baa6020c009934b4fa342f9f6dbbcc46
This reverts commit df2056ff3730316f376f29d9986c9913b95ceb1
This reverts commit faff79b7ca144e505da6bc74aa2b2f7cffbbf23
This reverts commit d2a9020785c6e02afebc876aa2778fa64c5cafd
Unrolling with more iterations than MaxTripCount is pointless, as
those iterations can never be executed. As such, we clamp ULO.Count
to MaxTripCount if it is known. This means we no longer need to
consider iterations after MaxTripCount for exit folding, and the
CompletelyUnroll flag becomes independent of ULO.TripCount.
Differential Revision: https://reviews.llvm.org/D103748
We might want to use it when creating SCEV proper in createSCEV(),
now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`,
which might have caused us to loose some optimization potential.
Loop peeling is currently performed as part of UnrollLoop().
Outside test scenarios, it is always performed with an unroll
count of 1. This means that unrolling doesn't actually do anything
apart from performing post-unroll simplification.
When testing, it's currently possible to specify both an explicit
peel count and an explicit unroll count. This doesn't perform any
sensible operation and may result in miscompiles, see
https://bugs.llvm.org/show_bug.cgi?id=45939.
This patch moves peeling from UnrollLoop() into tryToUnrollLoop(),
so that peeling does not also perform a susequent unroll. We only
run the post-unroll simplifications. Specifying both an explicit
peel count and unroll count is forbidden.
In the future, we may want to support both (non-PGO) peeling a
loop and unrolling it, but this needs to be done by first performing
the peel and then recalculating unrolling heuristics on a now
possibly analyzable loop.
Differential Revision: https://reviews.llvm.org/D103362
When SimplifyIndVars infers IR nowrap flags from SCEV, this may
happen in two ways: Either nowrap flags were already present in
SCEV and just get transferred to IR. Or zero/sign extension of
addrecs infers additional nowrap flags, and those get transferred
to IR. In the latter case, calling forgetValue() ensures that the
newly inferred nowrap flags get propagated to any other SCEV
expressions based on the addrec. However, the invalidation can
also have a major compile-time effect in some cases. For
https://bugs.llvm.org/show_bug.cgi?id=50384 with n=512 compile-
time drops from 7.1s to 0.8s without this invalidation. At the
same time, removing the invalidation doesn't affect any codegen
in test-suite.
Differential Revision: https://reviews.llvm.org/D103424