Commit Graph

6089 Commits

Author SHA1 Message Date
Nikita Popov 5ba73c924d [BuildLibCalls] Mark calloc as inaccessiblememonly
Now that DSE handles inaccessiblememonly calloc, mark it as such,
as we do with other memory allocation functions.
2022-01-19 12:55:09 +01:00
spupyrev 13d1364a34 A better profi rebalancer
This is an extension of **profi** post-processing step that rebalances counts
in CFGs that have basic blocks w/o probes (aka "unknown" blocks). Specifically,
the new version finds many more "unknown" subgraphs and marks more "unknown"
basic blocks as hot (which prevents unwanted optimization passes).

I see up to 0.5% perf on some (large) binaries, e.g., clang-10 and gcc-8.

The algorithm is still linear and yields no build time overhead.
2022-01-18 12:14:24 -08:00
Jan Svoboda 5f4ae56457 [llvm] Remove uses of `std::vector<bool>`
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.

This patch does just that for llvm.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D117121
2022-01-18 18:20:45 +01:00
pvellien 4e1c207726 [SimplifyCFG] Fix assertion failure when reusing table switch comparison
After D116332, some icmps no longer fold with the target-independent
constant folder. The SimplifyCFG code assumed that the comparison
would always fold, which is not guaranteed. Explicitly check that the
result is either true or false.

Differential Revision: https://reviews.llvm.org/D117184
2022-01-18 09:30:54 +01:00
Stephen Tozer 32417b3203 [DebugInfo] ValueMapper impl for DIArgList respects IgnoreMissingLocals
This patch fixes an issue in which SSA value reference within a
DIArgList would be unnecessarily dropped by llvm-link, even when
invoking on a single file (which should be a no-op). The reason for the
difference is that the ValueMapper does not refer to the
RF_IgnoreMissingLocals flag for LocalAsMetadata contained within a
DIArgList; this flag is used for direct LocalAsMetadata uses to preserve
SSA references even when the ValueMapper does not have an explicit
mapping for the referenced SSA value, which appears to always be the
case when using llvm-link in this manner.

Differential Revision: https://reviews.llvm.org/D114355
2022-01-17 17:17:32 +00:00
Nikita Popov c63a3175c2 [AttrBuilder] Remove ctor accepting AttributeList and Index
Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.
2022-01-15 22:39:31 +01:00
Nikita Popov d1675e4944 [AttrBuilder] Remove empty() / td_empty() methods
The empty() method is a footgun: It only checks whether there are
non-string attributes, which is not at all obvious from its name,
and of dubious usefulness. td_empty() is entirely unused.

Drop these methods in favor of hasAttributes(), which checks
whether there are any attributes, regardless of whether these are
string or enum attributes.
2022-01-15 17:57:18 +01:00
Florian Hahn e00158ed5c
[LoopUtils] Use InstSimplifyFolder in addRuntimeChecks.
Use the InstSimplifyFolder introduced earlier to perform initial
simplification during runtime check construction.
2022-01-15 15:21:16 +00:00
Roman Lebedev 82c8aca934
[SimplifyCFG] Be more aggressive when sinking into block followed by unreachable
I strongly believe we need some variant of this.

The main problem is e.g. that the glibc's assert has 4 parameters,
but the profitability check is only okay with one extra phi node,
so D116692 doesn't even trigger on most of the expected cases.

While that restriction probably makes sense in normal code, if we
are about to run off of a cliff (into an `unreachable`), this
successor block is unlikely so the cost to setup these PHI nodes
should not be on the hotpath, and shouldn't matter performance-wise.

Likewise, we don't sink if there are unconditional predecessors
UNLESS we'd sink at least one non-speculatable instruction,
which is a performance workaround, but if we are about to run into
`unreachable`, it shouldn't matter.

Note that we only allow the case where there are at
most unconditiona branches on the way to the unreachable block.

Differential Revision: https://reviews.llvm.org/D117045
2022-01-13 23:30:31 +03:00
Florian Hahn e3275cfa94
[BuildLibCalls] Add nounwind,willreturn to memset_pattern{4,8,16}.
Similar to memset, memset_pattern{4,8,16} all will return and do not
unwind. Use fallthrough to include all attributes also set for memset.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D114904
2022-01-12 10:32:53 +00:00
Nikita Popov 47a47733f0 [GlobalStatus] Remove unused HasNonInstructionUser member (NFC)
This hasn't been used in a long time.
2022-01-12 09:40:54 +01:00
Nikita Popov 94d6263391 [GlobalStatus] Look through non-constexpr casts
analyzeGlobal() looks through non-constexpr cast instructions when
looking for users. However, this particular place only strips the
casts again if they are constexprs. We should be looking through all
casts here.
2022-01-11 16:02:35 +01:00
Florian Hahn 2d67a86b7c
[SCEVExpander] Use IntToPtr for temporary instruction.
Use PtrToInt instead Add when creating temporary instructions. The add
might get folded away with more sophisticated folding.
2022-01-11 09:40:21 +00:00
Philip Reames 5265ac72c6 [MemoryBuiltin] Add an API for checking if an unused allocation can be removed [NFC]
Not all allocation functions are removable if unused.  An example of a non-removable allocation would be a direct call to the replaceable global allocation function in C++.  An example of a removable one - at least according to historical practice - would be malloc.
2022-01-10 15:43:39 -08:00
Roman Lebedev 82fb4f4b22
[SCEV] Sequential/in-order `UMin` expression
As discussed in https://github.com/llvm/llvm-project/issues/53020 / https://reviews.llvm.org/D116692,
SCEV is forbidden from reasoning about 'backedge taken count'
if the branch condition is a poison-safe logical operation,
which is conservatively correct, but is severely limiting.

Instead, we should have a way to express those
poison blocking properties in SCEV expressions.

The proposed semantics is:
```
Sequential/in-order min/max SCEV expressions are non-commutative variants
of commutative min/max SCEV expressions. If none of their operands
are poison, then they are functionally equivalent, otherwise,
if the operand that represents the saturation point* of given expression,
comes before the first poison operand, then the whole expression is not poison,
but is said saturation point.
```
* saturation point - the maximal/minimal possible integer value for the given type

The lowering is straight-forward:
```
compare each operand to the saturation point,
perform sequential in-order logical-or (poison-safe!) ordered reduction
over those checks, and if reduction returned true then return
saturation point else return the naive min/max reduction over the operands
```
https://alive2.llvm.org/ce/z/Q7jxvH (2 ops)
https://alive2.llvm.org/ce/z/QCRrhk (3 ops)
Note that we don't need to check the last operand: https://alive2.llvm.org/ce/z/abvHQS
Note that this is not commutative: https://alive2.llvm.org/ce/z/FK9e97

That allows us to handle the patterns in question.

Reviewed By: nikic, reames

Differential Revision: https://reviews.llvm.org/D116766
2022-01-10 20:51:26 +03:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Florian Hahn aecad5828e
[SCEVExpander] Only create trunc when needed.
9345ab3a45 updated generateOverflowCheck to skip creating checks that
always evaluate to false. This in turn means that we only need to
create TruncTripCount if it is actually used.

Sink the TruncTripCount creating into ComputeEndCheck, so it is only
created when there's an actual check.
2022-01-10 11:31:27 +00:00
Florian Hahn ad1b8772cf
[SCEVExpander] Only create multiplication if needed.
9345ab3a45 updated generateOverflowCheck to skip creating checks that
always evaluate to false. This in turn means that we only need to
compute |Step| * Trip count  if the result of the multiplication is
actually used.

Sink the multiplication into ComputeEndCheck, so it is only created
when there's an actual check.
2022-01-10 08:49:25 +00:00
Florian Hahn 1ce01b7dfe
[SCEVExpander] Simplify cleanup, skip sorting by dominance.
There is no need to sort inserted instructions by dominance, as the
deletion loop still requires RAUW with undef before deleting. Removing
instructions in reverse insertion order should still insure that the
number of uselist updates is kept to a minimum.
2022-01-09 18:38:41 +00:00
Florian Hahn 7f1bf68d7d
[SCEVExpander] Only check overflow if it is needed.
9345ab3a45 updated generateOverflowCheck to skip creating checks that
always evaluate to false. This in turn means that we only need to check
for overflows if the result of the multiplication is actually used.

Sink the Or for the overflow check into ComputeEndCheck, so it is only
created when there's an actual check.
2022-01-09 12:55:41 +00:00
Florian Hahn 9345ab3a45
[SCEVExpander] Skip creating <u 0 check, which is always false.
Unsigned compares of the form <u 0 are always false. Do not create such
a redundant check in generateOverflowCheck.

The patch introduces a new lambda to create the check, so we can
exit early conveniently and skip creating some instructions feeding the
check.

I am planning to sink a few additional instructions as follow-ups, but I
would prefer to do this separately, to keep the changes and diff
smaller.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D116811
2022-01-08 10:31:04 +00:00
Florian Hahn f395a4f8d5
[SCEVExpand] Only create required predicate checks.
Currently generateOverflowCheck always creates code for Step being
negative and positive, followed by a select at the end depending on
Step's sign.

This patch updates the code to only create either the checks for step
being positive or negative, if the sign is known.

Follow-up to D116696.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D116747
2022-01-07 14:49:02 +00:00
Nikita Popov 348bc76e35 [LibCalls] Infer same attrs for reallocf() as realloc()
reallocf() is the same as realloc() but frees the input pointer
on failure as well. We can infer the same attributes.

Also combine some cases that infer the same attributes and are
logically related.
2022-01-07 09:51:15 +01:00
Kazu Hirata 2aed08131d [llvm] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-07 00:39:14 -08:00
Nikita Popov c8189da201 [ModuleUtils] Remove dead arg from filterDeadComdatFunctions() (NFC)
The module argument is no longer used.
2022-01-07 09:12:16 +01:00
Philip Reames 916b35e783 [unroll] Strengthen verification of analysis updates under expensive asserts
I am suspecting a bug around updates of loop info for unreachable exits, but don't have a test case.  Running this locally on make check didn't reveal anything, we'll see if the expensive checks bots find it.
2022-01-06 08:51:50 -08:00
Florian Hahn 86d113a8b8
[SCEVExpand] Do not create redundant 'or false' for pred expansion.
This patch updates SCEVExpander::expandUnionPredicate to not create
redundant 'or false, x' instructions. While those are trivially
foldable, they can be easily avoided and hinder code that checks the
size/cost of the generated checks before further folds.

I am planning on look into a few other similar improvements to code
generated by SCEVExpander.

I remember a while ago @lebedev.ri working on doing some trivial folds
like that in IRBuilder itself, but there where concerns that such
changes may subtly break existing code.

Reviewed By: reames, lebedev.ri

Differential Revision: https://reviews.llvm.org/D116696
2022-01-06 11:52:19 +00:00
Nikita Popov 32808cfb24 [IR] Track users of comdats
Track all GlobalObjects that reference a given comdat, which allows
determining whether a function in a comdat is dead without scanning
the whole module.

In particular, this makes filterDeadComdatFunctions() have complexity
O(#DeadFunctions) rather than O(#SymbolsInModule), which addresses
half of the compile-time issue exposed by D115545.

Differential Revision: https://reviews.llvm.org/D115864
2022-01-06 09:13:58 +01:00
Sanjay Patel e2165e0968 [InstCombine] remove trunc user restriction for match of bswap
This does not appear to cause any problems, and it
fixes #50910

Extra tests with a trunc user were added with:
3a239379
...but they don't match either way, so there's an
opportunity to improve the matching further.
2022-01-05 13:04:11 -05:00
Philip Reames c16fd6a376 Rename doesNotReadMemory to onlyWritesMemory globally [NFC]
The naming has come up as a source of confusion in several recent reviews.  onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.
2022-01-05 08:52:55 -08:00
Nikita Popov 6e474d3308 [GlobalOpt][Evaluator] Fix off by one error in bounds check (PR53002)
We should bail out if the index is >= the size, not > the size.

Fixes https://github.com/llvm/llvm-project/issues/53002.
2022-01-05 14:06:02 +01:00
Benjamin Kramer 5f0a349738 Revert "Revert "[InferAttrs] Add writeonly to all the math functions""
This reverts commit 29b6e967f3. The bug it
found in PartiallyInlineLibCalls was fixed in
c8ffc73350.
2022-01-05 12:16:35 +01:00
Sjoerd Meijer e550dfa4a6 Silence a few unused variable warnings. NFC. 2022-01-05 09:15:07 +00:00
Martin Storsjö 29b6e967f3 Revert "[InferAttrs] Add writeonly to all the math functions"
This reverts commit ea75be3d9d and
1eb5b6e850.

That commit caused crashes with compilation e.g. like this
(not fixed by the follow-up commit):

$ cat sqrt.c
float a;
b() { sqrt(a); }
$ clang -target x86_64-linux-gnu -c -O2 sqrt.c
Attributes 'readnone and writeonly' are incompatible!
  %sqrtf = tail call float @sqrtf(float %0) #1
in function b
fatal error: error in backend: Broken function found, compilation aborted!
2022-01-05 11:12:19 +02:00
Nikita Popov 787f86e68c [GlobalOpt][Evaluator] Don't create bitcast for same type (PR52994)
isBitOrNoopPointerCastable() returns true if the types are the
same, but it's not actually possible to create a bitcast for all
such types. The assumption seems to be that the user will omit
creating the cast in that case, as it is unnecessary.

Fixes https://github.com/llvm/llvm-project/issues/52994.
2022-01-05 09:17:07 +01:00
Fangrui Song 1eb5b6e850 [InferAttrs] If readonly is already set, set readnone instead of writeonly
D116426 may lead to an assertion failure `Attributes 'readonly and writeonly' are incompatible!` if the builtin function already has `readonly`.
2022-01-04 18:59:35 -08:00
Benjamin Kramer ea75be3d9d [InferAttrs] Add writeonly to all the math functions
All of these functions would be `readnone`, but can't be on platforms
where they can set `errno`. A `writeonly` function with no pointer
arguments can only write (but never read) global state.

Writeonly theoretically allows these calls to be CSE'd (a writeonly call
with the same arguments will always result in the same global stores) or
hoisted out of loops, but that's not implemented currently.

There are a few functions in this list that could be `readnone` instead
of `writeonly`, if someone is interested.

Differential Revision: https://reviews.llvm.org/D116426
2022-01-04 16:58:05 +01:00
Nikita Popov bbeaf2aac6 [GlobalOpt][Evaluator] Rewrite global ctor evaluation (fixes PR51879)
Global ctor evaluation currently models memory as a map from Constant*
to Constant*. For this to be correct, it is required that there is
only a single Constant* referencing a given memory location. The
Evaluator tries to ensure this by imposing certain limitations that
could result in ambiguities (by limiting types, casts and GEP formats),
but ultimately still fails, as can be seen in PR51879. The approach
is fundamentally fragile and will get more so with opaque pointers.

My original thought was to instead store memory for each global as an
offset => value representation. However, we also need to make sure
that we can actually rematerialize the modified global initializer
into a Constant in the end, which may not be possible if we allow
arbitrary writes.

What this patch does instead is to represent globals as a MutableValue,
which is either a Constant* or a MutableAggregate*. The mutable
aggregate exists to allow efficient mutation of individual aggregate
elements, as mutating an element on a Constant would require interning
a new constant. When a write to the Constant* is made, it is converted
into a MutableAggregate* as needed.

I believe this should make the evaluator more robust, compatible
with opaque pointers, and a bit simpler as well.

Fixes https://github.com/llvm/llvm-project/issues/51221.

Differential Revision: https://reviews.llvm.org/D115530
2022-01-04 09:30:54 +01:00
Philip Reames 7203140748 Revert "[unroll] Prune all but first copy of invariant exit"
This reverts commit 9bd22595ba.

Seeing some bot failures which look plausibly connected.  Revert while investigating/waiting for bots to stablize.

e.g. https://lab.llvm.org/buildbot#builders/36/builds/15933
2022-01-03 11:57:35 -08:00
Craig Topper cbcbbd6ac8 [ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC
This function returns an upper bound on the number of bits needed
to represent the signed value. Use "Max" to match similar functions
in KnownBits like countMaxActiveBits.

Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old
name around to keep this patch size down. Will do a bulk rename as
follow up.

Rename KnownBits::countMaxSignedBits->countMaxSignificantBits.

Reviewed By: lebedev.ri, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D116522
2022-01-03 11:33:30 -08:00
Craig Topper 14849fe554 [SimplifyCFG] Make use of ComputeMinSignedBits and KnownBits::getBitWidth. NFC 2022-01-03 10:08:14 -08:00
Philip Reames 9bd22595ba [unroll] Prune all but first copy of invariant exit
If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled iteration can be taken. All other copies are dead.

The change itself is pretty straight forward, but let me add two points of context:
* I'd have expected other transform passes to catch this after unrolling, but I'm seeing multiple examples where we get to the end of O2/O3 without simplifying.
* I'd like to do a stronger change which did CSE during unroll and accounted for invariant expressions (as defined by SCEV instead of trivial ones from LoopInfo), but that doesn't fit cleanly into the current code structure.

Differential Revision: https://reviews.llvm.org/D116496
2022-01-03 09:55:19 -08:00
Nikita Popov 730414b341 [CodeExtractor] Remove unnecessary explicit attribute handling (NFC)
The nounwind and uwtable attributes will get handled as part of
the loop below as well, there is no need to special-case them here.
2022-01-03 14:23:25 +01:00
Nikita Popov 587495ffa1 [CodeExtractor] Separate function from param/ret attributes (NFC)
This list is confusing because it conflates functions attributes
(which are either extractable or not) and other attribute kinds,
which are simply irrelevant for this code.
2022-01-03 14:07:12 +01:00
Benjamin Kramer 4683ce2cd8 [InferAttrs] Give strnlen the same attributes as strlen
This moves the only string function out of the big list of math funcs.
And let's us CSE strnlen calls.
2021-12-30 20:43:43 +01:00
Kazu Hirata e7774f499b Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2021-12-26 14:26:44 -08:00
Djordje Todorovic 93615b88f5 [Debugify] Use WeakWH map collected before Pass when checking loc drop
This fixes a typo/bug when checking for pointer reuse when testing
DI location preservation in the Debugify original mode (when
checking -g generated Debug Info).

Differential Revision: https://reviews.llvm.org/D115621
2021-12-21 15:54:09 +01:00
Sami Tolvanen 5dc8aaac39 [llvm][IR] Add no_cfi constant
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that needs the address of the actual function body.

For example, in the Linux kernel, the code that sets up interrupt
handlers needs to take the address of the interrupt handler function
instead of the CFI jump table, as the jump table may not even be mapped
into memory when an interrupt is triggered.

This change adds the no_cfi constant type, which wraps function
references in a value that LowerTypeTestsModule::replaceCfiUses does not
replace.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Reviewed By: nickdesaulniers, pcc

Differential Revision: https://reviews.llvm.org/D108478
2021-12-20 12:55:32 -08:00
Kazu Hirata 26bd534a79 [llvm] Use none_of instead of \!any_of (NFC) 2021-12-17 13:48:57 -08:00
Kazu Hirata 2b7be47b22 [llvm] Strip redundant lambda (NFC) 2021-12-17 10:51:40 -08:00