Summary:
Updated CallPromotionUtils and impacted sites. Parameters that are
expected to be non-null, and return values that are guranteed non-null,
were replaced with CallBase references rather than pointers.
Left FIXME in places where more changes are facilitated by CallBase, but
aren't CallSites: Instruction* parameters or return values, for example,
where the contract that they are actually CallBase values.
Reviewers: davidxl, dblaikie, wmi
Reviewed By: dblaikie
Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77930
Summary:
The inline history is associated with a call site. There are two locations
we fetch inline history. In one, we fetch it together with the call
site. In the other, we initialize it under certain conditions, use it
later under same conditions (different if check), and otherwise is
uninitialized. Although currently there is no uninitialized use, the
code is more challenging to maintain correctly, than if the value were
always initialized.
Changed to the upfront initialization pattern already present in this
file.
Reviewers: davidxl, dblaikie
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77877
Summary:
Function names: camel case, lower case first letter.
Variable names: start with upper letter. For iterators that were 'i',
renamed with a descriptive name, as 'I' is 'Instruction&'.
Lambda captures simplification.
Opportunistic boolean return simplification.
Reviewers: davidxl, dblaikie
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77837
Summary:
*Almost* all uses are replaced. Left FIXMEs for the two sites that
require refactoring outside of Inliner, to scope this patch.
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77817
Attributor.cpp became quite big and we need to start provide structure.
The Attributor code is now in Attributor.cpp and the classes derived
from AbstractAttribute are in AttributorAttributes.cpp. Minor changes
were required but no intended functional changes.
We also minimized includes as part of this.
Reviewed By: baziotis
Differential Revision: https://reviews.llvm.org/D76873
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.
If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.
The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D74751
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.
In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.
In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)
Differential Revision: https://reviews.llvm.org/D75661
The new and old pass managers (PassManagerBuilder.cpp and
PassBuilder.cpp) are exposed to an `extern` declaration of
`attributor-disable` option which will guard the addition of the
attributor passes to the pass pipelines.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D76871
Query AAValueSimplify on pointers in memory accessing instructions to take
advantage of the constant propagation (or any other value simplification) of such values.
Summary:
A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining.
To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges.
```
void top()
{
int t = first();
second(t);
}
void second(int t)
{
t = third(t);
fourth(t);
}
void third(int t)
{
return t;
}
```
The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up. We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too.
Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification.
Reviewers: wenlei, davidxl, tejohnson
Reviewed By: wenlei, davidxl
Subscribers: eraman, nikic, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76248
There was a TODO in genericValueTraversal to provide the context
instruction and due to the lack of it users that wanted one just used
something available. Unfortunately, using a fixed instruction is wrong
in the presence of PHIs so we need to update the context instruction
properly.
Reviewed By: uenoku
Differential Revision: https://reviews.llvm.org/D76870
This cannot be triggered right now, as far as I know, but it doesn't
make sense to deduce a constant range on arguments of declarations.
Exposed during testing of AAValueSimplify extensions.
Use DL & ABI information for better alignment deduction, e.g., if a type
is accessed and the ABI specifies an alignment requirement for such an
access we can use it. This is based on a patch by @lebedev.ri and
inspired by getBaseAlign in Loads.cpp.
Depends on D76673.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D76674
If we have a must-tail call the callee and caller need to have matching
ABIs. Part of that is alignment which we might modify when we deduce
alignment of arguments of either. Since we would need to keep them in
sync, which is not as simple, we simply avoid deducing alignment for
arguments of the must-tail caller or callee.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D76673
We create a lot of AbstractAttributes and they live as long as
the Attributor does. It seems reasonable to allocate them via a
BumpPtrAllocator owned by the Attributor.
Reviewed By: baziotis
Differential Revision: https://reviews.llvm.org/D76589
Make the attributor pass aware of aligned_alloc for converting heap
allocations to stack ones.
Depends on D76971.
Differential Revision: https://reviews.llvm.org/D76974
Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression when writing/reading the profile. The patch adds the support in extbinary format. It is off by default but user can choose to enable it.
Note the feature of using MD5 in name table can bring very small chance of name conflict leading to profile mismatch. Besides, profile using the feature won't have the profile remapping support.
Differential Revision: https://reviews.llvm.org/D76255
Minor update/fixes to comments for the Attributor pass, and dyn_cast -> cast.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76972
This patch integrates operand bundle llvm.assumes [0] with the
Attributor. Most IRAttributes will now look at uses of the associated
value and if there are llvm.assume operand bundle uses with the right
tag we will check if they are in the must-be-executed-context (around
the context instruction). Droppable users, which is currently only
llvm::assume, are handled special in some places now as well.
[0] http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html
Reviewed By: uenoku
Differential Revision: https://reviews.llvm.org/D74888
PR35760 shows an example program which, when compiled with `clang -O0`
or gcc at any optimization level, prints '0'. However, llvm transforms
the program in a way that causes it to print '1'.
Fix the issue by having `AllUsesOfValueWillTrapIfNull` return false when
analyzing a load from a global which is used by an `icmp`. This special
case was untested [0] so this is just deleting dead code.
An alternative fix might be to change the GlobalStatus analysis for the
global to report "Stored" instead of "StoredOnce". However, "StoredOnce"
is appropriate when only one value other than the initializer is stored
to the global.
[0]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/IPO/GlobalOpt.cpp.html#L662
Differential Revision: https://reviews.llvm.org/D76645
Validation of the found runtime library functions declarations types
(return and argument types) with the expected types.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D76058
We special cased must-tail calls all over the place because they cannot
be modified as other calls can be. However, we already centralized the
modification API so we can centralize the handling as well. This
simplifies the code and allows to remove must-tail calls completely.
D75801 removed the last and only user of this option, so we can
drop it now. The original idea behind this was to only run expensive
transforms under -O3, but apart from the one known bits transform,
this has never really taken off. I believe nowadays the recommendation
is to put expensive transforms in AggressiveInstCombine instead,
though that isn't terribly popular either :)
Differential Revision: https://reviews.llvm.org/D76540
The existence of the class is more confusing than helpful, I think; the
commonality is mostly just "GEP is legal", which can be queried using
APIs on GetElementPtrInst.
Differential Revision: https://reviews.llvm.org/D75660
This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.
Reviewed By: jdoerfert, uenoku, baziotis
Differential Revision: https://reviews.llvm.org/D74691
Resolution for below fixme:
(ii) Check whether the value is captured in the scope using AANoCapture.
FIXME: This is conservative though, it is better to look at CFG and
check only uses possibly executed before this callsite.
Propagates caller argument's noalias attribute to callee.
Reviewed by: jdoerfert, uenoku
Reviewers: jdoerfert, sstefan1, uenoku
Subscribers: uenoku, sstefan1, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D71617
This patch introduces the propagation of known information based on path exploration.
For example,
```
int u(int c, int *p){
if(c) {
return *p;
} else {
return *p + 1;
}
}
```
An argument `p` is dereferenced whatever c's value is.
For an instruction `CtxI`, we accumulate branch instructions in the must-be-executed-context of `CtxI` and then, we take the conjunction of the successors' known state.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D65593
It is possible that an instruction to be changed to unreachable is
in the same block with a terminator that can be constant-folded.
In this case, as of now, the instruction will be changed to
unreachable before the terminator is folded. But, then the
whole BB becomes invalidated and so when we go ahead to fold
the terminator, we trap.
Change the order of these two.
Differential Revision: https://reviews.llvm.org/D75780
If we infer the dso_local flag for -fpic, dso_local should be dropped
when we convert a GlobalVariable a declaration. dso_local causes the
generation of direct access (e.g. R_X86_64_PC32). Such relocations referencing
STB_GLOBAL STV_DEFAULT objects are not allowed in a -shared link.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D74749
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.
This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.
Differential Revision: https://reviews.llvm.org/D75543
The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs:
https://bugs.llvm.org/show_bug.cgi?id=45015https://bugs.llvm.org/show_bug.cgi?id=42022
This patch contains a few independent changes:
1. Move the pass up in the pipeline, so it happens just after loop-vectorization.
This is only to keep vectorization passes together in the pipeline at the moment.
I don't have evidence of interaction between these yet.
2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was
partly proposed as far back as rL219644 (which is why it's effectively being moved
in the old PM code). This is important because the subsequent -instcombine doesn't
work as well without EarlyCSE. With the CSE, -instcombine is able to squash
shuffles together in 1 of the tests (because those are simple "select" shuffles).
3. Remove the -vector-combine pass that was running after SLP. We may want to do that
eventually, but I don't have a test case to support it yet.
Differential Revision: https://reviews.llvm.org/D75145
This reverts commit 80d0a137a5, and the
follow on fix in 873c0d0786. It is
causing test failures after a multi-stage clang bootstrap. See
discussion on D73242 and D75201.
Summary:
Fixes an issue that cropped up after the changes in D73242 to delay
the lowering of type tests. LTT couldn't handle any type tests with
non-string type id (which happens for local vtables, which we try to
promote during the compile step but cannot always when there are no
exported symbols).
We can simply treat the same as having an Unknown resolution, which
delays their lowering, still allowing such type tests to be used in
subsequent optimization (e.g. planned usage during ICP). The final
lowering which simply removes these handles them fine.
Beefed up an existing ThinLTO test for such unpromoted type ids so that
the internal vtable isn't removed before lower type tests, which hides
the problem.
Reviewers: evgeny777, pcc
Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, aganea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75201
This aims to fix a missed inlining case.
If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.
This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.
This is a second commit after a revert
https://reviews.llvm.org/rG4569b3a86f8a4b1b8ad28fe2321f936f9d7ffd43 and a fix
https://reviews.llvm.org/rG41e06ae7ba91.
Differential Revision: https://reviews.llvm.org/D69591