This patch makes getEdgeValueLocal more precise when a freeze instruction is
given, by adding support for freeze into constantFoldUser
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D84629
This adds a common API for compute constant ranges of intrinsics.
The intention here is that
a) we can reuse the same code across different passes that handle
constant ranges, i.e. this can be reused in SCCP
b) we only have to add knowledge about supported intrinsics to
ConstantRange, not any consumers.
Differential Revision: https://reviews.llvm.org/D84587
This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual.
Differential Revision: https://reviews.llvm.org/D83709
InstCombine may convert conditions like (x < C) && (y < C) into
(x | y) < C (for some C). This patch teaches LVI to recognize that
in this case, it can infer either x < C or y < C along the edge.
This fixes the issue reported at
https://github.com/rust-lang/rust/issues/73827.
Differential Revision: https://reviews.llvm.org/D82715
This prevents us from creating temporary PoisoningVHs and
AssertingVHs while performing hashmap lookups. As such, it only
matters in assertion-enabled builds.
When LVI is performing assume intersections, it also checks for
llvm.experimental.guard intrinsics. To avoid unnecessary block
scans, it first checks whether this intrinsic is declared in the
module at all. I've noticed that we end up spending quite a lot
of time looking up that function again and again...
Avoid this by only looking it up once when LazyValueInfo is
constructed. This of course assumes that we don't introduce new
guard intrinsics (which is the case for all existing uses of LVI --
and even if it weren't, it would not introduce miscompiles, just
potentially lose optimization power.)
Differential Revision: https://reviews.llvm.org/D81796
This was reverted due to a reported memory usage increase. However,
a test case was never provided, and I wasn't able to reproduce it
myself.
Relative to the original patch, I have moved the block cache
structure behind a unique_ptr, to avoid storing a huge structure
inside a DenseMap.
---
Variant on D70103 to fix https://bugs.llvm.org/show_bug.cgi?id=43909.
The caching is switched to always use a BB to cache entry map, which
then contains per-value caches. A separate set contains value handles
with a deletion callback. This allows us to properly invalidate
overdefined values.
A possible alternative would be to always cache by value first and
have per-BB maps/sets in the each cache entry. In that case we could
use a ValueMap and would avoid the separate value handle set. I went
with the BB indexing at the top level to make it easier to integrate
D69914, but possibly that's not the right choice.
Differential Revision: https://reviews.llvm.org/D70376
After D76797 the dominator tree is no longer used in LVI, so we
can remove it as a pass dependency, and also get rid of the
dominator tree enabling/disabling logic in JumpThreading.
Apart from cleaning up the code, this also clarifies LVI
cache consistency, in that the LVI cache can no longer
depend on whether the DT was or wasn't enabled due to
pending DT updates at any given time.
Differential Revision: https://reviews.llvm.org/D76985
LVI and its consumers currently have quite a bit of complexity
related to dominator tree management. However, it doesn't look
like it is actually needed...
The only use of the dominator tree is inside isValidAssumeForContext().
However, due to the way LVI queries work, it is not needed:
If we query a value for some block, we will first get the edge values
from all predecessor blocks, which also includes an intersection with
assumptions that apply to the terminator of the predecessor. As such,
we will already have processed all assumptions from predecessor blocks
(this is actually stronger than what isValidAssumeForContext() does
with a DT, because this is capable of combining non-dominating
assumptions). The only additional assumptions we need to take into
account are those in the block being queried. And we don't need a
dominator tree for that.
This patch only removes the use of DT, I will drop the machinery
around it in a followup.
Differential Revision: https://reviews.llvm.org/D76797
Currently an unknown/undef value is marked as overdefined when merged
with an empty range. An empty range can occur in unreachable/dead code.
When merging the new unknown state (= no value known yet) with an empty
range, there still isn't any information about the value yet and we can
stay in unknown.
This gives a few nice improvements on the number of instructions removed
by IPSCCP:
Same hash: 170 (filtered out)
Remaining: 67
Metric: sccp.IPNumInstRemoved
Program base patch diff
test-suite...rks/FreeBench/mason/mason.test 3.00 6.00 100.0%
test-suite...nchmarks/McCat/18-imp/imp.test 3.00 5.00 66.7%
test-suite...C/CFP2000/179.art/179.art.test 2.00 3.00 50.0%
test-suite...ijndael/security-rijndael.test 2.00 3.00 50.0%
test-suite...ks/Prolangs-C/agrep/agrep.test 40.00 58.00 45.0%
test-suite...ce/Applications/Burg/burg.test 26.00 37.00 42.3%
test-suite...cCat/03-testtrie/testtrie.test 3.00 4.00 33.3%
test-suite...Source/Benchmarks/sim/sim.test 29.00 36.00 24.1%
test-suite.../Applications/spiff/spiff.test 9.00 11.00 22.2%
test-suite...s/FreeBench/neural/neural.test 5.00 6.00 20.0%
test-suite...pplications/treecc/treecc.test 66.00 79.00 19.7%
test-suite...langs-C/football/football.test 85.00 101.00 18.8%
test-suite...ce/Benchmarks/PAQ8p/paq8p.test 90.00 105.00 16.7%
test-suite...oxyApps-C++/miniFE/miniFE.test 37.00 43.00 16.2%
test-suite...rks/FreeBench/pifft/pifft.test 26.00 30.00 15.4%
test-suite...lications/sqlite3/sqlite3.test 481.00 548.00 13.9%
test-suite...marks/7zip/7zip-benchmark.test 4875.00 5522.00 13.3%
test-suite.../CINT2000/176.gcc/176.gcc.test 1117.00 1197.00 7.2%
test-suite...0.perlbench/400.perlbench.test 1618.00 1732.00 7.0%
Reviewers: efriedma, nikic, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78667
As suggested on D76788, this switches the LVI implementation to
return Optional<ValueLatticeElement> from various methods, instead
of passing in a ValueLatticeElement reference and returning a boolean.
Differential Revision: https://reviews.llvm.org/D78383
This patch combines the "has" and "get" parts of the cache access.
getCachedValueInfo() now both sets the BBLV return argument, and
returns whether the value was found.
Additionally, the management of the work stack is now integrated
into getBlockValue(). If the value is not cached yet, we try to
push to the stack (and return false, indicating that we need to
solve first), or return overdefined in case of a cycle.
These changes a) avoid a duplicate cache lookup for has & get and
b) ensure that the logic is uniform everywhere. For this reason
this change is also not quite NFC, because previously overdefined
values from the cache, and overdefined values from a cycle received
different treatment when it came to assumption intersection.
Differential Revision: https://reviews.llvm.org/D76788
This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.
A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below
define i32 @f(i32 %a, i1 %c) {
br i1 %c, label %true, label %false
true:
%a.255 = and i32 %a, 255
br label %exit
false:
br label %exit
exit:
%p = phi i32 [ %a.255, %true ], [ undef, %false ]
%f.1 = icmp eq i32 %p, 300
call void @use(i1 %f.1)
%res = and i32 %p, 255
ret i32 %res
}
In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the second use might not be
< 256.
Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.
Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D76931
This patch adds a new undef lattice state, which is used to represent
UndefValue constants or instructions producing undef.
The main difference to the unknown state is that merging undef values
with constants (or single element constant ranges) produces the
constant/constant range, assuming all uses of the merge result will be
replaced by the found constant.
Contrary, merging non-single element ranges with undef needs to go to
overdefined. Using unknown for UndefValues currently causes mis-compiles
in CVP/LVI (PR44949) and will become problematic once we use
ValueLatticeElement for SCCP.
Reviewers: efriedma, reames, davide, nikic
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D75120
This is a rebase of the change over D70376, which fixes an LVI cache
invalidation issue that also affected this patch.
-----
Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.
The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.
This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this change not fully NFC, because we can now fold
terminator icmps based on the dereferencability information in the
same block. This is the reason why I changed one JumpThreading test
(it would optimize the condition away without the change).
Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.
I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.
Differential Revision: https://reviews.llvm.org/D69914
Variant on D70103. The caching is switched to always use a BB to
cache entry map, which then contains per-value caches. A separate
set contains value handles with a deletion callback. This allows us
to properly invalidate overdefined values.
A possible alternative would be to always cache by value first and
have per-BB maps/sets in the each cache entry. In that case we could
use a ValueMap and would avoid the separate value handle set. I went
with the BB indexing at the top level to make it easier to integrate
D69914, but possibly that's not the right choice.
Differential Revision: https://reviews.llvm.org/D70376
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Fix cache invalidation by not guarding the dereferenced pointer cache
erasure by SeenBlocks. SeenBlocks is only populated when actually
caching a value in the block, which doesn't necessarily have to happen
just because dereferenced pointers were calculated.
-----
Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.
The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.
This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this change not fully NFC, because we can now fold
terminator icmps based on the dereferencability information in the
same block. This is the reason why I changed one JumpThreading test
(it would optimize the condition away without the change).
Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.
I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.
Differential Revision: https://reviews.llvm.org/D69914
This reverts commit 15bc4dc9a8.
clang-cmake-x86_64-sde-avx512-linux buildbot reported quite a few
compile-time regressions in test-suite, will investigate.
Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.
The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.
This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this change not fully NFC, because we can now fold
terminator icmps based on the dereferencability information in the
same block. This is the reason why I changed one JumpThreading test
(it would optimize the condition away without the change).
Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.
I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.
Differential Revision: https://reviews.llvm.org/D69914
This addresses the issue mentioned on D19867. When we simplify
with.overflow instructions in CVP, we leave behind extractvalue
of insertvalue sequences that LVI no longer understands. This
means that we can not simplify any instructions based on the
with.overflow anymore (until some over pass like InstCombine
cleans them up).
This patch extends LVI extractvalue handling by calling
SimplifyExtractValueInst (which doesn't do anything more than
constant folding + looking through insertvalue) and using the block
value of the simplification.
A possible alternative would be to do something similar to
SimplifyIndVars, where we instead directly try to replace
extractvalue users of the with.overflow. This would need some
additional structural changes to CVP, as it's currently not legal
to remove anything but the current instruction -- we'd have to
introduce a worklist with instructions scheduled for deletion or similar.
Differential Revision: https://reviews.llvm.org/D67035
llvm-svn: 371306
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.
This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.
Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.
There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.
Reviewers: chandlerc, hfinkel
Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66428
llvm-svn: 371284
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
This patch allows current users of Value::stripPointerCasts() to force
the result of the function to have the same representation as the value
it was called on. This is useful in various cases, e.g., (non-)null
checks.
In this patch only a single call site was adjusted to fix an existing
misuse that would cause nonnull where they may be wrong. Uses in
attribute deduction and other areas, e.g., D60047, are to be expected.
For a discussion on this topic, please see [0].
[0] http://lists.llvm.org/pipermail/llvm-dev/2018-December/128423.html
Reviewers: hfinkel, arsenm, reames
Subscribers: wdng, hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61607
llvm-svn: 362545
The underlying ConstantRange functionality has been added in D60952,
D61207 and D61238, this just exposes it for LVI.
I'm switching the code from using a whitelist to a blacklist, as
we're down to one unsupported operation here (xor) and writing it
this way seems more obvious :)
Differential Revision: https://reviews.llvm.org/D62822
llvm-svn: 362519
Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.
Differential Revision: https://reviews.llvm.org/D62447
llvm-svn: 361703
In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.
Differential Revision: https://reviews.llvm.org/D60656
llvm-svn: 361693
Based on ConstantRange support added in D61084, we can now handle
abs and nabs select pattern flavors in LVI.
Differential Revision: https://reviews.llvm.org/D61794
llvm-svn: 360700
I got confused on the terminology, and the change in D60598 was not
correct. I was thinking of "exact" in terms of the result being
non-approximate. However, the relevant distinction here is whether
the result is
* Largest range such that:
Forall Y in Other: Forall X in Result: X BinOp Y does not wrap.
(makeGuaranteedNoWrapRegion)
* Smallest range such that:
Forall Y in Other: Forall X not in Result: X BinOp Y wraps.
(A hypothetical makeAllowedNoWrapRegion)
* Both. (makeExactNoWrapRegion)
I'm adding a separate makeExactNoWrapRegion method accepting a
single APInt (same as makeExactICmpRegion) and using it in the
places where the guarantee is relevant.
Differential Revision: https://reviews.llvm.org/D60960
llvm-svn: 359402
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.
After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.
llvm-svn: 358876
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().
llvm-svn: 358875
If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1)
then we can constrain the value of %x inside the branch based on
makeGuaranteedNoWrapRegion(). We do this by extending the edge-value
handling in LVI. This allows CVP to then fold comparisons against %x,
as illustrated in the tests.
Differential Revision: https://reviews.llvm.org/D60650
llvm-svn: 358597
This adds ConstantRange::getFull(BitWidth) and
ConstantRange::getEmpty(BitWidth) named constructors as more readable
alternatives to the current ConstantRange(BitWidth, /* full */ false)
and similar. Additionally private getFull() and getEmpty() member
functions are added which return a full/empty range with the same bit
width -- these are commonly needed inside ConstantRange.cpp.
The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet)
constructor is now mandatory for the few usages that still make use of it.
Differential Revision: https://reviews.llvm.org/D59716
llvm-svn: 356852
Deopt operands are generally intended to record information about a site in code with minimal perturbation of the surrounding code. Idiomatically, they also tend to appear down rare paths. Putting these together, we have an obvious case for extending CVP w/deopt operand constant folding. Arguably, we should be doing this for all operands on all instructions, but that's definitely a much larger and risky change.
Differential Revision: https://reviews.llvm.org/D55678
llvm-svn: 351774
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
LVI was symbolically executing binary operators only when the RHS was
constant, missing the case where we have a ConstantRange for the RHS,
but not an actual constant. Tested using check-all and by
bootstrapping. Compile time is not impacted measurably.
Differential Revision: https://reviews.llvm.org/D19859
llvm-svn: 347379
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
Summary:
It is possible for LVI to encounter instructions that are not in valid
SSA form and reference themselves. One example is the following:
%tmp4 = and i1 %tmp4, undef
Before this patch LVI would recurse until running out of stack memory
and crashed. This patch marks these self-referential instructions as
Overdefined and aborts analysis on the instruction.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33357
Reviewers: craig.topper, anna, efriedma, dberlin, sebpop, kuhar
Reviewed by: dberlin
Subscribers: uabelho, spatel, a.elovikov, fhahn, eli.friedman, mzolotukhin, spop, evandro, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D34135
llvm-svn: 327432
Summary:
The LazyValueInfo pass caches a copy of the DominatorTree when available.
Whenever there are pending DominatorTree updates within JumpThreading's
DeferredDominance object we cannot use the cached DT for LVI analysis.
This commit adds the new methods enableDT() and disableDT() to LVI.
JumpThreading also sets the appropriate usage model before calling LVI
analysis methods.
Fixes https://bugs.llvm.org/show_bug.cgi?id=36133
Reviewers: sebpop, dberlin, kuhar
Reviewed by: sebpop, kuhar
Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov
Differential Revision: https://reviews.llvm.org/D42717
llvm-svn: 325356
Enhance LVI to analyze the ‘ashr’ binary operation. This leverages the infrastructure in ConstantRange for the ashr operation.
Patch by Surya Kumari Jangala!
Differential Revision: https://reviews.llvm.org/D40886
llvm-svn: 320983
Summary:
This allows sharing the lattice value code between LVI and SCCP (D36656).
It also adds a `satisfiesPredicate` function, used by D36656.
Reviewers: davide, sanjoy, efriedma
Reviewed By: sanjoy
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D37591
llvm-svn: 314411
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null
Differential Revision: https://reviews.llvm.org/D37628
llvm-svn: 312869
Summary:
Avoid checking each operand and calling getValueFromCondition() before calling
constantFoldUser() when the instruction type isn't supported by
constantFoldUser().
This fixes a large compile time regression in an internal build.
Reviewers: sanjoy
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36552
llvm-svn: 310545
Summary:
(This is a second attempt as https://reviews.llvm.org/D34822 was reverted.)
LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges.
But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur.
This patch adds a small logic to handle such a case in getEdgeValueLocal().
This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary.
With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%.
Reviewers: sanjoy
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36247
llvm-svn: 309986
This causes assertion failures in (a somewhat old version of) SpiderMonkey.
I have already forwarded reproduction instructions to the patch author.
llvm-svn: 309659
Summary:
LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges.
But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur.
This patch adds a small logic to handle such a case in getEdgeValueLocal().
This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary.
With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%.
Reviewers: wmi, dberlin, sanjoy
Reviewed By: sanjoy
Subscribers: davide, davidxl, llvm-commits
Differential Revision: https://reviews.llvm.org/D34822
llvm-svn: 309415
Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition.
But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare.
This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it.
Differential Revision: https://reviews.llvm.org/D33262
llvm-svn: 306085
Summary: LVI can reason about an AND of icmps on the true dest of a branch. I believe we can do similar for the false dest of ORs. This allows us to get the same answer for the demorganed versions of some of the AND test cases as you can see.
Reviewers: anna, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34431
llvm-svn: 306076
Previously it was non-const reference named Result which would tend to make someone think that it was an outparam when really its an input.
llvm-svn: 305114
Summary:
Unless I'm mistaken, the special handling for EQ/NE should cover everything and there is no reason to fallthrough to the more complex code. For that matter I'm not sure there's any reason to special case EQ/NE other than avoiding creating temporary ConstantRanges.
This patch moves the complex code into an else so we only do it when we are handling a predicate other than EQ/NE.
Reviewers: anna, reames, resistor, Farhana
Reviewed By: anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34000
llvm-svn: 305086
Summary:
LVIPrinter pass was previously relying on the LVICache. We now directly call the
the LVI functions which solves the value if the LVI information is not already
available in the cache. This has 2 benefits over the printing of LVI cache:
1. higher coverage (i.e. catches errors) in LVI code when cache value is
invalidated.
2. relies on the core functions, and not dependent on the LVI cache (which may
be scrapped at some point).
It would still catch any cache invalidation errors, since we first go through
the cache.
Reviewers: reames, dberlin, sanjoy
Reviewed by: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32135
llvm-svn: 304819
Summary:
ConstantRange contains two APInts which can allocate memory if their width is larger than 64-bits. So we shouldn't copy it when we can avoid it.
This changes LVILatticeVal::getConstantRange() to return its internal ConstantRange by reference. This allows many places that just need a ConstantRange reference to avoid making a copy.
Several places now capture the return value of getConstantRange() by reference so they can call methods on it that don't need a new object.
Lastly it adds std::move in one place to capture to move a local ConstantRange into an LVILatticeVal.
Reviewers: reames, dberlin, sanjoy, anna
Reviewed By: reames
Subscribers: grandinj, llvm-commits
Differential Revision: https://reviews.llvm.org/D32884
llvm-svn: 302331
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.
All of this allows the iterator to be used with common STL facilities,
standard algorithms, etc.
Doing this exposed some missing facilities in the iterator facade that
I've fixed and required some work to the actual iterator to fully
support the necessary API.
Differential Revision: https://reviews.llvm.org/D31548
llvm-svn: 300032
Using AssemblyAnnotationWriter for LVI printer prints
for instructions and basic blocks.
So, we explicitly need to print LVI info for the arguments of the function (these
are values and not instructions).
llvm-svn: 298640
Summary:
Adding a printer pass for printing the LVI cache values after transformations
that use LVI.
This will help us in identifying cases where LVI
invariants are violated, or transforms that leave LVI in an incorrect state.
Right now, I have added two test cases to show that the printer pass is working.
I will be adding more test cases in a later change, once this change is
checked in upstream.
Reviewers: reames, dberlin, sanjoy, apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30790
llvm-svn: 298542
Summary:
LVI is now depth first, which is optimal for iteration strategy in
terms of work per call. However, the way the results get cached means
it can still go very badly N^2 or worse right now. The overdefined
cache is per-block, because LVI wants to try to get different results
for the same name in different blocks (IE solve the problem
PredicateInfo solves). This means even if we discover a value is
overdefined after going very deep, it doesn't cache this information,
causing it to end up trying to rediscover it again and again. The
same is true for values along the way. In practice, overdefined
anywhere should mean overdefined everywhere (this is how, for example,
SCCP works).
Until we get around to reworking the overdefined cache, we need to
limit the worklist size we process. Note that permanently reverting
the DFS strategy exploration seems the wrong strategy (temporarily
seems fine if we really want). BFS is clearly the wrong approach, it
just gets luckier on some testcases. It's also very hard to design
an effective throttle for BFS. For DFS, the throttle is directly related
to the depth of the CFG. So really deep CFGs will get cutoff, smaller
ones will not. As the CFG simplifies, you get better results.
In BFS, the limit is it's related to the fan-out times average block size,
which is harder to reason about or make good choices for.
Bug being filed about the overdefined cache, but it will require major
surgery to fix it (plumbing predicateinfo through CVP or LVI).
Note: I did not make this number configurable because i'm not sure
anyone really needs to tweak this knob. We run CVP 3 times. On the
testcases i have the slow ones happen in the middle, where CVP is
doing cleanup work other things are effective at. Over the course of
3 runs, we don't see to have any real loss of performance.
I haven't gotten a minimized testcase yet, but just imagine in your
head a testcase where, going *up* the CFG, you have branches, one of
which leads 50000 blocks deep, and the other, to something where the
answer is overdefined immediately. BFS would discover the overdefined
faster than DFS, but do more work to do so. In practice, the right
answer is "once DFS discovers overdefined for a value, stop trying to
get more info about that value" (and so, DFS would normally cache the
overdefined results for every value it passed through in those 50k
blocks, and never do that work again. But it don't, because of the
naming problem)
Reviewers: chandlerc, djasper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29715
llvm-svn: 294463
This patch changes the order in which LVI explores previously unexplored paths.
Previously, the code used an BFS strategy where each unexplored input was added to the search queue before any of them were explored. This has the effect of causing all inputs to be explored before returning to re-evaluate the merge point (non-local or phi node). This has the unfortunate property of doing redundant work if one of the inputs to the merge is found to be overdefined (i.e. unanalysable). If any input is overdefined, the result of the merge will be too; regardless of the values of other inputs.
The new code uses a DFS strategy where we re-evaluate the merge after evaluating each input. If we discover an overdefined input, we immediately return without exploring other inputs.
We have reports of large (4-10x) improvements of compile time with this patch and some reports of more precise analysis results as well. See the review discussion for details. The original motivating case was pr10584.
Differential Revision: https://reviews.llvm.org/D28190
llvm-svn: 294264
with it.
This code was dereferencing the PoisoningVH which isn't allowed once it
is poisoned. But the code itself really doesn't need to access the
pointer, it is just doing the safe stuff of clearing out data structures
keyed on the pointer value.
Change the code to use iterators to erase directly from a DenseMap. This
is also substantially more efficient as it avoids lots of hashing and
lookups to do the erasure. DenseMap supports iterating behind the
iteration which is fairly easy to implement.
Sadly, I don't have a test case here. I'm not even close and I don't
know that I ever will be. The issue is that several of the tricky
aspects of fixing this only show up when you cause the stack's
SmallVector to be in *EXACTLY* the right location. I only ever got
a reproduction for those with Clang, and only with *exactly* the right
command line flags. Any adjustment, even to seemingly unrelated flags,
would make partial and half-way solutions magically start to "work". In
good news, all of this was caught with the LLVM test suite. Also, there
is no *specific* code here that is untested, just that the old pattern
of code won't immediately fail on any test case I've managed to
contrive.
llvm-svn: 293160
a lazy-asserting PoisoningVH.
AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.
This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.
The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.
The rest is straight cleanup.
I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.
Differential Revision: https://reviews.llvm.org/D29006
llvm-svn: 292928
become unavailable.
The AssumptionCache is now immutable but it still needs to respond to
DomTree invalidation if it ended up caching one.
This lets us remove one of the explicit invalidates of LVI but the
other one continues to avoid hitting a latent bug.
llvm-svn: 292769
Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.
As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).
Differential Revision: https://reviews.llvm.org/D28459
llvm-svn: 291671