The existing code doesn't expect dummy values (undef, poison, null-derived
constants etc) as arguments of these intrinsics. However, they can be there
in unreached code. Currently we fail trying to find base for them.
Handle these cases separately. Return null as base for them to be consistent
with the handling in the main algorithm in findBaseDefiningValue.
Differential Revision: https://reviews.llvm.org/D129561
Reviewed By: apilipenko
Finding BDV for vector value does not handle freeze instruction.
Adding its handling as it is done for scalar case.
Reviewed By: apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D128254
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:
* If IR that contains *neither* explicit ptr nor %T* types is passed
to tools, we will now use opaque pointer mode, unless
-opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
It is possible to opt-out by calling setOpaquePointers(false) on
LLVMContext.
A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.
Differential Revision: https://reviews.llvm.org/D126689
Don't check whether an input of BDV can be pruned if the input
is the BDV itself. BDV is present in the states map, so in case
the input is the BDV itself, we'd return false. So explicitly check this case.
Differential Revision: https://reviews.llvm.org/D123846
The assertion verifying that a newly computed value matches what is
already cached used stripPointerCasts() to strip bitcasts, however the
values can be not only pointers, but also vectors of pointers. That is
problematic because stripPointerCasts() doesn't handle vectors of
pointers. This patch introduces an ad-hoc utility function to strip all
bitcasts regardless of the value type.
Reviewed By: skatkov, reames
Differential Revision: https://reviews.llvm.org/D119994
This makes the statepoint methods in IRBuilder accept a
FunctionCallee, which carries both the callee and function type.
This is used to add the elementtype attribute to the statepoint call.
RS4GC requires an additional tweak to actually preserve that attribute
-- previously the attributes on the call were completely overwritten.
Differential Revision: https://reviews.llvm.org/D118886
Mostly this fixes cases where !noalias or !alias.scope were passed
a scope rather than a scope list. In some cases I opted to drop
the metadata entirely instead, because it is not really relevant
to the test.
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103240
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103238
I don't like landing this change, but it's an acknowledgement of a practical reality. Despite not having well specified semantics for inttoptr and ptrtoint involving non-integral pointer types, they are used in practice. Here's a quick summary of the current pragmatic reality:
* I happen to know that the main external user of non-integral pointers has effectively disabled the verifier rules.
* RS4GC (the lowering pass for abstract GC machine model which is the key motivation for non-integral pointers), even supports them. We just have all the tests using an integral pointer space to let the verifier run.
* Certain idioms (such as alignment checks for alignment N, where any relocation is guaranteed to be N byte aligned) are fine in practice.
* As implemented, inttoptr/ptrtoint are CSEd and are not control dependent. This means that any code which is intending to check a particular bit pattern at site of use must be wrapped in an intrinsic or external function call.
This change allows them in the Verifier, and updates the LangRef to specific them as implementation dependent. This allows us to acknowledge current reality while still leaving ourselves room to punt on figuring out "good" semantics until the future.
This is a modified version of a patch by tolziplohu with a style change, and most importantly, a revised commit message.
inttoptr for a non-integral address space is currently ill defined in the LangRef. Figuring out exactly what the dynamic semantics of such a cast would be is hard, and not yet settled. Despite that, we still need to go ahead and implement something in RS4GC for a couple of reasons.
First, as a simple consistency argument. We're apparently added support for constexpr inttoptrs a while back, and even have tests which exercised them. Having a lack of constant folding trigger a crash during lowering is non-ideal.
Second, and more fundementally, the optimizer is allowed to insert undefined constructs in unreachable code. At the same time, we can't assume that dynamically dead code is always pruned before lowering. As a result, we must assume that inttoptrs can occur (even if completely ill defined) along dead paths. We need the lowering to not crash. The stackmaps produced can be garbage (as the assumption is the code is dynamically dead), but the lowering itself can't crash.
Differential Revision: https://reviews.llvm.org/D103492
There can be a need for some optimizations to get (base, offset)
for any GC pointer. The base can be calculated by generating
needed instructions as it is done by the
RewriteStatepointsForGC::findBasePointer() function. The offset
can be calculated in the same way. Though to not expose the base
calculation and to make the offset calculation as simple as
ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal
outside RS4GC, this patch introduces 2 intrinsics:
@llvm.experimental.gc.get.pointer.base(%derived_ptr)
@llvm.experimental.gc.get.pointer.offset(%derived_ptr)
These intrinsics are inlined by RS4GC along with generation of
statepoint sequences.
With these new intrinsics the GC parseable lowering for atomic
memcpy intrinsics (6ec2c5e402)
could be implemented as a separate pass.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100445
I noticed that rs4gc is not stripping a number of memory aliasing related attributes. We do strip some from call sites, but don't strip the same ones from declarations or parameters.
Why do we need to strip these? Two answers:
Safepoints conceptually read and write to the entire garbage collected heap in the physical model. We need this to preserve ordering of all loads and stores with respect to possible relocation.
We can infer other attributes from these. For instance, readnone can imply both nofree and nosync. Both of which don't hold after physical rewriting.
Note: This exposed a latent issue which was fixed a couple weeks back in 01801d5274.
Differential Revision: https://reviews.llvm.org/D99802
This change fixes a latent bug which was exposed by a change currently in review (https://reviews.llvm.org/D99802#2685032).
The story on this is a bit involved. Without this change, what ended up happening with the pending review was that we'd strip attributes off intrinsics, and then selectiondag would fail to lower the intrinsic. Why? Because the lowering of the intrinsic relies on the presence of the readonly attribute. We don't have a matcher to select the case where there's a glue node needed.
Now, on the surface, this still seems like a codegen bug. However, here it gets fun. I was unable to reproduce this with a standalone test at all, and was pretty much struck until skatkov provided the critical detail. This reproduces only when RS4GC and codegen are run in the same process and context. Why? Because it turns out we can't roundtrip the stripped attribute through serialized IR!
We'll happily print out the missing attribute, but when we parse it back, the auto-upgrade logic has a side effect of blindly overwriting attributes on intrinsics with those specified in Intrinsics.td. This makes it impossible to exercise SelectionDAG from a standalone test case.
At this point, I decided to treat this an RS4GC bug as a) we don't need to strip in this case, and b) I could write a test which shows the correct behavior to ensure this doesn't break again in the future.
As an aside, I'd originally set out to handle libfuncs too - since in theory they might have the same issues - but backed away quickly when I realized how the semantics of builtin, nobuiltin, and no-builtin-x all interacted. I'm utterly convinced that no part of the optimizer handles that correctly, and decided not to open that can of worms here.
The safepoints being inserted exists to free memory, or coordinate with another thread to do so. Thus, we must strip any inferred attributes and reinfer them after the lowering.
I'm not aware of any active miscompiles caused by this, but since I'm working on strengthening inference of both and leveraging them in the optimization decisions, I figured a bit of future proofing was warranted.
meetBDVState utility may sets the base pointer for the conflict state.
At this moment the base for conflict state does not have any meaning but
is used in comparison of BDV states. This comparison is used as an indicator
of progress done on iteration and RS4GC pass uses infinite loop to reach
fixed point.
As a result for added test on each iteration state for some phi nodes is updated
with other base value for conflict state and it indicates as a progress while
for conflict state there is no any progress more possible.
In reality the base value is transferred from one state to another and pass
detects the progress on these states.
The test is very fragile. The traversal order of states and operands of phi nodes
plays important role.
Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99058
A broadcast is a shufflevector where only one input is used. Because of the way we handle constants (undef is a constant), the canonical shuffle sees a meet of (some value) and (nullptr). Given this, every broadcast gets treated as a conflict and a new base pointer computation is added.
The other way to tackle this would be to change constant handling specifically for undefs, but this seems easier.
Differential Revision: https://reviews.llvm.org/D98315
RS4GC needs to rewrite the IR to ensure that every relocated pointer has an associated base pointer. The existing code isn't particularly smart about avoiding duplication of existing IR when it turns out the original pointer we were asked to materialize a base pointer for is itself a base pointer.
This patch adds a stage to the algorithm which prunes nodes proven (with a simple forward dataflow fixed point) to be base pointers from the list of nodes considered for duplication. This does require changing some of the later invariants slightly, that's probably the riskiest part of the change.
Differential Revision: D98122
As a pragmatic tradeoff, the ease of updating the tests outweighs the slightly easier to understand test conditions. Where revevant, debug output was converted to comments to help human understanding.
If we have a value live over a call which is used for deopt at the call, we know that the value must be a base pointer. We can avoid potentially inserting IR to materialize a base for this value.
In it's current form, this is mostly a compile time optimization. Building the base pointer graph (and then optimizing it away again) is a relatively expensive operation. We also sometimes end up with better codegen in practice - due to failures in optimizing away the inserted base pointer propogation - but those are optimization bugs we're fixing concurrently.
The alternative to this would be to extend the base pointer inference with the ability to generally reuse multiple-base input instructions (phis and selects). That's somewhat invasive and complicated, so we're defering it a bit longer.
Differential Revision: https://reviews.llvm.org/D97885
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93793
This commit copies existing tests at llvm/Transforms and replaces
'insertelement undef' in those files with 'insertelement poison'.
(see https://reviews.llvm.org/D93586)
Tests listed using this script:
grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq |
wc -l
Tests updated:
file_org=llvm/test/Transforms/$1
file=${file_org%.ll}-inseltpoison.ll
cp $file_org $file
sed -i -E 's/^([^;]*)insertelement <(.*)> undef/\1insertelement <\2> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
echo "$file : should be manually updated"
# I manually updated the script
exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
This change introduces a GC parseable lowering for element atomic
memcpy/memmove intrinsics. This way runtime can provide an
implementation which can take a safepoint during copy operation.
See "GC-parseable element atomic memcpy/memmove" thread on llvm-dev
for the background and details:
https://groups.google.com/g/llvm-dev/c/NnENHzmX-b8/m/3PyN8Y2pCAAJ
Differential Revision: https://reviews.llvm.org/D88861
Summary:
Currently when --passes is used, any passes specified via -foo are
ignored. Explicitly bail out when that happens.
This requires changing some tests. Most were straightforward, but
codegenprepare-produced-address-math.ll is tricky. One of its RUNs runs
CodeGenPrepare. I tried porting CodeGenPrepare to the NPM, but ended up
getting stuck when I needed a TargetMachine. NPM doesn't have support
for MachineFunctions yet. So I just deleted that RUN line, since it was
mass-added in https://reviews.llvm.org/D54848 and is likely not that
useful.
Reviewers: echristo, hans
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82271
Now that we have an operand based form for the GC arguments to a statepoint intrinsic, update RS4GC to use it and update tests to reflect. This is pretty straight forward. I nearly landed without review, but figured a second set of eyes didn't hurt.
Differential Revision: https://reviews.llvm.org/D81121
We've started (D80598) the process of migrating away from the inline operand lists in statepoints to using explicit operand bundles. Update a few tests to reflect the new preference. More to come, these were simply the ones outside any obvious grouping.
Continues from D80598.
The key point of the change is to default to using operand bundles instead of the inline length prefix argument lists for statepoint nodes. An important subtlety to note is that the presence of a bundle has semantic meaning, even if it is empty. As such, we need to make a somewhat deeper change to the interface than is first obvious.
Existing code treats statepoint deopt arguments and the deopt bundle operands differently during inlining. The former is ignored (resulting in caller state being dropped), the later is merged.
We can't preserve the old behaviour for calls with deopt fed to RS4GC and then inlining, but we can avoid the no-deopt case changing. At least in internal testing, that seem to be the important one. (I'd argue the "stop merging after RS4GC" behaviour for the former was always "unexpected", but that the behaviour for non-deopt calls actually make sense.)
Differential Revision: https://reviews.llvm.org/D80674
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.
The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.
Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.
This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.
Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.
Differential Revision: https://reviews.llvm.org/D78403
This is relanding of rGbb308b020522420413c7d3f2989a88f2fc423c56 after
speculatively fixing buildbot lit test failure which was seen on two
bots (I cannot reproduce the lit test failure locally either).
[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived
pointer
Summary:
This is a more general fix to 59029b9eef (D75704).
This patch does the following:
updates isKnownBaseValue to account for base pointer and
derived pointer having differing types.
This inturn allows us to populate the
lattice (States) for such derived pointers.
It also updates all states where the base and derived pointers have
differing types (vector versus scalar) and conservatively marks these
states as conflictcs.
Note that in 59029b9eef, we were just fixing existing lattice values
and that too, only for uses of extractelement.
Reviewers: reames, skatkov, dantrushin
Reviewed By: skatkov
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D76305
Summary:
This is a more general fix to 59029b9eef (D75704).
This patch does the following:
1. updates isKnownBaseValue to account for base pointer and
derived pointer having differing types.
2. This inturn allows us to populate the
lattice (States) for such derived pointers.
3. It also updates all states where the base and derived pointers have
differing types (vector versus scalar) and conservatively marks these
states as conflictcs.
Note that in 59029b9eef, we were just fixing existing lattice values
and that too, only for uses of extractelement.
Reviewers: reames, skatkov, dantrushin
Reviewed By: skatkov
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76305
As mentioned in the comments, extractelement is special
since we actually want a scalar base for that element we extracted from
the vector (i.e. not a vector base).
This same logic should apply to uses of the extractelement such as phis
and selects which have the same BDV as the extractelement.
Howeber, for these uses we conservatively mark the BDV state as
conflict, since setting the EE's new base BDV does not always dominate
these uses.
Added testcase showcases the problem where the BDV identification chokes
on the incorrect cast from vector to scalar for the phi use of
extractelement.
Tests-Run: make check, internal fuzzer testing
Reviewers: reames, skatkov, dantrushin
Reviewed-By: dantrushin
Differential Revision: https://reviews.llvm.org/D75704