I uploaded the old version accidentally instead of the one with these
minor adjustments requested by the reviewers.
Differential Revision: https://reviews.llvm.org/D77855
Summary:
The function in X86MCCodeEmitter has too many parameters to make it look
messy, and some parameters are unnecessary. This is the first patch to
reduce their parameters.
The follwing operations are cheap
```
unsigned Opcode = MI.getOpcode();
const MCInstrDesc &Desc = MCII.get(Opcode);
uint64_t TSFlags = Desc.TSFlags;
```
So if we pass a `MCInst`, we don't need to pass `MCInstrDesc`;
if we pass a `MCInstrDesc`, we don't need to pass `TSFlags`.
Reviewers: craig.topper, MaskRay, pengfei
Reviewed By: craig.topper
Subscribers: annita.zhang, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78180
I plan to use MCSection::getName() in D78138. Having the function in the base class is also convenient for debugging.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D78251
Summary:
Instead of storing a vptr in each FoldingSet instance, form an
equivalent struct and pass it implicitly from FoldingSet into the
various FoldingSetBase methods.
This has three benefits:
* FoldingSet becomes one pointer smaller.
* Under LTO, the "virtual" functions are much easier to inline.
* The element type no longer needs to be complete when instantiating
FoldingSet<T>, only when instantiating an insert / lookup member.
Reviewers: rnk
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78247
Now Reassociate Pass invalidates the analysis results of AAManager and BasicAA,
but it saves GlobalsAA, although it seems that it should preserve them, since
it affects only Unary and Binary operators.
Author: kpolushin (Kirill)
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D77137
Summary:
We can and should remove deleted nodes from their respective SCCs. We
did not do this before and this was a potential problem even though I
couldn't locally trigger an issue. Since the `DeleteNode` would assert
if the node was not in the SCC, we know we only remove nodes from their
SCC and only once (when run on all the Attributor tests).
Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku
Subscribers: hiraditya, bollu, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77855
Summary:
While it is uncommon that the ExternalCallingNode needs to be updated,
it can happen. It is uncommon because most functions listed as callees
have external linkage, modifying them is usually not allowed. That said,
there are also internal functions that have, or better had, their
"address taken" at construction time. We conservatively assume various
uses cause the address "to be taken". Furthermore, the user might have
become dead at some point. As a consequence, transformations, e.g., the
Attributor, might be able to replace a function that is listed
as callee of the ExternalCallingNode.
Since there is no function corresponding to the ExternalCallingNode, we
did just remove the node from the callee list if we replaced it (so
far). Now it would be preferable to replace it if needed and remove it
otherwise. However, removing the node has implications on the CGSCC
iteration. Locally, that caused some other nodes to be never visited
but it is for sure possible other (bad) side effects can occur. As it
seems conservatively safe to keep the new node in the callee list we
will do that for now.
Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku
Subscribers: hiraditya, bollu, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77854
Summary:
The old code did eliminate references from and to functions that were
about to be deleted only just before we deleted them. This can cause
references from other functions that are supposed to be deleted to still
exist, depending on the order. If the functions form a strongly
connected component the problem manifests regardless of the order in
which we try to actually delete the functions.
This patch introduces a two step deletion. First we remove all
references and then we delete the function. Note that this only affects
the old call graph. There should not be any functional changes if no old
style call graph was given.
To test this we delete two strongly connected functions instead of one
in an existing test.
Reviewers: hfinkel
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77975
Changing the underlying YAML support to allow returning a literal
string, rather than a StringRef, would probably be a bigger refactor so
I didn't do that.
This is needed to fix the reason
0a2be46cfd (Modules: Invalidate out-of-date PCMs as they're
discovered) and 5b44a4b07fc1d ([modules] Do not cache invalid state for
modules that we attempted to load.) were reverted.
These patches changed Clang to use `isVolatile` when loading modules.
This had the side effect of not using mmap when loading modules, and
thus greatly increased memory usage.
The reason it wasn't using mmap is because `MemoryBuffer` plays some
games with file size when you request null termination, and it has to
disable these when `isVolatile` is set as the size may change by the
time it's mmapped. Clang by default passes
`RequiresNullTerminator = true`, and `shouldUseMmap` ignored if
`RequiresNullTerminator` was even requested.
This patch adds `RequiresNullTerminator` to the `FileManager` interface
so Clang can use it when loading modules, and changes `shouldUseMmap` to
only take volatility into account if `RequiresNullTerminator` is true.
This is fine as both `mmap` and a `read` loop are vulnerable to
modifying the file while reading, but are immune to the rename Clang
does when replacing a module file.
Differential Revision: https://reviews.llvm.org/D77772
Summary:
The renaming is necessary to make the naming scheme uniform with other
gather/scatter load/stores SVE intrinsics.
The naming of variables and functions have been adapted to make it
explicit whether we are dealing with a scalar offset (which is
unscaled) or an index (which is scaled according to the data type of
the lanes of the vector).
Reviewers: andwar, sdesmalen, rengolin
Reviewed By: andwar
Subscribers: tschuett, hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77839
We have added code to correct the .localentry values on assignments. However, we
never clear the set so presumably it will still contain the (now dangling)
MCSymbol pointers across a call to finish() and reset() in the streamer.
This is based on my speculation that it is the reason we are getting
segmentation faults mentioned in https://bugs.llvm.org/show_bug.cgi?id=45366
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45366
Differential revision: https://reviews.llvm.org/D78196
The "Align" passed into getMachineMemOperand etc. is the alignment of
the MachinePointerInfo, not the alignment of the memory operation.
(getAlign() on a MachineMemOperand automatically reduces the alignment
to account for this.)
We were passing on wrong (overconservative) alignment in a bunch of
places. Fix a bunch of these, mostly in legalization. And while I'm
here, switch to the new Align APIs.
The test changes are all scheduling changes: the biggest effect of
preserving large alignments is that it improves alias analysis, so the
scheduler has more freedom.
(I was originally just trying to do a minor cleanup in
SelectionDAGBuilder, but I accidentally went deeper down the rabbit
hole.)
Differential Revision: https://reviews.llvm.org/D77687
This commit fixes using functions in `IRObjectFile` to load bitcode from
wasm objects by recognizing the file magic for wasm and also inheriting
the default implementation of classifying sections as bitcode.
Patch By: alexcrichton
Differential Revision: https://reviews.llvm.org/D78199
The current strategy LICM uses when sinking for debuginfo is
that of picking the debug location of one of the uses.
This causes stepping to be wrong sometimes, see, e.g. PR45523.
This patch introduces a generalization of getMergedLocation(),
that operates on a vector of locations instead of two, and try
to merge all them together, and use the new API in LICM.
<rdar://problem/61750950>
This moves v32i16/v64i8 to a model consistent with how we
treat integer types with avx1.
This does change the ABI for types vXi16/vXi8 vectors larger than
512 bits to pass in multiple zmms instead of multiple ymms. We'd
already hacked some code to make v64i8/v32i16 pass in zmm.
Cost model is still a bit of a mess. In some place I tried to
match existing behavior. But really we need to account for
splitting and concating costs. Cost model for shuffles is
especially pessimistic.
Differential Revision: https://reviews.llvm.org/D76212
MCExpr has a bunch of free space that is currently going to waste.
Repurpose it as 24 bits of subclass data, which is enough to reduce
the size of all subclasses by 8 bytes. This gives us some respectable
savings for debuginfo builds. Here are the max-rss reductions for the
fat LTO link step:
kc.link 238MiB 231MiB (-2.82%)
sqlite3.link 258MiB 250MiB (-3.27%)
consumer-typeset.link 152MiB 148MiB (-2.51%)
bullet.link 197MiB 192MiB (-2.30%)
tramp3d-v4.link 578MiB 567MiB (-1.92%)
pairlocalalign.link 92MiB 90MiB (-1.98%)
clamscan.link 230MiB 223MiB (-2.81%)
lencod.link 242MiB 235MiB (-2.67%)
SPASS.link 235MiB 230MiB (-2.23%)
7zip-benchmark.link 450MiB 435MiB (-3.25%)
Differential Revision: https://reviews.llvm.org/D77939
-Consistently name the functions as split*
-Add a helper for doing the two extractSubvector calls and determining the size of the split
-Use getSplitDestVTs to get the result type for the split node.
-Move the binary and unary helper to one place in the file near the extractSubvector functions. Left the VSETCC one near LowerVSETCC since that's its only caller.
-Remove the 256/512 wrappers that just had asserts. I don't think they provided a lot of value and now with the routines called split* the call sites are more obvious what they do.
-Make the unary routine support different source and dest types to support D76212.
-Add some weaker asserts into the helpers to make up for losing the very specific asserts from the 256/512 wrappers.
Differential Revision: https://reviews.llvm.org/D78176
Summary:
As a follow up to https://reviews.llvm.org/D29014, add translation
support for freeze.
Introduce a new generic instruction G_FREEZE and translate freeze to it.
Reviewers: dsanders, aqjune, arsenm, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson
Reviewed By: aqjune, arsenm
Subscribers: fhahn, lebedev.ri, wdng, rovka, hiraditya, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77795
This patch intends to provide relocation support for the expression
contains two unpaired relocatable terms with opposite signs.
Differential Revision: https://reviews.llvm.org/D77424
Summary:
No error or warning is emitted when specific reserved registers are
written to in inline assembly. Therefore, writes to the program counter
or to the frame pointer, for instance, were permitted, which could have
led to undesirable behaviour.
Example:
int foo() {
register int a __asm__("r7"); // r7 = frame-pointer in M-class ARM
__asm__ __volatile__("mov %0, r1" : "=r"(a) : : );
return a;
}
In contrast, GCC issues an error in the same scenario.
This patch detects writes to specific reserved registers in inline
assembly for ARM and emits an error in such case. The detection works
for output and input operands. Clobber operands are not handled here:
they are already covered at a later point in
AsmPrinter::emitInlineAsm(const MachineInstr *MI). The registers
covered are: program counter, frame pointer and base pointer.
This is ARM only. Therefore the implementation of other targets'
counterparts remain open to do.
Reviewers: efriedma
Reviewed By: efriedma
Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76848
Summary:
Improve error message in case of conflict between several implicit
format to mention the operand that conflict.
Reviewers: jhenderson, jdenny, probinson, grimar, arichardson, rnk
Reviewed By: jdenny
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77741
To simplify future work on statepoint representation, hide
direct access to statepoint field indices and provide getters
for them. Add getters for couple more statepoint fields.
This also fixes two bugs in MachineVerifier for statepoint:
First, the `break` statement was falling out of `if` statement
scope, thus disabling following checks.
Second, it was incorrectly accessing some fields like CallingConv -
StatepointOpers gives index to their value directly, not to
preceeding field type encoding.
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D78119
After introducing VPWidenSelectRecipe, the duplicated logic can be
shared.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D77973
Summary:
If we have an (invalid) relocation which relocates bytes which partially
lie outside the range of the relocated section, the getRelocatedValue
would return confusing results. It would first read zero (because that's
what the underlying DataExtractor api does for out-of-bounds reads), and
then relocate that zero anyway.
A more appropriate behavior is to return zero straight away. This is
what this patch does.
Reviewers: dblaikie, jhenderson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78113
We can eliminate MemoryDefs of objects not accessible after the function
returns (e.g. alloca), if there are no reads between the MemoryDef and
any function exits. We can stop traversing paths that completely
overwrite the memory location of the MemoryDef.
This patch was split off D73763.
Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv
Reviewed By: asbirlea, george.burgess.iv
Differential Revision: https://reviews.llvm.org/D77736
This is related to commit 8c11bc0cd0
which introduces the FixIrreducible pass. The warning seems hard to
reproduce locally. The latest attempt ought to work.
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.
This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D77198
This restores commit 2ada8e2525.
Originally reverted with commit 44e09b59b8.
Handling LoadInst and StoreInst in tryToWiden seems a bit
counter-intuitive, as there is only an assertion for them and in no
case VPWidenRefipes are created for them.
I think it makes sense to move the assertion to handleReplication, where
the non-widened loads and store are handled.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D77972
Summary:
Changing all mnemonic to match assembly instructions to simplify mnemonic
naming rules. This time update all fixed-point arithmetic instructions.
This also corrects smax/smin code generations.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D77856
Fix an assert introduced in 41ed5d856c1: a phi with a single predecessor and a
mask is a valid case which is already supported by the code.
Differential Revision: https://reviews.llvm.org/D78115
This reverts commit 2ada8e2525.
Buildbots produced compilation errors which I was not able to quickly
reproduce locally. Need more time to investigate.
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.
This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D77198
This is a minor NFC change to make the code more clear. We have the NegatibleCost that
has cheaper, neutral, and expensive. Typically, the smaller one means the less cost.
It is inverse for current implementation, which makes following code not easy to read.
If (CostX > CostY) negate(X)
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D77993
Summary:
This revision adds two utilities currently present in MLIR to LLVM StringExtras:
* convertToSnakeFromCamelCase
Convert a string from a camel case naming scheme, to a snake case scheme
* convertToCamelFromSnakeCase
Convert a string from a snake case naming scheme, to a camel case scheme
Differential Revision: https://reviews.llvm.org/D78167
Summary:
Currently, the internal options -vectorize-loops, -vectorize-slp, and
-interleave-loops do not have much practical effect. This is because
they are used to initialize the corresponding flags in the pass
managers, and those flags are then unconditionally overwritten when
compiling via clang or via LTO from the linkers. The only exception was
-vectorize-loops via opt because of some special hackery there.
While vectorization could still be disabled when compiling via clang,
using -fno-[slp-]vectorize, this meant that there was no way to disable
it when compiling in LTO mode via the linkers. This only affected
ThinLTO, since for regular LTO vectorization is done during the compile
step for scalability reasons. For ThinLTO it is invoked in the LTO
backends. See also the discussion on PR45434.
This patch makes it so the internal options can actually be used to
disable these optimizations. Ultimately, the best long term solution is
to mark the loops with metadata (similar to the approach used to fix
-fno-unroll-loops in D77058), but this enables a shorter term
workaround, and actually makes these internal options useful.
I constant propagated the initial values of these internal flags into
the pass manager flags (for some reasons vectorize-loops and
interleave-loops were initialized to true, while vectorize-slp was
initialized to false). As mentioned above, they are overwritten
unconditionally so this doesn't have any real impact, and these initial
values aren't particularly meaningful.
I then changed the passes to check the internl values and return without
performing the associated optimization when false (I changed the default
of -vectorize-slp to true so the options behave similarly). I was able
to remove the hackery in opt used to get -vectorize-loops=false to work,
as well as a special option there used to disable SLP vectorization.
Finally, I changed thinlto-slp-vectorize-pm.c to:
a) Only test SLP (moved the loop vectorization checking to a new test).
b) Use code that is slp vectorized when it is enabled, and check that
instead of whether the pass is enabled.
c) Test the new behavior of -vectorize-slp.
d) Test both pass managers.
The loop vectorization (and associated interleaving) testing I moved to
a new thinlto-loop-vectorize-pm.c test, with several changes:
a) Changed the flags on the interleaving testing so that it will
actually interleave, and check that.
b) Test the new behavior of -vectorize-loops and -interleave-loops.
c) Test both pass managers.
Reviewers: fhahn, wmi
Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77989
This should make both static and dynamic NewPM plugins work with LTO.
And as a bonus, it makes static linking of OldPM plugins more reliable
for plugins with both an OldPM and NewPM interface.
I only implemented the command-line flag to specify NewPM plugins in
llvm-lto2, to show it works. Support can be added for other tools later.
Differential Revision: https://reviews.llvm.org/D76866
Originally committed as 416fa7720e
Reverted (due to buildbot failure - breaking lldb) in 7a45aeacf3.
I still can't seem to build lldb locally, but Pavel Labath has kindly
provided a potential fix to preserve the old behavior in lldb by
registering a simple recoverable error handler there that prints to the
desired stream in lldb, rather than stderr.
Summary:
This patch fix the following issues in InstCombiner::visitGetElementPtrInst
1. Skip for scalable type if transformation requires fixed size number of
vector element.
2. Skip for scalable type if transformation relies on compile-time known type
alloc size.
3. Use VectorType::getElementCount when scalable property is used to construct
new VectorType.
4. Use TypeSize::getKnownMinSize when minimal size of a scalable type is valid to determine GEP 'inbounds'.
5. Explicitly call TypeSize::getFixedSize to avoid implicit type conversion to uint64_t.
Reviewers: sdesmalen, efriedma, spatel, ctetreau
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78081
GCC emits this new form along with others forms(supported in llvm-dwardump)
and since it's support was missing in llvm-dwarfdump, it was not
able to correctly dump the content a debug_macro section for GCC
generated binaries.
This patch extends llvm-dwarfdump to support this form,
now GCC generated debug_macro section can be correctly dumped
using llvm-dwarfdump.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D78006
FileCheckImpl.h internal header uses type defined in the public
FileCheck.h header but fails to include it. This commit fixes that.
Test Plan: Built it locally successfully.
Summary:
Only first-faulting and non-faulting loads read/update the FFR register
and hence only the corresponding SDNodes should be decorated with
`SDNPOptInGlue` and `SDNPOutGlue`. This patch:
* removes SDNPOptInGlue from regular loads stores (FFR is not read)
* adds SDNPOutGlue to first-faulting and non-faulting loads (FFR is
both read and updated)
Differential Revision: https://reviews.llvm.org/D77724
Summary:
AbstractCallSite::getCallbackUses() does not check that callback callee index from
the callback metadata does not exceed the total number of call arguments. This patch
add such validation check.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78112
The pass was incorrectly reverting back to a "T" when something wrote
to VPR inside a "E" block. This is not the correct behaviour, the
predicate should stay the same.
Differential Revision: https://reviews.llvm.org/D77798
Summary:
Without that we could be silently reading zeroes, as that's the default
DataExtractor behavior. The entire parse would still most likely fail,
but it would do that with a seemingly unrelated/nonsensical error
message.
Reviewers: dblaikie, probinson, jhenderson
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77554
This patch fixes 2 related bugs in ADCE:
- `performDeadCodeElimination` does not report changes if it did ONLY
CFG changes (affects both old and new pass managers);
- When control flow removal is enabled, new pass manager does not
drop CFG analyses.
Both can lead to incorrect loop info after ADCE that does only CFG changes.
Differential Revision: https://reviews.llvm.org/D78103
Reviewed By: Denis Antrushin
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.
Differential revision: https://reviews.llvm.org/D78016
Summary:
Creates the SVEIntrinsicOpts pass. In this patch, the pass tries
to remove unnecessary reinterpret intrinsics which convert to
and from svbool_t (llvm.aarch64.sve.convert.[to|from].svbool)
For example, the reinterprets below are redundant:
%1 = call <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1> %a)
%2 = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> %1)
The pass also looks for ptest intrinsics and phi instructions where
the operands are being needlessly converted to and from svbool_t.
Reviewers: sdesmalen, andwar, efriedma, cameron.mcinally, c-rhodes, rengolin
Reviewed By: efriedma
Subscribers: mgorny, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76078
The _GLOBAL_OFFSET_TABLE_ in SysVr4 ELF is conventionally the base of the
.got or .got.prel sections. Expressions such as _GLOBAL_OFFSET_TABLE_
- (.L1 +8) are used in assembler code to calculate offsets into the .got.
At present MC outputs a R_ARM_REL32 with respect to the
_GLOBAL_OFFSET_TABLE_ symbol, whereas gas outputs a R_ARM_BASE_PREL
relocation with respect to the _GLOBAL_OFFSET_TABLE_ symbol. While both are
correct the R_ARM_REL32 depends on the value of the _GLOBAL_OFFSET_TABLE_
symbol, wheras te R_ARM_BASE_PREL relocation is idependent of the symbol.
The R_ARM_BASE_PREL is therefore slightly more robust to linker's that may
not follow the conventional placement of _GLOBAL_OFFSET_TABLE_; for example
LLD for some time defined _GLOBAL_OFFSET_TABLE_ to 0.
Differential Revision: https://reviews.llvm.org/D46319
Summary:
Following up on the comments on D77638.
Not undoing rGd6525eff5ebfa0ef1d6cd75cb9b40b1881e7a707 here at the moment, since I don't know how to test mac builds. Please let me know if I should include that here too.
Reviewers: vitalybuka
Reviewed By: vitalybuka
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77889
Summary:
This iz pattern is a special pattern of im pattern. This im pattern
has been supported by https://reviews.llvm.org/D77769, so removing
iz pattern as a continuous patch.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D77770
Ports the existing DAG combines, minus the simplify demanded bits
which seems to have no equivalent now. Without these, this isn't
particularly helpful in most of the IR sample cases.
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.
Differential Revision: https://reviews.llvm.org/D77994
Summary:
In CFGStackify, `fixUnwindMismatches` function fixes unwind destination
mismatches created by `try` marker placement. For example,
```
try
...
call @qux ;; This should throw to the caller!
catch
...
end
```
When `call @qux` is supposed to throw to the caller, it is possible that
it is wrapped inside a `catch` so in case it throws it ends up unwinding
there incorrectly. (Also it is possible `call @qux` is supposed to
unwind to another `catch` within the same function.)
To fix this, we wrap this inner `call @qux` with a nested
`try`-`catch`-`end` sequence, and within the nested `catch` body, branch
to the right destination:
```
block $l0
try
...
try ;; new nested try
call @qux
catch ;; new nested catch
local.set n ;; store exnref to a local
br $l0
end
catch
...
end
end
local.get n ;; retrieve exnref back
rethrow ;; rethrow to the caller
```
The previous algorithm placed the nested `try` right before the `call`.
But it is possible that there are stackified instructions before the
call from which the call takes arguments.
```
try
...
i32.const 5
call @qux ;; This should throw to the caller!
catch
...
end
```
In this case we have to place `try` before those stackified
instructions.
```
block $l0
try
...
try ;; this should go *before* 'i32.const 5'
i32.const 5
call @qux
catch
local.set n
br $l0
end
catch
...
end
end
local.get n
rethrow
```
We correctly handle this in the first normal `try` placement phase
(`placeTryMarker` function), but failed to handle this in this
`fixUnwindMismatches`.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77950
This patch extracts the RTTI part of llvm::ErrorInfo into its own class
(RTTIExtends) so that it can be used in other non-error hierarchies, and makes
it compatible with the existing LLVM RTTI function templates (isa, cast,
dyn_cast, dyn_cast_or_null) by adding the classof method.
Differential Revision: https://reviews.llvm.org/D39111
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: rriddle, sdesmalen, efriedma
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77259
in the same section.
This allows specifying BasicBlock clusters like the following example:
!foo
!!0 1 2
!!4
This places basic blocks 0, 1, and 2 in one section in this order, and
places basic block #4 in a single section of its own.
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.
As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.
Reviewers: dsanders, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77915
Since 1725f28841, this should check
isFMADLegalForFAddFSub rather than the the plain isOperationLegal.
This would assert in a subset of cases due to an oddity in how FMAD is
selected. We will allow FMA formation pre-legalize, but not FMAD even
in cases where it would be valid.
The current hook requires passing in the root fadd/fsub. However, in
this distributed case, this would be far more complicated to pass in
the relevant operand. AMDGPU doesn't get any value from the node, and
only needs the type and is the only implementor, so I'm not sure why
we have this complexity. Just rename and expand the assert to avoid
the more complicated checks spread through the distribution logic.
The shuffle decoding is used by X86ISelLowering and
MCTargetDesc/X86InstComments. The latter used to be in a
separate InstPrinter library. The Utils library existed to allow
InstPrinter and CodeGen to share the shuffle decoding. Since
X86InstComments now lives in the MCTargetDesc, which CodeGen
already depends on, we can sink the shuffle decoding there as well.
Differential Revision: https://reviews.llvm.org/D77980
Summary:
Fold (shift (shift X, C2), C1) -> (shift X, (C1 + C2)) for logical as
well as arithmetic shifts. This is needed to prevent regressions from
an upcoming funnel shift expansion change.
While we're here, fold (VSRAI -1, C) -> -1 too.
Reviewers: RKSimon, craig.topper
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77300
Improve the chances of folding the writemask into the combined shuffle by scaling a wider shuffle mask to match the root's original type.
This creates a few minor issues with variable shuffles, preventing combines of shuffles because of the more limited support binary shuffle types. In most cases we're probably better off combining the shuffles and losing the writemask fold, but this isn't always going to be true.
Remove SmallPtrSet include, replace with forward declaration and include SmallPtrSet.h in CodeMetrics.cpp directly.
Remove unused llvm::DataLayout/Instruction forward declarations.
Pass from the calling recipe the interleave group itself instead of passing the
group's insertion position and having the function query CM for its interleave
group and making sure that given instruction is the insertion point of.
Differential Revision: https://reviews.llvm.org/D78002
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.
Reviewers: jdoerfert, hfinkel
Reviewed By: jdoerfert
Subscribers: hiraditya, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77402
Widening a selects depends on whether the condition is loop invariant or
not. Rather than checking during codegen-time, the information can be
recorded at the VPlan construction time.
This was suggested as part of D76992, to reduce the reliance on
accessing the original underlying IR values.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: gilr
Differential Revision: https://reviews.llvm.org/D77869
* Reorganize tests and add coverage
* Improve diagnostic testing
* Make assert() tests more relevant
* Rename tests to macro-* or altmacro-*
This is not NFC because a (previously untested) diagnostic message is changed.
Summary:
StringPool has many caveats and isn't used in the monorepo. I will
propose removing it as a patch separate from this refactoring patch.
Reviewers: rriddle
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77976
Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG
does, leading to accidental SDValue removal before its reference count was
truly zero.
Differential Revision: https://reviews.llvm.org/D76994
Reviewed-By: bjope
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049
Reverted in 3ce77142a6 because the previous patch
broke the expensive-checks bots. The new patch removes the broken check.
Summary:
Updated CallPromotionUtils and impacted sites. Parameters that are
expected to be non-null, and return values that are guranteed non-null,
were replaced with CallBase references rather than pointers.
Left FIXME in places where more changes are facilitated by CallBase, but
aren't CallSites: Instruction* parameters or return values, for example,
where the contract that they are actually CallBase values.
Reviewers: davidxl, dblaikie, wmi
Reviewed By: dblaikie
Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77930
Summary:
StringMapEntry.h can have lower dependencies, than StringMap.h, which
is useful for public headers that want to expose inline methods on
StringMapEntry<> but don't need to expose all of StringMap.h. One
example of this is mlir's Identifier.h, another example is the existing
LLVM StringPool.h.
StringPool also could use a cleanup, I'll deal with that in a follow-on
patch.
Reviewers: rriddle
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77963
As discussed in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c13
...we can avoid the likely slow round-trip from XMM to GPR to XMM
by using the vector versions of the convert instructions.
Based on experimental results from recent Intel/AMD chips, we don't
need to worry about triggering denorm stalls while operating on
garbage data in the high lanes with convert instructions, so this is
expected to always be as good or better perf than the scalar
instruction equivalent. FP exceptions are also not a concern because
strict code should not be using the regular SDAG opcodes.
Differential Revision: https://reviews.llvm.org/D77895
This is similar to the recent move/addition of "scaleShuffleMask" (D76508),
but there are a couple of differences:
1. The existing x86 helper (canWidenShuffleElements) always tries to
divide-by-2, so it gets called iteratively and wouldn't handle the
general case of non-pow-2 length.
2. The existing x86 code handles "SM_SentinelZero" - we don't have
that in IR, but this code should be safe to use with that or other
special (negative) values.
The motivation is to enable shuffle folds in instcombine/vector-combine
that are similar to D76844 and D76727, but in the reverse-bitcast direction.
Those patterns are visible in the tests for D40633.
Differential Revision: https://reviews.llvm.org/D77881
A lot of vectorized code doesn't use masks so we shouldn't penalize them by not doing shuffle combining on avx512 targets.
I've added support for VALIGNQ/VALIGND and 512-bit SHUF128 to prevent some regressions. I also prevented recombining 256-bit SHUF128 to PERM2X128. We may not need to add 256-bit SHUF128 support, but I don't think I found any cases requiring that in my testing.
Differential Revision: https://reviews.llvm.org/D77928
-Drop llvm:: on MVT::i32
-Use getValueType instead of getSimpleValueType for an equality
check just cause its shorter and doesn't matter.
-Don't create a const SDValue & since its cheap to copy.
-Remove explicit case from MVT enum to EVT.
-Add message to assert.
Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecessors. This is not allowed in the Machine Block Placement pass where an assert will go off. I'm removing the assert and making the optimization bail out when such case happens.
Reviewers: wenlei, davidxl, Carrot
Reviewed By: Carrot
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77748
Selection would fail after the post legalize combiner put an illegal
zextload back together.
The base combiner has parameter to only allow legal operations, but
they appear to not be used. I also don't see a nice way to remove a
single entry from all_combines, so just hack around this.
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.
While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
- Move Adapters array to the stack, we know the size precisely
- Parse format string on demand into a SmallVector. In theory this could
lead to parsing it multiple times, but I couldn't find a single instance
of that in LLVM.
- Make more of the implementation details private.
Default visibility for classes is private, so the private: at the top of
various class definitions is redundant.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D77810
This is the same as what was done to the CallLoweringInfo in
TargetLowering.h in r309159.
This is just a step on the way to replacing this with CallBase.
Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it. Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`.
Reviewers: LuoYuanke, reames, MaskRay
Reviewed By: MaskRay
Subscribers: MaskRay, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77851
I only left it at the interface to ParseConstraints since that
needs updates to other callers in different files. I'll do that
as a follow up.
Differential Revision: https://reviews.llvm.org/D77892
There was another issue introduced by this commit that the OP
initially missed. Namely, for functions that are free to use
R2 as a callee-saved register, we emit a TOC expression based
on the address of the GEP label without emitting the GEP label.
Since we only emit such expressions for the large code model, this
issue only surfaced there.
I have confirmed that with this fix, the kernel build is successful
with target "all".
This reverts commit 60c642e74b.
This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.
Differential Revision: https://reviews.llvm.org/D77925
This probably isn't ideal - the error was being printed specifically
inline with the dumping that was more legible - but then the error
wasn't reported to stderr and didn't produce a non-zero exit code.
Probably the error message could be improved by adding more context now
that it isn't printed in-situ of the DIE dumping as much.
Revision a1c05fe <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad>
removed bitcast from the list of problematic transformations, however:
%97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation
%98 = bitcast double %97 to i64 // a1c05fe checks ppc_fp128 at here
%99 = icmp slt i64 %98, 0
%100 = zext i1 %99 to i8
store i8 %100, i8* %7, align 1
so this patch does that. I'm also disabling it in the presence of extend just in case.
I verified separately that the hash of -std::infinity and std::infinity don't match now.
Differential Revision: https://reviews.llvm.org/D77911
Summary:
Don't attempt to analyze the decomposed GEP for scalable type.
GEP index scale is not compile-time constant for scalable type.
Be conservative, return MayAlias.
Explicitly call TypeSize::getFixedSize() to assert on places where
scalable type doesn't make sense.
Add unit tests to check functionality of -basicaa for scalable type.
This patch is needed for D76944.
Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77828
Summary:
This allows us to test each backend pass under the presence
of debug info using pre-existing tests. The tests should not
fail as a result of this so long as it's true that debug info
does not affect CodeGen.
In practice, a few tests are sensitive to this:
* Tests that check the pass structure (e.g. O0-pipeline.ll)
* Tests that check --debug output. Specifically instruction
dumps containing MMO's (e.g. prelegalizercombiner-extends.ll)
* Tests that contain debugify metadata as mir-strip-debug will
remove it (e.g. fastisel-debugvalue-undef.ll)
* Tests with partial debug info (e.g.
patchable-function-entry-empty.mir had debug info but no
!llvm.dbg.cu)
* Tests that check optimization remarks overly strictly (e.g.
prologue-epilogue-remarks.mir)
* Tests that would inject the pass in an unsafe region (e.g.
seqpairspill.mir would inject between register alloc and
virt reg rewriter)
In all cases, the checks can either be updated or
--debugify-and-strip-all-safe=0 can be used to avoid being
affected by something like llvm-lit -Dllc='llc --debugify-and-strip-all-safe'
I tested this without the lost debug locations verifier to
confirm that AArch64 behaviour is unaffected (with the fixes
in this patch) and with it to confirm it finds the problems
without the additional RUN lines we had before.
Depends on D77886, D77887, D77747
Reviewers: aprantl, vsk, bogner
Subscribers: qcolombet, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77888
Summary:
The inline history is associated with a call site. There are two locations
we fetch inline history. In one, we fetch it together with the call
site. In the other, we initialize it under certain conditions, use it
later under same conditions (different if check), and otherwise is
uninitialized. Although currently there is no uninitialized use, the
code is more challenging to maintain correctly, than if the value were
always initialized.
Changed to the upfront initialization pattern already present in this
file.
Reviewers: davidxl, dblaikie
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77877
Summary:
A few tests start out with debug info and expect it to reach
the output. For these tests we shouldn't strip the debug info
Reviewers: aprantl, vsk, bogner
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77886
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: stoklund, sdesmalen, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77272
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: dexonsmith, sdesmalen, efriedma
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77276
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: dexonsmith, sdesmalen, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77274
Summary:
At the moment, any changes we make to the passes that can be
injected before/after others (e.g. -verify-machineinstrs and
-print-after-all) have to be duplicated in both
TargetPassConfig (for normal execution, -start-before/
-stop-before/etc) and llc (for -run-pass). Unify this pass
injection into addMachinePrePass/addMachinePostPass that both
TargetPassConfig and llc can use.
Reviewers: vsk, aprantl, bogner
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77887
Summary:
Refactor LiveRangeCalc such that it is now split into two classes
The objective is to split all the "register specific" logic away
from LiveRangeCalc.
The two new classes created are:
- LiveRangeCalc - is meant as a generic class to compute and modify
live ranges in a generic way. This class should deal only with
SlotIndices and VNInfo objects.
- LiveIntervalCals - is meant to be equivalent to the old LiveRangeCalc.
It computes the liveness virtual registers tracked by a LiveInterval
object.
With this refactoring LiveRangeCalc can be used to implement tracking of
liveness of LiveRanges that represent other things than just registers.
Subscribers: MatzeB, qcolombet, mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76584