Propagate PC sections metadata to MachineInstr when FastISel is doing
instruction selection.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D130884
Although we only currently have one error produced in this function I am
working on changes right now that add some more. This change makes the
error location more accurate.
Differential Revision: https://reviews.llvm.org/D133016
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929
This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.
For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.
Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.
Differential Revision: https://reviews.llvm.org/D132520
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.
This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.
I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.
This is part of an ongoing transition to use OperandValueInfo which combines OperandValueKind and OperandValueProperties. This change adds some accessor methods and uses them to simplify backend code. The primary motivation of doing so is removing uses of the parameters so that an upcoming api change is less error prone.
WebAssembly globals are represented as IR globals with the wasm_var
address space (AS1). Prior to this patch, a wasm global load that isn't
lowerable will produce a failure to select, while a wasm global store
will produced incorrect code. This patch ensures we consistently produce
a clear error.
As noted in the test cases, it's conceivable that a frontend or an
optimisation pass could produce similar IR even in the presence of the
semantic restrictions on pointers to Wasm globals in the frontend, which
is a separate problem to address.
Differential Revision: https://reviews.llvm.org/D131387
1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method.
2) This patch also changes a few callsites (VectorCombine.cpp,
SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method.
3) This is a split of D128302.
Differential Revision: https://reviews.llvm.org/D131114
Only Emscripten supports dynamic linking with threads. To use
thread-local storage for other targets, this change defaults to the
`localexec` model.
Differential Revision: https://reviews.llvm.org/D130053
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Recommitted with some fixes for the leftover MCII variables in release
builds.
Differential Revision: https://reviews.llvm.org/D129506
This reverts commit e2fb8c0f4b as it does
not build for Release builds, and some buildbots are giving more warning
than I saw locally. Reverting to fix those issues.
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Differential Revision: https://reviews.llvm.org/D129506
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
Previously WebAssemblyCFGStackify, WebAssemblyInstrInfo, and
WebAssemblyPeephole all had equivalent logic for this. Move it into a
common helper in WebAssemblyUtilities.
Use the isExternRefType and isFuncRefType helpers rather than
reimplementing that logic in this function (which acts as a blocker to
work to prototype alternative IR-level representations of reference
types).
This relands 8ccc7e0aa4 (which had compilation errors).
Use the isExternRefType and isFuncRefType helpers rather than
reimplementing that logic in this function (which acts as a blocker to
work to prototype alternative IR-level representations of reference
types).
enabled
The C++20 Coroutines couldn't be compiled to WebAssembly due to an
optimization named symmetric transfer requires the support for musttail
calls but WebAssembly doesn't support it yet.
This patch tries to fix the problem by adding a supportsTailCalls
method to TargetTransformImpl to skip the symmetric transfer when
tail-call feature is not supported.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D128794
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.
The idea is to get support from the compiler to easily write efficient memory function implementations.
This patch could be split in two:
- one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
- and another one for the Clang part providing the instrinsic as a builtin.
Differential Revision: https://reviews.llvm.org/D126903
MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.
Add a clone method to MachineFunctionInfo. This is a subtle variant of
the copy constructor that is required if there are any MIR constructs
that use pointers. Specifically, at minimum fields that reference
MachineBasicBlocks or the MachineFunction need to be adjusted to the
values in the new function.
Refactor the tablegen definitions for relaxed SIMD min/max instructions to use a
shared RelaxedBinary multiclass modeled on the existing SIMDBinary multiclass. A
future commit will add further instruction definitions that use RelaxedBinary.
Also rename the SIMD_RELAXED_CONVERT multiclass to RelaxedConvert to better fit
existing naming conventions.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D127157
With f3b4f99007, the exclusive source of
truth for whether threads are supported is the -matomics flag.
Accordingly, strip TLS flags when -matomic is not specified, even if
bulk-memory is specified and it would theoretically be supportable.
This allows the backend to compile TLS variables when -mbulk-memory is
enabled but threads are not enabled.
Differential Revision: https://reviews.llvm.org/D125730
FixIrreducibleControlFlow pass adds dispatch blocks with a `br_table`
that has multiple predecessors and successors, because it serves as
something like a traffic hub for BBs. As a result of this, there can be
register uses that are not dominated by a def in every path from the
entry block. For example, suppose register %a is defined in BB1 and used
in BB2, and there is a single path from BB1 and BB2:
```
BB1 -> ... -> BB2
```
After FixIrreducibleControlFlow runs, there can be a dispatch block
between these two BBs:
```
BB1 -> ... -> Dispatch -> ... -> BB2
```
And this dispatch block has multiple predecessors, now
there is a path to BB2 that does not first visit BB1, and in that path
%a is not dominated by a def anymore.
To fix this problem, we have been adding `IMPLICIT_DEF`s to all
registers in PrepareForLiveInternals pass, and then remove unnecessary
ones in OptimizeLiveIntervals pass after computing `LiveIntervals`. But
FixIrreducibleControlFlow pass itself ends up violating register use-def
relationship, resulting in invalid code. This was OK so far because
MIR verifier apparently didn't check this in validation. But @arsenm
fixed this and it caught this bug in validation
(https://github.com/llvm/llvm-project/issues/55249).
This CL moves the `IMPLICIT_DEF` adding routine from
PrepareForLiveInternals to FixIrreducibleControlFlow. We only run it
when FixIrreducibleControlFlow changes the code. And then
PrepareForLiveInternals doesn't do anything other than setting
`TracksLiveness` property, which is a prerequisite for running
`LiveIntervals` analysis, which is required by the next pass
OptimizeLiveIntervals.
But in our backend we don't seem to do anything that invalidates this up
until OptimizeLiveIntervals, and I'm not sure why we are calling
`invalidateLiveness` in ReplacePhysRegs pass, because what that pass
does is to replace physical registers with virtual ones 1-to-1. I
deleted the `invalidateLiveness` call there and we don't need to set
that flag explicitly, which obviates all the need for
PrepareForLiveInternals.
(By the way, This 'Liveness' here is different from `LiveIntervals`
analysis. Setting this only means BBs' live-in info is correct, all uses
are dominated by defs, `kill` flag is conservatively correct, which
means if there is a `kill` flag set it should be the last use. See
2a0837aab1/llvm/include/llvm/CodeGen/MachineFunction.h (L125-L134)
for details.)
So this CL removes PrepareForLiveInternals pass altogether. Something
similar to this was attempted by D56091 long ago but that came short of
actually removing the pass, and I couldn't land it because
FixIrreducibleControlFlow violated use-def relationship, which this CL
fixes.
This doesn't change output in any meaningful way. All test changes
except `irreducible-cfg.mir` are register numbering.
Also this will likely to reduce compilation time, because we have been
adding `IMPLICIT_DEF` for all registers every time `-O2` is given, but
now we do that only when there is irreducible control flow, which is
rare.
Fixes https://github.com/llvm/llvm-project/issues/55249.
Reviewed By: dschuff, kripken
Differential Revision: https://reviews.llvm.org/D125515