Without it, the count of renderer functions is inaccurate and, in some
edge cases (like the patterns added in D134354), we can actually
go out of bounds (run out of pre-allocated renderer function spaces
in the GISel state)
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134861
Summary:
The existing undefined-bitfield-to-operand matching behavior is very
hard to understand, due to the combination of positional and named
matching. This can make it difficult to track down a bug in a target's
instruction definitions.
Over the last decade, folks have tried to work-around this in various
ways, but it's time to finally ditch the positional matching. With
https://reviews.llvm.org/D131003, there are no longer cases that
_require_ positional matching, and it's time to start removing usage
and support for it.
Therefore: add a (default-false) option, and set it to true only in
those targets that require positional matching today. Subsequent
changes will start cleaning up additional in-tree targets.
NOTE TO OUT OF TREE TARGET MAINTAINERS:
If this change breaks your build, you may restore the previous
behavior simply by adding:
let useDeprecatedPositionallyEncodedOperands = 1;
to your target's InstrInfo tablegen definition. However, this is
temporary -- the option will be removed in the future.
If your target does not set 'decodePositionallyEncodedOperands', you
may thus start migrating to named operands. However, if you _do_
currently set that option, I recommend waiting until a subsequent
change lands, which adds decoder support for named sub-operands.
Differential Revision: https://reviews.llvm.org/D134073
These names can then be matched by name against 'bits' fields in a
record, to populate an instruction's encoding.
This does _not_ yet change DecoderEmitter to allow by-name matching of
sub-operands. Unlike the encoder, the decoder already defaulted to not
supporting positional matching, and backends had workarounds in place
for the missing decoding support.
Additionally, use this new capability to allow the ARM and AArch64
backends not to require any positional operand matching.
Differential Revision: https://reviews.llvm.org/D131003
We have namespaces `DXIL` and `dxil`, which is just confusing. This
renames `DXIL` -> `dxil` making everything consistent.
While the LLVM coding standards don't have a clear direction here, I
chose lower case because by my current unscientific count there are
more places where we had the lowercase namespace than the uppercase.
This allows MachineCopyPropagation to eliminate copies of constant registers
such as zero registers. They were previously not being eliminated as the
check for MO.clobbersPhysReg(AvailSrc) would return true for constant
registers such as MIPS $zero.
To avoid having to manually add the zero registers to all CalleeSavedRegs
instantiations in tablegen, I instead added a new isConstant bit to the
Register and set this for MIPS, RISC-V, and AArch64 zero registers.
RegisterInfoEmitter.cpp looks at this flag and adds all constant registers
to the preserved register mask.
This may also benefit other passes but so far I have only seen differences
in MachineCopyPropagation. In the future it might make sense to generate
`isConstantPhysReg()` from this information.
Original source: 8588d8b814
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D131958
In `GIMatchTreeOpcodePartitioner::applyForPartition()`, the loop over
the possible leaves skip a leaf if the instruction does not care
about the instruction.
When processing the referenced operands in the next loop the same
leaves need to be skipped.
Later, when these leaves are added to all partitions, the bit vector
must be resized first before the bit representing the leaf is set.
This fixes a crash in llvm-tblgen.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134192
CodeGenSchedModels::hasReadOfWrite tries to predicate whether the WriteDef is contained in the list of ValidWrites of someone ProcReadAdvance,
so that WriteID of WriteDef can be compressed and reusable.
It tries to iterate all ProcReadAdvance entry, but not all ProcReadAdvance defs also inherit from SchedRead.
Some ProcReadAdvances are defined by ReadAdvance.So it's not complete to enumerate all ProcReadAdvances if just iterate all SchedReads.
Differential Revision: https://reviews.llvm.org/D132205
The following changes are necessasy to get the generated tree
matcher to compile:
- In CodeExpansions::declare(), the assert() prevents connecting
two instructions. E.g. the match code
(match (MUL $t, $s1, $s2),
(SUB $d, $t, $s3)),
results in two declarations of $t, one for the def and one for
the use. Removing the assertion allows this construct.
If $t is later used, it is one of the operands, which should be
perfectly fine.
- The code emitted in GIMatchTreeVRegDefPartitioner::generatePartitionSelectorCode()
is not compilable:
- The value of NewInstrID should be emitted, not the name
- Both calls involving getOperand() end with one parenthesis too many
- Swaps generated condition for the partition code in the latter function
It also changes the rules i2p_to_p2i, fabs_fabs_fold, and fneg_fneg_fold
to use the tree matcher for a linear match. These rules are tested by:
CodeGen/AArch64/GlobalISel/combine-fabs.mir
CodeGen/AArch64/GlobalISel/combine-fneg.mir
CodeGen/AArch64/GlobalISel/combine-ptrtoint.mir
CodeGen/AMDGPU/GlobalISel/combine-add-nullptr.mir
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D133257
The class priority is expected to be at most 5 bits before it starts
clobbering bits used for other fields. Also clamp the instruction
distance in case we have millions of instructions.
AMDGPU was accidentally overflowing into the global priority bit in
some cases. I think in principal we would have wanted this, but in the
cases I've looked at, it had the counter intuitive effect and
de-prioritized the large register tuple.
Avoid using weird bit hack PPC uses for global priority. The
AllocationPriority field is really 5 bits, and PPC was relying on
overflowing this to 6-bits to forcibly set the global priority
bit. Split this out as a separate flag to avoid having magic behavior
for values above 31.
Currently there isn't a generic way to get a smaller register class
that can be produced from a subregister of a larger class. Replaces a
manually implemented version for AMDGPU. This will be used to improve
subregister support in the allocator.
`llvm` and downstream internal callers no longer use `array_lengthof`, so drop
the include everywhere.
Differential Revision: https://reviews.llvm.org/D133600
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133429
Propagate PC sections metadata to MachineInstr when FastISel is doing
instruction selection.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D130884
Allows things like `(G_PTR_ADD (G_PTR_ADD a, b), c)` to be
simplified into a single ADD3 instruction instead of two adds.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D131254
This commit moves the information on whether a register is constant into
the Tablegen files to allow generating the implementaiton of
isConstantPhysReg(). I've marked isConstantPhysReg() as final in this
generated file to ensure that changes are made to tablegen instead of
overriding this function, but if that turns out to be too restrictive,
we can remove the qualifier.
This should be pretty much NFC, but I did notice that e.g. the AMDGPU
generated file also includes the LO16/HI16 registers now.
The new isConstant flag will also be used by D131958 to ensure that
constant registers are marked as call-preserved.
Differential Revision: https://reviews.llvm.org/D131962
If a MachineInstr's operand should be Reg in compiler's output but is
currently FrameIndex, `isCompressibleInst()` will terminate at
`MachineOperandType::getReg()`.
This patch adds `.isReg()` checks to make `isCompressibleInst()` return
false for these MachineInstr, allowing `getInstSizeInBytes()` to return
a value and `EstimateFunctionSizeInBytes()` to work as intended.
See https://reviews.llvm.org/D129999#3694222 for details.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D129999
Add initial support for NonNull attribute.
(https://github.com/llvm/llvm-project/issues/57113)
Test plan:
verify that for
__thread int x;
int main() {
int* y = &x;
return *y;
}
(with this patch) clang -O -fsanitize=null -S -emit-llvm -o -
doesn't emit a null-pointer check
Differential revision: https://reviews.llvm.org/D131872
This fixes some fallout from D131282. Currently, add_tablegen() will add the tablegen target to LLVM_EXPORTS and associates the install with LLVMExports. For non-standalone builds, this means that you end up with mlir-tblgen and clang-tblgen in LLVMExports.
However, these projects should instead be using MLIR_EXPORTS/MLIRTargets and CLANG_EXPORTS/ClangTargets. To fix this, add an extra EXPORT option and make use of get_target_export_arg() to create the correct export argument.
Reviewed By: ashay-github
Differential Revision: https://reviews.llvm.org/D131565
Prior to this patch, the `add_tablegen()` macro in
llvm/cmake/modules/TableGen.cmake added the install rule only if
`project` matched `LLVM` or `MLIR`. This patch adds an optional
`DESTINATION` argument, which, if non-empty, decides whether (and where)
to install the tablegen tool, thus eliminating the need for
project-specific overrides. This patch also updates all other
invocations of the `add_tablegen()` macro.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D131282
When using AdditionalArguments in a GICombinerHelper, the generator
does not put a space between the type and the name.
E.g.
let AdditionalArguments = [GICombinerHelperArg<"bool", "IsSomething">];
ends up as
boolIsSomething) const;
in the generated file. This change adds a space between the type and the name.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D130823
A new helper class DXILOpBuilder is added to create DXIL op function calls.
TableGen backend for DXILOperation will create table for DXIL op function parameter types.
When create DXIL op function, these parameter types will used to create the function type.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D130291
llvm::sort is beneficial even when we use the iterator-based overload,
since it can optionally shuffle the elements (to detect
non-determinism). However llvm::sort is not usable everywhere, for
example, in compiler-rt.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D130406
This change improves ctags generation for tablegen files.
For the following example
```
class A;
class A {
int a;
}
```
Previously, tags were generated only for a forward declaration of class 'A'.
This patch allows generating tags for the forward declarations
and further definition of class 'A'.
Reviewed By: barannikov88
Original patch by: rusyaev-roman (Roman Rusyaev)
Some adjustments by: nhaehnle (Nicolai Hähnle)
Differential Revision: https://reviews.llvm.org/D129935
This patch introduce an automatic generation of the clause parser from the TableGen
information.
New information can be stored directly in the TableGen file:
- The different aliases that a clause support.
- prefix before a value.
- whether a prefix is optional or not.
Makes it easier to add new clauses and also avoid some error (`write` clause incorrect until now).
This patch is updating only the OpenACC part. A patch with a modification of the OpenMP clause parser will follow.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D106968
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Recommitted with some fixes for the leftover MCII variables in release
builds.
Differential Revision: https://reviews.llvm.org/D129506
This reverts commit e2fb8c0f4b as it does
not build for Release builds, and some buildbots are giving more warning
than I saw locally. Reverting to fix those issues.
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Differential Revision: https://reviews.llvm.org/D129506
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.
The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.
I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
This change introduces the HasNoUse builtin predicate in PatFrags that
checks for the absence of use of the first result operand.
GlobalISelEmitter will allow source PatFrags with this predicate to be
matched with destination instructions with empty outs. This predicate is
required for selecting the no-return variant of atomic instructions in
AMDGPU.
Differential Revision: https://reviews.llvm.org/D125212
The previous code had a bug when dealing with matching iPTR against a
set of integer types. It was trying to handle it all in a compact way,
but that implementation couldn't be modified to correct the problem in
a simple way. The code wasn't long, and it was easier to rewrite it.
The actual issue was that non-scalar-integer types were considered when
matching against iPTR. For example {iPTR} intersected with {i32 f32}
was {iPTR} (due to multiple types in the other set), but should be just
{i32}, because i32 is the only integer scalar in the other set.
The `hasType` function may be given a type that has been modified from
its original form (in particular made "simple", due to a predicate).
Make sure that such a type is still recognized as associated with a
register class, if the class contains it under any hw-mode.
This is somewhat optimistic though, since there is no information as
to where that simple type originated from.
This is used by disassemblers: `llvm-mc -disassemble -mattr=` and `llvm-objdump --mattr=`.
The main use case is for llvm-objdump to disassemble all known instructions
(D128030).
In user-facing tools, "all" is intentionally not supported in producers:
integrated assembler (`.arch_extension all`), clang -march (`-march=armv9.3a+all`).
Due to the code structure, `llvm-mc -mattr=+all` `llc -mattr=+all` are not
rejected (they are internal tool). Add `llvm/test/CodeGen/AArch64/mattr-all.ll`
to catch behavior changes.
AArch64SysReg::SysReg::haveFeatures: check `FeatureAll` to print
`AArch64SysReg::SysReg::AltName` for some system registers (e.g. `ERRIDR_EL1, RNDR`).
AArch64.td: add `AssemblerPredicateWithAll` to additionally test `FeatureAll`.
Change all `AssemblerPredicate` (except `UseNegativeImmediates`) to `AssemblerPredicateWithAll`.
utils/TableGen/{DecoderEmitter,SubtargetFeatureInfo}.cpp: support arbitrarily
nested all_of, any_of, and not.
Note: A predicate supports all_of, any_of, and not. For a target (though
currently not for AArch64) an encoding may be disassembled differently with
different target features.
Note: AArch64MCCodeEmitter::computeAvailableFeatures is not available to
the disassembler.
Reviewed By: peter.smith, lenary
Differential Revision: https://reviews.llvm.org/D128029