SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.
The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.
I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
This change introduces the HasNoUse builtin predicate in PatFrags that
checks for the absence of use of the first result operand.
GlobalISelEmitter will allow source PatFrags with this predicate to be
matched with destination instructions with empty outs. This predicate is
required for selecting the no-return variant of atomic instructions in
AMDGPU.
Differential Revision: https://reviews.llvm.org/D125212
The builtin predicate handling has a strange behavior where the code
assumes that a PatFrag is a stack of PatFrags, and each level adds at
most one predicate. I don't think this particularly makes sense,
especially without a diagnostic to ensure you aren't trying to set
multiple at once.
This wasn't followed for address spaces and alignment, which could
potentially fall through to report no builtin predicate was
added. Just switch these to follow the existing convention for now.
Add void casts to mark the variables used, next to the places where
they are used in assert or `LLVM_DEBUG()` expressions.
Differential Revision: https://reviews.llvm.org/D123117
Based on the output of include-what-you-use.
It's an utility directory, so no much impact on other code areas.
clang++ -E -Iinclude -I../llvm/include ../llvm/utils/TableGen/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 4327274
after: 4316190
Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118466
This reverts commit fd4808887e.
This patch causes gcc to issue a lot of warnings like:
warning: base class ‘class llvm::MCParsedAsmOperand’ should be
explicitly initialized in the copy constructor [-Wextra]
Since 65b13610a5, raw_string_ostream has
been unbuffered by default. Based on an audit of llvm/utils/, this
commit removes every call to `raw_string_ostream::flush()` and any call
to `raw_string_ostream::str()` whose result is ignored or that doesn't
help with clarity.
I left behind a few calls to `str()`. In these cases, the underlying
std::string was declared pretty far away and never used again, whereas
stream recently had its last write. The code is easier to read as-is;
the no-op call to `flush()` inside `str()` isn't harmful, and when
https://reviews.llvm.org/D115421 lands it'll be gone anyway.
During rule optimization, identical SameOperandMatchers are hoisted into a common group,
however previously only one operand index was considered.
Commutable patterns can introduce SameOperandMatcher checks where the second index is commuted,
resulting in a different check that cannot be hoisted.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D111506
This patch relands https://reviews.llvm.org/D104454, but fixes some failing
builds on Mac OS which apparently has a different definition for size_t,
that caused 'ambiguous operator overload' for the implicit conversion
of TypeSize to a scalar value.
This reverts commit b732e6c9a8.
To reflect that the size may be scalable, a TypeSize is returned
instead of an unsigned. In places where the result is used,
it currently relies on an implicit cast of TypeSize -> uint64_t,
which asserts that the type is not scalable.
This patch is NFC for fixed-width vectors.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104454
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector
The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
same number of elements, then use LLT::vector(OtherTy.getElementCount())
or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
or operator*. That is because there is no reason to specifically restrict
the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
just use fixed_vector. This will need to be fixed up in the future when
modifying the algorithm to also work for scalable vectors, and will need
then need additional tests to confirm the behaviour works the same for
scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
this is replaced by LLT::scalable_vector.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104451
This patch aims to add the scalable property to LLT. The rest of the
patch-series changes the interfaces to take/return ElementCount and
TypeSize, which both have the ability to represent the scalable property.
The changes are mostly mechanical and aim to be non-functional changes
for fixed-width vectors.
For scalable vectors some unit tests have been added, but no effort has
been put into making any of the GlobalISel algorithms work with scalable
vectors yet. That will be left as future work.
The work is split into a series of 5 patches to make reviews easier.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D104450
This uses to be how predicates were handled prior to HwMode being
added. When the Predicates were converted to a std::vector it
significantly increased the cost of a compare in GenerateVariants.
Since ListInit's are uniquified by tablegen, we can use a simple
pointer comparison to check for identical lists.
In order to store the HwMode, we now add a separate string to
PatternToMatch. This will be appended separately to the predicate
string in getPredicateCheck. A new getPredicateRecords is added
to allow GlobalISel and getPredicateCheck to both get the sorted
list of Records. GlobalISel was ignoring any HwMode predicates
before and still is.
There is one slight change here, ListInits with different predicate
orders aren't sorted so the filtering in GenerateVariants might
fail to detect two isomorphic patterns with different predicate
orders. This doesn't seem to be happening in tree today.
My hope is this will allow us to remove all the BitVector tracking
in GenerateVariants that was making up for predicates beeing
expensive to compare. There's a decent amount of heap allocations
there on large targets like X86, AMDGPU, and RISCV.
Differential Revision: https://reviews.llvm.org/D100691
When GlobalISelEmitter::emitCxxPredicateFns emitted code for MI
predicates it used "PatFrag" when searching for definitions. With
this patch it will search for all "PatFrags" instead. Since PatFrag
derives from PatFrags the difference is that we now include all
definitions using PatFrags directly as well. Thus making it possible
to use GISelPredicateCode together with a PatFrags definition.
It might be noted that the matcher code was emitted also for PatFrags
in the past. But then one ended up with errors since the custom code
in testMIPredicate_MI was missing.
Differential Revision: https://reviews.llvm.org/D98486
To do this while supporting the existing functionality in SelectionDAG of using
PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis
dependencies to the instruction selector pass.
Then, use the predicate to generate constant pool loads for f32 materialization,
if we're targeting optsize/minsize.
Differential Revision: https://reviews.llvm.org/D97732
Allow different GICustomOperandRenderers to use the same RendererFn.
This avoids the need for targets to define a bunch of identical C++
renderer functions with different names.
Without this fix TableGen would have emitted code that tried to define
the GICR enumeration with duplicate enumerators.
Differential Revision: https://reviews.llvm.org/D96587
Before this patch there was generic mapping from vector_extract
to G_EXTRACT_VECTOR_ELT added in SelectionDAGCompat.td. That
mapping is now replaced by a mapping from extractelt instead.
The reasoning is that vector_extract is marked as deprecated,
so it is assumed that a majority of targets will use extractelt
and not vector_extract (and that the long term solution for all
targets would be to use extractelt).
Targets like AArch64 that still use vector_extract can add an
additional mapping from the deprecated vector_extract as target
specific tablegen definitions. Such a mapping is added for AArch64
in this patch to avoid breaking tests.
When adding the extractelt => G_EXTRACT_VECTOR_ELT mapping we
triggered some new code paths in GlobalISelEmitter, ending up in
an assert when trying to import a pattern containing EXTRACT_SUBREG
for ARM. Therefore this patch also adds a "failedImport" warning
for that situation (instead of hitting the assert).
Differential Revision: https://reviews.llvm.org/D93416
TableGen would pick the largest RC for constraining the operands, which
could potentially be an unallocatable RC. This patch removes selection
of unallocatable RCs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D93945
The companion RFC (http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html) gives lots of details on the overall strategy, but we summarize it here:
LLVM IR involving vector types is going to be selected using pseudo instructions (only MachineInstr). These pseudo instructions contain dummy operands to represent the vector type being operated and the vector length for the operation.
These two dummy operands, as set by instruction selection, will be used by the custom inserter to prepend every operation with an appropriate vsetvli instruction that ensures the vector architecture is properly configured for the operation. Not in this patch: later passes will remove the redundant vsetvli instructions.
Register classes of tuples of vector registers are used to represent vector register groups (LMUL > 1).
Those pseudos are eventually lowered into the actual instructions when emitting the MCInsts.
About the patch:
Because there is a bit of initial infrastructure required, this is the minimal patch that allows us to select instructions for 3 LLVM IR instructions: load, add and store vectors of integers. LLVM IR operations have "whole-vector" semantics (as in they generate values for all the elements).
Later patches will extend the information represented in TableGen.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Co-Authored-by: Craig Topper <craig.topper@sifive.com>
Differential Revision: https://reviews.llvm.org/D89449
Tablegen seg faulted when parsing a Pat where the destination part has
no output (zero instruction), due to a register class lookup using
nullptr.
Reviewed By: Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D90829
Some of these were found by running clang-format over the generated
code, although that complains about far more issues than I have fixed
here.
Differential Revision: https://reviews.llvm.org/D90937
When nesting INSERT_SUBREG and EXTRACT_SUBREG, GlobalISelEmitter would
fail to find the register class of the nested node. This patch fixes
that for registers with subregs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D88487
When generating matching tables for GlobalISel, TableGen would output
"::zero_reg" whenever encountering the zero_reg, which in turn would
result in compilation error. This patch fixes that by instead outputting
NoRegister (== 0), which is the same result that TableGen produces when
generating matching tables for ISelDAG.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86215
The "name" of a non-leaf complex pattern (MY_PAT $op1, $op2) is
"MY_PAT:op1:op2" and the ones with same "name" represent same operand.
Add 'same operand check' for this case.
Differential Revision: https://reviews.llvm.org/D87351
Predicates with 'let PredicateCodeUsesOperands = 1' want to examine
matched operands. When we encounter predicate code that uses operands,
analyze its named operand arguments and create a map between argument
index and name. Later, when leaf node with name is encountered, emit
GIM_RecordNamedOperand that will store that operand at its argument
index in operand list. This operand list will be an argument to c++
code of the predicate.
Differential Revision: https://reviews.llvm.org/D87285
When optimizing the table, PointerToAnyOperandMatchers would be
incorrectly reported as identical even though they have different
SizeInBits values. This bug was due to failing to overload the
isIdentical() method, which this patch addresses.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86199
This is to initially handleg immAllOnesV, which should match
G_BUILD_VECTOR or G_BUILD_VECTOR_TRUNC. In the future, it could be
used for other patterns cases that map to multiple G_* instructions,
such as G_ADD and G_PTR_ADD.