Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D112055
During rule optimization, identical SameOperandMatchers are hoisted into a common group,
however previously only one operand index was considered.
Commutable patterns can introduce SameOperandMatcher checks where the second index is commuted,
resulting in a different check that cannot be hoisted.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D111506
This patch fixes invalid syntax of generated code for InstrMapping
that has multiple columns and values.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D111962
This patch is to order the AVX instructions ahead of AVX512 instructions
in the matching table so that the AVX instructions can be matched first.
Thanks Craig and Shengchen for the idea.
Differential Revision: https://reviews.llvm.org/D111538
We are seeing extremely long time in building AMDGPUInstPrinter.cpp
when profile instrumentation is enabled: It takes more than 5 minutes
(compared to ~8 seconds in non-instrument build).
This caused by the huge statements in printInstruction functions. In
profile instrumentation build, we need have extra control flow to
differentiate each case statement. This in turn adds significant
compile time in block placement and branch folding.
Function printInstruction is not likely to benefit from PGO build
as it's rarely executed in a typical compilation. So here I disable
the profile instrumentation for this function.
Differential Revision: https://reviews.llvm.org/D111682
The operand of the second any_of in EnforceSmallerThan should be
B not S like the FP code in the if below.
Unfortunately, fixing that causes an infinite loop in the build
of RISCV. So I've added a workaround for that as well.
Fixes PR44768.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D111502
As described on D111049, we're trying to remove the <string> dependency from error handling. In most cases the plan is to use the Twine() variant directly but to reduce introducing additional headers for the generated files, I'm using the const char* variant here instead.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.
Differential Revision: https://reviews.llvm.org/D110807
Tablegen currently expects targets to have at least one
pressure set for every broader register category. AMDGPU's
VGPR or AGPR, for instance, seemed to work correctly without
any pset, though we have forced one for each type to avoid
the assertion in computeRegUnitSets. However, psets can not
be entirely empty. At least one set is mandatory for every
target. This patch bypasses the assertion for the classes
when GeneratePressureSet is zero while ensuring the
RegUnitSets are not empty.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D110305
This is also a bug. The VK[1/2/4/8/16]PAIR here should be VK[1/2/4/8/16]Pair which has its
custom PrintMethod and ParserMatchClass. However we don't have any instructions using vvvv
and ModR/M.REG field so this issue is not exposed.
Differential Revision: https://reviews.llvm.org/D109564
Analogous to the TSFlags for machine instructions, this
patch introduces a bit vector for register classes to have
target specific flags that become a tablegened value in
TargetRegisterClass.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D108767
CodeGenMapTable.cpp refers to TableGen as TabelGen in the comments. This appears to be a typo. This patch fixes the typo.
Differential Revision: https://reviews.llvm.org/D76343
Adds MVT::i64x8, a Machine Value Type needed for lowering inline assembly
operands which materialize a sequence of eight general purpose registers.
Differential Revision: https://reviews.llvm.org/D94096
When setting Allocatable on a generated register class check all
superclasses and set Allocatable true if any superclass is
allocatable.
Without this change generated register classes based on an
allocatable class may end up unallocatable due to the topological
inheritance order.
This change primarily effects AMDGPU backend; however, there are
a few changes in MIPs GlobalISel register constraints as a result.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D105967
The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit
variant, the former using EDI, the latter RDI, but the use of the
register is implicit. In 64-bit mode, a 0x67 prefix can be used to get
the version using EDI, but there is no way to express this in
assembly in a single instruction, the only way is with an explicit
addr32.
This change adds support for the instruction. When generating assembly
text, that explicit addr32 will be added. When not generating assembly
text, it will be kept as a single instruction and will be emitted with
that 0x67 prefix. When parsing assembly text, it will be re-parsed as
ADDR32 followed by MASKMOVDQU64, which still results in the correct
bytes when converted to machine code.
The same applies to vmaskmovdqu as well.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103427
Continuing from D105763, this allows placing certain properties
about attributes in the TableGen definition. In particular, we
store whether an attribute applies to fn/param/ret (or a combination
thereof). This information is used by the Verifier, as well as the
ForceFunctionAttrs pass. I also plan to use this in LLParser,
which also duplicates info on which attributes are valid where.
This keeps metadata about attributes in one place, and makes it
more likely that it stays in sync, rather than in various
functions spread across the codebase.
Differential Revision: https://reviews.llvm.org/D105780
Followup to D105658 to make AttrBuilder automatically work with
new type attributes. TableGen is tweaked to emit First/LastTypeAttr
markers, based on which we can handle type attributes
programmatically.
Differential Revision: https://reviews.llvm.org/D105763
My use case for this is illustrated in the test case: I want to define
the same instruction twice with different (disjoint) predicates, because
the instruction has different operands on different subtargets. It's
convenient to do this with a multiclass that also defines an alias for
the instruction.
Previously tablegen would complain if this alias was defined twice with
no predicate. One way to fix this would be to add a predicate on each
definition of the alias, matching the predicate on the instruction. But
this (a) is slightly awkward to do in the real world use case I had, and
(b) leads to an inefficient matcher that will do something like this:
if (Mnemonic == "foo_alias") {
if (Features.test(Feature_Subtarget1Bit))
Mnemonic == "foo";
else if (Features.test(Feature_Subtarget2Bit))
Mnemonic == "foo";
return;
}
It would be more efficient to skip the feature tests and return "foo"
unconditionally.
Overall it seems better to allow multiple definitions of the identical
alias with no predicate.
Differential Revision: https://reviews.llvm.org/D105033
This patch relands https://reviews.llvm.org/D104454, but fixes some failing
builds on Mac OS which apparently has a different definition for size_t,
that caused 'ambiguous operator overload' for the implicit conversion
of TypeSize to a scalar value.
This reverts commit b732e6c9a8.
To reflect that the size may be scalable, a TypeSize is returned
instead of an unsigned. In places where the result is used,
it currently relies on an implicit cast of TypeSize -> uint64_t,
which asserts that the type is not scalable.
This patch is NFC for fixed-width vectors.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104454
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector
The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
same number of elements, then use LLT::vector(OtherTy.getElementCount())
or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
or operator*. That is because there is no reason to specifically restrict
the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
just use fixed_vector. This will need to be fixed up in the future when
modifying the algorithm to also work for scalable vectors, and will need
then need additional tests to confirm the behaviour works the same for
scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
this is replaced by LLT::scalable_vector.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104451
Having type symmetry with these is somewhat necessary when implementing support for 192-bit values.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104621
If an instruction has several operands and a PC-relative one is not the
first of them, the generator may produce the code that does not pass the
'Address' parameter to the printout method. For example, for an Arm
instruction 'LE LR, $imm', it reuses the same code as for other
instructions where the second operand is not PC-relative:
void ARMInstPrinter::printInstruction(...) {
...
case 11:
// BF16VDOTI_VDOTD, BF16VDOTI_VDOTQ, BF16VDOTS_VDOTD, ...
printOperand(MI, 1, STI, O);
O << ", ";
printOperand(MI, 2, STI, O);
break;
...
The patch fixes that by considering 'PCRel' when comparing
'AsmWriterOperand' values.
Differential Revision: https://reviews.llvm.org/D104698
This patch aims to add the scalable property to LLT. The rest of the
patch-series changes the interfaces to take/return ElementCount and
TypeSize, which both have the ability to represent the scalable property.
The changes are mostly mechanical and aim to be non-functional changes
for fixed-width vectors.
For scalable vectors some unit tests have been added, but no effort has
been put into making any of the GlobalISel algorithms work with scalable
vectors yet. That will be left as future work.
The work is split into a series of 5 patches to make reviews easier.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D104450
This patch changes RVV's policy for its supported list of fixed-length
vector types by capping by vector size rather than element count. Now
all 1024-byte vectors (of supported element types) are supported, rather
than all 256-element vectors.
This is a more natural fit for the architecture, and allows us to, for
example, improve the support for vector bitcasts.
This change necessitated the adding of some new simple types to avoid
"regressing" on the number of currently-supported vectors. We round out
the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and
`v512f16`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103884
These types are (presumably) never used in the generated TableGen files.
The `default` switch case silences any compiler warnings for these
missing types so it's easy to miss.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103883
This gives a nice message about the location of errors in a large
tablegen file, which is much more useful for users
Differential Revision: https://reviews.llvm.org/D102740
When you try to define a new DEBUG_TYPE in a header file, DEBUG_TYPE
definition defined around the #includes in files include it could
result in redefinition warnings even compile errors.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D102594
If a cmpxchg specifies acquire or seq_cst on failure, make sure we
generate code consistent with that ordering even if the success ordering
is not acquire/seq_cst.
At one point, it was ambiguous whether this sort of construct was valid,
but the C++ standad and LLVM now accept arbitrary combinations of
success/failure orderings.
This doesn't address the corresponding issue in AtomicExpand. (This was
reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .)
Fixes https://bugs.llvm.org/show_bug.cgi?id=50512.
Differential Revision: https://reviews.llvm.org/D103284
getVectorNumElements() returns a value for scalable vectors
without any warning so it is effectively getVectorMinNumElements().
By renaming it and making getVectorNumElements() forward to
it, we can insert a check for scalable vectors into getVectorNumElements()
similar to EVT. I didn't do that in this patch because there are still more
fixes needed, but I was able to temporarily do it and passed the RISCV
lit tests with these changes.
The changes to isPow2VectorType and getPow2VectorType are copied from EVT.
The change to TypeInfer::EnforceSameNumElts reduces the size of AArch64's isel table.
We're now considering SameNumElts to require the scalable property to match which
removes some unneeded type checks.
This was motivated by the bug I fixed yesterday in 80b9510806
Reviewed By: frasercrmck, sdesmalen
Differential Revision: https://reviews.llvm.org/D102262
Since calling `PrintFatalError` will automatically add `error: `
prefix in the message printed, there is no need having an extra
`ERROR:` prefix in the argument passed.
Differential Revision: https://reviews.llvm.org/D102151
Reviewed By: Paul-C-Anagnostopoulos
This allows for a much more efficient encoding for small negative
numbers by storing the sign bit first and negating the rest of
the bits. This was already being used for OPC_CheckInteger.
For every in tree target this affects, the table got smaller.
R600GenDAGISel.inc saw the largest reduction of 7K.
I did have to add a new opcode for StringIntegers used for
register class ids and subregister indices since we don't have the
integer value to encode. The enum name is emitted directly into
the table. Previously assumed the enum would expand to a positive
7-bit number. We might be able to just shift that right by 1 and
assume it is a positive 6 bit number, but that will need more
investigation.
After D100691, predicates should be cheap to compare again so
we don't need to filter anymore.
This is mostly just a revert of several patches going back to 2018.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D100695