This uses to be how predicates were handled prior to HwMode being
added. When the Predicates were converted to a std::vector it
significantly increased the cost of a compare in GenerateVariants.
Since ListInit's are uniquified by tablegen, we can use a simple
pointer comparison to check for identical lists.
In order to store the HwMode, we now add a separate string to
PatternToMatch. This will be appended separately to the predicate
string in getPredicateCheck. A new getPredicateRecords is added
to allow GlobalISel and getPredicateCheck to both get the sorted
list of Records. GlobalISel was ignoring any HwMode predicates
before and still is.
There is one slight change here, ListInits with different predicate
orders aren't sorted so the filtering in GenerateVariants might
fail to detect two isomorphic patterns with different predicate
orders. This doesn't seem to be happening in tree today.
My hope is this will allow us to remove all the BitVector tracking
in GenerateVariants that was making up for predicates beeing
expensive to compare. There's a decent amount of heap allocations
there on large targets like X86, AMDGPU, and RISCV.
Differential Revision: https://reviews.llvm.org/D100691
As discussed in D100691 and based on D100889.
I removed the ModeChecks cache which provides little value. Reduced
from three loops to two. Used ArrayRef to pass the Predicate to
AppendPattern to avoid needing to construct a vector for single
mode. Used SmallVector to avoid heap allocation constructing
DefaultCheck for the in tree targets the use it.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D101240
CommandLine.h is indirectly included in ~50% of TUs when building
clang, and VirtualFileSystem.h is large.
(Already remarked by jhenderson on D70769.)
No behavior change.
Differential Revision: https://reviews.llvm.org/D100957
The key here is HwMode indices. They're going to be small numbers,
contiguous, and only a few different values. I don't think we need
to go through the SmallDenseSet hashing.
A BitVector would be even better, but we don't have the upper
bound here.
A large portion of the patterns are duplicated for HwMode on RISCV.
If we expand HwMode first, we need to check nearly twice as many
patterns for variants. HwModes shouldn't affect whether a variant
is valid so we should be able to expand after.
This also reduces the RISCV isel table by 539 bytes due to factoring
working better on this pattern order. Unfortunately it increases
Hexagon table size by ~50 bytes. But I think this is a reasonable
trade.
This was causing GenerateVariants to lose some variants since
HwMode is expanded first. We were mistakenly thinking the HwMode
predicate matched and finding the variant was isomorphic to a
pattern in another HwMode and discarding it.
Found while investigating it if would be better to generate
variants before expanding HwModes to improve RISCV build time.
I noticed an increase in the number of Opc_MorphNodeTo in the table
which indicated that the number of patterns had changed.
hasMode was looking up the map once. Then we'd either call get which
would look up again, or we'd insert into the map which requires
walking the map to find the insertion point.
I believe the hasMode was needed because get has a special case
to look for DefaultMode if the mode being asked for doesn't exist.
We don't want that here so we were using hasMode to make sure we
wouldn't hit that case.
Simplify to a regular operator[] access which will default
construct a SetType if the lookup fails.
This is a followup to D98145: As far as I know, tracking of kill
flags in FastISel is just a compile-time optimization. However,
I'm not actually seeing any compile-time regression when removing
the tracking. This probably used to be more important in the past,
before FastRA was switched to allocate instructions in reverse
order, which means that it discovers kills as a matter of course.
As such, the kill tracking doesn't really seem to serve a purpose
anymore, and just adds additional complexity and potential for
errors. This patch removes it entirely. The primary changes are
dropping the hasTrivialKill() method and removing the kill
arguments from the emitFast methods. The rest is mechanical fixup.
Differential Revision: https://reviews.llvm.org/D98294
That's how it was originally intended but that wasn't possible because
we still needed to support older CMake versions.
The problem here is that the sources in TableGenGlobalISel are meant to
be linked into both llvm-tblgen and TableGenTests (a unit test), but not
be part of LLVM proper. So they shouldn't be an ordinary LLVM component.
Because they are used in llvm-tblgen, they can't draw in the LLVM dylib
dependency, but then we'd have to do the same thing in TableGenTests to
make sure we don't link both a static Support library and another copy
through the LLVM dylib.
With an object library we're just reusing the object files and don't
have to care about dependencies at all.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D74588
I have seen this error quite frequently in our out-of-tree CHERI backends
and the lack of location information sometimes makes it quite difficult
to track down the actual source of the error.
This patch changes the llvm_unreachable() to a PrintFatalError() so that
tablegen prints a stack of source locations.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99468
This patch addresses the removal of register size information done in
commit c8b782c.
Without this change, there is no viable option to get register size
information outside libTarget. We need this information to run
analysis that know the register size from the MC layer, used by
BOLT.
Discussion D50285 and D47199.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D97891
Follow up from D92955 and D83636. This patch makes the base cpp files
OMP.cpp and ACC.cpp normal files and they now include the XXX.inc file
generated by tablegen. This reduces the number of file generated by the
DirectiveEmitter backend and makes it closer to the proposal in D83636.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D93560
This patch reduces the time taken for clang to compile the generated
disassembler for an out-of-tree target with InsnType bigger than 64 bits
from 4m30s to 48s.
D67686 did a similar thing for CodeEmitterGen.
The idea is to tweak the API of the APInt-like InsnType class so that
we don't need so many temporary InsnTypes. This takes advantage of the
rule stated in D52100 that currently "no string of bits extracted
from the encoding may exceeed 64-bits", so we can use uint64_t for some
temporaries.
D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.
Differential Revision: https://reviews.llvm.org/D98046
1. Generate the mapping for clauses between the parser class and the
corresponding clause kind for OpenMP and OpenACC using tablegen.
2. Add a common function to get the OmpObjectList from the OpenMP
clauses to avoid repetition of code.
Reviewed by: Kiranchandramohan @kiranchandramohan , Valentin Clement @clementval
Differential Revision: https://reviews.llvm.org/D98603
When GlobalISelEmitter::emitCxxPredicateFns emitted code for MI
predicates it used "PatFrag" when searching for definitions. With
this patch it will search for all "PatFrags" instead. Since PatFrag
derives from PatFrags the difference is that we now include all
definitions using PatFrags directly as well. Thus making it possible
to use GISelPredicateCode together with a PatFrags definition.
It might be noted that the matcher code was emitted also for PatFrags
in the past. But then one ended up with errors since the custom code
in testMIPredicate_MI was missing.
Differential Revision: https://reviews.llvm.org/D98486
I've left mask registers to a future patch as we'll need
to convert them to full vectors, shuffle, and then truncate.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D97609
Some EVEX instructions should check the predicates when compress to VEX
encoding. For example, avx512vnni instructions. This is because avx512vnni
doesn't mean that avxvnni is supported on the target.
This patch moving the manually added check to .inc that generated by tablegen.
Differential Revision: https://reviews.llvm.org/D98011
- Add a new TableGen backend: CodeBeads
- Add support to generate logical operand information
For the first item, it is currently a workaround of M68k's (complex)
instruction encoding. A typical architecture, especially CISC one like
X86, normally uses `MCInstrDesc::TSFlags` to carry instruction encoding
info. However, at the early days of M68k backend development, we found
it difficult to fit every possible encoding into the 64-bit
`MCInstrDesc::TSFlags`. Therefore CodeBeads was invented to provide
an alternative, arbitrary length container for instruciton encoding
info. However, in the long term we incline not to use a new TG
backend for less common pattern like what we encountered in M68k. A bug
has been created to host to discussion on migrating from CodeBeads to
more concise solution: https://bugs.llvm.org/show_bug.cgi?id=48792
The second item was also served for similar purpose. It created utility
functions that tell you the index of a `MachineOperand` in a
`MachineInst` given a logical operand index. In normal cases a logical
operand is the same as `MachineOperand`, but for operands using complex
addressing mode a logical operand might be consisting of multiple
`MachineOperand`. The TableGen-ed `getLogicalOperandIdx`, for instance,
can give you the mapping between these two concepts. Nevertheless, we
hope to remove this feature in the future if possible. Since it's not
really useful for the targets supported by LLVM now either.
Authors: myhsu, m4yers, glaubitz
Differential Revision: https://reviews.llvm.org/D88385
This fixes an instance of:
warning: cast from 'const unsigned long *' to 'unsigned char *' drops const qualifier [-Wcast-qual]
when compiling the generated MCCodeEmitter for an out-of-tree target
that uses the optional support for instruction widths > 64 bits.
Differential Revision: https://reviews.llvm.org/D97942
This patch adds a pipeline to support in-order CPUs such as ARM
Cortex-A55.
In-order pipeline implements a simplified version of Dispatch,
Scheduler and Execute stages as a single stage. Entry and Retire
stages are common for both in-order and out-of-order pipelines.
Differential Revision: https://reviews.llvm.org/D94928
To do this while supporting the existing functionality in SelectionDAG of using
PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis
dependencies to the instruction selector pass.
Then, use the predicate to generate constant pool loads for f32 materialization,
if we're targeting optsize/minsize.
Differential Revision: https://reviews.llvm.org/D97732
There is a function attribute 'nomerge' in addition to 'noduplicate'
and 'convergent'. Both 'noduplicate' and 'convergent' have corresponding
intrinsic properties. This patch adds an intrinsic property for the
'nomerge' attribute.
Differential Revision: https://reviews.llvm.org/D96364
CheckInteger uses an int64_t encoded using a variable width encoding
that is optimized for encoding a number with a lot of leading zeros.
Negative numbers have no leading zeros so use the largest encoding
requiring 9 bytes.
I believe its most like we want to check for positive and negative
numbers near 0. -1 is quite common due to its use in the 'not'
idiom.
To optimize for this, we can borrow an idea from the bitcode format
and move the sign bit to bit 0 with the magnitude stored in the
upper bits. This will drastically increase the number of leading
zeros for small magnitudes. Then we can run this value through
VBR encoding.
This gives a small reduction in the table size on all in tree
targets except VE where size increased by about 300 bytes due
to intrinsic ids now requiring 3 bytes instead of 2. Since the
intrinsic enum space is shared by all targets this an unfortunate
consquence of where VE is currently located in the range.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D96317
Allow different GICustomOperandRenderers to use the same RendererFn.
This avoids the need for targets to define a bunch of identical C++
renderer functions with different names.
Without this fix TableGen would have emitted code that tried to define
the GICR enumeration with duplicate enumerators.
Differential Revision: https://reviews.llvm.org/D96587
Switch some for loops to just use the begin()/end() implementations
in the InfoByHwMode struct.
Add a method to insert into the map for the one case that was
modifying the map directly.
This attempts to move all tools over to using `add_llvm_library` for
better consistency. After doing this, I noticed it ended up as nearly a
reimplementation of https://reviews.llvm.org/rL342148, which later got
reverted in r342336 (b09a8c9bd9).
With ccache and ninja on a large core machine (40), I haven't run into
build errors, so I'm hopeful it's better now, though it doesn't seem to
be any different / new.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D90970
getSize and setSize both use unsigned. So size_t doesn't
increase range here and might get truncated if passed to
setSize.
Also not sure why EmitVBRValue was returning uint64_t, but used
an unsigned to supply the value.
This attempts to move all tools over to using `add_llvm_library` for
better consistency. After doing this, I noticed it ended up as nearly a
reimplementation of https://reviews.llvm.org/rL342148, which later got
reverted in r342336 (b09a8c9bd9).
With ccache and ninja on a large core machine (40), I haven't run into
build errors, so I'm hopeful it's better now, though it doesn't seem to
be any different / new.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D90970
This ensures that we'll match immediates consistently regardless
of whether we match them as a standalone splat or as part of
another operation.
While I was there I added complexities to the simm5/uimm5 patterns so
we didn't have to assume that the 1 on the non-immediate was lower
than what tablegen inferred.
I had to make a minor tweak to tablegen to fix one place that
didn't expect to see a ComplexPattern that wasn't a "leaf".
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D96199
This patch is a follow up to D94821 to ensure the correct behavior of the
general directive structure checker.
This patch add the generation of the Enter function declaration for clauses in
the TableGen backend.
This helps to ensure each clauses declared in the TableGen file has at least
a basic check.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D95108
This primarily occurs with isel patterns using vnot. This reduces
the number of variants in the isel tables.
We generally canonicalize build_vectors of constants to the RHS. I think
we might fail if there is a bitcast on the build_vector, but that
should be easy to fix if we can find a case. Usually the
bitcast is introduced by type legalization or lowering. It's
likely canonicalization would have already occured.
We already used emplace_back in at least one other place so be
consistent.
AddPatternToMatch already took PTM as an rvalue reference, but
we need to use std::move again to move it into the PatternToMatch
vector.
Use vector::swap instead of copying to a local vector and clearing
the original. We can just swap into the just created local vector
instead which will move the pointers and not the data.
Use std::move in another place to avoid a copy.
This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.
For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D86836
We need at least 252 UniqAttributes now, which will soon overflow.
Actually with downstream backends we can easily use up the last few values.
So bump to uint16_t.
There can be muliple patterns that map to the same compressed
instruction. Reversing those leads to multiple ways to uncompress
an instruction, but its not easily controllable which one will
be chosen by the tablegen backend.
This patch adds a flag to mark patterns that should only be used
for compressing. This allows us to leave one canonical pattern
for uncompressing.
The obvious benefit of this is getting c.mv to uncompress to
the addi patern that is aliased to the mv pseudoinstruction. For
the add/and/or/xor/li patterns it just removes some unreachable
code from the generated code.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94894
The TableGen emitter for directives has two slots for flangClass information and this was mainly
to be able to keep up with the legacy openmp parser at the time. Now that all clauses are encapsulated in
AccClause or OmpClause, these two strings are not necessary anymore and were the the source of couple
of problem while working with the generic structure checker for OpenMP.
This patch remove the flangClassValue string from DirectiveBase.td and use the string flangClass as the
placeholder for the encapsulated class.
Reviewed By: sameeranjoshi
Differential Revision: https://reviews.llvm.org/D94821
When we looked up the map to see if the entry already existed,
this created the new entry for us. So save a reference to it so
we can use it to update the entry instead of looking it up again.
Also remove unnecessary StringRef constructors around string
literals on calls to this function.
Instead forming a std::string and returning it to pass into another
raw_ostream, just pass the raw_ostream as a parameter.
Take StringRef as arguments instead raw_string_ostream references
making the caller responsible for converting to strings. Use
StringRef operations instead of std::string::substr.a
Stop concatenating std::string before streaming into a raw_ostream.
Just stream the pieces.
Remove some new lines from asserts. Remove std::string concatenation
from an assert. assert strings aren't really evaluated like this at
runtime. An assertion failure will just print exactly what's between
the parentheses in the source.
Before this patch there was generic mapping from vector_extract
to G_EXTRACT_VECTOR_ELT added in SelectionDAGCompat.td. That
mapping is now replaced by a mapping from extractelt instead.
The reasoning is that vector_extract is marked as deprecated,
so it is assumed that a majority of targets will use extractelt
and not vector_extract (and that the long term solution for all
targets would be to use extractelt).
Targets like AArch64 that still use vector_extract can add an
additional mapping from the deprecated vector_extract as target
specific tablegen definitions. Such a mapping is added for AArch64
in this patch to avoid breaking tests.
When adding the extractelt => G_EXTRACT_VECTOR_ELT mapping we
triggered some new code paths in GlobalISelEmitter, ending up in
an assert when trying to import a pattern containing EXTRACT_SUBREG
for ARM. Therefore this patch also adds a "failedImport" warning
for that situation (instead of hitting the assert).
Differential Revision: https://reviews.llvm.org/D93416
This reverts commit 8e3e148c
This commit fixes two issues with the original patch:
* The sanitizer build bot reported an uninitialized value. This was caused by normalizeStringIntegral not returning None on failure.
* Some build bots complained about inaccessible keypaths. To mitigate that, "this->" was added back to the keypath to restore the previous behavior.
The TableGen immAllOnesV and immAllZerosV helpers implicitly wrapped the
ISD::isBuildVectorAll(Ones|Zeros) helper functions. This was inhibiting
their use for targets such as RISC-V which use ISD::SPLAT_VECTOR. In
particular, RISC-V had to define its own 'vnot' fragment.
In order to extend the scope of these nodes to include support for
ISD::SPLAT_VECTOR, two new ISD predicate functions have been introduced:
ISD::isConstantSplatVectorAll(Ones|Zeros). These effectively supersede
the older "isBuildVector" predicates, which are now simple wrappers for
the new functions. They pass a defaulted boolean toggle which preserves
the old behaviour. It is hoped that in time all call-sites can be ported
to the "isConstantSplatVector" functions.
While the use of ISD::isBuildVectorAll(Ones|Zeros) has not changed, the
behaviour of the TableGen immAll(Ones|Zeros)V **has**. To test the new
functionality, the custom RISC-V TableGen fragment has been removed and
replaced with the built-in 'vnot'. To test their use as pattern-roots, two
splat patterns have been updated accordingly.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94223
This removes `exnref` type and `br_on_exn` instruction. This is
effectively NFC because most uses of these were already removed in the
previous CLs.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D94041