Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common atomic predicates related to
memory type into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
llvm-svn: 318095
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
(sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
(sext:i32 (load:i16 $ptr)<<unindexedload>>)
I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.
Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.
llvm-svn: 317971
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
Depends on D36618
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37443
Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().
llvm-svn: 315841
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.
With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.
This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.
For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.
Depends on D36086
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36534
llvm-svn: 315747
I'm about to commit a patch that makes them necessary for getPredCode() and
it would be strange for getPredCode() and getImmCode() to require different
usage.
llvm-svn: 315733
The assertion tests were using count() instead of testing the find result, resulting in double the number of searches in debug/assert builds.
Instead, call find once (like the release builds do) and assert the result against end().
llvm-svn: 315151
Avoid unnecessary std::string creations in the TreePredicateFn getters and in CodeGenDAGPatterns::getSDNodeNamed
Differential Revision: https://reviews.llvm.org/D38624
llvm-svn: 315148
Also add operator<< for use with raw_ostream to InfoByHwMode and its
derived classes.
Recommitting r313989 with the fix for unresolved references: explicitly
define the operator<< in namespace llvm.
llvm-svn: 314004
This changes some STL data types to corresponding LLVM
data types that have better performance characteristics.
Differential Revision: https://reviews.llvm.org/D37957
llvm-svn: 313783
Add some member types to MachineValueTypeSet::const_iterator so that
iterator_traits can work with it.
Improve TableGen performance of -gen-dag-isel (motivated by X86 backend)
The introduction of parameterized register classes in r313271 caused the
matcher generation code in TableGen to run much slower, particularly so
in the unoptimized (debug) build. This patch recovers some of the lost
performance.
Summary of changes:
- Cache the set of legal types in TypeInfer::getLegalTypes. The contents
of this set do not change.
- Add LLVM_ATTRIBUTE_ALWAYS_INLINE to several small functions. Normally
this would not be necessary, but in the debug build TableGen is not
optimized, so this helps a little bit.
- Add an early exit from TypeSetByHwMode::operator== for the case when
one or both arguments are "simple", i.e. only have one mode. This
saves some time in GenerateVariants.
- Finally, replace the underlying storage type in TypeSetByHwMode::SetType
with MachineValueTypeSet based on std::array instead of std::set.
This significantly reduces the number of memory allocation calls.
I've done a number of experiments with the underlying type of InfoByHwMode.
The type is a map, and for targets that do not use the parameterization,
this map has only one entry. The best (unoptimized) performance, somewhat
surprisingly came from std::map, followed closely by std::unordered_map.
DenseMap was the slowest by a large margin.
Various hand-crafted solutions (emulating enough of the map interface
not to make sweeping changes to the users) did not yield any observable
improvements.
llvm-svn: 313660
The introduction of parameterized register classes in r313271 caused the
matcher generation code in TableGen to run much slower, particularly so
in the unoptimized (debug) build. This patch recovers some of the lost
performance.
Summary of changes:
- Cache the set of legal types in TypeInfer::getLegalTypes. The contents
of this set do not change.
- Add LLVM_ATTRIBUTE_ALWAYS_INLINE to several small functions. Normally
this would not be necessary, but in the debug build TableGen is not
optimized, so this helps a little bit.
- Add an early exit from TypeSetByHwMode::operator== for the case when
one or both arguments are "simple", i.e. only have one mode. This
saves some time in GenerateVariants.
- Finally, replace the underlying storage type in TypeSetByHwMode::SetType
with MachineValueTypeSet based on std::array instead of std::set.
This significantly reduces the number of memory allocation calls.
I've done a number of experiments with the underlying type of InfoByHwMode.
The type is a map, and for targets that do not use the parameterization,
this map has only one entry. The best (unoptimized) performance, somewhat
surprisingly came from std::map, followed closely by std::unordered_map.
DenseMap was the slowest by a large margin.
Various hand-crafted solutions (emulating enough of the map interface
not to make sweeping changes to the users) did not yield any observable
improvements.
llvm-svn: 313647
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.
This affects the way that types and type sets are printed, and the
tests relying on that have been updated.
There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)
For more information, please refer to the review page.
Differential Revision: https://reviews.llvm.org/D31951
llvm-svn: 313271
Summary:
This patch does a few things that should remove some copies around PatternsToMatch. These were noticed while reviewing code for D34341.
Change constructor to take Dstregs by value and move it into the class. Change one of the callers to add std::move to the argument so that it gets moved.
Make AddPatternToMatch take PatternToMatch by rvalue reference so we can move it into the PatternsToMatch vector. I believe we should have a implicit default move constructor available on PatternToMatch. I chose rvalue reference because both callers call it with temporaries already.
Reviewers: RKSimon, aymanmus, spatel
Reviewed By: aymanmus
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34411
llvm-svn: 306251
Currently we don't enforce that ISD::ANY_EXTEND, ZERO_EXTEND, SIGN_EXTEND, TRUNC, FP_ROUND, FP_EXTEND have the same number of elements(including scalar) between their input and output. Though we have them documented as such. Up until a few months ago x86 created nodes that violated this rule. That's all been fixed now, and we should enforce the rule going forward.
In order to do this we need to allow SDTCisSameNumEltsAs to support scalar types and not enforce being a vector. If one type is scalar we will force the other type to also be scalar.
Differential Revision: https://reviews.llvm.org/D30878
llvm-svn: 297648
This adds a basic tablegen backend that analyzes the SelectionDAG
patterns to find simple ones that are eligible for GlobalISel-emission.
That's similar to FastISel, with one notable difference: we're not fed
ISD opcodes, so we need to map the SDNode operators to generic opcodes.
That's done using GINodeEquiv in TargetGlobalISel.td.
Otherwise, this is mostly boilerplate, and lots of filtering of any kind
of "complicated" pattern. On AArch64, this is sufficient to match G_ADD
up to s64 (to ADDWrr/ADDXrr) and G_BR (to B).
Differential Revision: https://reviews.llvm.org/D26878
llvm-svn: 290284
This splits out the intrinsic table such that generic intrinsics come
first and target specific intrinsics are grouped by target. From here
we can find out which target an intrinsic is for or differentiate
between generic and target intrinsics.
The motivation here is to make it easier to move target specific
intrinsic handling out of generic code.
llvm-svn: 275575
Summary: This fixes a variety of typos in docs, code and headers.
Subscribers: jholewinski, sanjoy, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12626
llvm-svn: 247495
This reverts commit r222183.
Broke on the MSVC buildbots due to MSVC not producing default move
operations - I'd fix it immediately but just broke my build system a
bit, so backing out until I have a chance to get everything going again.
llvm-svn: 222187
The next step is to actually use unique_ptr in TreePatternNode's
Children vector. That will be more intrusive, and may not work,
depending on exactly how these things are handled (I have a bad
suspicion things are shared more than they should be, making this more
DAG than tree - but if it's really a tree, unique_ptr should suffice)
llvm-svn: 222183
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
This is useful for cases when stand-alone patterns are preferred to the
patterns included in the instruction definitions. Instead of requiring
that stand-alone patterns set a larger AddedComplexity value, which
can be confusing to new developers, the allows us to reduce the
complexity of the included patterns to achieve the same result.
There will be test cases for this added to the R600 backend in a
future commit.
llvm-svn: 214466
file not in the test/ area). Backing out now so that this test isn't part of
the 3.5 branch.
Original commit message: "TableGen: Allow AddedComplexity values to be negative
[...]"
llvm-svn: 213596
This is useful for cases when stand-alone patterns are preferred to the
patterns included in the instruction definitions. Instead of requiring
that stand-alone patterns set a larger AddedComplexity value, which
can be confusing to new developers, the allows us to reduce the
complexity of the included patterns to achieve the same result.
llvm-svn: 213521
This allows the results of a ComplexPattern check to be distributed to separate
named Operands, instead of the current system where all results must apply (and
match perfectly) with a single Operand.
For example, if "some_addrmode" is a ComplexPattern producing two results, you
can write:
def : Pat<(load (some_addrmode GPR64:$base, imm:$offset)),
(INST GPR64:$base, imm:$offset)>;
This should allow neater instruction definitions in TableGen that don't put all
possible aspects of addressing into a single operand, but are still usable with
relatively simple C++ CodeGen idioms.
llvm-svn: 209206
Unfortunately, it is currently impossible to use a PatFrag as part of an output
pattern (the part of the pattern that has instructions in it) in TableGen.
Looking at the current implementation, this was clearly intended to work (there
is already code in place to expand patterns in the output DAG), but is
currently broken by the baked-in type-checking assumption and the order in which
the pattern fragments are processed (output pattern fragments need to be
processed after the instruction definitions are processed).
Fixing this is fairly simple, but requires some way of differentiating output
patterns from the existing input patterns. The simplest way to handle this
seems to be to create a subclass of PatFrag, and so that's what I've done here.
As a simple example, this allows us to write:
def crnot : OutPatFrag<(ops node:$in),
(CRNOR $in, $in)>;
def : Pat<(not i1:$in),
(crnot $in)>;
which captures the core use case: handling of repeated subexpressions inside
of complicated output patterns.
This will be used by an upcoming commit to the PowerPC backend.
llvm-svn: 202450
A register class can appear as a leaf TreePatternNode with and without a
name:
(COPY_TO_REGCLASS GPR:$src, F8RC)
In a named leaf node like GPR:$src, the register class provides type
information for the named variable represented by the node. The TypeSet
for such a node is the set of value types that the register class can
represent.
In an unnamed leaf node like F8RC above, the register class represents
itself as a kind of immediate. Such a node has the type MVT::i32,
we'll never create a virtual register representing it.
This change makes it possible to remove the special handling of
COPY_TO_REGCLASS in CodeGenDAGPatterns.cpp.
llvm-svn: 177825
Most places can use PrintFatalError as the unwinding mechanism was not
used for anything other than printing the error. The single exception
was CodeGenDAGPatterns.cpp, where intermediate errors during type
resolution were ignored to simplify incremental platform development.
This use is replaced by an error flag in TreePattern and bailout earlier
in various places if it is set.
llvm-svn: 166712
This is a generally useful utility; there's no reason to have it hidden
in CodeGenDAGPatterns.cpp.
Also, rename it to fit the other comparators in Record.h
Review by Jakob.
llvm-svn: 164189
Manage Inits in a FoldingSet. This provides several benefits:
- Memory for Inits is properly managed
- Duplicate Inits are folded into Flyweights, saving memory
- It enforces const-correctness, protecting against certain classes
of bugs
The above benefits allow Inits to be used in more contexts, which in
turn provides more dynamism to TableGen. This enhanced capability
will be used by the AVX code generator to a fold common patterns
together.
llvm-svn: 134907
value constraints on them (when defined as ImmLeaf's). This is particularly important
for X86-64, where almost all reg/imm instructions take a i64immSExt32 immediate operand,
which has a value constraint. Before this patch we ended up iseling the examples into
such amazing code as:
movabsq $7, %rax
imulq %rax, %rdi
movq %rdi, %rax
ret
now we produce:
imulq $7, %rdi, %rax
ret
This dramatically shrinks the generated code at -O0 on x86-64.
llvm-svn: 129691
kind of predicate: one that is specific to imm nodes. The predicate function
specified here just checks an int64_t directly instead of messing around with
SDNode's. The virtue of this is that it means that fastisel and other things
can reason about these predicates.
llvm-svn: 129675
structure and fix some fixmes. We now have a TreePredicateFn class
that handles all of the decoding of these things. This is an internal
cleanup that has no impact on the code generated by tblgen.
llvm-svn: 129670
This will be used to check patterns referencing a forthcoming
INSERT_SUBVECTOR SDNode. INSERT_SUBVECTOR in turn is very useful for
matching to VINSERTF128 instructions and complements the already
existing EXTRACT_SUBVECTOR SDNode.
llvm-svn: 124145
to maintain a list of types (one for each result of
the node) instead of a single type. There are liberal
hacks added to emulate the old behavior in various
situations, but they can start disolving now.
llvm-svn: 98999
record* -> instrinfo instead of std::string -> instrinfo.
This speeds up tblgen on cellcpu from 7.28 -> 5.98s with a debug
build (20%).
llvm-svn: 98916
like this:
def : Pat<(add ...),
(FOOINST)>;
When fooinst only has a single implicit def (e.g. to R1). This will be handled
as if written as (set R1, (FOOINST ...))
llvm-svn: 98897
changing the primary datastructure from being a
"std::vector<unsigned char>" to being a new TypeSet class
that actually has (gasp) invariants!
This changes more things than I remember, but one major
innovation here is that it enforces that named input
values agree in type with their output values.
This also eliminates code that transparently assumes (in
some cases) that SDNodeXForm input/output types are the
same, because this is wrong in many case.
This also eliminates a bug which caused a lot of ambiguous
patterns to go undetected, where a register class would
sometimes pick the first possible type, causing an
ambiguous pattern to get arbitrary results.
With all the recent target changes, this causes no
functionality change!
llvm-svn: 98534
ordered correctly. Previously it would get in trouble when
two patterns were too similar and give them nondet ordering.
We force this by using the record ID order as a fallback.
The testsuite diff is due to alpha patterns being ordered
slightly differently, the change is a semantic noop afaict:
< lda $0,-100($16)
---
> subq $16,100,$0
llvm-svn: 97509
node is always guaranteed to have a particular type
instead of hacking in ISD::STORE explicitly. This allows
us to use implied types for a broad range of nodes, even
target specific ones.
llvm-svn: 97355
inferencing. As far as I can tell, these are equivalent to the existing
MVT::fAny, iAny and vAny types, and having both of them makes it harder
to reason about and modify the type inferencing code.
The specific problem in PR4795 occurs when updating a vAny type to be fAny
or iAny, or vice versa. Both iAny and fAny include vector types -- they
intersect with the set of types represented by vAny. When merging them,
choose fAny/iAny to represent the intersection. This is not perfect, since
fAny/iAny also include scalar types, but it is good enough for TableGen's
type inferencing.
llvm-svn: 80423
- This manifested as non-determinism in the .inc output in rare cases (when two
distinct patterns ended up being equivalent, which is rather rare). That
meant the pattern matching was non-deterministic, which could eventually mean
the code generator selected different instructions based on the arch.
- It's probably worth making the DAGISel ensure a total ordering (or force the
user to), but the simple fix here is to totally order the Record* maps based
on a unique ID.
- PR4672, PR4711.
Yay:
--
ddunbar@giles:~$ cat ~/llvm.obj.64/lib/Target/*/*.inc | shasum
d1099ff34b21459a5a3e7021c225c080e6017ece -
ddunbar@giles:~$ cat ~/llvm.obj.ppc/lib/Target/*/*.inc | shasum
d1099ff34b21459a5a3e7021c225c080e6017ece -
--
llvm-svn: 79846
There have been a few times where I've wanted this but ended up leaving the
operand type unconstrained. It is easy to add this now and should help
catch errors in the future.
llvm-svn: 78849
PR2957
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.
In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.
llvm-svn: 70225
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.
In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.
A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.
llvm-svn: 69952