Summary:
Add an AdditionalEncoding class which can be used to define additional encodings
for a given instruction. This causes the disassembler to add an additional
encoding to its matching tables that map to the specified instruction.
Usage:
def ADD1 : Instruction {
bits<8> Reg;
bits<32> Inst;
let Size = 4;
let Inst{0-7} = Reg;
let Inst{8-14} = 0;
let Inst{15} = 1; // Continuation bit
let Inst{16-31} = 0;
...
}
def : AdditionalEncoding<ADD1> {
bits<8> Reg;
bits<16> Inst; // You can also have bits<32> and it will still be a 16-bit encoding
let Size = 2;
let Inst{0-3} = 0;
let Inst{4-7} = Reg;
let Inst{8-15} = 0;
...
}
with those definitions, llvm-mc will successfully disassemble both of these:
0x01 0x00
0x10 0x80 0x00 0x00
to:
ADD1 r1
Depends on D52366
Reviewers: bogner, charukcs
Reviewed By: bogner
Subscribers: nlguillemot, nhaehnle, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D52369
llvm-svn: 363744
Merging the two bits shrinks the context table from 16384 bytes to 8192 bytes.
Remove the ATTRIBUTE_BITS macro and just create an enum directly. Then fix the ATTR_max define to be 8192 to reflect the table size so we stop hardcoding it separately.
llvm-svn: 363330
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.
For example:
; The LLVM LangRef specifies that the scalar result must equal the
; vector element type, but this is not checked/enforced by LLVM.
declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)
This patch changes that into:
declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)
Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.
Reviewers: RKSimon, arsenm, rnk, greened, aemerson
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D62996
llvm-svn: 363240
Extend the mechanism to overload intrinsic arguments by using either
backward or forward references to the overloadable arguments.
In for example:
def int_something : Intrinsic<[LLVMPointerToElt<0>],
[llvm_anyvector_ty], []>;
LLVMPointerToElt<0> is a forward reference to the overloadable operand
of type 'llvm_anyvector_ty' and would allow intrinsics such as:
declare i32* @llvm.something.v4i32(<4 x i32>);
declare i64* @llvm.something.v2i64(<2 x i64>);
where the result pointer type is deduced from the element type of the
first argument.
If the returned pointer is not a pointer to the element type, LLVM will
give an error:
Intrinsic has incorrect return type!
i64* (<4 x i32>)* @llvm.something.v4i32
Reviewers: RKSimon, arsenm, rnk, greened
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D62995
llvm-svn: 363233
Summary:
Use a set in getReqFeatures() in RISCVCompressInstEmitter instead of a map
because the index we save is not needed.
This also fixes bug 41666.
Reviewers: llvm-commits, apazos, asb, nickdesaulniers
Reviewed By: asb
Subscribers: Jim, nickdesaulniers, rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61412
llvm-svn: 362968
The ISD::STRICT_ nodes used to implement the constrained floating-point
intrinsics are currently never passed to the target back-end, which makes
it impossible to handle them correctly (e.g. mark instructions are depending
on a floating-point status and control register, or mark instructions as
possibly trapping).
This patch allows the target to use setOperationAction to switch the action
on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code
will stop converting the STRICT nodes to regular floating-point nodes, but
instead pass the STRICT nodes to the target using normal SelectionDAG
matching rules.
To avoid having the back-end duplicate all the floating-point instruction
patterns to handle both strict and non-strict variants, we make the MI
codegen explicitly aware of the floating-point exceptions by introducing
two new concepts:
- A new MCID flag "mayRaiseFPException" that the target should set on any
instruction that possibly can raise FP exception according to the
architecture definition.
- A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI
instruction resulting from expansion of any constrained FP intrinsic.
Any MI instruction that is *both* marked as mayRaiseFPException *and*
FPExcept then needs to be considered as raising exceptions by MI-level
codegen (e.g. scheduling).
Setting those two new flags is straightforward. The mayRaiseFPException
flag is simply set via TableGen by marking all relevant instruction
patterns in the .td files.
The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes
in the SelectionDAG, and gets inherited in the MachineSDNode nodes created
from it during instruction selection. The flag is then transfered to an
MIFlag when creating the MI from the MachineSDNode. This is handled just
like fast-math flags like no-nans are handled today.
This patch includes both common code changes required to implement the
new features, and the SystemZ implementation.
Reviewed By: andrew.w.kaylor
Differential Revision: https://reviews.llvm.org/D55506
llvm-svn: 362663
A std::array is implemented as a template with an array inside a struct.
Older versions of clang, like 3.6, require an extra set of curly braces
around std::array initializations to avoid warnings.
The C++ language was changed regarding this by CWG 1270. So more modern
tool chains does not complain even if leaving out one level of braces.
llvm-svn: 362360
Fix the misleadingly indentation introduced in rL362064. This will get rid of
the compiler warning, and it was actually a bug. This change will be used and
tested in D62669.
llvm-svn: 362211
If an assembly instruction has to mention an input operand name twice,
for example the MVE VMOV instruction that accesses two lanes of the
same vector by writing 'vmov r1, r2, q0[3], q0[1]', then the obvious
way to write its AsmString is to include the same operand (here $Qd)
twice. But this causes the AsmMatcher generator to omit that
instruction completely from the match table, on the basis that the
generator isn't clever enough to deal with the duplication.
But you need to have _some_ way of dealing with an instruction like
this - and in this case, where the mnemonic is shared with many other
instructions that the AsmMatcher does handle, it would be very painful
to take it out of the AsmMatcher system completely.
A nicer way is to add a custom AsmMatchConverter routine, and let that
deal with the problem if the autogenerated converter can't. But that
doesn't work, because TableGen leaves the instruction out of its table
_even_ if you provide a custom converter.
Solution: this change, which makes TableGen relax the restriction on
duplicated operands in the case where there's a custom converter.
Patch by: Simon Tatham
Differential Revision: https://reviews.llvm.org/D60695
llvm-svn: 362066
This is a new special identifier which you can use as a default in
OperandWithDefaultOps. The idea is that you use it for an input
operand of an instruction that's tied to an output operand, and its
semantics are that (in the default case) the input operand's value is
not used at all.
The detailed effect is that when instruction selection emits the
instruction in the form of a pre-regalloc MachineInstr, it creates an
IMPLICIT_DEF node to use as that input.
If you're creating an MCInst with explicit register names, then the
right handling would be to set the input operand to the same register
as the output one (honouring the tie) and to add the 'undef' flag
indicating that that register is deemed to acquire a new don't-care
definition just before we read it. But I haven't done that in this
commit, because there was no need to - no Tablegen backend seems to
autogenerate default fields in an MCInst.
Patch by: Simon Tatham
Differential Revision: https://reviews.llvm.org/D60696
llvm-svn: 362064
Summary:
This reverts commit r360106.
The revisioin causes llvm-tblgen to hang while generating info for
RISCV.td. The root cause might be in the RISCV.td definition but I don't
know enough about this to investigate further.
Command that starts hangning after r360106:
`llvm-build/bin/llvm-tblgen -I llvm/include -I llvm/tools/clang/include -I llvm/lib/Target/RISCV -gen-instr-info llvm/lib/Target/RISCV/RISCV.td`
Reviewers: sammccall, yan_luo, craig.topper, gribozavr
Reviewed By: gribozavr
Subscribers: PkmX, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61632
llvm-svn: 360136
If you have more than one schedule model in your TableGen target
definitions, then the diagnostic "No schedule information for
instruction 'foo'" is rather unhelpful, because it doesn't tell you
_which_ schedule model is missing the necessary information (or, as it
might be, missing the UnsupportedFeatures definition that would stop
it thinking it needed it).
Extended the message to include the name of the schedule model that
it's complaining about.
Reviewers: nhaehnle, hfinkel, javedabsar, efriedma, javed.absar
Reviewed By: javed.absar
Subscribers: javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60559
llvm-svn: 358389
The composite existed to simplify some other tablegen code and not really in an
important way. Remove the combined field and just calculate the vector size
using two ifs.
llvm-svn: 357972
The instruction's document this as W0 for the VEX encoding. But there's a
footnote mentioning that VEX.W is ignored in 64-bit mode. And the main VEX
encoding description says the VEX.W bit is ignored for instructions that are
equivalent to a legacy SSE instruction that uses REX.W to select a GPR which
would apply here.
By making this match EVEX we can remove a special case of allowing EVEX2VEX to
turn an EVEX.WIG instruction into VEX.W0.
llvm-svn: 357971
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes.
Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.
Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon
Reviewed By: RKSimon
Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60228
llvm-svn: 357802
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes.
Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.
Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri
Reviewed By: andreadb
Subscribers: hiraditya, lebedev.ri, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60138
llvm-svn: 357801
Summary:
Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models.
This avoids needing an isel pattern for each condition code. And it removes
translation switches for converting between CMOV instructions and condition
codes.
Now the printer, encoder and disassembler take care of converting the immediate.
We use InstAliases to handle the assembly matching. But we print using the
asm string in the instruction definition. The instruction itself is marked
IsCodeGenOnly=1 to hide it from the assembly parser.
This does complicate the scheduler models a little since we can't assign the
A and BE instructions to a separate class now.
I plan to make similar changes for SETcc and Jcc.
Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet
Reviewed By: RKSimon
Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60041
llvm-svn: 357800
We were using the number of Matchables rather than the number of rows in the converter table.
This only matters for a few of the targets where the number of matchables is more than 255, but the number of converters is less than 255. Many of the targets have more than 256 converters. So already required a uint16_t.
llvm-svn: 357527
Summary:
It does not currently make sense to use WebAssembly features in some functions
but not others, so this CL adds an IR pass that takes the union of all used
feature sets and applies it to each function in the module. This allows us to
prevent atomics from being lowered away if some function has opted in to using
them. When atomics is not enabled anywhere, we detect whether there exists any
atomic operations or thread local storage that would be stripped and disallow
linking with objects that contain atomics if and only if atomics or tls are
stripped. When atomics is enabled, mark it as used but do not require it of
other objects in the link. These changes allow libraries that do not use atomics
to be built once and linked into both single-threaded and multithreaded
binaries.
Reviewers: aheejin, sbc100, dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59625
llvm-svn: 357226
These were added in r355423.
We only use the autogenerated table to assist with the maintenance of the
manual table. These entries are alreayd in the manual table.
llvm-svn: 356357
AMDGPU would like to use these MVTs.
Differential Revision: https://reviews.llvm.org/D58901
Change-Id: I6125fea810d7cc62a4b4de3d9904255a1233ae4e
llvm-svn: 356351
Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible.
The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b".
The _alt form on the other hand parsed and printed like any other instruction with no specialness.
With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax.
I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off.
I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them.
Differential Revision: https://reviews.llvm.org/D59398
llvm-svn: 356343
This indicates an intrinsic parameter is required to be a constant,
and should not be replaced with a non-constant value.
Add the attribute to all AMDGPU and generic intrinsics that comments
indicate it should apply to. I scanned other target intrinsics, but I
don't see any obvious comments indicating which arguments are intended
to be only immediates.
This breaks one questionable testcase for the autoupgrade. I'm unclear
on whether the autoupgrade is supposed to really handle declarations
which were never valid. The verifier fails because the attributes now
refer to a parameter past the end of the argument list.
llvm-svn: 355981
AMDGPU target run out of Subtarget feature flags hitting the limit of 64.
AssemblerPredicates uses at most uint64_t for their representation.
At the same time CodeGen has exhausted this a long time ago and switched
to a FeatureBitset with the current limit of 192 bits.
This patch completes transition to the bitset for feature bits extending
it to asm matcher and MC code emitter.
Differential Revision: https://reviews.llvm.org/D59002
llvm-svn: 355839
Includes a fix to emit a CheckOpcode for build_vector when immAllZerosV/immAllOnesV is used as a pattern root. This means it can't be used to look through bitcasts when used as a root, but that's probably ok. This extra CheckOpcode will ensure that the first match in the isel table will be a SwitchOpcode which is needed by the caching optimization in the ISel Matcher.
Original commit message:
Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts.
By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up.
This removes something like 40,000 bytes from the X86 isel table.
Differential Revision: https://reviews.llvm.org/D58595
llvm-svn: 355784
This caused the first matcher in the isel table for many targets to Opc_Scope instead of Opc_SwitchOpcode. This leads to a significant increase in isel match failures.
llvm-svn: 355433
These arrays are both keyed by CPU name and go into the same tablegenerated file. Merge them so we only need to store keys once.
This also removes a weird space saving quirk where we used the ProcDesc.size() to create to build an ArrayRef for ProcSched.
Differential Revision: https://reviews.llvm.org/D58939
llvm-svn: 355431
The description for CPUs was just the CPU name wrapped with "Select the " and " processor". We can just do that directly in the help printer instead of making a separate version in the binary for each CPU.
Also remove the Value field that isn't needed and was always 0.
Differential Revision: https://reviews.llvm.org/D58938
llvm-svn: 355429
Apparently older versions of clang like 3.6 require an extra set of curly braces around std::array initializations. I'm told the C++ language was changed regarding this by CWG 1270.
llvm-svn: 355327
Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts.
By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up.
This removes something like 40,000 bytes from the X86 isel table.
Differential Revision: https://reviews.llvm.org/D58595
llvm-svn: 355224
Subtarget features are stored in a std::bitset that has been subclassed. There is a special constructor to allow the tablegen files to provide a list of bits to initialize the std::bitset to. This constructor isn't constexpr and std::bitset doesn't support many constexpr operations either. This results in a static global constructor being used to initialize the feature bitsets in these files at startup.
To fix this I've introduced a new FeatureBitArray class that holds three 64-bit values representing the initial bit values and taught tablegen to emit hex constants for them based on the feature enum values. This makes the tablegen files less readable than they were before. I can add the list of features back as a comment if we think that's important.
I've added a method to convert from this class into the std::bitset subclass we had before. I considered making the new FeatureBitArray class just implement the std::bitset interface we need instead, but thought I'd see how others felts about that first.
I've simplified the interfaces to SetImpliedBits and ClearImpliedBits a little minimize the number of times we need to convert to the bitset.
This removes about 27K from my local release+asserts build of llc.
Differential Revision: https://reviews.llvm.org/D58520
llvm-svn: 355167
The previous sort comparator was not deterministic, i.e. in some
situations it would be possible for lhs < rhs && rhs < lhs. This was
discovered by an STL assertion in a Windows debug build of llvm-tblgen.
Differential Revision: https://reviews.llvm.org/D58687
llvm-svn: 354910
The --disassembler-options, or -M, are used to customize
the disassembler and affect its output.
The two implemented options allow selecting register names on ARM:
* With -Mreg-names-raw, the disassembler uses rNN for all registers.
* With -Mreg-names-std it prints sp, lr and pc for r13, r14 and r15,
which is the default behavior of llvm-objdump.
Differential Revision: https://reviews.llvm.org/D57680
llvm-svn: 354870
More or less all the instructions defined in the v8.2a full-fp16
extension are defined as UNPREDICTABLE if you put them in an IT block
(Thumb) or use with any condition other than AL (ARM). LLVM didn't
know that, and was happy to conditionalise them.
In order to force these instructions to count as not predicable, I had
to make a small Tablegen change. The code generation back end mostly
decides if an instruction was predicable by looking for something it
can identify as a predicate operand; there's an isPredicable bit flag
that overrides that check in the positive direction, but nothing that
overrides it in the negative direction.
(I considered the alternative approach of actually removing the
predicate operand from those instructions, but thought that it would
be more painful overall for instructions differing only in data type
to have different shapes of operand list. This way, the only code that
has to notice the difference is the if-converter.)
So I've added an isUnpredicable bit alongside isPredicable, and set
that bit on the right subset of FP16 instructions, and also on the
VSEL, VMAXNM/VMINNM and VRINT[ANPM] families which should be
unpredicable for all data types.
I've included a couple of representative regression tests, both of
which previously caused an fp16 instruction to be conditionalised in
ARM state and (with -arm-no-restrict-it) to be put in an IT block in
Thumb.
Reviewers: SjoerdMeijer, t.p.northover, efriedma
Reviewed By: efriedma
Subscribers: jdoerfert, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57823
llvm-svn: 354768
OPC_CheckCondCode is always used as operand 2 of a setcc. And its always surrounded by a MoveChild2 and a MoveParent. By having a dedicated opcode for this case we can reduce the number of bytes needed for this pattern from 4 bytes to 2.
This saves ~3000 bytes in the X86 table.
llvm-svn: 354763
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
This class is used for two difference tablegen generated tables. For one of the tables the Value FeatureBitset only has one bit set. For the other usage the Implies field was unused.
This patch changes the Value field to just be an unsigned. For the usage that put a real vector in bitset, we now use the previously unused Implies field and leave the Value field unused instead.
This is good for a 16K reduction in the size of llc on my local build with all targets enabled.
llvm-svn: 354243
Summary:
While working on the GISel Combiner, I noticed I was producing location-less
error messages fairly often and set about fixing this. In the process, I
noticed quite a few places elsewhere in TableGen that also neglected to include
a relevant location.
This patch adds locations to errors that relate to a specific record (or a
field within it) and also have easy access to the relevant location. This is
particularly useful when multiclasses are involved as many of these errors
refer to the full name of a record and it's difficult to guess which substring
is grep-able.
Unfortunately, tablegen currently only supports Record granularity so it's not
currently possible to point at a specific Init so these sometimes point at the
record that caused the error rather than the precise origin of the error.
Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, nhaehnle
Reviewed By: nhaehnle
Subscribers: jdoerfert, nhaehnle, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58077
llvm-svn: 353862
This patch adds a -time-regions option to tablegen that can enable timers
(currently only one) that assess the performance of tablegen itself. This
can be useful for identifying scaling problems with tablegen backends.
This particular timer has allowed me to ignore time that is not attributed
the GISel combiner pass. It's useful by itself but it is particularly
useful in combination with https://reviews.llvm.org/D52954 which causes
this period of time to be annotated within Xcode Instruments which in turn
allows profile samples and recorded allocations attributed to reading
instructions to be filtered out.
llvm-svn: 353763
If we run into a pattern that looks like this:
add
(complex $x, $y)
(complex $x, $z)
We should skip the pattern instead of asserting/doing something unpredictable.
This makes us return an Error in that case, and adds a testcase for skipped
patterns.
Differential Revision: https://reviews.llvm.org/D57980
llvm-svn: 353586
Summary:
There are a few instructions that all map to the same opcode, so
when disassembling, we have to pick one. That was just the first one
before (the except_ref variant in the case of "call"), now it is the
one marked as IsCanonical in tablegen, or failing that, the shortest
name (which is typically the "canonical" one).
Also introduced a canonical "end" instruction for this purpose.
Reviewers: dschuff, tlively
Subscribers: sbc100, jgravelle-google, aheejin, llvm-commits, sunfish
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57713
llvm-svn: 353131
LLVM_ENABLE_DAGISEL_COV can be used to instrument DAGISel tablegen
selection code to show which patterns along with Complex patterns were
used when selecting instructions. Unfortunately this is turned off by
default and was broken but never tested.
This required a simple fix (missing new line) to get it to build again.
llvm-svn: 353091
This patch replaces the existing LLVMVectorSameWidth matcher with LLVMScalarOrSameVectorWidth.
The matching args must be either scalars or vectors with the same number of elements, but in either case the scalar/element type can differ, specified by LLVMScalarOrSameVectorWidth.
I've updated the _overflow intrinsics to demonstrate this - allowing it to return a i1 or <N x i1> overflow result, matching the scalar/vectorwidth of the other (add/sub/mul) result type.
The masked load/store/gather/scatter intrinsics have also been updated to use this, although as we specify the reference type to be llvm_anyvector_ty we guarantee the mask will be <N x i1> so no change in behaviour
Differential Revision: https://reviews.llvm.org/D57090
llvm-svn: 351957
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
Right now we include ${TGT}GenCallingConv.inc once per each instruction
selection method implemented by ${TGT}:
- ${TGT}ISelLowering.cpp
- ${TGT}CallLowering.cpp
- ${TGT}FastISel.cpp
Instead, add a mechanism to tablegen for marking a particular convention
as "External", which causes tablegen to emit into the ::llvm namespace,
instead of as a static helper. This allows us to provide a header to
forward declare it, so we can simply call the function from all the
places it is referenced. Typically the calling convention analyzer is
called indirectly, so it doesn't benefit from inlining.
This saves a bit of final binary size, but mostly just saves object file
size:
before after diff artifact
12852K 12492K -360K X86ISelLowering.cpp.obj
4640K 4280K -360K X86FastISel.cpp.obj
1704K 2092K +388K X86CallingConv.cpp.obj
52448K 52336K -112K llc.exe
I didn't collect before numbers for X86CallLowering.cpp.obj, which is
for GlobalISel, but we should save 360K there as well.
This patch applies the strategy to the X86 backend, but there is no
reason it couldn't be applied to the other backends that implement
multiple ISel strategies, like AArch64.
Reviewers: craig.topper, hfinkel, efriedma
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D56883
llvm-svn: 351616
Summary:
The previously introduced new operand type for br_table didn't have
a disassembler implementation, causing an assert.
Reviewers: dschuff, aheejin
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D56227
llvm-svn: 350366
When you define an instruction alias as a subclass of InstAlias, you
specify all the MC operands for the instruction it expands to, except
for operands that are tied to a previous one, which you leave out in
the expectation that the Tablegen output code will fill them in
automatically.
But the code in Tablegen's AsmWriter backend that skips over a tied
operand was doing it using 'if' instead of 'while', because it wasn't
expecting to find two tied operands in sequence.
So if an instruction updates a pair of registers in place, so that its
MC representation has two input operands tied to the output ones (for
example, Arm's UMLAL instruction), then any alias which wants to
expand to a special case of that instruction is likely to fail to
match, because the indices of subsequent operands will be off by one
in the generated printAliasInstr function.
This patch re-indents some existing code, so it's clearest when
viewed as a diff with whitespace changes ignored.
Reviewers: fhahn, rengolin, sdesmalen, atanasyan, asb, jholewinski, t.p.northover, kparzysz, craig.topper, stoklund
Reviewed By: rengolin
Subscribers: javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D53816
llvm-svn: 349141
One of the GCC based bots is objecting to a vector of const EncodingAndInst's:
In file included from /usr/include/c++/8/vector:64,
from /export/users/atombot/llvm/clang-atom-d525-fedora-rel/llvm/utils/TableGen/CodeGenInstruction.h:22,
from /export/users/atombot/llvm/clang-atom-d525-fedora-rel/llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:15:
/usr/include/c++/8/bits/stl_vector.h: In instantiation of 'class std::vector<const {anonymous}::EncodingAndInst, std::allocator<const {anonymous}::EncodingAndInst> >':
/export/users/atombot/llvm/clang-atom-d525-fedora-rel/llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:375:32: required from here
/usr/include/c++/8/bits/stl_vector.h:351:21: error: static assertion failed: std::vector must have a non-const, non-volatile value_type
static_assert(is_same<typename remove_cv<_Tp>::type, _Tp>::value,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/8/bits/stl_vector.h:354:21: error: static assertion failed: std::vector must have the same value_type as its allocator
static_assert(is_same<typename _Alloc::value_type, _Tp>::value,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
llvm-svn: 349046
Summary:
Separate the concept of an encoding from an instruction. This will enable
the definition of additional encodings for the same instruction which can
be used to support variable length instruction sets in the disassembler
(and potentially assembler but I'm not working towards that right now)
without causing an explosion in the number of Instruction records that
CodeGen then has to pick between.
Reviewers: bogner, charukcs
Reviewed By: bogner
Subscribers: kparzysz, llvm-commits
Differential Revision: https://reviews.llvm.org/D52366
llvm-svn: 349041
Summary:
This fixes support in DAGISelMatcher backend for DAG nodes with multiple
result values. Previously the order of results in selected DAG nodes always
matched the order of results in ISel patterns. After the change the order of
results matches the order of operands in OutOperandList instead.
For example, given this definition from the attached test case:
def INSTR : Instruction {
let OutOperandList = (outs GPR:$r1, GPR:$r0);
let InOperandList = (ins GPR:$t0, GPR:$t1);
let Pattern = [(set i32:$r0, i32:$r1, (udivrem i32:$t0, i32:$t1))];
}
the DAGISelMatcher backend currently produces a matcher that creates INSTR
nodes with the first result `$r0` and the second result `$r1`, contrary to the
order in the OutOperandList. The order of operands in OutOperandList does not
matter at all, which is unexpected (and unfortunate) because the order of
results of a DAG node does matters, perhaps a lot.
With this change, if the order in OutOperandList does not match the order in
Pattern, DAGISelMatcherGen emits CompleteMatch opcodes with the order of
results taken from OutOperandList. Backend writers can use it to express
result reorderings in TableGen.
If the order in OutOperandList matches the order in Pattern, the result of
DAGISelMatcherGen is unaffected.
Patch by Eugene Sharygin
Reviewers: andreadb, bjope, hfinkel, RKSimon, craig.topper
Reviewed By: craig.topper
Subscribers: nhaehnle, craig.topper, llvm-commits
Differential Revision: https://reviews.llvm.org/D55055
llvm-svn: 348326
Currently, variadic operands on an MCInst are assumed to be uses,
because they come after the defs. However, this is not always the case,
for example the Arm/Thumb LDM instructions write to a variable number of
registers.
This adds a property of instruction definitions which can be used to
mark variadic operands as defs. This only affects MCInst, because
MachineInstruction already tracks use/def per operand in each instance
of the instruction, so can already represent this.
This property can then be checked in MCInstrDesc, allowing us to remove
some special cases in ARMAsmParser::isITBlockTerminator.
Differential revision: https://reviews.llvm.org/D54853
llvm-svn: 348114
Simple predicates, such as those defined by `CheckRegOperandSimple` or
`CheckImmOperandSimple`, were not being negated when used with `CheckNot`.
This change fixes this issue by defining the previously declared methods to
handle simple predicates.
Differential revision: https://reviews.llvm.org/D55089
llvm-svn: 348034
Summary:
This simplifies writing predicates for pattern fragments that are
automatically re-associated or commuted.
For example, a followup patch adds patterns for fragments of the form
(add (shl $x, $y), $z) to the AMDGPU backend. Such patterns are
automatically commuted to (add $z, (shl $x, $y)), which makes it basically
impossible to refer to $x, $y, and $z generically in the PredicateCode.
With this change, the PredicateCode can refer to $x, $y, and $z simply
as `Operands[i]`.
Test confirmed that there are no changes to any of the generated files
when building all (non-experimental) targets.
Change-Id: I61c00ace7eed42c1d4edc4c5351174b56b77a79c
Reviewers: arsenm, rampitec, RKSimon, craig.topper, hfinkel, uweigand
Subscribers: wdng, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D51994
llvm-svn: 347992
When tablegen detects that there exist two subregister compositions that
result in the same value for some register, it will emit a warning. This
kind of an overlap in compositions should only happen when it is caused
by a user-defined composition. It can happen, however, that the user-
defined composition is not identically equal to another one, but it does
produce the same value for one or more registers. In such cases suppress
the warning.
This patch is to silence the warning when building the System Z backend
after D50725.
Differential Revision: https://reviews.llvm.org/D50977
llvm-svn: 347894
This patch adds the ability to specify via tablegen which processor resources
are load/store queue resources.
A new tablegen class named MemoryQueue can be optionally used to mark resources
that model load/store queues. Information about the load/store queue is
collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to
initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and
`StoreQueueID`. Those two fields are identifiers for buffered resources used to
describe the load queue and the store queue.
Field `BufferSize` is interpreted as the number of entries in the queue, while
the number of units is a throughput indicator (i.e. number of available pickers
for loads/stores).
At construction time, LSUnit in llvm-mca checks for the presence of extra
processor information (i.e. MCExtraProcessorInfo) in the scheduling model. If
that information is available, and fields LoadQueueID and StoreQueueID are set
to a value different than zero (i.e. the invalid processor resource index), then
LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value
declared by the two processor resources.
With this patch, we more accurately track dynamic dispatch stalls caused by the
lack of LS tokens (i.e. load/store queue full). This is also shown by the
differences in two BdVer2 tests. Stalls that were previously classified as
generic SCHEDULER FULL stalls, are not correctly classified either as "load
queue full" or "store queue full".
About the differences in the -scheduler-stats view: those differences are
expected, because entries in the load/store queue are not released at
instruction issue stage. Instead, those are released at instruction executed
stage. This is the main reason why for the modified tests, the load/store
queues gets full before PdEx is full.
Differential Revision: https://reviews.llvm.org/D54957
llvm-svn: 347857
There are quite strong constraints on how you can use the TIED_TO
constraint between MC operands, many of which are currently not
checked until compiler run time.
MachineVerifier enforces that operands can only be tied together in
pairs (no three-way ties), and MachineInstr::tieOperands enforces that
one of the tied operands must be an output operand (def) and the other
must be an input operand (use).
Now we check these at TableGen time, so that if you violate any of
them in a new instruction definition, you find out immediately,
instead of having to wait until you compile something that makes code
generation hit one of those assertions.
Also in this commit, all the error reports in ParseConstraint now
include the name and source location of the def where the problem
happened, so that if you do trigger any of these errors, it's easier
to find the part of your TableGen input where you made the mistake.
The trunk sources already build successfully with this additional
error check, so I think no in-tree target has any of these problems.
Reviewers: fhahn, lhames, nhaehnle, MatzeB
Reviewed By: MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53815
llvm-svn: 347743
`llvm-mca` relies on the predicates to be based on `MCSchedPredicate` in order
to resolve the scheduling for variant instructions. Otherwise, it aborts
the building of the instruction model early.
However, the scheduling model emitter in `TableGen` gives up too soon, unless
all processors use only such predicates.
In order to allow more processors to be used with `llvm-mca`, this patch
emits scheduling transitions if any processor uses these predicates. The
transition emitted for the processors using legacy predicates is the one
specified with `NoSchedPred`, which is based on `MCSchedPredicate`.
Preferably, `llvm-mca` should instead assume a reasonable default when a
variant transition is not based on `MCSchedPredicate` for a given processor.
This issue should be revisited in the future.
Differential revision: https://reviews.llvm.org/D54648
llvm-svn: 347504
A call to @llvm.trap can be expected to be cold (i.e. unlikely to be
reached in a normal program execution).
Outlining paths which unconditionally trap is an important memory
saving. As the hot/cold splitting pass (imho) should not treat all
noreturn calls as cold, explicitly mark @llvm.trap cold so that it can
be outlined.
Split out of https://reviews.llvm.org/D54244.
Differential Revision: https://reviews.llvm.org/D54329
llvm-svn: 346885
Summary:
This simplifies the code and moves everything to tablegen for consistency. This
also prepares the ground for adding issue counters.
Reviewers: gchatelet, john.brawn, jsji
Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D54297
llvm-svn: 346489
Summary:
As a bonus, this arguably improves the code by making it simpler.
gcc 8 on Ubuntu 18.10 reports the following:
==39667==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fffffff8ae0 at pc 0x555555dbfc68 bp 0x7fffffff8760 sp 0x7fffffff8750
WRITE of size 8 at 0x7fffffff8ae0 thread T0
#0 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider::_Alloc_hider(char*, std::allocator<char>&&) /usr/include/c++/8/bits/basic_string.h:149
#1 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) /usr/include/c++/8/bits/basic_string.h:542
#2 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) /usr/include/c++/8/bits/basic_string.h:6009
#3 0x555555dbfc67 in searchableFieldType /home/nha/amd/build/san/llvm-src/utils/TableGen/SearchableTableEmitter.cpp:168
(...)
Address 0x7fffffff8ae0 is located in stack of thread T0 at offset 864 in frame
#0 0x555555dbef3f in searchableFieldType /home/nha/amd/build/san/llvm-src/utils/TableGen/SearchableTableEmitter.cpp:148
Reviewers: fhahn, simon_tatham, kparzysz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53931
llvm-svn: 345749
Before this patch, class PredicateExpander only knew how to expand simple
predicates that performed checks on instruction operands.
In particular, the new scheduling predicate syntax was not rich enough to
express checks like this one:
Foo(MI->getOperand(0).getImm()) == ExpectedVal;
Here, the immediate operand value at index zero is passed in input to function
Foo, and ExpectedVal is compared against the value returned by function Foo.
While this predicate pattern doesn't show up in any X86 model, it shows up in
other upstream targets. So, being able to support those predicates is
fundamental if we want to be able to modernize all the scheduling models
upstream.
With this patch, we allow users to specify if a register/immediate operand value
needs to be passed in input to a function as part of the predicate check. Now,
register/immediate operand checks all derive from base class CheckOperandBase.
This patch also changes where TIIPredicate definitions are expanded by the
instructon info emitter. Before, definitions were expanded in class
XXXGenInstrInfo (where XXX is a target name).
With the introduction of this new syntax, we may want to have TIIPredicates
expanded directly in XXXInstrInfo. That is because functions used by the new
operand predicates may only exist in the derived class (i.e. XXXInstrInfo).
This patch is a non functional change for the existing scheduling models.
In future, we will be able to use this richer syntax to better describe complex
scheduling predicates, and expose them to llvm-mca.
Differential Revision: https://reviews.llvm.org/D53880
llvm-svn: 345714
Summary:
The pfm counters are now in the ExegesisTarget rather than the
MCSchedModel (PR39165).
This also compresses the pfm counter tables (PR37068).
Reviewers: RKSimon, gchatelet
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D52932
llvm-svn: 345243
Summary:
Some targets have very long encodings and uint64_t isn't sufficient. uint128_t
isn't portable so such targets need to use an object instead.
There is one catch with this at the moment, no string of bits extracted
from the encoding may exceeed 64-bits. Fields are still permitted to
exceed 64-bits so long as they aren't one contiguous string of bits. If
this proves to be a problem then we can modify the generation of
fieldFromInstruction() calls to account for it but for now I've added an
assertion for this.
InsnType must either be integral or an APInt-like object that must:
* Have a static const max_size_in_bits equal to the number of bits in the encoding.
* be default-constructible and copy-constructible
* be constructible from a uint64_t (this is the key area the interface deviates
from APInt since this constructor does not take the bit width)
* be constructible from an APInt (this can be private)
* be convertible to uint64_t
* Support the ~, &,, ==, !=, and |= operators with other objects of the same type
* Support shift (<<, >>) with signed and unsigned integers on the RHS
* Support put (<<) to raw_ostream&
Reviewers: bogner, charukcs
Subscribers: nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D52100
llvm-svn: 345056
Summary:
Replace its functionality with a TableGen InstrInfo relational
instruction mapping. Although arguably more complex than the TableGen
backend, the relational mapping is a smaller maintenance burden than a
TableGen backend.
Reviewers: aardappel, aheejin, dschuff
Subscribers: mgorny, sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D53307
llvm-svn: 344962
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.
A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).
For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:
```
def : IsOptimizableRegisterMove<[
InstructionEquivalenceClass<[
// GPR variants.
MOV32rr, MOV64rr,
// MMX variants.
MMX_MOVQ64rr,
// SSE variants.
MOVAPSrr, MOVUPSrr,
MOVAPDrr, MOVUPDrr,
MOVDQArr, MOVDQUrr,
// AVX variants.
VMOVAPSrr, VMOVUPSrr,
VMOVAPDrr, VMOVUPDrr,
VMOVDQArr, VMOVDQUrr
], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```
Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:
```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```
By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).
Tablegen class RegisterFile has been extended with the following information:
- The set of register classes that allow move elimination.
- Maxium number of moves that can be eliminated every cycle.
- Whether move elimination is restricted to moves from registers that are
known to be zero.
This patch is structured in three part:
A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.
A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.
A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.
llvm-mca tests for btver2 show differences before/after this patch.
Differential Revision: https://reviews.llvm.org/D53134
llvm-svn: 344334
Summary:
The predicate function is added in InlinePatternFragments, no need to
do it here. As a result, all uses of addPredicateFn are located in
InlinePatternFragments.
Test confirmed that there are no changes to generated files when
building all (non-experimental) targets.
Change-Id: I720e42e045ca596eb0aa339fb61adf6fe71034d5
Reviewers: arsenm, rampitec, RKSimon, craig.topper, hfinkel, uweigand
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D51993
llvm-svn: 343977
There are a few leftovers in rL343163 which span two lines. This commit
changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...)
llvm-svn: 343426