Summary:
The loop which emits AssemblerPredicate conditions also links them together by emitting a '&&'.
If the 1st predicate is not an AssemblerPredicate, while the 2nd one is, nothing gets emitted for the 1st one, but we still emit the '&&' because of the 2nd predicate.
This generated code looks like "( && Cond2)" and is invalid.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D8294
llvm-svn: 234312
Specify an allocation order with a register class. This is used by register
allocators with a greedy heuristic. This is usefull as it is sometimes
beneficial to color more constrained classes first.
Differential Revision: http://reviews.llvm.org/D8626
llvm-svn: 233743
per-function subtarget.
Currently, code-gen passes the default or generic subtarget to the constructors
of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which
enables some targets (AArch64, ARM, and X86) to change their instprinter's
behavior based on the subtarget feature bits. Since the backend can now use
different subtargets for each function, instprinter has to be changed to use the
per-function subtarget rather than the default subtarget.
This patch takes the first step towards enabling instprinter to change its
behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to
AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the
various print methods table-gen auto-generates.
I will follow up with changes to instprinters of AArch64, ARM, and X86.
llvm-svn: 233411
This reverts commit r233055.
It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time.
llvm-svn: 233068
Previously, subtarget features were a bitfield with the underlying type being uint64_t.
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.
The first time this was committed (r229831), it caused several buildbot failures.
At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed.
Differential Revision: http://reviews.llvm.org/D8542
llvm-svn: 233055
This is needed for AVX512 masked scatter/gather support.
The R600 change is necessary to remove a hack that was working around the lack of multiple results.
llvm-svn: 232798
Some subregisters are only to indicate different access sizes, while not
providing any way to actually divide the register up into multiple
disjunct parts. Avoid tracking subregister liveness in these cases as it
is not beneficial.
Differential Revision: http://reviews.llvm.org/D8429
llvm-svn: 232695
When calculating the lanemask of a register class we have to include the
masks of subregisters supported by any of the class members, not just
the ones supported by all class members.
This fixes problems when coalescing towards a subclass with additional
subregisters available.
The attached testcase works fine as is, but does crash if you enable
subregister liveness on x86 without this change applied.
llvm-svn: 232652
Now that SmallString is a first-class citizen, most SmallString::str()
calls are not required. This patch removes a whole bunch of them, yet
there are lots more.
There are two use cases where str() is really needed:
1) To use one of StringRef member functions which is not available in
SmallString.
2) To convert to std::string, as StringRef implicitly converts while
SmallString do not. We may wish to change this, but it may introduce
ambiguity.
llvm-svn: 232622
clang-cl would warn that this value is not representable in 'int':
enum { FeatureX = 1ULL << 31 };
All MS enums are 'ints' unless otherwise specified, so we have to use an
explicit type. The AMDGPU target just hit 32 features, triggering this
warning.
Now that we have C++11 strong enum types, we can also eliminate the
'const uint64_t' codepath from tablegen and just use 'enum : uint64_t'.
llvm-svn: 231697
This is what all the targets check for and is consistent with the
initialized value of MissingFeatures, which is sometimes assinged
to ErrorInfo.
llvm-svn: 231397
This should help with the AVX512 masked gather changes Elena is working on. This patch is derived from some of the changes Elena made to tablegen, but modified by me to support arbitrary number of results.
llvm-svn: 231357
Accidentally committed a few more of these cleanup changes than
intended. Still breaking these out & tidying them up.
This reverts commit r231135.
llvm-svn: 231136
There doesn't seem to be any need to assert that iterator assignment is
between iterators over the same node - if you want to reuse an iterator
variable to iterate another node, that's perfectly acceptable. Just
don't mix comparisons between iterators into disjoint sequences, as
usual.
llvm-svn: 231135
All of the cases were just appending from random access iterators to a
vector. Using insert/append can grow the vector to the perfect size
directly and moves the growing out of the loop. No intended functionalty
change.
llvm-svn: 230845
The keys of the map are unique by pointer address, so there's no need
to use the llvm::less comparator. This allows us to use DenseMap
instead, which reduces tblgen time by 20% on my stress test.
llvm-svn: 230769
Gather and scatter instructions additionally write to one of the source operands - mask register.
In this case Gather has 2 destination values - the loaded value and the mask.
Till now we did not support code gen pattern for gather - the instruction was generated from
intrinsic only and machine node was hardcoded.
When we introduce the masked_gather node, we need to select instruction automatically,
in the standard way.
I added a flag "hasTwoExplicitDefs" that allows to handle 2 destination operands.
(Some code in the X86InstrFragmentsSIMD.td is commented out, just to split one big
patch in many small patches)
llvm-svn: 230471
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.
There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.
llvm-svn: 230118
Previously, subtarget features were a bitfield with the underlying type being uint64_t.
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.
Differential Revision: http://reviews.llvm.org/D7065
llvm-svn: 229831
Gather and Scatter are new introduced intrinsics, comming after recently implemented masked load and store.
This is the first patch for Gather and Scatter intrinsics. It includes only the syntax, parsing and verification.
Gather and Scatter intrinsics allow to perform multiple memory accesses (read/write) in one vector instruction.
The intrinsics are not target specific and will have the following syntax:
Gather:
declare <16 x i32> @llvm.masked.gather.v16i32(<16 x i32*> <vector of ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x i32> <passthru>)
declare <8 x float> @llvm.masked.gather.v8f32(<8 x float*><vector of ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float><passthru>)
Scatter:
declare void @llvm.masked.scatter.v8i32(<8 x i32><vector value to be stored> , <8 x i32*><vector of ptrs> , i32 <alignment>, <8 x i1> <mask>)
declare void @llvm.masked.scatter.v16i32(<16 x i32> <vector value to be stored> , <16 x i32*> <vector of ptrs>, i32 <alignment>, <16 x i1><mask> )
Vector of ptrs - a set of source/destination addresses, to load/store the value.
Mask - switches on/off vector lanes to prevent memory access for switched-off lanes
vector of ptrs, value and mask should have the same vector width.
These are code examples where gather / scatter should be used and will allow function vectorization
;void foo1(int * restrict A, int * restrict B, int * restrict C) {
; for (int i=0; i<SIZE; i++) {
; A[i] = B[C[i]];
; }
;}
;void foo3(int * restrict A, int * restrict B) {
; for (int i=0; i<SIZE; i++) {
; A[B[i]] = i+5;
; }
;}
Tests will come in the following patches, with CodeGen and Vectorizer.
http://reviews.llvm.org/D7433
llvm-svn: 228521
Similar to the C++14 void specializations of these templates, useful as
a stop-gap until LLVM switches to '14.
Example use-cases in tblgen because I saw some functors that looked like
they could be simplified/refactored.
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7324
llvm-svn: 227828
The hot path through this region of code does lots of batch inserts into sets. By storing them as sorted arrays, we can defer the sorting to the end of the batch, which is dramatically more efficient. This reduces tblgen runtime by 25% on my worst-case target.
llvm-svn: 227682
This is a continuation of my prior work to move some of the inner workings for CodeGenRegister to use bit vectors when computing about register units. This is highly beneficial to TableGen runtime on targets with large, dense register files. This patch represents a ~40% runtime reduction over and above my earlier improvement on a stress test of this case.
llvm-svn: 227678
For target descriptions with very large and very dense register files, TableGen
can take an extremely long time to run. This change makes a dent in that (~15%
in my measurements) by accelerating the single hottest operation with better data
structures.
I believe there's still a lot of room to make this even faster with more global
changes that require replacing some of the existing datastructures in this area
with bit vectors, but that's a more involved change and I wanted to get this
simpler improvement in first.
llvm-svn: 227562
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
llvm-svn: 227113
Specifically, gc.result benefits from this greatly. Instead of:
gc.result.int.*
gc.result.float.*
gc.result.ptr.*
...
We now have a gc.result.* that can specialize to literally any type.
Differential Revision: http://reviews.llvm.org/D7020
llvm-svn: 226857
This patch was generated by a clang tidy checker that is being open sourced.
The documentation of that checker is the following:
/// The emptiness of a container should be checked using the empty method
/// instead of the size method. It is not guaranteed that size is a
/// constant-time function, and it is generally more efficient and also shows
/// clearer intent to use empty. Furthermore some containers may implement the
/// empty method but not implement the size method. Using empty whenever
/// possible makes it easier to switch to another container in the future.
Patch by Gábor Horváth!
llvm-svn: 226161
This adds support for creating an InstAlias with a negative immediate, i.e.:
def NOT : InstAlias<"not $dst, $src", (XORI GR32:$dst, GR32:$src, -1)>;
by resolving this problem:
RISCVGenAsmMatcher.inc:95:11: error: expected '= constant-expression' or end of enumerator definition
CVT_imm_-1,
^^^^^^^^^^
Patch by Jordy Potman, thanks!
llvm-svn: 226073
These intrinsics allow multiple functions to share a single stack
allocation from one function's call frame. The function with the
allocation may only perform one allocation, and it must be in the entry
block.
Functions accessing the allocation call llvm.recoverframeallocation with
the function whose frame they are accessing and a frame pointer from an
active call frame of that function.
These intrinsics are very difficult to inline correctly, so the
intention is that they be introduced rarely, or at least very late
during EH preparation.
Reviewers: echristo, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D6493
llvm-svn: 225746
This adds two new fields to the RegisterOperand TableGen class:
string OperandNamespace = "MCOI";
string OperandType = "OPERAND_REGISTER";
These fields can be used to specify a target specific operand type,
which will be stored in the OperandType member of the MCOperandInfo
object.
This can be useful for targets that need to store some extra information
about operands that cannot be expressed using the target independent
types. For example, in the R600 backend, there are operands which
can take either registers or immediates and it is convenient to be able
to specify this in the TableGen definitions.
llvm-svn: 225661
Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building.
llvm-svn: 225256
This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used.
Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction.
llvm-svn: 225075
This removes a hardcoded list of instructions in the CodeEmitter. Eventually I intend to remove the predicates on the affected instructions since in any given mode two of them are valid if we supported addr32/addr16 prefixes in the assembler.
llvm-svn: 224809
An instruction alias defined with InstAlias and an optional operand in the
middle of the AsmString field, "..${a} <operands>", would get the final
"}" printed in the instruction disassembly. This wouldn't happen if the optional
operand appeared as the last item in the AsmString which is how the current
backends avoided the problem.
There don't appear to be any tests for this part of Tablegen but it passes the
pre-commit tests. Manually tested the change by enabling the generic alias
printer in the ARM backend and checking the output.
Differential Revision: http://reviews.llvm.org/D6529
llvm-svn: 224348
On X86, the Intel asm parser tries to match all memory operand sizes when
none is explicitly specified. For LEA, which doesn't really have a memory
operand (just a pointer one), this results in multiple successful matches,
one for each memory size. There's no error because it's same opcode, so
really, it's just one match. However, the tablegen'd matcher function
adds opcode/operands to the passed MCInst, and this results in multiple
duplicated operands.
This commit clears the MCInst in the tablegen'd matcher function.
We sometimes clear it when the match failed, so there's no expectation of
keeping the previous content anyway.
Differential Revision: http://reviews.llvm.org/D6670
llvm-svn: 224347
Clang's static analyzer found several potential cases of undefined
behavior, use of un-initialized values, and potentially null pointer
dereferences in tablegen, Support, MC, and ADT. This cleans them up
with specific assertions on the assumptions of the code.
llvm-svn: 224154
Let tablegen compute the combination of subregister lanemasks for all
subregisters in a register/register class. This is preparation for further
work subregister allocation
llvm-svn: 223873
I'm recommiting the codegen part of the patch.
The vectorizer part will be send to review again.
Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)
Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.
http://reviews.llvm.org/D6191
llvm-svn: 223348
This complicates a few algorithms due to not having random access, but
not by a huge degree I don't think (open to debate/design
discussion/etc).
llvm-svn: 223261
This is the second patch in a small series. This patch contains the MachineInstruction and x86-64 backend pieces required to lower Statepoints. It does not include the code to actually generate the STATEPOINT machine instruction and as a result, the entire patch is currently dead code. I will be submitting the SelectionDAG parts within the next 24-48 hours. Since those pieces are by far the most complicated, I wanted to minimize the size of that patch. That patch will include the tests which exercise the functionality in this patch. The entire series can be seen as one combined whole in http://reviews.llvm.org/D5683.
The STATEPOINT psuedo node is generated after all gc values are explicitly spilled to stack slots. The purpose of this node is to wrap an actual call instruction while recording the spill locations of the meta arguments used for garbage collection and other purposes. The STATEPOINT is modeled as modifing all of those locations to prevent backend optimizations from forwarding the value from before the STATEPOINT to after the STATEPOINT. (Doing so would break relocation semantics for collectors which wish to relocate roots.)
The implementation of STATEPOINT is closely modeled on PATCHPOINT. Eventually, much of the code in this patch will be removed. The long term plan is to merge the functionality provided by statepoints and patchpoints. Merging their implementations in the backend is likely to be a good starting point.
Reviewed by: atrick, ributzka
llvm-svn: 223085
Order matters for this container, it seems (using a forward_list and
replacing the original push_backs with emplace_fronts caused test
failures). I didn't look too deeply into why.
(& in retrospect, I might go back & change some of the forward_lists I
introduced to deques anyway - since most don't require removal, deque is
a more memory-friendly data structure (moderate locality while not
invalidating pointers))
llvm-svn: 222950
Seems libstdc++ on some buildbots is lacking std::map::emplace, which is
weird... reverting while I look into it.
This reverts commit r222937.
llvm-svn: 222939
Pointers and references to map elements are never invalidated (except on
removal, which isn't used here) so there's no need for the indirection
unless there's polymorphism at work.
A little const correctness had to be fixed, since the indirection
allowed some benign const violations.
llvm-svn: 222937
This reverts commit r222632 (and follow-up r222636), which caused a host
of LNT failures on an internal bot. I'll respond to the commit on the
list with a reproduction of one of the failures.
Conflicts:
lib/Target/X86/X86TargetTransformInfo.cpp
llvm-svn: 222936
Since the elements were not polymorphic, the unique_ptr was only used to
avoid pointer invalidation on container resizes - might as well skip the
indirection and use a container with suitable invalidation semantics.
llvm-svn: 222931
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)
Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.
http://reviews.llvm.org/D6191
llvm-svn: 222632
Primarily done by using SequenceToOffsetTable to reduce the register pressure set tables and then sizing the indices into the tables appropriately. Size a few other table entries based on content as well. Reduces X86RegisterInfo.o by ~9k.
llvm-svn: 222621
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.
This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...
llvm-svn: 222334
Having two ways to do this doesn't seem terribly helpful and
consistently using the insert version (which we already has) seems like
it'll make the code easier to understand to anyone working with standard
data structures. (I also updated many references to the Entry's
key and value to use first() and second instead of getKey{Data,Length,}
and get/setValue - for similar consistency)
Also removes the GetOrCreateValue functions so there's less surface area
to StringMap to fix/improve/change/accommodate move semantics, etc.
llvm-svn: 222319
StringSet is still a bit dodgy in that it exposes the raw iterator of
the StringMap parent, which exposes the weird detail that StringSet
actually has a 'value'... but anyway, this is useful for a handful of
clients that want to reference the newly inserted/persistent string data
in the StringSet/Map/Entry/thing.
llvm-svn: 222302
This reverts commit r222183.
Broke on the MSVC buildbots due to MSVC not producing default move
operations - I'd fix it immediately but just broke my build system a
bit, so backing out until I have a chance to get everything going again.
llvm-svn: 222187
The next step is to actually use unique_ptr in TreePatternNode's
Children vector. That will be more intrusive, and may not work,
depending on exactly how these things are handled (I have a bad
suspicion things are shared more than they should be, making this more
DAG than tree - but if it's really a tree, unique_ptr should suffice)
llvm-svn: 222183
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.
llvm-svn: 222118
based on instruction complexity
The order that tablegen fast-isel instruction code is generated is
currently based on the text of the predicate (using string
less-than). This patch changes this to instead use the instruction
complexity. Because the complexities are not unique a C++ multimap is
used instead of a map.
This fixes the problem where code with no predicate always comes out
first (the empty string always compares as less than all other
strings) thus making the code with predicates dead code. See the FMUL
code in PPCFastISel.cpp for an example. It also more closely matches
the normal codegen ordering. Some error checking in the tablegen
fast-isel code is fixed as well.
Patch by Bill Seurer.
llvm-svn: 222038
The problem is mostly that variadic output instruction
aren't handled, so it is rejected for having an inconsistent
number of operands, and then the right number of operands
isn't emitted.
llvm-svn: 221117
Summary:
CustomCallingConv is simply a CallingConv that tablegen should not generate the
implementation for. It allows regular CallingConv's to delegate to these custom
functions. This is (currently) necessary for Mips and we cannot use CCCustom
without having to adapt to the different API that CCCustom uses.
This brings us a bit closer to being able to remove
MipsCC::analyzeCallOperands and MipsCC::analyzeFormalArguments in favour of
the common implementation.
No functional change to the targets.
Depends on D3341
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: vmedic, llvm-commits
Differential Revision: http://reviews.llvm.org/D5965
llvm-svn: 221052
FastISel has a fixed set of virtual functions that are overridden by the
tablegen-generated code for each target. These functions are distinguished by
the kinds of operands, e.g., register + immediate = "ri". The FastISel emitter
has been blindly emitting functions with different combinations of operand
kinds, even for combinations that are completely unused by FastISel, e.g.,
"fastEmit_rrr". Change to filter out functions that will be irrelevant for
FastISel and do not bother generating the code for them. Also add explicit
"override" keywords for the virtual functions that are overridden.
llvm-svn: 218838
No functionality change intended.
This implements Elena's idea to put the new additionalOperand outside the
switch to cover all cases
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140929/237763.html).
Note only nontrivial change is in MRMSrcMemFrm. This requires an inclusive
interval of [2, 4] because we have prefix-dependent *optional* immediate
operand.
llvm-svn: 218790
Summary:
The N32/N64 ABI's require that structs passed in registers are laid out
such that spilling the register with 'sd' places the struct at the lowest
address. For little endian this is trivial but for big-endian it requires
that structs are shifted into the upper bits of the register.
We also require that structs passed in registers have the 'inreg'
attribute for big-endian N32/N64 to work correctly. This is because the
tablegen-erated calling convention implementation only has access to the
lowered form of struct arguments (one or more integers of up to 64-bits
each) and is unable to determine the original type.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5286
llvm-svn: 218451
parsing (and latent bug in the instruction definitions).
This is effectively a revert of r136287 which tried to address
a specific and narrow case of immediate operands failing to be accepted
by x86 instructions with a pretty heavy hammer: it introduced a new kind
of operand that behaved differently. All of that is removed with this
commit, but the test cases are both preserved and enhanced.
The core problem that r136287 and this commit are trying to handle is
that gas accepts both of the following instructions:
insertps $192, %xmm0, %xmm1
insertps $-64, %xmm0, %xmm1
These will encode to the same byte sequence, with the immediate
occupying an 8-bit entry. The first form was fixed by r136287 but that
broke the prior handling of the second form! =[ Ironically, we would
still emit the second form in some cases and then be unable to
re-assemble the output.
The reason why the first instruction failed to be handled is because
prior to r136287 the operands ere marked 'i32i8imm' which forces them to
be sign-extenable. Clearly, that won't work for 192 in a single byte.
However, making thim zero-extended or "unsigned" doesn't really address
the core issue either because it breaks negative immediates. The correct
fix is to make these operands 'i8imm' reflecting that they can be either
signed or unsigned but must be 8-bit immediates. This patch backs out
r136287 and then changes those places as well as some others to use
'i8imm' rather than one of the extended variants.
Naturally, this broke something else. The custom DAG nodes had to be
updated to have a much more accurate type constraint of an i8 node, and
a bunch of Pat immediates needed to be specified as i8 values.
The fallout didn't end there though. We also then ceased to be able to
match the instruction-specific intrinsics to the instructions so
modified. Digging, this is because they too used i32 rather than i8 in
their signature. So I've also switched those intrinsics to i8 arguments
in line with the instructions.
In order to make the intrinsic adjustments of course, I also had to add
auto upgrading for the intrinsics.
I suspect that the intrinsic argument types may have led everything down
this rabbit hole. Pretty happy with the result.
llvm-svn: 217310
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.
Reviewed by Eric
llvm-svn: 217075
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour
Reviewed by Andy Trick and Chandler C
llvm-svn: 216919
This patch adds a new property: isInsertSubreg and the related target hooks:
TargetIntrInfo::getInsertSubregInputs and
TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific
instruction is a (kind of) INSERT_SUBREG.
The approach is similar to r215394.
<rdar://problem/12702965>
llvm-svn: 216139
This patch adds a new property: isExtractSubreg and the related target hooks:
TargetIntrInfo::getExtractSubregInputs and
TargetInstrInfo::getExtractSubregLikeInputs to specify that a target specific
instruction is a (kind of) EXTRACT_SUBREG.
The approach is similar to r215394.
<rdar://problem/12702965>
llvm-svn: 216130
ARM in particular is getting dangerously close to exceeding 32 bits worth of
possible subtarget features. When this happens, various parts of MC start to
fail inexplicably as masks get truncated to "unsigned".
Mostly just refactoring at present, and there's probably no way to test.
llvm-svn: 215887
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
This patch adds a new property: isRegSequence and the related target hooks:
TargetIntrInfo::getRegSequenceInputs and
TargetInstrInfo::getRegSequenceLikeInputs to specify that a target specific
instruction is a (kind of) REG_SEQUENCE.
<rdar://problem/12702965>
llvm-svn: 215394
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reverts commits r215111, 215115, 215116, 215117, 215136.
llvm-svn: 215154
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.
Thanks to Lang Hames for making MCJIT a good replacement!
llvm-svn: 215111
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.
llvm-svn: 214988
This is useful for cases when stand-alone patterns are preferred to the
patterns included in the instruction definitions. Instead of requiring
that stand-alone patterns set a larger AddedComplexity value, which
can be confusing to new developers, the allows us to reduce the
complexity of the included patterns to achieve the same result.
There will be test cases for this added to the R600 backend in a
future commit.
llvm-svn: 214466
This allows assembling the two new instructions, encls and enclu for the
SKX processor model.
Note the diffs are a bigger than what might think, but to fit the new
MRM_CF and MRM_D7 in things in the right places things had to be
renumbered and shuffled down causing a bit more diffs.
rdar://16228228
llvm-svn: 214460
address of the stack guard was being spilled to the stack.
Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register.
<rdar://problem/12475629>
llvm-svn: 213967
file not in the test/ area). Backing out now so that this test isn't part of
the 3.5 branch.
Original commit message: "TableGen: Allow AddedComplexity values to be negative
[...]"
llvm-svn: 213596
This is useful for cases when stand-alone patterns are preferred to the
patterns included in the instruction definitions. Instead of requiring
that stand-alone patterns set a larger AddedComplexity value, which
can be confusing to new developers, the allows us to reduce the
complexity of the included patterns to achieve the same result.
llvm-svn: 213521
It's also possible to just write "= nullptr", but there's some question
of whether that's as readable, so I leave it up to authors to pick which
they prefer for now. If we want to discuss standardizing on one or the
other, we can do that at some point in the future.
llvm-svn: 213438
Speculative fix for a -Wframe-larger-than warning from gcc. Clang will
implicitly promote such constant arrays to globals, so in theory it
won't hit this.
llvm-svn: 213298
There are two parts here. First is to modify tablegen to adjust the encoding
type ENCODING_RM with the scaling factor.
The second is to use the new encoding types to compute the correct
displacement in the decoder.
Fixes <rdar://problem/17608489>
llvm-svn: 213281
Refactoring; no functional changes intended
Removed PostRAScheduler bits from subtargets (X86, ARM).
Added PostRAScheduler bit to MCSchedModel class.
This bit is set by a CPU's scheduling model (if it exists).
Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86:
a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget.
b. MIPS overrides the CPU's postRA settings by enabling postRA for everything.
c. PPC overrides the CPU's postRA settings by enabling postRA for everything.
d. X86 is the only target that actually has postRA specified via sched model info.
Differential Revision: http://reviews.llvm.org/D4217
llvm-svn: 213101
Use 0 for the invalid buffer instead of -1/~0 and switch to unsigned
representation to enable more idiomatic usage.
Also introduce a trivial SourceMgr::getMainFileID() instead of hard-coding 0/1
to identify the main file.
llvm-svn: 212398
Add MSBuiltin which is similar in vein to GCCBuiltin. This allows for adding
intrinsics for Microsoft compatibility to individual instructions. This is
needed to permit the creation of ARM specific MSVC extensions.
This is not currently in use, and requires an associated change in clang to
enable use of the intrinsics defined by this new class. This merely sets the
LLVM portion of the infrastructure in place to permit the use of this
functionality. A separate set of changes will enable the new intrinsics.
llvm-svn: 212350
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749
inverted condition codes (CINC, CINV, CNEG, CSET, and CSETM).
Matching aliases based on "immediate classes", when disassembling,
wasn't previously supported, hence adding MCOperandPredicate
into class Operand, and implementing the support for it
in AsmWriterEmitter.
The parsing for those aliases was already custom, so just adding
the missing condition into AArch64AsmParser::parseCondCode.
llvm-svn: 210528
I saw at least a memory leak or two from inspection (on probably
untested error paths) and r206991, which was the original inspiration
for this change.
I ran this idea by Jim Grosbach a few weeks ago & he was OK with it.
Since it's a basically mechanical patch that seemed sufficient - usual
post-commit review, revert, etc, as needed.
llvm-svn: 210427
This changes ARM64 to use separate operands for each component of an
address, and look for separate '[', '$Rn, ..., ']' tokens when
parsing.
This allows us to do away with quite a bit of special C++ code to
handle monolithic "addressing modes" in the MC components. The more
incremental matching of the assembler operands also allows for better
diagnostics when LLVM is presented with invalid input.
Most of the complexity here is with the register-offset instructions,
which were extremely dodgy beforehand: even when the instruction used
wM, LLVM's model had xM as an operand. We papered over this
discrepancy before, but that approach doesn't work now so I split them
into separate X and W variants.
llvm-svn: 209425
Summary:
The minimal type needs to hold a value of '1ULL << 31' but
getMinimalTypeForRange() is called with a value of '1ULL << 32'.
This patch will also reduce the size of the matcher table when there are 8
or 16 SubtargetFeatures.
Also added a dump of the SubtargetFeatures to the -debug output and corrected getMinimalTypeInRange() to consider 0xffffffffull to be a 32-bit value.
The testcase is that no existing code is broken and that LLVM still successfully
compiles after adding MIPS64r6 CodeGen support.
Reviewers: rafael
Reviewed By: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3787
llvm-svn: 209288
This allows the results of a ComplexPattern check to be distributed to separate
named Operands, instead of the current system where all results must apply (and
match perfectly) with a single Operand.
For example, if "some_addrmode" is a ComplexPattern producing two results, you
can write:
def : Pat<(load (some_addrmode GPR64:$base, imm:$offset)),
(INST GPR64:$base, imm:$offset)>;
This should allow neater instruction definitions in TableGen that don't put all
possible aspects of addressing into a single operand, but are still usable with
relatively simple C++ CodeGen idioms.
llvm-svn: 209206
When multiple aliases overlap, the correct string to print can often be
determined purely by considering the InstAlias declarations in some particular
order. This allows the user to specify that order manually when desired,
without resorting to hacking around with the default lexicographical order on
Record instantiation, which is error-prone and ugly.
I was also mistaken about "add w2, w3, w4" being the same as "add w2, w3, w4,
uxtw". That's only true if Rn is the stack pointer.
llvm-svn: 209199
TableGen has a fairly dubious heuristic to decide whether an alias should be
printed: does the alias have lest operands than the real instruction. This is
bad enough (particularly with no way to override it), but it should at least be
calculated consistently for both strings.
This patch implements that logic: first get the *correct* string for the
variant, in the same way as the Matcher, without guessing; then count the
number of whitespace chars.
There are basically 4 changes this brings about after the previous
commits; all of these appear to be good, so I have changed the tests:
+ ARM64: we print "neg X, Y" instead of "sub X, xzr, Y".
+ ARM64: we skip implicit "uxtx" and "uxtw" modifiers.
+ Sparc: we print "mov A, B" instead of "or %g0, A, B".
+ Sparc: we print "fcmpX A, B" instead of "fcmpX %fcc0, A, B"
llvm-svn: 208969
Previously, TableGen assumed that every aliased operand consumed precisely 1
MachineInstr slot (this was reasonable because until a couple of days ago,
nothing more complicated was eligible for printing).
This allows a couple more ARM64 aliases to print so we can remove the special
code.
On the X86 side, I've gone for explicit AT&T size specifiers as the default, so
turned off a few of the aliases that would have just started printing.
llvm-svn: 208880
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).
This does represent a small functionality change:
- For generic x86-64 (which uses the SB model and, thus, will get some
unrolling).
- For AMD cores (because they still currently use the SB scheduling model)
- For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.
llvm-svn: 208289
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.
This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:
- Header files that need to provide a DEBUG_TYPE for some inline code
can do so by defining the macro before their inline code and undef-ing
it afterward so the macro does not escape.
- We no longer have rampant ODR violations due to including headers with
different DEBUG_TYPE definitions. This may be mostly an academic
violation today, but with modules these types of violations are easy
to check for and potentially very relevant.
Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.
The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.
llvm-svn: 206822
Removes some extra manual dynamic memory allocation/management. It does
get a bit quirky having to make State's members mutable and
pointers/references to const rather than non-const, but that's a
necessary workaround to dealing with the std::set elements.
llvm-svn: 206807
entirely clear whether this should be valid with modules enabled, but the fixed
code is cleaner regardless.
Also fix a TU-local type that accidentally had external linkage.
llvm-svn: 206714
This is like the LLVMMatchType, except the verifier checks that the
second argument is a vector with the same base type and half the
number of elements.
This will be used by the ARM64 backend.
llvm-svn: 205079
These are used in the ARM backends to aid type-checking on patterns involving
intrinsics. By making sure one argument is an extended/truncated version of
another.
However, there's no reason to limit them to just vectors types. For example
AArch64 has the instruction "uqshrn sD, dN, #imm" which would naturally use an
intrinsic taking an i64 and returning an i32.
llvm-svn: 205003
When an instruction's operand list does not have a sufficient number of
operands to match with all of the variables that contribute to its
encoding, instead of asserting inside a call to getSubOperandNumber, produce an
informative error.
llvm-svn: 204542
The "noduplicate" function attribute exists to prevent certain optimizations
from duplicating calls to the function. This is important on platforms where
certain function call duplications are unsafe (for example execution barriers
for CUDA and OpenCL).
This patch makes it possible to specify intrinsics as "noduplicate" and
translates that to the appropriate function attribute.
llvm-svn: 204200
Utilize the previous move of MVT to a separate header for all trivial
cases (that don't need any further restructuring).
Reviewed By: Tim Northover
llvm-svn: 204003
There are currently two schemes for mapping instruction operands to
instruction-format variables for generating the instruction encoders and
decoders for the assembler and disassembler respectively: a) to map by name and
b) to map by position.
In the long run, we'd like to remove the position-based scheme and use only
name-based mapping. Unfortunately, the name-based scheme currently cannot deal
with complex operands (those with suboperands), and so we currently must use
the position-based scheme for those. On the other hand, the position-based
scheme cannot deal with (register) variables that are split into multiple
ranges. An upcoming commit to the PowerPC backend (adding VSX support) will
require this capability. While we could teach the position-based scheme to
handle that, since we'd like to move away from the position-based mapping
generally, it seems silly to teach it new tricks now. What makes more sense is
to allow for partial transitioning: use the name-based mapping when possible,
and only use the position-based scheme when necessary.
Now the problem is that mixing the two sensibly was not possible: the
position-based mapping would map based on position, but would not skip those
variables that were mapped by name. Instead, the two sets of assignments would
overlap. However, I cannot currently change the current behavior, because there
are some backends that rely on it [I think mistakenly, but I'll send a message
to llvmdev about that]. So I've added a new TableGen bit variable:
noNamedPositionallyEncodedOperands, that can be used to cause the
position-based mapping to skip variables mapped by name.
llvm-svn: 203767
"ProcResource def is not included in the ProcResources".
Some of the machine model definitions were not added to the
processor's list used for diagnostics and error checking.
llvm-svn: 203749
The old system was fairly convoluted:
* A temporary label was created.
* A single PROLOG_LABEL was created with it.
* A few MCCFIInstructions were created with the same label.
The semantics were that the cfi instructions were mapped to the PROLOG_LABEL
via the temporary label. The output position was that of the PROLOG_LABEL.
The temporary label itself was used only for doing the mapping.
The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to
one by holding an index into the CFI instructions of this function.
I did consider removing MMI.getFrameInstructions completelly and having
CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non
trivial constructors and destructors and are somewhat big, so the this setup
is probably better.
The net result is that we don't create temporary labels that are never used.
llvm-svn: 203204
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083
Unfortunately, it is currently impossible to use a PatFrag as part of an output
pattern (the part of the pattern that has instructions in it) in TableGen.
Looking at the current implementation, this was clearly intended to work (there
is already code in place to expand patterns in the output DAG), but is
currently broken by the baked-in type-checking assumption and the order in which
the pattern fragments are processed (output pattern fragments need to be
processed after the instruction definitions are processed).
Fixing this is fairly simple, but requires some way of differentiating output
patterns from the existing input patterns. The simplest way to handle this
seems to be to create a subclass of PatFrag, and so that's what I've done here.
As a simple example, this allows us to write:
def crnot : OutPatFrag<(ops node:$in),
(CRNOR $in, $in)>;
def : Pat<(not i1:$in),
(crnot $in)>;
which captures the core use case: handling of repeated subexpressions inside
of complicated output patterns.
This will be used by an upcoming commit to the PowerPC backend.
llvm-svn: 202450
should not be marked nounwind.
Marking them nounwind caused crashes in the WebKit FTL JIT, because if we enable
sufficient optimizations, LLVM starts eliding compact_unwind sections (or any unwind
data for that matter), making deoptimization via stackmaps impossible.
This changes the stackmap intrinsic to be may-throw, adds a test for exactly the
sympton that WebKit saw, and fixes TableGen to handle un-attributed intrinsics.
Thanks to atrick and philipreames for reviewing this.
llvm-svn: 201826
Original commits messages:
Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code.
Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information.
llvm-svn: 201065
r201059 appears to cause a crash in a bootstrapped build of clang. Craig
isn't available to look at it right now, so I'm reverting it while he
investigates.
llvm-svn: 201064
According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.
I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.
llvm-svn: 200970
The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It
also turns out to have been unnecessary. The disassembler handles the
AdSize prefix for itself, and doesn't care about the difference between
(e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and
don't worry about it.
llvm-svn: 199654
promotion code, Tablegen will now select FPExt for floating point promotions
(previously it had returned AExt, which is not valid for floating point types).
Any out-of-tree targets that were relying on AExt being returned for FP
promotions will need to update their code check for FPExt instead.
llvm-svn: 199252
To declare or define reserved identifers is undefined behaviour in standard
C++. This needs to be addressed in compiler-rt before it can be used in LLVM.
See the list discussion for details.
This reverts commit r198858.
llvm-svn: 198884
It seems there is no separate instruction class for having AdSize *and*
OpSize bits set, which is required in order to disambiguate between all
these instructions. So add that to the disassembler.
Hm, perhaps we do need an AdSize16 bit after all?
llvm-svn: 198759
A ValueType in a pattern dag is a type cast, and GetNumNodeResults should
handle it (the type cast has only one result).
This comes up, for example, during the type checking of pattern fragments, for
example, AArch64's Neon_combine_2d fragment is:
dag Operands = (ops node:$Rm, node:$Rn);
dag Fragment = (v2f64 (concat_vectors (v1f64 node:$Rm), (v1f64 node:$Rn)));
llvm-svn: 198347
That's what it actually means, and with 16-bit support it's going to be
a little more relevant since in a few corner cases we may actually want
to distinguish between 16-bit and 32-bit mode (for example the bare 'push'
aliases to pushw/pushl etc.)
Patch by David Woodhouse
llvm-svn: 197768
Unfortunately, the PowerPC instruction definitions make heavy use of the
positional operand encoding heuristic to map operands onto bitfield variables
in the instruction definitions. Changing this to use name-based mapping is not
trivial, however, because additional infrastructure needs to be designed to
handle mapping of complex operands (with multiple suboperands) onto multiple
bitfield variables.
In the mean time, this adds support for positionally encoded operands to
FixedLenDecoderEmitter, so that we can generate a disassembler for the PowerPC
backend. To prevent an accidental reliance on this feature, and to prevent an
undesirable interaction with existing disassemblers, a backend must opt-in to
this support by setting the new decodePositionallyEncodedOperands
instruction-set bit to true.
When enabled, this iterates the variables that contribute to the instruction
encoding, just as the encoder does, and emulates the procedure the encoder uses
to map "numbered" operands to variables. The bit range for each variable is
also determined as the encoder determines them. This map is then consulted
during the decoder-generator's loop over operands to decode, allowing the
decoder to understand both position-based and name-based operand-to-variable
mappings.
As noted in the comment on the decodePositionallyEncodedOperands definition,
this support should be removed once it is no longer needed. There should be no
change to existing disassemblers.
llvm-svn: 197691
This is more prep for adding the PowerPC disassembler. FixedLenDecoderEmitter
should recognize PointerLikeRegClass operands as register types, and generate
register-like decoding calls instead of treating them like immediates.
llvm-svn: 197680
The convention used to specify the PowerPC ISA is that bits are numbered in
reverse order (0 is the index of the high bit). To support this "little endian"
encoding convention, CodeEmitterGen will reverse the bit numberings prior to
generating the encoding tables. In order to generate a disassembler,
FixedLenDecoderEmitter needs to do the same.
This moves the bit reversal logic out of CodeEmitterGen and into CodeGenTarget
(where it can be used by both CodeEmitterGen and FixedLenDecoderEmitter). This
is prep work for disassembly support in the PPC backend (which is the only
in-tree user of this little-endian encoding support).
llvm-svn: 197532
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.
llvm-svn: 197384
This patch places class definitions in implementation files into anonymous
namespaces to prevent weak vtables. This eliminates the need of providing an
out-of-line definition to pin the vtable explicitly to the file.
llvm-svn: 195092
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 195064
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
Base *foo = new Child();
delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.
llvm-svn: 194997
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 194865
These used to be referenced by the CGI->AWI map (in AsmWriterEmitter), but
stored in a vector local to EmitPrintInstruction. Move the vector to
AsmWriterEmitter too.
llvm-svn: 193525
The old code skipped one of the sorting criteria if either pattern had
no types. This could lead to cycles of the form X < Y, Y < Z, Z < X.
llvm-svn: 191735
Add VEX_LIG to scalar FMA4 instructions.
Use VEX_LIG in some of the inheriting checks in disassembler table generator.
Make use of VEX_L_W, VEX_L_W_XS, VEX_L_W_XD contexts.
Don't let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from their non-L forms unless VEX_LIG is set.
Let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from all of their non-L or non-W cases.
Increase ranking on VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE so they get chosen over non-L/non-W forms.
llvm-svn: 191649
Ideally, the machinel model is added at the time the instructions are
defined. But many instructions in X86InstrSSE.td still need a model.
Without this workaround the scheduler asserts because x86 already has
itinerary classes for these instructions, indicating they should be
modeled by the scheduler. Since we use the new machine model for other
instructions, it expects a new machine model for these too.
llvm-svn: 191391
Patch by Ana Pazos.
1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.
llvm-svn: 191263
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.
llvm-svn: 191175
TableGen was sorting the entries in some of its internal data
structures by pointer. This order filtered through to the final
matching table and affected the diagnostics produced on bad assembly
occasionally.
It also turns out STL algorithms are ridiculously easy to misuse on
containers with custom order methods. (No bugs before, or now that I
know of, but plenty in the middle).
This should fix the sanitizer bot, which ends up with weird pointers.
llvm-svn: 190793
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.
The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
ComplexDeprecationPredicate<"MCR">
would mean you would have to define the following function:
bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info)
Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.
The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.
llvm-svn: 190598
Link.exe's command line options are case-insensitive. This patch
adds a new attribute to OptTable to let the option parser to compare
options, ignoring case.
Command lines are generally case-insensitive on Windows. CL.exe is an
exception. So this new attribute should be useful for other commands
running on Windows.
Differential Revision: http://llvm-reviews.chandlerc.com/D1485
llvm-svn: 189416
This field specifies registers that are preserved across function calls,
but that should not be included in the generates SaveList array.
This can be used ot generate regmasks for architectures that save
registers through other means, like SPARC's register windows.
llvm-svn: 189084
Back in the mists of time (2008), it seems TableGen couldn't handle the
patterns necessary to match ARM's CMOV node that we convert select operations
to, so we wrote a lot of fairly hairy C++ to do it for us.
TableGen can deal with it now: there were a few minor differences to CodeGen
(see tests), but nothing obviously worse that I could see, so we should
probably address anything that *does* come up in a localised manner.
llvm-svn: 188995
clang bootstraps intermittently failed for me due a difference in
the MCK_Reg ordering in ARMGenAsmMatcher.inc. E.g. in my latest
run the stage 1 and stage 3 versions were the same but the stage 2
one was different (though still functionally correct). This meant
that the .o comparison failed.
MCK_Regs were assigned by iterating over a std::set< std::set<Record*> >,
and since std::set is sorted lexicographically, the order depended on the
order of the pointer values. This patch replaces the pointer ordering
with LessRecordByID.
llvm-svn: 188164
LLVM's coding standards recommend raw_ostream and MemoryBuffer for
reading and writing text.
This has the side effect of allowing clang to compile more of Support
and TableGen in the Microsoft C++ ABI.
llvm-svn: 187826
This makes option aliases more powerful by enabling them to
pass along arguments to the option they're aliasing.
For example, if we have a joined option "-foo=", we can now
specify a flag option "-bar" to be an alias of that, with the
argument "baz".
This is especially useful for the cl.exe compatible clang driver,
where many options are aliases. For example, this patch enables
us to alias "/Ox" to "-O3" (-O is a joined option), and "/WX" to
"-Werror" (again, -W is a joined option).
Differential Revision: http://llvm-reviews.chandlerc.com/D1245
llvm-svn: 187537
For two intrinsics 'llvm.nvvm.texsurf.handle' and 'llvm.nvvm.texsurf.handle.internal',
TableGen was emitting matching code like:
if (Name.startswith("llvm.nvvm.texsurf.handle")) ...
if (Name.startswith("llvm.nvvm.texsurf.handle.internal")) ...
We can never match "llvm.nvvm.texsurf.handle.internal" here because it will
always be erroneously matched by the first condition.
The fix is to sort the intrinsic names and emit them in reverse order.
llvm-svn: 187119
This removes the need to store the asm variant in each row of the single table that existed before. Shaves ~16K off the size of X86AsmParser.o.
llvm-svn: 187026
algorithm when assigning EnumValues to the synthesized registers.
The current algorithm, LessRecord, uses the StringRef compare_numeric
function. This function compares strings, while handling embedded numbers.
For example, the R600 backend registers are sorted as follows:
T1
T1_W
T1_X
T1_XYZW
T1_Y
T1_Z
T2
T2_W
T2_X
T2_XYZW
T2_Y
T2_Z
In this example, the 'scaling factor' is dEnum/dN = 6 because T0, T1, T2
have an EnumValue offset of 6 from one another. However, in other parts
of the register bank, the scaling factors are different:
dEnum/dN = 5:
KC0_128_W
KC0_128_X
KC0_128_XYZW
KC0_128_Y
KC0_128_Z
KC0_129_W
KC0_129_X
KC0_129_XYZW
KC0_129_Y
KC0_129_Z
The diff lists do not work correctly because different kinds of registers have
different 'scaling factors'. This new algorithm, LessRecordRegister, tries to
enforce a scaling factor of 1. For example, the registers are now sorted as
follows:
T1
T2
T3
...
T0_W
T1_W
T2_W
...
T0_X
T1_X
T2_X
...
KC0_128_W
KC0_129_W
KC0_130_W
...
For the Mips and R600 I see a 19% and 6% reduction in size, respectively. I
did see a few small regressions, but the differences were on the order of a
few bytes (e.g., AArch64 was 16 bytes). I suspect there will be even
greater wins for targets with larger register files.
Patch reviewed by Jakob.
rdar://14006013
llvm-svn: 185094
This patch modifies TableGen to generate a function in
${TARGET}GenInstrInfo.inc called getNamedOperandIdx(), which can be used
to look up indices for operands based on their names.
In order to activate this feature for an instruction, you must set the
UseNamedOperandTable bit.
For example, if you have an instruction like:
def ADD : TargetInstr <(outs GPR:$dst), (ins GPR:$src0, GPR:$src1)>;
You can look up the operand indices using the new function, like this:
Target::getNamedOperandIdx(Target::ADD, Target::OpName::dst) => 0
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src0) => 1
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src1) => 2
The operand names are case sensitive, so $dst and $DST are considered
different operands.
This change is useful for R600 which has instructions with a large number
of operands, many of which model single bit instruction configuration
values. These configuration bits are common across most instructions,
but may have a different operand index depending on the instruction type.
It is useful to have a convenient way to look up the operand indices,
so these bits can be generically set on any instruction.
llvm-svn: 184879
For decoding, keep the current behavior of always decoding these as their REP
versions. In the future, this could be improved to recognize the cases where
these behave as XACQUIRE and XRELEASE and decode them as such.
llvm-svn: 184207
Replace the ill-defined MinLatency and ILPWindow properties with
with straightforward buffer sizes:
MCSchedMode::MicroOpBufferSize
MCProcResourceDesc::BufferSize
These can be used to more precisely model instruction execution if desired.
Disabled some misched tests temporarily. They'll be reenabled in a few commits.
llvm-svn: 184032
The element passed to push_back is not copied before the vector reallocates.
The client needs to copy the element first before passing it to push_back.
No test case, will be tested by follow-up swift scheduler model change (it
segfaults without this change).
llvm-svn: 183459
Don't output data if we are supposed to ignore the record.
Reapply of 183255, I don't think this was causing the tablegen segfault on linux
testers.
llvm-svn: 183311
This fixes some of the ridiculously complex code for optimizing the
machine model tables that are shared among all processors of a given
target. A9 and Swift both use the "special" feature that maps old
itinerary classes to new machine model defs. They map different
overlapping subsets of instructions, which wasn't handled correctly.
llvm-svn: 183302
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.
In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.
llvm-svn: 183020
The size reduction in the RegDiffLists are rather dramatic. Here are a few
size differences for MCTargetDesc.o files (before and after) in bytes:
R600 - 36160B - 11184B - 69% reduction
ARM - 28480B - 8368B - 71% reduction
Mips - 816B - 576B - 29% reduction
One side effect of dynamically computing the aliases is that the iterator does
not guarantee that the entries are ordered or that duplicates have been removed.
The documentation implies this is a safe assumption and I found no clients that
requires these attributes (i.e., strict ordering and uniqueness).
My local LNT tester results showed no execution-time failures or significant
compile-time regressions (i.e., beyond what I would consider noise) for -O0g,
-O2 and -O3 runs on x86_64 and i386 configurations.
rdar://12906217
llvm-svn: 182783
Currently the fast-isel table generator recognizes registers, register
classes, and immediates for source pattern operands. ValueType
operands are not recognized. This is not a problem for existing
targets with fast-isel support, but will not work for targets like
PowerPC and SPARC that use types in source patterns.
The proposed patch allows ValueType operands and treats them in the
same manner as register classes. There is no convenient way to map
from a ValueType to a register class, but there's no need to do so.
The table generator already requires that all types in the source
pattern be identical, and we know the register class of the output
operand already. So we just assign that register class to any
ValueType operands we encounter.
No functional effect on existing targets. Testing deferred until the
PowerPC target implements fast-isel.
llvm-svn: 182512
This lane mask provides information about which register lanes
completely cover super-registers. See the block comment before
getCoveringLanes().
llvm-svn: 182034
The problem this patch addresses is the handling of register tie
constraints in AsmMatcherEmitter, where one operand is tied to a
sub-operand of another operand. The typical scenario for this to
happen is the tie between the "write-back" register of a pre-inc
instruction, and the base register sub-operand of the memory address
operand of that instruction.
The current AsmMatcherEmitter code attempts to handle tied
operands by emitting the operand as usual first, and emitting
a CVT_Tied node when handling the second (tied) operand. However,
this really only works correctly if the tied operand does not
have sub-operands (and isn't a sub-operand itself). Under those
circumstances, a wrong MC operand list is generated.
In discussions with Jim Grosbach, it turned out that the MC operand
list really ought not to contain tied operands in the first place;
instead, it ought to consist of exactly those operands that are
named in the AsmString. However, getting there requires significant
rework of (some) targets.
This patch fixes the immediate problem, and at the same time makes
one (small) step in the direction of the long-term solution, by
implementing two changes:
1. Restricts the AsmMatcherEmitter handling of tied operands to
apply solely to simple operands (not complex operands or
sub-operand of such).
This means that at least we don't get silently corrupt MC operand
lists as output. However, if we do have tied sub-operands, they
would now no longer be handled at all, except for:
2. If we have an operand that does not occur in the AsmString,
and also isn't handled as tied operand, simply emit a dummy
MC operand (constant 0).
This works as long as target code never attempts to access
MC operands that do no not occur in the AsmString (and are
not tied simple operands), which happens to be the case for
all targets where this situation can occur (ARM and PowerPC).
[ Note that this change means that many of the ARM custom
converters are now superfluous, since the implement the
same "hack" now performed already by common code. ]
Longer term, we ought to fix targets to never access *any*
MC operand that does not occur in the AsmString (including
tied simple operands), and then finally completely remove
all such operands from the MC operand list.
Patch approved by Jim Grosbach.
llvm-svn: 180677
Super-resources and resource groups are two ways of expressing
overlapping sets of processor resources. Now we generate table entries
the same way for both so the scheduler never needs to explicitly check
for super-resources.
llvm-svn: 180162
variant/dialect. Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.
llvm-svn: 179804
As these two instructions in AVX extension are privileged instructions for
special purpose, it's only expected to be used in inlined assembly.
llvm-svn: 179266
A9 uses itinerary classes, Swift uses RW lists. This tripped some
verification when we're expanding variants. I had to refine the
verification a bit.
llvm-svn: 178357
This syntax is now preferred:
def : Pat<(subc i32:$b, i32:$c), (SUBCCrr $b, $c)>;
There is no reason to repeat the types in the output pattern.
llvm-svn: 177844
This makes it possible to define instruction patterns like this:
def LDri : F3_2<3, 0b000000,
(outs IntRegs:$dst), (ins MEMri:$addr),
"ld [$addr], $dst",
[(set i32:$dst, (load ADDRri:$addr))]>;
~~~
llvm-svn: 177834
Just like register classes, value types can be used in two ways in
patterns:
(sext_inreg i32:$src, i16)
In a named leaf node like i32:$src, the value type simply provides the
type of the node directly. This simplifies type inference a lot compared
to the current practice of specifiying types indirectly with register
classes.
As an unnamed leaf node, like i16 above, the value type represents
itself as an MVT::Other immediate.
llvm-svn: 177828
A register class can appear as a leaf TreePatternNode with and without a
name:
(COPY_TO_REGCLASS GPR:$src, F8RC)
In a named leaf node like GPR:$src, the register class provides type
information for the named variable represented by the node. The TypeSet
for such a node is the set of value types that the register class can
represent.
In an unnamed leaf node like F8RC above, the register class represents
itself as a kind of immediate. Such a node has the type MVT::i32,
we'll never create a virtual register representing it.
This change makes it possible to remove the special handling of
COPY_TO_REGCLASS in CodeGenDAGPatterns.cpp.
llvm-svn: 177825
To use this in conjunction with exuberant ctags to generate a single
combined tags file, run tblgen first and then
$ ctags --append [...]
Since some identifiers have corresponding definitions in C++ code,
it can be useful (if using vim) to also use cscope, and
:set cscopetagorder=1
so that
:tag X
will preferentially select the tablegen symbol, while
:cscope find g X
will always find the C++ symbol.
Patch by Kevin Schoedel!
(a couple small formatting changes courtesy of clang-format)
llvm-svn: 177682
of complex instruction operands (e.g. address modes).
Currently, if a Pat pattern creates an instruction that has a complex
operand (i.e. one that consists of multiple sub-operands at the MI
level), this operand must match a ComplexPattern DAG pattern with the
correct number of output operands.
This commit extends TableGen to alternatively allow match a complex
operands against multiple separate operands at the DAG level.
This allows using Pat patterns to match pre-increment nodes like
pre_store (which must have separate operands at the DAG level) onto
an instruction pattern that uses a multi-operand memory operand,
like the following example on PowerPC (will be committed as a
follow-on patch):
def STWU : DForm_1<37, (outs ptr_rc:$ea_res), (ins GPRC:$rS, memri:$dst),
"stwu $rS, $dst", LdStStoreUpd, []>,
RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;
def : Pat<(pre_store GPRC:$rS, ptr_rc:$ptrreg, iaddroff:$ptroff),
(STWU GPRC:$rS, iaddroff:$ptroff, ptr_rc:$ptrreg)>;
Here, the pair of "ptroff" and "ptrreg" operands is matched onto the
complex operand "dst" of class "memri" in the "STWU" instruction.
Approved by Jakob Stoklund Olesen.
llvm-svn: 177428
Properly handle cases where a group of instructions have different
SchedRW lists with the same itinerary class.
This was supposed to work, but I left in an early break.
llvm-svn: 177317
We always supported a mixture of the old itinerary model and new
per-operand model, but it required a level of indirection to map
itinerary classes to SchedRW lists. This was done for ARM A9.
Now we want to define x86 SchedRW lists, with the goal of removing its
itinerary classes, but still support the itineraries in the mean
time. When I original developed the model, Atom did not have
itineraries, so there was no reason to expect this requirement.
llvm-svn: 177226
Don't require instructions to inherit Sched<...>. Sometimes it is more
convenient to say:
let SchedRW = ... in {
...
}
Which is now possible.
llvm-svn: 177199
This allows abitrary groups of processor resources. Using something in
a subset automatically counts againts the superset. Currently, this
only works if the superset is also a ProcResGroup as opposed to a
SuperUnit.
This allows SandyBridge to be expressed naturally, which will be
checked in shortly.
def SBPort01 : ProcResGroup<[SBPort0, SBPort1]>;
def SBPort15 : ProcResGroup<[SBPort1, SBPort5]>;
def SBPort23 : ProcResGroup<[SBPort2, SBPort3]>;
def SBPort015 : ProcResGroup<[SBPort0, SBPort1, SBPort5]>;
llvm-svn: 177112
Fix the way resources are counted. I'm taking some time to cleanup the
way MachineScheduler handles in-order machine resources. Eventually
we'll need more PPC/Atom test cases in tree.
llvm-svn: 176390
For example, ARM has several instructions with a literal '#0' immediate in the syntax
that's not represented as an actual operand. The asm matcher is expected a token
operand, but the parser will have created an immediate operand. This is currently
handled by dedicated per-instruction C++ munging of the ParsedAsmOperand list, but
will be better handled by this hook.
llvm-svn: 174487
and enables the instruction printer to print aliased
instructions.
Due to usage of RegisterOperands a change in common
code (utils/TableGen/AsmWriterEmitter.cpp) is required
to get the correct register value if it is a RegisterOperand.
Contributer: Vladimir Medic
llvm-svn: 174358
Drive by fix. I noticed some missing logic that might bite future
users. This shouldn't affect the final output on currently modeled
targets.
llvm-svn: 174142
This patch adds support for AArch64 (ARM's 64-bit architecture) to
LLVM in the "experimental" category. Currently, it won't be built
unless requested explicitly.
This initial commit should have support for:
+ Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions
(except the late addition CRC instructions).
+ CodeGen features required for C++03 and C99.
+ Compilation for the "small" memory model: code+static data <
4GB.
+ Absolute and position-independent code.
+ GNU-style (i.e. "__thread") TLS.
+ Debugging information.
The principal omission, currently, is performance tuning.
This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.
Further reviews would be gratefully received.
llvm-svn: 174054
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.
llvm-svn: 173606
// FIXME: Constraints are hard coded to 'm', but we need an 'r'
// constraint for addressof. This needs to be cleaned up!
Test cases are already in place. Specifically,
clang/test/CodeGen/ms-inline-asm.c t15(), t16(), and t24().
llvm-svn: 172569
The purpose of this patch is to allow PredicateMethods to be set to something
like "isUImm<8>", calling a C++ template method to reduce code duplication. For
this to work, the PredicateMethod must be mangled into a valid C++ identifier
for insertion into an enum.
llvm-svn: 172073
When processing possible aliases, TableGen assumes that if an operand *can* be
an immediate, then it always *will* be. This is incorrect for the AArch64
backend. This patch inserts a check in the generated code to make sure isImm is
true first.
llvm-svn: 171972
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.
I converted some in-order scheduling tests to A2. Hal is working on
more test cases.
llvm-svn: 171946
MC disassembler clients (LLDB) are interested in querying if an
instruction may affect control flow other than by virtue of being
an explicit branch instruction. For example, instructions which
write directly to the PC on some architectures.
llvm-svn: 170610
beyond array bounds.
No test case since I cannot reproduce an ICE with this bug. According
to Carlos -- the bug reporter -- a segfault occurs only when LLVM is
compiled with a specific version of GCC.
llvm-svn: 169783
This is much simpler to reason about, more efficient, and
fixes some corner cases involving implicit super-register defs.
Fixed rdar://12797931.
llvm-svn: 169425
At build-time register pressure was always computed in terms of
register units. But the compile-time API was expressed in terms of
register classes because it was intended for virtual registers (and
physical register units weren't yet used anywhere in codegen).
Now that the codegen uses physreg units consistently, prepare for
tracking register pressure also in terms of live units, not live
registers.
llvm-svn: 169360
When code deletes the context, the AttributeImpls that the AttrListPtr points to
are now invalid. Therefore, instead of keeping a separate managed static for the
AttrListPtrs that's reference counted, move it into the LLVMContext and delete
it when deleting the AttributeImpls.
llvm-svn: 168354
This patch replaces the hard coded GPR pair [R0, R1] of
Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with
even/odd GPRPair reg class.
Similar to the lowering of atomic_64 operation.
llvm-svn: 168207
- Add RTM code generation support throught 3 X86 intrinsics:
xbegin()/xend() to start/end a transaction region, and xabort() to abort a
tranaction region
llvm-svn: 167573
"../llvm-git/utils/TableGen/CodeGenSchedule.cpp", line 1594.12: 1540-0218 (S) The call does not match any parameter list for "operator+".
"../llvm-git/include/llvm/ADT/STLExtras.h", line 130.1: 1540-1283 (I) "template <class _Iterator, class Func> llvm::operator+(mapped_iterator<_Iterator,Func>::difference_type, const mapped_iterator<_Iterator,Func> &)" is not a viable candidate.
Patch by Kai.
llvm-svn: 167311
Explicitly allow composition of null sub-register indices, and handle
that common case in an inlinable stub.
Use a compressed table implementation instead of the previous nested
switches which generated pretty bad code.
llvm-svn: 167190