string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749
inverted condition codes (CINC, CINV, CNEG, CSET, and CSETM).
Matching aliases based on "immediate classes", when disassembling,
wasn't previously supported, hence adding MCOperandPredicate
into class Operand, and implementing the support for it
in AsmWriterEmitter.
The parsing for those aliases was already custom, so just adding
the missing condition into AArch64AsmParser::parseCondCode.
llvm-svn: 210528
I saw at least a memory leak or two from inspection (on probably
untested error paths) and r206991, which was the original inspiration
for this change.
I ran this idea by Jim Grosbach a few weeks ago & he was OK with it.
Since it's a basically mechanical patch that seemed sufficient - usual
post-commit review, revert, etc, as needed.
llvm-svn: 210427
This changes ARM64 to use separate operands for each component of an
address, and look for separate '[', '$Rn, ..., ']' tokens when
parsing.
This allows us to do away with quite a bit of special C++ code to
handle monolithic "addressing modes" in the MC components. The more
incremental matching of the assembler operands also allows for better
diagnostics when LLVM is presented with invalid input.
Most of the complexity here is with the register-offset instructions,
which were extremely dodgy beforehand: even when the instruction used
wM, LLVM's model had xM as an operand. We papered over this
discrepancy before, but that approach doesn't work now so I split them
into separate X and W variants.
llvm-svn: 209425
Summary:
The minimal type needs to hold a value of '1ULL << 31' but
getMinimalTypeForRange() is called with a value of '1ULL << 32'.
This patch will also reduce the size of the matcher table when there are 8
or 16 SubtargetFeatures.
Also added a dump of the SubtargetFeatures to the -debug output and corrected getMinimalTypeInRange() to consider 0xffffffffull to be a 32-bit value.
The testcase is that no existing code is broken and that LLVM still successfully
compiles after adding MIPS64r6 CodeGen support.
Reviewers: rafael
Reviewed By: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3787
llvm-svn: 209288
This allows the results of a ComplexPattern check to be distributed to separate
named Operands, instead of the current system where all results must apply (and
match perfectly) with a single Operand.
For example, if "some_addrmode" is a ComplexPattern producing two results, you
can write:
def : Pat<(load (some_addrmode GPR64:$base, imm:$offset)),
(INST GPR64:$base, imm:$offset)>;
This should allow neater instruction definitions in TableGen that don't put all
possible aspects of addressing into a single operand, but are still usable with
relatively simple C++ CodeGen idioms.
llvm-svn: 209206
When multiple aliases overlap, the correct string to print can often be
determined purely by considering the InstAlias declarations in some particular
order. This allows the user to specify that order manually when desired,
without resorting to hacking around with the default lexicographical order on
Record instantiation, which is error-prone and ugly.
I was also mistaken about "add w2, w3, w4" being the same as "add w2, w3, w4,
uxtw". That's only true if Rn is the stack pointer.
llvm-svn: 209199
TableGen has a fairly dubious heuristic to decide whether an alias should be
printed: does the alias have lest operands than the real instruction. This is
bad enough (particularly with no way to override it), but it should at least be
calculated consistently for both strings.
This patch implements that logic: first get the *correct* string for the
variant, in the same way as the Matcher, without guessing; then count the
number of whitespace chars.
There are basically 4 changes this brings about after the previous
commits; all of these appear to be good, so I have changed the tests:
+ ARM64: we print "neg X, Y" instead of "sub X, xzr, Y".
+ ARM64: we skip implicit "uxtx" and "uxtw" modifiers.
+ Sparc: we print "mov A, B" instead of "or %g0, A, B".
+ Sparc: we print "fcmpX A, B" instead of "fcmpX %fcc0, A, B"
llvm-svn: 208969
Previously, TableGen assumed that every aliased operand consumed precisely 1
MachineInstr slot (this was reasonable because until a couple of days ago,
nothing more complicated was eligible for printing).
This allows a couple more ARM64 aliases to print so we can remove the special
code.
On the X86 side, I've gone for explicit AT&T size specifiers as the default, so
turned off a few of the aliases that would have just started printing.
llvm-svn: 208880
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).
This does represent a small functionality change:
- For generic x86-64 (which uses the SB model and, thus, will get some
unrolling).
- For AMD cores (because they still currently use the SB scheduling model)
- For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.
llvm-svn: 208289
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.
This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:
- Header files that need to provide a DEBUG_TYPE for some inline code
can do so by defining the macro before their inline code and undef-ing
it afterward so the macro does not escape.
- We no longer have rampant ODR violations due to including headers with
different DEBUG_TYPE definitions. This may be mostly an academic
violation today, but with modules these types of violations are easy
to check for and potentially very relevant.
Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.
The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.
llvm-svn: 206822
Removes some extra manual dynamic memory allocation/management. It does
get a bit quirky having to make State's members mutable and
pointers/references to const rather than non-const, but that's a
necessary workaround to dealing with the std::set elements.
llvm-svn: 206807
entirely clear whether this should be valid with modules enabled, but the fixed
code is cleaner regardless.
Also fix a TU-local type that accidentally had external linkage.
llvm-svn: 206714
This is like the LLVMMatchType, except the verifier checks that the
second argument is a vector with the same base type and half the
number of elements.
This will be used by the ARM64 backend.
llvm-svn: 205079
These are used in the ARM backends to aid type-checking on patterns involving
intrinsics. By making sure one argument is an extended/truncated version of
another.
However, there's no reason to limit them to just vectors types. For example
AArch64 has the instruction "uqshrn sD, dN, #imm" which would naturally use an
intrinsic taking an i64 and returning an i32.
llvm-svn: 205003
When an instruction's operand list does not have a sufficient number of
operands to match with all of the variables that contribute to its
encoding, instead of asserting inside a call to getSubOperandNumber, produce an
informative error.
llvm-svn: 204542
The "noduplicate" function attribute exists to prevent certain optimizations
from duplicating calls to the function. This is important on platforms where
certain function call duplications are unsafe (for example execution barriers
for CUDA and OpenCL).
This patch makes it possible to specify intrinsics as "noduplicate" and
translates that to the appropriate function attribute.
llvm-svn: 204200
Utilize the previous move of MVT to a separate header for all trivial
cases (that don't need any further restructuring).
Reviewed By: Tim Northover
llvm-svn: 204003
There are currently two schemes for mapping instruction operands to
instruction-format variables for generating the instruction encoders and
decoders for the assembler and disassembler respectively: a) to map by name and
b) to map by position.
In the long run, we'd like to remove the position-based scheme and use only
name-based mapping. Unfortunately, the name-based scheme currently cannot deal
with complex operands (those with suboperands), and so we currently must use
the position-based scheme for those. On the other hand, the position-based
scheme cannot deal with (register) variables that are split into multiple
ranges. An upcoming commit to the PowerPC backend (adding VSX support) will
require this capability. While we could teach the position-based scheme to
handle that, since we'd like to move away from the position-based mapping
generally, it seems silly to teach it new tricks now. What makes more sense is
to allow for partial transitioning: use the name-based mapping when possible,
and only use the position-based scheme when necessary.
Now the problem is that mixing the two sensibly was not possible: the
position-based mapping would map based on position, but would not skip those
variables that were mapped by name. Instead, the two sets of assignments would
overlap. However, I cannot currently change the current behavior, because there
are some backends that rely on it [I think mistakenly, but I'll send a message
to llvmdev about that]. So I've added a new TableGen bit variable:
noNamedPositionallyEncodedOperands, that can be used to cause the
position-based mapping to skip variables mapped by name.
llvm-svn: 203767
"ProcResource def is not included in the ProcResources".
Some of the machine model definitions were not added to the
processor's list used for diagnostics and error checking.
llvm-svn: 203749
The old system was fairly convoluted:
* A temporary label was created.
* A single PROLOG_LABEL was created with it.
* A few MCCFIInstructions were created with the same label.
The semantics were that the cfi instructions were mapped to the PROLOG_LABEL
via the temporary label. The output position was that of the PROLOG_LABEL.
The temporary label itself was used only for doing the mapping.
The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to
one by holding an index into the CFI instructions of this function.
I did consider removing MMI.getFrameInstructions completelly and having
CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non
trivial constructors and destructors and are somewhat big, so the this setup
is probably better.
The net result is that we don't create temporary labels that are never used.
llvm-svn: 203204
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083
Unfortunately, it is currently impossible to use a PatFrag as part of an output
pattern (the part of the pattern that has instructions in it) in TableGen.
Looking at the current implementation, this was clearly intended to work (there
is already code in place to expand patterns in the output DAG), but is
currently broken by the baked-in type-checking assumption and the order in which
the pattern fragments are processed (output pattern fragments need to be
processed after the instruction definitions are processed).
Fixing this is fairly simple, but requires some way of differentiating output
patterns from the existing input patterns. The simplest way to handle this
seems to be to create a subclass of PatFrag, and so that's what I've done here.
As a simple example, this allows us to write:
def crnot : OutPatFrag<(ops node:$in),
(CRNOR $in, $in)>;
def : Pat<(not i1:$in),
(crnot $in)>;
which captures the core use case: handling of repeated subexpressions inside
of complicated output patterns.
This will be used by an upcoming commit to the PowerPC backend.
llvm-svn: 202450
should not be marked nounwind.
Marking them nounwind caused crashes in the WebKit FTL JIT, because if we enable
sufficient optimizations, LLVM starts eliding compact_unwind sections (or any unwind
data for that matter), making deoptimization via stackmaps impossible.
This changes the stackmap intrinsic to be may-throw, adds a test for exactly the
sympton that WebKit saw, and fixes TableGen to handle un-attributed intrinsics.
Thanks to atrick and philipreames for reviewing this.
llvm-svn: 201826
Original commits messages:
Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code.
Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information.
llvm-svn: 201065
r201059 appears to cause a crash in a bootstrapped build of clang. Craig
isn't available to look at it right now, so I'm reverting it while he
investigates.
llvm-svn: 201064
According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.
I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.
llvm-svn: 200970
The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It
also turns out to have been unnecessary. The disassembler handles the
AdSize prefix for itself, and doesn't care about the difference between
(e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and
don't worry about it.
llvm-svn: 199654
promotion code, Tablegen will now select FPExt for floating point promotions
(previously it had returned AExt, which is not valid for floating point types).
Any out-of-tree targets that were relying on AExt being returned for FP
promotions will need to update their code check for FPExt instead.
llvm-svn: 199252
To declare or define reserved identifers is undefined behaviour in standard
C++. This needs to be addressed in compiler-rt before it can be used in LLVM.
See the list discussion for details.
This reverts commit r198858.
llvm-svn: 198884
It seems there is no separate instruction class for having AdSize *and*
OpSize bits set, which is required in order to disambiguate between all
these instructions. So add that to the disassembler.
Hm, perhaps we do need an AdSize16 bit after all?
llvm-svn: 198759
A ValueType in a pattern dag is a type cast, and GetNumNodeResults should
handle it (the type cast has only one result).
This comes up, for example, during the type checking of pattern fragments, for
example, AArch64's Neon_combine_2d fragment is:
dag Operands = (ops node:$Rm, node:$Rn);
dag Fragment = (v2f64 (concat_vectors (v1f64 node:$Rm), (v1f64 node:$Rn)));
llvm-svn: 198347
That's what it actually means, and with 16-bit support it's going to be
a little more relevant since in a few corner cases we may actually want
to distinguish between 16-bit and 32-bit mode (for example the bare 'push'
aliases to pushw/pushl etc.)
Patch by David Woodhouse
llvm-svn: 197768
Unfortunately, the PowerPC instruction definitions make heavy use of the
positional operand encoding heuristic to map operands onto bitfield variables
in the instruction definitions. Changing this to use name-based mapping is not
trivial, however, because additional infrastructure needs to be designed to
handle mapping of complex operands (with multiple suboperands) onto multiple
bitfield variables.
In the mean time, this adds support for positionally encoded operands to
FixedLenDecoderEmitter, so that we can generate a disassembler for the PowerPC
backend. To prevent an accidental reliance on this feature, and to prevent an
undesirable interaction with existing disassemblers, a backend must opt-in to
this support by setting the new decodePositionallyEncodedOperands
instruction-set bit to true.
When enabled, this iterates the variables that contribute to the instruction
encoding, just as the encoder does, and emulates the procedure the encoder uses
to map "numbered" operands to variables. The bit range for each variable is
also determined as the encoder determines them. This map is then consulted
during the decoder-generator's loop over operands to decode, allowing the
decoder to understand both position-based and name-based operand-to-variable
mappings.
As noted in the comment on the decodePositionallyEncodedOperands definition,
this support should be removed once it is no longer needed. There should be no
change to existing disassemblers.
llvm-svn: 197691
This is more prep for adding the PowerPC disassembler. FixedLenDecoderEmitter
should recognize PointerLikeRegClass operands as register types, and generate
register-like decoding calls instead of treating them like immediates.
llvm-svn: 197680
The convention used to specify the PowerPC ISA is that bits are numbered in
reverse order (0 is the index of the high bit). To support this "little endian"
encoding convention, CodeEmitterGen will reverse the bit numberings prior to
generating the encoding tables. In order to generate a disassembler,
FixedLenDecoderEmitter needs to do the same.
This moves the bit reversal logic out of CodeEmitterGen and into CodeGenTarget
(where it can be used by both CodeEmitterGen and FixedLenDecoderEmitter). This
is prep work for disassembly support in the PPC backend (which is the only
in-tree user of this little-endian encoding support).
llvm-svn: 197532
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.
llvm-svn: 197384
This patch places class definitions in implementation files into anonymous
namespaces to prevent weak vtables. This eliminates the need of providing an
out-of-line definition to pin the vtable explicitly to the file.
llvm-svn: 195092
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 195064
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
Base *foo = new Child();
delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.
llvm-svn: 194997
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 194865
These used to be referenced by the CGI->AWI map (in AsmWriterEmitter), but
stored in a vector local to EmitPrintInstruction. Move the vector to
AsmWriterEmitter too.
llvm-svn: 193525
The old code skipped one of the sorting criteria if either pattern had
no types. This could lead to cycles of the form X < Y, Y < Z, Z < X.
llvm-svn: 191735
Add VEX_LIG to scalar FMA4 instructions.
Use VEX_LIG in some of the inheriting checks in disassembler table generator.
Make use of VEX_L_W, VEX_L_W_XS, VEX_L_W_XD contexts.
Don't let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from their non-L forms unless VEX_LIG is set.
Let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from all of their non-L or non-W cases.
Increase ranking on VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE so they get chosen over non-L/non-W forms.
llvm-svn: 191649
Ideally, the machinel model is added at the time the instructions are
defined. But many instructions in X86InstrSSE.td still need a model.
Without this workaround the scheduler asserts because x86 already has
itinerary classes for these instructions, indicating they should be
modeled by the scheduler. Since we use the new machine model for other
instructions, it expects a new machine model for these too.
llvm-svn: 191391
Patch by Ana Pazos.
1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.
llvm-svn: 191263
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.
llvm-svn: 191175
TableGen was sorting the entries in some of its internal data
structures by pointer. This order filtered through to the final
matching table and affected the diagnostics produced on bad assembly
occasionally.
It also turns out STL algorithms are ridiculously easy to misuse on
containers with custom order methods. (No bugs before, or now that I
know of, but plenty in the middle).
This should fix the sanitizer bot, which ends up with weird pointers.
llvm-svn: 190793
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.
The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
ComplexDeprecationPredicate<"MCR">
would mean you would have to define the following function:
bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info)
Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.
The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.
llvm-svn: 190598
Link.exe's command line options are case-insensitive. This patch
adds a new attribute to OptTable to let the option parser to compare
options, ignoring case.
Command lines are generally case-insensitive on Windows. CL.exe is an
exception. So this new attribute should be useful for other commands
running on Windows.
Differential Revision: http://llvm-reviews.chandlerc.com/D1485
llvm-svn: 189416
This field specifies registers that are preserved across function calls,
but that should not be included in the generates SaveList array.
This can be used ot generate regmasks for architectures that save
registers through other means, like SPARC's register windows.
llvm-svn: 189084
Back in the mists of time (2008), it seems TableGen couldn't handle the
patterns necessary to match ARM's CMOV node that we convert select operations
to, so we wrote a lot of fairly hairy C++ to do it for us.
TableGen can deal with it now: there were a few minor differences to CodeGen
(see tests), but nothing obviously worse that I could see, so we should
probably address anything that *does* come up in a localised manner.
llvm-svn: 188995
clang bootstraps intermittently failed for me due a difference in
the MCK_Reg ordering in ARMGenAsmMatcher.inc. E.g. in my latest
run the stage 1 and stage 3 versions were the same but the stage 2
one was different (though still functionally correct). This meant
that the .o comparison failed.
MCK_Regs were assigned by iterating over a std::set< std::set<Record*> >,
and since std::set is sorted lexicographically, the order depended on the
order of the pointer values. This patch replaces the pointer ordering
with LessRecordByID.
llvm-svn: 188164
LLVM's coding standards recommend raw_ostream and MemoryBuffer for
reading and writing text.
This has the side effect of allowing clang to compile more of Support
and TableGen in the Microsoft C++ ABI.
llvm-svn: 187826
This makes option aliases more powerful by enabling them to
pass along arguments to the option they're aliasing.
For example, if we have a joined option "-foo=", we can now
specify a flag option "-bar" to be an alias of that, with the
argument "baz".
This is especially useful for the cl.exe compatible clang driver,
where many options are aliases. For example, this patch enables
us to alias "/Ox" to "-O3" (-O is a joined option), and "/WX" to
"-Werror" (again, -W is a joined option).
Differential Revision: http://llvm-reviews.chandlerc.com/D1245
llvm-svn: 187537
For two intrinsics 'llvm.nvvm.texsurf.handle' and 'llvm.nvvm.texsurf.handle.internal',
TableGen was emitting matching code like:
if (Name.startswith("llvm.nvvm.texsurf.handle")) ...
if (Name.startswith("llvm.nvvm.texsurf.handle.internal")) ...
We can never match "llvm.nvvm.texsurf.handle.internal" here because it will
always be erroneously matched by the first condition.
The fix is to sort the intrinsic names and emit them in reverse order.
llvm-svn: 187119
This removes the need to store the asm variant in each row of the single table that existed before. Shaves ~16K off the size of X86AsmParser.o.
llvm-svn: 187026
algorithm when assigning EnumValues to the synthesized registers.
The current algorithm, LessRecord, uses the StringRef compare_numeric
function. This function compares strings, while handling embedded numbers.
For example, the R600 backend registers are sorted as follows:
T1
T1_W
T1_X
T1_XYZW
T1_Y
T1_Z
T2
T2_W
T2_X
T2_XYZW
T2_Y
T2_Z
In this example, the 'scaling factor' is dEnum/dN = 6 because T0, T1, T2
have an EnumValue offset of 6 from one another. However, in other parts
of the register bank, the scaling factors are different:
dEnum/dN = 5:
KC0_128_W
KC0_128_X
KC0_128_XYZW
KC0_128_Y
KC0_128_Z
KC0_129_W
KC0_129_X
KC0_129_XYZW
KC0_129_Y
KC0_129_Z
The diff lists do not work correctly because different kinds of registers have
different 'scaling factors'. This new algorithm, LessRecordRegister, tries to
enforce a scaling factor of 1. For example, the registers are now sorted as
follows:
T1
T2
T3
...
T0_W
T1_W
T2_W
...
T0_X
T1_X
T2_X
...
KC0_128_W
KC0_129_W
KC0_130_W
...
For the Mips and R600 I see a 19% and 6% reduction in size, respectively. I
did see a few small regressions, but the differences were on the order of a
few bytes (e.g., AArch64 was 16 bytes). I suspect there will be even
greater wins for targets with larger register files.
Patch reviewed by Jakob.
rdar://14006013
llvm-svn: 185094
This patch modifies TableGen to generate a function in
${TARGET}GenInstrInfo.inc called getNamedOperandIdx(), which can be used
to look up indices for operands based on their names.
In order to activate this feature for an instruction, you must set the
UseNamedOperandTable bit.
For example, if you have an instruction like:
def ADD : TargetInstr <(outs GPR:$dst), (ins GPR:$src0, GPR:$src1)>;
You can look up the operand indices using the new function, like this:
Target::getNamedOperandIdx(Target::ADD, Target::OpName::dst) => 0
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src0) => 1
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src1) => 2
The operand names are case sensitive, so $dst and $DST are considered
different operands.
This change is useful for R600 which has instructions with a large number
of operands, many of which model single bit instruction configuration
values. These configuration bits are common across most instructions,
but may have a different operand index depending on the instruction type.
It is useful to have a convenient way to look up the operand indices,
so these bits can be generically set on any instruction.
llvm-svn: 184879
For decoding, keep the current behavior of always decoding these as their REP
versions. In the future, this could be improved to recognize the cases where
these behave as XACQUIRE and XRELEASE and decode them as such.
llvm-svn: 184207
Replace the ill-defined MinLatency and ILPWindow properties with
with straightforward buffer sizes:
MCSchedMode::MicroOpBufferSize
MCProcResourceDesc::BufferSize
These can be used to more precisely model instruction execution if desired.
Disabled some misched tests temporarily. They'll be reenabled in a few commits.
llvm-svn: 184032
The element passed to push_back is not copied before the vector reallocates.
The client needs to copy the element first before passing it to push_back.
No test case, will be tested by follow-up swift scheduler model change (it
segfaults without this change).
llvm-svn: 183459
Don't output data if we are supposed to ignore the record.
Reapply of 183255, I don't think this was causing the tablegen segfault on linux
testers.
llvm-svn: 183311
This fixes some of the ridiculously complex code for optimizing the
machine model tables that are shared among all processors of a given
target. A9 and Swift both use the "special" feature that maps old
itinerary classes to new machine model defs. They map different
overlapping subsets of instructions, which wasn't handled correctly.
llvm-svn: 183302
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.
In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.
llvm-svn: 183020
The size reduction in the RegDiffLists are rather dramatic. Here are a few
size differences for MCTargetDesc.o files (before and after) in bytes:
R600 - 36160B - 11184B - 69% reduction
ARM - 28480B - 8368B - 71% reduction
Mips - 816B - 576B - 29% reduction
One side effect of dynamically computing the aliases is that the iterator does
not guarantee that the entries are ordered or that duplicates have been removed.
The documentation implies this is a safe assumption and I found no clients that
requires these attributes (i.e., strict ordering and uniqueness).
My local LNT tester results showed no execution-time failures or significant
compile-time regressions (i.e., beyond what I would consider noise) for -O0g,
-O2 and -O3 runs on x86_64 and i386 configurations.
rdar://12906217
llvm-svn: 182783
Currently the fast-isel table generator recognizes registers, register
classes, and immediates for source pattern operands. ValueType
operands are not recognized. This is not a problem for existing
targets with fast-isel support, but will not work for targets like
PowerPC and SPARC that use types in source patterns.
The proposed patch allows ValueType operands and treats them in the
same manner as register classes. There is no convenient way to map
from a ValueType to a register class, but there's no need to do so.
The table generator already requires that all types in the source
pattern be identical, and we know the register class of the output
operand already. So we just assign that register class to any
ValueType operands we encounter.
No functional effect on existing targets. Testing deferred until the
PowerPC target implements fast-isel.
llvm-svn: 182512
This lane mask provides information about which register lanes
completely cover super-registers. See the block comment before
getCoveringLanes().
llvm-svn: 182034
The problem this patch addresses is the handling of register tie
constraints in AsmMatcherEmitter, where one operand is tied to a
sub-operand of another operand. The typical scenario for this to
happen is the tie between the "write-back" register of a pre-inc
instruction, and the base register sub-operand of the memory address
operand of that instruction.
The current AsmMatcherEmitter code attempts to handle tied
operands by emitting the operand as usual first, and emitting
a CVT_Tied node when handling the second (tied) operand. However,
this really only works correctly if the tied operand does not
have sub-operands (and isn't a sub-operand itself). Under those
circumstances, a wrong MC operand list is generated.
In discussions with Jim Grosbach, it turned out that the MC operand
list really ought not to contain tied operands in the first place;
instead, it ought to consist of exactly those operands that are
named in the AsmString. However, getting there requires significant
rework of (some) targets.
This patch fixes the immediate problem, and at the same time makes
one (small) step in the direction of the long-term solution, by
implementing two changes:
1. Restricts the AsmMatcherEmitter handling of tied operands to
apply solely to simple operands (not complex operands or
sub-operand of such).
This means that at least we don't get silently corrupt MC operand
lists as output. However, if we do have tied sub-operands, they
would now no longer be handled at all, except for:
2. If we have an operand that does not occur in the AsmString,
and also isn't handled as tied operand, simply emit a dummy
MC operand (constant 0).
This works as long as target code never attempts to access
MC operands that do no not occur in the AsmString (and are
not tied simple operands), which happens to be the case for
all targets where this situation can occur (ARM and PowerPC).
[ Note that this change means that many of the ARM custom
converters are now superfluous, since the implement the
same "hack" now performed already by common code. ]
Longer term, we ought to fix targets to never access *any*
MC operand that does not occur in the AsmString (including
tied simple operands), and then finally completely remove
all such operands from the MC operand list.
Patch approved by Jim Grosbach.
llvm-svn: 180677
Super-resources and resource groups are two ways of expressing
overlapping sets of processor resources. Now we generate table entries
the same way for both so the scheduler never needs to explicitly check
for super-resources.
llvm-svn: 180162
variant/dialect. Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.
llvm-svn: 179804
As these two instructions in AVX extension are privileged instructions for
special purpose, it's only expected to be used in inlined assembly.
llvm-svn: 179266
A9 uses itinerary classes, Swift uses RW lists. This tripped some
verification when we're expanding variants. I had to refine the
verification a bit.
llvm-svn: 178357
This syntax is now preferred:
def : Pat<(subc i32:$b, i32:$c), (SUBCCrr $b, $c)>;
There is no reason to repeat the types in the output pattern.
llvm-svn: 177844
This makes it possible to define instruction patterns like this:
def LDri : F3_2<3, 0b000000,
(outs IntRegs:$dst), (ins MEMri:$addr),
"ld [$addr], $dst",
[(set i32:$dst, (load ADDRri:$addr))]>;
~~~
llvm-svn: 177834
Just like register classes, value types can be used in two ways in
patterns:
(sext_inreg i32:$src, i16)
In a named leaf node like i32:$src, the value type simply provides the
type of the node directly. This simplifies type inference a lot compared
to the current practice of specifiying types indirectly with register
classes.
As an unnamed leaf node, like i16 above, the value type represents
itself as an MVT::Other immediate.
llvm-svn: 177828
A register class can appear as a leaf TreePatternNode with and without a
name:
(COPY_TO_REGCLASS GPR:$src, F8RC)
In a named leaf node like GPR:$src, the register class provides type
information for the named variable represented by the node. The TypeSet
for such a node is the set of value types that the register class can
represent.
In an unnamed leaf node like F8RC above, the register class represents
itself as a kind of immediate. Such a node has the type MVT::i32,
we'll never create a virtual register representing it.
This change makes it possible to remove the special handling of
COPY_TO_REGCLASS in CodeGenDAGPatterns.cpp.
llvm-svn: 177825
To use this in conjunction with exuberant ctags to generate a single
combined tags file, run tblgen first and then
$ ctags --append [...]
Since some identifiers have corresponding definitions in C++ code,
it can be useful (if using vim) to also use cscope, and
:set cscopetagorder=1
so that
:tag X
will preferentially select the tablegen symbol, while
:cscope find g X
will always find the C++ symbol.
Patch by Kevin Schoedel!
(a couple small formatting changes courtesy of clang-format)
llvm-svn: 177682
of complex instruction operands (e.g. address modes).
Currently, if a Pat pattern creates an instruction that has a complex
operand (i.e. one that consists of multiple sub-operands at the MI
level), this operand must match a ComplexPattern DAG pattern with the
correct number of output operands.
This commit extends TableGen to alternatively allow match a complex
operands against multiple separate operands at the DAG level.
This allows using Pat patterns to match pre-increment nodes like
pre_store (which must have separate operands at the DAG level) onto
an instruction pattern that uses a multi-operand memory operand,
like the following example on PowerPC (will be committed as a
follow-on patch):
def STWU : DForm_1<37, (outs ptr_rc:$ea_res), (ins GPRC:$rS, memri:$dst),
"stwu $rS, $dst", LdStStoreUpd, []>,
RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;
def : Pat<(pre_store GPRC:$rS, ptr_rc:$ptrreg, iaddroff:$ptroff),
(STWU GPRC:$rS, iaddroff:$ptroff, ptr_rc:$ptrreg)>;
Here, the pair of "ptroff" and "ptrreg" operands is matched onto the
complex operand "dst" of class "memri" in the "STWU" instruction.
Approved by Jakob Stoklund Olesen.
llvm-svn: 177428
Properly handle cases where a group of instructions have different
SchedRW lists with the same itinerary class.
This was supposed to work, but I left in an early break.
llvm-svn: 177317
We always supported a mixture of the old itinerary model and new
per-operand model, but it required a level of indirection to map
itinerary classes to SchedRW lists. This was done for ARM A9.
Now we want to define x86 SchedRW lists, with the goal of removing its
itinerary classes, but still support the itineraries in the mean
time. When I original developed the model, Atom did not have
itineraries, so there was no reason to expect this requirement.
llvm-svn: 177226
Don't require instructions to inherit Sched<...>. Sometimes it is more
convenient to say:
let SchedRW = ... in {
...
}
Which is now possible.
llvm-svn: 177199