copy of the BURRList scheduler, but with several parts ripped
out, such as backtracking, online topological sort maintenance
(needed by backtracking), the priority queue, and Sethi-Ullman
number computation and maintenance (needed by the priority
queue). As a result of all this, it generates somewhat lower
quality code, but that's its tradeoff for running about 30%
faster than list-burr in -fast mode in many cases.
This is somewhat experimental. Moving forward, major pieces of
this can be refactored with pieces in common with
ScheduleDAGRRList.cpp.
llvm-svn: 56307
with an earlyclobber operand elsewhere. Propagate
this bit and the earlyclobber bit through SDISel.
Change linear-scan RA not to allocate regs in a way
that conflicts with an earlyclobber. See also comments.
llvm-svn: 56290
ConstantPoolSDNode, using the target's preferred alignment for the
constant type.
In LegalizeDAG, when performing loads from the constant pool, the
ConstantPoolSDNode's alignment is used in the calls to getLoad and
getExtLoad.
This change prevents SelectionDAG::getLoad/getExtLoad from incorrectly
choosing the ABI alignment for constant pool loads when Alignment == 0.
The incorrect alignment is only a performance issue when ABI alignment
does not equal preferred alignment (i.e., on x86 it was generating
MOVUPS instead of MOVAPS for v4f32 constant loads when the default ABI
alignment for 128bit vectors is forced to 1 byte.)
Patch by Paul Redmond!
llvm-svn: 56253
- Add linkage to SymbolSDNode (default to external).
- Change ISD::ExternalSymbol to ISD::Symbol.
- Change ISD::TargetExternalSymbol to ISD::TargetSymbol
These changes pave the way to allowing SymbolSDNodes with non-external linkage.
llvm-svn: 56249
Currently it just holds the calling convention and flags
for isVarArgs and isTailCall.
And it has several utility methods, which eliminate magic
5+2*i and similar index computations in several places.
CallSDNodes are not CSE'd. Teach UpdateNodeOperands to handle
nodes that are not CSE'd gracefully.
llvm-svn: 56183
ConstantFP* instead of APInt and APFloat directly.
This reduces the amount of time to create ConstantSDNode
and ConstantFPSDNode nodes when ConstantInt* and ConstantFP*
respectively are already available, as is the case in
SelectionDAGBuild.cpp. Also, it reduces the amount of time
to legalize constants into constant pools, and the amount of
time to add ConstantFP operands to MachineInstrs, due to
eliminating ConstantInt::get and ConstantFP::get calls.
It increases the amount of work needed to create new constants
in cases where the client doesn't already have a ConstantInt*
or ConstantFP*, such as legalize expanding 64-bit integer constants
to 32-bit constants. And it adds a layer of indirection for the
accessor methods. But these appear to be outweight by the benefits
in most cases.
It will also make it easier to make ConstantSDNode and
ConstantFPNode more consistent with ConstantInt and ConstantFP.
llvm-svn: 56162
form:
powf(10.0f, x);
If this is the case, and also we want limited precision floating-point
calculations, then lower to do the limited-precision stuff.
llvm-svn: 56035
to being off by default. Also, add assertion checks to check that
the various fast-isel-related command-line options are only used
when -fast-isel itself is enabled.
llvm-svn: 56029
before. This is taken care of in the selection DAG pass. In my opinion, this
should be in one place or the other. I.e., it should probably be removed from
the DAG combiner (along with the other arithmetic transformations on constants
that are essentially no-ops).
llvm-svn: 55889
out of ScheduleDAGEmit.cpp and into SelectionDAGISel.cpp. This
allows it to be run exactly once per function, even if multiple
SelectionDAG iterations happen in the entry block, as may happen
with FastISel.
llvm-svn: 55863
but less accurate (non-IEEE) code sequences for
certain math library functions. Add the first of
several such expansions. Don't worry, if you don't
turn it on it won't affect you.
llvm-svn: 55823
HandlePHINodesInSuccessorBlocks that works FastISel-style. This
allows PHI nodes to be updated correctly while using FastISel.
This also involves some code reorganization; ValueMap and
MBBMap are now members of the FastISel class, so they needn't
be passed around explicitly anymore. Also, SelectInstructions
is changed to SelectInstruction, and only does one instruction
at a time.
llvm-svn: 55746
the simple fix, materializing the constant before every use. It might be better to either track domination of uses or
to materialize all constants and the beginning of the function and let remat sort when to do materialization at uses.
llvm-svn: 55703
The first can update the SDNode in an SDValue
while the second is called with SDNode* and
returns a possibly updated SDNode*.
This patch has no intended functional impact,
but helps eliminating ugly temporary SDValues.
llvm-svn: 55608
assignment when selecting the def. This is the naive solution to the problem: insert a copy to the pre-chosen
vreg. Other solutions might be preferable, such as:
1) Passing the dest reg into FastEmit_. However, this would require the higher level code to know about reg classes, which they don't currently.
2) Selecting blocks in reverse postorder. This has some compile time cost for computing the order, and we'd need to measure its impact.
llvm-svn: 55555
its work by putting all nodes in the worklist, requiring a big
dynamic allocation. Now, DAGCombiner just iterates over the AllNodes
list and maintains a worklist for nodes that are newly created or
need to be revisited. This allows the worklist to stay small in most
cases, so it can be a SmallVector.
This has the side effect of making DAGCombine not miss a folding
opportunity in alloca-align-rounding.ll.
llvm-svn: 55498
SelectionDAGLowering instead of being in an anonymous namespace.
This fixes warnings about SelectionDAGLowering having fields
using anonymous namespaces.
llvm-svn: 55497
ATOMIC_LOAD_ADD_{8,16,32,64} instead of ATOMIC_LOAD_ADD.
Increased the Hardcoded Constant OpActionsCapacity to match.
Large but boring; no functional change.
This is to support partial-word atomics on ppc; i8 is
not a valid type there, so by the time we get to lowering, the
ATOMIC_LOAD nodes looks the same whether the type was i8 or i32.
The information can be added to the AtomicSDNode, but that is the
largest SDNode; I don't fully understand the SDNode allocation,
but it is sensitive to the largest node size, so increasing
that must be bad. This is the alternative.
llvm-svn: 55457
works with.
SelectionDAG, FunctionLoweringInfo, and SelectionDAGLowering
objects now get created once per SelectionDAGISel instance, and
can be reused across blocks and across functions. Previously,
they were created and destroyed each time they were needed.
This reorganization simplifies the handling of PHI nodes, and
also SwitchCases, JumpTables, and BitTestBlocks. This
simplification has the side effect of fixing a bug in FastISel
where successor PHI nodes weren't being updated correctly.
This is also a step towards making the transition from FastISel
into and out of SelectionDAG faster, and also making
plain SelectionDAG faster on code with lots of little blocks.
llvm-svn: 55450
of two, and to not need a scratch std::vector. Also, compute the ordering
immediately in the result array, instead of in another scratch std::vector
that is copied to the result array.
llvm-svn: 55421
of two, and to not need a scratch std::vector. Also, use the
SelectionDAG's topological sort in LegalizeDAG instead of having
a separate implementation.
llvm-svn: 55389
was inserted or not. This allows bitcast in fast isel to properly handle the case
where an appropriate reg-to-reg copy is not available.
llvm-svn: 55375
RecyclingAllocator, but this change is needed for the nodes
to actually be recycled. This cuts SelectionDAG's memory
usage high-water-mark in half in some cases.
llvm-svn: 55351
use raw_ostream instead of std::ostream. Among other goodness,
this speeds up llvm-dis of kc++ with a release build from 0.85s
to 0.49s (88% faster).
Other interesting changes:
1) This makes Value::print be non-virtual.
2) AP[S]Int and ConstantRange can no longer print to ostream directly,
use raw_ostream instead.
3) This fixes a bug in raw_os_ostream where it didn't flush itself
when destroyed.
4) This adds a new SDNode::print method, instead of only allowing "dump".
A lot of APIs have both std::ostream and raw_ostream versions, it would
be useful to go through and systematically anihilate the std::ostream
versions.
This passes dejagnu, but there may be minor fallout, plz let me know if
so and I'll fix it.
llvm-svn: 55263
process up to a higher level. This allows FastISel to leverage
more of SelectionDAGISel's infastructure, such as updating Machine
PHI nodes.
Also, implement transitioning from SDISel back to FastISel in
the middle of a block, so it's now possible to go back and
forth. This allows FastISel to hand individual CallInsts and other
complicated things off to SDISel to handle, while handling the rest
of the block itself.
To help support this, reorganize the SelectionDAG class so that it
is allocated once and reused throughout a function, instead of
being completely reallocated for each block.
llvm-svn: 55219
and use it in FastISelEmitter.cpp, and make FastISel
subtarget aware. Among other things, this lets it work
properly on x86 targets that don't have SSE, where it
successfully selects x87 instructions.
llvm-svn: 55156
class hold a MachineRegisterInfo member, and make the
MachineBasicBlock be passed in to SelectInstructions rather
than the FastISel constructor.
llvm-svn: 55076
In particular, Collector was confusing to implementors. Several
thought that this compile-time class was the place to implement
their runtime GC heap. Of course, it doesn't even exist at runtime.
Specifically, the renames are:
Collector -> GCStrategy
CollectorMetadata -> GCFunctionInfo
CollectorModuleMetadata -> GCModuleInfo
CollectorRegistry -> GCRegistry
Function::getCollector -> getGC (setGC, hasGC, clearGC)
Several accessors and nested types have also been renamed to be
consistent. These changes should be obvious.
llvm-svn: 54899
returning an std::string by value, it fills in a SmallString/SmallVector
passed in. This significantly reduces string thrashing in some cases.
More specifically, this:
- Adds an operator<< and a print method for APInt that allows you to
directly send them to an ostream.
- Reimplements APInt::toString to be much simpler and more efficient
algorithmically in addition to not thrashing strings quite as much.
This speeds up llvm-dis on kc++ by 7%, and may also slightly speed up the
asmprinter. This also fixes a bug I introduced into the asmwriter in a
previous patch w.r.t. alias printing.
llvm-svn: 54873
FPROUND_F80_F32, FPROUND_PPCF128_F32,
FPROUND_F80_F64, FPROUND_PPCF128_F64
Support for soften float fp_round operands is added, Mips
needs this to round f64->f32.
Also added support to soften float FABS result, Mips doesn't
support double fabs results while in 'single float only' mode.
llvm-svn: 54484
- Add a basic machine-level dead block eliminator.
These two have to go together, since many other parts of the code generator are unable to handle the unreachable blocks otherwise created.
llvm-svn: 54333
switches use the binary search algorithm) for
environments that don't support it. PPC64 JIT
is such an environment; turn the flag on for that.
llvm-svn: 54248
a new ilist_node class, and remove them. Unlike alist_node,
ilist_node doesn't attempt to manage storage itself, so it avoids
the associated problems, including being opaque in gdb.
Adjust the Recycler class so that it doesn't depend on alist_node.
Also, change it to use explicit Size and Align parameters, allowing
it to work when the largest-sized node doesn't have the greatest
alignment requirement.
Change MachineInstr's MachineMemOperand list from a pool-backed
alist to a std::list for now.
llvm-svn: 54146
Remove the GetResultInst instruction. It is still accepted in LLVM assembly
and bitcode, where it is now auto-upgraded to ExtractValueInst. Also, remove
support for return instructions with multiple values. These are auto-upgraded
to use InsertValueInst instructions.
The IRBuilder still accepts multiple-value returns, and auto-upgrades them
to InsertValueInst instructions.
llvm-svn: 53941
SelectionDAG graph writer to make use of them. Now, nodes with multiple
values are displayed as such, with incoming edges pointing to the
specific value they use.
llvm-svn: 53875
generic SDNode's (nodes with their own constructors
should do sanity checking in the constructor). Add
sanity checks for BUILD_VECTOR and fix all the places
that were producing bogus BUILD_VECTORs, as found by
"make check". My favorite is the BUILD_VECTOR with
only two operands that was being used to build a
vector with four elements!
llvm-svn: 53850
the night realising that it was wrong :) I
think the reason the same type was being used
for the shufflevec of indices as for the actual
indices is so that if one of them needs splitting
then so does the other. After my patch it might
be that the indices need splitting but not the
rest, yet there is no good way of handling that.
I think the right solution is to not have the
shufflevec be an operand at all: just have it
be the list of numbers it actually is, stored
as extra info in the node.
llvm-svn: 53768
PseudoSourceValue values, which never have names. Use getName()
for all other values, because we want to print just a short summary
of the value, not the entire instruction.
llvm-svn: 53738
mask. These are just indices into the shuffled vector
so their type is unrelated to the type of the
shuffled elements (which is what was being used before).
This fixes vec_shuffle-11.ll when using LegalizeTypes.
What seems to have happened is that Dan's recent change
r53687, which corrected the result type of the shuffle,
somehow caused LegalizeTypes to notice that the mask
operand was a BUILD_VECTOR with a legal type but elements
of an illegal type (i64). LegalizeTypes legalized this
by introducing a new BUILD_VECTOR of i32 and bitcasting
it to the old type. But the mask operand is not supposed
to be a bitcast but a straight BUILD_VECTOR of constants,
causing a crash.
llvm-svn: 53729
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.
Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.
This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.
These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.
llvm-svn: 53728
In LegalizeDAG the value is zero-extended to
the new type before byte swapping. It doesn't
matter how the extension is done since the new
bits are shifted off anyway after the swap, so
extend by any old rubbish bits. This results
in the final assembler for the testcase being
one line shorter.
llvm-svn: 53604
extending load of a vector. Handle this case when
splitting vector loads. I'm not completely sure
what is supposed to happen, but I think it means
hi should be set to undef. LegalizeDAG does not
consider this case.
llvm-svn: 53555
stores of one-element vectors. Also, neaten the
handling of INSERT_VECTOR_ELT when the inserted
type is larger than the vector element type.
llvm-svn: 53554
are used for passing huge immediates in inline ASM
from the front-end straight down to the ASM writer.
Of course this is a hack, but it is simple, limited
in scope, works in practice, and is what LegalizeDAG
does.
llvm-svn: 53553
SINT_TO_FP libcall plus additional operations:
it might as well be a direct UINT_TO_FP libcall.
So only turn it into an SINT_TO_FP if the target
has special handling for SINT_TO_FP.
llvm-svn: 53461
Lack of these caused a bootstrap failure with Fortran
on x86-64 with LegalizeTypes turned on. While there,
be nice to 16 bit machines and support expansion of
i32 too.
llvm-svn: 53408
makes their special-case checks of use_size() less beneficial,
so remove them. This eliminates all but one use of use_size(),
which is in AssignTopologicalOrder, which uses it only once for
each node, and so can reasonably afford to recompute it, as
this allows the UsesSize field of SDNode to be removed
altogether.
llvm-svn: 53377
class, and store IsVolatile and Alignment in a more compact form.
This makes AtomicSDNode slightly larger, but it shrinks LoadSDNode
and StoreSDNode, which are much more common and are the largest of
the SDNode subclasses. Also, this lets the isVolatile() and
getAlignment() accessors be non-virtual.
llvm-svn: 53361
SINT_TO_FP and UINT_TO_FP. This now produces
the same code as LegalizeDAG (the previous
code was based on a mistaken idea of what
LegalizeDAG did in this case).
llvm-svn: 53288
soft float: experiments show that targets aren't
expecting this for results or for operands. Add
support select/select_cc result soft float and
correct operand soft float for these.
llvm-svn: 53245
MachineMemOperands. The pools are owned by MachineFunctions.
This drastically reduces the number of calls to malloc/free made
during the "Emit" phase of scheduling, as well as later phases
in CodeGen. Combined with other changes, this speeds up the
"instruction selection" phase of CodeGen by 10% in some cases.
llvm-svn: 53212
and reused across SelectionDAGs.
This drastically reduces the number of calls to malloc/free made during
instruction selection, and improves memory locality.
llvm-svn: 53211
simple const SDOperand*, which is what's usually needed.
For AddNodeIDOperands, which is small, just duplicate the function to
accept an SDUse*.
For SelectionDAG::getNode - Add an overload that accepts SDUse* that
copies the operands into a temporary SDOperand array, but also has
special-case checks for 0 through 3 operands to avoid the copy in
the common cases.
llvm-svn: 53183
that fixed problems in EmitStackConvert where the source and target type
have different alignment by creating a stack slot with the max
alignment of source and target type.
llvm-svn: 53150
hook for each way in which a result type can be
legalized (promotion, expansion, softening etc),
just use one: ReplaceNodeResults, which returns
a node with exactly the same result types as the
node passed to it, but presumably with a bunch of
custom code behind the scenes. No change if the
new LegalizeTypes infrastructure is not turned on.
llvm-svn: 53137
SelectionDAG::SelectNodeTo in the instruction selector. This
updates existing nodes in place instead of creating new ones.
Go back to selecting ISD::DBG_LABEL nodes into
TargetInstrInfo::DBG_LABEL nodes instead of leaving them
unselected, now that SelectNodeTo allows us to update them
in place.
llvm-svn: 53057
to be passed the list of value types, and use this
where appropriate. Inappropriate places are where
the value type list is already known and may be
long, in which case the existing method is more
efficient.
llvm-svn: 53035
the need for a flavor operand, and add a new SDNode subclass,
LabelSDNode, for use with them to eliminate the need for a label id
operand.
Change instruction selection to let these label nodes through
unmodified instead of creating copies of them. Teach the MachineInstr
emitter how to emit a MachineInstr directly from an ISD label node.
This avoids the need for allocating SDNodes for the label id and
flavor value, as well as SDNodes for each of the post-isel label,
label id, and label flavor.
llvm-svn: 52943
SelectionDAG::allnodes_size is linear, but that doesn't appear to
outweigh the benefit of reducing heap traffic. If it does become a
problem, we should teach SelectionDAG to keep a count of how many
nodes are live, because there are several other places where that
information would be useful as well.
llvm-svn: 52926
purpose, and give it a custom SDNode subclass so that it doesn't
need to have line number, column number, filename string, and
directory string, all existing as individual SDNodes to be the
operands.
This was the only user of ISD::STRING, StringSDNode, etc., so
remove those and some associated code.
This makes stop-points considerably easier to read in
-view-legalize-dags output, and reduces overhead (creating new
nodes and copying std::strings into them) on code containing
debugging information.
llvm-svn: 52924
SmallVectors. Change the signature of TargetLowering::LowerArguments
to avoid returning a vector by value, and update the two targets
which still use this directly, Sparc and IA64, accordingly.
llvm-svn: 52917
only needs one bit for each register. UsedRegs is a SmallVector
sized at 16, so this eliminates a heap allocation/free for every
call and return processed by Legalize on most targets.
llvm-svn: 52915
it impossible to create a MERGE_VALUES node with
only one result: sometimes it is useful to be able
to create a node with only one result out of one of
the results of a node with more than one result, for
example because the new node will eventually be used
to replace a one-result node using ReplaceAllUsesWith,
cf X86TargetLowering::ExpandFP_TO_SINT. On the other
hand, most users of MERGE_VALUES don't need this and
for them the optimization was valuable. So add a new
utility method getMergeValues for creating MERGE_VALUES
nodes which by default performs the optimization.
Change almost everywhere to use getMergeValues (and
tidy some stuff up at the same time).
llvm-svn: 52893
Move GetConstantStringInfo to lib/Analysis. Remove
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
This unbreaks llvm-gcc bootstrap.
llvm-svn: 52884
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
llvm-svn: 52748
For this it is convenient to permit floats to
be used with EXTRACT_ELEMENT, so I tweaked
things to allow that. I also added libcalls
for ppcf128 to i32 forms of FP_TO_XINT, since
they exist in libgcc and this case can certainly
occur (and does occur in the testsuite) - before
the i64 libcall was being used. Also, the
XINT_TO_FP result seemed to be wrong when
the argument is an i128: the wrong fudge
factor was added (the i32 and i64 cases were
handled directly, but the i128 code fell
through to some generic softening code which
seemed to think it was i64 to f32!). So I
fixed it by adding a fudge factor that I
found in my breakfast cereal.
llvm-svn: 52739
Added abstract class MemSDNode for any Node that have an associated MemOperand
Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and
atomic.lss => atomic.load.sub
llvm-svn: 52706
fixes PR2476; patch by Richard Osborne. The same
problem exists for a bunch of other operators, but
I'm ignoring this because they will be automagically
fixed when the new LegalizeTypes infrastructure lands,
since it already solves this problem centrally.
llvm-svn: 52610