related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
llvm-svn: 66779
alignment of the generated constant pool entry to the
desired alignment of a type. If we don't do this, we end up
trying to do movsd from 4-byte alignment memory. This fixes
450.soplex and 456.hmmer.
llvm-svn: 66641
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
llvm-svn: 66339
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.
This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.
llvm-svn: 66240
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.
Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.
llvm-svn: 65985
arbitrary vector sizes. Add an optional MinSplatBits parameter to specify
a minimum for the splat element size. Update the PPC target to use the
revised interface.
llvm-svn: 65899
extracts + build_vector into a shuffle would fail, because the
type of the new build_vector would not be legal. Try harder to
create a legal build_vector type. Note: this will be totally
irrelevant once vector_shuffle no longer takes a build_vector for
shuffle mask.
New:
_foo:
xorps %xmm0, %xmm0
xorps %xmm1, %xmm1
subps %xmm1, %xmm1
mulps %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, 0
Old:
_foo:
xorps %xmm0, %xmm0
movss %xmm0, %xmm1
xorps %xmm2, %xmm2
unpcklps %xmm1, %xmm2
pshufd $80, %xmm1, %xmm1
unpcklps %xmm1, %xmm2
pslldq $16, %xmm2
pshufd $57, %xmm2, %xmm1
subps %xmm0, %xmm1
mulps %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, 0
llvm-svn: 65791
overly long ints, e.g. i96, into pieces at PHIs
and the nodes that feed into them; however big-endian
reverses the order of the pieces (for some reason), and
wasn't doing it the same way on both sides, so
the pieces didn't match and runtime failures ensued.
Fixes 188.ammp and sqlite3 on ppc32.
llvm-svn: 65481
them are generic changes.
- Use the "fast" flag that's already being passed into the asm printers instead
of shoving it into the DwarfWriter.
- Instead of calling "MI->getParent()->getParent()" for every MI, set the
machine function when calling "runOnMachineFunction" in the asm printers.
llvm-svn: 65379
a DBG_LABEL or not. We want to fall back to the original way of emitting debug
info when we're in -O0/-fast mode.
- Add plumbing in to pass the "Fast" flag to places that need it.
- XFAIL DebugInfo/deaddebuglabel.ll. This is finding 11 labels instead of 8. I
need to investigate still.
llvm-svn: 65367
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
llvm-svn: 65296
(Note: Eventually, commits like this will be handled via a pre-commit hook that
does this automagically, as well as expand tabs to spaces and look for 80-col
violations.)
llvm-svn: 64827
U include/llvm/CodeGen/DebugLoc.h
U lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
Enable debug location generation at -Os. This goes with the reapplication of the
r63639 patch.
llvm-svn: 64715
one bit set, because the bit may be shifted off the end. Instead,
just check for a constant 1 being shifted. This is still sufficient
to handle all the cases in test/CodeGen/X86/bt.ll. This fixes PR3583.
llvm-svn: 64622
Cleanup some warning.
Remark: when struct/class are declared differently than they are defined, this make problem for VC++ since it seems to mangle class differently that struct. These error are very hard to understand and find. So please, try to keep your definition/declaration in sync.
Only tested with VS2008. hope it does not break anything. feel free to revert.
llvm-svn: 64554
in inline asm as signed (what gcc does). Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.
llvm-svn: 64400
is determined by whether the node has a Flag operand. However, if the
node does have a Flag operand, it will be glued to its register's
def, so the heuristic would end up spuriously applying to whatever
node is the def.
llvm-svn: 64319
It was transforming (x&y)==y to (x&y)!=0 in the case where
y is variable and known to have at most one bit set (e.g. z&1).
This is not correct; the expressions are not equivalent when y==0.
I believe this patch salvages what can be salvaged, including
all the cases in bt.ll. Dan, please review.
Fixes gcc.c-torture/execute/20040709-[12].c
llvm-svn: 64314
instruction index across each part. Instruction indices are used
to make live range queries, and live ranges can extend beyond
scheduling region boundaries.
Refactor the ScheduleDAGSDNodes class some more so that it
doesn't have to worry about this additional information.
llvm-svn: 64288
scheduling, and generalize is so that preserves state across
scheduling regions. This fixes incorrect live-range information around
terminators and labels, which are effective region boundaries.
In place of looking for terminators to anchor inter-block dependencies,
introduce special entry and exit scheduling units for this purpose.
llvm-svn: 64254
Adjust derived classes to pass UnknownLoc where
a DebugLoc does not make sense. Pick one of
DebugLoc and non-DebugLoc variants to survive
for all such classes.
llvm-svn: 64000
Many targets build placeholder nodes for special operands, e.g.
GlobalBaseReg on X86 and PPC for the PIC base. There's no
sensible way to associate debug info with these. I've left
them built with getNode calls with explicit DebugLoc::getUnknownLoc operands.
I'm not too happy about this but don't see a good improvement;
I considered adding a getPseudoOperand or something, but it
seems to me that'll just make it harder to read.
llvm-svn: 63992
SelectionDAGISel::CreateScheduler, and make it just create the
scheduler. Leave running the scheduler to the higher-level code.
This makes the higher-level code a little more explicit and
easier to follow, and will help enable some future refactoring.
llvm-svn: 63944
target directories themselves. This also means that VMCore no longer
needs to know about every target's list of intrinsics. Future work
will include converting the PowerPC target to this interface as an
example implementation.
llvm-svn: 63765
but when legalizing the operation, we split the vector type and generate a library
call whose type needs to be promoted. For example, X86 with SSE on but MMX off,
a divide v2i64 will be scalarized to 2 calls to a library using i64.
llvm-svn: 63760
support GraphViz, I've been using the foo->dump() facility. This
patch is a minor rewrite to the SelectionDAG dump() stuff to make it a
little more helpful. The existing foo->dump() functionality does not
change; this patch adds foo->dumpr(). All of this is only useful when
running LLVM under a debugger.
llvm-svn: 63736
in any old order. Since analyzing a node analyzes its
operands also, this can mean that when we pop a node
off the list of nodes to be analyzed, it may already
have been analyzed.
llvm-svn: 63632
information. This eliminates the need for the Flags field in MemSDNode,
so this makes LoadSDNode and StoreSDNode smaller. Also, it makes
FoldingSetNodeIDs for loads and stores two AddIntegers smaller.
llvm-svn: 63577
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.
llvm-svn: 63494
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization. Thanks to Dan for writing the
original patch (which I shamelessly pillaged).
llvm-svn: 63482
- Modify TableGen to add the DebugLoc when calling getTargetNode.
(The light-weight wrappers are only temporary. The non-DebugLoc version will be
removed once the whole debug info stuff is finished with.)
llvm-svn: 63273
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.
llvm-svn: 63266
new isOperationLegalOrCustom, which does what isOperationLegal
previously did.
Update a bunch of callers to use isOperationLegalOrCustom
instead of isOperationLegal. In some case it wasn't obvious
which behavior is desired; when in doubt I changed then to
isOperationLegalOrCustom as that preserves their previous
behavior.
This is for the second half of PR3376.
llvm-svn: 63212
a uint64_t to verify that the value is in range for the given type,
to help catch accidental overflow. Fix a few places that relied on
getConstant implicitly truncating the value.
llvm-svn: 63128
checking logic. Rather than make the checking more
complicated, I've tweaked some logic to make things
conform to how the checking thought things ought to
be, since this results in a simpler "mental model".
llvm-svn: 63048
tidy up SDUse and related code.
- Replace the operator= member functions with a set method, like
LLVM Use has, and variants setInitial and setNode, which take
care up updating use lists, like LLVM Use's does. This simplifies
code that calls these functions.
- getSDValue() is renamed to get(), as in LLVM Use, though most
places can either use the implicit conversion to SDValue or the
convenience functions instead.
- Fix some more node vs. value terminology issues.
Also, eliminate the one remaining use of SDOperandPtr, and
SDOperandPtr itself.
llvm-svn: 62995
of each use in the SelectionDAG ReplaceAllUses* functions. Thanks
to Chris for spotting this opportunity.
Also, factor out code from all 5 of the ReplaceAllUses* functions
into AddNonLeafNodeToCSEMaps, which is now renamed
AddModifiedNodeToCSEMaps to more accurately reflect its purpose.
llvm-svn: 62964
testcase from PR3376, and in fact is sufficient to completely
avoid the problem in that testcase.
There's an underlying problem though; TLI.isOperationLegal
considers Custom to be Legal, which might be ok in some
cases, but that's what DAGCombiner is using in many places
to test if something is legal when LegalOperations is true.
When DAGCombiner is running after legalize, this isn't
sufficient. I'll address this in a separate commit.
llvm-svn: 62860
to "C ^ 1" is only valid when C is known to be either 0 or 1. Most of the
similar foldings in this function only handle "i1" types, but this one appears
intentionally written to handle larger integer types. If C has an integer
type larger than "i1", this needs to check if the high bits of a boolean
are known to be zero. I also changed the comment to describe this folding as
"C ^ 1" instead of "~C", since that is what the code does and since the latter
would only be valid for "i1" types. The good news is that most LLVM targets
use TargetLowering::ZeroOrOneBooleanContent so this change will not disable
the optimization; the bad news is that I've been unable to come up with a
testcase to demonstrate the problem.
I have also removed a "FIXME" comment for folding "select C, X, 0" to "C & X",
since the code looks correct to me. It could be made more aggressive by not
limiting the type to "i1", but that would then require checking for
TargetLowering::ZeroOrNegativeOneBooleanContent. Similar changes could be
done for the other SELECT foldings, but it was decided to be not worth the
trouble and complexity (see e.g., r44663).
llvm-svn: 62790
Simplify x+0 to x in unsafe-fp-math mode. This avoids a bunch of
redundant work in many cases, because in unsafe-fp-math mode,
ISD::FADD with a constant is considered free to negate, so the
DAGCombiner often negates x+0 to -0-x thinking it's free, when
in reality the end result is -x, which is more expensive than x.
Also, combine x*0 to 0.
This fixes PR3374.
llvm-svn: 62789
special cases after producing the new reduced-width load, because the
new load already has the needed adjustments built into it. This fixes
several bugs due to the special cases, including PR3317.
llvm-svn: 62692
- Ensure that (operation) legalization emits proper FDIV libcall when needed.
- Fix various bugs encountered during llvm-spu-gcc build, along with various
cleanups.
- Start supporting double precision comparisons for remaining libgcc2 build.
Discovered interesting DAGCombiner feature, which is currently solved via
custom lowering (64-bit constants are not legal on CellSPU, but DAGCombiner
insists on inserting one anyway.)
- Update README.
llvm-svn: 62664
SDNode subclasses to keep state that requires non-trivial
destructors, however it was already effectively impossible,
since the destructor isn't actually ever called. There currently
aren't any SDNode subclasses affected by this, and in general
it's desireable to keep SDNode objects light-weight.
This eliminates the last virtual member function in the SDNode
class, so it eliminates the need for a vtable pointer, making
SDNode smaller.
llvm-svn: 62539
uses are added to the From node while it is processing From's
use list, because of automatic local CSE. The fix is to avoid
visiting any new uses.
Fix a few places in the DAGCombiner that assumed that after
a RAUW call, the From node has no users and may be deleted.
This fixes PR3018.
llvm-svn: 62533
and every other instruction in their blocks to keep the terminator
instructions at the end, teach the post-RA scheduler how to operate
on ranges of instructions, and exclude terminators from the range
of instructions that get scheduled.
Also, exclude mid-block labels, such as EH_LABEL instructions, and
schedule code before them separately from code after them. This
fixes problems with the post-RA scheduler moving code past
EH_LABELs.
llvm-svn: 62366
a new toy hazard recognizier heuristic which attempts to direct the
scheduler to avoid clumping large groups of loads or stores too densely.
llvm-svn: 62291
and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.
To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.
llvm-svn: 62275
scheduling dependencies. Add assertion checks to help catch
this.
It appears the Mips target defaults to list-td, and it has a
regression test that uses a physreg dependence. Such code was
liable to be miscompiled, and now evokes an assertion failure.
llvm-svn: 62177
aggregate types. Don't increment the current index after reaching
the end of a struct, as it will already be pointing at
one-past-the end. This fixes PR3288.
llvm-svn: 61828
AddPseudoTwoAddrDeps. This lets the scheduling infrastructure
avoid recalculating node heights. In very large testcases this
was a major bottleneck. Thanks to Roman Levenstein for finding
this!
As a side effect, fold-pcmpeqd-0.ll is now scheduled better
and it no longer requires spilling on x86-32.
llvm-svn: 61778
instructions to avoid copies, because TwoAddressInstructionPass
also does this optimization. The scheduler's version didn't
account for live-out values, which resulted in spurious commutes
and missed opportunities.
Now, TwoAddressInstructionPass handles all the opportunities,
instead of just those that the scheduler missed. The result is
usually the same, though there are occasional trivial differences
resulting from the avoidance of spurious commutes.
llvm-svn: 61611
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType. In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).
llvm-svn: 61542
This removes all the _8, _16, _32, and _64 opcodes and replaces each
group with an unsuffixed opcode. The MemoryVT field of the AtomicSDNode
is now used to carry the size information. In tablegen, the size-specific
opcodes are replaced by size-independent opcodes that utilize the
ability to compose them with predicates.
This shrinks the per-opcode tables and makes the code that handles
atomics much more concise.
llvm-svn: 61389
temporary workaround for an obscure bug. When node cloning is
used, it is possible that more SUnits will be created, and
if the SUnits std::vector has to reallocate, it will
invalidate all the graph edges.
llvm-svn: 61122
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.
Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.
Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.
llvm-svn: 61073
width register load followed by a truncating
store for the copy, since the load will not place
the value in the lower bits. Probably partial
loads/stores can never happen here, but fix it
anyway.
llvm-svn: 60972
use of illegal integer types: instead, use a stack slot
and copying via integer registers. The existing code
is still used if the bitconvert is to a legal integer
type.
This fires on the PPC testcases 2007-09-08-unaligned.ll
and vec_misaligned.ll. It looks like equivalent code
is generated with these changes, just permuted, but
it's hard to tell.
With these changes, nothing in LegalizeDAG produces
illegal integer types anymore. This is a prerequisite
for removing the LegalizeDAG type legalization code.
While there I noticed that the existing code doesn't
handle trunc store of f64 to f32: it turns this into
an i64 store, which represents a 4 byte stack smash.
I added a FIXME about this. Hopefully someone more
motivated than I am will take care of it.
llvm-svn: 60964
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.
Similar for SUB and MUL instructions.
llvm-svn: 60915
for promoted integer types, eg: i16 on ppc-32, or
i24 on any platform. Complete support for arbitrary
precision integers would require handling expanded
integer types, eg: i128, but I couldn't be bothered.
llvm-svn: 60834
The Cost field is removed. It was only being used in a very limited way,
to indicate when the scheduler should attempt to protect a live register,
and it isn't really needed to do that. If we ever want the scheduler to
start inserting copies in non-prohibitive situations, we'll have to
rethink some things anyway.
A Latency field is added. Instead of giving each node a single
fixed latency, each edge can have its own latency. This will eventually
be used to model various micro-architecture properties more accurately.
The PointerIntPair class and an internal union are now used, which
reduce the overall size.
llvm-svn: 60806
essential problem was that the DAG can contain
random unused nodes which were never analyzed.
When remapping a value of a node being processed,
such a node may become used and need to be analyzed;
however due to operands being transformed during
analysis the node may morph into a different one.
Users of the morphing node need to be updated, and
this wasn't happening. While there I added a bunch
of documentation and sanity checks, so I (or some
other poor soul) won't have to scratch their head
over this stuff so long trying to remember how it
was all supposed to work next time some obscure
problem pops up! The extra sanity checking exposed
a few places where invariants weren't being preserved,
so those are fixed too. Since some of the sanity
checking is expensive, I added a flag to turn it
on. It is also turned on when building with
ENABLE_EXPENSIVE_CHECKS=1.
llvm-svn: 60797
Fix the shift amount when unrolling a vector shift into scalar shifts.
Fix problem in getShuffleScalarElt where it assumes that the input of
a bit convert must be a vector.
llvm-svn: 60740
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.
llvm-svn: 60696
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.
llvm-svn: 60443
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
llvm-svn: 60348
multiplies.
Some more cleverness would be nice, though. It would be nice if we
could do this transformation on illegal types. Also, we would
prefer a narrower constant when possible so that we can use a narrower
multiply, which can be cheaper.
llvm-svn: 60283
introduce any new spilling; it just uses unused registers.
Refactor the SUnit topological sort code out of the RRList scheduler and
make use of it to help with the post-pass scheduler.
llvm-svn: 59999
(this doesn't happen that often, since most code
does not use illegal types) then follow it by a
DAG combiner run that is allowed to generate
illegal operations but not illegal types. I didn't
modify the target combiner code to distinguish like
this between illegal operations and illegal types,
so it will not produce illegal operations as well
as not producing illegal types.
llvm-svn: 59960
"It simplifies the type legalization part a bit, and produces better code by
teaching SelectionDAG about the extra bits in an i8 SADDO/UADDO node. In
essence, I spontaneously decided that on x86 this i8 boolean result would be
either 0 or 1, and on other platforms 0/1 or 0/-1, depending on whether the
platform likes it's boolean zero extended or sign extended."
llvm-svn: 59864
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2
llvm-svn: 59809
"ISD::ADDO". ISD::ADDO is lowered into a target-independent form that does the
addition and then checks if the result is less than one of the operands. (If it
is, then there was an overflow.)
llvm-svn: 59779
schedulers. This doesn't have much immediate impact because
targets that use these schedulers by default don't yet provide
pipeline information.
This code also didn't have the benefit of register pressure
information. Also, removing it will avoid problems with list-burr
suddenly starting to do latency-oriented scheduling on x86 when we
start providing pipeline data, which would increase spilling.
llvm-svn: 59775
some of the latency computation logic out of the SDNode
ScheduleDAG code into a TargetInstrItineraries helper method
to help with this.
llvm-svn: 59761
the RR scheduler actually does look at latency values, but it
doesn't use a hazard recognizer so it has no way to know when
a no-op is needed, as opposed to just stalling and incrementing
the cycle count.
llvm-svn: 59759
dedicated "fast" scheduler in -fast mode instead, which is
faster. This speeds up llc -fast by a few percent on some
testcases -- the speedup only happens for code not handled by
fast-isel.
llvm-svn: 59700
is currently off by default, and can be enabled with
-disable-post-RA-scheduler=false.
This doesn't have a significant impact on most code yet because it doesn't
yet do anything to address anti-dependencies and it doesn't attempt to
disambiguate memory references. Also, several popular targets
don't have pipeline descriptions yet.
The majority of the changes here are splitting the SelectionDAG-specific
code out of ScheduleDAG, so that ScheduleDAG can be moved to
libLLVMCodeGen.a. The interface between ScheduleDAG-using code and
the rest of the scheduling code is somewhat rough and will evolve.
llvm-svn: 59676
and FP_ROUND. Not sure what these were doing
here - probably they were sometimes (wrongly)
created with integer operands somewhere that
has since been fixed.
llvm-svn: 59548
SCALAR_TO_VECTOR. I didn't add the testcase, because
once llc gets past scalar-to-vector it hits a SPU target
lowering bug and explodes.
llvm-svn: 59530
new CycleBound value. Instead, just update CycleBound on each call.
Also, make ReleasePred and ReleaseSucc methods more consistent accross
the various schedulers.
This also happens to make ScheduleDAGRRList's CycleBound computation
somewhat more interesting, though it still doesn't have any noticeable
effect, because no current targets that use the register-pressure
reduction scheduler provide pipeline models.
llvm-svn: 59475
to carry a SmallVector of flagged nodes, just calculate the flagged nodes
dynamically when they are needed.
The local-liveness change is due to a trivial scheduling change where
the scheduler arbitrary decision differently.
llvm-svn: 59273
special-purpose hook to a new pass. Also, add check to see if any
x87 virtual registers are used, to avoid doing any work in the
common case that no x87 code is needed.
llvm-svn: 59190
that it no longer handles non-power-of-two vectors.
However it previously only handled them sometimes,
depending on obscure numerical relationships between
the index and vector type. For example, for a vector
of length 6, it would succeed if and only if the
index was an even multiple of 6. I consider this
more confusing than useful.
llvm-svn: 59122
before creating the SUnit for the operation that it was unfolded from. This
allows each SUnit to have all of its predecessor SUnits available at the time
it is created. I don't know yet if this will be absolutely required, but it
is a little tidier to do it this way.
llvm-svn: 59083
The CC was changed, but wasn't checked to see if it was legal if the DAG
combiner was being run after legalization. Threw in a couple of checks just to
make sure that it's okay. As far as the PR is concerned, no back-end target
actually exhibited this problem, so there isn't an associated testcase.
llvm-svn: 59035
support targets that support these conversions. Users should avoid using
this node as the current targets don't generating code for it.
llvm-svn: 59001
inform the optimizers that the result must be zero/
sign extended from the smaller type. For example,
if a fp to unsigned i16 is promoted to fp to i32,
then we are allowed to assume that the extra 16 bits
are zero (because the result of fp to i16 is undefined
if the result does not fit in an i16). This is
quite aggressive, but should help the optimizers
produce better code. This requires correcting a
test which thought that fp_to_uint is some kind
of truncation, which it is not: in the testcase
(which does fp to i1), either the fp value converts
to 0 or 1 or the result is undefined, which is
quite different to truncation.
llvm-svn: 58991
This is Chris' patch from the PR, modified to realize that
SETUGT/SETULT occur legitimately with integers, plus
two fixes in LegalizeDAG to pass a valid result type into
LegalizeSetCC. The argument of TLI.getSetCCResultType is
ignored on PPC, but I think I'm following usage elsewhere.
llvm-svn: 58871
the condition for a BRCOND, according to what is
returned by getSetCCResultContents. Since all
targets return the same thing (ZeroOrOneSetCCResult),
this should be harmless! The point is that all over
the place the result of SETCC is fed directly into
BRCOND. On machines for which getSetCCResultContents
returns ZeroOrNegativeOneSetCCResult, this is a
sign-extended boolean. So it seems dangerous to
also feed BRCOND zero-extended booleans in some
circumstances - for example, when promoting the
condition.
llvm-svn: 58861
(e.g. a bitfield test) narrow the load as much as possible.
The has the potential to avoid unnecessary partial-word
load-after-store conflicts, which cause stalls on several targets.
Also a size win on x86 (testb vs testl).
llvm-svn: 58825
LLVM IR code and not in the selection DAG ISel. This is a cleaner solution.
- Fix the heuristic for determining if protectors are necessary. The previous
one wasn't checking the proper type size.
llvm-svn: 58824
- stackprotector_prologue creates a stack object and stores the guard there.
- stackprotector_epilogue reads the stack guard from the stack position created
by stackprotector_prologue.
- The PrologEpilogInserter was changed to make sure that the stack guard is
first on the stack frame.
llvm-svn: 58791
sized integers like i129, and also reduce the number
of assumptions made about how vaarg is implemented.
This still doesn't work correctly for small integers
like (eg) i1 on x86, since x86 passes each of them
(essentially an i8) in a 4 byte stack slot, so the
pointer needs to be advanced by 4 bytes not by 1 byte
as now. But this is no longer a LegalizeTypes problem
(it was also wrong in LT before): it is a bug in the
operation expansion in LegalizeDAG: now LegalizeTypes
turns an i1 vaarg into an i8 vaarg which would work
fine if only the i8 vaarg was turned into correct code
later.
llvm-svn: 58635
type for the shift amount type. Add a check
that shifts and rotates use the type returned
by getShiftAmountTy for the amount. This
exposed some problems in CellSPU and PPC,
which have already been fixed.
llvm-svn: 58455
other day that PPC custom lowering could create
a BUILD_PAIR of two f64 with a result type of...
f64! - already fixed). Fix a place that triggers
the sanity check.
llvm-svn: 58378
is morphed by AnalyzeNewNode into a previously
processed node, and different result values of
that node are remapped to values with different
nodes, then we could end up using wrong values
here [we were assuming that all results remap
to values with the same underlying node]. This
seems theoretically possible, but I don't have
a testcase. The meat of the patch is in the
changes to AnalyzeNewNode/AnalyzeNewValue and
ReplaceNodeWith. While there, I changed names
like RemapNode to RemapValue, since it really
remaps values. To tell the truth, I would be
much happier if we were only remapping nodes
(it would simplify a bunch of logic, and allow
for some cute speedups) but I haven't yet worked
out how to do that.
llvm-svn: 58372
ppcf128 to i32 conversion and expand it into a code
sequence like in LegalizeDAG. This needs custom
ppc lowering of FP_ROUND_INREG, so turn that on and
make it work with LegalizeTypes. Probably PPC should
simply custom lower the original conversion.
llvm-svn: 58329
(and a bunch of other node types). While there, I
added a doNotCSE predicate and used it to reduce code
duplication (some of the duplicated code was wrong...).
This fixes ARM/cse-libcalls.ll when using LegalizeTypes.
llvm-svn: 58249
worklist twice: UpdateNodeOperands could morph
a new node into a node already on the worklist.
We would then recalculate the NodeId for this
existing node and add it to the worklist. The
testcase is ARM/cse-libcalls.ll, the problem
showing up once UpdateNodeOperands is taught to
do CSE for calls.
llvm-svn: 58246
may return i8, which can result in SELECT nodes for
which the type of the condition is i8, but there are
no patterns for select with i8 condition. Tweak the
LegalizeTypes logic to avoid this as much as possible.
This isn't a real fix because it is still perfectly
possible to end up with such select nodes - CellSPU
needs to be fixed IMHO.
llvm-svn: 57968
that is not of type MVT::i1 in SELECT and SETCC nodes.
Relax the LegalizeTypes SELECT condition promotion
sanity checks to allow other condition types than i1.
llvm-svn: 57966
to have a different type to the vector element
type. This should be fairly harmless because in
the past guys like this were being built all over
the place (and were cleaned up when I added this
check). The reason for relaxing this check is
that it helps LegalizeTypes legalize vector
shuffles: the mask is a BUILD_VECTOR that it is
*not always possible* to legalize while keeping it
a BUILD_VECTOR (vector_shuffle requires the mask
to be a BUILD_VECTOR, as opposed to a vector with
the right vector type). With this check it is even
harder to legalize the mask - turning the check off
means that LegalizeTypes manages to legalize almost
all vector shuffles encountered in practice. The
correct solution is to change vector_shuffle to be a
variadic node with the mask built into it as operands.
While waiting for that change, this hack stops the
problem with vector_shuffle from blocking the turning
on of LegalizeTypes.
llvm-svn: 57965
The same one Apple gcc uses, faster. Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.
llvm-svn: 57926
in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.
Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.
llvm-svn: 57885
for strange asm conditions earlier. In this case, we have a
double being passed in an integer reg class. Convert to like
sized integer register so that we allocate the right number
for the class (two i32's for the f64 in this case).
llvm-svn: 57862
result type when the result type is legal but
not the operand type. Add additional support
for EXTRACT_SUBVECTOR and CONCAT_VECTORS,
needed to handle such cases.
llvm-svn: 57840
elements. Otherwise LegalizeTypes will, reasonably
enough, legalize the mask, which may result in it
no longer being a BUILD_VECTOR node (LegalizeDAG
simply ignores the legality or not of vector masks).
llvm-svn: 57782
the previous patch this one actually passes make check.
"Fix PR2356 on PowerPC: if we have an input and output that are tied together
that have different sizes (e.g. i32 and i64) make sure to reserve registers for
the bigger operand."
llvm-svn: 57771
and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)
This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.
This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.
Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.
The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.
llvm-svn: 57748
shift counts, and patterns that match dynamic shift counts
when the subtract is obscured by a truncate node.
Add DAGCombiner support for recognizing rotate patterns
when the shift counts are defined by truncate nodes.
Fix and simplify the code for commuting shld and shrd
instructions to work even when the given instruction doesn't
have a parent, and when the caller needs a new instruction.
These changes allow LLVM to use the shld, shrd, rol, and ror
instructions on x86 to replace equivalent code using two
shifts and an or in many more cases.
llvm-svn: 57662
i.e. conditions that cannot be checked with a single instruction. For example,
SETONE and SETUEQ on x86.
- Teach legalizer to implement *illegal* setcc as a and / or of a number of
legal setcc nodes. For now, only implement FP conditions. e.g. SETONE is
implemented as SETO & SETNE, SETUEQ is SETUO | SETEQ.
- Move x86 target over.
llvm-svn: 57542
- Move the EH landing-pad code and adjust it so that it works
with FastISel as well as with SDISel.
- Add FastISel support for @llvm.eh.exception and
@llvm.eh.selector.
llvm-svn: 57539
instead of requiring all "short description" strings to begin with
two spaces. This makes these strings less mysterious, and it fixes
some cases where short description strings mistakenly did not
begin with two spaces.
llvm-svn: 57521
than the type an i1 is promoted to (eg: i8). Account
for this. Noticed by Tilmann Scheller on CellSPU; he
will hopefully take care of fixing this in LegalizeDAG
and adding a testcase!
llvm-svn: 56997
`-fno-builtin' flag. Currently, it's used to replace "memset" with "_bzero"
instead of "__bzero" on Darwin10+. This arguably violates the meaning of this
flag, but is currently sufficient. The meaning of this flag should become more
specific over time.
llvm-svn: 56885
Completely eliminate the TopOrder std::vector. Instead, sort
the AllNodes list in place. This also eliminates the need to
call AllNodes.size(), a linear-time operation, before
performing the sort.
Also, eliminate the Sources temporary std::vector, since it
essentially duplicates the sorted result as it is being
built.
This also changes the direction of the topological sort
from bottom-up to top-down. The AllNodes list starts out in
roughly top-down order, so this reduces the amount of
reordering needed. Top-down is also more convenient for
Legalize, and ISel needed only minor adjustments.
llvm-svn: 56867
its size). Adjust various lowering functions to
pass this info through from CallInst. Use it to
implement sseregparm returns on X86. Remove
X86_ssecall calling convention.
llvm-svn: 56677
s/ParamAttr/Attribute/g
s/PAList/AttrList/g
s/FnAttributeWithIndex/AttributeWithIndex/g
s/FnAttr/Attribute/g
This sets the stage
- to implement function notes as function attributes and
- to distinguish between function attributes and return value attributes.
This requires corresponding changes in llvm-gcc and clang.
llvm-svn: 56622
the SelectionDAG and DAGCombiner code. The only functionality change is that now
the DAG combiner is performing the constant folding for these operations instead
of being a no-op.
This is *not* in response to a bug, so there isn't a testcase.
llvm-svn: 56550
RA problem by expanding the live interval of an
earlyclobber def back one slot. Remove
overlap-earlyclobber throughout. Remove
earlyclobber bits and their handling from
live internals.
llvm-svn: 56539
copy of the BURRList scheduler, but with several parts ripped
out, such as backtracking, online topological sort maintenance
(needed by backtracking), the priority queue, and Sethi-Ullman
number computation and maintenance (needed by the priority
queue). As a result of all this, it generates somewhat lower
quality code, but that's its tradeoff for running about 30%
faster than list-burr in -fast mode in many cases.
This is somewhat experimental. Moving forward, major pieces of
this can be refactored with pieces in common with
ScheduleDAGRRList.cpp.
llvm-svn: 56307
with an earlyclobber operand elsewhere. Propagate
this bit and the earlyclobber bit through SDISel.
Change linear-scan RA not to allocate regs in a way
that conflicts with an earlyclobber. See also comments.
llvm-svn: 56290
ConstantPoolSDNode, using the target's preferred alignment for the
constant type.
In LegalizeDAG, when performing loads from the constant pool, the
ConstantPoolSDNode's alignment is used in the calls to getLoad and
getExtLoad.
This change prevents SelectionDAG::getLoad/getExtLoad from incorrectly
choosing the ABI alignment for constant pool loads when Alignment == 0.
The incorrect alignment is only a performance issue when ABI alignment
does not equal preferred alignment (i.e., on x86 it was generating
MOVUPS instead of MOVAPS for v4f32 constant loads when the default ABI
alignment for 128bit vectors is forced to 1 byte.)
Patch by Paul Redmond!
llvm-svn: 56253
- Add linkage to SymbolSDNode (default to external).
- Change ISD::ExternalSymbol to ISD::Symbol.
- Change ISD::TargetExternalSymbol to ISD::TargetSymbol
These changes pave the way to allowing SymbolSDNodes with non-external linkage.
llvm-svn: 56249
Currently it just holds the calling convention and flags
for isVarArgs and isTailCall.
And it has several utility methods, which eliminate magic
5+2*i and similar index computations in several places.
CallSDNodes are not CSE'd. Teach UpdateNodeOperands to handle
nodes that are not CSE'd gracefully.
llvm-svn: 56183
ConstantFP* instead of APInt and APFloat directly.
This reduces the amount of time to create ConstantSDNode
and ConstantFPSDNode nodes when ConstantInt* and ConstantFP*
respectively are already available, as is the case in
SelectionDAGBuild.cpp. Also, it reduces the amount of time
to legalize constants into constant pools, and the amount of
time to add ConstantFP operands to MachineInstrs, due to
eliminating ConstantInt::get and ConstantFP::get calls.
It increases the amount of work needed to create new constants
in cases where the client doesn't already have a ConstantInt*
or ConstantFP*, such as legalize expanding 64-bit integer constants
to 32-bit constants. And it adds a layer of indirection for the
accessor methods. But these appear to be outweight by the benefits
in most cases.
It will also make it easier to make ConstantSDNode and
ConstantFPNode more consistent with ConstantInt and ConstantFP.
llvm-svn: 56162
form:
powf(10.0f, x);
If this is the case, and also we want limited precision floating-point
calculations, then lower to do the limited-precision stuff.
llvm-svn: 56035
to being off by default. Also, add assertion checks to check that
the various fast-isel-related command-line options are only used
when -fast-isel itself is enabled.
llvm-svn: 56029