Roman Levenstein
358e04a185
Use a linked data structure for the uses lists of an SDNode, just like
...
LLVM Value/Use does and MachineRegisterInfo/MachineOperand does.
This allows constant time for all uses list maintenance operations.
The idea was suggested by Chris. Reviewed by Evan and Dan.
Patch is tested and approved by Dan.
On normal use-cases compilation speed is not affected. On very big basic
blocks there are compilation speedups in the range of 15-20% or even better.
llvm-svn: 48822
2008-03-26 12:39:26 +00:00
Bill Wendling
6306183df3
Use the bit size of the operand instead of the hard-coded 32 to generate the
...
mask.
llvm-svn: 48750
2008-03-24 23:16:37 +00:00
Duncan Sands
d97eea372a
Introduce a new node for holding call argument
...
flags. This is needed by the new legalize types
infrastructure which wants to expand the 64 bit
constants previously used to hold the flags on
32 bit machines. There are two functional changes:
(1) in LowerArguments, if a parameter has the zext
attribute set then that is marked in the flags;
before it was being ignored; (2) PPC had some bogus
code for handling two word arguments when using the
ELF 32 ABI, which was hard to convert because of
the bogusness. As suggested by the original author
(Nicolas Geoffray), I've disabled it for the moment.
Tested with "make check" and the Ada ACATS testsuite.
llvm-svn: 48640
2008-03-21 09:14:45 +00:00
Dan Gohman
b9056838d2
Add support for multiple return values for the PPC target by
...
converting call result lowering to use the CallingConvLowering
infastructure.
llvm-svn: 48552
2008-03-19 21:39:28 +00:00
Dale Johannesen
cf87e71053
Make Complex long long/double/long double work
...
in ppc64 mode.
llvm-svn: 48459
2008-03-17 17:11:08 +00:00
Dale Johannesen
92dcf1e0c2
Next round of PPC32 ABI changes. Allow for gcc
...
behavior where a callee thinks a param will be
present in memory, even though the ABI doc says
it doesn't have to be. Handle complex long long
and complex double (4 and 8 return regs).
llvm-svn: 48439
2008-03-17 02:13:43 +00:00
Dale Johannesen
0dfd3f33f7
Implement the real calling convention for ppc32 Altivec:
...
vectors go at the end of the memory area, after all
non-vector parameters.
llvm-svn: 48364
2008-03-14 17:41:26 +00:00
Dale Johannesen
6424bd0b11
Do not promote float params to double in varargs
...
calls here. This was done earlier for params in
the varargs part of the params; any float params
that survive to here are in the non-varargs part,
and must not be promoted.
llvm-svn: 48310
2008-03-12 23:40:01 +00:00
Dale Johannesen
0d98256c05
One more bit of Altivec parameter passing.
...
llvm-svn: 48269
2008-03-12 00:49:20 +00:00
Dale Johannesen
b28456ef18
Implement Altivec passing to varargs functions on ppc.
...
llvm-svn: 48264
2008-03-12 00:22:17 +00:00
Dan Gohman
d6819da453
Generalize ExpandIntToFP to handle the case where the operand is legal
...
and it's the result that requires expansion. This code is a little confusing
because the TargetLoweringInfo tables for [US]INT_TO_FP use the operand type
(the integer type) rather than the result type.
llvm-svn: 48206
2008-03-11 01:59:03 +00:00
Evan Cheng
ae2c56d93e
Default ISD::PREFETCH to expand.
...
llvm-svn: 48169
2008-03-10 19:38:10 +00:00
Scott Michel
a6729e8666
Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's
...
return ValueType can depend its operands' ValueType.
This is a cosmetic change, no functionality impacted.
llvm-svn: 48145
2008-03-10 15:42:14 +00:00
Nicolas Geoffray
b1de7a35f9
Add description of individual bits in CR. This fix PR1765.
...
llvm-svn: 48143
2008-03-10 14:12:10 +00:00
Dale Johannesen
4e622ec86d
Increase ISD::ParamFlags to 64 bits. Increase the ByValSize
...
field to 32 bits, thus enabling correct handling of ByVal
structs bigger than 0x1ffff. Abstract interface a bit.
Fixes gcc.c-torture/execute/pr23135.c and
gcc.c-torture/execute/pr28982b.c in gcc testsuite (were ICE'ing
on ppc32, quietly producing wrong code on x86-32.)
llvm-svn: 48122
2008-03-10 02:17:22 +00:00
Chris Lattner
aed9406bb7
Darwin PPC64 indirect call target goes in X12, not R12. This fixes these
...
two regression tests:
test/CodeGen/PowerPC/2007-10-21-LocalRegAllocAssert.ll
test/CodeGen/PowerPC/2007-10-21-LocalRegAllocAssert2.ll
llvm-svn: 48120
2008-03-09 20:49:33 +00:00
Dale Johannesen
21a8f144a6
More ppc32 byval handling (bug fixes). Things
...
are looking pretty good now.
llvm-svn: 48043
2008-03-08 01:41:42 +00:00
Evan Cheng
95cf661534
Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0|1|2} and prefetchnta instructions.
...
llvm-svn: 48042
2008-03-08 00:58:38 +00:00
Dan Gohman
d2ab5fd713
Add support for calls with i128 return values on ppc64.
...
llvm-svn: 48041
2008-03-08 00:19:12 +00:00
Bill Wendling
968f32ce3d
PPC64 passes arguments of integral type in i64 registers, not i32. Reflect this
...
by promoting smaller integral values (i32 at this point) to i64, then truncating
to get the wanted size.
llvm-svn: 48030
2008-03-07 20:49:02 +00:00
Dan Gohman
8d2ead2e34
Add support for lowering 128-bit shifts on ppc64.
...
llvm-svn: 48029
2008-03-07 20:36:53 +00:00
Dale Johannesen
bfa252d402
Next bits of PPC byval handling. Basically functional
...
but there are bugs.
llvm-svn: 48028
2008-03-07 20:27:40 +00:00
Dale Johannesen
0d235052d1
Next bit of PPC ByVal handling; call-site code seems
...
correct now.
llvm-svn: 47978
2008-03-05 23:31:27 +00:00
Dale Johannesen
85d41a1c44
Move PPC lowering functions into PPCTargetLowering
...
class (cosmetic). First piece of byval implementation;
this doesn't work yet. No functional change.
llvm-svn: 47917
2008-03-04 23:17:14 +00:00
Dale Johannesen
cbde4c2206
Interface of getByValTypeAlignment differed between
...
generic & x86 versions; change generic to follow x86
and improve comments. Add PPC version (not right
for non-Darwin.)
llvm-svn: 47734
2008-02-28 22:31:51 +00:00
Dan Gohman
26854f242e
Don't hard-code the mask size to be 32, which is incorrect on ppc64
...
and was causing aborts with the new APInt changes. This may also be
fixing an obscure ppc64 bug.
llvm-svn: 47692
2008-02-27 21:12:32 +00:00
Dan Gohman
f19609abe8
Convert the last remaining users of the non-APInt form of
...
ComputeMaskedBits to use the APInt form, and remove the
non-APInt form.
llvm-svn: 47654
2008-02-27 01:23:58 +00:00
Andrew Lenharth
fedcf477b5
I cannot find a libgcc function for this builtin. Therefor expanding it to a noop (which is how it use to be treated). If someone who knows the x86 backend better than me could tell me how to get a lock prefix on an instruction, that would be nice to complete x86 support.
...
llvm-svn: 47213
2008-02-16 14:46:26 +00:00
Nate Begeman
53e1b3f9d5
Change how FP immediates are handled.
...
1) ConstantFP is now expand by default
2) ConstantFP is not turned into TargetConstantFP during Legalize
if it is legal.
This allows ConstantFP to be handled like Constant, allowing for
targets that can encode FP immediates as MachineOperands.
As a bonus, fix up Itanium FP constants, which now correctly match,
and match more constants! Hooray.
llvm-svn: 47121
2008-02-14 08:57:00 +00:00
Dan Gohman
e1d9ee66ed
Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits
...
to pass the mask APInt by value, not by reference.
llvm-svn: 47096
2008-02-13 22:28:48 +00:00
Chris Lattner
f6518cf4ab
don't try to avoid inserting loads when lowering FORMAL_ARGUMENTS.
...
DAGCombine is now quite good at zapifying them.
llvm-svn: 47053
2008-02-13 07:35:30 +00:00
Dan Gohman
f990faf23b
Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t.
...
Add an overload that supports the uint64_t interface for use by clients
that haven't been updated yet.
llvm-svn: 47039
2008-02-13 00:35:47 +00:00
Dan Gohman
16d4bc3dc0
Follow Chris' suggestion; change the PseudoSourceValue accessors
...
to return pointers instead of references, since this is always what
is needed.
llvm-svn: 46857
2008-02-07 18:41:25 +00:00
Dan Gohman
2d489b5081
Re-apply the memory operand changes, with a fix for the static
...
initializer problem, a minor tweak to the way the
DAGISelEmitter finds load/store nodes, and a renaming of the
new PseudoSourceValue objects.
llvm-svn: 46827
2008-02-06 22:27:42 +00:00
Evan Cheng
2aa360adf8
Unbreak ppc debug support.
...
llvm-svn: 46665
2008-02-02 05:06:29 +00:00
Evan Cheng
27b32b87ed
Revert 46556 and 46585. Dan please fix the PseudoSourceValue problem and re-commit.
...
llvm-svn: 46623
2008-01-31 21:00:00 +00:00
Dan Gohman
9ba4d76816
Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting
...
with the real FLT_ROUNDS (defined in <float.h>).
llvm-svn: 46587
2008-01-31 00:41:03 +00:00
Dan Gohman
3646fdda67
Create a new class, MemOperand, for describing memory references
...
in the backend. Introduce a new SDNode type, MemOperandSDNode, for
holding a MemOperand in the SelectionDAG IR, and add a MemOperand
list to MachineInstr, and code to manage them. Remove the offset
field from SrcValueSDNode; uses of SrcValueSDNode that were using
it are all all using MemOperandSDNode now.
Also, begin updating some getLoad and getStore calls to use the
PseudoSourceValue objects.
Most of this was written by Florian Brander, some
reorganization and updating to TOT by me.
llvm-svn: 46585
2008-01-31 00:25:39 +00:00
Evan Cheng
29cfb67e28
Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert
...
instruction at the end.
llvm-svn: 46562
2008-01-30 18:18:23 +00:00
Dan Gohman
47a7d6fafe
Factor the addressing mode and the load/store VT out of LoadSDNode
...
and StoreSDNode into their common base class LSBaseSDNode. Member
functions getLoadedVT and getStoredVT are replaced with the common
getMemoryVT to simplify code that will handle both loads and stores.
llvm-svn: 46538
2008-01-30 00:15:11 +00:00
Duncan Sands
95d46ef887
The last pieces needed for loading arbitrary
...
precision integers. This won't actually work
(and most of the code is dead) unless the new
legalization machinery is turned on. While
there, I rationalized the handling of i1, and
removed some bogus (and unused) sextload patterns.
For i1, this could result in microscopically
better code for some architectures (not X86).
It might also result in worse code if annotating
with AssertZExt nodes turns out to be more harmful
than helpful.
llvm-svn: 46280
2008-01-23 20:39:46 +00:00
Dale Johannesen
5c94cb3596
Implement flt_rounds for PowerPC.
...
llvm-svn: 46174
2008-01-18 19:55:37 +00:00
Chris Lattner
87757d38b3
get symbolic information for ppc ldbl nodes.
...
llvm-svn: 46165
2008-01-18 18:51:16 +00:00
Chris Lattner
f5b46f7dad
Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to:
...
_test:
fctiwz f0, f1
stfiwx f0, 0, r4
blr
instead of:
_test:
fctiwz f0, f1
stfd f0, -8(r1)
nop
nop
lwz r2, -4(r1)
stb r2, 0(r4)
blr
The former is not correct (stores 4 bytes, not 1).
llvm-svn: 46161
2008-01-18 16:54:56 +00:00
Chris Lattner
1ea55cf816
This commit changes:
...
1. Legalize now always promotes truncstore of i1 to i8.
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
safe.
The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:
_foo:
fldt 20(%esp)
fldt 4(%esp)
faddp %st(1)
movl 36(%esp), %eax
fstps (%eax)
ret
instead of:
_foo:
subl $4, %esp
fldt 24(%esp)
fldt 8(%esp)
faddp %st(1)
fstps (%esp)
movl 40(%esp), %eax
movss (%esp), %xmm0
movss %xmm0, (%eax)
addl $4, %esp
ret
llvm-svn: 46140
2008-01-17 19:59:44 +00:00
Chris Lattner
72733e573b
* Introduce a new SelectionDAG::getIntPtrConstant method
...
and switch various codegen pieces and the X86 backend over
to using it.
* Add some comments to SelectionDAGNodes.h
* Introduce a second argument to FP_ROUND, which indicates
whether the FP_ROUND changes the value of its input. If
not it is safe to xform things like fp_extend(fp_round(x)) -> x.
llvm-svn: 46125
2008-01-17 07:00:52 +00:00
Duncan Sands
53c954fa86
Output sinl for a long double FSIN node, not sin.
...
Likewise fix up a bunch of other libcalls. While
there I remove NEG_F32 and NEG_F64 since they are
not used anywhere. This fixes 9 Ada ACATS failures.
llvm-svn: 45833
2008-01-10 10:28:30 +00:00
Chris Lattner
a10fff51d9
Rename SSARegMap -> MachineRegisterInfo in keeping with the idea
...
that "machine" classes are used to represent the current state of
the code being compiled. Given this expanded name, we can start
moving other stuff into it. For now, move the UsedPhysRegs and
LiveIn/LoveOuts vectors from MachineFunction into it.
Update all the clients to match.
This also reduces some needless #includes, such as MachineModuleInfo
from MachineFunction.
llvm-svn: 45467
2007-12-31 04:13:23 +00:00
Chris Lattner
f3ebc3f3d2
Remove attribution from file headers, per discussion on llvmdev.
...
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Nicolas Geoffray
31a2c3948e
Fix unintented change from last commit
...
llvm-svn: 45282
2007-12-21 12:22:29 +00:00
Nicolas Geoffray
80c741e160
Enable EH for linux/ppc32 targets
...
llvm-svn: 45281
2007-12-21 12:19:44 +00:00
Chris Lattner
f6a8156e4f
implement __builtin_return_addr(0) on ppc.
...
llvm-svn: 44700
2007-12-08 06:59:59 +00:00
Chris Lattner
57ee7c6630
Implement ExpandOperationResult for ppc i64 fp->int, which fixes
...
CodeGen/Generic/fp_to_int.ll among others. Its unclear why this
just started failing...
llvm-svn: 44407
2007-11-28 18:44:47 +00:00
Bill Wendling
f359fed9f9
Unify CALLSEQ_{START,END}. They take 4 parameters: the chain, two stack
...
adjustment fields, and an optional flag. If there is a "dynamic_stackalloc" in
the code, make sure that it's bracketed by CALLSEQ_START and CALLSEQ_END. If
not, then there is the potential for the stack to be changed while the stack's
being used by another instruction (like a call).
This can only result in tears...
llvm-svn: 44037
2007-11-13 00:44:25 +00:00
Dale Johannesen
10f4152471
Disable a couple more things for ppcf128.
...
llvm-svn: 43267
2007-10-23 23:20:14 +00:00
Evan Cheng
bdbed66333
Use ptr type in the immediate field of a BxA instruction so we don't end up selecting 32-bit call instruction for ppc64.
...
llvm-svn: 43228
2007-10-22 19:46:19 +00:00
Chris Lattner
b193576bc6
comment fixes
...
llvm-svn: 43168
2007-10-19 04:08:28 +00:00
Dale Johannesen
10432e5a67
More ppcf128 issues (maybe the last)?
...
llvm-svn: 43160
2007-10-19 00:59:18 +00:00
Chris Lattner
06a4954e6e
Change LowerFP_TO_SINT to create the specific code it needs instead of
...
unconditionally creating an i64 bitcast. With the future legalizer
design, operation legalization can't introduce new nodes with illegal
types.
This fixes the rest of olden on ppc32.
llvm-svn: 43005
2007-10-15 20:14:52 +00:00
Dale Johannesen
2f6b6d6fb0
Fix type mismatch error in PPC Altivec (only causes
...
a problem when asserts are on). From vecLib.
llvm-svn: 42959
2007-10-14 01:58:32 +00:00
Dan Gohman
9013eaff9a
Mark vector pow, ctpop, cttz, and ctlz as Expand on PowerPC.
...
llvm-svn: 42904
2007-10-12 14:08:57 +00:00
Dan Gohman
482732af9d
Set ISD::FPOW to Expand.
...
llvm-svn: 42881
2007-10-11 23:21:31 +00:00
Dale Johannesen
666323eacd
Next PPC long double bits: ppcf128->i32 conversion.
...
Surprisingly complicated.
Adds getTargetNode for 2 outputs, no inputs (missing).
llvm-svn: 42822
2007-10-10 01:01:31 +00:00
Dan Gohman
71f0d7d76e
When we start enabling SMUL_LOHI/UMUL_LOHI or SDIVREM/UDIVREM in
...
target-indepenent lowering, don't use them on PowerPC.
llvm-svn: 42755
2007-10-08 17:28:24 +00:00
Dale Johannesen
f864ac96d8
Next powerpc long double bits. Comparisons work,
...
although not well, and shortening FP converts.
llvm-svn: 42672
2007-10-06 01:24:11 +00:00
Dale Johannesen
c0154c06d6
First round of ppc long double. call/return and
...
basic arithmetic works.
Rename RTLIB long double functions to distinguish
different flavors of long double; the lib functions
have different names, alas.
llvm-svn: 42644
2007-10-05 20:04:43 +00:00
Duncan Sands
86e0119822
Fold the adjust_trampoline intrinsic into
...
init_trampoline. There is now only one
trampoline intrinsic.
llvm-svn: 41841
2007-09-11 14:10:23 +00:00
Owen Anderson
e2f23a3abf
Add lengthof and endof templates that hide a lot of sizeof computations.
...
Patch by Sterling Stein!
llvm-svn: 41758
2007-09-07 04:06:50 +00:00
Dale Johannesen
3cf889f75e
Enhance APFloat to retain bits of NaNs (fixes oggenc).
...
Use APFloat interfaces for more references, mostly
of ConstantFPSDNode.
llvm-svn: 41632
2007-08-31 04:03:46 +00:00
Bill Wendling
10e18dea2a
Use i64 on a PPC64 machine
...
llvm-svn: 41590
2007-08-30 00:59:19 +00:00
Chris Lattner
d8c9cb9182
rename isOperandValidForConstraint to LowerAsmOperandForConstraint,
...
changing the interface to allow for future changes.
llvm-svn: 41384
2007-08-25 00:47:38 +00:00
Evan Cheng
581d2795dc
Vector fneg must be expanded into fsub -0.0, X.
...
llvm-svn: 40586
2007-07-30 07:51:22 +00:00
Duncan Sands
644f917358
Support for trampolines, except for X86 codegen which is
...
still under discussion.
llvm-svn: 40549
2007-07-27 12:58:54 +00:00
Lauro Ramos Venancio
09d73c0305
Assert when TLS is not implemented.
...
llvm-svn: 39737
2007-07-11 17:19:51 +00:00
Dan Gohman
f8f531bf69
Change getCopyToParts and getCopyFromParts to always use target-endian
...
register ordering, for both physical and virtual registers. Update the PPC
target lowering for calls to expect registers for the call result to
already be in target order.
llvm-svn: 38471
2007-07-09 20:59:04 +00:00
Dan Gohman
309d3d51b3
Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from
...
TargetLowering to SelectionDAG so that they have more convenient
access to the current DAG, in preparation for the ValueType routines
being changed from standalone functions to members of SelectionDAG for
the pre-legalize vector type changes.
llvm-svn: 37704
2007-06-22 14:59:07 +00:00
Chris Lattner
7936d91f70
describe an argument, hide it.
...
llvm-svn: 37650
2007-06-19 05:46:06 +00:00
Chris Lattner
944200be45
If a function is vararg, never pass inreg arguments in registers. Thanks to
...
Anton for half of this patch.
llvm-svn: 37641
2007-06-19 00:13:10 +00:00
Dan Gohman
5c4413120f
Rename MVT::getVectorBaseType to MVT::getVectorElementType.
...
llvm-svn: 37579
2007-06-14 22:58:02 +00:00
Dan Gohman
c12dd5207d
Apply this patch:
...
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070514/049845.html
llvm-svn: 37240
2007-05-18 23:21:46 +00:00
Chris Lattner
0b7472da6f
fix some subtle inline asm selection issues
...
llvm-svn: 37067
2007-05-15 01:31:05 +00:00
Chris Lattner
19ccd6226c
Fix a bug in PPCTargetLowering::isLegalAddressingMode, scales other than 0/1/2
...
are always unsupported.
llvm-svn: 35835
2007-04-09 22:10:05 +00:00
Nicolas Geoffray
23710a7da3
Starting implementation of the ELF32 ABI specification of varargs handling.
...
LowerVASTART emits the right code if the subtarget is ELF32, the other intrinsics
(VAARG, VACOPY and VAEND) are not yet implemented.
llvm-svn: 35625
2007-04-03 13:59:52 +00:00
Nicolas Geoffray
b3e99a18ee
The PPC64 ELF ABI is "intended to use the same structure layout and calling convention rules
...
as the 64-bit PowerOpen ABI" (Reference http://www.linux-foundation.org/spec/ELF/ppc64/ ).
Change all ELF tests to ELF32.
llvm-svn: 35624
2007-04-03 12:35:28 +00:00
Nicolas Geoffray
fbfc451ba9
The ELF ABI specifies F1-F8 registers as argument registers for double, not
...
F1-F10. This affects only ELF, not MachO.
llvm-svn: 35622
2007-04-03 10:27:07 +00:00
Chris Lattner
1eb94d973a
implement the new addressing mode description hook.
...
llvm-svn: 35521
2007-03-30 23:15:24 +00:00
Lauro Ramos Venancio
682baf2dda
"The C standards do say that "char" may either be a "signed char" or "unsigned
...
char" and it is up to the compilers implementation or the platform which is
followed."
http://www.arm.linux.org.uk/docs/faqs/signedchar.php
llvm-svn: 35382
2007-03-27 16:33:08 +00:00
Chris Lattner
d685514e2e
switch TargetLowering::getConstraintType to take the entire constraint,
...
not just the first letter. No functionality change.
llvm-svn: 35322
2007-03-25 02:14:49 +00:00
Nicolas Geoffray
7aad92868c
Stack and register alignment of call arguments in the ELF ABI
...
llvm-svn: 35083
2007-03-13 15:02:46 +00:00
Evan Cheng
b9dce9db85
More flexible TargetLowering LSR hooks for testing whether an immediate is a legal target address immediate or scale.
...
llvm-svn: 35074
2007-03-12 23:29:01 +00:00
Chris Lattner
4f2e4e0f92
Switch PPC return lower to use an autogenerated CC description.
...
llvm-svn: 34940
2007-03-06 00:59:59 +00:00
Nicolas Geoffray
75ab9799df
Implemented the frameaddress intrinsic for PPC.
...
llvm-svn: 34787
2007-03-01 13:11:38 +00:00
Nicolas Geoffray
89d81878d2
Differentiate between the MachO and the ELF ABI the CALL instruction.
...
llvm-svn: 34667
2007-02-27 13:01:19 +00:00
Chris Lattner
535bd6d3ba
always lower to RETFLAG, never leave it as just ret.
...
llvm-svn: 34639
2007-02-26 19:44:02 +00:00
Chris Lattner
1ee61ab414
no really, this is the right patch
...
llvm-svn: 34605
2007-02-25 20:01:40 +00:00
Chris Lattner
4d2f5f8740
always promote float varargs to double.
...
llvm-svn: 34604
2007-02-25 19:59:18 +00:00
Chris Lattner
43df5b335c
implement support for the linux/ppc function call ABI. Patch by
...
Nicolas Geoffray!
llvm-svn: 34574
2007-02-25 05:34:32 +00:00
Jim Laskey
e0008e23cf
Simplify lowering and selection of exception ops.
...
llvm-svn: 34488
2007-02-22 14:56:36 +00:00
Jim Laskey
3796abea0f
Support to provide exception and selector registers.
...
llvm-svn: 34482
2007-02-21 22:54:50 +00:00
Chris Lattner
1f7d60262e
Fix ixaddrs as well, allowing ppc64 to compile to:
...
_test2:
li r2, 0
lis r3, 1
std r2, 9024(r3)
blr
instead of:
_test2:
lis r2, 1
li r3, 0
ori r2, r2, 9024
std r3, 0(r2)
blr
This implements CodeGen/PowerPC/LargeAbsoluteAddr.ll:test2
llvm-svn: 34373
2007-02-17 06:57:26 +00:00
Chris Lattner
4a9c0bb147
Compile test/CodeGen/PowerPC/LargeAbsoluteAddr.ll to:
...
_test:
lis r2, 743
li r3, 0
stw r3, 32751(r2)
blr
instead of:
_test:
li r2, 0
stw r2, 32751(48693248)
blr
Implement support for ppc64 as well, allowing it to produce better code.
llvm-svn: 34371
2007-02-17 06:44:03 +00:00
Nate Begeman
eda5997cc8
Finish off bug 680, allowing targets to custom lower frame and return
...
address nodes.
llvm-svn: 33636
2007-01-29 22:58:52 +00:00
Anton Korobeynikov
037c867b54
Propagate changes from my local tree. This patch includes:
...
1. New parameter attribute called 'inreg'. It has meaning "place this
parameter in registers, if possible". This is some generalization of
gcc's regparm(n) attribute. It's currently used only in X86-32 backend.
2. Completely rewritten CC handling/lowering code inside X86 backend.
Merged stdcall + c CCs and fastcall + fast CC.
3. Dropped CSRET CC. We cannot add struct return variant for each
target-specific CC (e.g. stdcall + csretcc and so on).
4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in
on first attribute has meaning 'This is hidden pointer to structure
return. Handle it gently'.
5. Fixed small bug in llvm-extract + add new feature to
FunctionExtraction pass, which relinks all internal-linkaged callees
from deleted function to external linkage. This will allow further
linking everything together.
NOTEs: 1. Documentation will be updated soon.
2. llvm-upgrade should be improved to translate csret => sret.
Before this, there will be some unexpected test fails.
llvm-svn: 33597
2007-01-28 13:31:35 +00:00
Jim Laskey
f9e5445ed4
Make LABEL a builtin opcode.
...
llvm-svn: 33537
2007-01-26 14:34:52 +00:00
Evan Cheng
8bc7ddb84d
setSetCCIsExpensive is gone.
...
llvm-svn: 32941
2007-01-05 23:42:53 +00:00
Jim Laskey
f07cc990dd
Provide support for FP_TO_UINT.
...
llvm-svn: 32599
2006-12-15 14:32:57 +00:00
Chris Lattner
f4646a7e54
Another step forward in PPC64 JIT support: we now no-longer need stubs
...
emitted for external globals in PPC64-JIT-PIC mode (which is good because
we didn't handle them before!).
This also fixes a bug handling the picbase delta, which we would get wrong
in some cases.
llvm-svn: 32451
2006-12-11 23:22:45 +00:00
Jim Laskey
2b136a73bd
Missing opcode.
...
llvm-svn: 32439
2006-12-11 18:45:56 +00:00
Anton Korobeynikov
3b7c257cae
Cleaned setjmp/longjmp lowering interfaces. Now we're producing right
...
code (both asm & cbe) for Mingw32 target.
Removed autoconf checks for underscored versions of setjmp/longjmp.
llvm-svn: 32415
2006-12-10 23:12:42 +00:00
Chris Lattner
d8e7451dc3
Fix i64 uint_to_fp on ppc64
...
llvm-svn: 32297
2006-12-07 01:24:16 +00:00
Jim Laskey
e4f4d048dd
Restoration of the stack pointer after a deallocation of a alloca was not
...
updating the SP link.
llvm-svn: 32202
2006-12-04 22:04:42 +00:00
Jim Laskey
1b0bc794e6
1. In ppc64 mode we need only use one GPR.
...
2. Float values need to be promoted to double when they are vararg.
llvm-svn: 32074
2006-12-01 16:30:47 +00:00
Chris Lattner
09ed0ff2ac
Fix the CodeGen/PowerPC/vec_constants.ll regression.
...
llvm-svn: 32057
2006-12-01 01:45:39 +00:00
Chris Lattner
2f648fc55d
Fix bug codegen'ing FP constant vectors with integer splats. Make sure the
...
created intrinsics have the right integer types. This fixes
PowerPC/2006-11-29-AltivecFPSplat.ll
llvm-svn: 32024
2006-11-29 19:58:49 +00:00
Jim Laskey
152671f0bf
Offset for load of 32-bit arg in 64-bit world was incorrect.
...
llvm-svn: 32019
2006-11-29 13:37:09 +00:00
Jim Laskey
40182179b6
Remove debug code.
...
llvm-svn: 31970
2006-11-28 18:27:02 +00:00
Jim Laskey
f4e2e009d9
32-bit int space was not accounted for properly in lowerCall.
...
llvm-svn: 31966
2006-11-28 14:53:52 +00:00
Evan Cheng
20350c4025
Change MachineInstr ctor's to take a TargetInstrDescriptor reference instead
...
of opcode and number of operands.
llvm-svn: 31947
2006-11-27 23:37:22 +00:00
Chris Lattner
2cca385fbb
on ppc64, float arguments take 8-byte stack slots not 4-byte stack slots.
...
Also, valist should create a pointer RC reg class value, not a GPRC value.
llvm-svn: 31840
2006-11-18 01:57:19 +00:00
Chris Lattner
be9377a1e3
convert PPC::BCC to use the 'pred' operand instead of separate predicate
...
value and CR reg #. This requires swapping the order of these everywhere
that touches BCC and requires us to write custom matching logic for
PPCcondbranch :(
llvm-svn: 31835
2006-11-17 22:37:34 +00:00
Chris Lattner
e0263794f4
rename PPC::COND_BRANCH to PPC::BCC
...
llvm-svn: 31834
2006-11-17 22:14:47 +00:00
Chris Lattner
8c6a41ea12
start using PPC predicates more consistently.
...
llvm-svn: 31833
2006-11-17 22:10:59 +00:00
Jim Laskey
48850c10c0
This is a general clean up of the PowerPC ABI. Address several problems and
...
bugs including making sure that the TOS links back to the previous frame,
that the maximum call frame size is not included twice when using frame
pointers, no longer growing the frame on calls, double storing of SP and
a cleaner/faster dynamic alloca.
llvm-svn: 31792
2006-11-16 22:43:37 +00:00
Chris Lattner
474b5b7c95
fix ldu/stu jit encoding. Swith 64-bit preinc load instrs to use memri
...
addrmodes.
llvm-svn: 31757
2006-11-15 19:55:13 +00:00
Chris Lattner
97ff46b3cc
lower "X = seteq Y, Z" to '(shr (ctlz (xor Y, Z)), 5)' instead of
...
'(shr (ctlz (sub Y, Z)), 5)'.
The use of xor better exposes the operation to bit-twiddling logic in the
dag combiner. For example, this:
typedef struct {
unsigned prefix : 4;
unsigned code : 4;
unsigned unsigned_p : 4;
} tree_common;
int foo(tree_common *a, tree_common *b) {
return a->code == b->code;
}
Now compiles to:
_foo:
lwz r2, 0(r4)
lwz r3, 0(r3)
xor r2, r3, r2
rlwinm r2, r2, 28, 28, 31
cntlzw r2, r2
srwi r3, r2, 5
blr
instead of:
_foo:
lbz r2, 3(r4)
lbz r3, 3(r3)
srwi r2, r2, 4
srwi r3, r3, 4
subf r2, r2, r3
cntlzw r2, r2
srwi r3, r2, 5
blr
saving a cycle.
llvm-svn: 31725
2006-11-14 05:28:08 +00:00
Chris Lattner
683712583a
minor tweaks, reject vector preinc.
...
llvm-svn: 31717
2006-11-14 01:38:31 +00:00
Chris Lattner
b314b155ed
ppc64 doesn't have lwau, don't attempt to form it.
...
llvm-svn: 31656
2006-11-11 00:08:42 +00:00
Chris Lattner
c9fa36d706
implement preinc support for r+i loads on ppc64
...
llvm-svn: 31654
2006-11-10 23:58:45 +00:00
Chris Lattner
ce6455489a
add an initial cut at preinc loads for ppc32. This is broken for ppc64
...
(because the 64-bit reg target versions aren't implemented yet), doesn't
support r+r addr modes, and doesn't handle stores, but it works otherwise. :)
This is disabled unless -enable-ppc-preinc is passed to llc for now.
llvm-svn: 31621
2006-11-10 02:08:47 +00:00
Evan Cheng
36a8fbf771
PPC supports i32 / i64 pre-inc load / store.
...
llvm-svn: 31599
2006-11-09 19:11:50 +00:00
Evan Cheng
b15000736c
Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
...
llvm-svn: 31595
2006-11-09 17:55:04 +00:00
Chris Lattner
a801fcedd3
Refactor all the addressing mode selection stuff into the isel lowering
...
class, where it can be used for preinc formation.
llvm-svn: 31536
2006-11-08 02:15:41 +00:00
Reid Spencer
de46e48420
For PR786:
...
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.
llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Chris Lattner
584a11ae22
Implement the getRegForInlineAsmConstraint method for PPC. With recent
...
sdisel changes, this eliminates a ton of copies around common inline asms.
For example:
int test2(int Y, int X) {
asm("foo %0, %1" : "=r"(X): "r"(X));
return X;
}
now compiles to:
_test2:
foo r3, r4
blr
instead of:
_test2:
mr r2, r4
foo r2, r2
mr r3, r2
blr
GCC produces:
_test2:
foo r4, r4
mr r3,r4
blr
llvm-svn: 31367
2006-11-02 01:44:04 +00:00
Chris Lattner
8c6949e5b2
Change the prototype for TargetLowering::isOperandValidForConstraint
...
llvm-svn: 31318
2006-10-31 19:40:43 +00:00
Evan Cheng
0d41d19427
All targets expand BR_JT for now.
...
llvm-svn: 31294
2006-10-30 08:02:39 +00:00
Chris Lattner
454436dcc5
set the ppc64 stack pointer right, dynamic alloca now works for ppc64
...
llvm-svn: 31028
2006-10-18 01:20:43 +00:00
Chris Lattner
ab4df83426
Expand alloca for ppc64
...
llvm-svn: 31027
2006-10-18 01:18:48 +00:00
Evan Cheng
ab51cf2e78
Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
...
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Evan Cheng
e71fe34d75
Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
...
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Evan Cheng
df9ac47e5e
Make use of getStore().
...
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Evan Cheng
5d9fd977d3
Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
...
extra operand to LOADX to specify the exact value extension type.
llvm-svn: 30714
2006-10-04 00:56:09 +00:00
Chris Lattner
601b86513d
Legalize is no longer limited to cleverness with just constant shift amounts.
...
Allow it to be clever when possible and fall back to the gross code when needed.
This allows us to compile:
long long foo1(long long X, int C) {
return X << (C|32);
}
long long foo2(long long X, int C) {
return X << (C&~32);
}
to:
_foo1:
rlwinm r2, r5, 0, 27, 31
slw r3, r4, r2
li r4, 0
blr
.globl _foo2
.align 4
_foo2:
rlwinm r2, r5, 0, 27, 25
subfic r5, r2, 32
slw r3, r3, r2
srw r5, r4, r5
or r3, r3, r5
slw r4, r4, r2
blr
instead of:
_foo1:
ori r2, r5, 32
subfic r5, r2, 32
addi r6, r2, -32
srw r5, r4, r5
slw r3, r3, r2
slw r6, r4, r6
or r3, r3, r5
slw r4, r4, r2
or r3, r3, r6
blr
.globl _foo2
.align 4
_foo2:
rlwinm r2, r5, 0, 27, 25
subfic r5, r2, 32
addi r6, r2, -32
srw r5, r4, r5
slw r3, r3, r2
slw r6, r4, r6
or r3, r3, r5
slw r4, r4, r2
or r3, r3, r6
blr
llvm-svn: 30507
2006-09-20 03:47:40 +00:00
Chris Lattner
3c48ea54ee
Fold the PPCISD shifts when presented with 0 inputs. This occurs for code
...
like:
long long test(long long X, int Y) {
return 1ULL << Y;
}
long long test2(long long X, int Y) {
return -1LL << Y;
}
which we used to compile to:
_test:
li r2, 1
subfic r3, r5, 32
li r4, 0
addi r6, r5, -32
srw r3, r2, r3
slw r4, r4, r5
slw r6, r2, r6
or r3, r4, r3
slw r4, r2, r5
or r3, r3, r6
blr
_test2:
li r2, -1
subfic r3, r5, 32
addi r6, r5, -32
srw r3, r2, r3
slw r4, r2, r5
slw r2, r2, r6
or r3, r4, r3
or r3, r3, r2
blr
Now we produce:
_test:
li r2, 1
addi r3, r5, -32
subfic r4, r5, 32
slw r3, r2, r3
srw r4, r2, r4
or r3, r4, r3
slw r4, r2, r5
blr
_test2:
li r2, -1
subfic r3, r5, 32
addi r6, r5, -32
srw r3, r2, r3
slw r4, r2, r5
slw r2, r2, r6
or r3, r4, r3
or r3, r3, r2
blr
llvm-svn: 30479
2006-09-19 05:22:59 +00:00
Evan Cheng
9a083a4121
Reflects MachineConstantPoolEntry changes.
...
llvm-svn: 30279
2006-09-12 21:04:05 +00:00
Reid Spencer
e7141c8be6
For PR387:
...
Close out this long standing bug by removing the remaining overloaded
virtual functions in LLVM. The -Woverloaded-virtual option is now turned on.
llvm-svn: 29934
2006-08-28 01:02:49 +00:00
Chris Lattner
095e4ad2ea
Fix a bug in a recent refactoring that broke a bunch of stuff.
...
llvm-svn: 29649
2006-08-12 07:20:05 +00:00
Chris Lattner
ed728e8dc9
Eliminate use of getNode that takes a vector.
...
llvm-svn: 29614
2006-08-11 17:38:39 +00:00
Chris Lattner
d66f14e846
Convert vectors to fixed sized arrays and smallvectors. Eliminate use of getNode that takes a vector.
...
llvm-svn: 29609
2006-08-11 17:18:05 +00:00
Chris Lattner
66f1fbaaad
Fix miscompilation of float vector returns. Compile code to this:
...
_func:
vsldoi v2, v3, v2, 12
vsldoi v2, v2, v2, 4
blr
instead of:
_func:
vsldoi v2, v3, v2, 12
vsldoi v2, v2, v2, 4
*** vor f1, v2, v2
blr
llvm-svn: 29607
2006-08-11 16:47:32 +00:00
Chris Lattner
8298265042
Fix some ppc64 issues with vector code.
...
llvm-svn: 29384
2006-07-28 16:45:47 +00:00
Chris Lattner
9e56e5c003
Rename RelocModel::PIC to PIC_, to avoid conflicts with -DPIC.
...
llvm-svn: 29307
2006-07-26 21:12:04 +00:00
Chris Lattner
a7976d329e
Implement Regression/CodeGen/PowerPC/bswap-load-store.ll by folding bswaps
...
into i16/i32 load/stores.
llvm-svn: 29089
2006-07-10 20:56:58 +00:00
Chris Lattner
8aed3cc46b
Implement 64-bit select, bswap, etc.
...
llvm-svn: 28935
2006-06-27 20:14:52 +00:00
Chris Lattner
a07410c95b
PPC doesn't have bit converts to/from i64
...
llvm-svn: 28932
2006-06-27 18:40:08 +00:00
Chris Lattner
d48ce27532
Implement 64-bit undef, sub, shl/shr, srem/urem
...
llvm-svn: 28929
2006-06-27 18:18:41 +00:00
Chris Lattner
cb5a84f446
Use i32 for shift amounts instead of i64. This gets bisort working.
...
llvm-svn: 28927
2006-06-27 17:34:57 +00:00
Chris Lattner
97b3da1519
Implement a bunch of 64-bit cleanliness work. With this, treeadd builds (but
...
doesn't work right).
llvm-svn: 28921
2006-06-27 00:04:13 +00:00
Chris Lattner
ec78cade34
Improve PPC64 calling convention support
...
llvm-svn: 28919
2006-06-26 22:48:35 +00:00
Chris Lattner
dc38e6f322
Correct returns of 64-bit values, though they seemed to work before...
...
llvm-svn: 28892
2006-06-21 00:34:03 +00:00
Chris Lattner
a5190ae7a9
fix some assumptions that pointers can only be 32-bits. With this, we can
...
now compile:
static unsigned long X;
void test1() {
X = 0;
}
into:
_test1:
lis r2, ha16(_X)
li r3, 0
stw r3, lo16(_X)(r2)
blr
Totally amazing :)
llvm-svn: 28839
2006-06-16 21:01:35 +00:00
Chris Lattner
a35f306740
Rename some subtarget features. A CPU now can *have* 64-bit instructions,
...
can in 32-bit mode we can choose to optionally *use* 64-bit registers.
llvm-svn: 28824
2006-06-16 17:34:12 +00:00
Evan Cheng
94bb93f8f7
Type of extract_element index operand should be iPTR.
...
llvm-svn: 28797
2006-06-15 08:18:06 +00:00
Chris Lattner
006b2c6ab9
Fix a problem exposed by the local allocator. CALL instructions are not marked
...
as using incoming argument registers, so the local allocator would clobber them
between their set and use. To fix this, we give the call instructions a variable
number of uses in the CALL MachineInstr itself, so live variables understands
the live ranges of these register arguments.
llvm-svn: 28744
2006-06-10 01:14:28 +00:00
Chris Lattner
b9342afa56
Always reserve space for 8 spilled GPRs. GCC apparently assumes that this
...
space will be available, even if the callee isn't varargs.
llvm-svn: 28571
2006-05-30 21:21:04 +00:00
Evan Cheng
a3add0fea8
Change RET node to include signness information of the return values. i.e.
...
RET chain, value1, sign1, value2, sign2, ...
llvm-svn: 28510
2006-05-26 23:10:12 +00:00
Evan Cheng
c2cd473d9b
CALL node change (arg / sign pairs instead of just arguments).
...
llvm-svn: 28462
2006-05-25 00:57:32 +00:00
Chris Lattner
aa2372562e
Patches to make the LLVM sources more -pedantic clean. Patch provided
...
by Anton Korobeynikov! This is a step towards closing PR786.
llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Chris Lattner
33165c246c
Fix CodeGen/Generic/vector.ll:test_div with altivec.
...
llvm-svn: 28445
2006-05-24 00:15:25 +00:00
Chris Lattner
b56d22c2f6
Handle SETO* like we handle SET*, restoring behavior after Evan's setcc
...
change. This fixes PowerPC/fnegsel.ll.
llvm-svn: 28443
2006-05-24 00:06:44 +00:00
Chris Lattner
eb755fc1b3
Make PPC call lowering more aggressive, making the isel matching code simple
...
enough to be autogenerated.
llvm-svn: 28354
2006-05-17 19:00:46 +00:00
Chris Lattner
b1e9e37c58
Switch PPC over to a call-selection model where the lowering code creates
...
the copyto/fromregs instead of making the PPCISD::CALL selection code create
them. This vastly simplifies the selection code, and moves the ABI handling
parts into one place.
llvm-svn: 28346
2006-05-17 06:01:33 +00:00
Chris Lattner
b7552a88d6
3 changes, 2 of which are cleanup one of which changes codegen:
...
1. Rearrange code a bit so that the special case doesn't require indenting lots
of code.
2. Add comments describing PPC calling convention.
3. Only round up to 56-bytes of stack space for an outgoing call if the callee
is varargs. This saves a bit of stack space.
llvm-svn: 28342
2006-05-17 00:15:40 +00:00
Chris Lattner
f058f5aef1
implement passing/returning vector regs to calls, at least non-varargs calls.
...
llvm-svn: 28341
2006-05-16 23:54:25 +00:00
Chris Lattner
aa40ec1b32
Instead of implementing LowerCallTo directly, let the default impl produce an
...
ISD::CALL node, then custom lower that. This means that we only have to handle
LEGAL call operands/results, not every possible type. This allows us to
simplify the call code, shrinking it by about 1/3.
llvm-svn: 28339
2006-05-16 22:56:08 +00:00
Chris Lattner
26e2fcd8b1
Simplify the argument counting logic by only incrementing the index.
...
llvm-svn: 28335
2006-05-16 18:58:15 +00:00
Chris Lattner
76c47b50e7
Simplify the dead argument handling code.
...
llvm-svn: 28334
2006-05-16 18:54:32 +00:00
Chris Lattner
318f0d2122
Vector args passed in registers don't reserve stack space.
...
llvm-svn: 28333
2006-05-16 18:51:52 +00:00
Chris Lattner
4302e8fb67
Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument
...
handling. This makes the lower argument code significantly simpler (we
only need to handle legal argument types).
Incidentally, this also implements support for vector argument registers,
so long as they are not on the stack.
llvm-svn: 28331
2006-05-16 18:18:50 +00:00
Chris Lattner
d2ca9abf57
Fit in 80 cols
...
llvm-svn: 28311
2006-05-16 04:20:24 +00:00
Chris Lattner
ae48a894b1
Remove dead var, fix bad override.
...
llvm-svn: 28264
2006-05-12 21:09:57 +00:00
Chris Lattner
84b49d51be
Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
...
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Nate Begeman
4ca2ea5b43
JumpTable support! What this represents is working asm and jit support for
...
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner
518834c67e
Fix a crash on:
...
void foo2(vector float *A, vector float *B) {
vector float C = (vector float)vec_cmpeq(*A, *B);
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
*A = C;
}
llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Chris Lattner
1e174c87c3
pretty print node name
...
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner
9754d142a4
Implement an important entry from README_ALTIVEC:
...
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's. Instead, just branch on CR6 directly. :)
For example, for:
void foo2(vector float *A, vector float *B) {
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
}
We now generate:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
instead of:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
cmpwi cr0, r3, 0
beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
This implements CodeGen/PowerPC/vec_br_cmp.ll.
llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
96d50487c9
Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
...
even/odd halves. Thanks to Nate telling me what's what.
llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner
d6d82aa889
Implement v16i8 multiply with this code:
...
vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4
This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are
6.79x faster than before.
Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.
Remove the 'integer multiplies' todo from the README file.
llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Chris Lattner
7e439874cb
Lower v8i16 multiply into this code:
...
li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4
where v4 is:
LCPI1_0: ; <16 x ubyte>
.byte 2
.byte 3
.byte 18
.byte 19
.byte 6
.byte 7
.byte 22
.byte 23
.byte 10
.byte 11
.byte 26
.byte 27
.byte 14
.byte 15
.byte 30
.byte 31
This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.
llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner
a2cae1bb10
Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
...
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.
This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements
PowerPC/vec_mul.ll
llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Chris Lattner
e54133cfba
Make sure to check splats of every constant we can, handle splat(31) by
...
being a bit more clever, add support for odd splats from -31 to -17.
llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Chris Lattner
264c908e3a
Teach the ppc backend to use rol and vsldoi to generate splatted constants.
...
This implements vec_constants.ll:test_vsldoi and test_rol
llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner
1b3806ace5
Make some code more general, adding support for constant formation of several
...
new patterns.
llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner
f8dd76df5b
Learn how to make odd splatted constants in range [17,29]. This implements
...
PowerPC/vec_constants.ll:test_29.
llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner
2a099c04c1
Pull some code out into a helper function.
...
Effeciently codegen even splats in the range [-32,30].
This allows us to codegen <30,30,30,30> as:
vspltisw v0, 15
vadduwm v2, v0, v0
instead of as a cp load.
llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner
071ad01ceb
Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
...
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll
llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner
06a21ba96b
Implement a TODO: have the legalizer canonicalize a bunch of operations to
...
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.
llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner
fa5aa396c2
Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
...
Remove some done items from the todo list.
llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner
24acbe46c0
Fix a crash when faced with a shuffle vector that has an undef in its mask.
...
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner
559c8ba466
Allow undef in a shuffle mask
...
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner
4211ca9108
Move the rest of the PPCTargetLowering::LowerOperation cases out into
...
separate functions, for simplicity and code clarity.
llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
19e9055eb5
Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
...
functions, which makes the code much cleaner :)
llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
883fb053bd
Force non-darwin targets to use a static relo model. This fixes PR734,
...
tested by CodeGen/Generic/vector.ll
llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
147e50e1c5
Add a new way to match vector constants, which make it easier to bang bits of
...
different types.
Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1. This compiles:
typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
*P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
*P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
*P3 = vec_abs((vector float)*P3);
}
to:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
vspltisw v0, -1
vslw v0, v0, v0
lvx v1, 0, r3
vand v1, v1, v0
stvx v1, 0, r3
lvx v1, 0, r4
vandc v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vandc v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
instead of (with two constant pool entries):
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
li r6, lo16(LCPI1_0)
lis r7, ha16(LCPI1_0)
li r8, lo16(LCPI1_1)
lis r9, ha16(LCPI1_1)
lvx v0, r7, r6
lvx v1, 0, r3
vand v0, v1, v0
stvx v0, 0, r3
lvx v0, r9, r8
lvx v1, 0, r4
vand v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vand v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
GCC produces (with 2 cp entries):
_test:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
lis r9,ha16(LC1)
la r2,lo16(LC0)(r2)
lvx v0,0,r3
lvx v1,0,r5
la r9,lo16(LC1)(r9)
lwz r12,-4(r1)
lvx v12,0,r2
lvx v13,0,r9
vand v0,v0,v12
stvx v0,0,r3
vspltisw v0,-1
vslw v12,v0,v0
vandc v1,v1,v12
stvx v1,0,r5
lvx v0,0,r4
vand v0,v0,v13
stvx v0,0,r4
mtspr 256,r12
blr
llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
74cf9ff761
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
e318a7574e
Ensure that zero vectors are always v4i32, which forces them to CSE with
...
each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll
llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Chris Lattner
e4db08a2f1
Vector function results go into V2 according to GCC. The darwin ABI doc
...
doesn't say where they go :-/
llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
92533cfb4a
Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
...
No functionality change.
llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner
3a68f3c3ca
properly mark vector selects as expanded to select_cc
...
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner
0a3d1bbca4
Add VRRC select support
...
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Chris Lattner
d9e80f4516
Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
...
constant pool load.
llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner
d71a1f946d
Change the interface to the predicate that determines if vsplti* can be used.
...
No functionality changes.
llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Chris Lattner
466841ddc7
Make sure to return the result in the right type.
...
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner
a4bbfaed5c
Match vpku[hw]um(x,x).
...
Convert vsldoi(x,x) to work the same way other (x,x) cases work.
llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
f38e033270
Add support for matching vmrg(x,x) patterns
...
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
d1dcb52093
Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
...
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
1d33819194
Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
...
lower it and LLVM to have one fewer intrinsic. This implements
CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
e8b83b4206
Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
...
vperm with a perm mask lvx'd from the constant pool.
llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
39cc717c65
Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
...
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng
2cf4232ced
Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
...
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Chris Lattner
4a744e5c9d
Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
...
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Chris Lattner
95c7adc7cb
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
447a7968af
Revert accidentally committed hunks.
...
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
533aed9a35
Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
...
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Chris Lattner
c5287c0ece
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
9b2d6e7886
Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
...
"vspltisb v0, 8" instead of a constant pool load.
llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
baa73e0d91
Rearrange code a bit
...
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner
754b41c84b
Add, sub and shuffle are legal for all vector types
...
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Chris Lattner
829a061abf
note to self: *save* file, then check it in
...
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner
d4058a59d4
Implement an item from the readme, folding vcmp/vcmp. instructions with
...
identical instructions into a single instruction. For example, for:
void test(vector float *x, vector float *y, int *P) {
int v = vec_any_out(*x, *y);
*x = (vector float)vec_cmpb(*x, *y);
*P = v;
}
we now generate:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v0, v1, v0
mfcr r4, 2
stvx v0, 0, r3
rlwinm r3, r4, 27, 31, 31
xori r3, r3, 1
stw r3, 0(r5)
mtspr 256, r2
blr
instead of:
_test:
mfspr r2, 256
oris r6, r2, 57344
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v2, v1, v0
mfcr r4, 2
*** vcmpbfp v0, v1, v0
rlwinm r4, r4, 27, 31, 31
stvx v0, 0, r3
xori r3, r4, 1
stw r3, 0(r5)
mtspr 256, r2
blr
Testcase here: CodeGen/PowerPC/vcmp-fold.ll
llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner
d7495ae7e9
Lower vector compares to VCMP nodes, just like we lower vector comparison
...
predicates to VCMPo nodes.
llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner
bca5fbe914
Mark INSERT_VECTOR_ELT as expand
...
llvm-svn: 27276
2006-03-31 01:48:55 +00:00
Nate Begeman
1b3928765d
Add a few more altivec intrinsics
...
llvm-svn: 27215
2006-03-28 04:15:58 +00:00
Chris Lattner
cb5ec07cc3
Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
...
same thing and we have a dag node for the former.
llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner
e55d171ccd
Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
...
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Nate Begeman
ed728c1291
SelectionDAGISel can now natively handle Switch instructions, in the same
...
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks. The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.
This functionality is currently only enabled on x86, but should be safe for
every target. In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.
llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Chris Lattner
6961fc76bb
Codegen vector predicate compares.
...
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng
b1ddc988af
Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
...
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Chris Lattner
1cb91b3cd9
Add some basic patterns for other datatypes
...
llvm-svn: 27116
2006-03-25 07:39:07 +00:00
Chris Lattner
2771e2c960
Codegen things like:
...
<int -1, int -1, int -1, int -1>
and
<int 65537, int 65537, int 65537, int 65537>
Using things like:
vspltisb v0, -1
and:
vspltish v0, 1
instead of using constant pool loads.
This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.
llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Chris Lattner
a90b7141ed
Disable the i32->float G5 optimization. It is unsafe, as documented in the
...
comment.
This fixes 177.mesa, and McCat/09-vor with the td scheduler.
llvm-svn: 27060
2006-03-24 07:53:47 +00:00
Chris Lattner
ab882abce8
add support for using vxor to build zero vectors. This implements
...
Regression/CodeGen/PowerPC/vec_zero.ll
llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
4a66d69433
When possible, custom lower 32-bit SINT_TO_FP to this:
...
_foo2:
extsw r2, r3
std r2, -8(r1)
lfd f0, -8(r1)
fcfid f0, f0
frsp f1, f0
blr
instead of this:
_foo2:
lis r2, ha16(LCPI2_0)
lis r4, 17200
xoris r3, r3, 32768
stw r3, -4(r1)
stw r4, -8(r1)
lfs f0, lo16(LCPI2_0)(r2)
lfd f1, -8(r1)
fsub f0, f1, f0
frsp f1, f0
blr
This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).
llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
00f4683bf6
These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.
...
llvm-svn: 26930
2006-03-21 20:51:05 +00:00
Chris Lattner
6d74b09da7
remove dead variable
...
llvm-svn: 26907
2006-03-20 22:37:23 +00:00
Chris Lattner
a1bc294f0c
Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
...
figuring these out! :)
llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner
a9a1313386
Add support for generating vspltw, instead of a vperm instruction with a
...
constant pool load. This generates significantly nicer code for splats.
When tblgen gets bugfixed, we can remove the custom selection code.
llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner
a8fbb6dd3d
Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate.
...
llvm-svn: 26897
2006-03-20 06:37:44 +00:00
Chris Lattner
ffc475689b
fix duplicate definition errors
...
llvm-svn: 26896
2006-03-20 06:33:01 +00:00
Chris Lattner
a8713b1ee6
Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
...
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*
llvm-svn: 26887
2006-03-20 01:53:53 +00:00