Chris Lattner
5f0edfb849
Legalize FORMAL_ARGUMENTS nodes correctly, we don't want to legalize them once
...
for each argument.
llvm-svn: 28313
2006-05-16 05:49:56 +00:00
Evan Cheng
99f2f79e2f
Fixing 2006-05-01-SchedCausingSpills.ll; some clean up
...
llvm-svn: 28279
2006-05-13 08:22:24 +00:00
Evan Cheng
d1915cfa6f
Revert an un-intended change
...
llvm-svn: 28278
2006-05-13 05:53:47 +00:00
Chris Lattner
69a0ce6261
Merge identical code.
...
llvm-svn: 28274
2006-05-13 02:11:14 +00:00
Chris Lattner
53cdb2f2b0
Remove dead vars
...
llvm-svn: 28255
2006-05-12 18:06:45 +00:00
Chris Lattner
da076e41ab
remove dead vars
...
llvm-svn: 28254
2006-05-12 18:04:28 +00:00
Chris Lattner
afe72481f6
Comment out dead variables
...
llvm-svn: 28252
2006-05-12 17:57:54 +00:00
Chris Lattner
8c02c3f41a
Compile:
...
%tmp152 = setgt uint %tmp144, %tmp149 ; <bool> [#uses=1]
%tmp159 = setlt uint %tmp144, %tmp149 ; <bool> [#uses=1]
%bothcond2 = or bool %tmp152, %tmp159 ; <bool> [#uses=1]
To setne, not setune, which causes an assertion fault.
llvm-svn: 28244
2006-05-12 17:03:46 +00:00
Owen Anderson
8c2c1e90c4
Refactor a bunch of includes so that TargetMachine.h doesn't have to include
...
TargetData.h. This should make recompiles a bit faster with my current
TargetData tinkering.
llvm-svn: 28238
2006-05-12 06:33:49 +00:00
Evan Cheng
095c9d9b7f
Duh. That could take a long time.
...
llvm-svn: 28235
2006-05-12 06:05:18 +00:00
Chris Lattner
66adee93aa
Two simplifications for token factor nodes: simplify tf(x,x) -> x.
...
simplify tf(x,y,y,z) -> tf(x,y,z).
llvm-svn: 28233
2006-05-12 05:01:37 +00:00
Evan Cheng
afed73eebe
Add capability to scheduler to commute nodes for profit.
...
If a two-address code whose first operand has uses below, it should be commuted
when possible.
llvm-svn: 28230
2006-05-12 01:58:24 +00:00
Evan Cheng
d38c22bdd3
Refactor scheduler code. Move register-reduction list scheduler to a
...
separate file. Added an initial implementation of top-down register pressure
reduction list scheduler.
llvm-svn: 28226
2006-05-11 23:55:42 +00:00
Evan Cheng
9665ba053f
Templatify RegReductionPriorityQueue
...
llvm-svn: 28212
2006-05-10 06:16:44 +00:00
Nate Begeman
1a225d23ae
Fix PR773
...
llvm-svn: 28207
2006-05-09 18:20:51 +00:00
Evan Cheng
7d693898ee
Add pseudo dependency to force a def&use operand to be scheduled last (unless
...
the distance between the def and another use is much longer). This is under
option control for now "-sched-lower-defnuse".
llvm-svn: 28201
2006-05-09 07:13:34 +00:00
Evan Cheng
2c74848af1
Debugging info
...
llvm-svn: 28200
2006-05-09 06:55:15 +00:00
Chris Lattner
446e1ef26a
Make the case I just checked in stronger. Now we compile this:
...
short test2(short X, short x) {
int Y = (short)(X+x);
return Y >> 1;
}
to:
_test2:
add r2, r3, r4
extsh r2, r2
srawi r3, r2, 1
blr
instead of:
_test2:
add r2, r3, r4
extsh r2, r2
srwi r2, r2, 1
extsh r3, r2
blr
llvm-svn: 28175
2006-05-08 21:18:59 +00:00
Chris Lattner
29062da0ac
Implement and_sext.ll:test3, generating:
...
_test4:
srawi r3, r3, 16
blr
instead of:
_test4:
srwi r2, r3, 16
extsh r3, r2
blr
for:
short test4(unsigned X) {
return (X >> 16);
}
llvm-svn: 28174
2006-05-08 20:59:41 +00:00
Chris Lattner
2935d8190c
Compile this:
...
short test4(unsigned X) {
return (X >> 16);
}
to:
_test4:
movl 4(%esp), %eax
sarl $16, %eax
ret
instead of:
_test4:
movl $-65536, %eax
andl 4(%esp), %eax
sarl $16, %eax
ret
llvm-svn: 28171
2006-05-08 20:51:54 +00:00
Chris Lattner
78da6792e7
Fold shifts with undef operands.
...
llvm-svn: 28167
2006-05-08 17:29:49 +00:00
Nate Begeman
d7a19102d1
Make emission of jump tables a bit less conservative; they are now required
...
to be only 31.25% dense, rather than 75% dense.
llvm-svn: 28165
2006-05-08 16:51:36 +00:00
Nate Begeman
e5ce5bb6da
Fix PR772
...
llvm-svn: 28161
2006-05-08 01:35:01 +00:00
Chris Lattner
7e7bcf3a54
Simplify some code, add a couple minor missed folds
...
llvm-svn: 28152
2006-05-06 23:06:26 +00:00
Chris Lattner
751817c54f
constant fold sign_extend_inreg
...
llvm-svn: 28151
2006-05-06 23:05:41 +00:00
Chris Lattner
2a4d7b845b
remove cases handled elsewhere
...
llvm-svn: 28150
2006-05-06 22:43:44 +00:00
Chris Lattner
1ecb2a2dac
Use the new TargetLowering::ComputeNumSignBits method to eliminate
...
sign_extend_inreg operations. Though ComputeNumSignBits is still rudimentary,
this is enough to compile this:
short test(short X, short x) {
int Y = X+x;
return (Y >> 1);
}
short test2(short X, short x) {
int Y = (short)(X+x);
return Y >> 1;
}
into:
_test:
add r2, r3, r4
srawi r3, r2, 1
blr
_test2:
add r2, r3, r4
extsh r2, r2
srawi r3, r2, 1
blr
instead of:
_test:
add r2, r3, r4
srawi r2, r2, 1
extsh r3, r2
blr
_test2:
add r2, r3, r4
extsh r2, r2
srawi r2, r2, 1
extsh r3, r2
blr
llvm-svn: 28146
2006-05-06 09:30:03 +00:00
Chris Lattner
21cd99024a
When inserting casts, be careful of where we put them. We cannot insert
...
a cast immediately before a PHI node.
This fixes Regression/CodeGen/Generic/2006-05-06-GEP-Cast-Sink-Crash.ll
llvm-svn: 28143
2006-05-06 09:10:37 +00:00
Chris Lattner
907e392dba
Fold trunc(any_ext). This gives stuff like:
...
27,28c27
< movzwl %di, %edi
< movl %edi, %ebx
---
> movw %di, %bx
llvm-svn: 28137
2006-05-05 22:56:26 +00:00
Chris Lattner
57f8c5a387
Shrink shifts when possible.
...
llvm-svn: 28136
2006-05-05 22:53:17 +00:00
Chris Lattner
3d26577396
Fold (fpext (load x)) -> (extload x)
...
llvm-svn: 28130
2006-05-05 21:34:35 +00:00
Chris Lattner
3e3f2c63c3
More aggressively sink GEP offsets into loops. For example, before we
...
generated:
movl 8(%esp), %eax
movl %eax, %edx
addl $4316, %edx
cmpb $1, %cl
ja LBB1_2 #cond_false
LBB1_1: #cond_true
movl L_QuantizationTables720$non_lazy_ptr, %ecx
movl %ecx, (%edx)
movl L_QNOtoQuantTableShift720$non_lazy_ptr, %edx
movl %edx, 4460(%eax)
ret
...
Now we generate:
movl 8(%esp), %eax
cmpb $1, %cl
ja LBB1_2 #cond_false
LBB1_1: #cond_true
movl L_QuantizationTables720$non_lazy_ptr, %ecx
movl %ecx, 4316(%eax)
movl L_QNOtoQuantTableShift720$non_lazy_ptr, %ecx
movl %ecx, 4460(%eax)
ret
... which uses one fewer register.
llvm-svn: 28129
2006-05-05 21:17:49 +00:00
Chris Lattner
25a5283a86
Fold some common code.
...
llvm-svn: 28124
2006-05-05 06:32:04 +00:00
Chris Lattner
002ee91457
Implement:
...
// fold (and (sext x), (sext y)) -> (sext (and x, y))
// fold (or (sext x), (sext y)) -> (sext (or x, y))
// fold (xor (sext x), (sext y)) -> (sext (xor x, y))
// fold (and (aext x), (aext y)) -> (aext (and x, y))
// fold (or (aext x), (aext y)) -> (aext (or x, y))
// fold (xor (aext x), (aext y)) -> (aext (xor x, y))
llvm-svn: 28123
2006-05-05 06:31:05 +00:00
Chris Lattner
5ac4293606
Pull and through and/or/xor. This compiles some bitfield code to:
...
mov EAX, DWORD PTR [ESP + 4]
mov ECX, DWORD PTR [EAX]
mov EDX, ECX
add EDX, EDX
or EDX, ECX
and EDX, -2147483648
and ECX, 2147483647
or EDX, ECX
mov DWORD PTR [EAX], EDX
ret
instead of:
sub ESP, 4
mov DWORD PTR [ESP], ESI
mov EAX, DWORD PTR [ESP + 8]
mov ECX, DWORD PTR [EAX]
mov EDX, ECX
add EDX, EDX
mov ESI, ECX
and ESI, -2147483648
and EDX, -2147483648
or EDX, ESI
and ECX, 2147483647
or EDX, ECX
mov DWORD PTR [EAX], EDX
mov ESI, DWORD PTR [ESP]
add ESP, 4
ret
llvm-svn: 28122
2006-05-05 06:10:43 +00:00
Chris Lattner
812646aa0c
Implement a variety of simplifications for ANY_EXTEND.
...
llvm-svn: 28121
2006-05-05 05:58:59 +00:00
Chris Lattner
8d6fc20181
Factor some code, add these transformations:
...
// fold (and (trunc x), (trunc y)) -> (trunc (and x, y))
// fold (or (trunc x), (trunc y)) -> (trunc (or x, y))
// fold (xor (trunc x), (trunc y)) -> (trunc (xor x, y))
llvm-svn: 28120
2006-05-05 05:51:50 +00:00
Jeff Cohen
78a7f0e05e
Fix VC++ compilation error.
...
llvm-svn: 28117
2006-05-05 01:47:05 +00:00
Chris Lattner
7a3ecf7993
Sink noop copies into the basic block that uses them. This reduces the number
...
of cross-block live ranges, and allows the bb-at-a-time selector to always
coallesce these away, at isel time.
This reduces the load on the coallescer and register allocator. For example
on a codec on X86, we went from:
1643 asm-printer - Number of machine instrs printed
419 liveintervals - Number of loads/stores folded into instructions
1144 liveintervals - Number of identity moves eliminated after coalescing
1022 liveintervals - Number of interval joins performed
282 liveintervals - Number of intervals after coalescing
1304 liveintervals - Number of original intervals
86 regalloc - Number of times we had to backtrack
1.90232 regalloc - Ratio of intervals processed over total intervals
40 spiller - Number of values reused
182 spiller - Number of loads added
121 spiller - Number of stores added
132 spiller - Number of register spills
6 twoaddressinstruction - Number of instructions commuted to coalesce
360 twoaddressinstruction - Number of two-address instructions
to:
1636 asm-printer - Number of machine instrs printed
403 liveintervals - Number of loads/stores folded into instructions
1155 liveintervals - Number of identity moves eliminated after coalescing
1033 liveintervals - Number of interval joins performed
279 liveintervals - Number of intervals after coalescing
1312 liveintervals - Number of original intervals
76 regalloc - Number of times we had to backtrack
1.88998 regalloc - Ratio of intervals processed over total intervals
1 spiller - Number of copies elided
41 spiller - Number of values reused
191 spiller - Number of loads added
114 spiller - Number of stores added
128 spiller - Number of register spills
4 twoaddressinstruction - Number of instructions commuted to coalesce
356 twoaddressinstruction - Number of two-address instructions
On this testcase, this change provides a modest reduction in spill code,
regalloc iterations, and total instructions emitted. It increases the number
of register coallesces.
llvm-svn: 28115
2006-05-05 01:04:50 +00:00
Evan Cheng
9add880566
Initial support for register pressure aware scheduling. The register reduction
...
scheduler can go into a "vertical mode" (i.e. traversing up the two-address
chain, etc.) when the register pressure is low.
This does seem to reduce the number of spills in the cases I've looked at. But
with x86, it's no guarantee the performance of the code improves.
It can be turned on with -sched-vertically option.
llvm-svn: 28108
2006-05-04 19:16:39 +00:00
Chris Lattner
469647bf38
Remove and simplify some more machineinstr/machineoperand stuff.
...
llvm-svn: 28105
2006-05-04 18:16:01 +00:00
Chris Lattner
10b71c0d08
Rename MO_VirtualRegister -> MO_Register. Clean up immediate handling.
...
llvm-svn: 28104
2006-05-04 18:05:43 +00:00
Chris Lattner
940cc978ef
Remove a bunch more SparcV9 specific stuff
...
llvm-svn: 28093
2006-05-04 01:15:02 +00:00
Nate Begeman
df4883971e
Finish up the initial jump table implementation by allowing jump tables to
...
not be 100% dense. Increase the minimum threshold for the number of cases
in a switch statement from 4 to 6 in order to create a jump table.
llvm-svn: 28079
2006-05-03 03:48:02 +00:00
Evan Cheng
ffef8b9412
Bottom up register pressure reduction work: clean up some hacks and enhanced
...
the heuristic to further reduce spills for several test cases. (Note, it may
not necessarily translate to runtime win!)
llvm-svn: 28076
2006-05-03 02:10:45 +00:00
Owen Anderson
20a631fde7
Refactor TargetMachine, pushing handling of TargetData into the target-specific subclasses. This has one caller-visible change: getTargetData() now returns a pointer instead of a reference.
...
This fixes PR 759.
llvm-svn: 28074
2006-05-03 01:29:57 +00:00
Evan Cheng
0d084fb9ca
Dis-favor stores more
...
llvm-svn: 28035
2006-05-01 09:20:44 +00:00
Evan Cheng
24e795496d
Bottom up register-pressure reduction scheduler now pushes store operations
...
up the schedule. This helps code that looks like this:
loads ...
computations (first set) ...
stores (first set) ...
loads
computations (seccond set) ...
stores (seccond set) ...
Without this change, the stores and computations are more likely to
interleave:
loads ...
loads ...
computations (first set) ...
computations (second set) ...
computations (first set) ...
stores (first set) ...
computations (second set) ...
stores (stores set) ...
This can increase the number of spills if we are unlucky.
llvm-svn: 28033
2006-05-01 09:14:40 +00:00
Evan Cheng
10ff7b27ce
Didn't mean ScheduleDAGList.cpp to make the last checkin.
...
llvm-svn: 28030
2006-05-01 08:56:34 +00:00
Evan Cheng
a656242690
Remove temp. option -spiller-check-liveout, it didn't cause any failure nor performance regressions.
...
llvm-svn: 28029
2006-05-01 08:54:57 +00:00
Chris Lattner
2b48a94413
Remove a bogus transformation. This fixes SingleSource/UnitTests/2006-01-23-InitializedBitField.c
...
with some changes I have to the new CFE.
llvm-svn: 28022
2006-04-28 23:33:20 +00:00
Evan Cheng
c5e8ce8b8c
Remove the temporary option: -no-isel-fold-inflight
...
llvm-svn: 28012
2006-04-28 18:54:11 +00:00
Evan Cheng
d43c5c6046
TargetLowering::LowerArguments should return a VBIT_CONVERT of
...
FORMAL_ARGUMENTS SDOperand in the return result vector.
llvm-svn: 28009
2006-04-28 05:25:15 +00:00
Evan Cheng
51ab4498e7
Added a temporary option -no-isel-fold-inflight to control whether a "inflight"
...
node can be folded.
llvm-svn: 28003
2006-04-28 02:09:19 +00:00
Evan Cheng
3784f3c57c
Insert a VBIT_CONVERT between a FORMAL_ARGUMENT node and its vector uses
...
(VAND, VADD, etc.). Legalizer will assert otherwise.
llvm-svn: 27991
2006-04-27 08:29:42 +00:00
Chris Lattner
393d96a56c
Fix Regression/CodeGen/Generic/2006-04-26-SetCCAnd.ll and
...
PR748.
llvm-svn: 27987
2006-04-27 05:01:07 +00:00
Evan Cheng
9618df1190
Don't forget return void.
...
llvm-svn: 27974
2006-04-25 23:03:35 +00:00
Nate Begeman
866b4b4d45
Fix the updating of the machine CFG when a PHI node was in a successor of
...
the jump table's range check block. This re-enables 100% dense jump tables
by default on PPC & x86
llvm-svn: 27952
2006-04-23 06:26:20 +00:00
Nate Begeman
ecb1dafd3d
Turn of jump tables for a bit, there are still some issues to work out with
...
updating the machine CFG.
llvm-svn: 27949
2006-04-22 23:51:56 +00:00
Nate Begeman
4ca2ea5b43
JumpTable support! What this represents is working asm and jit support for
...
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner
b21d3bfd1f
The BFS scheduler is apparently nondeterminstic (causes many llvmgcc bootstrap
...
miscompares). Switch RISC targets to use the list-td scheduler, which isn't.
llvm-svn: 27933
2006-04-21 17:16:16 +00:00
Chris Lattner
662e940f73
Fix a couple more memory issues
...
llvm-svn: 27930
2006-04-21 15:32:26 +00:00
Chris Lattner
cc47ab3305
Fix a really subtle and obnoxious memory bug that caused issues with an
...
llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it
*must* be returned by the visit function, or the dag combiner will not
know that the node has been processed (and will, e.g., send it to the
target dag combine xforms).
llvm-svn: 27922
2006-04-20 23:55:59 +00:00
Evan Cheng
a320abc494
Turn a VAND into a VECTOR_SHUFFLE is applicable.
...
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements,
into a vector shuffle with a zero vector. It only does so when TLI tells it
the xform is profitable.
llvm-svn: 27874
2006-04-20 08:56:16 +00:00
Chris Lattner
bc1b262725
Implement folding of a bunch of binops with undef
...
llvm-svn: 27863
2006-04-20 05:39:12 +00:00
Chris Lattner
73eb58e1a2
Simplify some code
...
llvm-svn: 27846
2006-04-19 23:17:50 +00:00
Chris Lattner
916ae0775e
Fix handling of calls in functions that use vectors. This fixes a crash on
...
the code in GCC PR26546.
llvm-svn: 27780
2006-04-17 22:10:08 +00:00
Chris Lattner
326870b40b
Codegen insertelement with constant insertion points as scalar_to_vector
...
and a shuffle. For this:
void %test2(<4 x float>* %F, float %f) {
%tmp = load <4 x float>* %F ; <<4 x float>> [#uses=2]
%tmp3 = add <4 x float> %tmp, %tmp ; <<4 x float>> [#uses=1]
%tmp2 = insertelement <4 x float> %tmp3, float %f, uint 2 ; <<4 x float>> [#uses=2]
%tmp6 = add <4 x float> %tmp2, %tmp2 ; <<4 x float>> [#uses=1]
store <4 x float> %tmp6, <4 x float>* %F
ret void
}
we now get this on X86 (which will get better):
_test2:
movl 4(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, %xmm1
shufps $3, %xmm1, %xmm1
movaps %xmm0, %xmm2
shufps $1, %xmm2, %xmm2
unpcklps %xmm1, %xmm2
movss 8(%esp), %xmm1
unpcklps %xmm1, %xmm0
unpcklps %xmm2, %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
ret
instead of:
_test2:
subl $28, %esp
movl 32(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%esp)
movss 36(%esp), %xmm0
movss %xmm0, 8(%esp)
movaps (%esp), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
addl $28, %esp
ret
llvm-svn: 27765
2006-04-17 19:21:01 +00:00
Chris Lattner
91226e5799
Add support for promoting stores from one legal type to another, allowing us
...
to write one pattern for vector stores instead of 4.
llvm-svn: 27730
2006-04-16 01:36:45 +00:00
Chris Lattner
7e7ad593cc
Make these predicates return true for bit_convert(buildvector)'s as well as
...
buildvectors.
llvm-svn: 27723
2006-04-15 23:38:00 +00:00
Chris Lattner
086e986e94
Make this assertion better
...
llvm-svn: 27695
2006-04-14 06:08:35 +00:00
Evan Cheng
119266ea92
Promote vector AND, OR, and XOR
...
llvm-svn: 27632
2006-04-12 21:20:24 +00:00
Evan Cheng
be8a8933e6
Vector type promotion for ISD::LOAD and ISD::SELECT
...
llvm-svn: 27606
2006-04-12 16:33:18 +00:00
Chris Lattner
d3b504ae10
Implement support for the formal_arguments node. To get this, targets shouldcustom legalize it and remove their XXXTargetLowering::LowerArguments overload
...
llvm-svn: 27604
2006-04-12 16:20:43 +00:00
Chris Lattner
417b96b6dd
Don't memoize vloads in the load map! Don't memoize them anywhere here, let
...
getNode do it. This fixes CodeGen/Generic/2006-04-11-vecload.ll
llvm-svn: 27602
2006-04-12 03:25:41 +00:00
Evan Cheng
7256b0ae05
Only get Tmp2 for cases where number of operands is > 1. Fixed return void.
...
llvm-svn: 27586
2006-04-11 06:33:39 +00:00
Chris Lattner
6cf3bbbe17
add some todos
...
llvm-svn: 27580
2006-04-11 02:00:08 +00:00
Chris Lattner
2eb22eef7d
Add basic support for legalizing returns of vectors
...
llvm-svn: 27578
2006-04-11 01:31:51 +00:00
Evan Cheng
cb73b8d419
Missing break
...
llvm-svn: 27559
2006-04-10 18:54:36 +00:00
Chris Lattner
02274a5265
Add code generator support for VSELECT
...
llvm-svn: 27542
2006-04-08 22:22:57 +00:00
Chris Lattner
e1401e3610
Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns
...
to match again :)
llvm-svn: 27533
2006-04-08 05:34:25 +00:00
Chris Lattner
098c01e94e
Codegen shufflevector as VVECTOR_SHUFFLE
...
llvm-svn: 27529
2006-04-08 04:15:24 +00:00
Chris Lattner
101ea66813
add a sanity check: LegalizeOp should return a value that is the same type
...
as its input.
llvm-svn: 27528
2006-04-08 04:13:17 +00:00
Evan Cheng
78e3d565af
INSERT_VECTOR_ELT lowering bug:
...
store vector to $esp
store element to $esp + sizeof(VT) * index
load vector from $esp
The bug is VT is the type of the vector element, not the type of the vector!
llvm-svn: 27517
2006-04-08 01:46:37 +00:00
Chris Lattner
aa3185f12e
Stub out shufflevector
...
llvm-svn: 27514
2006-04-08 01:19:25 +00:00
Evan Cheng
613996c55e
1. If both vector operands of a vector_shuffle are undef, turn it into an undef.
...
2. A shuffle mask element can also be an undef.
llvm-svn: 27472
2006-04-06 23:20:43 +00:00
Chris Lattner
4a2413a590
Make a vector live across blocks have the correct Vec type. This fixes
...
CodeGen/X86/2006-04-04-CrossBlockCrash.ll
llvm-svn: 27436
2006-04-05 06:54:42 +00:00
Evan Cheng
9fa8959dce
Exapnd a VECTOR_SHUFFLE to a BUILD_VECTOR if target asks for it to be expanded
...
or custom lowering fails.
llvm-svn: 27432
2006-04-05 06:07:11 +00:00
Chris Lattner
4ea52cac01
Do not create ZEXTLOAD's unless we are before legalize or the operation is
...
legal.
llvm-svn: 27402
2006-04-04 17:39:18 +00:00
Chris Lattner
6be79823e7
* Add supprot for SCALAR_TO_VECTOR operations where the input needs to be
...
promoted/expanded (e.g. SCALAR_TO_VECTOR from i8/i16 on PPC).
* Add support for targets to request that VECTOR_SHUFFLE nodes be promoted
to a canonical type, for example, we only want v16i8 shuffles on PPC.
* Move isShuffleLegal out of TLI into Legalize.
* Teach isShuffleLegal to allow shuffles that need to be promoted.
llvm-svn: 27399
2006-04-04 17:23:26 +00:00
Chris Lattner
a9e77d14c7
Constant fold bitconvert(undef)
...
llvm-svn: 27391
2006-04-04 01:02:22 +00:00
Chris Lattner
e1e3adf802
Add a missing check, this fixes UnitTests/Vector/sumarray.c
...
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner
04c00fc844
Add a missing check, which broke a bunch of vector tests.
...
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Andrew Lenharth
94f012f606
back this out
...
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth
015eaf5f33
This should be a win of every arch
...
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Chris Lattner
4993249a04
Add a little dag combine to compile this:
...
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
%tmp1 = load <4 x float>* %in ; <<4 x float>> [#uses=1]
%tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 ) ; <int> [#uses=1]
%tmp = seteq int %tmp, 0 ; <bool> [#uses=1]
%tmp3 = cast bool %tmp to int ; <int> [#uses=1]
ret int %tmp3
}
into this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
mtspr 256, r2
blr
instead of this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
xori r3, r3, 1
cntlzw r3, r3
srwi r3, r3, 5
mtspr 256, r2
blr
llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner
42a5fca47e
Implement promotion for EXTRACT_VECTOR_ELT, allowing v16i8 multiplies to work with PowerPC.
...
llvm-svn: 27349
2006-04-02 05:06:04 +00:00
Chris Lattner
87f080949b
Implement the Expand action for binary vector operations to break the binop
...
into elements and operate on each piece. This allows generic vector integer
multiplies to work on PPC, though the generated code is horrible.
llvm-svn: 27347
2006-04-02 03:57:31 +00:00
Chris Lattner
a9c59156be
Intrinsics that just load from memory can be treated like loads: they don't
...
have to serialize against each other. This allows us to schedule lvx's
across each other, for example.
llvm-svn: 27346
2006-04-02 03:41:14 +00:00
Chris Lattner
0442a18758
Constant fold all of the vector binops. This allows us to compile this:
...
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"
aka:
void %test2(<16 x sbyte>* %P) {
store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
ret void
}
into this:
_test2:
mfspr r2, 256
oris r4, r2, 32768
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
lvx v0, r5, r4
stvx v0, 0, r3
mtspr 256, r2
blr
instead of this:
_test2:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
vspltisb v0, 8
lvx v1, r5, r4
vxor v0, v1, v0
stvx v0, 0, r3
mtspr 256, r2
blr
... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html
llvm-svn: 27343
2006-04-02 03:25:57 +00:00