Commit Graph

5623 Commits

Author SHA1 Message Date
Chris Lattner a2cae1bb10 Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.

This speeds up v4i32 multiplies 4.1x (measured) on a G5.  This implements
PowerPC/vec_mul.ll

llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Evan Cheng 0ef233509b Another entry
llvm-svn: 27786
2006-04-18 01:22:57 +00:00
Evan Cheng e008bd3d27 Another entry.
llvm-svn: 27784
2006-04-18 00:21:01 +00:00
Evan Cheng 5421206c4b Use movss to insert_vector_elt(v, s, 0).
llvm-svn: 27782
2006-04-17 22:45:49 +00:00
Evan Cheng 6e5e205841 Use two pinsrw to insert an element into v4i32 / v4f32 vector.
llvm-svn: 27779
2006-04-17 22:04:06 +00:00
Chris Lattner 63a5cdc423 remove done item
llvm-svn: 27778
2006-04-17 21:52:03 +00:00
Chris Lattner 6bd68ae81e Don't diddle VRSAVE if no registers need to be added/removed from it. This
allows us to codegen functions as:

_test_rol:
        vspltisw v2, -12
        vrlw v2, v2, v2
        blr

instead of:

_test_rol:
        mfvrsave r2, 256
        mr r3, r2
        mtvrsave r3
        vspltisw v2, -12
        vrlw v2, v2, v2
        mtvrsave r2
        blr

Testcase here: CodeGen/PowerPC/vec_vrsave.ll

llvm-svn: 27777
2006-04-17 21:48:13 +00:00
Evan Cheng 22c06f054b Encoding bug
llvm-svn: 27773
2006-04-17 21:33:57 +00:00
Chris Lattner 72d7c27069 Vectors that are known live-in and live-out are clearly already marked in
the vrsave register for the caller.  This allows us to codegen a function as:

_test_rol:
        mfspr r2, 256
        mr r3, r2
        mtspr 256, r3
        vspltisw v2, -12
        vrlw v2, v2, v2
        mtspr 256, r2
        blr

instead of:

_test_rol:
        mfspr r2, 256
        oris r3, r2, 40960
        mtspr 256, r3
        vspltisw v0, -12
        vrlw v2, v0, v0
        mtspr 256, r2
        blr

llvm-svn: 27772
2006-04-17 21:22:06 +00:00
Chris Lattner 14c4972b6d Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this:
vspltisw v2, -12
        vrlw v2, v2, v2

instead of:

        vspltisw v0, -12
        vrlw v2, v0, v0

when a function is returning a value.

llvm-svn: 27771
2006-04-17 21:19:12 +00:00
Chris Lattner 6df094b4ab Move some knowledge about registers out of the code emitter into the register info.
llvm-svn: 27770
2006-04-17 21:07:20 +00:00
Chris Lattner 0f28d48da2 Use a small table instead of macros to do this conversion.
llvm-svn: 27769
2006-04-17 20:59:25 +00:00
Evan Cheng 5022b3426e Implement v8i16, v16i8 splat using unpckl + pshufd.
llvm-svn: 27768
2006-04-17 20:43:08 +00:00
Chris Lattner c070c621ac implement returns of a vector, testcase here: CodeGen/X86/vec_return.ll
llvm-svn: 27767
2006-04-17 20:32:50 +00:00
Chris Lattner e54133cfba Make sure to check splats of every constant we can, handle splat(31) by
being a bit more clever, add support for odd splats from -31 to -17.

llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Evan Cheng bf0d13c54f Incorrect foldMemoryOperand entries
llvm-svn: 27763
2006-04-17 18:06:12 +00:00
Evan Cheng 5112b5c544 Errors in patterns preventing load folding
llvm-svn: 27762
2006-04-17 18:05:01 +00:00
Jeff Cohen e3955a05e4 Add checks for __OpenBSD__.
llvm-svn: 27761
2006-04-17 17:55:41 +00:00
Chris Lattner 264c908e3a Teach the ppc backend to use rol and vsldoi to generate splatted constants.
This implements vec_constants.ll:test_vsldoi and test_rol

llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner 26fb8d9393 add a note
llvm-svn: 27758
2006-04-17 17:29:41 +00:00
Evan Cheng b3b41c4f3d FP SETOLT, SETOLT, SETUGE, SETUGT conditions were implemented incorrectly
llvm-svn: 27755
2006-04-17 07:24:10 +00:00
Chris Lattner 1b3806ace5 Make some code more general, adding support for constant formation of several
new patterns.

llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner f8dd76df5b Learn how to make odd splatted constants in range [17,29]. This implements
PowerPC/vec_constants.ll:test_29.

llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner 2a099c04c1 Pull some code out into a helper function.
Effeciently codegen even splats in the range [-32,30].

This allows us to codegen <30,30,30,30> as:

        vspltisw v0, 15
        vadduwm v2, v0, v0

instead of as a cp load.

llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner 071ad01ceb Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such.  This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll

llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner 85bfa3c2bc Regenerate with adjusted costs
llvm-svn: 27746
2006-04-17 05:26:20 +00:00
Chris Lattner aac2a200cd Regenerate with correct offset
llvm-svn: 27744
2006-04-17 05:08:46 +00:00
Chris Lattner 311b1a6e23 Increase the opcodes by one each to disambiguate COPY from VMRGHW.
llvm-svn: 27742
2006-04-17 00:47:48 +00:00
Chris Lattner 07a3d01a91 Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles
of various 4-element vectors.

llvm-svn: 27739
2006-04-17 00:37:02 +00:00
Evan Cheng 20712deecb movduprm, movshduprm bugs
llvm-svn: 27734
2006-04-16 18:11:28 +00:00
Evan Cheng 3064f9aaa6 Encoding bugs
llvm-svn: 27733
2006-04-16 07:02:22 +00:00
Evan Cheng 685ddd8152 Can't fold loads into alias vector SSE ops used for scalar operation. The load
address has to be 16-byte aligned but the values aren't spilled to 128-bit
locations.

llvm-svn: 27732
2006-04-16 06:58:19 +00:00
Chris Lattner 06a21ba96b Implement a TODO: have the legalizer canonicalize a bunch of operations to
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.

llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner fa5aa396c2 Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
Remove some done items from the todo list.

llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner 24acbe46c0 Fix a crash when faced with a shuffle vector that has an undef in its mask.
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner 873202fabd Add patterns for matching vnots with bit converted inputs. Most of these will
go away when I start using evan's binop type canonicalizer

llvm-svn: 27725
2006-04-15 23:45:24 +00:00
Chris Lattner 41df12ff4c Add a new vnot_conv predicate for matching vnot's where the allones vector is
bitconverted from some other type.

llvm-svn: 27724
2006-04-15 23:39:14 +00:00
Evan Cheng 8f1d801389 More encoding bugs
llvm-svn: 27722
2006-04-15 06:10:09 +00:00
Evan Cheng 91944e8699 pslldrm, psrawrm, etc. encoding bug
llvm-svn: 27721
2006-04-15 05:59:08 +00:00
Evan Cheng 1220b31a31 hsubp{s|d} encoding bug
llvm-svn: 27720
2006-04-15 05:52:42 +00:00
Evan Cheng 6222cf2a36 Silly bug
llvm-svn: 27719
2006-04-15 05:37:34 +00:00
Evan Cheng 65bb720a8b Do not use movs{h|l}dup for a shuffle with a single non-undef node.
llvm-svn: 27718
2006-04-15 03:13:24 +00:00
Evan Cheng 0ba896c75b Added SSE (and other) entries to foldMemoryOperand().
llvm-svn: 27716
2006-04-14 23:33:27 +00:00
Evan Cheng 00a5b3d9d3 Some clean up
llvm-svn: 27715
2006-04-14 23:32:40 +00:00
Chris Lattner 559c8ba466 Allow undef in a shuffle mask
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Evan Cheng 5d247f81c1 Last few SSE3 intrinsics.
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng 3bd605397b Misc. SSE2 intrinsics: clflush, lfench, mfence
llvm-svn: 27699
2006-04-14 07:43:12 +00:00
Evan Cheng e349d01acf We were not adjusting the frame size to ensure proper alignment when alloca /
vla are present in the function. This causes a crash when a leaf function
allocates space on the stack used to store / load with 128-bit SSE
instructions.

llvm-svn: 27698
2006-04-14 07:26:43 +00:00
Evan Cheng 8d76f3922b New entry
llvm-svn: 27697
2006-04-14 07:24:04 +00:00
Chris Lattner 4211ca9108 Move the rest of the PPCTargetLowering::LowerOperation cases out into
separate functions, for simplicity and code clarity.

llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner 19e9055eb5 Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
functions, which makes the code much cleaner :)

llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Evan Cheng eb0063a34f pcmpeq* and pcmpgt* intrinsics.
llvm-svn: 27685
2006-04-14 01:39:53 +00:00
Evan Cheng 16287444ff psll*, psrl*, and psra* intrinsics.
llvm-svn: 27684
2006-04-14 00:14:05 +00:00
Reid Spencer 64f6c11c59 Remove the .cvsignore file so this directory can be pruned.
llvm-svn: 27683
2006-04-13 22:00:10 +00:00
Reid Spencer 497ecf6840 Remove .cvsignore so that this directory can be pruned.
llvm-svn: 27682
2006-04-13 21:59:03 +00:00
Evan Cheng a84319719c Doh. PANDrm, etc. are not commutable.
llvm-svn: 27668
2006-04-13 18:11:28 +00:00
Chris Lattner 883fb053bd Force non-darwin targets to use a static relo model. This fixes PR734,
tested by CodeGen/Generic/vector.ll

llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner 5879efe0c8 add a note, move an altivec todo to the altivec list.
llvm-svn: 27654
2006-04-13 16:48:00 +00:00
Reid Spencer 9857229aba Add the README files to the distribution.
llvm-svn: 27651
2006-04-13 06:39:24 +00:00
Evan Cheng ed3996743f psad, pmax, pmin intrinsics.
llvm-svn: 27647
2006-04-13 06:11:45 +00:00
Evan Cheng 58dad55959 Various SSE2 packed integer intrinsics: pmulhuw, pavgw, etc.
llvm-svn: 27645
2006-04-13 05:24:54 +00:00
Evan Cheng e4f97ccf7f X86 SSE2 supports v8i16 multiplication
llvm-svn: 27644
2006-04-13 05:10:25 +00:00
Evan Cheng d2eb662415 Update
llvm-svn: 27643
2006-04-13 05:09:45 +00:00
Evan Cheng b3fe00bdc6 padds{b|w}, paddus{b|w}, psubs{b|w}, psubus{b|w} intrinsics.
llvm-svn: 27639
2006-04-13 00:43:35 +00:00
Evan Cheng 0aab735a1a Naming inconsistency.
llvm-svn: 27638
2006-04-13 00:00:23 +00:00
Evan Cheng c88afc36a9 SSE / SSE2 conversion intrinsics.
llvm-svn: 27637
2006-04-12 23:42:44 +00:00
Evan Cheng 92232307d0 All "integer" logical ops (pand, por, pxor) are now promoted to v2i64.
Clean up and fix various logical ops issues.

llvm-svn: 27633
2006-04-12 21:21:57 +00:00
Chris Lattner 147e50e1c5 Add a new way to match vector constants, which make it easier to bang bits of
different types.

Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1.  This compiles:

typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
  *P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
  *P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
  *P3 = vec_abs((vector float)*P3);
}

to:

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        vspltisw v0, -1
        vslw v0, v0, v0
        lvx v1, 0, r3
        vand v1, v1, v0
        stvx v1, 0, r3
        lvx v1, 0, r4
        vandc v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vandc v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

instead of (with two constant pool entries):

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        li r6, lo16(LCPI1_0)
        lis r7, ha16(LCPI1_0)
        li r8, lo16(LCPI1_1)
        lis r9, ha16(LCPI1_1)
        lvx v0, r7, r6
        lvx v1, 0, r3
        vand v0, v1, v0
        stvx v0, 0, r3
        lvx v0, r9, r8
        lvx v1, 0, r4
        vand v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vand v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

GCC produces (with 2 cp entries):

_test:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc00c
        mtspr 256,r0
        lis r2,ha16(LC0)
        lis r9,ha16(LC1)
        la r2,lo16(LC0)(r2)
        lvx v0,0,r3
        lvx v1,0,r5
        la r9,lo16(LC1)(r9)
        lwz r12,-4(r1)
        lvx v12,0,r2
        lvx v13,0,r9
        vand v0,v0,v12
        stvx v0,0,r3
        vspltisw v0,-1
        vslw v12,v0,v0
        vandc v1,v1,v12
        stvx v1,0,r5
        lvx v0,0,r4
        vand v0,v0,v13
        stvx v0,0,r4
        mtspr 256,r12
        blr

llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner 74cf9ff761 Rename get_VSPLI_elt -> get_VSPLTI_elt
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively.  This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI

llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Evan Cheng e2157c6e41 Promote v4i32, v8i16, v16i8 load to v2i64 load.
llvm-svn: 27612
2006-04-12 17:12:36 +00:00
Chris Lattner e318a7574e Ensure that zero vectors are always v4i32, which forces them to CSE with
each other.  This implements CodeGen/PowerPC/vxor-canonicalize.ll

llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Evan Cheng 29be057d92 Various SSE2 conversion intrinsics
llvm-svn: 27603
2006-04-12 05:20:24 +00:00
Evan Cheng 70c74a3ced Added __builtin_ia32_storelv4si, __builtin_ia32_movqv4si,
__builtin_ia32_loadlv4si, __builtin_ia32_loaddqu, __builtin_ia32_storedqu.

llvm-svn: 27599
2006-04-11 22:28:25 +00:00
Nate Begeman f19bcd5177 Fix SingleSource/UnitTests/Vector/sumarray-dbl
llvm-svn: 27594
2006-04-11 19:44:43 +00:00
Nate Begeman 1bb132099f Fix PR727, correctly handling large stack aligments on ppc
llvm-svn: 27593
2006-04-11 19:29:21 +00:00
Chris Lattner aaa04230bd we have a shuffle instr, add an example.
llvm-svn: 27592
2006-04-11 18:47:03 +00:00
Evan Cheng 6b60357f4a gcc lower SSE prefetch into generic prefetch intrinsic. Need to add support
later.

llvm-svn: 27591
2006-04-11 18:04:57 +00:00
Evan Cheng 6ea715af28 Misc. intrinsics.
llvm-svn: 27590
2006-04-11 17:35:57 +00:00
Jim Laskey 02b3b72bfc Suppress debug label when not debug.
llvm-svn: 27588
2006-04-11 08:11:53 +00:00
Evan Cheng 09a956271a movnt* and maskmovdqu intrinsics
llvm-svn: 27587
2006-04-11 06:57:30 +00:00
Chris Lattner e4db08a2f1 Vector function results go into V2 according to GCC. The darwin ABI doc
doesn't say where they go :-/

llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner 92533cfb4a Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
No functionality change.

llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Evan Cheng 12ba3e23d0 Added support for _mm_move_ss and _mm_move_sd.
llvm-svn: 27575
2006-04-11 00:19:04 +00:00
Jim Laskey dca2655daa Use existing information.
llvm-svn: 27574
2006-04-10 23:09:19 +00:00
Evan Cheng f8ac02283c Remove some bogus patterns; clean up.
llvm-svn: 27569
2006-04-10 22:35:16 +00:00
Chris Lattner d99f57c1e1 add a note
llvm-svn: 27567
2006-04-10 21:51:03 +00:00
Evan Cheng 051de9a82b Remove an entry that is now done.
llvm-svn: 27565
2006-04-10 21:42:57 +00:00
Evan Cheng 76112c3cb8 Added some missing shuffle patterns.
llvm-svn: 27564
2006-04-10 21:42:19 +00:00
Evan Cheng 664fcba5fa Correct an entry
llvm-svn: 27563
2006-04-10 21:41:39 +00:00
Evan Cheng 395fa3d2a6 movups / movupd
llvm-svn: 27562
2006-04-10 21:11:06 +00:00
Evan Cheng 617a6a812e Conditional move of vector types.
llvm-svn: 27556
2006-04-10 07:23:14 +00:00
Evan Cheng 014849e121 New entries
llvm-svn: 27555
2006-04-10 07:22:03 +00:00
Evan Cheng c9ed8e4c1a Use movaps to do VR128 reg-to-reg copies for now. It's shorter and available for SSE1.
llvm-svn: 27554
2006-04-10 07:21:31 +00:00
Chris Lattner 3a68f3c3ca properly mark vector selects as expanded to select_cc
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner 0a3d1bbca4 Add VRRC select support
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Nate Begeman 3f9c17906f Disable switch lowering for targets based on the selection dag isel,
letting the code generator handle them directly.

llvm-svn: 27539
2006-04-08 19:46:55 +00:00
Chris Lattner d9e80f4516 Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
constant pool load.

llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner d71a1f946d Change the interface to the predicate that determines if vsplti* can be used.
No functionality changes.

llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Reid Spencer cf905223c5 Initialize SDOperand values because the gcc 4.0.2 compiler complains about
them.

llvm-svn: 27534
2006-04-08 05:38:03 +00:00
Evan Cheng 0df9c9f57d ldmxcsr and stmxcsr.
llvm-svn: 27506
2006-04-08 00:47:44 +00:00
Evan Cheng ac847268c5 Code clean up.
llvm-svn: 27501
2006-04-07 21:53:05 +00:00
Evan Cheng aa18a52545 Added patterns for MOVHPSmr and MOVLPSmr.
llvm-svn: 27497
2006-04-07 21:20:58 +00:00
Evan Cheng 748e573ce5 Keep track of an Mac OS X / x86 ABI bug.
llvm-svn: 27496
2006-04-07 21:19:53 +00:00
Jim Laskey c0d6518f27 Make sure that debug labels are defined within the same section and after the
entry point of a function.

llvm-svn: 27494
2006-04-07 20:44:42 +00:00
Jim Laskey 2d7298c362 Foundation for call frame information.
llvm-svn: 27491
2006-04-07 16:34:46 +00:00
Evan Cheng d8e1a01be6 A MOVPS2SSmr, i.e. _mm_store_ss, encoding bug.
Also MOVPDI2DIrr.

llvm-svn: 27476
2006-04-06 23:53:29 +00:00
Evan Cheng c995b45f67 - movlp{s|d} and movhp{s|d} support.
- Normalize shuffle nodes so result vector lower half elements come from the
  first vector, the rest come from the second vector. (Except for the
  exceptions :-).
- Other minor fixes.

llvm-svn: 27474
2006-04-06 23:23:56 +00:00
Evan Cheng acf8b3c828 New entries.
llvm-svn: 27473
2006-04-06 23:21:24 +00:00
Andrew Lenharth 1596a1b276 This may be overconservative, but it lets the new cfe compile
llvm-svn: 27471
2006-04-06 23:18:45 +00:00
Chris Lattner e61cfad815 Add an item
llvm-svn: 27470
2006-04-06 23:16:19 +00:00
Chris Lattner 466841ddc7 Make sure to return the result in the right type.
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner a4bbfaed5c Match vpku[hw]um(x,x).
Convert vsldoi(x,x) to work the same way other (x,x) cases work.

llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner f38e033270 Add support for matching vmrg(x,x) patterns
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Andrew Lenharth cee782d514 fix some linking problems with the new gcc
llvm-svn: 27460
2006-04-06 21:26:32 +00:00
Chris Lattner d1dcb52093 Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner a4c727f1cc remove two done items
llvm-svn: 27453
2006-04-06 19:19:38 +00:00
Chris Lattner 1d33819194 Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
lower it and LLVM to have one fewer intrinsic.  This implements
CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner e8b83b4206 Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
vperm with a perm mask lvx'd from the constant pool.

llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Evan Cheng 695e45c252 POR encoded as PAND, yikes.
llvm-svn: 27446
2006-04-06 01:49:20 +00:00
Evan Cheng dddb688a40 An entry about comi / ucomi intrinsics.
llvm-svn: 27445
2006-04-05 23:46:04 +00:00
Evan Cheng 780382946e Support for comi / ucomi intrinsics.
llvm-svn: 27444
2006-04-05 23:38:46 +00:00
Chris Lattner c94d932447 Add all of the data stream intrinsics and instructions. woo
llvm-svn: 27442
2006-04-05 22:27:14 +00:00
Chris Lattner 39dc64c955 Fix a typo
llvm-svn: 27440
2006-04-05 20:15:25 +00:00
Chris Lattner 39cc717c65 Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng f3b52c84ea Handle canonical form of e.g.
vector_shuffle v1, v1, <0, 4, 1, 5, 2, 6, 3, 7>

This is turned into
vector_shuffle v1, <undef>, <0, 0, 1, 1, 2, 2, 3, 3>
by dag combiner.

It would match a {p}unpckl on x86.

llvm-svn: 27437
2006-04-05 07:20:06 +00:00
Evan Cheng 6d196db40d Bogus assert
llvm-svn: 27434
2006-04-05 06:11:20 +00:00
Evan Cheng 2cf4232ced Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Evan Cheng 59a6355e82 Handle v8i16 shuffle that must be broken into a pair of pshufhw / pshuflw.
llvm-svn: 27427
2006-04-05 01:47:37 +00:00
Chris Lattner 2f8e2b2895 add vsl
llvm-svn: 27425
2006-04-05 01:16:22 +00:00
Chris Lattner 575352ac20 add vmladduhm
llvm-svn: 27423
2006-04-05 00:49:48 +00:00
Chris Lattner 5a528e565b Add m[tf]vscr instructions.
llvm-svn: 27421
2006-04-05 00:03:57 +00:00
Chris Lattner 0c82447c66 add a note
llvm-svn: 27419
2006-04-04 23:45:11 +00:00
Chris Lattner 281bb5da1d Add missing byte merges.
llvm-svn: 27418
2006-04-04 23:43:56 +00:00
Chris Lattner fc50ae521c Add FP -> Int Conversions
llvm-svn: 27417
2006-04-04 23:25:02 +00:00
Chris Lattner 96338b6a21 add average intrinsics
llvm-svn: 27416
2006-04-04 23:14:00 +00:00
Chris Lattner 4464383a17 add a note
llvm-svn: 27414
2006-04-04 22:43:55 +00:00
Chris Lattner 4a744e5c9d Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Evan Cheng 011c23d9d3 Added pslldq and psrldq.
llvm-svn: 27412
2006-04-04 21:49:39 +00:00
Evan Cheng 8f3b6b8d8a Minor fixes + naming changes.
llvm-svn: 27410
2006-04-04 19:12:30 +00:00
Evan Cheng 802b35c339 PSHUF* encoding bugs.
llvm-svn: 27405
2006-04-04 18:40:36 +00:00
Chris Lattner 95c7adc7cb Ask legalize to promote all vector shuffles to be v16i8 instead of having to
handle all 4 PPC vector types.   This simplifies the matching code and allows
us to eliminate a bunch of patterns.  This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.

llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Evan Cheng e91e3bd874 cmpps / cmppd encoding bug
llvm-svn: 27393
2006-04-04 03:04:07 +00:00
Evan Cheng dd2eb27d6d Compact some intrinsic definitions.
llvm-svn: 27388
2006-04-04 00:10:53 +00:00
Chris Lattner b1e6d84544 Plug in the byte and short splats
llvm-svn: 27387
2006-04-04 00:05:13 +00:00
Chris Lattner 447a7968af Revert accidentally committed hunks.
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner 533aed9a35 Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Evan Cheng 0ef83c83e1 Some SSE1 intrinsics: min, max, sqrt, etc.
llvm-svn: 27384
2006-04-03 23:49:17 +00:00
Chris Lattner bf0016f2d4 revert previous patch
llvm-svn: 27383
2006-04-03 23:14:49 +00:00
Evan Cheng b64827e662 Use movlpd to: store lower f64 extracted from v2f64.
Use movhpd to: store upper f64 extracted from v2f64.

llvm-svn: 27382
2006-04-03 22:30:54 +00:00
Chris Lattner 5400727595 Force use of a frame-pointer if there is anything on the stack that is aligned
more than the OS keeps the stack aligned.

llvm-svn: 27381
2006-04-03 22:03:29 +00:00
Evan Cheng ebf1006d16 - More efficient extract_vector_elt with shuffle and movss, movsd, movd, etc.
- Some bug fixes and naming inconsistency fixes.

llvm-svn: 27377
2006-04-03 20:53:28 +00:00
Chris Lattner 78c788b450 Align vectors to the size in bytes, not bits.
llvm-svn: 27376
2006-04-03 19:28:50 +00:00
Chris Lattner 9ccd61c893 Add the full set of min/max instructions
llvm-svn: 27372
2006-04-03 15:58:28 +00:00
Andrew Lenharth df7abf8b74 support x * (c1 + c2) where c1 and c2 are pow2s. special case for c2 == 4
llvm-svn: 27370
2006-04-03 04:19:17 +00:00
Andrew Lenharth 4e2c073a33 mul by const conversion sequences. more coming soon
llvm-svn: 27368
2006-04-03 03:18:59 +00:00
Andrew Lenharth 444bdb069a This makes McCat/12-IOtest go 8x faster or so
llvm-svn: 27363
2006-04-02 21:08:39 +00:00
Andrew Lenharth 01bd5523a3 This will be needed soon
llvm-svn: 27362
2006-04-02 20:13:57 +00:00
Chris Lattner acf1fc8a28 add a note
llvm-svn: 27360
2006-04-02 07:20:00 +00:00
Chris Lattner c5287c0ece Inform the dag combiner that the predicate compares only return a low bit.
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner 6c1321ca3f relax assertion
llvm-svn: 27358
2006-04-02 06:19:46 +00:00
Chris Lattner e6025525fb Allow targets to compute masked bits for intrinsics.
llvm-svn: 27357
2006-04-02 06:15:09 +00:00
Chris Lattner 80fdc1eb6b Remove done item
llvm-svn: 27351
2006-04-02 05:28:54 +00:00
Chris Lattner b80f114707 add a note
llvm-svn: 27348
2006-04-02 03:59:11 +00:00
Chris Lattner 7a29cf3c7f New note
llvm-svn: 27337
2006-04-02 01:47:20 +00:00
Chris Lattner 9b2d6e7886 Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
"vspltisb v0, 8" instead of a constant pool load.

llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner dc72c17798 Implement vnot using VNOR instead of using 'vspltisb v0, -1' and vxor
llvm-svn: 27331
2006-04-01 22:41:47 +00:00
Chris Lattner 0baebb11bf ADd a note
llvm-svn: 27324
2006-04-01 04:08:29 +00:00
Chris Lattner ff77dc0a08 Shrinkify some more intrinsic definitions.
llvm-svn: 27322
2006-03-31 22:41:56 +00:00
Evan Cheng dc1161cf53 An entry about packed type alignments.
llvm-svn: 27321
2006-03-31 22:35:14 +00:00
Chris Lattner 20d3f3726f Pull operand asm string into base class, shrinkifying intrinsic definitions.
No functionality change.

llvm-svn: 27320
2006-03-31 22:34:05 +00:00
Evan Cheng a11d834b8c TargetData.cpp::getTypeInfo() was returning alignment of element type as the
alignment of a packed type. This is obviously wrong. Added a workaround that
returns the size of the packed type as its alignment. The correct fix would
be to return a target dependent alignment value provided via TargetLowering
(or some other interface).

llvm-svn: 27319
2006-03-31 22:33:42 +00:00
Chris Lattner 110fc74b97 Fix 80 column violations :)
llvm-svn: 27315
2006-03-31 21:57:36 +00:00
Evan Cheng 5fd7c69473 Use a X86 target specific node X86ISD::PINSRW instead of a mal-formed
INSERT_VECTOR_ELT to insert a 16-bit value in a 128-bit vector.

llvm-svn: 27314
2006-03-31 21:55:24 +00:00
Evan Cheng 747e29ef0b Added support for SSE3 horizontal ops: haddp{s|d} and hsub{s|d}.
llvm-svn: 27310
2006-03-31 21:29:33 +00:00
Chris Lattner a4150f751d fix a pasto
llvm-svn: 27308
2006-03-31 21:19:06 +00:00
Chris Lattner e7fd4b0274 Add vperm support for all datatypes
llvm-svn: 27307
2006-03-31 20:00:35 +00:00
Chris Lattner baa73e0d91 Rearrange code a bit
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner 754b41c84b Add, sub and shuffle are legal for all vector types
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Evan Cheng cbffa4656b Add support to use pextrw and pinsrw to extract and insert a word element
from a 128-bit vector.

llvm-svn: 27304
2006-03-31 19:22:53 +00:00
Evan Cheng 3296f297d5 Add vector_extract and vector_insert nodes.
llvm-svn: 27303
2006-03-31 19:21:16 +00:00
Chris Lattner 40ff17dc22 add a note
llvm-svn: 27302
2006-03-31 19:00:22 +00:00
Chris Lattner 829a061abf note to self: *save* file, then check it in
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner d4058a59d4 Implement an item from the readme, folding vcmp/vcmp. instructions with
identical instructions into a single instruction.  For example, for:

void test(vector float *x, vector float *y, int *P) {
  int v = vec_any_out(*x, *y);
  *x = (vector float)vec_cmpb(*x, *y);
  *P = v;
}

we now generate:

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        lvx v0, 0, r4
        lvx v1, 0, r3
        vcmpbfp. v0, v1, v0
        mfcr r4, 2
        stvx v0, 0, r3
        rlwinm r3, r4, 27, 31, 31
        xori r3, r3, 1
        stw r3, 0(r5)
        mtspr 256, r2
        blr

instead of:

_test:
        mfspr r2, 256
        oris r6, r2, 57344
        mtspr 256, r6
        lvx v0, 0, r4
        lvx v1, 0, r3
        vcmpbfp. v2, v1, v0
        mfcr r4, 2
***     vcmpbfp v0, v1, v0
        rlwinm r4, r4, 27, 31, 31
        stvx v0, 0, r3
        xori r3, r4, 1
        stw r3, 0(r5)
        mtspr 256, r2
        blr

Testcase here: CodeGen/PowerPC/vcmp-fold.ll

llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner 070181c927 compactify some more instruction definitions
llvm-svn: 27288
2006-03-31 05:38:32 +00:00
Chris Lattner 45c709388a Compactify comparisons.
llvm-svn: 27287
2006-03-31 05:32:57 +00:00
Chris Lattner d7495ae7e9 Lower vector compares to VCMP nodes, just like we lower vector comparison
predicates to VCMPo nodes.

llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner e5a6c4f8b7 These are done
llvm-svn: 27284
2006-03-31 04:53:21 +00:00
Chris Lattner 051f7861b8 Was returning the wrong type.
llvm-svn: 27277
2006-03-31 01:50:09 +00:00
Chris Lattner bca5fbe914 Mark INSERT_VECTOR_ELT as expand
llvm-svn: 27276
2006-03-31 01:48:55 +00:00
Evan Cheng 1b0d294de0 Expand all INSERT_VECTOR_ELT (obviously bad) for now.
llvm-svn: 27275
2006-03-31 01:30:39 +00:00
Chris Lattner f144dac7b7 Modify the TargetLowering::getPackedTypeBreakdown method to also return the
unpromoted element type.

llvm-svn: 27273
2006-03-31 00:46:36 +00:00
Evan Cheng d9d0bbb5ac Typo
llvm-svn: 27272
2006-03-31 00:33:57 +00:00
Evan Cheng 99d7205fba Ok for vector_shuffle mask to contain undef elements.
llvm-svn: 27271
2006-03-31 00:30:29 +00:00
Chris Lattner 549fb167eb Implement TargetLowering::getPackedTypeBreakdown
llvm-svn: 27270
2006-03-31 00:28:56 +00:00
Chris Lattner c4e3eadf21 Add the rest of the vmul instructions and the vmulsum* instructions.
llvm-svn: 27268
2006-03-30 23:39:06 +00:00
Chris Lattner a23158f1ca Use a new tblgen feature to significantly shrinkify instruction definitions that
directly correspond to intrinsics.

llvm-svn: 27266
2006-03-30 23:21:27 +00:00
Chris Lattner 551d3a11d3 Add a bunch of new instructions for intrinsics.
llvm-svn: 27265
2006-03-30 23:07:36 +00:00
Evan Cheng 7e2ff11a42 Make sure all possible shuffles are matched.
Use pshufd, pshuhw, and pshulw to shuffle v4f32 if shufps doesn't match.
Use shufps to shuffle v4f32 if pshufd, pshuhw, and pshulw don't match.

llvm-svn: 27259
2006-03-30 19:54:57 +00:00
Evan Cheng dd487d865b More logical ops patterns
llvm-svn: 27257
2006-03-30 07:33:32 +00:00
Evan Cheng c58ef7deeb Add support for _mm_cmp{cc}_ss and _mm_cmp{cc}_ps intrinsics
llvm-svn: 27256
2006-03-30 06:21:22 +00:00
Evan Cheng 593310016d Add 128-bit pmovmskb intrinsic support.
llvm-svn: 27255
2006-03-30 00:33:26 +00:00
Evan Cheng c5cf9bba05 Change SSE pack operation definitions to fit what the intrinsics expected.
For example, packsswb actually creates a v16i8 from a pair of v8i16. But since
the intrinsic specification forces the output type to match the operands.

llvm-svn: 27254
2006-03-29 23:53:14 +00:00
Evan Cheng b7fedffc78 - Added some SSE2 128-bit packed integer ops.
- Added SSE2 128-bit integer pack with signed saturation ops.
- Added pshufhw and pshuflw ops.

llvm-svn: 27252
2006-03-29 23:07:14 +00:00
Evan Cheng acc336475e Need to special case splat after all. Make the second operand of splat
vector_shuffle undef.

llvm-svn: 27250
2006-03-29 19:02:40 +00:00
Evan Cheng 3cf95747c7 Floating point logical operation patterns should match bit_convert. Or else
integer vector logical operations would match andp{s|d} instead of pand.

llvm-svn: 27248
2006-03-29 18:47:40 +00:00
Evan Cheng 500ec16578 - More shuffle related bug fixes.
- Whenever possible use ops of the right packed types for vector shuffles /
  splats.

llvm-svn: 27246
2006-03-29 03:04:49 +00:00
Evan Cheng 3a1c4e75de Another entry about shuffles.
llvm-svn: 27245
2006-03-29 03:03:46 +00:00
Evan Cheng da59b0d2a8 - Only use pshufd for v4i32 vector shuffles.
- Other shuffle related fixes.

llvm-svn: 27244
2006-03-29 01:30:51 +00:00
Chris Lattner 7d6f4f14b4 add a note
llvm-svn: 27243
2006-03-29 00:24:13 +00:00
Evan Cheng 38b34296d0 Added aliases to scalar SSE instructions, e.g. addss, to match x86 intrinsics.
The source operands type are v4sf with upper bits passes through.
Added matching code for these.

llvm-svn: 27240
2006-03-28 23:51:43 +00:00
Evan Cheng 8160fd3d42 Fixing buggy code.
llvm-svn: 27239
2006-03-28 23:41:33 +00:00
Chris Lattner 66e1410858 add a note
llvm-svn: 27227
2006-03-28 18:56:23 +00:00
Jim Laskey d1aa1638c6 Expose base register for DwarfWriter. Refactor code accordingly.
llvm-svn: 27225
2006-03-28 13:48:33 +00:00
Jim Laskey 457e54efc1 Added missing paren on behalf of Ramana Radhakrishnan.
llvm-svn: 27223
2006-03-28 10:17:11 +00:00
Evan Cheng 21e5476deb Missed X86::isUNPCKHMask
llvm-svn: 27222
2006-03-28 08:27:15 +00:00
Evan Cheng be2d9a0e99 movlps and movlpd should be modeled as two address code.
llvm-svn: 27221
2006-03-28 07:01:28 +00:00
Evan Cheng dc57ae0711 Update
llvm-svn: 27220
2006-03-28 06:55:45 +00:00
Evan Cheng 4e7374ff8a Typo
llvm-svn: 27219
2006-03-28 06:53:49 +00:00
Evan Cheng 1a194a5264 * Prefer using operation of matching types. e.g unpcklpd rather than movlhps.
* Bug fixes.

llvm-svn: 27218
2006-03-28 06:50:32 +00:00
Nate Begeman af8c373e77 Fix a couple typos
llvm-svn: 27216
2006-03-28 04:18:18 +00:00
Nate Begeman 1b3928765d Add a few more altivec intrinsics
llvm-svn: 27215
2006-03-28 04:15:58 +00:00
Evan Cheng 08b473c619 Added a couple of entries about movhps and movlhps.
llvm-svn: 27212
2006-03-28 02:49:12 +00:00
Evan Cheng 3765fadef6 All unpack cases are now being handled.
llvm-svn: 27211
2006-03-28 02:44:05 +00:00
Evan Cheng 2bc3280659 - Clean up / consoladate various shuffle masks.
- Some misc. bug fixes.
- Use MOVHPDrm to load from m64 to upper half of a XMM register.

llvm-svn: 27210
2006-03-28 02:43:26 +00:00
Chris Lattner 3710fca2b8 implement a bunch more intrinsics.
llvm-svn: 27209
2006-03-28 02:29:37 +00:00
Chris Lattner cb5ec07cc3 Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
same thing and we have a dag node for the former.

llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner e55d171ccd Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Evan Cheng 5df75889db Model unpack lower and interleave as vector_shuffle so we can lower the
intrinsics as such.

llvm-svn: 27200
2006-03-28 00:39:58 +00:00
Jim Laskey fa53b276d0 Translate llvm target registers to dwarf register numbers properly.
llvm-svn: 27180
2006-03-27 20:18:45 +00:00
Chris Lattner 018e17c8de unbreak the build
llvm-svn: 27174
2006-03-27 16:52:45 +00:00
Chris Lattner 939c9ab88f Add a bunch of notes from my journey thus far.
llvm-svn: 27170
2006-03-27 07:41:00 +00:00
Chris Lattner 22ec3e7b7e Split out altivec notes into their own README
llvm-svn: 27168
2006-03-27 07:04:16 +00:00
Evan Cheng 9b9cc4fb39 Use pcmpeq to generate vector of all ones.
llvm-svn: 27167
2006-03-27 07:00:16 +00:00
Evan Cheng a74792fa9d Changed isBuildVectorAllOnesInteger to isBuildVectorAllOnes.
llvm-svn: 27166
2006-03-27 06:59:32 +00:00
Chris Lattner 1738c293b5 Fix the JIT encoding of VSEL
llvm-svn: 27160
2006-03-27 03:34:17 +00:00
Chris Lattner df59d5314c Fix the JIT encoding of VSPLTI*
llvm-svn: 27159
2006-03-27 03:28:57 +00:00
Nate Begeman ed728c1291 SelectionDAGISel can now natively handle Switch instructions, in the same
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks.  The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.

This functionality is currently only enabled on x86, but should be safe for
every target.  In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.

llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Chris Lattner 65473e20d8 add vsel
llvm-svn: 27153
2006-03-26 22:38:43 +00:00
Nate Begeman 68cc9d4540 Readme note
llvm-svn: 27152
2006-03-26 19:19:27 +00:00
Chris Lattner 6961fc76bb Codegen vector predicate compares.
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng ed6184aef2 Remove X86:isZeroVector, use ISD::isBuildVectorAllZeros instead; some fixes / cleanups
llvm-svn: 27150
2006-03-26 09:53:12 +00:00
Evan Cheng b1ddc988af Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Evan Cheng 5562f2092f Add immAllZerosV helper
llvm-svn: 27148
2006-03-26 09:51:39 +00:00
Chris Lattner 793cbcb4fd Add all of the altivec comparison instructions. Add patterns for the
non-predicate altivec compare intrinsics.

llvm-svn: 27143
2006-03-26 04:57:17 +00:00
Chris Lattner c6c88b2ea1 Add and 8/16-bit adds, add all integer subtracts, add saturating subtract
intrinsics.

llvm-svn: 27142
2006-03-26 02:39:02 +00:00
Chris Lattner 53e07decd7 implement the vsldoi intrinsic.
llvm-svn: 27139
2006-03-26 00:41:48 +00:00
Chris Lattner 5c0c762443 fix the pattern for vandc, it's NOT vnand
llvm-svn: 27136
2006-03-25 23:10:40 +00:00
Chris Lattner e8c1d04051 add patterns for VANDC/VNOR, implementing
CodeGen/PowerPC/eqv-andc-orc-nor.ll:VNOR/VANDC

llvm-svn: 27135
2006-03-25 23:05:29 +00:00
Chris Lattner 3de9286e09 add a vnot helper node for matching 'not' on vectors
llvm-svn: 27132
2006-03-25 23:00:08 +00:00
Chris Lattner b3617beb52 Add some logical operations
llvm-svn: 27127
2006-03-25 22:16:05 +00:00
Evan Cheng 3e4d38eea5 Added missing (any_extend (load ...)) patterns.
llvm-svn: 27120
2006-03-25 09:45:48 +00:00
Evan Cheng 2bc0941e2a Build arbitrary vector with more than 2 distinct scalar elements with a
series of unpack and interleave ops.

llvm-svn: 27119
2006-03-25 09:37:23 +00:00
Chris Lattner 1b4bb22f8a implement a bunch of intrinsics
llvm-svn: 27118
2006-03-25 08:01:02 +00:00
Chris Lattner 2a85fa1f79 Move all Altivec stuff out into a new PPCInstrAltivec.td file.
Add a bunch of patterns for different datatypes, e.g. bit_convert, undef and
zero vector support.

llvm-svn: 27117
2006-03-25 07:51:43 +00:00
Chris Lattner 1cb91b3cd9 Add some basic patterns for other datatypes
llvm-svn: 27116
2006-03-25 07:39:07 +00:00
Chris Lattner 3a66a75108 add all supported formats to the vector register file
llvm-svn: 27115
2006-03-25 07:36:56 +00:00
Chris Lattner f653cdd3f9 Add support for __builtin_altivec_vnmsubfp /vmaddfp
llvm-svn: 27112
2006-03-25 07:05:55 +00:00
Chris Lattner 5d70a7c4a5 #include Intrinsics.h into all dag isels
llvm-svn: 27109
2006-03-25 06:47:10 +00:00
Chris Lattner 2771e2c960 Codegen things like:
<int -1, int -1, int -1, int -1>
and
 <int 65537, int 65537, int 65537, int 65537>

Using things like:
  vspltisb v0, -1
and:
  vspltish v0, 1

instead of using constant pool loads.

This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.

llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Evan Cheng 79e500ec74 Added SSE cachebility ops
llvm-svn: 27103
2006-03-25 06:03:26 +00:00
Evan Cheng 1aaa7280cd Instruction encoding bug
llvm-svn: 27102
2006-03-25 06:00:03 +00:00
Chris Lattner 9dc2d17ae6 Add new intrinsic node definitions for tblgen use
llvm-svn: 27100
2006-03-25 02:29:35 +00:00
Evan Cheng 6f7d31ea50 Added 128-bit packed integer subtraction.
llvm-svn: 27096
2006-03-25 01:33:37 +00:00
Evan Cheng 8e481df625 Added CVTTPS2PI.
llvm-svn: 27095
2006-03-25 01:31:59 +00:00
Evan Cheng 980c4d5b46 Added CVTSS2SI.
llvm-svn: 27094
2006-03-25 01:00:18 +00:00
Evan Cheng e7ee6a5e32 Support for scalar to vector with zero extension.
llvm-svn: 27091
2006-03-24 23:15:12 +00:00
Jim Laskey bb84eae239 D'oh - should be even numbered.
llvm-svn: 27088
2006-03-24 22:48:02 +00:00
Evan Cheng 2f0277bf48 Added LDMXCSR
llvm-svn: 27087
2006-03-24 22:28:37 +00:00
Chris Lattner 97599f1211 plug the intrinsics into the patterns for movmsk*
llvm-svn: 27083
2006-03-24 21:49:18 +00:00
Jim Laskey f0729b4067 Add dwarf register numbering to register data.
llvm-svn: 27081
2006-03-24 21:15:58 +00:00
Jim Laskey 3b338d5566 Add support for dwarf register numbering.
llvm-svn: 27080
2006-03-24 21:13:21 +00:00
Chris Lattner 9f9b6116e1 add another note
llvm-svn: 27077
2006-03-24 20:04:27 +00:00
Chris Lattner 0affd76182 add a note
llvm-svn: 27076
2006-03-24 19:59:17 +00:00
Chris Lattner c6b13e21cc Shuffle some includes around
llvm-svn: 27073
2006-03-24 18:52:35 +00:00
Chris Lattner 58a9622957 expose intrinsic info to the targets.
llvm-svn: 27070
2006-03-24 18:44:11 +00:00
Chris Lattner d589dd1352 Fix a bad JIT encoding of VPERM. Why is VPERM D,A,B,C but vfmadd is D,A,C,B ??
llvm-svn: 27069
2006-03-24 18:24:43 +00:00
Chris Lattner f2286d5917 Like the comment says, prefer to use the implicit add done by [r+r] addressing
modes than emitting an explicit add and using a base of r0.  This implements
Regression/CodeGen/PowerPC/mem-rr-addr-mode.ll

llvm-svn: 27068
2006-03-24 17:58:06 +00:00
Jim Laskey 864e444749 Clean up some commentary.
llvm-svn: 27064
2006-03-24 10:00:56 +00:00
Chris Lattner a90b7141ed Disable the i32->float G5 optimization. It is unsafe, as documented in the
comment.

This fixes 177.mesa, and McCat/09-vor with the td scheduler.

llvm-svn: 27060
2006-03-24 07:53:47 +00:00
Chris Lattner ab882abce8 add support for using vxor to build zero vectors. This implements
Regression/CodeGen/PowerPC/vec_zero.ll

llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Evan Cheng 082c8785ef Handle BUILD_VECTOR with all zero elements.
llvm-svn: 27056
2006-03-24 07:29:27 +00:00
Chris Lattner f5efddf80b Gabor points out that we can't spell. :)
llvm-svn: 27049
2006-03-24 07:12:19 +00:00
Evan Cheng a91d8a5b43 All v2f64 shuffle cases can be handled.
llvm-svn: 27044
2006-03-24 06:40:32 +00:00
Evan Cheng 2595a687da More efficient v2f64 shuffle using movlhps, movhlps, unpckhpd, and unpcklpd.
llvm-svn: 27040
2006-03-24 02:58:06 +00:00
Evan Cheng 6afb3c2de7 A new entry
llvm-svn: 27039
2006-03-24 02:57:03 +00:00
Reid Spencer f9c3dcfdc1 Ignore the burg output files.
llvm-svn: 27033
2006-03-24 02:21:35 +00:00
Evan Cheng d27fb3e85e Handle more shuffle cases with SHUFP* instructions.
llvm-svn: 27024
2006-03-24 01:18:28 +00:00
Evan Cheng 4b5b4e373b Typo
llvm-svn: 27008
2006-03-23 23:24:51 +00:00
Chris Lattner cbcfe46556 add a note
llvm-svn: 27000
2006-03-23 21:28:44 +00:00
Evan Cheng f842ea57bb Typo
llvm-svn: 26997
2006-03-23 20:26:04 +00:00
Chris Lattner 81137629e0 Add PPC vector bit-convert support
llvm-svn: 26995
2006-03-23 19:54:27 +00:00
Jim Laskey 3c43609f1f Add support to locate local variables in frames (early version.)
llvm-svn: 26994
2006-03-23 18:12:57 +00:00
Jim Laskey cf0166fbeb Change interface to DwarfWriter.
llvm-svn: 26991
2006-03-23 18:09:44 +00:00
Jim Laskey 267d39d128 Modify how CBE handles #lines.
llvm-svn: 26990
2006-03-23 18:08:29 +00:00
Chris Lattner ce0206e119 Fix the encodings of these new instructions, hopefully fixing the JIT
failures from last night

llvm-svn: 26981
2006-03-23 16:13:50 +00:00
Evan Cheng 82ed4a42f9 Following icc's lead: use movdqa to load / store 128-bit integer vectors
llvm-svn: 26980
2006-03-23 07:44:07 +00:00
Chris Lattner 6f95ab7abb Eliminate IntrinsicLowering from TargetMachine.
Make the CBE and V9 backends create their own, since they're the only ones that use it.

llvm-svn: 26974
2006-03-23 05:43:16 +00:00
Chris Lattner 811dd8d009 remove always-null IntrinsicLowering argument.
llvm-svn: 26971
2006-03-23 05:28:02 +00:00
Evan Cheng 7055878170 Add v4i32 <-> v4f32 bitconvert patterns.
llvm-svn: 26969
2006-03-23 02:36:37 +00:00
Evan Cheng b9b0550dc6 Add 128-bit integer vector load and add (for testing).
llvm-svn: 26967
2006-03-23 01:57:24 +00:00
Nate Begeman fb6e02931c Add support for 8 bit immediates with 16/32 bit cmp instructions
llvm-svn: 26966
2006-03-23 01:29:48 +00:00
Evan Cheng 021bb7c956 Added a ValueType operand to isShuffleMaskLegal(). For now, x86 will not do
64-bit vector shuffle.

llvm-svn: 26964
2006-03-22 22:07:06 +00:00
Evan Cheng ed794cd27b SHUFP* are two address code.
llvm-svn: 26959
2006-03-22 20:08:18 +00:00
Evan Cheng bc04722860 Some clean up.
llvm-svn: 26957
2006-03-22 19:22:18 +00:00
Evan Cheng d4e1557941 - Supposely movlhps is faster / better than unpcklpd.
- Don't forget pshufd is only available with sse2.

llvm-svn: 26956
2006-03-22 19:16:21 +00:00
Evan Cheng 68ad48bd1a - Implement X86ISelLowering::isShuffleMaskLegal(). We currently only support
splat and PSHUFD cases.
- Clean up shuffle / splat matching code.

llvm-svn: 26954
2006-03-22 18:59:22 +00:00
Evan Cheng 8fdbdf20cd - VECTOR_SHUFFLE of v4i32 / v4f32 with undef second vector always matches
PSHUFD. We can make permutes entries which point to the undef pointing
  anything we want.
- Change some names to appease Chris.

llvm-svn: 26951
2006-03-22 08:01:21 +00:00
Chris Lattner e24cf9dfa1 add a note
llvm-svn: 26950
2006-03-22 07:33:46 +00:00
Evan Cheng 3617caf526 Fix PSHUF* and SHUF* jit code emission problems
llvm-svn: 26949
2006-03-22 07:10:28 +00:00
Chris Lattner eccf46950c This has been implemented. Tweak it into another note
llvm-svn: 26944
2006-03-22 05:33:23 +00:00
Chris Lattner 4a66d69433 When possible, custom lower 32-bit SINT_TO_FP to this:
_foo2:
        extsw r2, r3
        std r2, -8(r1)
        lfd f0, -8(r1)
        fcfid f0, f0
        frsp f1, f0
        blr

instead of this:

_foo2:
        lis r2, ha16(LCPI2_0)
        lis r4, 17200
        xoris r3, r3, 32768
        stw r3, -4(r1)
        stw r4, -8(r1)
        lfs f0, lo16(LCPI2_0)(r2)
        lfd f1, -8(r1)
        fsub f0, f1, f0
        frsp f1, f0
        blr

This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).

llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner 77373d1bea Add support for "ri" addressing modes where the immediate is a 14-bit field
which is shifted left two bits before use.  Instructions like STD use this
addressing mode.

llvm-svn: 26942
2006-03-22 05:26:03 +00:00
Chris Lattner f5e36c8bc0 fix a warning
llvm-svn: 26941
2006-03-22 04:18:34 +00:00
Evan Cheng d097e67544 Some splat and shuffle support.
llvm-svn: 26940
2006-03-22 02:53:00 +00:00
Evan Cheng b1d3c64d1f Add a couple more pseudo instructions.
llvm-svn: 26939
2006-03-22 02:52:03 +00:00
Chris Lattner 4e7371758f Fix the JIT encoding of the VAForm_1 instructions, including vmaddfp
llvm-svn: 26935
2006-03-22 01:44:36 +00:00
Evan Cheng baea59c61c Didn't mean to check this in. No MMX support yet.
llvm-svn: 26933
2006-03-21 23:04:23 +00:00
Evan Cheng d5e905d762 - Use movaps to store 128-bit vector integers.
- Each scalar to vector v8i16 and v16i8 is a any_extend followed by a movd.

llvm-svn: 26932
2006-03-21 23:01:21 +00:00
Chris Lattner 00f4683bf6 These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.
llvm-svn: 26930
2006-03-21 20:51:05 +00:00
Chris Lattner 3a2ae6ad3c Don't emit pseudo instructions!
llvm-svn: 26926
2006-03-21 20:19:37 +00:00
Nate Begeman 013127981a Update readme
llvm-svn: 26924
2006-03-21 18:58:20 +00:00
Chris Lattner 139eac5b71 Print absolute memory references like this:
lwz r2, 8(0)
instead of this:
       lwz r2, 8(r0)

This fixes the llc/llc-beta failures on PPC last night.

llvm-svn: 26922
2006-03-21 17:21:13 +00:00
Evan Cheng 2d819f5fa4 Combine 2 entries
llvm-svn: 26921
2006-03-21 07:18:26 +00:00
Evan Cheng aeebc96099 Add a note about x86 register coallescing
llvm-svn: 26920
2006-03-21 07:12:57 +00:00
Evan Cheng 1208d9179a - Remove scalar to vector pseudo ops. They are just wrong.
- Handle FR32 to VR128:v4f32 and FR64 to VR128:v2f64 with aliases of MOVAPS
and MOVAPD. Mark them as move instructions and *hope* they will be deleted.

llvm-svn: 26919
2006-03-21 07:09:35 +00:00
Chris Lattner bda7310ef7 With Evan's latest tblgen patch, this code is obsolete, thanks Evan!
llvm-svn: 26917
2006-03-21 06:37:40 +00:00
Chris Lattner d2132f87d7 When codegen'ing vector MUL using VFMADD, *add* the 0, don't *mul* the 0.
llvm-svn: 26913
2006-03-21 00:51:38 +00:00
Chris Lattner f194834161 minor note
llvm-svn: 26912
2006-03-21 00:47:09 +00:00
Evan Cheng e4d1416239 x86 ISD::SCALAR_TO_VECTOR support.
llvm-svn: 26911
2006-03-21 00:33:35 +00:00
Evan Cheng fb872b41c0 Junk unused vector register classes.
llvm-svn: 26910
2006-03-21 00:30:59 +00:00
Chris Lattner c8b16d00b9 Handle constant addresses more efficiently, folding the low bits into the
disp field of the load/store if possible.  This compiles
CodeGen/PowerPC/load-constant-addr.ll to:

_test:
        lis r2, 2838
        lfs f1, 26848(r2)
        blr

instead of:

_test:
        lis r2, 2838
        ori r2, r2, 26848
        lfs f1, 0(r2)
        blr

llvm-svn: 26908
2006-03-20 22:38:22 +00:00
Chris Lattner 6d74b09da7 remove dead variable
llvm-svn: 26907
2006-03-20 22:37:23 +00:00
Chris Lattner a1bc294f0c Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
figuring these out! :)

llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner eda030da04 reenable this hack, the tblgen version isn't quite ready
llvm-svn: 26902
2006-03-20 17:54:43 +00:00
Chris Lattner f96d523b8f Fix the pattern for VADDUWM, add i32 splat
llvm-svn: 26901
2006-03-20 17:51:58 +00:00
Evan Cheng 89f3cff0f5 Use tblgen'd VECTOR_SHUFFLE selection code.
llvm-svn: 26900
2006-03-20 08:14:16 +00:00
Chris Lattner a9a1313386 Add support for generating vspltw, instead of a vperm instruction with a
constant pool load.  This generates significantly nicer code for splats.

When tblgen gets bugfixed, we can remove the custom selection code.

llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner a8fbb6dd3d Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate.
llvm-svn: 26897
2006-03-20 06:37:44 +00:00
Chris Lattner ffc475689b fix duplicate definition errors
llvm-svn: 26896
2006-03-20 06:33:01 +00:00
Chris Lattner 80b6bd2746 Add a build_vector node
llvm-svn: 26895
2006-03-20 06:18:01 +00:00
Chris Lattner 382f356bd9 Check in some intermediate code that adds a skeleton for matching vsplt*
instructions

llvm-svn: 26894
2006-03-20 06:15:45 +00:00
Evan Cheng e6448448c2 Move a few things around.
llvm-svn: 26893
2006-03-20 06:04:52 +00:00
Chris Lattner e4e1ac37ba add vector_shuffle
llvm-svn: 26891
2006-03-20 05:40:45 +00:00
Chris Lattner 93d99f9928 fix typo
llvm-svn: 26889
2006-03-20 05:05:55 +00:00
Chris Lattner 366b2514fa add vsplat instructions, fix sched description for vperm
llvm-svn: 26888
2006-03-20 04:47:33 +00:00
Chris Lattner a8713b1ee6 Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*

llvm-svn: 26887
2006-03-20 01:53:53 +00:00
Chris Lattner 0a8b4eaee9 Claim to have v16i8 for perm masks
llvm-svn: 26886
2006-03-20 01:53:02 +00:00
Chris Lattner e7a058de7d add the vperm instruction
llvm-svn: 26883
2006-03-20 01:00:56 +00:00
Chris Lattner d16f6fdd49 add a note with a testcase
llvm-svn: 26877
2006-03-19 22:27:41 +00:00
Chris Lattner 169e6238ad Add a note about the MUL -> FMADD vector bug.
llvm-svn: 26874
2006-03-19 22:08:08 +00:00
Evan Cheng f7c2e3628b Vector undef's
llvm-svn: 26870
2006-03-19 09:38:54 +00:00
Chris Lattner 7e9440a4fc Custom lower SCALAR_TO_VECTOR into lve*x.
llvm-svn: 26868
2006-03-19 06:55:52 +00:00
Chris Lattner b1ee9c7e24 PPC doesn't have SCALAR_TO_VECTOR
llvm-svn: 26865
2006-03-19 06:17:19 +00:00
Chris Lattner 5b595af956 add support for vector undef
llvm-svn: 26863
2006-03-19 06:10:09 +00:00
Evan Cheng 0a03f789c2 Remind us of exit value substitution
llvm-svn: 26862
2006-03-19 06:09:23 +00:00
Evan Cheng 5111c81a3c Turning on LSR by default
llvm-svn: 26861
2006-03-19 06:08:49 +00:00
Evan Cheng 66a9c0dea7 Remember which tests are hurt by LSR.
llvm-svn: 26860
2006-03-19 06:08:11 +00:00
Chris Lattner 0c9eb670bb minor fixes
llvm-svn: 26857
2006-03-19 05:43:01 +00:00
Chris Lattner ea6468758d notes
llvm-svn: 26856
2006-03-19 05:33:30 +00:00
Chris Lattner 431c90c9fa we don't use lmw/stmw. When we want them they are easy enough to add
llvm-svn: 26853
2006-03-19 04:33:37 +00:00
Chris Lattner f7b6e7212f rename these nodes
llvm-svn: 26848
2006-03-19 01:13:28 +00:00
Evan Cheng 9bf978dc20 Use the generic vector register classes VR64 / VR128 rather than V4F32,
V8I16, etc.

llvm-svn: 26838
2006-03-18 01:23:20 +00:00
Nate Begeman 21f87d0e4c Fix subfic to match subc by default instead of sub so that it is correctly
cost-modeled as producing a flag.  This fixes the test I just added for neg

llvm-svn: 26835
2006-03-17 22:41:37 +00:00
Evan Cheng b09a56f3a4 Darwin should use _setjmp/_longjmp instead of setjmp/longjmp.
llvm-svn: 26833
2006-03-17 20:31:41 +00:00
Evan Cheng 4f674921d6 Move some pattern fragments to the right files.
llvm-svn: 26831
2006-03-17 19:55:52 +00:00
Chris Lattner 388fc4d9fb Disable x86 fastcc from passing args in registers
llvm-svn: 26824
2006-03-17 17:27:47 +00:00
Chris Lattner 43798850f9 Parameterize the number of integer arguments to pass in registers
llvm-svn: 26818
2006-03-17 05:10:20 +00:00
Evan Cheng bfc2e97383 Also fold MOV8r0, MOV16r0, MOV32r0 + store to MOV8mi, MOV16mi, and MOV32mi.
llvm-svn: 26817
2006-03-17 02:36:22 +00:00
Evan Cheng aca7915b70 Add some missing entries to X86RegisterInfo::foldMemoryOperand(). e.g.
ADD32ri8.

llvm-svn: 26816
2006-03-17 02:25:01 +00:00
Evan Cheng 27750f3287 - Nuke 16-bit SBB instructions. We'll never use them.
- Nuke a bogus comment.

llvm-svn: 26815
2006-03-17 02:24:04 +00:00
Nate Begeman bb01d4f272 Remove BRTWOWAY*
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.

llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner 8bf1c59e7f remove dead variable
llvm-svn: 26813
2006-03-16 23:52:08 +00:00
Evan Cheng c11fcceec5 A new entry.
llvm-svn: 26810
2006-03-16 22:44:22 +00:00
Nate Begeman fb0e36fa56 Notes on how to kill the eeevil brtwoway, and make ppc branch selector
more target independant, generate better code, and be less conservative.

llvm-svn: 26809
2006-03-16 22:37:48 +00:00
Chris Lattner 1e6dfa4c1f Strangely, calls clobber call-clobbered vector regs. Whodathoughtit?
llvm-svn: 26808
2006-03-16 22:35:59 +00:00
Chris Lattner 325bb46315 add a note
llvm-svn: 26807
2006-03-16 22:25:55 +00:00
Chris Lattner 91400bd413 teach the ppc backend how to spill/reload vector regs
llvm-svn: 26806
2006-03-16 22:24:02 +00:00
Chris Lattner 6e90062416 add callee saved vector regs
llvm-svn: 26805
2006-03-16 22:07:06 +00:00
Evan Cheng f75555feb9 Bug fix: condition inverted.
llvm-svn: 26804
2006-03-16 22:02:48 +00:00
Evan Cheng 20931a798e Added a way for TargetLowering to specify what values can be used as the
scale component of the target addressing mode.

llvm-svn: 26802
2006-03-16 21:47:42 +00:00
Chris Lattner 0b27047a6c in functions that use a lot of callee saved regs, this can be more than
5 instructions away.

llvm-svn: 26801
2006-03-16 21:31:45 +00:00
Chris Lattner fd9f3e8ed3 Add support for copying registers. still needed: spilling and reloading them
llvm-svn: 26800
2006-03-16 20:03:58 +00:00
Chris Lattner ad74844bfa set TransformToType correctly for vector types.
llvm-svn: 26797
2006-03-16 19:50:01 +00:00
Nate Begeman 32e73f9881 Another case we could do better on.
llvm-svn: 26795
2006-03-16 18:50:44 +00:00
Chris Lattner 1678a6c477 Save/restore VRSAVE once per function, not once per block.
llvm-svn: 26793
2006-03-16 18:25:23 +00:00
Chris Lattner 4b41e40621 add support for the bitconvert node
llvm-svn: 26789
2006-03-16 01:29:53 +00:00
Nate Begeman 2e1fde7c5c Update scheduling info for vrsave instruction
llvm-svn: 26776
2006-03-15 05:25:05 +00:00
Chris Lattner 5271a1f9b5 add a note
llvm-svn: 26762
2006-03-14 19:31:24 +00:00
Chris Lattner ab1ed2aa96 Fix an off by one error that caused PPC LLC failures last night.
llvm-svn: 26758
2006-03-14 17:56:49 +00:00
Chris Lattner 30402be175 transformation implemented
llvm-svn: 26754
2006-03-14 06:57:34 +00:00
Evan Cheng 0f9d6534f5 PPC LSR pass should use target lowering hooks.
llvm-svn: 26743
2006-03-13 23:56:51 +00:00
Evan Cheng 2dd2c652b2 Added getTargetLowering() to TargetMachine. Refactored targets to support this.
llvm-svn: 26742
2006-03-13 23:20:37 +00:00
Evan Cheng 60f495100a Update
llvm-svn: 26741
2006-03-13 23:19:10 +00:00
Evan Cheng af598d2461 Add LSR hooks.
llvm-svn: 26740
2006-03-13 23:18:16 +00:00
Chris Lattner 2b8eb375d7 Handle builtins that directly correspond to GCC builtins.
llvm-svn: 26737
2006-03-13 23:09:05 +00:00
Chris Lattner 02e2c18c9c For functions that use vector registers, save VRSAVE, mark used
registers, and update it on entry to each function, then restore it on exit.

This compiles:

void func(vfloat *a, vfloat *b, vfloat *c) {
        *a = *b * *c + *c;
}

to this:

_func:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        lvx v0, 0, r5
        lvx v1, 0, r4
        vmaddfp v0, v1, v0, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

GCC produces this (which has additional stack accesses):

_func:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc000
        mtspr 256,r0
        lvx v0,0,r5
        lvx v1,0,r4
        lwz r12,-4(r1)
        vmaddfp v0,v0,v1,v0
        stvx v0,0,r3
        mtspr 256,r12
        blr

llvm-svn: 26733
2006-03-13 21:52:10 +00:00
Jim Laskey acb6e34277 Handle the removal of the debug chain.
llvm-svn: 26729
2006-03-13 13:07:37 +00:00
Chris Lattner fe4c7fb7ae remove two implemented items
llvm-svn: 26728
2006-03-13 06:52:22 +00:00
Chris Lattner 3d761b6211 I can't convince myself that this is safe, remove the recursive call.
llvm-svn: 26725
2006-03-13 06:42:16 +00:00
Chris Lattner ec9d0bc3ec Fix a couple of bugs that broke the alpha tester build
llvm-svn: 26722
2006-03-13 05:23:59 +00:00
Chris Lattner 4fbb612685 Handle cracked instructions in dispatch group formation.
llvm-svn: 26721
2006-03-13 05:20:04 +00:00
Chris Lattner 7579cfb1a0 Mark instructions that are cracked by the PPC970 decoder as such.
llvm-svn: 26720
2006-03-13 05:15:10 +00:00
Chris Lattner 51348c5f27 Several big changes:
1. Use flags on the instructions in the .td file to indicate the PPC970 unit
   type instead of a table in the .cpp file.  Much cleaner.
2. Change the hazard recognizer to build d-groups according to the actual
   algorithm used, not my flawed understanding of it.
3. Model "must be in the first slot" and "must be the only instr in a group"
   accurately.

llvm-svn: 26719
2006-03-12 09:13:49 +00:00
Chris Lattner d03132a409 blr is a branch too
llvm-svn: 26710
2006-03-11 21:49:49 +00:00
Chris Lattner 4e56b686f1 add an example
llvm-svn: 26709
2006-03-11 20:20:40 +00:00
Chris Lattner 003f633036 add a note
llvm-svn: 26708
2006-03-11 20:17:08 +00:00
Chris Lattner c2447e8b59 teach the JIT to encode vector registers
llvm-svn: 26697
2006-03-10 20:19:50 +00:00
Evan Cheng 306c13a8fb Add option -enable-x86-lsr to enable x86 loop strength reduction pass.
llvm-svn: 26665
2006-03-09 21:51:28 +00:00
Chris Lattner f136299635 add a note
llvm-svn: 26661
2006-03-09 20:13:21 +00:00
Andrew Lenharth 43e569c95f these are copies too
llvm-svn: 26653
2006-03-09 18:18:51 +00:00
Chris Lattner 7e7dccd3ab remove some now-dead code
llvm-svn: 26652
2006-03-09 18:07:49 +00:00
Andrew Lenharth 70236fc12f fcopysign for mixed mode
llvm-svn: 26651
2006-03-09 17:56:33 +00:00
Andrew Lenharth ebfd94fa1d relax fcopysign
llvm-svn: 26649
2006-03-09 17:47:22 +00:00
Andrew Lenharth 4a87e7d9a3 alpha and llvm have different oppinions on which arg is the sign bit
llvm-svn: 26647
2006-03-09 17:41:50 +00:00
Andrew Lenharth 16b96d2cb4 Alpha Scheduling classes
llvm-svn: 26643
2006-03-09 17:16:45 +00:00
Andrew Lenharth ed7a293b44 fcopysign and get rid of dsnode cruft. custom PA runtimes make this better in some senses
llvm-svn: 26641
2006-03-09 14:58:25 +00:00
Andrew Lenharth b8a06a7c6c fcopysign support
llvm-svn: 26640
2006-03-09 14:57:36 +00:00
Chris Lattner e363fdf318 Add support for 'special' llvm globals like debug info and static ctors/dtors.
llvm-svn: 26628
2006-03-09 06:14:35 +00:00
Chris Lattner 920e661e50 a couple of miscellaneous things.
llvm-svn: 26625
2006-03-09 01:39:46 +00:00
Jim Laskey 8f0a95f664 Add #line support for CBE.
llvm-svn: 26621
2006-03-08 19:31:15 +00:00
Duraid Madina 5005b01c20 doo de doo
llvm-svn: 26614
2006-03-08 06:18:46 +00:00
Chris Lattner 543832d39d Change the interface for getting a target HazardRecognizer to be more clean.
llvm-svn: 26608
2006-03-08 04:25:59 +00:00
Chris Lattner a8dd636192 add a note
llvm-svn: 26605
2006-03-08 00:25:47 +00:00
Evan Cheng 70b25efa57 X86ISD::REP_STOS and X86ISD::REP_MOVS now produces a flag.
llvm-svn: 26604
2006-03-07 23:34:23 +00:00
Evan Cheng adc7093fc1 Use rep/stosl; and Count 0x3; rep/stosb for memset with 4 byte aligned dest.
and variable value.
Similarly for memcpy.

llvm-svn: 26603
2006-03-07 23:29:39 +00:00
Chris Lattner 207291fd1a Two things:
1. Don't emit debug info, or other llvm.metadata to the .cbe.c file.
2. Mark static ctors/dtors as such, so that bugpoint works on C++ code
   compiled with the new CFE.

llvm-svn: 26602
2006-03-07 22:58:23 +00:00
Jim Laskey 313570fb17 Use "llvm.metadata" section for debug globals. Filter out these globals in the
asm printer.

llvm-svn: 26599
2006-03-07 22:00:35 +00:00
Chris Lattner 907e13c742 add another missing store.
llvm-svn: 26595
2006-03-07 16:26:48 +00:00
Chris Lattner 8c73d80b08 add a couple more load/store instrs, add a newline to the end of file.
llvm-svn: 26594
2006-03-07 16:19:46 +00:00
Nate Begeman 3e3219cc0a This kinda sorta implements "things that have to lead a dispatch group".
llvm-svn: 26591
2006-03-07 08:30:27 +00:00
Chris Lattner 675567f77c add some new instructions to the classifier. With this, we correctly insert
a nop into Freebench/neural, which speeds it up from 136->129s (~5.4%).

llvm-svn: 26590
2006-03-07 07:14:55 +00:00
Chris Lattner 05ad128dca add some comments that describe what we model
llvm-svn: 26588
2006-03-07 06:44:19 +00:00
Chris Lattner 2cab13573c Implement a very very simple hazard recognizer for LSU rejects and ctr set/read
flushes

llvm-svn: 26587
2006-03-07 06:32:48 +00:00
Chris Lattner 883cefc656 add a note
llvm-svn: 26585
2006-03-07 04:42:59 +00:00
Chris Lattner bccb0e07f0 add a note
llvm-svn: 26583
2006-03-07 02:46:26 +00:00
Evan Cheng a4a4ceb478 - Emit subsections_via_symbols for Darwin.
- Conditionalize Dwarf debugging output (Darwin only for now).

llvm-svn: 26582
2006-03-07 02:23:26 +00:00
Evan Cheng 30d7b70b73 Enable Dwarf debugging info.
llvm-svn: 26581
2006-03-07 02:02:57 +00:00
Chris Lattner ea79d9fd73 implement TII::insertNoop
llvm-svn: 26562
2006-03-05 23:49:55 +00:00
Chris Lattner 5032c32d30 add a note
llvm-svn: 26549
2006-03-05 20:00:08 +00:00
Chris Lattner c726a5c31f Do not fold (add (shl x, c1), (shl c2, c1)) -> (shl (add x, c2), c1),
we want to canonicalize the other way.

llvm-svn: 26547
2006-03-05 19:52:57 +00:00
Chris Lattner 9c7f50376a Copysign needs to be expanded everywhere. Note that Alpha and IA64 should
implement copysign as a native op if they have it.

llvm-svn: 26541
2006-03-05 05:08:37 +00:00
Chris Lattner c2dd7aae71 add a note for something evan noticed
llvm-svn: 26539
2006-03-05 01:15:18 +00:00
Chris Lattner 8d8b4cf63d Implemented.
llvm-svn: 26536
2006-03-04 23:33:44 +00:00
Chris Lattner c9a318d8fa Add a note
llvm-svn: 26523
2006-03-04 08:44:51 +00:00
Evan Cheng c66fd44541 Add an entry
llvm-svn: 26520
2006-03-04 07:49:50 +00:00
Evan Cheng 6dc73297c3 MEMSET / MEMCPY lowering bugs: we can't issue a single WORD / DWORD version of
rep/stos and rep/mov if the count is not a constant. We could do
  rep/stosl; and $count, 3; rep/stosb
For now, I will lower them to memset / memcpy calls. We will revisit this after
a little bit experiment.

Also need to take care of the trailing bytes even if the count is a constant.
Since the max. number of trailing bytes are 3, we will simply issue loads /
stores.

llvm-svn: 26517
2006-03-04 02:48:56 +00:00
Chris Lattner e43e5c0697 add a note
llvm-svn: 26513
2006-03-04 01:19:34 +00:00
Evan Cheng 084a102b17 Typo
llvm-svn: 26512
2006-03-04 01:12:00 +00:00
Evan Cheng a7fb285c60 Number of NodeTypes now exceeds 128.
llvm-svn: 26503
2006-03-03 06:58:59 +00:00
Chris Lattner b203355298 Split the valuetypes out of Target.td into ValueTypes.td
llvm-svn: 26490
2006-03-03 01:55:26 +00:00
Chris Lattner ad3c974a77 remove the read/write port/io intrinsics.
llvm-svn: 26479
2006-03-03 00:19:58 +00:00
Chris Lattner 9067500e2e add a note
llvm-svn: 26472
2006-03-02 22:34:38 +00:00
Chris Lattner 60a60f4b1e Implement CodeGen/PowerPC/or-addressing-mode.ll, which is also PR668.
llvm-svn: 26450
2006-03-01 07:14:48 +00:00
Chris Lattner 3cb349a068 add a note
llvm-svn: 26448
2006-03-01 06:36:20 +00:00
Chris Lattner 27f5345b1f Compile this:
void foo(float a, int *b) { *b = a; }

to this:

_foo:
        fctiwz f0, f1
        stfiwx f0, 0, r4
        blr

instead of this:

_foo:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        stw r2, 0(r4)
        blr

This implements CodeGen/PowerPC/stfiwx.ll, and also incidentally does the
right thing for GCC bugzilla 26505.

llvm-svn: 26447
2006-03-01 05:50:56 +00:00
Chris Lattner f418435819 Use a target-specific dag-combine to implement CodeGen/PowerPC/fp-int-fp.ll.
llvm-svn: 26445
2006-03-01 04:57:39 +00:00
Chris Lattner 4a2eeea671 Add interfaces for targets to provide target-specific dag combiner optimizations.
llvm-svn: 26442
2006-03-01 04:52:55 +00:00
Evan Cheng 1926427351 Vector op lowering.
llvm-svn: 26438
2006-03-01 01:11:20 +00:00
Evan Cheng 91c574b642 New type v2f32.
llvm-svn: 26435
2006-03-01 01:06:22 +00:00
Evan Cheng 0e69f45b07 Another entry.
llvm-svn: 26430
2006-02-28 23:38:49 +00:00
Evan Cheng 990c3602bd Don't match x << 1 to LEAL. It's better to emit x + x.
llvm-svn: 26429
2006-02-28 21:13:57 +00:00
Chris Lattner b9f35f06bc Add a subtarget feature for the stfiwx instruction. I know the G5 has it,
but I don't know what other PPC impls do.  If someone could update the proc
table, I would appreciate it :)

llvm-svn: 26421
2006-02-28 07:08:22 +00:00
Chris Lattner 872810da6c remove implemented item
llvm-svn: 26418
2006-02-28 06:36:04 +00:00
Nate Begeman f918ed2e33 readme updates
llvm-svn: 26405
2006-02-27 22:08:36 +00:00
Chris Lattner ec185f7843 Don't print constant initializers, they may span lines now.
llvm-svn: 26403
2006-02-27 20:09:23 +00:00
Jim Laskey 8f2c1021b4 Removed dependency on how operands are printed (want multi-line.)
llvm-svn: 26399
2006-02-27 10:29:04 +00:00
Chris Lattner ab8164042a Implement bit propagation through sub nodes, this (re)implements
PowerPC/div-2.ll

llvm-svn: 26392
2006-02-27 01:00:42 +00:00
Chris Lattner a60751dd43 Check RHS simplification before LHS simplification to avoid infinitely looping
on PowerPC/small-arguments.ll

llvm-svn: 26389
2006-02-27 00:36:27 +00:00
Chris Lattner 27220f8958 Just like we use the RHS of an AND to simplify the LHS, use the LHS to
simplify the RHS.  This allows for the elimination of many thousands of
ands from multisource, and compiles CodeGen/PowerPC/and-elim.ll:test2
into this:

_test2:
        srwi r2, r3, 1
        xori r3, r2, 40961
        blr

instead of this:

_test2:
        rlwinm r2, r3, 31, 17, 31
        xori r2, r2, 40961
        rlwinm r3, r2, 0, 16, 31
        blr

llvm-svn: 26388
2006-02-27 00:22:28 +00:00
Chris Lattner 118ddba929 Add a bunch of missed cases. Perhaps the most significant of which is that
assertzext produces zero bits.

llvm-svn: 26386
2006-02-26 23:36:02 +00:00
Evan Cheng 877ab55e06 ConstantPoolIndex is now the displacement portion of the address (rather
than base).

llvm-svn: 26382
2006-02-26 09:12:34 +00:00
Evan Cheng 75b8783aaf Fixed ConstantPoolIndex operand asm print bug. This fixed 2005-07-17-INT-To-FP
and 2005-05-12-Int64ToFP.

llvm-svn: 26380
2006-02-26 08:28:12 +00:00
Evan Cheng 77d86ff8fc * Cleaned up addressing mode matching code.
* Cleaned up and tweaked LEA cost analysis code. Removed some hacks.
* Handle ADD $X, c to MOV32ri $X+c. These patterns cannot be autogen'd and
  they need to be matched before LEA.

llvm-svn: 26376
2006-02-25 10:09:08 +00:00
Evan Cheng 1c557bfeb5 Updates.
llvm-svn: 26375
2006-02-25 10:04:07 +00:00
Evan Cheng 1fac3b3360 * Allow mul, shl nodes to be codegen'd as LEA (if appropriate).
* Add patterns to handle GlobalAddress, ConstantPool, etc.
  MOV32ri to materialize these nodes in registers.
  ADD32ri to handle %reg + GA, etc.
  MOV32mi to handle store GA, etc. to memory.

llvm-svn: 26374
2006-02-25 10:02:21 +00:00
Evan Cheng e4a8b74e4f ConstantPoolIndex is now the displacement field of addressing mode.
llvm-svn: 26373
2006-02-25 09:56:50 +00:00
Evan Cheng 994700101e Added a common about the need for X86ISD::Wrapper.
llvm-svn: 26372
2006-02-25 09:55:19 +00:00
Evan Cheng ed169db8a5 Added an offset field to ConstantPoolSDNode.
llvm-svn: 26371
2006-02-25 09:54:52 +00:00
Evan Cheng 42d5ac557c Fix an obvious bug exposed when we are doing
ADD X, 4
==>
MOV32ri $X+4, ...

llvm-svn: 26366
2006-02-25 01:37:02 +00:00
Chris Lattner 7674d90fa1 Add memory printing support for PPC. Input memory operands now work with
inline asms! :)

llvm-svn: 26365
2006-02-24 20:27:40 +00:00
Chris Lattner a1ec1ddd59 Implement selection of inline asm memory operands
llvm-svn: 26348
2006-02-24 02:13:12 +00:00
Chris Lattner 2a9e1e3e74 Recognize memory operand codes
llvm-svn: 26345
2006-02-24 01:10:46 +00:00
Evan Cheng 0ed48fe601 PPC JIT relocation model should be DynamicNoPIC.
llvm-svn: 26338
2006-02-23 22:18:07 +00:00
Evan Cheng e0ed6ec13f - Clean up the lowering and selection code of ConstantPool, GlobalAddress,
and ExternalSymbol.
- Use C++ code (rather than tblgen'd selection code) to match the above
  mentioned leaf nodes. Do not mutate and nodes and do not record the
  selection in CodeGenMap. These nodes should be safe to duplicate. This is
  a performance win.

llvm-svn: 26335
2006-02-23 20:41:18 +00:00
Chris Lattner 1bad2546d0 Implement the PPC inline asm "L" modifier. This allows us to compile:
long long test(long long X) {
  __asm__("foo %0 %L0 %1 %L1" : "=r"(X): "r"(X));
  return X;
}

to:
        foo r2 r3 r2 r3

llvm-svn: 26333
2006-02-23 19:31:10 +00:00
Chris Lattner 16f08f53b1 "." isn't enough to get a private label on linux, use ".L".
llvm-svn: 26327
2006-02-23 05:25:02 +00:00
Chris Lattner 2bacf981bf add a small and simple case.
llvm-svn: 26326
2006-02-23 05:17:43 +00:00
Evan Cheng f4448cee66 A couple of new entries.
llvm-svn: 26325
2006-02-23 02:50:21 +00:00
Evan Cheng 1f342c2884 PIC related bug fixes.
1. Various asm printer bug.
2. Lowering bug. Now TargetGlobalAddress is wrapped in X86ISD::TGAWrapper.

llvm-svn: 26324
2006-02-23 02:43:52 +00:00
Evan Cheng 7eabbfd618 X86 codegen tweak to use lea in another case:
Suppose base == %eax and it has multiple uses, then instead of
  movl %eax, %ecx
  addl $8, %ecx
use
  leal 8(%eax), %ecx.

llvm-svn: 26323
2006-02-23 00:13:58 +00:00
Evan Cheng 7714a59d91 Missing .globl for weak / link-once .text symbols.
llvm-svn: 26321
2006-02-22 23:59:57 +00:00
Chris Lattner 2e124af406 Don't return registers from register classes that aren't legal.
llvm-svn: 26317
2006-02-22 23:00:51 +00:00
Evan Cheng 73136dfecc - Added option -relocation-model to set relocation model. Valid values include static, pic,
dynamic-no-pic, and default.
PPC and x86 default is dynamic-no-pic for Darwin, pic for others.
- Removed options -enable-pic and -ppc-static.

llvm-svn: 26315
2006-02-22 20:19:42 +00:00
Jim Laskey 2fa33a989d Coordinate activities with llvm-gcc4 and dwarf.
llvm-svn: 26314
2006-02-22 19:02:11 +00:00
Evan Cheng 9e252e3bcf Added MMX, SSE1, and SSE2 vector instructions and some simple patterns.
Fixed some existing bugs (wrong predicates, prefixes) at the same time.

llvm-svn: 26310
2006-02-22 02:26:30 +00:00
Chris Lattner 7ad77dfc2a split register class handling from explicit physreg handling.
llvm-svn: 26308
2006-02-22 00:56:39 +00:00
Chris Lattner 7bb4696dc3 Updates to match change of getRegForInlineAsmConstraint prototype
llvm-svn: 26305
2006-02-21 23:11:00 +00:00
Evan Cheng d58478161f One more round of reorg so sabre doesn't freak out. :-)
llvm-svn: 26303
2006-02-21 20:00:20 +00:00
Evan Cheng 6fc1162855 A big more cleaning up.
llvm-svn: 26302
2006-02-21 19:30:30 +00:00
Evan Cheng 8711b6bff3 Moving things to their proper places.
llvm-svn: 26301
2006-02-21 19:26:52 +00:00