Commit Graph

694 Commits

Author SHA1 Message Date
Craig Topper fe82eb6bcd Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those.
llvm-svn: 171237
2012-12-29 18:18:20 +00:00
Craig Topper 6b27251a76 Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled.
llvm-svn: 171227
2012-12-29 16:44:25 +00:00
Craig Topper 81d1e596bb Remove alignment from a bunch more VEX encoded operations in the folding tables.
llvm-svn: 171082
2012-12-26 02:44:47 +00:00
Craig Topper b2922164f0 Remove alignment from folding table for VMOVUPD as an unaligned instruction it shouldn't require alignment...
llvm-svn: 171081
2012-12-26 02:14:19 +00:00
Craig Topper d09a9af9b6 Remove alignment requirements from (V)EXTRACTPS. This instruction does 32-bit stores which aren't required to be aligned on SSE or AVX.
llvm-svn: 171080
2012-12-26 01:47:12 +00:00
Craig Topper caef1c5d86 Remove alignment requirement from VCVTSS2SD in folding tables. Reverting r171049. This instruction doesn't require alignment.
llvm-svn: 171078
2012-12-26 00:35:47 +00:00
Nadav Rotem 00410ae625 VCVTSS2SD requires a strict alignment. Thanks Elena.
llvm-svn: 171049
2012-12-25 03:29:18 +00:00
Nadav Rotem dc0ad92b64 Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned.
When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding
tables and removes the alignment restrictions from VEX-encoded instructions.

llvm-svn: 171024
2012-12-24 09:40:33 +00:00
Nadav Rotem d5aae980cb In some cases, due to scheduling constraints we copy the EFLAGS.
The only way to read the eflags is using push and pop. If we don't
adjust the stack then we run over the first frame index. This is
not something that we want to do, so we have to make sure that
our machine function does not copy the flags. If it does then
we have to emit the prolog that adjusts the stack.

rdar://12896831

llvm-svn: 170961
2012-12-21 23:48:49 +00:00
Benjamin Kramer 4669d18893 X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
This is very mechanical, no functionality change. Preparation for PR14667.

llvm-svn: 170898
2012-12-21 14:04:55 +00:00
Jakob Stoklund Olesen b159b5ff0d Remove the explicit MachineInstrBuilder(MI) constructor.
Use the version that also takes an MF reference instead.

It would technically be possible to extract an MF reference from the MI
as MI->getParent()->getParent(), but that would not work for MIs that
are not inserted into any basic block.

Given the reasonably small number of places this constructor was used at
all, I preferred the compile time check to a run time assertion.

llvm-svn: 170588
2012-12-19 21:31:56 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Craig Topper f3ff6ae066 Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible.
llvm-svn: 170305
2012-12-17 05:12:30 +00:00
Craig Topper f924a58af1 Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt.
llvm-svn: 170304
2012-12-17 05:02:29 +00:00
Craig Topper 5b08cf7736 Remove store forms of DEC/INC from isDefConvertible. Since they are stores they don't have a register def.
llvm-svn: 170303
2012-12-17 04:55:07 +00:00
Craig Topper 922f10aec4 Mark MOVDQ(A/U)rm as ReMaterializable. Mark all MOVDQ(A/U) instructions as neverHasSideEffects.
llvm-svn: 169477
2012-12-06 06:49:16 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Jakob Stoklund Olesen 9de596e650 Remove all references to TargetInstrInfoImpl.
This class has been merged into its super-class TargetInstrInfo.

llvm-svn: 168760
2012-11-28 02:35:17 +00:00
Manman Ren 5b4628201f X86: do not fold load instructions such as [V]MOVS[S|D] to other instructions
when the destination register is wider than the memory load.

These load instructions load from m32 or m64 and set the upper bits to zero,
while the folded instructions may accept m128.

rdar://12721174

llvm-svn: 168710
2012-11-27 18:09:26 +00:00
Craig Topper 3b530ea605 Remove alignments from folding tables for scalar FMA4 instructions.
llvm-svn: 167366
2012-11-04 04:40:08 +00:00
Craig Topper 8cd3b07a51 Add scalar forms of FMA4 VFNMSUB/VFNMADD to folding tables. Patch from Cameron McInally.
llvm-svn: 167106
2012-10-31 04:59:46 +00:00
Bill Wendling c9b22d735a Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488
2012-10-09 07:45:08 +00:00
Craig Topper 9384902ef1 Move expansion of SETB_C(8/16/32/64)r from MCInstLower to ExpandPostRAPseudos and mark them as pseudos in the td file.
llvm-svn: 165302
2012-10-05 06:05:15 +00:00
Sylvestre Ledru 91ce36c986 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768
2012-09-27 10:14:43 +00:00
Sylvestre Ledru 721cffd53a Fix a typo 'iff' => 'if'
llvm-svn: 164767
2012-09-27 09:59:43 +00:00
Bill Wendling 863bab689a Remove the `hasFnAttr' method from Function.
The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.

llvm-svn: 164725
2012-09-26 21:48:26 +00:00
Michael Liao 2b425e1e24 Add SARX/SHRX/SHLX code generation support
llvm-svn: 164675
2012-09-26 08:26:25 +00:00
Michael Liao 2de86af22d Add RORX code generation support
llvm-svn: 164674
2012-09-26 08:24:51 +00:00
Michael Liao f9f7b5518a Add MULX code generation support
llvm-svn: 164673
2012-09-26 08:22:37 +00:00
Michael Liao 3237662b65 Re-work X86 code generation of atomic ops with spin-loop
- Rewrite/merge pseudo-atomic instruction emitters to address the
  following issue:
  * Reduce one unnecessary load in spin-loop

    previously the spin-loop looks like

        thisMBB:
        newMBB:
          ld  t1 = [bitinstr.addr]
          op  t2 = t1, [bitinstr.val]
          not t3 = t2  (if Invert)
          mov EAX = t1
          lcs dest = [bitinstr.addr], t3  [EAX is implicit]
          bz  newMBB
          fallthrough -->nextMBB

    the 'ld' at the beginning of newMBB should be lift out of the loop
    as lcs (or CMPXCHG on x86) will load the current memory value into
    EAX. This loop is refined as:

        thisMBB:
          EAX = LOAD [MI.addr]
        mainMBB:
          t1 = OP [MI.val], EAX
          LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
          JNE mainMBB
        sinkMBB:

  * Remove immopc as, so far, all pseudo-atomic instructions has
    all-register form only, there is no immedidate operand.

  * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
    td

  * Fix issues in PR13458

- Add comprehensive tests on atomic ops on various data types.
  NOTE: Some of them are turned off due to missing functionality.

- Revise tests due to the new spin-loop generated.

llvm-svn: 164281
2012-09-20 03:06:15 +00:00
Jan Wen Voung 4ce1d7b4f1 Add some cases to x86 OptimizeCompare to handle DEC and INC, too.
While we are setting the earlier def to true, also make it live.

llvm-svn: 164056
2012-09-17 22:04:23 +00:00
Craig Topper 908e685102 Mark FMA4 instructions as commutable and add them to the folding tables.
llvm-svn: 163035
2012-08-31 23:10:34 +00:00
Craig Topper 7573c8f081 Add selection of RegOp2MemOpTable3 to canFoldMemoryOperand
llvm-svn: 163029
2012-08-31 22:12:16 +00:00
Craig Topper 72f51c3986 Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos.
llvm-svn: 162740
2012-08-28 07:30:47 +00:00
Craig Topper bd509eea4a Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
llvm-svn: 162738
2012-08-28 07:05:28 +00:00
Jakob Stoklund Olesen 7030427623 Preserve operand flags in convertToThreeAddress() by copying operands.
No test case, this is a generalization of r160260.

llvm-svn: 162485
2012-08-23 22:36:31 +00:00
Craig Topper f911597494 Use a switch statement instead of a bunch of if-else checks and pull out the common function call.
llvm-svn: 162428
2012-08-23 04:57:36 +00:00
Craig Topper bab0c76674 Fix up indentation and remove a couple else's after returns.
llvm-svn: 162270
2012-08-21 08:29:51 +00:00
Craig Topper bfcfdeb563 Use uint16_t for tables of opcodes.
llvm-svn: 162267
2012-08-21 08:23:21 +00:00
Craig Topper a0cabf19f8 Fix up indentation. No functional change.
llvm-svn: 162264
2012-08-21 08:17:07 +00:00
Craig Topper 4bc3e5a1bf Add a couple llvm_unreachables. Add a message to several others.
llvm-svn: 162263
2012-08-21 08:16:16 +00:00
Craig Topper 653e759046 Replace a break with llvm_unreachable in the default case of a nested switch. Condense code a bit. No functional change.
llvm-svn: 162261
2012-08-21 07:32:16 +00:00
Craig Topper b58eec4eaf Remove FMA3 intrinsic instructions in favor of patterns.
llvm-svn: 162194
2012-08-20 06:21:25 +00:00
Manman Ren 959acb106b X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.

PR13576

llvm-svn: 161769
2012-08-13 18:29:41 +00:00
Manman Ren 1be131ba27 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen 3b9a442841 Don't scan physreg use-def chains looking for a PIC base.
We can't rematerialize a PIC base after register allocation anyway, and
scanning physreg use-def chains is very expensive in a function with
many calls.

<rdar://problem/12047515>

llvm-svn: 161461
2012-08-08 00:40:47 +00:00
Manman Ren 5759d01230 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Elena Demikhovsky 3cb3b0045c Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Manman Ren f87dd7c01b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Manman Ren 0fa3ab88ba X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Manman Ren 32367c063b X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.

llvm-svn: 160912
2012-07-28 03:15:46 +00:00
Manman Ren d0a4ee8427 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454
2012-07-18 21:40:01 +00:00
Nadav Rotem 4968e45b9f Fix a bug in the 3-address conversion of LEA when one of the operands is an
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160260
2012-07-16 10:52:25 +00:00
Nadav Rotem ee3552f88d Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Benjamin Kramer abbfe69356 Make helper functions static.
llvm-svn: 160173
2012-07-13 13:25:15 +00:00
Manman Ren 1553ce0e81 X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair.
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables
removal of Cmp.

llvm-svn: 160066
2012-07-11 19:35:12 +00:00
Manman Ren 5f6fa428fa X86: implement functions to analyze & synthesize CMOV|SET|Jcc
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond

No functional change intended.
If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
opcode to get the condition code, then update the condition code, finally
synthesize the new opcode form the new condition code.

llvm-svn: 159955
2012-07-09 18:57:12 +00:00
Manman Ren bb36074047 X86: Fix optimizeCompare to correctly check safe condition.
It is safe if EFLAGS is killed or re-defined.
When we are done with the basic block, check whether EFLAGS is live-out.
Do not optimize away cmp if EFLAGS is live-out.

llvm-svn: 159888
2012-07-07 03:34:46 +00:00
Manman Ren c965673707 X86: peephole optimization to remove cmp instruction
For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.

llvm-svn: 159838
2012-07-06 17:36:20 +00:00
Jakob Stoklund Olesen 49e4d4b3ef Add early if-conversion support to X86.
Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.

Early if-conversion is still not enabled by default.

llvm-svn: 159695
2012-07-04 00:09:58 +00:00
Craig Topper b6eb513c68 Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.
llvm-svn: 159126
2012-06-25 06:16:00 +00:00
Craig Topper fd5e6e7db1 Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
llvm-svn: 159109
2012-06-24 07:07:16 +00:00
Craig Topper b925230fb1 Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
llvm-svn: 159108
2012-06-24 06:55:37 +00:00
Craig Topper f48ec7a708 Fix build failures from r159106.
llvm-svn: 159107
2012-06-24 06:08:31 +00:00
Craig Topper 3cee08ce7d Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns.
llvm-svn: 159105
2012-06-24 05:33:24 +00:00
Craig Topper a899cc15f1 Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead.
llvm-svn: 159090
2012-06-23 22:33:14 +00:00
Craig Topper 1cac50bc5e Compress flags in X86 op folding to reduce space in static tables.
llvm-svn: 159073
2012-06-23 08:01:18 +00:00
Craig Topper 431f1e7192 Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits.
llvm-svn: 159070
2012-06-23 04:23:36 +00:00
Craig Topper 11913052d6 Move AVX version of convert instructions that write to GPRs to the Op1 table.
llvm-svn: 158497
2012-06-15 07:02:58 +00:00
Pete Cooper 8bbce768d8 Move X86::VCVTTSD2SIrr from the 2 operand to 1 operand MemRegOp table.
Can someone with more knowledge of this please look at other entries
to see if others need moved.

llvm-svn: 158474
2012-06-14 22:12:58 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Craig Topper c6ac4cefcc Add intrinsic forms for FMA instructions to opcode folding tables.
llvm-svn: 157917
2012-06-04 07:46:16 +00:00
Craig Topper 3cb143016d Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions.
llvm-svn: 157914
2012-06-04 07:08:21 +00:00
Manman Ren 5097e4f38a Revert r157831
llvm-svn: 157896
2012-06-03 03:14:24 +00:00
Manman Ren 879ca9d47d X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831
2012-06-01 19:49:33 +00:00
Hans Wennborg 789acfb63d Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

llvm-svn: 157818
2012-06-01 16:27:21 +00:00
Craig Topper 2e127b5274 Add VFNSUB* instructions to folding table.
llvm-svn: 157802
2012-06-01 05:48:39 +00:00
Manman Ren 9bccb64e56 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Elena Demikhovsky 602f3a26d6 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Jakob Stoklund Olesen 38dcd598f9 Make the global base reg GR32_NOSP.
It can sometimes be used in addressing modes that don't support %ESP.

llvm-svn: 157165
2012-05-20 18:43:00 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Craig Topper abadc660e0 Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155186
2012-04-20 06:31:50 +00:00
Elena Demikhovsky 779a72b49e Added VPERM optimization for AVX2 shuffles
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
Craig Topper b25fda95f6 Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.
llvm-svn: 152997
2012-03-17 18:46:09 +00:00
Craig Topper 2dac962864 Use uint16_t to store opcodes in static tables in X86 backend.
llvm-svn: 152391
2012-03-09 07:45:21 +00:00
Craig Topper 760b134ffa Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
llvm-svn: 151134
2012-02-22 05:59:10 +00:00
Jia Liu b22310fda6 Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Jakob Stoklund Olesen 97e3115dc2 Use the same CALL instructions for Windows as for everything else.
The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.

llvm-svn: 150708
2012-02-16 17:56:02 +00:00
Jakob Stoklund Olesen 4519fd0b21 Handle register masks when searching for EFLAGS clobbers.
Calls clobber the flags, but when using register masks there is no
EFLAGS<imp-def> operand.

llvm-svn: 150117
2012-02-09 00:17:22 +00:00
Craig Topper 7834900950 Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.
llvm-svn: 148933
2012-01-25 06:43:11 +00:00
Craig Topper ce4f9c5668 Custom lower phadd and phsub intrinsics to target specific nodes. Remove the patterns that are no longer necessary.
llvm-svn: 148927
2012-01-25 05:37:32 +00:00
David Blaikie 46a9f016c5 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Craig Topper a875b7ccc7 Folding table additions and fixes for AVX.
llvm-svn: 148467
2012-01-19 08:50:38 +00:00
Craig Topper d78429f850 Add a bunch of AVX instructions to the folding tables. Also fixed the alignment on 256-bit AVX2 instructions.
llvm-svn: 148194
2012-01-14 18:14:53 +00:00
Craig Topper e52d86a740 Convert SHUFPD with the same register for both sources to PSHUFD if it would prevent a register copy. Similar to SHUFPS, but requires the mask to be converted.
llvm-svn: 148112
2012-01-13 09:21:41 +00:00
Craig Topper cb7e13d7c0 Make X86 instruction selection use 256-bit VPXOR for build_vector of all ones if AVX2 is enabled. This gives the ExeDepsFix pass a chance to choose FP vs int as appropriate. Also use v8i32 as the type for getZeroVector if AVX2 is enabled. This is consistent with SSE2 using prefering v4i32.
llvm-svn: 148108
2012-01-13 08:12:35 +00:00
Craig Topper a4c5a47b97 Use 8i32 constant pool entry for converting AVX2_SETALLONES. Possibly fixes PR11750.
llvm-svn: 148101
2012-01-13 06:12:41 +00:00
Evan Cheng 7fae11b231 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.

llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Benjamin Kramer 2dc5dec41d X86: Split (v)rounds[sd] into a normal and an intrinsic version.
llvm-svn: 146256
2011-12-09 15:43:55 +00:00