Commit Graph

17428 Commits

Author SHA1 Message Date
Johnny Chen 221a014ea3 It used to be that t_addrmode_s4 was used for both:
o A8.6.195 STR (register) -- Encoding T1
o A8.6.193 STR (immediate, Thumb) -- Encoding T1

It has been changed so that now they use different addressing modes
and thus different MC representation (Operand Infos).  Modify the
disassembler to reflect the change, and add relevant tests.

llvm-svn: 127833
2011-03-17 22:04:05 +00:00
Richard Osborne 6120962d7d Add XCore intrinsic for setpsc.
llvm-svn: 127821
2011-03-17 18:42:05 +00:00
Cameron Zwarich 2ef0c69df1 Move more logic into getTypeForExtArgOrReturn.
llvm-svn: 127809
2011-03-17 14:53:37 +00:00
Cameron Zwarich 34e7b3f77e Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn().
llvm-svn: 127807
2011-03-17 14:21:56 +00:00
Nick Lewycky 881e1871dd Add "swi" which is an obsolete mnemonic for "svc".
llvm-svn: 127788
2011-03-17 01:46:14 +00:00
Eli Friedman e8f2be0c10 A couple new README entries.
llvm-svn: 127786
2011-03-17 01:22:09 +00:00
Cameron Zwarich ac106273d4 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766
2011-03-16 22:20:18 +00:00
Richard Osborne c871eff3f5 Add XCore intrinsics for setclk, setrdy.
llvm-svn: 127761
2011-03-16 21:56:00 +00:00
Richard Osborne d4346f2388 Add checkevent intrinsic to check if any resources owned by the current thread
can event.

llvm-svn: 127741
2011-03-16 18:34:00 +00:00
Johnny Chen a4c3154fca There were two issues fixed:
1. The ARM Darwin *r9 call instructions were pseudo-ized recently.
   Modify the ARMDisassemblerCore.cpp file to accomodate the change.

2. The disassembler was unnecessarily adding 8 to the sign-extended imm24:

   imm32 = SignExtend(imm24:'00', 32); // A8.6.23 BL, BLX (immediate)
                                       // Encoding A1

   It has no business doing such.  Removed the offending logic.

Add test cases to arm-tests.txt.

llvm-svn: 127707
2011-03-15 22:27:33 +00:00
Bill Wendling 865f8b592a The VTBL (and VTBX) instructions are rather permissive concerning the masks they
accept. If a value in the mask is out of range, it uses the value 0, for VTBL,
or leaves the value unchanged, for VTBX.

llvm-svn: 127700
2011-03-15 21:15:20 +00:00
Bill Wendling ebecb33307 Some minor cleanups based on feedback.
llvm-svn: 127694
2011-03-15 20:47:26 +00:00
Evan Cheng 42401d6af2 Do not form thumb2 ldrd / strd if the offset is by multiple of 4. rdar://9133587
llvm-svn: 127683
2011-03-15 18:41:52 +00:00
Richard Osborne 024932fc77 Don't indent cases in a switch, no functionality change.
llvm-svn: 127681
2011-03-15 15:55:30 +00:00
Richard Osborne 5f1a26ea39 On the XCore the scavenging slot should be closest to the SP.
llvm-svn: 127680
2011-03-15 15:10:11 +00:00
Richard Osborne 3a68eb150b Add XCore intrinsics for getps, setps, setsr and clrsr.
llvm-svn: 127678
2011-03-15 13:45:47 +00:00
Justin Holewinski 94751fbf32 PTX: Set PTX 2.0 as the minimum supported version
- Remove PTX 1.4 code generation
- Change type of intrinsics to .v4.i32 instead of .v4.i16
- Add and/or/xor integer instructions

llvm-svn: 127677
2011-03-15 13:24:15 +00:00
Duncan Sands 7921ac0975 Avoid a compiler warning about reg possibly being used uninitialized
when building with assertions disabled.

llvm-svn: 127675
2011-03-15 08:41:24 +00:00
Sean Callanan b60b0bc47e Enabled disassembler support for AVX instructions
in the instruction tables and fixed a few bugs that
were causing decode conflicts.  Rudimentary tests
are coming up in the next patch.

llvm-svn: 127646
2011-03-15 01:28:15 +00:00
Sean Callanan c3fd523731 X86 table-generator and disassembler support for the AVX
instruction set.  This code adds support for the VEX prefix
and for the YMM registers accessible on AVX-enabled
architectures.  Instruction table support that enables AVX
instructions for the disassembler is in an upcoming patch.

llvm-svn: 127644
2011-03-15 01:23:15 +00:00
Johnny Chen 7a2873dfbe Fixed an ARM disassembler bug where it does not handle STRi12 correctly because an extra
register operand was erroneously added.  Remove an incorrect assert which triggers the bug.

rdar://problem/9131529

llvm-svn: 127642
2011-03-15 01:13:17 +00:00
Jim Grosbach 3af6fe66b9 Clean up ARM tail calls a bit. They're pseudo-instructions for normal branches.
Also more cleanly separate the ARM vs. Thumb functionality. Previously, the
encoding would be incorrect for some Thumb instructions (the indirect calls).

llvm-svn: 127637
2011-03-15 00:30:40 +00:00
Bill Wendling e1fd78f2bc Generate a VTBL instruction instead of a series of loads and stores when we
can. As Nate pointed out, VTBL isn't super performant, but it *has* to be better
than this:

_shuf:
@ BB#0:       @ %entry
  push        {r4, r7, lr}
  add         r7, sp, #4
  sub         sp, #12
  mov         r4, sp
  bic         r4, r4, #7
  mov         sp, r4
  mov         r2, sp
  vmov        d16, r0, r1
  orr         r0, r2, #6
  orr         r3, r2, #7
  vst1.8      {d16[0]}, [r3]
  vst1.8      {d16[5]}, [r0]
  subs        r4, r7, #4
  orr         r0, r2, #5
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #4
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #3
  vst1.8      {d16[0]}, [r0]
  orr         r0, r2, #2
  vst1.8      {d16[2]}, [r0]
  orr         r0, r2, #1
  vst1.8      {d16[1]}, [r0]
  vst1.8      {d16[3]}, [r2]
  vldr.64     d16, [sp]
  vmov        r0, r1, d16
  mov         sp, r4
  pop         {r4, r7, pc}

The "illegal" testcase in vext.ll is no longer illegal.
<rdar://problem/9078775>

llvm-svn: 127630
2011-03-14 23:02:38 +00:00
Jim Grosbach c5efcbad71 Remove some dead patterns.
llvm-svn: 127601
2011-03-14 18:34:35 +00:00
Evan Cheng 383ecd873b Indentation.
llvm-svn: 127595
2011-03-14 18:02:30 +00:00
Justin Holewinski fbc8d301bf PTX: Emit global arrays with proper sizes
- Emit all arrays as type .b8 and proper sizes in bytes to conform
  to the output of nvcc

llvm-svn: 127584
2011-03-14 15:40:11 +00:00
Justin Holewinski 8509380f83 PTX: Add support for sqrt/sin/cos intrinsics
llvm-svn: 127578
2011-03-14 14:09:33 +00:00
Che-Liang Chiou a19f075974 ptx: add set.p instruction and related changes to predicate execution
llvm-svn: 127577
2011-03-14 11:26:01 +00:00
Che-Liang Chiou 58bae0e957 ptx: add basic support of predicate execution
llvm-svn: 127569
2011-03-13 17:26:00 +00:00
Eric Christopher 174d872702 Sometimes isPredicable lies to us and tells us we don't need the operands.
Go ahead and add them on when we might want to use them and let
later passes remove them.

Fixes rdar://9118569

llvm-svn: 127518
2011-03-12 01:09:29 +00:00
Jim Grosbach 965fe994c2 Add FIXME.
llvm-svn: 127516
2011-03-12 00:51:00 +00:00
Jim Grosbach 3f2096eafe Pseudo-ize the ARM Darwin *r9 call instruction definitions. They're the same
actual instruction as the non-Darwin defs, but have different call-clobber
semantics and so need separate patterns. They don't need to duplicate the
encoding information, however.

llvm-svn: 127515
2011-03-12 00:45:26 +00:00
Jim Grosbach b7c6e8f575 Add a FIXME.
llvm-svn: 127511
2011-03-11 23:25:21 +00:00
Jim Grosbach f026d9ed53 Pseudo-ize the ARM 'B' instruction.
llvm-svn: 127510
2011-03-11 23:24:15 +00:00
Jim Grosbach 2fee5327aa Remove dead code. These ARM instruction definitions no longer exist.
llvm-svn: 127509
2011-03-11 23:15:02 +00:00
Jim Grosbach bb0547d9c4 Pseudo-ize VMOVDcc and VMOVScc.
llvm-svn: 127506
2011-03-11 23:09:50 +00:00
Jim Grosbach 9f2b3b569b 80 columns
llvm-svn: 127505
2011-03-11 23:00:16 +00:00
Jim Grosbach 6d371ce37e Properly pseudo-ize the ARM LDMIA_RET instruction. This has the nice side-
effect that we get proper instruction printing using the "pop" mnemonic for it.

llvm-svn: 127502
2011-03-11 22:51:41 +00:00
Jim Grosbach 59eea670f8 ARM VDUPfd and VDUPfq can just be patterns. The instruction is the same
as for VDUP32d and VDUP32q, respectively.

llvm-svn: 127489
2011-03-11 20:44:08 +00:00
Jim Grosbach c77dea7f55 ARM VDUPLNfq and VDUPLNfd definitions can just be Pat<>s for VDUPLN32q
and VDUPLN32d, respectively.

llvm-svn: 127486
2011-03-11 20:31:17 +00:00
Jim Grosbach 24fe5e36ea ARM VREV64df and VREV64qf can just be patterns. The instruction is the same
as for VREV64d32 and VREV64q32, respectively.

llvm-svn: 127485
2011-03-11 20:18:05 +00:00
Jim Grosbach 0b5119315b This FIXME has been fixed.
llvm-svn: 127483
2011-03-11 20:07:37 +00:00
Jim Grosbach fa56bca781 Properly pseudo-ize ARM MVNCCi.
llvm-svn: 127482
2011-03-11 19:55:55 +00:00
Jim Grosbach f541bfd7d4 Fix MOVCCi32imm to be have ARM-mode Requires and a proper size (8 bytes, was 4).
llvm-svn: 127469
2011-03-11 18:00:42 +00:00
Chris Lattner 05a23b1e61 silence a conditional assignment -Wuninitialized warning.
llvm-svn: 127453
2011-03-11 02:12:51 +00:00
Jim Grosbach d025498271 Properly pseudo-ize ARM MOVCCi and MOVCCi16.
llvm-svn: 127442
2011-03-11 01:09:28 +00:00
Eric Christopher cf56a5034f Change the x86 32-bit scheduler to register pressure and fix up the
corresponding testcases back to the previous versions.

Fixes some performance regressions only seen on 32-bit.

llvm-svn: 127441
2011-03-11 01:05:58 +00:00
Jim Grosbach 62a7b473af Properly pseudo-ize MOVCCr and MOVCCs.
llvm-svn: 127434
2011-03-10 23:56:09 +00:00
Jim Grosbach e5ccac85d3 DMB can just be a pat referencing MCR.
llvm-svn: 127423
2011-03-10 19:27:17 +00:00
Jim Grosbach b75c0db9d2 Reorganize a bit. No functional change, just moving patterns up.
llvm-svn: 127422
2011-03-10 19:21:08 +00:00
Jim Grosbach e175682781 Pseudo-instructions are codegenonly by definition.
llvm-svn: 127420
2011-03-10 19:06:39 +00:00
Justin Holewinski 72ff7e4fa9 PTX: Add preliminary support for floating-point divide and multiply-and-add
llvm-svn: 127410
2011-03-10 16:57:18 +00:00
Che-Liang Chiou 6e9fb0d056 ptx: add the rest of special registers of ISA version 2.0
llvm-svn: 127397
2011-03-10 04:05:57 +00:00
Stuart Hastings d17ae4e939 Revert 127359; it broke lencod.
llvm-svn: 127382
2011-03-10 00:25:53 +00:00
Evan Cheng b4c6a34415 Re-commit 127368 and 127371. They are exonerated.
llvm-svn: 127380
2011-03-10 00:16:32 +00:00
Evan Cheng d4b3f8e009 Revert 127368 and 127371 for now.
llvm-svn: 127376
2011-03-09 23:53:17 +00:00
Evan Cheng ca9a936332 Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more
flexible.

If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.

llvm-svn: 127368
2011-03-09 22:47:38 +00:00
Benjamin Kramer 801c9afd94 Fix a pasto that broke all x86_64-elf targets.
llvm-svn: 127365
2011-03-09 22:07:13 +00:00
Stuart Hastings 9955e2f912 X86 byval copies no longer always_inline. <rdar://problem/8706628>
llvm-svn: 127359
2011-03-09 21:10:30 +00:00
Johnny Chen 9363d41f14 LLVM combines the offset mode of A8.6.199 A1 & A2 into STRBT.
The insufficient encoding information of the combined instruction confuses the decoder wrt
UQADD16.  Add extra logic to recover from that.

Fixed an assert reported by Sean Callanan

llvm-svn: 127354
2011-03-09 20:01:14 +00:00
Bruno Cardoso Lopes 048ffabe78 Improve varags handling, with testcases. Patch by Sasa Stankovic
llvm-svn: 127349
2011-03-09 19:22:22 +00:00
Jan Sjödin 6348dc0566 Add createELFObjectTargetWriter method to TargetAsmBackend, which enables construction of non-standard ELFObjectWriters that can be used in MCJIT.
llvm-svn: 127346
2011-03-09 18:44:41 +00:00
NAKAMURA Takumi 58d1f93b03 Target/X86: Tweak va_arg for Win64 not to miss taking va_start when number of fixed args > 4.
llvm-svn: 127328
2011-03-09 11:33:15 +00:00
Bill Wendling 5e57137e87 * Correct encoding for VSRI.
* Add tests for VSRI and VSLI.

llvm-svn: 127297
2011-03-09 00:33:17 +00:00
Bill Wendling a7f303de71 Correct the encoding for VRSRA and VSRA instructions.
llvm-svn: 127294
2011-03-09 00:00:35 +00:00
Bill Wendling e313f16ad9 * Fix VRSHR and VSHR to have the correct encoding for the immediate.
* Update the NEON shift instruction test to expect what 'as' produces.

llvm-svn: 127293
2011-03-08 23:48:09 +00:00
Benjamin Kramer 679cfb54ec X86: Fix the (saddo/ssub x, 1) -> incl/decl selection to check the right operand for 1.
Found by inspection.

llvm-svn: 127247
2011-03-08 15:20:20 +00:00
Justin Holewinski 42e9aaa4b1 PTX: Add intrinsic support for ntid, ctaid, and nctaid registers
llvm-svn: 127246
2011-03-08 14:10:18 +00:00
Eric Christopher eb19e9e9fc Turn on list-ilp scheduling by default on x86 and x86-64, fix up
testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.

Performance results:

roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated

john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES

Small compile time impact.

llvm-svn: 127208
2011-03-08 02:42:25 +00:00
Bob Wilson 45acbd03db Fix a compiler crash where a Glue value had multiple uses. Radar 9049552.
llvm-svn: 127198
2011-03-08 01:17:20 +00:00
Bob Wilson 70bd363517 Fix comment typos.
llvm-svn: 127197
2011-03-08 01:17:16 +00:00
Bill Wendling 77ad1dc56d Rename the narrow shift right immediate operands to "shr_imm*" operands. Also
expand the testing of the narrowing shift right instructions.

No functionality change.

llvm-svn: 127193
2011-03-07 23:38:41 +00:00
Cameron Zwarich df61694417 Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.
llvm-svn: 127175
2011-03-07 21:56:36 +00:00
Anton Korobeynikov 692f633df9 ARM assembler stuff is crazy: for .setfp positive values of offset corresponds to "add" instruction, not to "sub" as in .pad case
llvm-svn: 127106
2011-03-05 18:44:00 +00:00
Anton Korobeynikov 9e66cbb366 In Thumb1 mode the constant might be materialized via the load from constpool. Emit unwinding information in case when this load from constpool is used to change the stack pointer in the prologue.
llvm-svn: 127105
2011-03-05 18:43:55 +00:00
Anton Korobeynikov a8d177b2d4 Implement frame unwinding information emission for Thumb1. Not finished yet because there is no way given the constpool index to examine the actual entry: the reason is clones inserted by constant island pass, which are not tracked at all! The only connection is done during asmprinting time via magic label names which is really gross and needs to be eventually fixed.
llvm-svn: 127104
2011-03-05 18:43:50 +00:00
Anton Korobeynikov 51537f1c7f Add unwind information emission for thumb stuff
llvm-svn: 127103
2011-03-05 18:43:43 +00:00
Anton Korobeynikov acca7adf16 Handle MI flags inside Thumb2SizeReduction pass.
llvm-svn: 127102
2011-03-05 18:43:38 +00:00
Anton Korobeynikov e7410dd0d5 Preliminary support for ARM frame save directives emission via MI flags.
This is just very first approximation how the stuff should be done
(e.g. ARM-only for now). More to follow.

llvm-svn: 127101
2011-03-05 18:43:32 +00:00
Anton Korobeynikov a7ec2dcefd Some first rudimentary support for ARM EHABI: print exception table in "text mode".
llvm-svn: 127099
2011-03-05 18:43:15 +00:00
Bob Wilson 00d09428fe Remove unused conditional negate operations.
llvm-svn: 127090
2011-03-05 16:54:31 +00:00
Che-Liang Chiou 369ea3fdb4 ptx: add basic intrinsic support
llvm-svn: 127084
2011-03-05 14:17:37 +00:00
Andrew Trick 641e2d4f8c Increased the register pressure limit on x86_64 from 8 to 12
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.

Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.

Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.

llvm-svn: 127067
2011-03-05 08:00:22 +00:00
Andrew Trick 27c079e1b0 whitespace
llvm-svn: 127065
2011-03-05 06:31:54 +00:00
Bill Wendling 88842e4574 Initialize variable.
llvm-svn: 127038
2011-03-04 21:38:47 +00:00
Bruno Cardoso Lopes 434248a62c Improve div/rem node handling on mips. Patch by Akira Hatanaka
llvm-svn: 127034
2011-03-04 21:03:24 +00:00
Bruno Cardoso Lopes a744ef3f90 Expands register/immediate pairs when the immediate is too large to fit in 16-bit field. Patch by Akira Hatanaka
llvm-svn: 127032
2011-03-04 20:48:08 +00:00
Bruno Cardoso Lopes 8887d6593f Rewrite and simplify o32 vaarg passing, no functional changes. Patch by Sasa Stankovic
llvm-svn: 127029
2011-03-04 20:27:44 +00:00
Bruno Cardoso Lopes f8198e4311 Lowers block address. Currently asserts when relocation model is not PIC. Patch by Akira Hatanaka
llvm-svn: 127027
2011-03-04 20:01:52 +00:00
Bruno Cardoso Lopes 328e2ce043 Fix an old copy-n-paste
llvm-svn: 127020
2011-03-04 19:20:24 +00:00
Devang Patel a0d73fd65e Disable ARMGlobalMerge on darwin. The debugger is not yet able to extract individual variable's info from merged global.
llvm-svn: 127019
2011-03-04 19:11:05 +00:00
Bruno Cardoso Lopes 22b69db8dd Expands FCOS and FSIN nodes when type is f64.
llvm-svn: 127017
2011-03-04 18:54:14 +00:00
Bruno Cardoso Lopes db93ddb41b Fixes addc pattern when immediate cannot be represented with 16-bit. Patch by Akira Hatanaka
llvm-svn: 127005
2011-03-04 17:59:18 +00:00
Bruno Cardoso Lopes ed874eff93 Remove (hopefully) all trailing whitespaces from the mips backend. Patch by Hatanaka, Akira
llvm-svn: 127003
2011-03-04 17:51:39 +00:00
Kalle Raiskila a1d947dd14 Allow vector shifts (shl,lshr,ashr) on SPU.
There was a previous implementation with patterns that would 
have matched e.g. 
	shl <v4i32> <i32>,
but this is not valid LLVM IR so they never were selected.

llvm-svn: 126998
2011-03-04 13:19:18 +00:00
Kalle Raiskila 3531e9b0d9 Allow load from constant on SPU.
A 'load <4 x i32>* null' crashes llc before this fix.

llvm-svn: 126995
2011-03-04 12:00:11 +00:00
Eli Friedman f63614a982 PR9377: Handle x86 str with register operand in a way consistent with gas.
llvm-svn: 126970
2011-03-04 00:10:17 +00:00
Bob Wilson f5d23beff7 PR8053: Fix encoding of S bit in some ARM instructions.
Patch by Zonr Chang!

llvm-svn: 126967
2011-03-03 23:07:15 +00:00
Richard Osborne af52c52569 Optimize fprintf -> iprintf if there are no floating point arguments
and siprintf is available on the target.

llvm-svn: 126940
2011-03-03 14:20:22 +00:00
Justin Holewinski 8e9a126a6c PTX: Fix Emacs renaming a symbol
llvm-svn: 126938
2011-03-03 14:09:40 +00:00
Richard Osborne 2dfb888392 Optimize sprintf -> siprintf if there are no floating point arguments
and siprintf is available on the target.

llvm-svn: 126937
2011-03-03 14:09:28 +00:00
Justin Holewinski 969dfbcff6 PTX: Fix a couple of lint violations
llvm-svn: 126936
2011-03-03 13:34:29 +00:00
Richard Osborne 815de536e5 Optimize printf -> iprintf if there are no floating point arguments
and iprintf is available on the target. Currently iprintf is only
marked as being available on the XCore.

llvm-svn: 126935
2011-03-03 13:17:51 +00:00
Tilmann Scheller 3bc0bcf3ad Use X86_thiscall calling convention for Win64 as well.
llvm-svn: 126934
2011-03-03 07:49:07 +00:00
Bob Wilson ab8881accd Add a readme entry for the redundant movw issue for pr9370.
llvm-svn: 126930
2011-03-03 06:39:09 +00:00
Bob Wilson ec84568904 pr9367: Add missing predicated BLX instructions.
Patch by Jyun-Yan You, with some minor adjustments and a testcase from me.

llvm-svn: 126915
2011-03-03 01:41:01 +00:00
Kevin Enderby b8b6041734 Fixes an assertion failure while disassembling ARM rsbs reg/reg form.
Patch by Ted Kremenek!

llvm-svn: 126895
2011-03-02 23:08:33 +00:00
Renato Golin e84af17b6e Fixing a bug when printing fpu text to object file. Patch by Mans Rullgard.
llvm-svn: 126882
2011-03-02 21:20:09 +00:00
Tilmann Scheller a3769f8021 Add Win64 thiscall calling convention.
llvm-svn: 126862
2011-03-02 19:29:22 +00:00
David Greene dd567b214b [AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement
missing patterns for them.

      Add a SIMD test subdirectory to hold tests for SIMD instruction
      selection correctness and quality.
'

llvm-svn: 126845
2011-03-02 17:23:43 +00:00
Che-Liang Chiou 7ed32cc51b ptx: fix lint and compiler warnings
llvm-svn: 126838
2011-03-02 07:58:46 +00:00
Che-Liang Chiou 59515dc703 Add 64-bit addressing to PTX backend
- Add '64bit' sub-target option.
- Select 32-bit/64-bit loads/stores based on '64bit' option.
- Fix function parameter order.

Patch by Justin Holewinski

llvm-svn: 126837
2011-03-02 07:36:48 +00:00
Che-Liang Chiou 65b1476031 Extend initial support for primitive types in PTX backend
- Allow i16, i32, i64, float, and double types, using the native .u16,
  .u32, .u64, .f32, and .f64 PTX types.
- Allow loading/storing of all primitive types.
- Allow primitive types to be passed as parameters.
- Allow selection of PTX Version and Shader Model as sub-target attributes.
- Merge integer/floating-point test cases for load/store.
- Use .u32 instead of .s32 to conform to output from NVidia nvcc compiler.

Patch by Justin Holewinski

llvm-svn: 126824
2011-03-02 03:20:28 +00:00
Duncan Sands c76ae9c8e0 Add datalayout information for the IEEE quad precision fp128 type.
llvm-svn: 126780
2011-03-01 20:56:50 +00:00
Bill Wendling 3b1459b810 Narrow right shifts need to encode their immediates differently from a normal
shift.

   16-bit: imm6<5:3> = '001', 8 - <imm> is encded in imm6<2:0>
   32-bit: imm6<5:4> = '01',16 - <imm> is encded in imm6<3:0>
   64-bit: imm6<5> = '1', 32 - <imm> is encded in imm6<4:0>

llvm-svn: 126723
2011-03-01 01:00:59 +00:00
Chris Lattner 0c6cb46ac1 add a note
llvm-svn: 126719
2011-03-01 00:24:51 +00:00
Renato Golin ec0fc7d842 Fix .fpu printing in ARM assembly, regarding bug http://llvm.org/bugs/show_bug.cgi?id=8931
llvm-svn: 126689
2011-02-28 22:04:27 +00:00
Kevin Enderby 63b0d108a2 Add missing whitespace in the formatting.
llvm-svn: 126687
2011-02-28 21:45:12 +00:00
Chris Lattner c93d207e8c fix a signed comparison warning.
llvm-svn: 126682
2011-02-28 20:50:35 +00:00
David Greene 20a1cbefad [AVX] Add decode support for VUNPCKLPS/D instructions, both 128-bit
and 256-bit forms.  Because the number of elements in a vector
      does not determine the vector type (4 elements could be v4f32 or
      v4f64), pass the full type of the vector to decode routines.

llvm-svn: 126664
2011-02-28 19:06:56 +00:00
Kevin Enderby 58775fea6f Fix the arm's disassembler for blx that was building an MCInst without the
needed two predicate operands before the imm operand.

llvm-svn: 126662
2011-02-28 18:46:31 +00:00
Evan Cheng 6e3d443646 Fix a typo which cause dag combine crash. rdar://9059537.
llvm-svn: 126661
2011-02-28 18:45:27 +00:00
Stuart Hastings 67c5c3e939 Support for byval parameters on ARM. Will be enabled by a forthcoming
patch to the front-end.  Radar 7662569.

llvm-svn: 126655
2011-02-28 17:17:53 +00:00
Kalle Raiskila 612b85e58c Add branch hinting for SPU.
The implemented algorithm is overly simplistic (just speculate all branches are
taken)- this is work in progress.

llvm-svn: 126651
2011-02-28 14:08:24 +00:00
Che-Liang Chiou 75a800d3bf Add preliminary support for .f32 in the PTX backend.
- Add appropriate TableGen patterns for fadd, fsub, fmul.
- Add .f32 as the PTX type for the LLVM float type.
- Allow parameters, return values, and global variable declarations
  to accept the float type.
- Add appropriate test cases.

Patch by Justin Holewinski

llvm-svn: 126636
2011-02-28 06:34:09 +00:00
Benjamin Kramer 25bddae404 Silence enum conversion warnings.
llvm-svn: 126578
2011-02-27 18:13:53 +00:00
NAKAMURA Takumi d4e5003a3f Target/X86: Always emit "push/pop GPRs" in prologue/epilogue and emit "spill/reload frames" for XMMs.
It improves Win64's prologue/epilogue but it would not affect ia32 and amd64 (lack of nonvolatile XMMs).

llvm-svn: 126568
2011-02-27 08:47:19 +00:00
Benjamin Kramer 26691d9660 Add some DAGCombines for (adde 0, 0, glue), which are useful to optimize legalized code for large integer arithmetic.
1. Inform users of ADDEs with two 0 operands that it never sets carry
2. Fold other ADDs or ADDCs into the ADDE if possible

It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code.

llvm-svn: 126557
2011-02-26 22:48:07 +00:00
Owen Anderson b2c80da4ae Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
llvm-svn: 126518
2011-02-25 21:41:48 +00:00
Cameron Zwarich fcf51fd298 Roll out r126425 and r126450 to see if it fixes the failures on the buildbots.
llvm-svn: 126488
2011-02-25 16:30:32 +00:00
Bob Wilson e3ecd5fb9b Add patterns to use post-increment addressing for Neon VST1-lane instructions.
llvm-svn: 126477
2011-02-25 06:42:42 +00:00
Evan Cheng a921dc5860 Fix typo.
llvm-svn: 126467
2011-02-25 01:29:29 +00:00
Evan Cheng 70d29634a9 Each prologue may have multiple vpush instructions to store callee-saved
D registers since the vpush list may not have gaps. Make sure the stack
adjustment instruction isn't moved between them. Ditto for vpop in
epilogues.

Sorry, can't reduce a small test case.
rdar://9043312

llvm-svn: 126457
2011-02-25 00:24:46 +00:00
Chris Lattner 0152b7bc7c remove command line option debugging hook.
llvm-svn: 126441
2011-02-24 21:53:03 +00:00
Devang Patel b037383a35 Enable DebugInfo support for COFF object files.
Patch by Nathan Jeffords!

llvm-svn: 126425
2011-02-24 21:04:00 +00:00
Richard Osborne 42f52e737e Add XCore intrinsic for eeu instruction.
llvm-svn: 126384
2011-02-24 13:39:18 +00:00
Evan Cheng 3923466e82 Fix bug in X86 folding / unfolding table. Int_CMPSDrm and Int_CMPSSrm memory
operands starts at index 2, not 1.
rdar://9045024
PR9305

llvm-svn: 126359
2011-02-24 02:36:52 +00:00
Richard Osborne bfa5cc0e08 Add XCore intrinsic for clre instruction.
llvm-svn: 126322
2011-02-23 18:52:05 +00:00
Richard Osborne 4995b05f56 Add llvm.xcore.waitevent intrinsic. The effect of this intrinsic is to enable
events on the thread and wait until a resource is ready to event. The vector
of the resource that is ready is returned.

llvm-svn: 126320
2011-02-23 18:35:59 +00:00
Richard Osborne 2c610aa3ed Add XCore intrinsic for the setv instruction.
llvm-svn: 126315
2011-02-23 16:46:37 +00:00
Richard Osborne 12377e0947 Fix format for setc instruction.
llvm-svn: 126314
2011-02-23 15:20:16 +00:00
Richard Osborne aab96995f6 Add XCore intrinsic for settw instruction.
llvm-svn: 126313
2011-02-23 14:45:03 +00:00
Evan Cheng 97e6428014 Change VFPNeonA8 definition to make the code easier to read.
llvm-svn: 126298
2011-02-23 02:35:33 +00:00
Evan Cheng d6b641e5bc More fcopysign correctness and performance fix.
The previous codegen for the slow path (when values are in VFP / NEON
registers) was incorrect if the source is NaN.

The new codegen uses NEON vbsl instruction to copy the sign bit. e.g.
        vmov.i32        d1, #0x80000000
        vbsl    d1, d2, d0
If NEON is not available, it uses integer instructions to copy the sign bit.
rdar://9034702

llvm-svn: 126295
2011-02-23 02:24:55 +00:00
David Greene 9a6040dc86 [AVX] General VUNPCKL codegen support.
llvm-svn: 126264
2011-02-22 23:31:46 +00:00
Joerg Sonnenberger b7e635dcad Use the same (%dx) hack for in[bwl] as for out[bwl].
llvm-svn: 126244
2011-02-22 20:40:09 +00:00
Evan Cheng 04ad35b53f VFP single precision arith instructions can go down to NEON pipeline, but on Cortex-A8 only.
llvm-svn: 126238
2011-02-22 19:53:14 +00:00
Roman Divacky e8a93fe8f0 Stack alignment is 16 bytes on FreeBSD/i386 too.
llvm-svn: 126226
2011-02-22 17:30:05 +00:00
Evan Cheng 666cf56668 Guard against de-referencing MBB.end().
llvm-svn: 126192
2011-02-22 07:07:59 +00:00
Evan Cheng 2ce663031f available_externally (hidden or not) GVs are always accessed via stubs. rdar://9027648.
llvm-svn: 126191
2011-02-22 06:58:34 +00:00