Commit Graph

21958 Commits

Author SHA1 Message Date
Nadav Rotem bbd40f67d8 Do not optimize the used bits of the x86 vselect condition operand, when the condition operand is a vector of 1-bit predicates.
This may happen on MIC devices.

llvm-svn: 158168
2012-06-07 20:53:48 +00:00
Andrew Trick a5d24ca453 Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency.
llvm-svn: 158164
2012-06-07 19:42:04 +00:00
Andrew Trick 5b1cadf9f7 ARM getOperandLatency rewrite.
Match expectations of the new latency API. Cleanup and make the logic consistent.

llvm-svn: 158163
2012-06-07 19:42:00 +00:00
Andrew Trick 3564bdfa61 ARM getOperandLatency should return -1 for unknown, consistent with API
llvm-svn: 158162
2012-06-07 19:41:58 +00:00
Andrew Trick fb1a74c2b2 Fix ARM getInstrLatency logic to work with the current API.
llvm-svn: 158161
2012-06-07 19:41:55 +00:00
Manman Ren 746e4859d0 PR13046: we can't replace usage of SUB with CMP in the lowering phase.
It will cause assertion failure later on.

llvm-svn: 158160
2012-06-07 19:27:33 +00:00
Rafael Espindola 55d1145bd5 Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

llvm-svn: 158158
2012-06-07 18:39:19 +00:00
Manman Ren ae02c5a93e X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 158126
2012-06-07 00:42:47 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Benjamin Kramer 009b1c1cf1 Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Chad Rosier 5d6f01ad77 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434

llvm-svn: 158087
2012-06-06 17:37:40 +00:00
Richard Barton f1ef87ddbb Correct decoder for T1 conditional B encoding
llvm-svn: 158055
2012-06-06 09:12:53 +00:00
Craig Topper bf2409e8aa Mark several instructions SSE2 instead of SSE3 as they should be.
llvm-svn: 158049
2012-06-06 06:45:27 +00:00
Andrew Trick 4544606c71 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Yuan Lin 572a3a2cce Fix header file include order in NVPTX backend NV_CONTRIB
llvm-svn: 158013
2012-06-05 19:06:13 +00:00
Roman Divacky c856653fb3 PPC32 uses R2 as the TLS register. Fix the copy and paste.
llvm-svn: 158004
2012-06-05 17:14:17 +00:00
Andrew Trick 39a99140c7 X86 itinerary properties.
llvm-svn: 157981
2012-06-05 03:44:46 +00:00
Andrew Trick b2680c718f ARM itinerary properties.
llvm-svn: 157980
2012-06-05 03:44:43 +00:00
Andrew Trick 73d7736b17 misched: Added MultiIssueItineraries.
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.

llvm-svn: 157979
2012-06-05 03:44:40 +00:00
Andrew Trick 515f131786 whitespace
llvm-svn: 157976
2012-06-05 03:44:29 +00:00
Joel Jones 7f2ac7a2c8 Revert commit r157966
llvm-svn: 157972
2012-06-05 00:47:21 +00:00
Joel Jones d08534f82e This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>

llvm-svn: 157966
2012-06-04 23:38:57 +00:00
Akira Hatanaka 6734685f21 Fix a bug in MipsTargetLowering::LowerLOAD. A shift-right-logical node is
inserted after the shift-left-logical node.

llvm-svn: 157937
2012-06-04 17:46:29 +00:00
Roman Divacky e3f15c98d1 Implement local-exec TLS on PowerPC.
llvm-svn: 157935
2012-06-04 17:36:38 +00:00
Hans Wennborg 245917b536 MIPS TLS: use the model selected by TargetMachine::getTLSModel().
This was mostly done already in r156162, but I missed one place.

llvm-svn: 157929
2012-06-04 14:02:08 +00:00
Hans Wennborg 09610f3e09 Better comments for TLS-related X86 MachineOperand flags.
llvm-svn: 157920
2012-06-04 09:55:36 +00:00
Craig Topper c6ac4cefcc Add intrinsic forms for FMA instructions to opcode folding tables.
llvm-svn: 157917
2012-06-04 07:46:16 +00:00
Craig Topper 3cb143016d Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions.
llvm-svn: 157914
2012-06-04 07:08:21 +00:00
Hal Finkel 1de9bf01e4 Fix a copy-and-paste duplication error in the PPC 440 and A2 schedules (no functionality change).
llvm-svn: 157912
2012-06-04 02:39:52 +00:00
Hal Finkel 595817eebe Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

llvm-svn: 157911
2012-06-04 02:21:00 +00:00
Craig Topper 79dbb0c6e4 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Craig Topper fd53b80219 Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Manman Ren 5097e4f38a Revert r157831
llvm-svn: 157896
2012-06-03 03:14:24 +00:00
Craig Topper 29eafea292 Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
llvm-svn: 157895
2012-06-03 01:40:43 +00:00
Craig Topper badd755a0e Add neverHasSideEffects and mayLoad to FMA3 instructions.
llvm-svn: 157894
2012-06-03 00:30:49 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Chris Lattner 58268c23ac remove an unused variable.
llvm-svn: 157872
2012-06-02 01:03:42 +00:00
Akira Hatanaka 23327b30ef Remove code which is no longer needed in MipsAsmPrinter and MipsMCInstLower.
llvm-svn: 157867
2012-06-02 00:05:11 +00:00
Akira Hatanaka 019e592f75 Set operation actions for load/store nodes in the Mips backend.
llvm-svn: 157866
2012-06-02 00:04:42 +00:00
Akira Hatanaka f11571d90d Add definitions of 32/64-bit unaligned load/store instructions for Mips.
llvm-svn: 157865
2012-06-02 00:04:19 +00:00
Akira Hatanaka 8f1db778a4 Define functions MipsTargetLowering::LowerLOAD and LowerSTORE which
custom-lower unaligned load and store nodes.

llvm-svn: 157864
2012-06-02 00:03:49 +00:00
Akira Hatanaka b9ebf8d644 Define Mips specific unaligned load/store nodes.
llvm-svn: 157863
2012-06-02 00:03:12 +00:00
Akira Hatanaka 4e76bf8282 Expand unaligned i16 loads/stores for the Mips backend.
This is the first of a series of patches which make changes to the backend to
emit unaligned load/store instructions (lwl,lwr,swl,swr) during instruction
selection.

llvm-svn: 157862
2012-06-02 00:02:45 +00:00
Akira Hatanaka 56bf023a6d In MipsMCInstLower::LowerSymbolOperand, get offset from symbol if
the MachineOperand type has a valid offset. 

llvm-svn: 157861
2012-06-02 00:02:11 +00:00
Jakob Stoklund Olesen 54038d796c Switch all register list clients to the new MC*Iterator interface.
No functional change intended.

Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.

This makes it possible to do so without changing all clients (again).

llvm-svn: 157854
2012-06-01 23:28:30 +00:00
Chad Rosier f319324082 [arm-fast-isel] Fix handling of the frameaddress intrinsic. If depth is 0
then DestReg is undefined.

llvm-svn: 157840
2012-06-01 21:12:31 +00:00
Jakob Stoklund Olesen 92a0083944 Switch some getAliasSet clients to MCRegAliasIterator.
MCRegAliasIterator can optionally visit the register itself, allowing
for simpler code.

llvm-svn: 157837
2012-06-01 20:36:54 +00:00
Manman Ren 879ca9d47d X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831
2012-06-01 19:49:33 +00:00
Manman Ren e873552091 ARM: properly handle alignment for struct byval.
Factor out the expansion code into a function.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 157830
2012-06-01 19:33:18 +00:00
Hans Wennborg 789acfb63d Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

llvm-svn: 157818
2012-06-01 16:27:21 +00:00
Craig Topper 1d4d62d76c Enable automatic detection of FMA3 support to allow intrinsics to be used.
llvm-svn: 157805
2012-06-01 06:10:14 +00:00
Craig Topper 00649d5111 Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
llvm-svn: 157804
2012-06-01 06:07:48 +00:00
Craig Topper 2e127b5274 Add VFNSUB* instructions to folding table.
llvm-svn: 157802
2012-06-01 05:48:39 +00:00
Craig Topper 9eadcfdf2a Remove a trailing space and fix a comment.
llvm-svn: 157801
2012-06-01 05:34:01 +00:00
Craig Topper df09da8355 Tidy up. Remove trailing spaces and fix the worst of the 80 column violations.
llvm-svn: 157799
2012-06-01 05:24:29 +00:00
Manman Ren 9f9111651e ARM: support struct byval in llvm
We handle struct byval by inserting a pseudo op, which will be expanded to a
loop at ExpandISelPseudos.
A separate patch for clang will be submitted to enable struct byval.

rdar://9877866

llvm-svn: 157793
2012-06-01 02:44:42 +00:00
Chad Rosier 526772de29 Put the shiny new MCSubRegIterator to work.
llvm-svn: 157783
2012-06-01 00:02:08 +00:00
Jakob Stoklund Olesen 4f203ea34b Add support for return value promotion in X86 calling conventions.
Patch by Yiannis Tsiouris!

llvm-svn: 157757
2012-05-31 17:28:20 +00:00
Manman Ren 9bccb64e56 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Jakob Stoklund Olesen fa9d7db17b Add a PrintRegUnit helper similar to PrintReg.
Reg-units are named after their root registers, and most units have a
single root, so they simply print as 'AL', 'XMM0', etc. The rare dual
root reg-units print as FPSCR~FPSCR_NZCV, FP0~ST7, ...

The printing piggybacks on the existing register name tables, so no
extra const data space is required.

llvm-svn: 157754
2012-05-31 17:18:29 +00:00
Joel Jones 585bc82489 Fix typos
llvm-svn: 157752
2012-05-31 17:11:25 +00:00
Benjamin Kramer a0396e4583 X86: Rename the CLMUL target feature to PCLMUL.
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.

llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Elena Demikhovsky 602f3a26d6 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Craig Topper c1ac05dad5 Add intrinsic for pclmulqdq instruction.
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Akira Hatanaka bff8e31d3c Cleanup and factoring of mips16 tablegen classes. Make register classes
CPU16RegsRegClass and CPURARegRegClass available. Add definition of mips16
jalr instruction.

Patch by Reed Kotler.

llvm-svn: 157730
2012-05-31 02:59:44 +00:00
Jakob Stoklund Olesen 5541f6026e Avoid depending on list orders and register numbering.
This code is covered by test/CodeGen/ARM/arm-modifier.ll.

llvm-svn: 157720
2012-05-30 23:00:43 +00:00
Jakob Stoklund Olesen 0b97dbcf1a Extract some pointer hacking to a function.
Switch to MCSuperRegIterator while we're there.

llvm-svn: 157717
2012-05-30 22:40:03 +00:00
Eric Christopher f481ab3877 Add support for the mips inline asm 'm' output modifier.
Patch by Jack Carter.

llvm-svn: 157709
2012-05-30 19:05:19 +00:00
Jakob Stoklund Olesen ad8103dc7b Fix some uses of getSubRegisters() to use getSubReg() instead.
It is better to address sub-registers directly by name instead of
relying on their position in the sub-register list.

llvm-svn: 157703
2012-05-30 18:40:49 +00:00
Chris Lattner 1622a99e58 it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more.
llvm-svn: 157699
2012-05-30 18:08:02 +00:00
Chris Lattner 04d722a68d Extend the (abi-irrelevant) return convention to be able to return more than two values in
integer registers.  This is already supported by the fastcc convention, but it doesn't
hurt to support it in the standard conventions as well.

In cases where we can cheat at the calling convention, this allows us to avoid returning
things through memory in more cases.

llvm-svn: 157698
2012-05-30 17:50:14 +00:00
Chad Rosier 820d248c4d [arm-fast-isel] Add support for the llvm.frameaddress() intrinsic.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 157696
2012-05-30 17:23:22 +00:00
Benjamin Kramer f1e0b6cdf7 Port support for SSE4a extrq/insertq to the old jit code emitter.
llvm-svn: 157685
2012-05-30 09:13:55 +00:00
Benjamin Kramer ef479ea854 Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Nicolas Geoffray 312b28ce9d Update CPPBackend to new API for AttrListPtr::get.
llvm-svn: 157624
2012-05-29 15:07:18 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Akira Hatanaka 5cec9007bb Fix predicate HasStandardEncoding in MipsInstrInfo.td per suggestion of
Benjamin Kramer.

llvm-svn: 157504
2012-05-25 22:15:15 +00:00
Akira Hatanaka 03968fac4f Delete MipsExpandPseudo.cpp.
llvm-svn: 157496
2012-05-25 20:54:48 +00:00
Akira Hatanaka d0ac2c93d3 Move the code in MipsExpandPseudo to MipsInstrInfo::expandPostRAPseudo.
Delete MipsExpandPseudo.

llvm-svn: 157495
2012-05-25 20:52:52 +00:00
Akira Hatanaka f4554485cb Remove the code that expands MIPS' .cpload directive.
llvm-svn: 157494
2012-05-25 20:46:52 +00:00
Akira Hatanaka 5de59266cd Remove the code that emits MIPS' .cprestore directive.
llvm-svn: 157493
2012-05-25 20:42:55 +00:00
Akira Hatanaka 4d9b017ef2 Remove pseudo instructions that are no longer used.
llvm-svn: 157492
2012-05-25 20:37:40 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Jakob Stoklund Olesen ff7fd4543f Shrink.
llvm-svn: 157433
2012-05-24 22:17:44 +00:00
Justin Holewinski 907f7606f2 Remove the PTX back-end and all of its artifacts (triple, etc.)
This back-end was deprecated in favor of the NVPTX back-end.

NV_CONTRIB

llvm-svn: 157417
2012-05-24 21:38:21 +00:00
Akira Hatanaka a649cc75b3 Turn on mips16 pseudo op when compiling for mips16.
Expand test case for this.

Patch by Reed Kotler.

llvm-svn: 157410
2012-05-24 18:37:43 +00:00
Akira Hatanaka df98a7a34d Enable Mips16 compiler to compile a null program.
First code from the Mips16 compiler. Includes trivial test program.

Patch by Reed Kotler.

llvm-svn: 157408
2012-05-24 18:32:33 +00:00
Craig Topper bdf39a46a3 Convert assert(0) to llvm_unreachable.
llvm-svn: 157380
2012-05-24 07:02:50 +00:00
Craig Topper 273b0d7be5 Use uint16_t to store registers in static tables. Matches other tables.
llvm-svn: 157375
2012-05-24 06:09:56 +00:00
Craig Topper be064d0136 Use uint16_t to store register number in static tables to match other tables.
llvm-svn: 157374
2012-05-24 05:55:47 +00:00
Craig Topper 01736f866a Make some opcode tables static and const. Allows code to avoid making copies to pass the tables around.
llvm-svn: 157373
2012-05-24 05:17:00 +00:00
Craig Topper e4260f911b Mark a couple arrays as static and const. Use array_lengthof instead of sizeof/sizeof.
llvm-svn: 157369
2012-05-24 04:22:05 +00:00
Craig Topper 42b96d1b74 Mark a static array as const.
llvm-svn: 157368
2012-05-24 04:11:15 +00:00
Craig Topper 2fbd130a79 Mark a static table as const. Shrink opcode size in static tables to uint16_t. Simplify loop iterating over one of those tables. No functional change intended.
llvm-svn: 157367
2012-05-24 03:59:11 +00:00
Chad Rosier 20b79dc40e Tidy up naming for consistency and other cleanup. No functional change intended.
llvm-svn: 157358
2012-05-23 23:45:10 +00:00
Chad Rosier 223faf719c [arm-fast-isel] Add support for non-global callee.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 157336
2012-05-23 18:38:57 +00:00
Craig Topper a4fd6d655a Tidy up spacing.
llvm-svn: 157313
2012-05-23 05:44:51 +00:00
Craig Topper 9fc5c814fa Fix indentation of wrapped line for readability. No functional change.
llvm-svn: 157309
2012-05-23 03:59:53 +00:00
NAKAMURA Takumi 70c1aa0bb5 ARMDisassembler.cpp: Fix utf8 char in comments.
llvm-svn: 157292
2012-05-22 21:47:02 +00:00
Craig Topper 53b4b73be9 Fix constant used for pshufb mask when lowering v16i8 shuffles. Bug introduced in r157043. Fixes PR12908.
llvm-svn: 157236
2012-05-22 06:09:38 +00:00
Akira Hatanaka cdf4fd8267 This patch adds a predicate to existing mips32 and mips64 so that those
instruction encodings can be excluded during mips16 processing.

This revision fixes the issue raised by Jim Grosbach.

bool hasStandardEncoding() const { return !inMips16Mode(); }

When micromips is added it will be

bool StandardEncoding() const { return !inMips16Mode()&&  !inMicroMipsMode(); }

No additional testing is needed other than to assure that there is no regression
from this patch.

Patch by Reed Kotler.

llvm-svn: 157234
2012-05-22 03:10:09 +00:00
Jim Grosbach 2597f83889 ARM: .end_data_region mismatch in Thumb2.
32-bit offset jump tables just use real branch instructions and so aren't
marked as data regions. We were still emitting the .end_data_region
marker though, which assert()ed.

rdar://11499158

llvm-svn: 157221
2012-05-21 23:34:42 +00:00
Jim Grosbach 19a7bcedb1 Thumb2: RSB source register should be rGRP not GPRnopc.
t2RSB defined the operand correctly, but tRSBS didn't.

llvm-svn: 157200
2012-05-21 17:57:17 +00:00
Craig Topper e88f2fd4f7 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
llvm-svn: 157175
2012-05-21 06:40:16 +00:00
Jakob Stoklund Olesen 38dcd598f9 Make the global base reg GR32_NOSP.
It can sometimes be used in addressing modes that don't support %ESP.

llvm-svn: 157165
2012-05-20 18:43:00 +00:00
Hal Finkel 601f555eee Add a missing PPC 64-bit stwu pattern.
This seems to fix the remaining compile-time failures on PPC64 when
compiling with -enable-ppc-preinc.

llvm-svn: 157159
2012-05-20 17:11:24 +00:00
Jakob Stoklund Olesen 691ae3388f Use the right register class for LDRrs.
llvm-svn: 157152
2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen 4fd0e4f415 Transfer memory operands to the right instruction.
They need to go on the PICLDR as the verifier points out.

llvm-svn: 157151
2012-05-20 06:38:42 +00:00
Hal Finkel 66b0c93553 Add a FIXME about access to negative stack-pointer offsets on PPC32.
The current code will generate a prologue which starts with something like:
        mflr 0
        stw 31, -4(1)
        stw 0, 4(1)
        stwu 1, -16(1)

But under the PPC32 SVR4 ABI, access to negative offsets from R1 is not allowed.

This was pointed out by Peter Bergner.

llvm-svn: 157133
2012-05-19 21:52:55 +00:00
Nadav Rotem c93e91da27 On Haswell, perfer storing YMM registers using a single instruction.
llvm-svn: 157129
2012-05-19 20:30:08 +00:00
Nadav Rotem 900c7cb7ce Add support for additional in-reg vbroadcast patterns
llvm-svn: 157127
2012-05-19 19:57:37 +00:00
Craig Topper 1964b6d39d Tidy up some spacing and inconsistent use of pre/post increment. No functional change intended.
llvm-svn: 157122
2012-05-19 19:14:18 +00:00
Stepan Dyatkovskiy 79a0d80d51 Ordinary PR1255 patch: DifferenceEngine and CPPBackend adopted to the new SwitchInst methods.
llvm-svn: 157112
2012-05-19 13:14:30 +00:00
Craig Topper 6166178573 Copy some AVX support from MCJIT to JIT. Maybe will fix PR12748.
llvm-svn: 157109
2012-05-19 08:28:17 +00:00
Eric Christopher bc5d24999c Add support for the 'd' mips inline asm output modifier.
Patch by Jack Carter.

llvm-svn: 157093
2012-05-19 00:51:56 +00:00
Jim Grosbach 4b63d2ae1d Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

llvm-svn: 157062
2012-05-18 19:12:01 +00:00
Eric Christopher 9ca26cfb5f Add support for the mips 'x' inline asm modifier.
Patch by Jack Carter.

llvm-svn: 157057
2012-05-18 17:39:35 +00:00
Craig Topper 0cf4038c59 Simplify code a bit. No functional change intended.
llvm-svn: 157044
2012-05-18 07:07:36 +00:00
Craig Topper 92db928ee9 Simplify handling of v16i8 shuffles and fix a missed optimization.
llvm-svn: 157043
2012-05-18 06:42:06 +00:00
Kevin Enderby f1b225d0e0 Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing
the 0b10 mask encoding bits.  Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use.  Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions.  Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025

llvm-svn: 157019
2012-05-17 22:18:01 +00:00
Tim Northover af501a29d3 Remove incorrect pattern for ARM SMML instruction.
Patch by Meador Inge.

llvm-svn: 156989
2012-05-17 13:12:13 +00:00
Akira Hatanaka 0faaebf27c This patch adds the register class for MIPS16 as well as the ability for
llc to recognize MIPS16 as a MIPS ASE extension. -mips16 will mean the
mips16 ASE for mips32 by default.

As part of fixing of adding this we discovered some small changes that
need to be made to MipsInstrInfo::storeRegToStackSLot and
MipsInstrInfo::loadRegFromStackSlot. We were using some "==" equality tests
where in fact we should have been using Mips::<regclas>.hasSubClassEQ instead,
per suggestion of Jakob Stoklund Olesen.

Patch by Reed Kotler.

llvm-svn: 156958
2012-05-16 22:19:56 +00:00
Benjamin Kramer 7faf84f125 Hexagon: Remove unused command line option.
llvm-svn: 156917
2012-05-16 15:03:55 +00:00
Evan Cheng 58a95f0c8a Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
llvm-svn: 156896
2012-05-16 01:54:27 +00:00
Jim Grosbach c3b0427921 Allow MCCodeEmitter access to the target MCRegisterInfo.
Add the MCRegisterInfo to the factories and constructors.

Patch by Tom Stellard <Tom.Stellard@amd.com>.

llvm-svn: 156828
2012-05-15 17:35:52 +00:00
Akira Hatanaka cf434ee4c1 Temporarily disable anti-dependence breaking for Mips until bug 12829 is
resolved.

llvm-svn: 156801
2012-05-15 03:14:52 +00:00
Bill Wendling 8b5c0e4af2 Remove extraneous ';'.
llvm-svn: 156791
2012-05-15 00:41:56 +00:00
Akira Hatanaka 4773e67e0b Add a command line option to skip the delay slot filler pass entirely for Mips.
The purpose of this option is to silence error messages issued by machine
verifier passes and enable them to run to the end. If this option is not
provided, -verify-machineinstrs complains when it discovers there is a
non-terminator instruction (an instruction that is in a delay slot) after the
first terminator in a basic block.

llvm-svn: 156790
2012-05-14 23:59:17 +00:00
David Blaikie 81a84bd841 Fix use of uninitialized variable.
Found by GCC's maybe-uninitialized.

llvm-svn: 156780
2012-05-14 21:48:19 +00:00
Brendon Cahoon f6b687e5d1 Revert 156634 upon request until code improvement changes are made.
llvm-svn: 156775
2012-05-14 19:35:42 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Benjamin Kramer 0b03cbd416 Hexagon: Initialize TBB to 0.
Found by valgrind.

llvm-svn: 156744
2012-05-13 15:13:22 +00:00
Sirish Pande 8bb9745a5e Make sure new value jump is enabled for Hexagon V5 as well.
llvm-svn: 156700
2012-05-12 05:54:15 +00:00
Sirish Pande 4bd20c50eb Support for Hexagon feature, New Value Jump.
llvm-svn: 156698
2012-05-12 05:10:30 +00:00
Akira Hatanaka a6c3fd8317 Remove MipsEmitGPRestore.cpp.
llvm-svn: 156696
2012-05-12 03:24:03 +00:00
Akira Hatanaka 3ecc5273c1 Delete all functions that are no longer needed in MipsFunctionInfo, including
the ones that get or set the frame index for the $gp save slot. 

Remove the piece of code in MipsFunctionInfo::getGlobalBaseReg() which returns
GP. This function should always return a virtual register.

llvm-svn: 156695
2012-05-12 03:22:13 +00:00
Akira Hatanaka 2e31e036b6 Stop reserving register $gp. Do not call isGPFI to check whether a frame object
is the $gp save slot.

llvm-svn: 156694
2012-05-12 03:21:18 +00:00
Akira Hatanaka 0fb87feb39 Do not add the pass which restores $gp after every function call.
llvm-svn: 156693
2012-05-12 03:19:51 +00:00
Akira Hatanaka f542ebd958 Make the following changes in MipsISelLowering.cpp:
- Stop creating stack frame objects needed for saving $gp.
- Insert a node that copies the global pointer register to register $gp
  before the call node. This will ensure $gp is valid at the entry of the
  called function.

llvm-svn: 156692
2012-05-12 03:19:04 +00:00
Akira Hatanaka c980f8453a Make the following changes in MipsFrameLowering.cpp:
- Stop emitting instructions needed to initialize the global pointer register.
- Stop emitting .cprestore directive.
- Do not take into account the $gp save slot when computing stack size.

llvm-svn: 156691
2012-05-12 03:18:00 +00:00
Akira Hatanaka 8f3573034b Make the following changes in MipsAsmPrinter.cpp:
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
  initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.

llvm-svn: 156689
2012-05-12 00:48:43 +00:00
Akira Hatanaka d918f77ba3 Insert instructions to the entry basic block which initializes the global
pointer register. 


This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:

- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
  allocator decide which register to assign to it or whether spill/reloads are
  needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
  for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.

llvm-svn: 156671
2012-05-12 00:17:17 +00:00
Akira Hatanaka 0661b81bca Do not replace operands of pseudo instructions with register $zero.
llvm-svn: 156663
2012-05-11 23:22:18 +00:00
Chad Rosier aa9cb9df59 [fast-isel] Add support for selecting @llvm.trap().
llvm-svn: 156646
2012-05-11 21:33:49 +00:00
Brendon Cahoon 5edcf8822d Updated instruction table due to addded intrinsics.
llvm-svn: 156644
2012-05-11 21:10:16 +00:00
Sirish Pande 95d0117bb3 Remove warnings from HexagonVLIWPacketizer.
llvm-svn: 156636
2012-05-11 20:00:34 +00:00
Brendon Cahoon 31f8723ef3 Hexagon constant extender support.
Patch by Jyotsna Verma.

llvm-svn: 156634
2012-05-11 19:56:59 +00:00
Chad Rosier 06e34d9220 Typo.
llvm-svn: 156633
2012-05-11 19:43:29 +00:00
Chad Rosier 3268692aa8 [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
llvm-svn: 156632
2012-05-11 19:40:25 +00:00
Sirish Pande 83ccb6ce08 Hexagon V5 intrinsics support.
llvm-svn: 156631
2012-05-11 19:39:13 +00:00
Chad Rosier 90f9afe659 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796

llvm-svn: 156628
2012-05-11 18:51:55 +00:00
Chad Rosier 519b12f927 [fast-isel] Rather then assert (or segfault in a non-asserts build), fall back
to selection DAG isel if we're unable to handle a non-double multi-reg retval.
rdar://11430407
PR12796

llvm-svn: 156622
2012-05-11 17:41:06 +00:00
Chad Rosier 466d3d8faa The return type is an unsigned, not a bool.
llvm-svn: 156621
2012-05-11 16:41:38 +00:00
Manman Ren 0d5ec28ccc Add space before an open parenthesis in control flow statements.
llvm-svn: 156620
2012-05-11 15:36:46 +00:00
Preston Gurd 09de6ae399 Added X86 Atom latencies to X86InstrMMX.td.
llvm-svn: 156615
2012-05-11 14:27:12 +00:00
Hans Wennborg f9d0e44b82 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

llvm-svn: 156611
2012-05-11 10:11:01 +00:00
Silviu Baranga ddc67a7655 Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions.
llvm-svn: 156609
2012-05-11 09:28:27 +00:00
Silviu Baranga 5a719f9b9a Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate.
llvm-svn: 156608
2012-05-11 09:10:54 +00:00
Akira Hatanaka e37614438f Fix a misleading comment.
llvm-svn: 156603
2012-05-11 01:45:15 +00:00
Manman Ren dc8ad0058f ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156599
2012-05-11 01:30:47 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00
Preston Gurd 4fe10a5d9a Added X86 Atom latencies for instructions in X86InstrInfo.td.
llvm-svn: 156579
2012-05-10 21:58:35 +00:00
Eric Christopher ed51b9ec0b Add support for the 'X' inline asm operand modifier.
Patch by Jack Carter.

llvm-svn: 156577
2012-05-10 21:48:22 +00:00
Sirish Pande fc8118bf41 Hexagon V5 Support - V5 td file.
llvm-svn: 156569
2012-05-10 20:24:28 +00:00
Sirish Pande 69295b8963 Hexagon V5 FP Support.
llvm-svn: 156568
2012-05-10 20:20:25 +00:00
Manman Ren b555b382bd Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.

llvm-svn: 156556
2012-05-10 18:49:43 +00:00
Manman Ren c860887b2d ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156550
2012-05-10 16:48:21 +00:00
Nadav Rotem 1a65397017 Fix merge-typo and cleanup
llvm-svn: 156541
2012-05-10 12:50:02 +00:00
Nadav Rotem 15946e50c1 AVX2: Add an additional broadcast idiom.
llvm-svn: 156540
2012-05-10 12:39:13 +00:00
Nadav Rotem b86a3fb8d0 Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.

Fix PR11900.

llvm-svn: 156539
2012-05-10 12:22:05 +00:00
Roman Divacky e07cc042f6 Mark .opd @progbits, thus avoiding a warning from asm.
llvm-svn: 156494
2012-05-09 18:24:23 +00:00
Akira Hatanaka ca41d13bbd Add another peephole pattern for conditional moves.
llvm-svn: 156460
2012-05-09 02:29:29 +00:00
Jakob Stoklund Olesen 7e21d617ef Use ptr_rc_tailcall instead of GR32_TC.
The getPointerRegClass() hook will return GR32_TC, or whatever is
appropriate for the current function.

Patch by Yiannis Tsiouris!

llvm-svn: 156459
2012-05-09 01:50:09 +00:00
Akira Hatanaka 05b9dad1e6 Make register FP allocatable if the compiled function does not have dynamic
allocas.

llvm-svn: 156458
2012-05-09 01:38:13 +00:00
Akira Hatanaka 0a8ab718cb Expand 64-bit shifts if target ABI is O32.
llvm-svn: 156457
2012-05-09 00:55:21 +00:00
Richard Trieu edf46e6b6e Remove unused variable to silence compiler warning.
llvm-svn: 156456
2012-05-09 00:30:21 +00:00
Jakob Stoklund Olesen 10191fd44f Use a shared function for a common operation.
llvm-svn: 156441
2012-05-08 23:27:30 +00:00
Eric Christopher d666bb0dd8 Remove excess semi-colons to quiet warnings.
llvm-svn: 156416
2012-05-08 20:45:04 +00:00
Sirish Pande 1c9f7dbc10 Update load/store instruction patterns in Hexagon V4.
llvm-svn: 156411
2012-05-08 19:50:20 +00:00
Akira Hatanaka c515bfb9e7 Define mips16 instruction formats.
Patch by Reed Kotler.

llvm-svn: 156408
2012-05-08 19:08:58 +00:00
Jakob Stoklund Olesen 276ae14023 s/CSR_Ghc/CSR_NoRegs/
Share the CalleeSavedRegs defs between all calling conventions having no
callee-saved registers.

Patch by Yiannis Tsiouris!

llvm-svn: 156382
2012-05-08 15:07:29 +00:00
Craig Topper 7daf897678 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Jakob Stoklund Olesen c4b3a7a1d7 Fix bug in TRI::getCommonSuperRegClass().
Test cases for this code are coming. It is not used for anything yet.

llvm-svn: 156327
2012-05-07 21:59:31 +00:00
Jakob Stoklund Olesen 65a6dafc8d Add TRI::getCommonSuperRegClass().
This function is a generalization of getMatchingSuperRegClass() to the
symmetric case where both sides are using a sub-register index. It will
find a super-register class and sub-register indexes that make this
diagram commute:

                                   PreA
                       SuperRC  ---------->  RCA

                          |                   |
                          |                   |
                     PreB |                   | SubA
                          |                   |
                          |                   |
                          V                   V

                         RCB    ----------> SubRC
                                   SubB

This can be used to coalesce copies like:

  %vreg1:sub16 = COPY %vreg2:sub16; GR64:%vreg1, GR32: %vreg2

llvm-svn: 156317
2012-05-07 19:14:58 +00:00
Chad Rosier d8287fec17 Fix a regression from r147481. This combine should only happen if there is a
single use.
rdar://11360370

llvm-svn: 156316
2012-05-07 18:47:44 +00:00
Manman Ren ef4e0479ec X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709
llvm-svn: 156312
2012-05-07 18:06:23 +00:00
Eric Christopher 0d8c15d20f Add support for the 'x' constraint.
Patch by Jack Carter.

llvm-svn: 156295
2012-05-07 06:25:19 +00:00
Eric Christopher 9c492e6ebf Add support for the 'l' constraint.
Patch by Jack Carter.

llvm-svn: 156294
2012-05-07 06:25:15 +00:00
Eric Christopher e3c494de82 Add support for the 'c' constraint.
Patch by Jack Carter.

llvm-svn: 156293
2012-05-07 06:25:10 +00:00
Eric Christopher c18ae4a3b1 Add support for the 'P' constraint.
Patch by Jack Carter.

llvm-svn: 156292
2012-05-07 06:25:02 +00:00
Craig Topper dbb98b4917 Fix some issues in the f16c instructions.
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Eric Christopher 470578a91b Add support for the 'O' constraint.
Patch by Jack Carter.

llvm-svn: 156285
2012-05-07 05:46:48 +00:00
Eric Christopher e07aa430b8 Add support for the 'N' inline asm constraint.
Patch by Jack Carter.

llvm-svn: 156284
2012-05-07 05:46:43 +00:00
Eric Christopher 1109b3406d Add support for the 'L' inline asm constraint.
Patch by Jack Carter.

llvm-svn: 156283
2012-05-07 05:46:37 +00:00
Eric Christopher 3ff88a05b7 Add support for the inline asm constraint 'K'.
llvm-svn: 156282
2012-05-07 05:46:29 +00:00
Craig Topper d4e1894ec1 Add SSE4A MOVNTSS/MOVNTSD instructions.
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Eric Christopher 7201e1b4b9 Support the 'J' constraint.
Patch by Jack Carter.

llvm-svn: 156280
2012-05-07 03:13:42 +00:00
Eric Christopher 1d6c89eea1 Add support for the 'I' inline asm constraint. Also add tests
from the previous 2 patches.

Patch by Jack Carter.

llvm-svn: 156279
2012-05-07 03:13:32 +00:00
Eric Christopher 58daf04681 Allow 64 bit integer values in gpu registers if arch and abi are 64 bit.
Patch by Jack Carter.

llvm-svn: 156278
2012-05-07 03:13:22 +00:00
Eric Christopher cfcd77b0bc When using inline asm constraints representing
non-floating point general registers allow 8 and 16-bit
elements.

Patch by Jack Carter.

llvm-svn: 156277
2012-05-07 03:13:16 +00:00
Craig Topper 00a1e6d48b Use MVT instead of EVT as the argument to all the shuffle decode functions. Simplify some of the decode functions.
llvm-svn: 156268
2012-05-06 19:46:21 +00:00
Craig Topper 804be3b546 Add VPERMQ/VPERMPD to the list of target specific shuffles that can be looked through for DAG combine purposes.
llvm-svn: 156266
2012-05-06 18:54:26 +00:00
Craig Topper 54bdb350e2 Add shuffle decode support for VPERMQ/VPERMPD.
llvm-svn: 156265
2012-05-06 18:44:02 +00:00
Jim Grosbach 7ce129268e Nuke a few dead remnants of the CBE.
llvm-svn: 156241
2012-05-05 17:45:12 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Benjamin Kramer a25a61b9e8 NVPTX: Initialize the UseF32FTZ flag.
llvm-svn: 156232
2012-05-05 11:22:02 +00:00
Eric Christopher de9e92ed9b Typo.
llvm-svn: 156226
2012-05-05 01:16:06 +00:00
David Blaikie 891d0a3d20 Fix warnings in release build.
This fixes a couple of Clang warnings in release builds of LLVM:

* Missing return in ISelLowering
* Unused variable in NVPTXutil.cpp

llvm-svn: 156216
2012-05-04 22:34:16 +00:00
Kevin Enderby cabbae653e Tweak to the fix in r156212, as with the change in removing the shift the
SignExtend32<22>(Val<<1) also needs to change to SignExtend32<21>(Val) .

llvm-svn: 156213
2012-05-04 22:09:52 +00:00
Kevin Enderby 8ce1ada1be Fix a bug in the ARM disassembler for wide branch conditional instructions
where the symbolic operand's displacement was incorrectly shifted left by 1.
rdar://11387046

llvm-svn: 156212
2012-05-04 22:02:27 +00:00
Chandler Carruth cd3464ee22 Fix a Clang warning in the new NVPTX backend:
In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53:
../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
  default: assert(0 && "Unknown condition code");
  ^
1 warning generated.

The prevailing pattern in LLVM is to not use a default label, and instead to
use llvm_unreachable to denote that the switch in fact covers all return paths
from the function.

llvm-svn: 156209
2012-05-04 21:35:49 +00:00
Justin Holewinski ae556d3ef7 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

llvm-svn: 156196
2012-05-04 20:18:50 +00:00
Sebastian Pop 2420e8b7d5 Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions.
llvm-svn: 156195
2012-05-04 19:53:56 +00:00
Preston Gurd d6c440cd4c Adds Intel Atom scheduling latencies to X86InstrSystem.td.
llvm-svn: 156194
2012-05-04 19:26:37 +00:00
Matt Beaumont-Gay e82ab6baa7 Pacify GCC's -Wreturn-type
llvm-svn: 156189
2012-05-04 18:34:27 +00:00
Hans Wennborg aea412008e Make ARM and Mips use TargetMachine::getTLSModel()
This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).

llvm-svn: 156162
2012-05-04 09:40:39 +00:00
Craig Topper bdd2e34b1f Fix some loops to match coding standards. No functional change intended.
llvm-svn: 156159
2012-05-04 06:39:13 +00:00
Craig Topper d4d3237bb8 Fix up some spacing. No functional change.
llvm-svn: 156158
2012-05-04 06:18:33 +00:00
Craig Topper e2ae413746 Simplify broadcast lowering code. No functional change intended.
llvm-svn: 156157
2012-05-04 05:49:51 +00:00
Craig Topper 42f2182366 Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
llvm-svn: 156156
2012-05-04 04:44:49 +00:00
Craig Topper 59063c0a3d Simplify shuffle narrowing code a bit. No functional change intended.
llvm-svn: 156154
2012-05-04 04:08:44 +00:00
Jakob Stoklund Olesen 796e5272ab Remove the SubRegClasses field from RegisterClass descriptions.
This information in now computed by TableGen.

llvm-svn: 156152
2012-05-04 03:30:34 +00:00
Jakob Stoklund Olesen 34a8f13e5f Initialize SparcInstrInfo before SparcTargetLowering.
The TargetLowering construction needs to use a valid TargetRegisterInfo
instance.

llvm-svn: 156146
2012-05-04 02:16:39 +00:00
Jakob Stoklund Olesen 57c7050675 Add a SuperRegClassIterator class.
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.

llvm-svn: 156144
2012-05-04 01:48:29 +00:00
Jakob Stoklund Olesen 2f460ae3b4 Use a shared implementation of getMatchingSuperRegClass().
TargetRegisterClass now gives access to the necessary tables.

llvm-svn: 156122
2012-05-03 22:49:04 +00:00
Kevin Enderby 914223010c Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits
for the assembler and disassembler.  Which were not being set/read correctly
for offsets greater than 22 bits in some cases.

Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!

llvm-svn: 156118
2012-05-03 22:41:56 +00:00
Sirish Pande f8e5e3c072 Support for target dependent Hexagon VLIW packetizer.
This patch creates and optimizes packets as per Hexagon ISA rules.

llvm-svn: 156109
2012-05-03 21:52:53 +00:00
Silviu Baranga 9560af848c Fixed disassembler for vstm/vldm ARM VFP instructions.
llvm-svn: 156077
2012-05-03 16:38:40 +00:00
Sirish Pande c92c31674e Extensions of Hexagon V4 instructions.
This adds new instructions for Hexagon V4 architecture.

llvm-svn: 156071
2012-05-03 16:18:50 +00:00
Craig Topper 242183834a Use 'unsigned' instead of 'int' in a few places dealing with counts of vector elements.
llvm-svn: 156060
2012-05-03 07:26:59 +00:00
Craig Topper 315a5cc789 Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
llvm-svn: 156059
2012-05-03 07:12:59 +00:00
Andrew Trick 32aea358e1 Added TargetRegisterInfo::getAllocatableClass.
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.

This ensures that targets define register classes as intended.

llvm-svn: 156046
2012-05-03 01:14:37 +00:00
Preston Gurd 926afd7401 For Intel Atom, use ILP scheduling always, instead of ILP for 64 bit
and Hybrid for 32 bit, since benchmarks show ILP scheduling is better
most of the time.

llvm-svn: 156028
2012-05-02 22:02:02 +00:00
Preston Gurd c0b976c42a Change the Intel Atom detection code to recognize
Lincroft and Medfield.

llvm-svn: 156025
2012-05-02 21:38:46 +00:00
Jim Grosbach 28b0b7279e ARM: Add missing two-operand VBIC aliases.
llvm-svn: 156019
2012-05-02 21:11:56 +00:00
Preston Gurd fa3f6cb830 This patch continues the work of adding instruction latencies for X86 Atom,
by providing the latencies for the instructions in X86InstrFPStack.td.

llvm-svn: 155996
2012-05-02 16:03:35 +00:00
Manman Ren f02efc8731 Revert r155853
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.

llvm-svn: 155992
2012-05-02 15:24:32 +00:00
Richard Barton 0fc56890ba Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures.
llvm-svn: 155983
2012-05-02 09:43:18 +00:00
Craig Topper c73bc39c22 Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
llvm-svn: 155982
2012-05-02 08:03:44 +00:00
Jakub Staszak 6126401c83 Remove unneeded break.
llvm-svn: 155959
2012-05-01 23:08:16 +00:00
Jakub Staszak 339380286b Remove trailing spaces.
llvm-svn: 155956
2012-05-01 23:04:38 +00:00
Jim Grosbach 1d20efb837 ARM: Add a few missing add->sub aliases w/ 'w' suffix.
Aliases for adding a negative immediate when using an explicit 'w'
suffix. E.g.,
        adds.w r2, #-16
        adds.w r2, r2, #-16
        addw r2, #-16
        addw r2, #-16
        addw r2, r2, #-16

rdar://11330769

llvm-svn: 155946
2012-05-01 21:17:34 +00:00
Jim Grosbach 70bed4faaf ARM: allow vanilla expressions for movw/movt.
Expressions for movw/movt don't always have an :upper16: or :lower16:
on them and that's ok. When they don't, it's just a plain [0-65536]
immediate result, effectively the same as a :lower16: variant kind.

rdar://10550147

llvm-svn: 155941
2012-05-01 20:43:21 +00:00
Preston Gurd 5ae5278ca1 This patch marks the X86 floating point stack registers ST0-ST7 as reserved
in order to avoid assertion failures in the register scavenger. The assertion
failures were “Bad machine code: Using an undefined physical register” and
“Bad machine code: MBB exits via unconditional fall-through but its successor
differs from its CFG successor!”.

llvm-svn: 155930
2012-05-01 19:50:22 +00:00
Manman Ren 425a55c1ce X86: optimization for max-like struct
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0

FROM
movl    %edi, %ecx
subl    %esi, %ecx
cmpl    %edi, %esi
movl    $0, %eax
cmovll  %ecx, %eax
TO
xorl    %eax, %eax
subl    %esi, %edi
cmovll  %eax, %edi
movl    %edi, %eax

rdar: 10734411
llvm-svn: 155919
2012-05-01 17:16:15 +00:00
Alexey Samsonov c4b3ad8195 X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment
llvm-svn: 155917
2012-05-01 15:16:06 +00:00
Benjamin Kramer cb3e98cf44 Move MipsDisassembler classes into an anonymous namespace.
llvm-svn: 155915
2012-05-01 14:34:24 +00:00
Benjamin Kramer 512c1dce8f Value-initialize global to avoid global construction.
llvm-svn: 155909
2012-05-01 10:48:02 +00:00
Bill Wendling b12f16e75f Change the PassManager from a reference to a pointer.
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468

llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Craig Topper 05eb6e096a Allow BMI, AES, F16C, POPCNT, FMA3, and CLMUL to be detected on AMD processors.
llvm-svn: 155899
2012-05-01 07:10:32 +00:00
Craig Topper bae0e9ea1d Make XOP and FMA4 require SSE4A to match GCC behavior. Use this to simplify Bulldozer feature list.
llvm-svn: 155897
2012-05-01 06:54:48 +00:00
Craig Topper d32ebcc36b Attempt to handle MRMInitReg in emitVEXOpcodePrefix. Hopefully fixes PR12711.
llvm-svn: 155896
2012-05-01 06:34:01 +00:00
Craig Topper 43518cc55f Make XOP imply AVX as its needed to legalize the registers types.
llvm-svn: 155891
2012-05-01 05:41:41 +00:00
Craig Topper c0cef32b83 Remove HasSSE2 from AES and CLMUL predicates. It's now implied by the HasAES and HasCLMUL predicates.
llvm-svn: 155890
2012-05-01 05:35:02 +00:00
Craig Topper 29dd148a71 Make CLMUL and AES imply SSE2 since its needed to legalize the type.
llvm-svn: 155888
2012-05-01 05:28:32 +00:00
Craig Topper 0eacda5f69 Enable AVX and FMA4 for AMD Bulldozer processors.
llvm-svn: 155885
2012-05-01 05:18:13 +00:00
Manman Ren 4f4d5c8fc8 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

llvm-svn: 155853
2012-04-30 22:51:25 +00:00
Jim Grosbach e78031a9f3 ARM: Diagnostics for out of range fixups.
Replace some assert() calls w/ actual diagnostics. In a perfect world,
there'd be range checks on these values long before things ever reached
this code. For now, though, issuing a better-late-than-never diagnostic
is still a big improvement over assert().

rdar://11347287

llvm-svn: 155851
2012-04-30 22:30:43 +00:00
Jakob Stoklund Olesen 8503ba984f Fix address calculation error from r155744.
This was exposed by SingleSource/UnitTests/Vector/constpool.c.

The computed size of a basic block isn't always a multiple of its known
alignment, and that can introduce extra alignment padding after the
block.

<rdar://problem/11347135>

llvm-svn: 155845
2012-04-30 20:19:00 +00:00
Chad Rosier d427d51c2b Tidy up. No functional change intended.
llvm-svn: 155832
2012-04-30 17:47:15 +00:00
Derek Schuff b051adf263 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

(this time, actually commit what was reviewed!)

llvm-svn: 155825
2012-04-30 16:57:15 +00:00
Bob Wilson 9245c93656 Don't introduce illegal types when creating vmull operations. <rdar://11324364>
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.

llvm-svn: 155824
2012-04-30 16:53:34 +00:00
Craig Topper 55b3990837 No need to normalize index before calling Extract128BitVector
llvm-svn: 155811
2012-04-30 05:17:10 +00:00
Pete Cooper f76b5fe5ab Copied all the VEX prefix encoding code from X86MCCodeEmitter to the x86 JIT emitter. Needs some major refactoring as these two code emitters are almost identical
llvm-svn: 155810
2012-04-30 03:56:44 +00:00
Jakub Staszak da03f3ba64 Remove unneeded casts. No functionality change.
llvm-svn: 155800
2012-04-29 20:52:53 +00:00
Craig Topper 3b94fa63d6 Simplify code a bit. No functional change intended.
llvm-svn: 155798
2012-04-29 20:22:05 +00:00
Kalle Raiskila 4c5f83ea19 Update the documentation of CellSPU, in case it gets removed in 3.1.
llvm-svn: 155797
2012-04-29 20:00:55 +00:00
Jakob Stoklund Olesen ae7521d1e4 Fix a problem with blocks that need to be split twice.
The code could search past the end of the basic block when there was
already a constant pool entry after the block.

Test case with giant basic block in SingleSource/UnitTests/Vector/constpool.c

llvm-svn: 155753
2012-04-28 06:21:38 +00:00
Jim Grosbach c6f32b3295 ARM: Thumb add(sp plus register) asm constraints.
Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.

rdar://11219154

llvm-svn: 155748
2012-04-27 23:51:36 +00:00
Jim Grosbach 9d8f6f3d9d ARM: Tweak tADDrSP definition for consistent operand order.
Make the operand order of the instruction match that of the asm syntax.

llvm-svn: 155747
2012-04-27 23:51:33 +00:00
Derek Schuff a99b168145 Revert r155745
llvm-svn: 155746
2012-04-27 23:37:41 +00:00
Derek Schuff bbf8b83e90 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

llvm-svn: 155745
2012-04-27 23:27:17 +00:00
Jakob Stoklund Olesen 5f0d1b462c Track worst case alignment padding more accurately.
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:

  RoundUpToAlignment(Offset + UnknownPadding)

This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.

This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.

The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.

<rdar://problem/11339352>

llvm-svn: 155744
2012-04-27 22:58:38 +00:00
Craig Topper 0fa6c7e593 Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements.
llvm-svn: 155742
2012-04-27 22:54:43 +00:00
Chad Rosier 32c2178ef3 Add x86-specific DAG combine to simplify:
x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545

llvm-svn: 155739
2012-04-27 22:33:25 +00:00
Craig Topper 42cd8d2c00 Tidy up spacing.
llvm-svn: 155733
2012-04-27 21:05:09 +00:00
Lang Hames ea001225c1 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.

llvm-svn: 155724
2012-04-27 18:51:24 +00:00
Richard Barton 82f95ea2ad Fix ARM assembly parsing for upper case condition codes on IT instructions.
llvm-svn: 155720
2012-04-27 17:34:01 +00:00
Benjamin Kramer 913da4b261 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704
2012-04-27 12:07:43 +00:00
Richard Barton f435b09eaf Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst.
llvm-svn: 155700
2012-04-27 08:42:59 +00:00
Evan Cheng 1ec87ee096 Implement a bastardized ABI.
llvm-svn: 155686
2012-04-27 02:11:10 +00:00
Evan Cheng f52003de56 - thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541

llvm-svn: 155685
2012-04-27 01:27:19 +00:00
Jim Grosbach 3d6c629e26 ARM: Thumb ldr(literal) base address alignment is 32-bits.
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.

rdar://11314619

llvm-svn: 155659
2012-04-26 20:48:12 +00:00
Preston Gurd 81290f4be5 Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655
2012-04-26 19:52:27 +00:00
Tim Northover 3de97b7a86 Use VLD1 in NEON extenting-load patterns instead of VLDR.
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.

llvm-svn: 155630
2012-04-26 08:46:29 +00:00
Tim Northover 6699a60b0e Test commit.
llvm-svn: 155626
2012-04-26 08:24:07 +00:00
Craig Topper 08ccfbe57b Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
2012-04-26 06:40:15 +00:00
Evan Cheng 9f7ad310b5 If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438

llvm-svn: 155601
2012-04-26 01:13:36 +00:00
Richard Barton ba5b0cc82e Unify internal representation of ARM instructions with a register right-shifted by #32. These are stored as shifts by #0 in the MCInst and correctly marshalled when transforming from or to assembly representation.
llvm-svn: 155565
2012-04-25 18:00:18 +00:00
Craig Topper 3ec7c2aa84 Add ifdef around getSubtargetFeatureName in tablegen output file so that only targets that want the function get it. This prevents other targets from getting an unused function warning.
llvm-svn: 155538
2012-04-25 06:56:34 +00:00
Craig Topper 5ff6dc34b9 Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.
llvm-svn: 155537
2012-04-25 06:39:39 +00:00
Akira Hatanaka 2020e27d6d Do not use $gp as a dedicated global register if the target ABI is not O32.
llvm-svn: 155522
2012-04-25 01:24:52 +00:00
Jim Grosbach 5117ef7453 ARM: improved assembler diagnostics for missing CPU features.
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.

rdar://11257547

llvm-svn: 155499
2012-04-24 22:40:08 +00:00
Jim Grosbach 1e75fc1fe1 ARM: Nuke remnant bogus code.
r154362 was supposed to delete this bit, but obviously didn't.

rdar://11305594

llvm-svn: 155465
2012-04-24 18:39:47 +00:00
Nadav Rotem 810734b7f4 AVX: Add additional vbroadcast replacement sequences for integers.
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.

llvm-svn: 155461
2012-04-24 18:09:59 +00:00
Nadav Rotem 7b7b99c74a AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8
immediate. We can't use it here because the shuffle code does not check that
the lower part of the word is identical to the upper part.

llvm-svn: 155440
2012-04-24 11:27:53 +00:00
Richard Barton e9600009e9 Refactor Thumb ITState handling in ARM Disassembler to more efficiently use its vector
llvm-svn: 155439
2012-04-24 11:13:20 +00:00
Nadav Rotem aa3ff8da00 AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).

llvm-svn: 155437
2012-04-24 11:07:03 +00:00
Craig Topper 0b65c40821 Remove dangling spaces. Fix some other formatting.
llvm-svn: 155429
2012-04-24 06:36:35 +00:00
Craig Topper 6f2a535de2 Simplify code a bit and make it compile better. Remove unused parameters.
llvm-svn: 155428
2012-04-24 06:02:29 +00:00
Jim Grosbach 671ad2a572 Tidy up. 80 columns, whitespace, et. al.
llvm-svn: 155399
2012-04-23 22:04:10 +00:00
Nadav Rotem 3f8acfc3c4 Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics).
llvm-svn: 155397
2012-04-23 21:53:37 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Jim Grosbach 41e94d79be ARM: VSLI two-operand assmebly aliases are tblgen'erated.
llvm-svn: 155393
2012-04-23 21:22:04 +00:00
Jim Grosbach 3dada484c3 ARM: tblgen'erate VSRA/VRSRA/VSRI assembly two-operand aliases.
llvm-svn: 155392
2012-04-23 21:00:49 +00:00
Jim Grosbach e5012fbad3 ARM: vqdmulh two-operand aliases are tblgen'erated now.
llvm-svn: 155387
2012-04-23 20:37:20 +00:00
Chandler Carruth 3c3bb55a85 Revert r155365, r155366, and r155367. All three of these have regression
test suite failures. The failures occur at each stage, and only get
worse, so I'm reverting all of them.

Please resubmit these patches, one at a time, after verifying that the
regression test suite passes. Never submit a patch without running the
regression test suite.

llvm-svn: 155372
2012-04-23 18:25:57 +00:00
Sirish Pande a3f8ba2439 Hexagon V5 (floating point) support.
llvm-svn: 155367
2012-04-23 17:49:40 +00:00
Sirish Pande 2c7bf00fba Support for Hexagon architectural feature, new value jump.
llvm-svn: 155366
2012-04-23 17:49:28 +00:00
Sirish Pande 6cd2251598 Support for Hexagon VLIW Packetizer.
llvm-svn: 155365
2012-04-23 17:49:20 +00:00
Craig Topper 153bb34a3c Use MVT instead of EVT through all of LowerVECTOR_SHUFFLEtoBlend and not just the switch. Saves a little bit of binary size.
llvm-svn: 155339
2012-04-23 07:36:33 +00:00
Craig Topper 0a2c809d09 Make getZeroVector and getOnesVector more alike as far as how they detect 128-bit versus 256-bit vectors. Be explicit about both sizes and use llvm_unreachable. Similar changes to getLegalSplat.
llvm-svn: 155337
2012-04-23 07:24:41 +00:00
Craig Topper 2bbe8bcf4e Tidy up by removing some 'else' after 'return'
llvm-svn: 155336
2012-04-23 06:57:04 +00:00
Craig Topper 5c51eeecfc Tidy up spacing in LowerVECTOR_SHUFFLEtoBlend. Remove code that checks if shuffle operand has a different type than the the shuffle result since it can never happen.
llvm-svn: 155333
2012-04-23 06:38:28 +00:00
Craig Topper a52f0d09b6 Add a couple llvm_unreachables.
llvm-svn: 155332
2012-04-23 03:42:40 +00:00
Craig Topper 984dc015ae Remove some tab characers.
llvm-svn: 155331
2012-04-23 03:28:34 +00:00
Craig Topper ea428fd79c Remove some 'else' after 'return'. No functional change.
llvm-svn: 155330
2012-04-23 03:26:18 +00:00
Craig Topper bf7d5666f0 Make Extract128BitVector and Insert128BitVector take an unsigned instead of an ConstantNode SDValue. getConstant was almost always called just before only to have the functions take it apart and build a new ConstantSDNode.
llvm-svn: 155325
2012-04-22 20:55:18 +00:00
Craig Topper 2d474d6d92 Convert getNode(UNDEF) to getUNDEF.
llvm-svn: 155321
2012-04-22 19:29:34 +00:00
Craig Topper 860ed0d20a Make calls to getVectorShuffle more consistent. Use shuffle VT for calls to getUNDEF instead of requerying. Use &Mask[0] instead of Mask.data().
llvm-svn: 155320
2012-04-22 19:17:57 +00:00
Craig Topper 43397c0900 Tidy up. 80 columns and argument alignment.
llvm-svn: 155319
2012-04-22 18:51:37 +00:00
Craig Topper ad56a744f1 Simplify code by converting multiple places that were manually concatenating 128-bit vectors to use either CONCAT_VECTORS or a helper function. CONCAT_VECTORS will itself be lowered to the same pattern as before. The helper function is needed for concats of BUILD_VECTORs since getNode(CONCAT_VECTORS) will just return a large BUILD_VECTOR and we may be trying to lower large BUILD_VECTORS when this occurs.
llvm-svn: 155318
2012-04-22 18:15:59 +00:00
Benjamin Kramer 8877d68db7 ARM: Initialize the HasRAS bit.
Found by valgrind.

llvm-svn: 155313
2012-04-22 11:52:41 +00:00
Elena Demikhovsky 8d7e56c409 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Bill Wendling f9774c3253 Remove some potential warnings about variables used uninitialized.
llvm-svn: 155307
2012-04-22 07:23:04 +00:00
Craig Topper 6eadae8e60 Make some fixed arrays const. Use array_lengthof in a couple places instead of a hardcoded number.
llvm-svn: 155294
2012-04-21 18:58:38 +00:00
Craig Topper 2568bf3089 Tidy up. 80 columns and some other spacing issues.
llvm-svn: 155291
2012-04-21 18:13:35 +00:00
NAKAMURA Takumi e30303fa86 llvm/lib/Target: [PR12611] Add "llvm/Support/raw_ostream.h" for Debug build on MSVC.
Thanks to Andy Gibbs, to report the issue.

llvm-svn: 155287
2012-04-21 15:31:45 +00:00
NAKAMURA Takumi 54eed760da HexagonISelLowering.cpp: Reorder #includes.
llvm-svn: 155286
2012-04-21 15:31:36 +00:00
NAKAMURA Takumi df3d5ea990 HexagonInstPrinter.cpp: Suppress -Wunused-variable warnings with -Asserts.
llvm-svn: 155281
2012-04-21 11:24:55 +00:00
Jim Grosbach c931d451cd ARM: tblgen'erate more NEON two-operand aliases.
VMUL and VEXT.

llvm-svn: 155258
2012-04-20 23:46:33 +00:00
Jim Grosbach b4e849b924 ARM: tblgen'erate more NEON two-operand aliases.
llvm-svn: 155254
2012-04-20 23:30:14 +00:00
Jim Grosbach 2937df45a8 ARM: Update NEON assembly two-operand aliases.
Use the new TwoOperandAliasConstraint to handle lots of the two-operand aliases
for NEON instructions. There's still more to go, but this is a good chunk of
them.

llvm-svn: 155210
2012-04-20 18:12:54 +00:00
Gabor Greif c8a9abe9df effectively back out my last change (r155190)
llvm-svn: 155195
2012-04-20 11:41:38 +00:00
Gabor Greif 9eccbe9c82 fix obviously bogus (IMO) operand index of the load in asserts
(load only has one operand) and smuggle in some whitespace changes too

NB: I am obviously testing the water here, and believe that the unguarded
    cast is still wrong, but why is the getZExtValue of the load's operand
    tested against zero here? Any review is appreciated.
llvm-svn: 155190
2012-04-20 08:58:49 +00:00
Craig Topper c7242e054d Convert more uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155188
2012-04-20 07:30:17 +00:00
Craig Topper abadc660e0 Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155186
2012-04-20 06:31:50 +00:00
Jim Grosbach 9cc324d31a ARM some VFP tblgen'erated two-operand aliases.
llvm-svn: 155178
2012-04-20 00:15:00 +00:00
Jim Grosbach 6b46134862 ARM let TableGen handle a few two-operand aliases.
No need for these explicit aliases anymore. Nuke 'em.

llvm-svn: 155173
2012-04-19 23:59:26 +00:00
Gabor Greif 180c4445cf zap tabs
llvm-svn: 155128
2012-04-19 15:16:31 +00:00
Kevin Enderby ec4bd31206 Fixed the llvm-mv X86 disassembler so the 'C' API gets jumps properly
symbolicated.  These have and operand type of TYPE_RELv which was not handled
as isBranch in translateImmediate() in X86Disassembler.cpp.  rdar://11268426 

llvm-svn: 155074
2012-04-18 23:12:11 +00:00
Chandler Carruth b415bf98f0 This reverts a long string of commits to the Hexagon backend. These
commits have had several major issues pointed out in review, and those
issues are not being addressed in a timely fashion. Furthermore, this
was all committed leading up to the v3.1 branch, and we don't need piles
of code with outstanding issues in the branch.

It is possible that not all of these commits were necessary to revert to
get us back to a green state, but I'm going to let the Hexagon
maintainer sort that out. They can recommit, in order, after addressing
the feedback.

Reverted commits, with some notes:

Primary commit r154616: HexagonPacketizer
  - There are lots of review comments here. This is the primary reason
    for reverting. In particular, it introduced large amount of warnings
    due to a bad construct in tablegen.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154622: CMake fixes
    - r154660: Fix numerous build warnings in release builds.
  - Please don't resubmit this until the three commits above are
    included, and the issues in review addressed.

Primary commit r154695: Pass to replace transfer/copy ...
  - Reverted to minimize merge conflicts. I'm not aware of specific
    issues with this patch.

Primary commit r154703: New Value Jump.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154703: Remove iostream usage
    - r154758: Fix CMake builds
    - r154759: Fix build warnings in release builds
  - Please incorporate these fixes and and review feedback before
    resubmitting.

Primary commit r154829: Hexagon V5 (floating point) support.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154841: Remove unused variable (fixing build warnings)

There are also accompanying Clang commits that will be reverted for
consistency.

llvm-svn: 155047
2012-04-18 21:31:19 +00:00
Akira Hatanaka fc1d00bbd6 Mark instruction classes ArithLogicR, ArithLogicI and LoadUpper as isRematerializable.
llvm-svn: 155031
2012-04-18 18:52:10 +00:00
Akira Hatanaka 4167bb9346 Delete blank line.
llvm-svn: 155030
2012-04-18 18:47:17 +00:00
Silviu Baranga ca45af9a75 Added support for disassembling unpredictable swp/swpb ARM instructions.
llvm-svn: 155004
2012-04-18 14:18:57 +00:00
Silviu Baranga d5c6a63a50 Fix the bahavior of the disassembler when decoding unpredictable mrs instructions on ARM. Now the diasassembler emmits warnings instead of errors.
llvm-svn: 155002
2012-04-18 14:09:07 +00:00
Silviu Baranga 41f1fcd80e Added support for unpredictable mcrr/mcrr2/mrrc/mrrc2 ARM instruction in the disassembler. Since the upredicability conditions are complex, C++ code was added to handle them.
llvm-svn: 155001
2012-04-18 13:12:50 +00:00
Silviu Baranga a2944116dc Fixed decoding for the ARM cdp2 instruction. The restriction on the coprocessor number was removed for this instruction.
llvm-svn: 155000
2012-04-18 13:02:55 +00:00
Silviu Baranga 9da1918c84 Add suport for unpredicatble cases of the cmp, tst, teq and cmnz ARM instructions in the disassembler.
llvm-svn: 154999
2012-04-18 12:48:43 +00:00
Craig Topper d3c9e404ba Remove AVX vpermil intrinsics. I removed their uses from clang headers and builtins a while back.
llvm-svn: 154985
2012-04-18 05:24:00 +00:00
Joe Groff a81bcbb9bb fix pr12559: mark unavailable win32 math libcalls
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

llvm-svn: 154960
2012-04-17 23:05:54 +00:00
Chad Rosier 41675546eb Typo.
llvm-svn: 154953
2012-04-17 21:48:36 +00:00
Akira Hatanaka 236e14017f Delete latter half of CMakeLists.txt.
llvm-svn: 154936
2012-04-17 18:18:09 +00:00
Akira Hatanaka 71928e681b Add disassembler to MIPS.
Patch by Vladimir Medic. 

llvm-svn: 154935
2012-04-17 18:03:21 +00:00
Jay Foad 08a0598cd4 Remove unused CCIfSubtarget.
llvm-svn: 154921
2012-04-17 11:29:05 +00:00
James Molloy a9bcf20d22 Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON.
llvm-svn: 154915
2012-04-17 08:18:00 +00:00
Craig Topper 354103d8ca Don't decode vperm2i128 or vperm2f128 into a shuffle if bit 3 or 7 of the immediate is set.
llvm-svn: 154907
2012-04-17 05:54:54 +00:00
Kevin Enderby 29ae538647 Fix ARM disassembly of VLD2 (single 2-element structure to all lanes)
instructions with writebacks. And add test a case for all opcodes handed by
DecodeVLD2DupInstruction() in ARMDisassembler.cpp .

llvm-svn: 154884
2012-04-17 00:49:27 +00:00
Jim Grosbach 2bf5f73977 ARM two-operand forms for vhadd and vhsub instructions.
rdar://11252521

llvm-svn: 154875
2012-04-16 23:00:25 +00:00
Preston Gurd 5333e2e5ce Temporarily turn off anti-dependency checking
during Post RA scheduling in X86,
until the X86 target is changed to properly set up
post RA liveness.

llvm-svn: 154874
2012-04-16 22:52:28 +00:00
Jim Grosbach 003607f474 ARM handle :lower16: and :upper16: after a '#' prefix.
rdar://11252521

llvm-svn: 154862
2012-04-16 21:18:46 +00:00
Richard Smith 12da79b859 Fix incorrect atomics codegen introduced in r154705, and extend test to catch it.
llvm-svn: 154845
2012-04-16 18:43:53 +00:00
David Blaikie e67cdc07a5 Remove unused variable
llvm-svn: 154841
2012-04-16 18:10:13 +00:00
Jim Grosbach 6068d0014a ARM assembly two-operand forms for VRSHL.
rdar://11252521

llvm-svn: 154840
2012-04-16 18:03:16 +00:00
Akira Hatanaka 3e9d81f47c Do not add offset in applyFixup. This has already been accounted for in Value.
llvm-svn: 154838
2012-04-16 18:00:19 +00:00
Jim Grosbach cd1c000a9f ARM two-operand aliases for VRHADD instructions.
rdar://11252521

llvm-svn: 154832
2012-04-16 17:14:11 +00:00
Sirish Pande 96e8ee17e0 Hexagon V5 (Floating Point) Support.
llvm-svn: 154829
2012-04-16 17:05:06 +00:00
Craig Topper 4badeb3f0d Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes.
llvm-svn: 154801
2012-04-16 07:13:00 +00:00
Craig Topper 26d7a94981 Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.
llvm-svn: 154798
2012-04-16 06:43:40 +00:00
Craig Topper c0075aa7ff Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.
llvm-svn: 154797
2012-04-16 06:26:15 +00:00
Craig Topper b86fa404d3 Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper b04fe34030 Fix SDTypeProfile for vpermps. The mask operand should be v8i32.
llvm-svn: 154781
2012-04-16 00:12:20 +00:00
Craig Topper 1f8c9eb925 Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in vpermps/vpermd. Hardware only looks at lower 3-bits.
llvm-svn: 154780
2012-04-15 23:48:57 +00:00
Craig Topper bfc9a5f7d3 Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
llvm-svn: 154778
2012-04-15 22:43:31 +00:00
Nadav Rotem 42bcd04ee3 Fix PR12529. The Vxx family of instructions are only supported by AVX.
Use non-vex instructions for SSE4.

llvm-svn: 154770
2012-04-15 19:36:44 +00:00
Benjamin Kramer 673824b4a1 Wire up support for diagnostic ranges in the ARMAsmParser.
As an example, attach range info to the "invalid instruction" message:

$ clang -arch arm -c asm.c
asm.c:2:11: error: invalid instruction
  __asm__("foo r0");
          ^
<inline asm>:1:2: note: instantiated into assembly here
        foo r0
        ^~~

llvm-svn: 154765
2012-04-15 17:04:27 +00:00
Elena Demikhovsky 779a72b49e Added VPERM optimization for AVX2 shuffles
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
NAKAMURA Takumi 67de410135 HexagonCopyToCombine.cpp: Silence two warnings, -Wunused-variable, with -Asserts.
llvm-svn: 154759
2012-04-15 05:33:43 +00:00
NAKAMURA Takumi 355eebf4cf Target/Hexagon: Tweak to fix msvc build.
llvm-svn: 154758
2012-04-15 05:09:09 +00:00
Richard Smith 3e8f1f6aea Fix X86 codegen for 'atomicrmw nand' to generate *x = ~(*x & y), not *x = ~*x & y.
llvm-svn: 154705
2012-04-13 22:47:00 +00:00
Sirish Pande f4db4b2cb4 Remove iostream from New Value Jump.
llvm-svn: 154703
2012-04-13 21:01:35 +00:00
Sirish Pande 0e6e36d1d0 Add support for Hexagon Architectural feature, New Value Jump.
llvm-svn: 154696
2012-04-13 20:22:31 +00:00
Sirish Pande a8071a0f88 Pass to replace tranfer/copy instructions into combine instruction where possible.
llvm-svn: 154695
2012-04-13 20:22:19 +00:00
Evan Cheng 267a4ada52 On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly.
llvm-svn: 154689
2012-04-13 18:59:28 +00:00
Kevin Enderby c407cc7a40 For ARM disassembly only print 32 unsigned bits for the address of branch
targets so if the branch target has the high bit set it does not get printed as:
	 beq     0xffffffff8008c404

llvm-svn: 154685
2012-04-13 18:46:37 +00:00
Craig Topper eb455832b4 Silence various build warnings from Hexagon backend that show up in release builds. Mostly converting 'assert(0)' to 'llvm_unreachable' to silence warnings about missing returns. Also fold some variable declarations into asserts to prevent the variables from being unused in release builds.
llvm-svn: 154660
2012-04-13 06:38:11 +00:00
Kevin Enderby 40d4e47003 Fix a few more places in the ARM disassembler so that branches get
symbolic operands added when using the C disassembler API.

llvm-svn: 154628
2012-04-12 23:13:34 +00:00
Ted Kremenek 967aaa956f Update CMake build.
llvm-svn: 154622
2012-04-12 22:15:23 +00:00
Evandro Menezes 6a6a66e313 Hexagon: fix CMake error.
llvm-svn: 154620
2012-04-12 21:44:58 +00:00
Sirish Pande b486144c12 HexagonPacketizer patch.
llvm-svn: 154616
2012-04-12 21:06:38 +00:00
Evan Cheng 3e869f002c Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106
llvm-svn: 154604
2012-04-12 19:14:21 +00:00
Evandro Menezes 5cee621c88 Hexagon: enable assembler output through the MC layer.
llvm-svn: 154597
2012-04-12 17:55:53 +00:00
Benjamin Kramer df4477c506 Remove README entry obsoleted by register masks.
llvm-svn: 154588
2012-04-12 12:47:29 +00:00
Craig Topper d0271b27cb Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions.
llvm-svn: 154580
2012-04-12 07:23:00 +00:00
Jim Grosbach 4324f426ce ARM 'adr' fixups don't need the interworking addend tweaking.
They reference the PC directly, so things work properly that way.

rdar://11231229

llvm-svn: 154576
2012-04-12 01:19:35 +00:00
Akira Hatanaka 47ad674f67 Emit neg.s or neg.d only if -enable-no-nans-fp-math is supplied by user,
otherwise expand FNEG during legalization.

llvm-svn: 154546
2012-04-11 22:59:08 +00:00
Akira Hatanaka 7f4c9d1429 Emit abs.s or abs.d only if -enable-no-nans-fp-math is supplied by user.
Invalid operation is signaled if the operand of these instructions is NaN.

llvm-svn: 154545
2012-04-11 22:49:04 +00:00
Kevin Enderby 72f18bbcff Fixed a case of ARM disassembly getting an assert on a bad encoding
of a VST instruction.

llvm-svn: 154544
2012-04-11 22:40:17 +00:00
Akira Hatanaka 4f5c8421b3 Fix bugs in lowering of FCOPYSIGN nodes.
- FCOPYSIGN nodes that have operands of different types were not handled.
- Different code was generated depending on the endianness of the target.

Additionally, code is added that emits INS and EXT instructions, if they are
supported by target (they are R2 instructions).

llvm-svn: 154540
2012-04-11 22:13:04 +00:00
Jim Grosbach 6e536de1a1 ARM 'vuzp.32 Dd, Dm' is a pseudo-instruction.
While there is an encoding for it in VUZP, the result of that is undefined,
so we should avoid it. Define the instruction as a pseudo for VTRN.32
instead, as the ARM ARM indicates.

rdar://11222366

llvm-svn: 154511
2012-04-11 17:40:18 +00:00
Jim Grosbach 4640c8169f ARM 'vzip.32 Dd, Dm' is a pseudo-instruction.
While there is an encoding for it in VZIP, the result of that is undefined,
so we should avoid it. Define the instruction as a pseudo for VTRN.32
instead, as the ARM ARM indicates.

rdar://11221911

llvm-svn: 154505
2012-04-11 16:53:25 +00:00
Nadav Rotem 372cf15125 remove unused argument
llvm-svn: 154494
2012-04-11 11:05:21 +00:00
Duncan Sands 264d2e7121 Add a C binding to the Target and TargetMachine classes to allow for emitting
binary and assembly. Patch by Carlo Kok.  Emitting was inspired by but not based
on the D llvm bindings. 

llvm-svn: 154493
2012-04-11 10:25:24 +00:00
Evan Cheng 5efc442290 Add more fused mul+add/sub patterns. rdar://10139676
llvm-svn: 154484
2012-04-11 06:59:47 +00:00
Nadav Rotem 9bc178ac5c Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Evan Cheng 48346c1cd9 Clean up ARM fused multiply + add/sub support some more: rename some isel
predicates.
Also remove NEON2 since it's not really useful and it is confusing. If
NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it
really mean?

rdar://10139676

llvm-svn: 154480
2012-04-11 05:33:07 +00:00
Evan Cheng 67a09fc397 Match (fneg (fma) to vfnma. rdar://10139676
llvm-svn: 154469
2012-04-11 01:21:25 +00:00
Charles Davis 74c282b5ef Add retw and lretw instructions. Also, fix Intel syntax parsing for all
ret instructions.

llvm-svn: 154468
2012-04-11 01:10:53 +00:00
Kevin Enderby d2980cd041 Fix ARM disassembly of VLD instructions with writebacks.  And add test a case
for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp .

llvm-svn: 154459
2012-04-11 00:25:40 +00:00
Jim Grosbach ad66de155b ARM add missing Thumb1 two-operand aliases for shift-by-immediate.
rdar://11222742

llvm-svn: 154457
2012-04-11 00:15:16 +00:00
Evan Cheng aca6c822e6 Fix a number of problems with ARM fused multiply add/subtract instructions.
1. The new instruction itinerary entries are not properly described.
2. The asm parser can't handle vfms and vfnms.
3. There were no assembler, disassembler test cases.
4. HasNEON2 has the wrong assembler predicate.
rdar://10139676

llvm-svn: 154456
2012-04-11 00:13:00 +00:00
Evan Cheng d0007f3c83 Handle llvm.fma.* intrinsics. rdar://10914096
llvm-svn: 154439
2012-04-10 21:40:28 +00:00
Chad Rosier f7345b027a Whitespace.
llvm-svn: 154427
2012-04-10 19:42:07 +00:00
Chad Rosier 235a7a1746 Revert r154396, which looks to be the real culprit behind the bot failures.
llvm-svn: 154426
2012-04-10 19:39:18 +00:00
Eric Christopher 65ada95b84 Temporarily revert this patch to see if it brings the buildbots back.
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Jim Grosbach df5a244797 ARM fix cc_out operand handling for t2SUBrr instructions.
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.

rdar://11216577

llvm-svn: 154411
2012-04-10 17:31:55 +00:00
David Blaikie 2735136655 Remove unused variable.
llvm-svn: 154398
2012-04-10 15:23:13 +00:00
Nadav Rotem f934f91709 Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Evan Cheng f8bad08001 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370
2012-04-10 01:51:00 +00:00
Jim Grosbach 8f99bc3aed ARM LDR/LDRT has the same encoding collision as STR/STRT.
Generalized logic of r154141.

llvm-svn: 154362
2012-04-10 00:13:07 +00:00
Chad Rosier e0e38f61a5 When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a 
series of scalar stores.

For func_4_8 the generated code

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr

becomes

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr

I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.

This

	ldrh	r0, [r0, #4]
	strh	r0, [r1]

becomes

	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]

PR11158
rdar://10703339

llvm-svn: 154340
2012-04-09 20:32:02 +00:00
Chad Rosier 99cbde9e82 Update comments and remove unnecessary isVolatile() check.
llvm-svn: 154336
2012-04-09 19:38:15 +00:00
David Blaikie e6b6fae8ff Fix accidentally constant conditions found by uncommitted improvements to -Wconstant-conversion.
A couple of cases where we were accidentally creating constant conditions by
something like "x == a || b" instead of "x == a || x == b". In one case a
conditional & then unreachable was used - I transformed this into a direct
assert instead.

llvm-svn: 154324
2012-04-09 16:29:35 +00:00
Preston Gurd 2eec367227 This patch adds X86 instruction itineraries, which were missed by the
original patch to add itineraries, to X86InstrArithmetc.td.  

llvm-svn: 154320
2012-04-09 15:32:22 +00:00
Nadav Rotem fb7e2ae53c Lower some x86 shuffle sequences to the vblend family of instructions.
llvm-svn: 154313
2012-04-09 08:33:21 +00:00
Nadav Rotem b801ca3976 Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.

llvm-svn: 154310
2012-04-09 07:45:58 +00:00
Chandler Carruth 3779ac10b4 Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
   the match to try to fit RIP into the base register. If it fails, it
   now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
   as it did before. The reason these immediates are safe is because the
   ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
   This is the only case changed by the patch, and the primary place you
   see it is in TLS, either the win64 section offset TLS or Linux
   local-exec TLS model in a PIC compilation. Here the ABI again ensures
   that the immediates fit because we are in small mode, and any other
   operations required due to the PIC relocation model have been handled
   externally to the Wrapper node (extra loads etc are made around the
   wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

llvm-svn: 154304
2012-04-09 02:13:06 +00:00
Chandler Carruth ede4a8aa2b Teach LLVM about a PIE option which, when enabled on top of PIC, makes
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.

I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]

I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.

llvm-svn: 154294
2012-04-08 17:51:45 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Nadav Rotem 82609df647 AVX2: Build splat vectors by broadcasting a scalar from the constant pool.
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.

llvm-svn: 154284
2012-04-08 12:54:54 +00:00
Craig Topper d024cef233 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper aa9aab5ad2 Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
llvm-svn: 154268
2012-04-07 21:57:43 +00:00
Bob Wilson 6f9be7e2c6 Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543>
The tLDRr instruction with the last register operand set to the zero register
prints in assembly as if no register was specified, and the assembler encodes
it as a tLDRi instruction with a zero immediate.  With the integrated assembler,
that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which
is broken.  Emit the instruction as tLDRi with a zero immediate.  I don't
know if there's a good way to write a testcase for this.  Suggestions welcome.

Opportunities for follow-up work:
1) The asm printer should complain if a non-optional register operand is set
   to the zero register, instead of silently dropping it.
2) The integrated assembler should complain in the same situation, instead of
   silently emitting the operand as "r0".

llvm-svn: 154261
2012-04-07 16:51:59 +00:00
NAKAMURA Takumi b95f64134e Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming.
Cygwin-1.7 supports dw2. Some recent mingw distros support one, too.
I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin.

llvm-svn: 154247
2012-04-07 02:24:20 +00:00
Alexis Hunt 0235f684f0 Output UTF-8-encoded characters as identifier characters into assembly
by default.

This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.

I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.

llvm-svn: 154235
2012-04-07 00:37:53 +00:00
Jim Grosbach 0c509fa6bf Tidy up. 80 columns.
llvm-svn: 154226
2012-04-06 23:43:50 +00:00
Jakob Stoklund Olesen baa3566091 ARMPat is equivalent to Requires<[IsARM]>.
llvm-svn: 154210
2012-04-06 21:21:59 +00:00
Jakob Stoklund Olesen b4bd3880ba Eliminate iOS-specific tail call instructions.
After register masks were introdruced to represent the call clobbers, it
is no longer necessary to have duplicate instruction for iOS.

llvm-svn: 154209
2012-04-06 21:17:42 +00:00
Chandler Carruth 8a102c21e3 There is no portable std::abs overload for int64_t, use the llvm::abs64
which exists for this purpose.

llvm-svn: 154199
2012-04-06 20:10:52 +00:00
Jakob Stoklund Olesen 967b86a0a2 Allow negative immediates in ARM and Thumb2 compares.
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.

llvm-svn: 154183
2012-04-06 17:45:04 +00:00
Benjamin Kramer 3cacabfb04 Fix narrowing conversion.
llvm-svn: 154171
2012-04-06 13:33:52 +00:00
Craig Topper 447417c932 Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
llvm-svn: 154166
2012-04-06 07:45:23 +00:00
Jakob Stoklund Olesen 6a2e99a46a Deduplicate ARM call-related instructions.
We had special instructions for iOS because r9 is call-clobbered, but
that is represented dynamically by the register mask operands now, so
there is no need for the pseudo-instructions.

llvm-svn: 154144
2012-04-06 00:04:58 +00:00
Jim Grosbach d6a1a1dc2f ARM: Don't form a t2LDRi8 or t2STRi8 with an offset of zero.
The load/store optimizer splits LDRD/STRD into two instructions when the
register pairing doesn't work out. For negative offsets in Thumb2, it uses
t2STRi8 to do that. That's fine, except for the case when the offset is in
the range [-4,-1]. In that case, we'll also form a second t2STRi8 with
the original offset plus 4, resulting in a t2STRi8 with a non-negative
offset, which ends up as if it were an STRT, which is completely bogus.
Similarly for loads.

No testcase, unfortunately, as any I've been able to construct is both large
and extremely fragile.

rdar://11193937

llvm-svn: 154141
2012-04-05 23:51:24 +00:00
Jim Grosbach 930f2f66e7 ARM assembly aliases for add negative immediates using sub.
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.

rdar://11192734

llvm-svn: 154123
2012-04-05 20:57:13 +00:00
Silviu Baranga af3c79f0ac Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions.
llvm-svn: 154101
2012-04-05 16:19:29 +00:00
Silviu Baranga d365397daa Added support for handling unpredictable arithmetic instructions on ARM.
llvm-svn: 154100
2012-04-05 16:13:15 +00:00
Jim Grosbach 15c6884a4b ARM assembly aliases for two-operand V[R]SHR instructions.
rdar://11189467

llvm-svn: 154087
2012-04-05 07:23:53 +00:00
Jim Grosbach 3d00eecc53 ARM assembly parsing for 'msr' plain 'cpsr' operand.
Plain 'cpsr' is an alias for 'cpsr_fc'.

rdar://11153753

llvm-svn: 154080
2012-04-05 03:17:53 +00:00
Akira Hatanaka 121342fcc2 Reapply 154038 without the failing test.
llvm-svn: 154062
2012-04-04 22:16:36 +00:00
Owen Anderson 4743c6e159 Revert r154038. It was causing make check failures.
llvm-svn: 154054
2012-04-04 21:18:58 +00:00
Akira Hatanaka 9705c865d9 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.

llvm-svn: 154038
2012-04-04 19:02:38 +00:00
Akira Hatanaka 591ecdd7c1 Fix LowerJumpTable to produce instructions with the correct relocation
types for N32 ABI. Test case will be updated after the patch that fixes
TargetLowering::getPICJumpTableRelocBase is checked in.

llvm-svn: 154036
2012-04-04 18:31:32 +00:00
Akira Hatanaka b3a2b8c199 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154034
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen 0a5b72f0e4 Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

llvm-svn: 154033
2012-04-04 18:23:42 +00:00
Akira Hatanaka aeff24e424 Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154031
2012-04-04 18:22:53 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Dylan Noblesmith 7a3973d3e0 ARMDisassembler: drop bogus dependency on ARMCodeGen
And indirectly, a dependency on most of the core LLVM optimization
libraries.

llvm-svn: 153957
2012-04-03 15:48:14 +00:00
Anton Korobeynikov 325e92668b Make PPCCompilationCallbackC function to be static, so there will be no need to issue call via
PLT when LLVM is built as shared library. This mimics the X86 backend towards the approach.

llvm-svn: 153938
2012-04-03 06:59:28 +00:00
Craig Topper 7629d63bc4 Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Akira Hatanaka d19f025374 Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler.
llvm-svn: 153926
2012-04-03 03:01:13 +00:00
Akira Hatanaka 55059262aa Revert r153924. There were buildbot failures.
llvm-svn: 153925
2012-04-03 02:51:09 +00:00
Akira Hatanaka e2498d014b MIPS disassembler support.
Patch by Vladimir Medic.

llvm-svn: 153924
2012-04-03 02:20:58 +00:00
Akira Hatanaka b1f68f9696 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.

llvm-svn: 153889
2012-04-02 19:25:22 +00:00
Hal Finkel 7591afa235 The binutils for the IBM BG/P are too old to support CFI.
llvm-svn: 153886
2012-04-02 19:09:04 +00:00
Roman Divacky b9663ccd6b Implement the SVR4 byval alignment for aggregates. Fixing a FIXME.
llvm-svn: 153876
2012-04-02 15:49:30 +00:00
Benjamin Kramer 1c0541b031 Move getOpcodeName from the various target InstPrinters into the superclass MCInstPrinter.
All implementations used the same code.

llvm-svn: 153866
2012-04-02 08:32:38 +00:00
Craig Topper dab9e35ad0 Remove getInstructionName from MCInstPrinter implementations in favor of using the instruction name table from MCInstrInfo. Reduces static data in the InstPrinter implementations.
llvm-svn: 153863
2012-04-02 07:01:04 +00:00
Craig Topper 54bfde79db Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo.
llvm-svn: 153860
2012-04-02 06:09:36 +00:00
Hal Finkel 3ecfa7b277 Fix some 80-col. violations I introduced with the A2 PPC64 core.
llvm-svn: 153852
2012-04-01 21:20:14 +00:00
Hal Finkel 322e41a914 Enable prefetch generation on PPC64.
llvm-svn: 153851
2012-04-01 20:08:17 +00:00
Hal Finkel 9032344c15 Add LdStSTD* itin. for the PPC64 A2 core.
llvm-svn: 153850
2012-04-01 20:08:08 +00:00
Nadav Rotem b078350872 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Hal Finkel 88ed4e3b15 Set the default PPC node scheduling preference to ILP (for the embedded cores).
The 440 and A2 cores have detailed itineraries, and this allows them to be
fully used to maximize throughput.

llvm-svn: 153845
2012-04-01 19:23:08 +00:00
Hal Finkel b9845f5758 Add ppc440 itin. entries for LdStSTD*
llvm-svn: 153844
2012-04-01 19:23:04 +00:00
Hal Finkel ec5a1e3669 Use full anti-dep. breaking with post-ra sched. on the embedded ppc cores.
Post-RA scheduling gives a significant performance improvement on
the embedded cores, so turn it on. Using full anti-dep. breaking is
important for FP-intensive blocks, so turn it on (just on the
embedded cores for now; this should also be good on the 970s because
post-ra scheduling is all that we have for now, but that should have
more testing first).

llvm-svn: 153843
2012-04-01 19:22:57 +00:00
Hal Finkel 9f9f8929ee Add instruction itinerary for the PPC64 A2 core.
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.

llvm-svn: 153842
2012-04-01 19:22:40 +00:00
Hal Finkel 59607e63cb Split the LdStGeneral PPC itin. class into LdStLoad and LdStStore.
Loads and stores can have different pipeline behavior, especially on
embedded chips. This change allows those differences to be expressed.
Except for the 440 scheduler, there are no functionality changes.
On the 440, the latency adjustment is only by one cycle, and so this
probably does not affect much. Nevertheless, it will make a larger
difference in the future and this removes a FIXME from the 440 itin.

llvm-svn: 153821
2012-04-01 04:44:16 +00:00
Hal Finkel 51861b4855 Fix dynamic linking on PPC64.
Dynamic linking on PPC64 has had problems since we had to move the top-down
hazard-detection logic post-ra. For dynamic linking to work there needs to be
a nop placed after every call. It turns out that it is really hard to guarantee
that nothing will be placed in between the call (bl) and the nop during post-ra
scheduling. Previous attempts at fixing this by placing logic inside the
hazard detector only partially worked.

This is now fixed in a different way: call+nop codegen-only instructions. As far
as CodeGen is concerned the pair is now a single instruction and cannot be split.
This solution works much better than previous attempts.

The scoreboard hazard detector is also renamed to be more generic, there is currently
no cpu-specific logic in it.

llvm-svn: 153816
2012-03-31 14:45:15 +00:00
Akira Hatanaka 8f4e3a0088 Select static relocation model if it is jitting.
llvm-svn: 153795
2012-03-31 02:38:36 +00:00
Jakob Stoklund Olesen d915503486 Add a 2 byte safety margin in offset computations.
ARMConstantIslandPass still has bugs where jump table compression can
cause constant pool entries to go out of range.

Add a safety margin of 2 bytes when placing constant islands, but use
the real max displacement for verification.

<rdar://problem/11156595>

llvm-svn: 153789
2012-03-31 00:06:44 +00:00
Jakob Stoklund Olesen 24bb3d59d7 Add more debugging output to ARMConstantIslandPass.
llvm-svn: 153788
2012-03-31 00:06:42 +00:00
Benjamin Kramer 682de39f2d Rip out emission of the regIsInRegClass function for the asm printer.
It's slow, bloated and completely redundant with MCRegisterClass::contains.

llvm-svn: 153782
2012-03-30 23:13:40 +00:00
Jim Grosbach 913cc3072d ARM fix encoding fixup resolution for ldrd and friends.
The 8-bit payload is not contiguous in the opcode. Move the upper nibble
over 4 bits into the correct place.

rdar://11158641

llvm-svn: 153780
2012-03-30 21:54:22 +00:00
Jim Grosbach fdaab531b7 ARM assembler should prefer non-aliases encoding of cmp.
When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg,
we want to use the non-negated form to make sure we prefer the normal
encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'.

llvm-svn: 153770
2012-03-30 19:59:02 +00:00
Jim Grosbach daa04130ed ARM encoding for VSWP got the second operand incorrect.
Make the non-tied register operand names line up with what the base
class encoding handler expects.

rdar://11157236

llvm-svn: 153766
2012-03-30 18:53:01 +00:00
Jim Grosbach 74005ae691 ARM can only use narrow encoding for low regs.
llvm-svn: 153765
2012-03-30 18:39:43 +00:00
Jim Grosbach def5e34812 ARM integrated assembler should encoding choice for add/sub imm.
For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2
can be used for this syntax. Prefer the narrow encoding when possible.

rdar://11156277

llvm-svn: 153759
2012-03-30 17:20:40 +00:00
Jim Grosbach 199ab90946 ARM assembly parsing needs to be paranoid about negative immediates.
Make sure to treat immediates as unsigned when doing relative comparisons.

rdar://11153621

llvm-svn: 153753
2012-03-30 16:31:31 +00:00
Benjamin Kramer 88d31b3f0c Add a note about a missed cmov -> sbb opportunity.
llvm-svn: 153741
2012-03-30 13:02:58 +00:00
James Molloy fb5cd6085f Ensure conditional BL instructions for ARM are given the fixup fixup_arm_condbranch.
Patch by Tim Northover!

llvm-svn: 153737
2012-03-30 09:15:32 +00:00
Evan Cheng a40d40602c ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249
llvm-svn: 153717
2012-03-30 01:24:39 +00:00
Jakob Stoklund Olesen d8af9a5ee1 Invalidate liveness in ARMConstantIslandPass.
This pass splits basic blocks to insert constant islands, and it
doesn't recompute the live-in lists. No later passes depend on accurate
liveness information.

This fixes PR12410 where the machine code verifier was complaining.

llvm-svn: 153700
2012-03-29 23:14:26 +00:00
Jakob Stoklund Olesen 2f2897372a Prefer even-odd D-register pairs.
We are sometimes allocatinog from the DPair register class which
contains odd-even pairs in addition to the Q registers.

Place the Q registers first in the DPair allocation order as they can be
copied with a single instruction. The odd-even pairs should only be
allocated as a last resort.

llvm-svn: 153699
2012-03-29 22:54:32 +00:00
Lang Hames 591cdaf2ee Try using vmov.i32 to materialize FP32 constants that can't be materialized by
vmov.f32.

llvm-svn: 153696
2012-03-29 21:56:11 +00:00
Jim Grosbach 0b0298302c ARM assembly 'cmp lr, #0' should not encode using 'cmn'.
The CMP->CMN alias was matching for an immediate of zero when it
should only match for negative values.

rdar://11129224

llvm-svn: 153689
2012-03-29 21:19:52 +00:00
Jakob Stoklund Olesen caa6bd273f Handle register copies for the new ARM register classes.
ARM recently gained DPair, DTriple, and DQuad register classes.
Update copyPhysReg() to handle copies in these register classes.

No test case, it is difficult to make the register allocator emit the
odd copies reliably. The missing DPair copy caused a failure on
partialsums in the nightly test suite.

<rdar://problem/11147997>

llvm-svn: 153686
2012-03-29 21:10:40 +00:00
Lang Hames 5569ce7d56 Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode.
llvm-svn: 153680
2012-03-29 19:54:28 +00:00
Akira Hatanaka 0603ad8c65 Expand FREM.
llvm-svn: 153671
2012-03-29 18:43:11 +00:00
Benjamin Kramer 8619c37b5b Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds.
llvm-svn: 153643
2012-03-29 12:37:26 +00:00
Craig Topper a0a603e582 Only allow symbolic names for (v)cmpss/sd/ps/pd encodings 8-31 to be used with 'v' version of instructions.
llvm-svn: 153636
2012-03-29 07:11:23 +00:00
Joel Jones 68d59e8a90 For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153635
2012-03-29 05:45:48 +00:00
Joel Jones b474099e63 Reverted to revision 153616 to unblock build
llvm-svn: 153623
2012-03-29 01:20:56 +00:00
Joel Jones b88c81fe0f For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153617
2012-03-29 00:37:47 +00:00
Jakob Stoklund Olesen c3e80cc885 Enable machine code verification in the entire code generator.
Some targets still mess up the liveness information, but that isn't
verified after MRI->invalidateLiveness().

The verifier can still check other useful things like register classes
and CFG, so it should be enabled after all passes.

llvm-svn: 153615
2012-03-28 23:54:28 +00:00
Jakob Stoklund Olesen b6a7a89289 Don't kill the base register when expanding strd.
When an strd instruction doesn't get the registers it wants, it can be
expanded into two str instructions. Make sure the first str doesn't kill
the base register in the case where the base and data registers are
identical:

  t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg
  t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg

<rdar://problem/11101911>

llvm-svn: 153611
2012-03-28 23:07:03 +00:00
Jakob Stoklund Olesen cdee326ab6 Preserve implicit defs in ARMLoadStoreOptimizer.
When a number of sub-register VLRDS instructions are combined into a
VLDM, preserve any super-register implicit defs. This is required to
keep the register scavenger and machine code verifier happy.

Enable machine code verification after ARMLoadStoreOptimizer.
ARM/2012-01-26-CopyPropKills.ll was failing because of this.

llvm-svn: 153610
2012-03-28 22:50:56 +00:00
Jakob Stoklund Olesen 9e512120b7 Spill DPair registers, not just QPR.
The arm_neon intrinsics can create virtual registers from the DPair
register class which allows both even-odd and odd-even D-register pairs.

This fixes PR12389.

llvm-svn: 153603
2012-03-28 21:20:32 +00:00
Jakob Stoklund Olesen 8cb97523c6 Revert r153516: "Invalidate liveness in Thumb2ITBlockPass."
Revert r153519: "ARMLoadStoreOptimizer invalidates register liveness."

These patches caused miscompilations in povray by turning off branch
folding's updating of live-in lists.

It turns out the the late scheduler depends on the live-in lists, even
if it doesn't need correct kill flags.

<rdar://problem/11139228>

llvm-svn: 153593
2012-03-28 20:11:44 +00:00
Benjamin Kramer 20b32d2da6 Add another note about a missed compare with nsw arithmetic instcombine.
llvm-svn: 153574
2012-03-28 10:50:18 +00:00
Richard Barton 7ce39497b4 Fixup VST1.32 with writeback instruction. Also re-factor non-writeback version.
llvm-svn: 153573
2012-03-28 10:18:11 +00:00
Akira Hatanaka 2c67006cdd Turn off post-RA scheduler by default.
llvm-svn: 153557
2012-03-28 00:52:23 +00:00
Akira Hatanaka 047473e293 Turn on post register allocation scheduler.
llvm-svn: 153554
2012-03-28 00:24:17 +00:00
Akira Hatanaka 5ba593f509 Sort relocation entries before they are written out to a file. MIPS ABI
imposes a constraint that GOT16 referring to a local symbol or HI16 has to be
followed immediately by a matching LO16 relocation.

llvm-svn: 153553
2012-03-28 00:23:33 +00:00
Akira Hatanaka 34ee3ff83d Emit all directives except for ".cprestore" during asm printing rather than emit
them as machine instructions. Directives ".set noat" and ".set at" are now
emitted only at the beginning and end of a function except in the case where
they are emitted to enclose .cpload with an immediate operand that doesn't fit
in 16-bit field or unaligned load/stores.

Also, make the following changes:
- Remove function isUnalignedLoadStore and use a switch-case statement to
  determine whether an instruction is an unaligned load or store.

- Define helper function CreateMCInst which generates an instance of an MCInst
  from an opcode and a list of operands.

llvm-svn: 153552
2012-03-28 00:22:50 +00:00
Akira Hatanaka 1518a5fa9c Mark flag neverHasSideEffects of pattern-less instructions that do not have
any side effects.

llvm-svn: 153551
2012-03-28 00:21:37 +00:00
Benjamin Kramer 2735c01906 Add a note about a cute little fabs optimization.
llvm-svn: 153543
2012-03-27 22:42:42 +00:00
Benjamin Kramer f0901459b9 Add two missed instcombines related to compares with nsw arithmetic.
llvm-svn: 153542
2012-03-27 22:03:19 +00:00
Akira Hatanaka 52656d1047 Remove trailing white space.
llvm-svn: 153536
2012-03-27 20:35:51 +00:00
Akira Hatanaka a25fe22198 Add member EmitNOAT and its setter and getter functions to class MipsFunctionInfo.
If EmitNOAT is true, directives ".set noat" and ".set at" are emitted at the
beginning and end of a function. 

llvm-svn: 153528
2012-03-27 19:08:42 +00:00
Jakob Stoklund Olesen 4acbcb3171 ARMLoadStoreOptimizer invalidates register liveness.
This pass tries to update kill flags, but there are still many bugs.
Passes after the load/store optimizer don't need accurate liveness, so
don't even try.

<rdar://problem/11101911>

llvm-svn: 153519
2012-03-27 17:33:52 +00:00
Jakob Stoklund Olesen 14459cdc49 Invalidate liveness in Thumb2ITBlockPass.
llvm-svn: 153516
2012-03-27 17:06:06 +00:00
Craig Topper 1fcf5bcae1 Prune some includes
llvm-svn: 153502
2012-03-27 07:54:11 +00:00
Craig Topper f6e7e12f75 Remove unnecessary llvm:: qualifications
llvm-svn: 153500
2012-03-27 07:21:54 +00:00
Akira Hatanaka 8a7633c74e Pass the llvm IR pointer value and offset to the constructor of
MachinePointerInfo when getStore is called to create a node that stores an
argument passed in register to the stack. Without this change, the post RA 
scheduler will fail to discover the dependencies between the stores
instructions and the instructions that load from a structure passed by value. 

The link to the related discussion is here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-March/048055.html

llvm-svn: 153499
2012-03-27 03:13:56 +00:00
Akira Hatanaka 769f69f9b6 Fix bug in LowerConstantPool.
llvm-svn: 153498
2012-03-27 02:55:31 +00:00
Akira Hatanaka 2a36c9f4a8 Add T9 to the list of live-in registers of the entry basic block.
llvm-svn: 153497
2012-03-27 02:46:25 +00:00
Akira Hatanaka fe384a2c84 Retrieve and add the offset of a symbol in applyFixup rather than retrieve and
set it in MipsMCCodeEmitter::getMachineOpValue. Assert in getMachineOpValue if
MachineOperand MO is of an unexpected type. 

llvm-svn: 153494
2012-03-27 02:33:05 +00:00
Akira Hatanaka a06bc1c6e3 Define function MipsGetSymAndOffset which returns a fixup's symbol and the
offset applied to it.

llvm-svn: 153493
2012-03-27 02:04:18 +00:00
Akira Hatanaka da72819725 Rewrite computation of Value in adjustFixupValue so that the upper 48-bits are
cleared. No functionality change.

llvm-svn: 153491
2012-03-27 01:50:08 +00:00
Akira Hatanaka ba5100c117 Reserve hardware registers.
llvm-svn: 153486
2012-03-27 00:40:56 +00:00
Evan Cheng a2b48d985b ARM has a peephole optimization which looks for a def / use pair. The def
produces a 32-bit immediate which is consumed by the use. It tries to 
fold the immediate by breaking it into two parts and fold them into the
immmediate fields of two uses. e.g
       movw    r2, #40885
       movt    r3, #46540
       add     r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       add.w   r0, r0, #30146560
;
However, this transformation is incorrect if the user produces a flag. e.g.
       movw    r2, #40885
       movt    r3, #46540
       adds    r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       adds.w  r0, r0, #30146560
Note the adds.w may not set the carry flag even if the original sequence
would.

rdar://11116189

llvm-svn: 153484
2012-03-26 23:31:00 +00:00
Craig Topper 6e80c28017 Prune some includes and forward declarations.
llvm-svn: 153429
2012-03-26 06:58:25 +00:00
Craig Topper 5fa0caafc0 Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h
llvm-svn: 153422
2012-03-26 00:45:15 +00:00
Craig Topper 07720d8dcd Replace uses of ARMBaseInstrInfo and ARMTargetMachine with the Base versions.
llvm-svn: 153421
2012-03-25 23:49:58 +00:00
Craig Topper d4a964cd70 Prune some includes and forward declarations.
llvm-svn: 153415
2012-03-25 18:10:17 +00:00
Hal Finkel e44eb28807 Fix small-integer VAARG on SVR4 ABI PPC64.
The PPC64 SVR4 ABI requires integer stack arguments, and thus the var. args., that
are smaller than 64 bits be zero extended to 64 bits.

llvm-svn: 153373
2012-03-24 03:53:55 +00:00
Justin Holewinski a84577dcff PTX: Fix predicate logic bug
Code such as:

%vreg100 = setcc %vreg10, -1, SETNE
brcond %vreg10, %tgt

was being incorrectly morphed into

%vreg100 = and %vreg10, 1
brcond %vreg10, %tgt

where the 'and' instruction could be eliminated since
such logic is on 1-bit types in the PTX back-end, leaving
us with just:

brcond %vreg10, %tgt

which essentially gives us inverted branch conditions.

llvm-svn: 153364
2012-03-24 01:23:20 +00:00
Jim Grosbach 190e7b6e18 ARM tidy up ARMConstantIsland.cpp.
No functional change, just tidy up the code and nomenclature a bit.

llvm-svn: 153347
2012-03-23 23:07:03 +00:00
Benjamin Kramer b0640db80e Include cstdio in a few place that depended on getting it transitively through StringExtras.h
llvm-svn: 153328
2012-03-23 11:35:30 +00:00
Benjamin Kramer cbf108eda6 Move ftostr into its last user (cppbackend) and simplify it a bit.
New code should use raw_ostream.

llvm-svn: 153326
2012-03-23 11:26:29 +00:00
Eric Christopher 64a232343a Remove the C backend.
llvm-svn: 153307
2012-03-23 05:50:46 +00:00
Silviu Baranga 4afd7d2316 Added soft fail checks for the disassembler when decoding some corner cases of the STRD, STRH, LDRD, LDRH, LDRSH and LDRSB instructions on ARM.
llvm-svn: 153252
2012-03-22 14:14:49 +00:00
Silviu Baranga d213f2111a Added soft fail cases for the disassembler when decoding LDRSBT, LDRHT or LDRSHT instruction on ARM
llvm-svn: 153251
2012-03-22 13:24:43 +00:00
Silviu Baranga a6ea32afdd Added soft fail cases for the disassembler when decoding MUL instructions on ARM.
llvm-svn: 153250
2012-03-22 13:14:39 +00:00
Craig Topper 7da2aa24c2 Remove some unnecessary forward declarations.
llvm-svn: 153245
2012-03-22 06:52:14 +00:00
Hal Finkel 76eb187c0f PPC::DBG_VALUE must use Reg+Imm frame-index elimination even for large offsets. Fixes PR12203.
I don't have a small test case yet, but I'll try to construct one.

llvm-svn: 153240
2012-03-22 05:28:19 +00:00