Commit Graph

21958 Commits

Author SHA1 Message Date
NAKAMURA Takumi 70c1aa0bb5 ARMDisassembler.cpp: Fix utf8 char in comments.
llvm-svn: 157292
2012-05-22 21:47:02 +00:00
Craig Topper 53b4b73be9 Fix constant used for pshufb mask when lowering v16i8 shuffles. Bug introduced in r157043. Fixes PR12908.
llvm-svn: 157236
2012-05-22 06:09:38 +00:00
Akira Hatanaka cdf4fd8267 This patch adds a predicate to existing mips32 and mips64 so that those
instruction encodings can be excluded during mips16 processing.

This revision fixes the issue raised by Jim Grosbach.

bool hasStandardEncoding() const { return !inMips16Mode(); }

When micromips is added it will be

bool StandardEncoding() const { return !inMips16Mode()&&  !inMicroMipsMode(); }

No additional testing is needed other than to assure that there is no regression
from this patch.

Patch by Reed Kotler.

llvm-svn: 157234
2012-05-22 03:10:09 +00:00
Jim Grosbach 2597f83889 ARM: .end_data_region mismatch in Thumb2.
32-bit offset jump tables just use real branch instructions and so aren't
marked as data regions. We were still emitting the .end_data_region
marker though, which assert()ed.

rdar://11499158

llvm-svn: 157221
2012-05-21 23:34:42 +00:00
Jim Grosbach 19a7bcedb1 Thumb2: RSB source register should be rGRP not GPRnopc.
t2RSB defined the operand correctly, but tRSBS didn't.

llvm-svn: 157200
2012-05-21 17:57:17 +00:00
Craig Topper e88f2fd4f7 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
llvm-svn: 157175
2012-05-21 06:40:16 +00:00
Jakob Stoklund Olesen 38dcd598f9 Make the global base reg GR32_NOSP.
It can sometimes be used in addressing modes that don't support %ESP.

llvm-svn: 157165
2012-05-20 18:43:00 +00:00
Hal Finkel 601f555eee Add a missing PPC 64-bit stwu pattern.
This seems to fix the remaining compile-time failures on PPC64 when
compiling with -enable-ppc-preinc.

llvm-svn: 157159
2012-05-20 17:11:24 +00:00
Jakob Stoklund Olesen 691ae3388f Use the right register class for LDRrs.
llvm-svn: 157152
2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen 4fd0e4f415 Transfer memory operands to the right instruction.
They need to go on the PICLDR as the verifier points out.

llvm-svn: 157151
2012-05-20 06:38:42 +00:00
Hal Finkel 66b0c93553 Add a FIXME about access to negative stack-pointer offsets on PPC32.
The current code will generate a prologue which starts with something like:
        mflr 0
        stw 31, -4(1)
        stw 0, 4(1)
        stwu 1, -16(1)

But under the PPC32 SVR4 ABI, access to negative offsets from R1 is not allowed.

This was pointed out by Peter Bergner.

llvm-svn: 157133
2012-05-19 21:52:55 +00:00
Nadav Rotem c93e91da27 On Haswell, perfer storing YMM registers using a single instruction.
llvm-svn: 157129
2012-05-19 20:30:08 +00:00
Nadav Rotem 900c7cb7ce Add support for additional in-reg vbroadcast patterns
llvm-svn: 157127
2012-05-19 19:57:37 +00:00
Craig Topper 1964b6d39d Tidy up some spacing and inconsistent use of pre/post increment. No functional change intended.
llvm-svn: 157122
2012-05-19 19:14:18 +00:00
Stepan Dyatkovskiy 79a0d80d51 Ordinary PR1255 patch: DifferenceEngine and CPPBackend adopted to the new SwitchInst methods.
llvm-svn: 157112
2012-05-19 13:14:30 +00:00
Craig Topper 6166178573 Copy some AVX support from MCJIT to JIT. Maybe will fix PR12748.
llvm-svn: 157109
2012-05-19 08:28:17 +00:00
Eric Christopher bc5d24999c Add support for the 'd' mips inline asm output modifier.
Patch by Jack Carter.

llvm-svn: 157093
2012-05-19 00:51:56 +00:00
Jim Grosbach 4b63d2ae1d Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

llvm-svn: 157062
2012-05-18 19:12:01 +00:00
Eric Christopher 9ca26cfb5f Add support for the mips 'x' inline asm modifier.
Patch by Jack Carter.

llvm-svn: 157057
2012-05-18 17:39:35 +00:00
Craig Topper 0cf4038c59 Simplify code a bit. No functional change intended.
llvm-svn: 157044
2012-05-18 07:07:36 +00:00
Craig Topper 92db928ee9 Simplify handling of v16i8 shuffles and fix a missed optimization.
llvm-svn: 157043
2012-05-18 06:42:06 +00:00
Kevin Enderby f1b225d0e0 Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing
the 0b10 mask encoding bits.  Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use.  Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions.  Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025

llvm-svn: 157019
2012-05-17 22:18:01 +00:00
Tim Northover af501a29d3 Remove incorrect pattern for ARM SMML instruction.
Patch by Meador Inge.

llvm-svn: 156989
2012-05-17 13:12:13 +00:00
Akira Hatanaka 0faaebf27c This patch adds the register class for MIPS16 as well as the ability for
llc to recognize MIPS16 as a MIPS ASE extension. -mips16 will mean the
mips16 ASE for mips32 by default.

As part of fixing of adding this we discovered some small changes that
need to be made to MipsInstrInfo::storeRegToStackSLot and
MipsInstrInfo::loadRegFromStackSlot. We were using some "==" equality tests
where in fact we should have been using Mips::<regclas>.hasSubClassEQ instead,
per suggestion of Jakob Stoklund Olesen.

Patch by Reed Kotler.

llvm-svn: 156958
2012-05-16 22:19:56 +00:00
Benjamin Kramer 7faf84f125 Hexagon: Remove unused command line option.
llvm-svn: 156917
2012-05-16 15:03:55 +00:00
Evan Cheng 58a95f0c8a Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
llvm-svn: 156896
2012-05-16 01:54:27 +00:00
Jim Grosbach c3b0427921 Allow MCCodeEmitter access to the target MCRegisterInfo.
Add the MCRegisterInfo to the factories and constructors.

Patch by Tom Stellard <Tom.Stellard@amd.com>.

llvm-svn: 156828
2012-05-15 17:35:52 +00:00
Akira Hatanaka cf434ee4c1 Temporarily disable anti-dependence breaking for Mips until bug 12829 is
resolved.

llvm-svn: 156801
2012-05-15 03:14:52 +00:00
Bill Wendling 8b5c0e4af2 Remove extraneous ';'.
llvm-svn: 156791
2012-05-15 00:41:56 +00:00
Akira Hatanaka 4773e67e0b Add a command line option to skip the delay slot filler pass entirely for Mips.
The purpose of this option is to silence error messages issued by machine
verifier passes and enable them to run to the end. If this option is not
provided, -verify-machineinstrs complains when it discovers there is a
non-terminator instruction (an instruction that is in a delay slot) after the
first terminator in a basic block.

llvm-svn: 156790
2012-05-14 23:59:17 +00:00
David Blaikie 81a84bd841 Fix use of uninitialized variable.
Found by GCC's maybe-uninitialized.

llvm-svn: 156780
2012-05-14 21:48:19 +00:00
Brendon Cahoon f6b687e5d1 Revert 156634 upon request until code improvement changes are made.
llvm-svn: 156775
2012-05-14 19:35:42 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Benjamin Kramer 0b03cbd416 Hexagon: Initialize TBB to 0.
Found by valgrind.

llvm-svn: 156744
2012-05-13 15:13:22 +00:00
Sirish Pande 8bb9745a5e Make sure new value jump is enabled for Hexagon V5 as well.
llvm-svn: 156700
2012-05-12 05:54:15 +00:00
Sirish Pande 4bd20c50eb Support for Hexagon feature, New Value Jump.
llvm-svn: 156698
2012-05-12 05:10:30 +00:00
Akira Hatanaka a6c3fd8317 Remove MipsEmitGPRestore.cpp.
llvm-svn: 156696
2012-05-12 03:24:03 +00:00
Akira Hatanaka 3ecc5273c1 Delete all functions that are no longer needed in MipsFunctionInfo, including
the ones that get or set the frame index for the $gp save slot. 

Remove the piece of code in MipsFunctionInfo::getGlobalBaseReg() which returns
GP. This function should always return a virtual register.

llvm-svn: 156695
2012-05-12 03:22:13 +00:00
Akira Hatanaka 2e31e036b6 Stop reserving register $gp. Do not call isGPFI to check whether a frame object
is the $gp save slot.

llvm-svn: 156694
2012-05-12 03:21:18 +00:00
Akira Hatanaka 0fb87feb39 Do not add the pass which restores $gp after every function call.
llvm-svn: 156693
2012-05-12 03:19:51 +00:00
Akira Hatanaka f542ebd958 Make the following changes in MipsISelLowering.cpp:
- Stop creating stack frame objects needed for saving $gp.
- Insert a node that copies the global pointer register to register $gp
  before the call node. This will ensure $gp is valid at the entry of the
  called function.

llvm-svn: 156692
2012-05-12 03:19:04 +00:00
Akira Hatanaka c980f8453a Make the following changes in MipsFrameLowering.cpp:
- Stop emitting instructions needed to initialize the global pointer register.
- Stop emitting .cprestore directive.
- Do not take into account the $gp save slot when computing stack size.

llvm-svn: 156691
2012-05-12 03:18:00 +00:00
Akira Hatanaka 8f3573034b Make the following changes in MipsAsmPrinter.cpp:
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
  initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.

llvm-svn: 156689
2012-05-12 00:48:43 +00:00
Akira Hatanaka d918f77ba3 Insert instructions to the entry basic block which initializes the global
pointer register. 


This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:

- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
  allocator decide which register to assign to it or whether spill/reloads are
  needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
  for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.

llvm-svn: 156671
2012-05-12 00:17:17 +00:00
Akira Hatanaka 0661b81bca Do not replace operands of pseudo instructions with register $zero.
llvm-svn: 156663
2012-05-11 23:22:18 +00:00
Chad Rosier aa9cb9df59 [fast-isel] Add support for selecting @llvm.trap().
llvm-svn: 156646
2012-05-11 21:33:49 +00:00
Brendon Cahoon 5edcf8822d Updated instruction table due to addded intrinsics.
llvm-svn: 156644
2012-05-11 21:10:16 +00:00
Sirish Pande 95d0117bb3 Remove warnings from HexagonVLIWPacketizer.
llvm-svn: 156636
2012-05-11 20:00:34 +00:00
Brendon Cahoon 31f8723ef3 Hexagon constant extender support.
Patch by Jyotsna Verma.

llvm-svn: 156634
2012-05-11 19:56:59 +00:00
Chad Rosier 06e34d9220 Typo.
llvm-svn: 156633
2012-05-11 19:43:29 +00:00
Chad Rosier 3268692aa8 [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
llvm-svn: 156632
2012-05-11 19:40:25 +00:00
Sirish Pande 83ccb6ce08 Hexagon V5 intrinsics support.
llvm-svn: 156631
2012-05-11 19:39:13 +00:00
Chad Rosier 90f9afe659 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796

llvm-svn: 156628
2012-05-11 18:51:55 +00:00
Chad Rosier 519b12f927 [fast-isel] Rather then assert (or segfault in a non-asserts build), fall back
to selection DAG isel if we're unable to handle a non-double multi-reg retval.
rdar://11430407
PR12796

llvm-svn: 156622
2012-05-11 17:41:06 +00:00
Chad Rosier 466d3d8faa The return type is an unsigned, not a bool.
llvm-svn: 156621
2012-05-11 16:41:38 +00:00
Manman Ren 0d5ec28ccc Add space before an open parenthesis in control flow statements.
llvm-svn: 156620
2012-05-11 15:36:46 +00:00
Preston Gurd 09de6ae399 Added X86 Atom latencies to X86InstrMMX.td.
llvm-svn: 156615
2012-05-11 14:27:12 +00:00
Hans Wennborg f9d0e44b82 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

llvm-svn: 156611
2012-05-11 10:11:01 +00:00
Silviu Baranga ddc67a7655 Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions.
llvm-svn: 156609
2012-05-11 09:28:27 +00:00
Silviu Baranga 5a719f9b9a Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate.
llvm-svn: 156608
2012-05-11 09:10:54 +00:00
Akira Hatanaka e37614438f Fix a misleading comment.
llvm-svn: 156603
2012-05-11 01:45:15 +00:00
Manman Ren dc8ad0058f ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156599
2012-05-11 01:30:47 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00
Preston Gurd 4fe10a5d9a Added X86 Atom latencies for instructions in X86InstrInfo.td.
llvm-svn: 156579
2012-05-10 21:58:35 +00:00
Eric Christopher ed51b9ec0b Add support for the 'X' inline asm operand modifier.
Patch by Jack Carter.

llvm-svn: 156577
2012-05-10 21:48:22 +00:00
Sirish Pande fc8118bf41 Hexagon V5 Support - V5 td file.
llvm-svn: 156569
2012-05-10 20:24:28 +00:00
Sirish Pande 69295b8963 Hexagon V5 FP Support.
llvm-svn: 156568
2012-05-10 20:20:25 +00:00
Manman Ren b555b382bd Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.

llvm-svn: 156556
2012-05-10 18:49:43 +00:00
Manman Ren c860887b2d ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156550
2012-05-10 16:48:21 +00:00
Nadav Rotem 1a65397017 Fix merge-typo and cleanup
llvm-svn: 156541
2012-05-10 12:50:02 +00:00
Nadav Rotem 15946e50c1 AVX2: Add an additional broadcast idiom.
llvm-svn: 156540
2012-05-10 12:39:13 +00:00
Nadav Rotem b86a3fb8d0 Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.

Fix PR11900.

llvm-svn: 156539
2012-05-10 12:22:05 +00:00
Roman Divacky e07cc042f6 Mark .opd @progbits, thus avoiding a warning from asm.
llvm-svn: 156494
2012-05-09 18:24:23 +00:00
Akira Hatanaka ca41d13bbd Add another peephole pattern for conditional moves.
llvm-svn: 156460
2012-05-09 02:29:29 +00:00
Jakob Stoklund Olesen 7e21d617ef Use ptr_rc_tailcall instead of GR32_TC.
The getPointerRegClass() hook will return GR32_TC, or whatever is
appropriate for the current function.

Patch by Yiannis Tsiouris!

llvm-svn: 156459
2012-05-09 01:50:09 +00:00
Akira Hatanaka 05b9dad1e6 Make register FP allocatable if the compiled function does not have dynamic
allocas.

llvm-svn: 156458
2012-05-09 01:38:13 +00:00
Akira Hatanaka 0a8ab718cb Expand 64-bit shifts if target ABI is O32.
llvm-svn: 156457
2012-05-09 00:55:21 +00:00
Richard Trieu edf46e6b6e Remove unused variable to silence compiler warning.
llvm-svn: 156456
2012-05-09 00:30:21 +00:00
Jakob Stoklund Olesen 10191fd44f Use a shared function for a common operation.
llvm-svn: 156441
2012-05-08 23:27:30 +00:00
Eric Christopher d666bb0dd8 Remove excess semi-colons to quiet warnings.
llvm-svn: 156416
2012-05-08 20:45:04 +00:00
Sirish Pande 1c9f7dbc10 Update load/store instruction patterns in Hexagon V4.
llvm-svn: 156411
2012-05-08 19:50:20 +00:00
Akira Hatanaka c515bfb9e7 Define mips16 instruction formats.
Patch by Reed Kotler.

llvm-svn: 156408
2012-05-08 19:08:58 +00:00
Jakob Stoklund Olesen 276ae14023 s/CSR_Ghc/CSR_NoRegs/
Share the CalleeSavedRegs defs between all calling conventions having no
callee-saved registers.

Patch by Yiannis Tsiouris!

llvm-svn: 156382
2012-05-08 15:07:29 +00:00
Craig Topper 7daf897678 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Jakob Stoklund Olesen c4b3a7a1d7 Fix bug in TRI::getCommonSuperRegClass().
Test cases for this code are coming. It is not used for anything yet.

llvm-svn: 156327
2012-05-07 21:59:31 +00:00
Jakob Stoklund Olesen 65a6dafc8d Add TRI::getCommonSuperRegClass().
This function is a generalization of getMatchingSuperRegClass() to the
symmetric case where both sides are using a sub-register index. It will
find a super-register class and sub-register indexes that make this
diagram commute:

                                   PreA
                       SuperRC  ---------->  RCA

                          |                   |
                          |                   |
                     PreB |                   | SubA
                          |                   |
                          |                   |
                          V                   V

                         RCB    ----------> SubRC
                                   SubB

This can be used to coalesce copies like:

  %vreg1:sub16 = COPY %vreg2:sub16; GR64:%vreg1, GR32: %vreg2

llvm-svn: 156317
2012-05-07 19:14:58 +00:00
Chad Rosier d8287fec17 Fix a regression from r147481. This combine should only happen if there is a
single use.
rdar://11360370

llvm-svn: 156316
2012-05-07 18:47:44 +00:00
Manman Ren ef4e0479ec X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709
llvm-svn: 156312
2012-05-07 18:06:23 +00:00
Eric Christopher 0d8c15d20f Add support for the 'x' constraint.
Patch by Jack Carter.

llvm-svn: 156295
2012-05-07 06:25:19 +00:00
Eric Christopher 9c492e6ebf Add support for the 'l' constraint.
Patch by Jack Carter.

llvm-svn: 156294
2012-05-07 06:25:15 +00:00
Eric Christopher e3c494de82 Add support for the 'c' constraint.
Patch by Jack Carter.

llvm-svn: 156293
2012-05-07 06:25:10 +00:00
Eric Christopher c18ae4a3b1 Add support for the 'P' constraint.
Patch by Jack Carter.

llvm-svn: 156292
2012-05-07 06:25:02 +00:00
Craig Topper dbb98b4917 Fix some issues in the f16c instructions.
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Eric Christopher 470578a91b Add support for the 'O' constraint.
Patch by Jack Carter.

llvm-svn: 156285
2012-05-07 05:46:48 +00:00
Eric Christopher e07aa430b8 Add support for the 'N' inline asm constraint.
Patch by Jack Carter.

llvm-svn: 156284
2012-05-07 05:46:43 +00:00
Eric Christopher 1109b3406d Add support for the 'L' inline asm constraint.
Patch by Jack Carter.

llvm-svn: 156283
2012-05-07 05:46:37 +00:00
Eric Christopher 3ff88a05b7 Add support for the inline asm constraint 'K'.
llvm-svn: 156282
2012-05-07 05:46:29 +00:00
Craig Topper d4e1894ec1 Add SSE4A MOVNTSS/MOVNTSD instructions.
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Eric Christopher 7201e1b4b9 Support the 'J' constraint.
Patch by Jack Carter.

llvm-svn: 156280
2012-05-07 03:13:42 +00:00
Eric Christopher 1d6c89eea1 Add support for the 'I' inline asm constraint. Also add tests
from the previous 2 patches.

Patch by Jack Carter.

llvm-svn: 156279
2012-05-07 03:13:32 +00:00
Eric Christopher 58daf04681 Allow 64 bit integer values in gpu registers if arch and abi are 64 bit.
Patch by Jack Carter.

llvm-svn: 156278
2012-05-07 03:13:22 +00:00
Eric Christopher cfcd77b0bc When using inline asm constraints representing
non-floating point general registers allow 8 and 16-bit
elements.

Patch by Jack Carter.

llvm-svn: 156277
2012-05-07 03:13:16 +00:00
Craig Topper 00a1e6d48b Use MVT instead of EVT as the argument to all the shuffle decode functions. Simplify some of the decode functions.
llvm-svn: 156268
2012-05-06 19:46:21 +00:00
Craig Topper 804be3b546 Add VPERMQ/VPERMPD to the list of target specific shuffles that can be looked through for DAG combine purposes.
llvm-svn: 156266
2012-05-06 18:54:26 +00:00
Craig Topper 54bdb350e2 Add shuffle decode support for VPERMQ/VPERMPD.
llvm-svn: 156265
2012-05-06 18:44:02 +00:00
Jim Grosbach 7ce129268e Nuke a few dead remnants of the CBE.
llvm-svn: 156241
2012-05-05 17:45:12 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Benjamin Kramer a25a61b9e8 NVPTX: Initialize the UseF32FTZ flag.
llvm-svn: 156232
2012-05-05 11:22:02 +00:00
Eric Christopher de9e92ed9b Typo.
llvm-svn: 156226
2012-05-05 01:16:06 +00:00
David Blaikie 891d0a3d20 Fix warnings in release build.
This fixes a couple of Clang warnings in release builds of LLVM:

* Missing return in ISelLowering
* Unused variable in NVPTXutil.cpp

llvm-svn: 156216
2012-05-04 22:34:16 +00:00
Kevin Enderby cabbae653e Tweak to the fix in r156212, as with the change in removing the shift the
SignExtend32<22>(Val<<1) also needs to change to SignExtend32<21>(Val) .

llvm-svn: 156213
2012-05-04 22:09:52 +00:00
Kevin Enderby 8ce1ada1be Fix a bug in the ARM disassembler for wide branch conditional instructions
where the symbolic operand's displacement was incorrectly shifted left by 1.
rdar://11387046

llvm-svn: 156212
2012-05-04 22:02:27 +00:00
Chandler Carruth cd3464ee22 Fix a Clang warning in the new NVPTX backend:
In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53:
../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
  default: assert(0 && "Unknown condition code");
  ^
1 warning generated.

The prevailing pattern in LLVM is to not use a default label, and instead to
use llvm_unreachable to denote that the switch in fact covers all return paths
from the function.

llvm-svn: 156209
2012-05-04 21:35:49 +00:00
Justin Holewinski ae556d3ef7 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

llvm-svn: 156196
2012-05-04 20:18:50 +00:00
Sebastian Pop 2420e8b7d5 Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions.
llvm-svn: 156195
2012-05-04 19:53:56 +00:00
Preston Gurd d6c440cd4c Adds Intel Atom scheduling latencies to X86InstrSystem.td.
llvm-svn: 156194
2012-05-04 19:26:37 +00:00
Matt Beaumont-Gay e82ab6baa7 Pacify GCC's -Wreturn-type
llvm-svn: 156189
2012-05-04 18:34:27 +00:00
Hans Wennborg aea412008e Make ARM and Mips use TargetMachine::getTLSModel()
This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).

llvm-svn: 156162
2012-05-04 09:40:39 +00:00
Craig Topper bdd2e34b1f Fix some loops to match coding standards. No functional change intended.
llvm-svn: 156159
2012-05-04 06:39:13 +00:00
Craig Topper d4d3237bb8 Fix up some spacing. No functional change.
llvm-svn: 156158
2012-05-04 06:18:33 +00:00
Craig Topper e2ae413746 Simplify broadcast lowering code. No functional change intended.
llvm-svn: 156157
2012-05-04 05:49:51 +00:00
Craig Topper 42f2182366 Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
llvm-svn: 156156
2012-05-04 04:44:49 +00:00
Craig Topper 59063c0a3d Simplify shuffle narrowing code a bit. No functional change intended.
llvm-svn: 156154
2012-05-04 04:08:44 +00:00
Jakob Stoklund Olesen 796e5272ab Remove the SubRegClasses field from RegisterClass descriptions.
This information in now computed by TableGen.

llvm-svn: 156152
2012-05-04 03:30:34 +00:00
Jakob Stoklund Olesen 34a8f13e5f Initialize SparcInstrInfo before SparcTargetLowering.
The TargetLowering construction needs to use a valid TargetRegisterInfo
instance.

llvm-svn: 156146
2012-05-04 02:16:39 +00:00
Jakob Stoklund Olesen 57c7050675 Add a SuperRegClassIterator class.
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.

llvm-svn: 156144
2012-05-04 01:48:29 +00:00
Jakob Stoklund Olesen 2f460ae3b4 Use a shared implementation of getMatchingSuperRegClass().
TargetRegisterClass now gives access to the necessary tables.

llvm-svn: 156122
2012-05-03 22:49:04 +00:00
Kevin Enderby 914223010c Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits
for the assembler and disassembler.  Which were not being set/read correctly
for offsets greater than 22 bits in some cases.

Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!

llvm-svn: 156118
2012-05-03 22:41:56 +00:00
Sirish Pande f8e5e3c072 Support for target dependent Hexagon VLIW packetizer.
This patch creates and optimizes packets as per Hexagon ISA rules.

llvm-svn: 156109
2012-05-03 21:52:53 +00:00
Silviu Baranga 9560af848c Fixed disassembler for vstm/vldm ARM VFP instructions.
llvm-svn: 156077
2012-05-03 16:38:40 +00:00
Sirish Pande c92c31674e Extensions of Hexagon V4 instructions.
This adds new instructions for Hexagon V4 architecture.

llvm-svn: 156071
2012-05-03 16:18:50 +00:00
Craig Topper 242183834a Use 'unsigned' instead of 'int' in a few places dealing with counts of vector elements.
llvm-svn: 156060
2012-05-03 07:26:59 +00:00
Craig Topper 315a5cc789 Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
llvm-svn: 156059
2012-05-03 07:12:59 +00:00
Andrew Trick 32aea358e1 Added TargetRegisterInfo::getAllocatableClass.
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.

This ensures that targets define register classes as intended.

llvm-svn: 156046
2012-05-03 01:14:37 +00:00
Preston Gurd 926afd7401 For Intel Atom, use ILP scheduling always, instead of ILP for 64 bit
and Hybrid for 32 bit, since benchmarks show ILP scheduling is better
most of the time.

llvm-svn: 156028
2012-05-02 22:02:02 +00:00
Preston Gurd c0b976c42a Change the Intel Atom detection code to recognize
Lincroft and Medfield.

llvm-svn: 156025
2012-05-02 21:38:46 +00:00
Jim Grosbach 28b0b7279e ARM: Add missing two-operand VBIC aliases.
llvm-svn: 156019
2012-05-02 21:11:56 +00:00
Preston Gurd fa3f6cb830 This patch continues the work of adding instruction latencies for X86 Atom,
by providing the latencies for the instructions in X86InstrFPStack.td.

llvm-svn: 155996
2012-05-02 16:03:35 +00:00
Manman Ren f02efc8731 Revert r155853
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.

llvm-svn: 155992
2012-05-02 15:24:32 +00:00
Richard Barton 0fc56890ba Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures.
llvm-svn: 155983
2012-05-02 09:43:18 +00:00
Craig Topper c73bc39c22 Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
llvm-svn: 155982
2012-05-02 08:03:44 +00:00
Jakub Staszak 6126401c83 Remove unneeded break.
llvm-svn: 155959
2012-05-01 23:08:16 +00:00
Jakub Staszak 339380286b Remove trailing spaces.
llvm-svn: 155956
2012-05-01 23:04:38 +00:00
Jim Grosbach 1d20efb837 ARM: Add a few missing add->sub aliases w/ 'w' suffix.
Aliases for adding a negative immediate when using an explicit 'w'
suffix. E.g.,
        adds.w r2, #-16
        adds.w r2, r2, #-16
        addw r2, #-16
        addw r2, #-16
        addw r2, r2, #-16

rdar://11330769

llvm-svn: 155946
2012-05-01 21:17:34 +00:00
Jim Grosbach 70bed4faaf ARM: allow vanilla expressions for movw/movt.
Expressions for movw/movt don't always have an :upper16: or :lower16:
on them and that's ok. When they don't, it's just a plain [0-65536]
immediate result, effectively the same as a :lower16: variant kind.

rdar://10550147

llvm-svn: 155941
2012-05-01 20:43:21 +00:00
Preston Gurd 5ae5278ca1 This patch marks the X86 floating point stack registers ST0-ST7 as reserved
in order to avoid assertion failures in the register scavenger. The assertion
failures were “Bad machine code: Using an undefined physical register” and
“Bad machine code: MBB exits via unconditional fall-through but its successor
differs from its CFG successor!”.

llvm-svn: 155930
2012-05-01 19:50:22 +00:00
Manman Ren 425a55c1ce X86: optimization for max-like struct
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0

FROM
movl    %edi, %ecx
subl    %esi, %ecx
cmpl    %edi, %esi
movl    $0, %eax
cmovll  %ecx, %eax
TO
xorl    %eax, %eax
subl    %esi, %edi
cmovll  %eax, %edi
movl    %edi, %eax

rdar: 10734411
llvm-svn: 155919
2012-05-01 17:16:15 +00:00
Alexey Samsonov c4b3ad8195 X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment
llvm-svn: 155917
2012-05-01 15:16:06 +00:00
Benjamin Kramer cb3e98cf44 Move MipsDisassembler classes into an anonymous namespace.
llvm-svn: 155915
2012-05-01 14:34:24 +00:00
Benjamin Kramer 512c1dce8f Value-initialize global to avoid global construction.
llvm-svn: 155909
2012-05-01 10:48:02 +00:00
Bill Wendling b12f16e75f Change the PassManager from a reference to a pointer.
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468

llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Craig Topper 05eb6e096a Allow BMI, AES, F16C, POPCNT, FMA3, and CLMUL to be detected on AMD processors.
llvm-svn: 155899
2012-05-01 07:10:32 +00:00
Craig Topper bae0e9ea1d Make XOP and FMA4 require SSE4A to match GCC behavior. Use this to simplify Bulldozer feature list.
llvm-svn: 155897
2012-05-01 06:54:48 +00:00
Craig Topper d32ebcc36b Attempt to handle MRMInitReg in emitVEXOpcodePrefix. Hopefully fixes PR12711.
llvm-svn: 155896
2012-05-01 06:34:01 +00:00
Craig Topper 43518cc55f Make XOP imply AVX as its needed to legalize the registers types.
llvm-svn: 155891
2012-05-01 05:41:41 +00:00
Craig Topper c0cef32b83 Remove HasSSE2 from AES and CLMUL predicates. It's now implied by the HasAES and HasCLMUL predicates.
llvm-svn: 155890
2012-05-01 05:35:02 +00:00
Craig Topper 29dd148a71 Make CLMUL and AES imply SSE2 since its needed to legalize the type.
llvm-svn: 155888
2012-05-01 05:28:32 +00:00
Craig Topper 0eacda5f69 Enable AVX and FMA4 for AMD Bulldozer processors.
llvm-svn: 155885
2012-05-01 05:18:13 +00:00
Manman Ren 4f4d5c8fc8 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

llvm-svn: 155853
2012-04-30 22:51:25 +00:00
Jim Grosbach e78031a9f3 ARM: Diagnostics for out of range fixups.
Replace some assert() calls w/ actual diagnostics. In a perfect world,
there'd be range checks on these values long before things ever reached
this code. For now, though, issuing a better-late-than-never diagnostic
is still a big improvement over assert().

rdar://11347287

llvm-svn: 155851
2012-04-30 22:30:43 +00:00
Jakob Stoklund Olesen 8503ba984f Fix address calculation error from r155744.
This was exposed by SingleSource/UnitTests/Vector/constpool.c.

The computed size of a basic block isn't always a multiple of its known
alignment, and that can introduce extra alignment padding after the
block.

<rdar://problem/11347135>

llvm-svn: 155845
2012-04-30 20:19:00 +00:00
Chad Rosier d427d51c2b Tidy up. No functional change intended.
llvm-svn: 155832
2012-04-30 17:47:15 +00:00
Derek Schuff b051adf263 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

(this time, actually commit what was reviewed!)

llvm-svn: 155825
2012-04-30 16:57:15 +00:00
Bob Wilson 9245c93656 Don't introduce illegal types when creating vmull operations. <rdar://11324364>
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.

llvm-svn: 155824
2012-04-30 16:53:34 +00:00
Craig Topper 55b3990837 No need to normalize index before calling Extract128BitVector
llvm-svn: 155811
2012-04-30 05:17:10 +00:00
Pete Cooper f76b5fe5ab Copied all the VEX prefix encoding code from X86MCCodeEmitter to the x86 JIT emitter. Needs some major refactoring as these two code emitters are almost identical
llvm-svn: 155810
2012-04-30 03:56:44 +00:00
Jakub Staszak da03f3ba64 Remove unneeded casts. No functionality change.
llvm-svn: 155800
2012-04-29 20:52:53 +00:00
Craig Topper 3b94fa63d6 Simplify code a bit. No functional change intended.
llvm-svn: 155798
2012-04-29 20:22:05 +00:00
Kalle Raiskila 4c5f83ea19 Update the documentation of CellSPU, in case it gets removed in 3.1.
llvm-svn: 155797
2012-04-29 20:00:55 +00:00
Jakob Stoklund Olesen ae7521d1e4 Fix a problem with blocks that need to be split twice.
The code could search past the end of the basic block when there was
already a constant pool entry after the block.

Test case with giant basic block in SingleSource/UnitTests/Vector/constpool.c

llvm-svn: 155753
2012-04-28 06:21:38 +00:00
Jim Grosbach c6f32b3295 ARM: Thumb add(sp plus register) asm constraints.
Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.

rdar://11219154

llvm-svn: 155748
2012-04-27 23:51:36 +00:00
Jim Grosbach 9d8f6f3d9d ARM: Tweak tADDrSP definition for consistent operand order.
Make the operand order of the instruction match that of the asm syntax.

llvm-svn: 155747
2012-04-27 23:51:33 +00:00
Derek Schuff a99b168145 Revert r155745
llvm-svn: 155746
2012-04-27 23:37:41 +00:00
Derek Schuff bbf8b83e90 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

llvm-svn: 155745
2012-04-27 23:27:17 +00:00
Jakob Stoklund Olesen 5f0d1b462c Track worst case alignment padding more accurately.
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:

  RoundUpToAlignment(Offset + UnknownPadding)

This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.

This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.

The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.

<rdar://problem/11339352>

llvm-svn: 155744
2012-04-27 22:58:38 +00:00
Craig Topper 0fa6c7e593 Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements.
llvm-svn: 155742
2012-04-27 22:54:43 +00:00
Chad Rosier 32c2178ef3 Add x86-specific DAG combine to simplify:
x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545

llvm-svn: 155739
2012-04-27 22:33:25 +00:00
Craig Topper 42cd8d2c00 Tidy up spacing.
llvm-svn: 155733
2012-04-27 21:05:09 +00:00
Lang Hames ea001225c1 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.

llvm-svn: 155724
2012-04-27 18:51:24 +00:00
Richard Barton 82f95ea2ad Fix ARM assembly parsing for upper case condition codes on IT instructions.
llvm-svn: 155720
2012-04-27 17:34:01 +00:00
Benjamin Kramer 913da4b261 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704
2012-04-27 12:07:43 +00:00
Richard Barton f435b09eaf Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst.
llvm-svn: 155700
2012-04-27 08:42:59 +00:00
Evan Cheng 1ec87ee096 Implement a bastardized ABI.
llvm-svn: 155686
2012-04-27 02:11:10 +00:00
Evan Cheng f52003de56 - thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541

llvm-svn: 155685
2012-04-27 01:27:19 +00:00
Jim Grosbach 3d6c629e26 ARM: Thumb ldr(literal) base address alignment is 32-bits.
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.

rdar://11314619

llvm-svn: 155659
2012-04-26 20:48:12 +00:00
Preston Gurd 81290f4be5 Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655
2012-04-26 19:52:27 +00:00
Tim Northover 3de97b7a86 Use VLD1 in NEON extenting-load patterns instead of VLDR.
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.

llvm-svn: 155630
2012-04-26 08:46:29 +00:00
Tim Northover 6699a60b0e Test commit.
llvm-svn: 155626
2012-04-26 08:24:07 +00:00
Craig Topper 08ccfbe57b Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
2012-04-26 06:40:15 +00:00
Evan Cheng 9f7ad310b5 If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438

llvm-svn: 155601
2012-04-26 01:13:36 +00:00
Richard Barton ba5b0cc82e Unify internal representation of ARM instructions with a register right-shifted by #32. These are stored as shifts by #0 in the MCInst and correctly marshalled when transforming from or to assembly representation.
llvm-svn: 155565
2012-04-25 18:00:18 +00:00
Craig Topper 3ec7c2aa84 Add ifdef around getSubtargetFeatureName in tablegen output file so that only targets that want the function get it. This prevents other targets from getting an unused function warning.
llvm-svn: 155538
2012-04-25 06:56:34 +00:00
Craig Topper 5ff6dc34b9 Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.
llvm-svn: 155537
2012-04-25 06:39:39 +00:00
Akira Hatanaka 2020e27d6d Do not use $gp as a dedicated global register if the target ABI is not O32.
llvm-svn: 155522
2012-04-25 01:24:52 +00:00
Jim Grosbach 5117ef7453 ARM: improved assembler diagnostics for missing CPU features.
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.

rdar://11257547

llvm-svn: 155499
2012-04-24 22:40:08 +00:00
Jim Grosbach 1e75fc1fe1 ARM: Nuke remnant bogus code.
r154362 was supposed to delete this bit, but obviously didn't.

rdar://11305594

llvm-svn: 155465
2012-04-24 18:39:47 +00:00
Nadav Rotem 810734b7f4 AVX: Add additional vbroadcast replacement sequences for integers.
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.

llvm-svn: 155461
2012-04-24 18:09:59 +00:00
Nadav Rotem 7b7b99c74a AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8
immediate. We can't use it here because the shuffle code does not check that
the lower part of the word is identical to the upper part.

llvm-svn: 155440
2012-04-24 11:27:53 +00:00
Richard Barton e9600009e9 Refactor Thumb ITState handling in ARM Disassembler to more efficiently use its vector
llvm-svn: 155439
2012-04-24 11:13:20 +00:00
Nadav Rotem aa3ff8da00 AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).

llvm-svn: 155437
2012-04-24 11:07:03 +00:00
Craig Topper 0b65c40821 Remove dangling spaces. Fix some other formatting.
llvm-svn: 155429
2012-04-24 06:36:35 +00:00
Craig Topper 6f2a535de2 Simplify code a bit and make it compile better. Remove unused parameters.
llvm-svn: 155428
2012-04-24 06:02:29 +00:00
Jim Grosbach 671ad2a572 Tidy up. 80 columns, whitespace, et. al.
llvm-svn: 155399
2012-04-23 22:04:10 +00:00
Nadav Rotem 3f8acfc3c4 Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics).
llvm-svn: 155397
2012-04-23 21:53:37 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Jim Grosbach 41e94d79be ARM: VSLI two-operand assmebly aliases are tblgen'erated.
llvm-svn: 155393
2012-04-23 21:22:04 +00:00
Jim Grosbach 3dada484c3 ARM: tblgen'erate VSRA/VRSRA/VSRI assembly two-operand aliases.
llvm-svn: 155392
2012-04-23 21:00:49 +00:00
Jim Grosbach e5012fbad3 ARM: vqdmulh two-operand aliases are tblgen'erated now.
llvm-svn: 155387
2012-04-23 20:37:20 +00:00
Chandler Carruth 3c3bb55a85 Revert r155365, r155366, and r155367. All three of these have regression
test suite failures. The failures occur at each stage, and only get
worse, so I'm reverting all of them.

Please resubmit these patches, one at a time, after verifying that the
regression test suite passes. Never submit a patch without running the
regression test suite.

llvm-svn: 155372
2012-04-23 18:25:57 +00:00
Sirish Pande a3f8ba2439 Hexagon V5 (floating point) support.
llvm-svn: 155367
2012-04-23 17:49:40 +00:00
Sirish Pande 2c7bf00fba Support for Hexagon architectural feature, new value jump.
llvm-svn: 155366
2012-04-23 17:49:28 +00:00
Sirish Pande 6cd2251598 Support for Hexagon VLIW Packetizer.
llvm-svn: 155365
2012-04-23 17:49:20 +00:00
Craig Topper 153bb34a3c Use MVT instead of EVT through all of LowerVECTOR_SHUFFLEtoBlend and not just the switch. Saves a little bit of binary size.
llvm-svn: 155339
2012-04-23 07:36:33 +00:00
Craig Topper 0a2c809d09 Make getZeroVector and getOnesVector more alike as far as how they detect 128-bit versus 256-bit vectors. Be explicit about both sizes and use llvm_unreachable. Similar changes to getLegalSplat.
llvm-svn: 155337
2012-04-23 07:24:41 +00:00
Craig Topper 2bbe8bcf4e Tidy up by removing some 'else' after 'return'
llvm-svn: 155336
2012-04-23 06:57:04 +00:00
Craig Topper 5c51eeecfc Tidy up spacing in LowerVECTOR_SHUFFLEtoBlend. Remove code that checks if shuffle operand has a different type than the the shuffle result since it can never happen.
llvm-svn: 155333
2012-04-23 06:38:28 +00:00
Craig Topper a52f0d09b6 Add a couple llvm_unreachables.
llvm-svn: 155332
2012-04-23 03:42:40 +00:00
Craig Topper 984dc015ae Remove some tab characers.
llvm-svn: 155331
2012-04-23 03:28:34 +00:00
Craig Topper ea428fd79c Remove some 'else' after 'return'. No functional change.
llvm-svn: 155330
2012-04-23 03:26:18 +00:00
Craig Topper bf7d5666f0 Make Extract128BitVector and Insert128BitVector take an unsigned instead of an ConstantNode SDValue. getConstant was almost always called just before only to have the functions take it apart and build a new ConstantSDNode.
llvm-svn: 155325
2012-04-22 20:55:18 +00:00
Craig Topper 2d474d6d92 Convert getNode(UNDEF) to getUNDEF.
llvm-svn: 155321
2012-04-22 19:29:34 +00:00
Craig Topper 860ed0d20a Make calls to getVectorShuffle more consistent. Use shuffle VT for calls to getUNDEF instead of requerying. Use &Mask[0] instead of Mask.data().
llvm-svn: 155320
2012-04-22 19:17:57 +00:00
Craig Topper 43397c0900 Tidy up. 80 columns and argument alignment.
llvm-svn: 155319
2012-04-22 18:51:37 +00:00
Craig Topper ad56a744f1 Simplify code by converting multiple places that were manually concatenating 128-bit vectors to use either CONCAT_VECTORS or a helper function. CONCAT_VECTORS will itself be lowered to the same pattern as before. The helper function is needed for concats of BUILD_VECTORs since getNode(CONCAT_VECTORS) will just return a large BUILD_VECTOR and we may be trying to lower large BUILD_VECTORS when this occurs.
llvm-svn: 155318
2012-04-22 18:15:59 +00:00
Benjamin Kramer 8877d68db7 ARM: Initialize the HasRAS bit.
Found by valgrind.

llvm-svn: 155313
2012-04-22 11:52:41 +00:00
Elena Demikhovsky 8d7e56c409 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Bill Wendling f9774c3253 Remove some potential warnings about variables used uninitialized.
llvm-svn: 155307
2012-04-22 07:23:04 +00:00
Craig Topper 6eadae8e60 Make some fixed arrays const. Use array_lengthof in a couple places instead of a hardcoded number.
llvm-svn: 155294
2012-04-21 18:58:38 +00:00
Craig Topper 2568bf3089 Tidy up. 80 columns and some other spacing issues.
llvm-svn: 155291
2012-04-21 18:13:35 +00:00
NAKAMURA Takumi e30303fa86 llvm/lib/Target: [PR12611] Add "llvm/Support/raw_ostream.h" for Debug build on MSVC.
Thanks to Andy Gibbs, to report the issue.

llvm-svn: 155287
2012-04-21 15:31:45 +00:00
NAKAMURA Takumi 54eed760da HexagonISelLowering.cpp: Reorder #includes.
llvm-svn: 155286
2012-04-21 15:31:36 +00:00
NAKAMURA Takumi df3d5ea990 HexagonInstPrinter.cpp: Suppress -Wunused-variable warnings with -Asserts.
llvm-svn: 155281
2012-04-21 11:24:55 +00:00
Jim Grosbach c931d451cd ARM: tblgen'erate more NEON two-operand aliases.
VMUL and VEXT.

llvm-svn: 155258
2012-04-20 23:46:33 +00:00
Jim Grosbach b4e849b924 ARM: tblgen'erate more NEON two-operand aliases.
llvm-svn: 155254
2012-04-20 23:30:14 +00:00
Jim Grosbach 2937df45a8 ARM: Update NEON assembly two-operand aliases.
Use the new TwoOperandAliasConstraint to handle lots of the two-operand aliases
for NEON instructions. There's still more to go, but this is a good chunk of
them.

llvm-svn: 155210
2012-04-20 18:12:54 +00:00
Gabor Greif c8a9abe9df effectively back out my last change (r155190)
llvm-svn: 155195
2012-04-20 11:41:38 +00:00
Gabor Greif 9eccbe9c82 fix obviously bogus (IMO) operand index of the load in asserts
(load only has one operand) and smuggle in some whitespace changes too

NB: I am obviously testing the water here, and believe that the unguarded
    cast is still wrong, but why is the getZExtValue of the load's operand
    tested against zero here? Any review is appreciated.
llvm-svn: 155190
2012-04-20 08:58:49 +00:00
Craig Topper c7242e054d Convert more uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155188
2012-04-20 07:30:17 +00:00
Craig Topper abadc660e0 Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155186
2012-04-20 06:31:50 +00:00
Jim Grosbach 9cc324d31a ARM some VFP tblgen'erated two-operand aliases.
llvm-svn: 155178
2012-04-20 00:15:00 +00:00
Jim Grosbach 6b46134862 ARM let TableGen handle a few two-operand aliases.
No need for these explicit aliases anymore. Nuke 'em.

llvm-svn: 155173
2012-04-19 23:59:26 +00:00
Gabor Greif 180c4445cf zap tabs
llvm-svn: 155128
2012-04-19 15:16:31 +00:00
Kevin Enderby ec4bd31206 Fixed the llvm-mv X86 disassembler so the 'C' API gets jumps properly
symbolicated.  These have and operand type of TYPE_RELv which was not handled
as isBranch in translateImmediate() in X86Disassembler.cpp.  rdar://11268426 

llvm-svn: 155074
2012-04-18 23:12:11 +00:00
Chandler Carruth b415bf98f0 This reverts a long string of commits to the Hexagon backend. These
commits have had several major issues pointed out in review, and those
issues are not being addressed in a timely fashion. Furthermore, this
was all committed leading up to the v3.1 branch, and we don't need piles
of code with outstanding issues in the branch.

It is possible that not all of these commits were necessary to revert to
get us back to a green state, but I'm going to let the Hexagon
maintainer sort that out. They can recommit, in order, after addressing
the feedback.

Reverted commits, with some notes:

Primary commit r154616: HexagonPacketizer
  - There are lots of review comments here. This is the primary reason
    for reverting. In particular, it introduced large amount of warnings
    due to a bad construct in tablegen.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154622: CMake fixes
    - r154660: Fix numerous build warnings in release builds.
  - Please don't resubmit this until the three commits above are
    included, and the issues in review addressed.

Primary commit r154695: Pass to replace transfer/copy ...
  - Reverted to minimize merge conflicts. I'm not aware of specific
    issues with this patch.

Primary commit r154703: New Value Jump.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154703: Remove iostream usage
    - r154758: Fix CMake builds
    - r154759: Fix build warnings in release builds
  - Please incorporate these fixes and and review feedback before
    resubmitting.

Primary commit r154829: Hexagon V5 (floating point) support.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154841: Remove unused variable (fixing build warnings)

There are also accompanying Clang commits that will be reverted for
consistency.

llvm-svn: 155047
2012-04-18 21:31:19 +00:00
Akira Hatanaka fc1d00bbd6 Mark instruction classes ArithLogicR, ArithLogicI and LoadUpper as isRematerializable.
llvm-svn: 155031
2012-04-18 18:52:10 +00:00
Akira Hatanaka 4167bb9346 Delete blank line.
llvm-svn: 155030
2012-04-18 18:47:17 +00:00
Silviu Baranga ca45af9a75 Added support for disassembling unpredictable swp/swpb ARM instructions.
llvm-svn: 155004
2012-04-18 14:18:57 +00:00
Silviu Baranga d5c6a63a50 Fix the bahavior of the disassembler when decoding unpredictable mrs instructions on ARM. Now the diasassembler emmits warnings instead of errors.
llvm-svn: 155002
2012-04-18 14:09:07 +00:00
Silviu Baranga 41f1fcd80e Added support for unpredictable mcrr/mcrr2/mrrc/mrrc2 ARM instruction in the disassembler. Since the upredicability conditions are complex, C++ code was added to handle them.
llvm-svn: 155001
2012-04-18 13:12:50 +00:00
Silviu Baranga a2944116dc Fixed decoding for the ARM cdp2 instruction. The restriction on the coprocessor number was removed for this instruction.
llvm-svn: 155000
2012-04-18 13:02:55 +00:00
Silviu Baranga 9da1918c84 Add suport for unpredicatble cases of the cmp, tst, teq and cmnz ARM instructions in the disassembler.
llvm-svn: 154999
2012-04-18 12:48:43 +00:00
Craig Topper d3c9e404ba Remove AVX vpermil intrinsics. I removed their uses from clang headers and builtins a while back.
llvm-svn: 154985
2012-04-18 05:24:00 +00:00
Joe Groff a81bcbb9bb fix pr12559: mark unavailable win32 math libcalls
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

llvm-svn: 154960
2012-04-17 23:05:54 +00:00
Chad Rosier 41675546eb Typo.
llvm-svn: 154953
2012-04-17 21:48:36 +00:00
Akira Hatanaka 236e14017f Delete latter half of CMakeLists.txt.
llvm-svn: 154936
2012-04-17 18:18:09 +00:00
Akira Hatanaka 71928e681b Add disassembler to MIPS.
Patch by Vladimir Medic. 

llvm-svn: 154935
2012-04-17 18:03:21 +00:00
Jay Foad 08a0598cd4 Remove unused CCIfSubtarget.
llvm-svn: 154921
2012-04-17 11:29:05 +00:00
James Molloy a9bcf20d22 Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON.
llvm-svn: 154915
2012-04-17 08:18:00 +00:00
Craig Topper 354103d8ca Don't decode vperm2i128 or vperm2f128 into a shuffle if bit 3 or 7 of the immediate is set.
llvm-svn: 154907
2012-04-17 05:54:54 +00:00
Kevin Enderby 29ae538647 Fix ARM disassembly of VLD2 (single 2-element structure to all lanes)
instructions with writebacks. And add test a case for all opcodes handed by
DecodeVLD2DupInstruction() in ARMDisassembler.cpp .

llvm-svn: 154884
2012-04-17 00:49:27 +00:00
Jim Grosbach 2bf5f73977 ARM two-operand forms for vhadd and vhsub instructions.
rdar://11252521

llvm-svn: 154875
2012-04-16 23:00:25 +00:00
Preston Gurd 5333e2e5ce Temporarily turn off anti-dependency checking
during Post RA scheduling in X86,
until the X86 target is changed to properly set up
post RA liveness.

llvm-svn: 154874
2012-04-16 22:52:28 +00:00
Jim Grosbach 003607f474 ARM handle :lower16: and :upper16: after a '#' prefix.
rdar://11252521

llvm-svn: 154862
2012-04-16 21:18:46 +00:00
Richard Smith 12da79b859 Fix incorrect atomics codegen introduced in r154705, and extend test to catch it.
llvm-svn: 154845
2012-04-16 18:43:53 +00:00
David Blaikie e67cdc07a5 Remove unused variable
llvm-svn: 154841
2012-04-16 18:10:13 +00:00
Jim Grosbach 6068d0014a ARM assembly two-operand forms for VRSHL.
rdar://11252521

llvm-svn: 154840
2012-04-16 18:03:16 +00:00
Akira Hatanaka 3e9d81f47c Do not add offset in applyFixup. This has already been accounted for in Value.
llvm-svn: 154838
2012-04-16 18:00:19 +00:00
Jim Grosbach cd1c000a9f ARM two-operand aliases for VRHADD instructions.
rdar://11252521

llvm-svn: 154832
2012-04-16 17:14:11 +00:00
Sirish Pande 96e8ee17e0 Hexagon V5 (Floating Point) Support.
llvm-svn: 154829
2012-04-16 17:05:06 +00:00
Craig Topper 4badeb3f0d Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes.
llvm-svn: 154801
2012-04-16 07:13:00 +00:00
Craig Topper 26d7a94981 Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.
llvm-svn: 154798
2012-04-16 06:43:40 +00:00
Craig Topper c0075aa7ff Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.
llvm-svn: 154797
2012-04-16 06:26:15 +00:00
Craig Topper b86fa404d3 Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper b04fe34030 Fix SDTypeProfile for vpermps. The mask operand should be v8i32.
llvm-svn: 154781
2012-04-16 00:12:20 +00:00
Craig Topper 1f8c9eb925 Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in vpermps/vpermd. Hardware only looks at lower 3-bits.
llvm-svn: 154780
2012-04-15 23:48:57 +00:00
Craig Topper bfc9a5f7d3 Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
llvm-svn: 154778
2012-04-15 22:43:31 +00:00
Nadav Rotem 42bcd04ee3 Fix PR12529. The Vxx family of instructions are only supported by AVX.
Use non-vex instructions for SSE4.

llvm-svn: 154770
2012-04-15 19:36:44 +00:00
Benjamin Kramer 673824b4a1 Wire up support for diagnostic ranges in the ARMAsmParser.
As an example, attach range info to the "invalid instruction" message:

$ clang -arch arm -c asm.c
asm.c:2:11: error: invalid instruction
  __asm__("foo r0");
          ^
<inline asm>:1:2: note: instantiated into assembly here
        foo r0
        ^~~

llvm-svn: 154765
2012-04-15 17:04:27 +00:00
Elena Demikhovsky 779a72b49e Added VPERM optimization for AVX2 shuffles
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
NAKAMURA Takumi 67de410135 HexagonCopyToCombine.cpp: Silence two warnings, -Wunused-variable, with -Asserts.
llvm-svn: 154759
2012-04-15 05:33:43 +00:00
NAKAMURA Takumi 355eebf4cf Target/Hexagon: Tweak to fix msvc build.
llvm-svn: 154758
2012-04-15 05:09:09 +00:00
Richard Smith 3e8f1f6aea Fix X86 codegen for 'atomicrmw nand' to generate *x = ~(*x & y), not *x = ~*x & y.
llvm-svn: 154705
2012-04-13 22:47:00 +00:00
Sirish Pande f4db4b2cb4 Remove iostream from New Value Jump.
llvm-svn: 154703
2012-04-13 21:01:35 +00:00
Sirish Pande 0e6e36d1d0 Add support for Hexagon Architectural feature, New Value Jump.
llvm-svn: 154696
2012-04-13 20:22:31 +00:00
Sirish Pande a8071a0f88 Pass to replace tranfer/copy instructions into combine instruction where possible.
llvm-svn: 154695
2012-04-13 20:22:19 +00:00
Evan Cheng 267a4ada52 On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly.
llvm-svn: 154689
2012-04-13 18:59:28 +00:00
Kevin Enderby c407cc7a40 For ARM disassembly only print 32 unsigned bits for the address of branch
targets so if the branch target has the high bit set it does not get printed as:
	 beq     0xffffffff8008c404

llvm-svn: 154685
2012-04-13 18:46:37 +00:00
Craig Topper eb455832b4 Silence various build warnings from Hexagon backend that show up in release builds. Mostly converting 'assert(0)' to 'llvm_unreachable' to silence warnings about missing returns. Also fold some variable declarations into asserts to prevent the variables from being unused in release builds.
llvm-svn: 154660
2012-04-13 06:38:11 +00:00
Kevin Enderby 40d4e47003 Fix a few more places in the ARM disassembler so that branches get
symbolic operands added when using the C disassembler API.

llvm-svn: 154628
2012-04-12 23:13:34 +00:00
Ted Kremenek 967aaa956f Update CMake build.
llvm-svn: 154622
2012-04-12 22:15:23 +00:00
Evandro Menezes 6a6a66e313 Hexagon: fix CMake error.
llvm-svn: 154620
2012-04-12 21:44:58 +00:00
Sirish Pande b486144c12 HexagonPacketizer patch.
llvm-svn: 154616
2012-04-12 21:06:38 +00:00
Evan Cheng 3e869f002c Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106
llvm-svn: 154604
2012-04-12 19:14:21 +00:00
Evandro Menezes 5cee621c88 Hexagon: enable assembler output through the MC layer.
llvm-svn: 154597
2012-04-12 17:55:53 +00:00
Benjamin Kramer df4477c506 Remove README entry obsoleted by register masks.
llvm-svn: 154588
2012-04-12 12:47:29 +00:00
Craig Topper d0271b27cb Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions.
llvm-svn: 154580
2012-04-12 07:23:00 +00:00
Jim Grosbach 4324f426ce ARM 'adr' fixups don't need the interworking addend tweaking.
They reference the PC directly, so things work properly that way.

rdar://11231229

llvm-svn: 154576
2012-04-12 01:19:35 +00:00
Akira Hatanaka 47ad674f67 Emit neg.s or neg.d only if -enable-no-nans-fp-math is supplied by user,
otherwise expand FNEG during legalization.

llvm-svn: 154546
2012-04-11 22:59:08 +00:00
Akira Hatanaka 7f4c9d1429 Emit abs.s or abs.d only if -enable-no-nans-fp-math is supplied by user.
Invalid operation is signaled if the operand of these instructions is NaN.

llvm-svn: 154545
2012-04-11 22:49:04 +00:00
Kevin Enderby 72f18bbcff Fixed a case of ARM disassembly getting an assert on a bad encoding
of a VST instruction.

llvm-svn: 154544
2012-04-11 22:40:17 +00:00
Akira Hatanaka 4f5c8421b3 Fix bugs in lowering of FCOPYSIGN nodes.
- FCOPYSIGN nodes that have operands of different types were not handled.
- Different code was generated depending on the endianness of the target.

Additionally, code is added that emits INS and EXT instructions, if they are
supported by target (they are R2 instructions).

llvm-svn: 154540
2012-04-11 22:13:04 +00:00
Jim Grosbach 6e536de1a1 ARM 'vuzp.32 Dd, Dm' is a pseudo-instruction.
While there is an encoding for it in VUZP, the result of that is undefined,
so we should avoid it. Define the instruction as a pseudo for VTRN.32
instead, as the ARM ARM indicates.

rdar://11222366

llvm-svn: 154511
2012-04-11 17:40:18 +00:00
Jim Grosbach 4640c8169f ARM 'vzip.32 Dd, Dm' is a pseudo-instruction.
While there is an encoding for it in VZIP, the result of that is undefined,
so we should avoid it. Define the instruction as a pseudo for VTRN.32
instead, as the ARM ARM indicates.

rdar://11221911

llvm-svn: 154505
2012-04-11 16:53:25 +00:00
Nadav Rotem 372cf15125 remove unused argument
llvm-svn: 154494
2012-04-11 11:05:21 +00:00
Duncan Sands 264d2e7121 Add a C binding to the Target and TargetMachine classes to allow for emitting
binary and assembly. Patch by Carlo Kok.  Emitting was inspired by but not based
on the D llvm bindings. 

llvm-svn: 154493
2012-04-11 10:25:24 +00:00
Evan Cheng 5efc442290 Add more fused mul+add/sub patterns. rdar://10139676
llvm-svn: 154484
2012-04-11 06:59:47 +00:00
Nadav Rotem 9bc178ac5c Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Evan Cheng 48346c1cd9 Clean up ARM fused multiply + add/sub support some more: rename some isel
predicates.
Also remove NEON2 since it's not really useful and it is confusing. If
NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it
really mean?

rdar://10139676

llvm-svn: 154480
2012-04-11 05:33:07 +00:00
Evan Cheng 67a09fc397 Match (fneg (fma) to vfnma. rdar://10139676
llvm-svn: 154469
2012-04-11 01:21:25 +00:00
Charles Davis 74c282b5ef Add retw and lretw instructions. Also, fix Intel syntax parsing for all
ret instructions.

llvm-svn: 154468
2012-04-11 01:10:53 +00:00
Kevin Enderby d2980cd041 Fix ARM disassembly of VLD instructions with writebacks.  And add test a case
for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp .

llvm-svn: 154459
2012-04-11 00:25:40 +00:00
Jim Grosbach ad66de155b ARM add missing Thumb1 two-operand aliases for shift-by-immediate.
rdar://11222742

llvm-svn: 154457
2012-04-11 00:15:16 +00:00
Evan Cheng aca6c822e6 Fix a number of problems with ARM fused multiply add/subtract instructions.
1. The new instruction itinerary entries are not properly described.
2. The asm parser can't handle vfms and vfnms.
3. There were no assembler, disassembler test cases.
4. HasNEON2 has the wrong assembler predicate.
rdar://10139676

llvm-svn: 154456
2012-04-11 00:13:00 +00:00
Evan Cheng d0007f3c83 Handle llvm.fma.* intrinsics. rdar://10914096
llvm-svn: 154439
2012-04-10 21:40:28 +00:00
Chad Rosier f7345b027a Whitespace.
llvm-svn: 154427
2012-04-10 19:42:07 +00:00
Chad Rosier 235a7a1746 Revert r154396, which looks to be the real culprit behind the bot failures.
llvm-svn: 154426
2012-04-10 19:39:18 +00:00
Eric Christopher 65ada95b84 Temporarily revert this patch to see if it brings the buildbots back.
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Jim Grosbach df5a244797 ARM fix cc_out operand handling for t2SUBrr instructions.
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.

rdar://11216577

llvm-svn: 154411
2012-04-10 17:31:55 +00:00
David Blaikie 2735136655 Remove unused variable.
llvm-svn: 154398
2012-04-10 15:23:13 +00:00
Nadav Rotem f934f91709 Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Evan Cheng f8bad08001 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370
2012-04-10 01:51:00 +00:00
Jim Grosbach 8f99bc3aed ARM LDR/LDRT has the same encoding collision as STR/STRT.
Generalized logic of r154141.

llvm-svn: 154362
2012-04-10 00:13:07 +00:00
Chad Rosier e0e38f61a5 When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a 
series of scalar stores.

For func_4_8 the generated code

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr

becomes

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr

I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.

This

	ldrh	r0, [r0, #4]
	strh	r0, [r1]

becomes

	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]

PR11158
rdar://10703339

llvm-svn: 154340
2012-04-09 20:32:02 +00:00
Chad Rosier 99cbde9e82 Update comments and remove unnecessary isVolatile() check.
llvm-svn: 154336
2012-04-09 19:38:15 +00:00
David Blaikie e6b6fae8ff Fix accidentally constant conditions found by uncommitted improvements to -Wconstant-conversion.
A couple of cases where we were accidentally creating constant conditions by
something like "x == a || b" instead of "x == a || x == b". In one case a
conditional & then unreachable was used - I transformed this into a direct
assert instead.

llvm-svn: 154324
2012-04-09 16:29:35 +00:00
Preston Gurd 2eec367227 This patch adds X86 instruction itineraries, which were missed by the
original patch to add itineraries, to X86InstrArithmetc.td.  

llvm-svn: 154320
2012-04-09 15:32:22 +00:00
Nadav Rotem fb7e2ae53c Lower some x86 shuffle sequences to the vblend family of instructions.
llvm-svn: 154313
2012-04-09 08:33:21 +00:00
Nadav Rotem b801ca3976 Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.

llvm-svn: 154310
2012-04-09 07:45:58 +00:00
Chandler Carruth 3779ac10b4 Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
   the match to try to fit RIP into the base register. If it fails, it
   now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
   as it did before. The reason these immediates are safe is because the
   ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
   This is the only case changed by the patch, and the primary place you
   see it is in TLS, either the win64 section offset TLS or Linux
   local-exec TLS model in a PIC compilation. Here the ABI again ensures
   that the immediates fit because we are in small mode, and any other
   operations required due to the PIC relocation model have been handled
   externally to the Wrapper node (extra loads etc are made around the
   wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

llvm-svn: 154304
2012-04-09 02:13:06 +00:00
Chandler Carruth ede4a8aa2b Teach LLVM about a PIE option which, when enabled on top of PIC, makes
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.

I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]

I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.

llvm-svn: 154294
2012-04-08 17:51:45 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Nadav Rotem 82609df647 AVX2: Build splat vectors by broadcasting a scalar from the constant pool.
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.

llvm-svn: 154284
2012-04-08 12:54:54 +00:00
Craig Topper d024cef233 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper aa9aab5ad2 Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
llvm-svn: 154268
2012-04-07 21:57:43 +00:00
Bob Wilson 6f9be7e2c6 Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543>
The tLDRr instruction with the last register operand set to the zero register
prints in assembly as if no register was specified, and the assembler encodes
it as a tLDRi instruction with a zero immediate.  With the integrated assembler,
that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which
is broken.  Emit the instruction as tLDRi with a zero immediate.  I don't
know if there's a good way to write a testcase for this.  Suggestions welcome.

Opportunities for follow-up work:
1) The asm printer should complain if a non-optional register operand is set
   to the zero register, instead of silently dropping it.
2) The integrated assembler should complain in the same situation, instead of
   silently emitting the operand as "r0".

llvm-svn: 154261
2012-04-07 16:51:59 +00:00
NAKAMURA Takumi b95f64134e Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming.
Cygwin-1.7 supports dw2. Some recent mingw distros support one, too.
I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin.

llvm-svn: 154247
2012-04-07 02:24:20 +00:00
Alexis Hunt 0235f684f0 Output UTF-8-encoded characters as identifier characters into assembly
by default.

This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.

I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.

llvm-svn: 154235
2012-04-07 00:37:53 +00:00
Jim Grosbach 0c509fa6bf Tidy up. 80 columns.
llvm-svn: 154226
2012-04-06 23:43:50 +00:00
Jakob Stoklund Olesen baa3566091 ARMPat is equivalent to Requires<[IsARM]>.
llvm-svn: 154210
2012-04-06 21:21:59 +00:00
Jakob Stoklund Olesen b4bd3880ba Eliminate iOS-specific tail call instructions.
After register masks were introdruced to represent the call clobbers, it
is no longer necessary to have duplicate instruction for iOS.

llvm-svn: 154209
2012-04-06 21:17:42 +00:00
Chandler Carruth 8a102c21e3 There is no portable std::abs overload for int64_t, use the llvm::abs64
which exists for this purpose.

llvm-svn: 154199
2012-04-06 20:10:52 +00:00
Jakob Stoklund Olesen 967b86a0a2 Allow negative immediates in ARM and Thumb2 compares.
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.

llvm-svn: 154183
2012-04-06 17:45:04 +00:00
Benjamin Kramer 3cacabfb04 Fix narrowing conversion.
llvm-svn: 154171
2012-04-06 13:33:52 +00:00
Craig Topper 447417c932 Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
llvm-svn: 154166
2012-04-06 07:45:23 +00:00
Jakob Stoklund Olesen 6a2e99a46a Deduplicate ARM call-related instructions.
We had special instructions for iOS because r9 is call-clobbered, but
that is represented dynamically by the register mask operands now, so
there is no need for the pseudo-instructions.

llvm-svn: 154144
2012-04-06 00:04:58 +00:00
Jim Grosbach d6a1a1dc2f ARM: Don't form a t2LDRi8 or t2STRi8 with an offset of zero.
The load/store optimizer splits LDRD/STRD into two instructions when the
register pairing doesn't work out. For negative offsets in Thumb2, it uses
t2STRi8 to do that. That's fine, except for the case when the offset is in
the range [-4,-1]. In that case, we'll also form a second t2STRi8 with
the original offset plus 4, resulting in a t2STRi8 with a non-negative
offset, which ends up as if it were an STRT, which is completely bogus.
Similarly for loads.

No testcase, unfortunately, as any I've been able to construct is both large
and extremely fragile.

rdar://11193937

llvm-svn: 154141
2012-04-05 23:51:24 +00:00
Jim Grosbach 930f2f66e7 ARM assembly aliases for add negative immediates using sub.
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.

rdar://11192734

llvm-svn: 154123
2012-04-05 20:57:13 +00:00
Silviu Baranga af3c79f0ac Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions.
llvm-svn: 154101
2012-04-05 16:19:29 +00:00
Silviu Baranga d365397daa Added support for handling unpredictable arithmetic instructions on ARM.
llvm-svn: 154100
2012-04-05 16:13:15 +00:00
Jim Grosbach 15c6884a4b ARM assembly aliases for two-operand V[R]SHR instructions.
rdar://11189467

llvm-svn: 154087
2012-04-05 07:23:53 +00:00
Jim Grosbach 3d00eecc53 ARM assembly parsing for 'msr' plain 'cpsr' operand.
Plain 'cpsr' is an alias for 'cpsr_fc'.

rdar://11153753

llvm-svn: 154080
2012-04-05 03:17:53 +00:00
Akira Hatanaka 121342fcc2 Reapply 154038 without the failing test.
llvm-svn: 154062
2012-04-04 22:16:36 +00:00
Owen Anderson 4743c6e159 Revert r154038. It was causing make check failures.
llvm-svn: 154054
2012-04-04 21:18:58 +00:00
Akira Hatanaka 9705c865d9 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.

llvm-svn: 154038
2012-04-04 19:02:38 +00:00
Akira Hatanaka 591ecdd7c1 Fix LowerJumpTable to produce instructions with the correct relocation
types for N32 ABI. Test case will be updated after the patch that fixes
TargetLowering::getPICJumpTableRelocBase is checked in.

llvm-svn: 154036
2012-04-04 18:31:32 +00:00
Akira Hatanaka b3a2b8c199 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154034
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen 0a5b72f0e4 Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

llvm-svn: 154033
2012-04-04 18:23:42 +00:00
Akira Hatanaka aeff24e424 Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154031
2012-04-04 18:22:53 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Dylan Noblesmith 7a3973d3e0 ARMDisassembler: drop bogus dependency on ARMCodeGen
And indirectly, a dependency on most of the core LLVM optimization
libraries.

llvm-svn: 153957
2012-04-03 15:48:14 +00:00
Anton Korobeynikov 325e92668b Make PPCCompilationCallbackC function to be static, so there will be no need to issue call via
PLT when LLVM is built as shared library. This mimics the X86 backend towards the approach.

llvm-svn: 153938
2012-04-03 06:59:28 +00:00
Craig Topper 7629d63bc4 Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Akira Hatanaka d19f025374 Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler.
llvm-svn: 153926
2012-04-03 03:01:13 +00:00
Akira Hatanaka 55059262aa Revert r153924. There were buildbot failures.
llvm-svn: 153925
2012-04-03 02:51:09 +00:00
Akira Hatanaka e2498d014b MIPS disassembler support.
Patch by Vladimir Medic.

llvm-svn: 153924
2012-04-03 02:20:58 +00:00
Akira Hatanaka b1f68f9696 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.

llvm-svn: 153889
2012-04-02 19:25:22 +00:00
Hal Finkel 7591afa235 The binutils for the IBM BG/P are too old to support CFI.
llvm-svn: 153886
2012-04-02 19:09:04 +00:00
Roman Divacky b9663ccd6b Implement the SVR4 byval alignment for aggregates. Fixing a FIXME.
llvm-svn: 153876
2012-04-02 15:49:30 +00:00
Benjamin Kramer 1c0541b031 Move getOpcodeName from the various target InstPrinters into the superclass MCInstPrinter.
All implementations used the same code.

llvm-svn: 153866
2012-04-02 08:32:38 +00:00
Craig Topper dab9e35ad0 Remove getInstructionName from MCInstPrinter implementations in favor of using the instruction name table from MCInstrInfo. Reduces static data in the InstPrinter implementations.
llvm-svn: 153863
2012-04-02 07:01:04 +00:00
Craig Topper 54bfde79db Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo.
llvm-svn: 153860
2012-04-02 06:09:36 +00:00
Hal Finkel 3ecfa7b277 Fix some 80-col. violations I introduced with the A2 PPC64 core.
llvm-svn: 153852
2012-04-01 21:20:14 +00:00
Hal Finkel 322e41a914 Enable prefetch generation on PPC64.
llvm-svn: 153851
2012-04-01 20:08:17 +00:00
Hal Finkel 9032344c15 Add LdStSTD* itin. for the PPC64 A2 core.
llvm-svn: 153850
2012-04-01 20:08:08 +00:00
Nadav Rotem b078350872 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Hal Finkel 88ed4e3b15 Set the default PPC node scheduling preference to ILP (for the embedded cores).
The 440 and A2 cores have detailed itineraries, and this allows them to be
fully used to maximize throughput.

llvm-svn: 153845
2012-04-01 19:23:08 +00:00
Hal Finkel b9845f5758 Add ppc440 itin. entries for LdStSTD*
llvm-svn: 153844
2012-04-01 19:23:04 +00:00
Hal Finkel ec5a1e3669 Use full anti-dep. breaking with post-ra sched. on the embedded ppc cores.
Post-RA scheduling gives a significant performance improvement on
the embedded cores, so turn it on. Using full anti-dep. breaking is
important for FP-intensive blocks, so turn it on (just on the
embedded cores for now; this should also be good on the 970s because
post-ra scheduling is all that we have for now, but that should have
more testing first).

llvm-svn: 153843
2012-04-01 19:22:57 +00:00
Hal Finkel 9f9f8929ee Add instruction itinerary for the PPC64 A2 core.
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.

llvm-svn: 153842
2012-04-01 19:22:40 +00:00
Hal Finkel 59607e63cb Split the LdStGeneral PPC itin. class into LdStLoad and LdStStore.
Loads and stores can have different pipeline behavior, especially on
embedded chips. This change allows those differences to be expressed.
Except for the 440 scheduler, there are no functionality changes.
On the 440, the latency adjustment is only by one cycle, and so this
probably does not affect much. Nevertheless, it will make a larger
difference in the future and this removes a FIXME from the 440 itin.

llvm-svn: 153821
2012-04-01 04:44:16 +00:00
Hal Finkel 51861b4855 Fix dynamic linking on PPC64.
Dynamic linking on PPC64 has had problems since we had to move the top-down
hazard-detection logic post-ra. For dynamic linking to work there needs to be
a nop placed after every call. It turns out that it is really hard to guarantee
that nothing will be placed in between the call (bl) and the nop during post-ra
scheduling. Previous attempts at fixing this by placing logic inside the
hazard detector only partially worked.

This is now fixed in a different way: call+nop codegen-only instructions. As far
as CodeGen is concerned the pair is now a single instruction and cannot be split.
This solution works much better than previous attempts.

The scoreboard hazard detector is also renamed to be more generic, there is currently
no cpu-specific logic in it.

llvm-svn: 153816
2012-03-31 14:45:15 +00:00
Akira Hatanaka 8f4e3a0088 Select static relocation model if it is jitting.
llvm-svn: 153795
2012-03-31 02:38:36 +00:00
Jakob Stoklund Olesen d915503486 Add a 2 byte safety margin in offset computations.
ARMConstantIslandPass still has bugs where jump table compression can
cause constant pool entries to go out of range.

Add a safety margin of 2 bytes when placing constant islands, but use
the real max displacement for verification.

<rdar://problem/11156595>

llvm-svn: 153789
2012-03-31 00:06:44 +00:00
Jakob Stoklund Olesen 24bb3d59d7 Add more debugging output to ARMConstantIslandPass.
llvm-svn: 153788
2012-03-31 00:06:42 +00:00
Benjamin Kramer 682de39f2d Rip out emission of the regIsInRegClass function for the asm printer.
It's slow, bloated and completely redundant with MCRegisterClass::contains.

llvm-svn: 153782
2012-03-30 23:13:40 +00:00
Jim Grosbach 913cc3072d ARM fix encoding fixup resolution for ldrd and friends.
The 8-bit payload is not contiguous in the opcode. Move the upper nibble
over 4 bits into the correct place.

rdar://11158641

llvm-svn: 153780
2012-03-30 21:54:22 +00:00
Jim Grosbach fdaab531b7 ARM assembler should prefer non-aliases encoding of cmp.
When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg,
we want to use the non-negated form to make sure we prefer the normal
encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'.

llvm-svn: 153770
2012-03-30 19:59:02 +00:00
Jim Grosbach daa04130ed ARM encoding for VSWP got the second operand incorrect.
Make the non-tied register operand names line up with what the base
class encoding handler expects.

rdar://11157236

llvm-svn: 153766
2012-03-30 18:53:01 +00:00
Jim Grosbach 74005ae691 ARM can only use narrow encoding for low regs.
llvm-svn: 153765
2012-03-30 18:39:43 +00:00
Jim Grosbach def5e34812 ARM integrated assembler should encoding choice for add/sub imm.
For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2
can be used for this syntax. Prefer the narrow encoding when possible.

rdar://11156277

llvm-svn: 153759
2012-03-30 17:20:40 +00:00
Jim Grosbach 199ab90946 ARM assembly parsing needs to be paranoid about negative immediates.
Make sure to treat immediates as unsigned when doing relative comparisons.

rdar://11153621

llvm-svn: 153753
2012-03-30 16:31:31 +00:00
Benjamin Kramer 88d31b3f0c Add a note about a missed cmov -> sbb opportunity.
llvm-svn: 153741
2012-03-30 13:02:58 +00:00
James Molloy fb5cd6085f Ensure conditional BL instructions for ARM are given the fixup fixup_arm_condbranch.
Patch by Tim Northover!

llvm-svn: 153737
2012-03-30 09:15:32 +00:00
Evan Cheng a40d40602c ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249
llvm-svn: 153717
2012-03-30 01:24:39 +00:00
Jakob Stoklund Olesen d8af9a5ee1 Invalidate liveness in ARMConstantIslandPass.
This pass splits basic blocks to insert constant islands, and it
doesn't recompute the live-in lists. No later passes depend on accurate
liveness information.

This fixes PR12410 where the machine code verifier was complaining.

llvm-svn: 153700
2012-03-29 23:14:26 +00:00
Jakob Stoklund Olesen 2f2897372a Prefer even-odd D-register pairs.
We are sometimes allocatinog from the DPair register class which
contains odd-even pairs in addition to the Q registers.

Place the Q registers first in the DPair allocation order as they can be
copied with a single instruction. The odd-even pairs should only be
allocated as a last resort.

llvm-svn: 153699
2012-03-29 22:54:32 +00:00
Lang Hames 591cdaf2ee Try using vmov.i32 to materialize FP32 constants that can't be materialized by
vmov.f32.

llvm-svn: 153696
2012-03-29 21:56:11 +00:00
Jim Grosbach 0b0298302c ARM assembly 'cmp lr, #0' should not encode using 'cmn'.
The CMP->CMN alias was matching for an immediate of zero when it
should only match for negative values.

rdar://11129224

llvm-svn: 153689
2012-03-29 21:19:52 +00:00
Jakob Stoklund Olesen caa6bd273f Handle register copies for the new ARM register classes.
ARM recently gained DPair, DTriple, and DQuad register classes.
Update copyPhysReg() to handle copies in these register classes.

No test case, it is difficult to make the register allocator emit the
odd copies reliably. The missing DPair copy caused a failure on
partialsums in the nightly test suite.

<rdar://problem/11147997>

llvm-svn: 153686
2012-03-29 21:10:40 +00:00
Lang Hames 5569ce7d56 Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode.
llvm-svn: 153680
2012-03-29 19:54:28 +00:00
Akira Hatanaka 0603ad8c65 Expand FREM.
llvm-svn: 153671
2012-03-29 18:43:11 +00:00
Benjamin Kramer 8619c37b5b Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds.
llvm-svn: 153643
2012-03-29 12:37:26 +00:00
Craig Topper a0a603e582 Only allow symbolic names for (v)cmpss/sd/ps/pd encodings 8-31 to be used with 'v' version of instructions.
llvm-svn: 153636
2012-03-29 07:11:23 +00:00
Joel Jones 68d59e8a90 For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153635
2012-03-29 05:45:48 +00:00
Joel Jones b474099e63 Reverted to revision 153616 to unblock build
llvm-svn: 153623
2012-03-29 01:20:56 +00:00
Joel Jones b88c81fe0f For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153617
2012-03-29 00:37:47 +00:00
Jakob Stoklund Olesen c3e80cc885 Enable machine code verification in the entire code generator.
Some targets still mess up the liveness information, but that isn't
verified after MRI->invalidateLiveness().

The verifier can still check other useful things like register classes
and CFG, so it should be enabled after all passes.

llvm-svn: 153615
2012-03-28 23:54:28 +00:00
Jakob Stoklund Olesen b6a7a89289 Don't kill the base register when expanding strd.
When an strd instruction doesn't get the registers it wants, it can be
expanded into two str instructions. Make sure the first str doesn't kill
the base register in the case where the base and data registers are
identical:

  t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg
  t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg

<rdar://problem/11101911>

llvm-svn: 153611
2012-03-28 23:07:03 +00:00
Jakob Stoklund Olesen cdee326ab6 Preserve implicit defs in ARMLoadStoreOptimizer.
When a number of sub-register VLRDS instructions are combined into a
VLDM, preserve any super-register implicit defs. This is required to
keep the register scavenger and machine code verifier happy.

Enable machine code verification after ARMLoadStoreOptimizer.
ARM/2012-01-26-CopyPropKills.ll was failing because of this.

llvm-svn: 153610
2012-03-28 22:50:56 +00:00
Jakob Stoklund Olesen 9e512120b7 Spill DPair registers, not just QPR.
The arm_neon intrinsics can create virtual registers from the DPair
register class which allows both even-odd and odd-even D-register pairs.

This fixes PR12389.

llvm-svn: 153603
2012-03-28 21:20:32 +00:00
Jakob Stoklund Olesen 8cb97523c6 Revert r153516: "Invalidate liveness in Thumb2ITBlockPass."
Revert r153519: "ARMLoadStoreOptimizer invalidates register liveness."

These patches caused miscompilations in povray by turning off branch
folding's updating of live-in lists.

It turns out the the late scheduler depends on the live-in lists, even
if it doesn't need correct kill flags.

<rdar://problem/11139228>

llvm-svn: 153593
2012-03-28 20:11:44 +00:00
Benjamin Kramer 20b32d2da6 Add another note about a missed compare with nsw arithmetic instcombine.
llvm-svn: 153574
2012-03-28 10:50:18 +00:00
Richard Barton 7ce39497b4 Fixup VST1.32 with writeback instruction. Also re-factor non-writeback version.
llvm-svn: 153573
2012-03-28 10:18:11 +00:00
Akira Hatanaka 2c67006cdd Turn off post-RA scheduler by default.
llvm-svn: 153557
2012-03-28 00:52:23 +00:00
Akira Hatanaka 047473e293 Turn on post register allocation scheduler.
llvm-svn: 153554
2012-03-28 00:24:17 +00:00
Akira Hatanaka 5ba593f509 Sort relocation entries before they are written out to a file. MIPS ABI
imposes a constraint that GOT16 referring to a local symbol or HI16 has to be
followed immediately by a matching LO16 relocation.

llvm-svn: 153553
2012-03-28 00:23:33 +00:00
Akira Hatanaka 34ee3ff83d Emit all directives except for ".cprestore" during asm printing rather than emit
them as machine instructions. Directives ".set noat" and ".set at" are now
emitted only at the beginning and end of a function except in the case where
they are emitted to enclose .cpload with an immediate operand that doesn't fit
in 16-bit field or unaligned load/stores.

Also, make the following changes:
- Remove function isUnalignedLoadStore and use a switch-case statement to
  determine whether an instruction is an unaligned load or store.

- Define helper function CreateMCInst which generates an instance of an MCInst
  from an opcode and a list of operands.

llvm-svn: 153552
2012-03-28 00:22:50 +00:00
Akira Hatanaka 1518a5fa9c Mark flag neverHasSideEffects of pattern-less instructions that do not have
any side effects.

llvm-svn: 153551
2012-03-28 00:21:37 +00:00
Benjamin Kramer 2735c01906 Add a note about a cute little fabs optimization.
llvm-svn: 153543
2012-03-27 22:42:42 +00:00
Benjamin Kramer f0901459b9 Add two missed instcombines related to compares with nsw arithmetic.
llvm-svn: 153542
2012-03-27 22:03:19 +00:00
Akira Hatanaka 52656d1047 Remove trailing white space.
llvm-svn: 153536
2012-03-27 20:35:51 +00:00
Akira Hatanaka a25fe22198 Add member EmitNOAT and its setter and getter functions to class MipsFunctionInfo.
If EmitNOAT is true, directives ".set noat" and ".set at" are emitted at the
beginning and end of a function. 

llvm-svn: 153528
2012-03-27 19:08:42 +00:00
Jakob Stoklund Olesen 4acbcb3171 ARMLoadStoreOptimizer invalidates register liveness.
This pass tries to update kill flags, but there are still many bugs.
Passes after the load/store optimizer don't need accurate liveness, so
don't even try.

<rdar://problem/11101911>

llvm-svn: 153519
2012-03-27 17:33:52 +00:00
Jakob Stoklund Olesen 14459cdc49 Invalidate liveness in Thumb2ITBlockPass.
llvm-svn: 153516
2012-03-27 17:06:06 +00:00
Craig Topper 1fcf5bcae1 Prune some includes
llvm-svn: 153502
2012-03-27 07:54:11 +00:00
Craig Topper f6e7e12f75 Remove unnecessary llvm:: qualifications
llvm-svn: 153500
2012-03-27 07:21:54 +00:00
Akira Hatanaka 8a7633c74e Pass the llvm IR pointer value and offset to the constructor of
MachinePointerInfo when getStore is called to create a node that stores an
argument passed in register to the stack. Without this change, the post RA 
scheduler will fail to discover the dependencies between the stores
instructions and the instructions that load from a structure passed by value. 

The link to the related discussion is here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-March/048055.html

llvm-svn: 153499
2012-03-27 03:13:56 +00:00
Akira Hatanaka 769f69f9b6 Fix bug in LowerConstantPool.
llvm-svn: 153498
2012-03-27 02:55:31 +00:00
Akira Hatanaka 2a36c9f4a8 Add T9 to the list of live-in registers of the entry basic block.
llvm-svn: 153497
2012-03-27 02:46:25 +00:00
Akira Hatanaka fe384a2c84 Retrieve and add the offset of a symbol in applyFixup rather than retrieve and
set it in MipsMCCodeEmitter::getMachineOpValue. Assert in getMachineOpValue if
MachineOperand MO is of an unexpected type. 

llvm-svn: 153494
2012-03-27 02:33:05 +00:00
Akira Hatanaka a06bc1c6e3 Define function MipsGetSymAndOffset which returns a fixup's symbol and the
offset applied to it.

llvm-svn: 153493
2012-03-27 02:04:18 +00:00
Akira Hatanaka da72819725 Rewrite computation of Value in adjustFixupValue so that the upper 48-bits are
cleared. No functionality change.

llvm-svn: 153491
2012-03-27 01:50:08 +00:00
Akira Hatanaka ba5100c117 Reserve hardware registers.
llvm-svn: 153486
2012-03-27 00:40:56 +00:00
Evan Cheng a2b48d985b ARM has a peephole optimization which looks for a def / use pair. The def
produces a 32-bit immediate which is consumed by the use. It tries to 
fold the immediate by breaking it into two parts and fold them into the
immmediate fields of two uses. e.g
       movw    r2, #40885
       movt    r3, #46540
       add     r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       add.w   r0, r0, #30146560
;
However, this transformation is incorrect if the user produces a flag. e.g.
       movw    r2, #40885
       movt    r3, #46540
       adds    r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       adds.w  r0, r0, #30146560
Note the adds.w may not set the carry flag even if the original sequence
would.

rdar://11116189

llvm-svn: 153484
2012-03-26 23:31:00 +00:00
Craig Topper 6e80c28017 Prune some includes and forward declarations.
llvm-svn: 153429
2012-03-26 06:58:25 +00:00
Craig Topper 5fa0caafc0 Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h
llvm-svn: 153422
2012-03-26 00:45:15 +00:00
Craig Topper 07720d8dcd Replace uses of ARMBaseInstrInfo and ARMTargetMachine with the Base versions.
llvm-svn: 153421
2012-03-25 23:49:58 +00:00
Craig Topper d4a964cd70 Prune some includes and forward declarations.
llvm-svn: 153415
2012-03-25 18:10:17 +00:00
Hal Finkel e44eb28807 Fix small-integer VAARG on SVR4 ABI PPC64.
The PPC64 SVR4 ABI requires integer stack arguments, and thus the var. args., that
are smaller than 64 bits be zero extended to 64 bits.

llvm-svn: 153373
2012-03-24 03:53:55 +00:00
Justin Holewinski a84577dcff PTX: Fix predicate logic bug
Code such as:

%vreg100 = setcc %vreg10, -1, SETNE
brcond %vreg10, %tgt

was being incorrectly morphed into

%vreg100 = and %vreg10, 1
brcond %vreg10, %tgt

where the 'and' instruction could be eliminated since
such logic is on 1-bit types in the PTX back-end, leaving
us with just:

brcond %vreg10, %tgt

which essentially gives us inverted branch conditions.

llvm-svn: 153364
2012-03-24 01:23:20 +00:00
Jim Grosbach 190e7b6e18 ARM tidy up ARMConstantIsland.cpp.
No functional change, just tidy up the code and nomenclature a bit.

llvm-svn: 153347
2012-03-23 23:07:03 +00:00
Benjamin Kramer b0640db80e Include cstdio in a few place that depended on getting it transitively through StringExtras.h
llvm-svn: 153328
2012-03-23 11:35:30 +00:00
Benjamin Kramer cbf108eda6 Move ftostr into its last user (cppbackend) and simplify it a bit.
New code should use raw_ostream.

llvm-svn: 153326
2012-03-23 11:26:29 +00:00
Eric Christopher 64a232343a Remove the C backend.
llvm-svn: 153307
2012-03-23 05:50:46 +00:00
Silviu Baranga 4afd7d2316 Added soft fail checks for the disassembler when decoding some corner cases of the STRD, STRH, LDRD, LDRH, LDRSH and LDRSB instructions on ARM.
llvm-svn: 153252
2012-03-22 14:14:49 +00:00
Silviu Baranga d213f2111a Added soft fail cases for the disassembler when decoding LDRSBT, LDRHT or LDRSHT instruction on ARM
llvm-svn: 153251
2012-03-22 13:24:43 +00:00
Silviu Baranga a6ea32afdd Added soft fail cases for the disassembler when decoding MUL instructions on ARM.
llvm-svn: 153250
2012-03-22 13:14:39 +00:00
Craig Topper 7da2aa24c2 Remove some unnecessary forward declarations.
llvm-svn: 153245
2012-03-22 06:52:14 +00:00
Hal Finkel 76eb187c0f PPC::DBG_VALUE must use Reg+Imm frame-index elimination even for large offsets. Fixes PR12203.
I don't have a small test case yet, but I'll try to construct one.

llvm-svn: 153240
2012-03-22 05:28:19 +00:00
Kevin Enderby 7e7d5eefb2 Fix ARM disassembly of VST1 and VST2 instructions with writeback. And add test
case for all opcodes handed by DecodeVSTInstruction() in ARMDisassembler.cpp .

llvm-svn: 153218
2012-03-21 20:54:32 +00:00
Joerg Sonnenberger a29b5bd2a8 Put Is64BitMemOperand into !defined(NDEBUG) for now.
llvm-svn: 153185
2012-03-21 14:09:26 +00:00
Benjamin Kramer bc1066734f Use a signed value for this enum to avoid spuriuos warnings from gcc.
llvm-svn: 153184
2012-03-21 13:48:11 +00:00
Joerg Sonnenberger 5463e66768 Fix generation of the address size override prefix. Add assertions for
the invalid cases. At least 16bit operand in 64bit mode is currently not
rejected in the parser.

llvm-svn: 153166
2012-03-21 05:48:07 +00:00
Craig Topper 344e0128ba Add typecast to silence -Wswitch warning introduced by r153153.
llvm-svn: 153155
2012-03-21 02:28:53 +00:00
Craig Topper 9cfc69c779 Spacing fixes and using 'unsigned' instead of 'int' to index to select shuffle elements for consistency with other shuffle code in X86 backend.
llvm-svn: 153154
2012-03-21 02:14:01 +00:00
Akira Hatanaka 0137dfe42a Incremental big endian patch by Jack Carter.
These changes allow us to compile big endian from the command line for 32 bit
Mips targets. This patch will result in code and data actually being produced
in the correct endianess.

llvm-svn: 153153
2012-03-21 00:52:01 +00:00
Chad Rosier 4106917355 [avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to
vextractf128 with 128-bit mem dest.

Combines

	vextractf128 $0, %ymm0, %xmm0
	vmovaps %xmm0, (%rdi)

to

    vextractf128 $0, %ymm0, (%rdi)

rdar://11082570

llvm-svn: 153139
2012-03-20 21:43:40 +00:00
Evan Cheng 759b1d169f Change conditional instructions definitions, e.g. ANDCC, ARMPseudoExpand and t2PseudoExpand.
llvm-svn: 153135
2012-03-20 21:28:05 +00:00
Matt Beaumont-Gay dc873d5e6a remove unused variable
llvm-svn: 153116
2012-03-20 19:52:05 +00:00
Chad Rosier 0158ae2e5b [avx] Add the AddedComplexity to the VINSERTI128 avx2 patterns to give
precedence over the VINSERTF128 avx1 patterns.

llvm-svn: 153114
2012-03-20 19:45:07 +00:00
Bob Wilson b60f8f875c Require a base pointer for stack realignment when SP may vary dynamically.
ARMBaseRegisterInfo::canRealignStack was checking for variable-sized objects
but not for stack adjustments around calls.  Use hasReservedCallFrame() to
check for both.  The hasBasePointer function was already correctly checking
both conditions, so the effect of this was that a base pointer would be used
without checking whether the base pointer register could be reserved. I don't
have a small testcase for this.

<rdar://problem/11075906>

llvm-svn: 153110
2012-03-20 19:28:25 +00:00
Bob Wilson ca690320fb Remove some redundant checks.
ARMFrameLowering::hasReservedCallFrame is already checking for variable
sized objects, so there's no point in checking it twice.

llvm-svn: 153109
2012-03-20 19:28:22 +00:00
Chad Rosier 93d5427c69 Whitespace.
llvm-svn: 153105
2012-03-20 18:38:33 +00:00
Chad Rosier 5a6011267a [avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove
whitespace from test case.  No functional change intended.

llvm-svn: 153103
2012-03-20 18:24:55 +00:00
Kevin Enderby 816ca27ef6 Fix assembling ARM vst2 instructions with double-spaced registers.
llvm-svn: 153099
2012-03-20 17:41:51 +00:00
Jim Grosbach 997614f597 ARM non-scattered MachO relocations for movw/movt.
Needed when building -mdynamic-no-pic code.

rdar://10459256

llvm-svn: 153097
2012-03-20 17:25:45 +00:00
Chad Rosier 07a4cb9382 [avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads.
This results in things such as

	vmovups	16(%rdi), %xmm0
	vinsertf128	$1, %xmm0, %ymm0, %ymm0

to be combined to

    vinsertf128	$1, 16(%rdi), %ymm0, %ymm0

rdar://11076953

llvm-svn: 153092
2012-03-20 17:08:51 +00:00
Silviu Baranga 32a49333ec The ARM instructions that have an unpredictable behavior when the pc register operand is given now fail with soft fail. Modified the regression tests to reflect this.
llvm-svn: 153089
2012-03-20 15:54:56 +00:00
Richard Barton 7caea33dfa Test Commit - add a newline
llvm-svn: 153083
2012-03-20 10:50:35 +00:00
Craig Topper b34d96c614 Remove code that prevented lowering shuffles if they are used by load and themselves used by a extract_vector_elt. This was done to allow the DAG combiner to collapse to a single element load. Unfortunately, sometimes the extract_vector_elt would disappear before DAG combine could do the transformation leaving a vector_shuffle that isel couldn't handle. New code lets the shuffle be converted to a target specific node, but then adds a combine routine that can convert target specific nodes back to vector_shuffles if the folding criteria are met.
llvm-svn: 153080
2012-03-20 07:17:59 +00:00
Craig Topper cbc96a6e90 Factor out target shuffle mask decoding from getShuffleScalarElt and use a SmallVector of int instead of unsigned for shuffle mask in decode functions. Preparation for another change.
llvm-svn: 153079
2012-03-20 06:42:26 +00:00
Chris Lattner 8c3c65067d fix a build failure with libc++
llvm-svn: 153063
2012-03-19 23:31:01 +00:00
Jim Grosbach c4aa60ffe9 ARM branch relaxation for unconditional t1 branches.
rdar://11059157

llvm-svn: 153055
2012-03-19 21:32:32 +00:00
Jim Grosbach 67e76babd3 ARM assembly, accept optional '#' on lane index number.
rdar://11057160

llvm-svn: 153053
2012-03-19 20:39:53 +00:00
Anton Korobeynikov 3edd854d64 Perform mul combine when multiplying wiht negative constants.
Patch by Weiming Zhao!
This fixes PR12212

llvm-svn: 153049
2012-03-19 19:19:50 +00:00
Preston Gurd 48ccc4df0b This patch adds X86 instruction itineraries for non-pseudo opcodes in
X86InstrCompiler.td.
 
It also adds –mcpu-generic to the legalize-shift-64.ll test so the test
will pass if run on an Intel Atom CPU, which would otherwise
produce an instruction schedule which differs from that which the test expects.

llvm-svn: 153033
2012-03-19 14:10:12 +00:00
Benjamin Kramer 57003a6768 Add a note for -ffast-math optimization of vector norm.
llvm-svn: 153031
2012-03-19 00:43:34 +00:00
Craig Topper 129f9ef669 isCommutedMOVLMask should only look at 128-bit vectors to match isMOVLMask.
llvm-svn: 153027
2012-03-18 22:50:10 +00:00
Craig Topper b25fda95f6 Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.
llvm-svn: 152997
2012-03-17 18:46:09 +00:00
Craig Topper f3f6650f8f Fix some copy and paste remnants of Cell and SPU in Hexagon files.
llvm-svn: 152981
2012-03-17 09:39:20 +00:00
Craig Topper bc3168b128 Fix typo in file header.
llvm-svn: 152980
2012-03-17 09:28:37 +00:00
Craig Topper b545408116 Pass TargetOptions to HexagonTargetMachine constructor by reference to match other targets and the base class.
llvm-svn: 152979
2012-03-17 09:24:09 +00:00
Craig Topper 188ed9d56e Reorder includes to match coding standards. Fix an issue or two exposed by that.
llvm-svn: 152978
2012-03-17 07:33:42 +00:00
Bill Wendling 23f8c4a50c Check if we can handle the arguments of a call (and therefore the call) in
fast-isel before emitting code. If the program bails after code was emitted,
then it could lead to the stack being adjusted more than once (two
CALLSEQ_BEGINs emitted) but being adjuste back only once after the call. This
leads to general badness and gnashing of teeth.
<rdar://problem/11050630>

llvm-svn: 152959
2012-03-16 23:11:07 +00:00
Jim Grosbach c40b0f72bb ARM fix silly typo in optional operand alias.
rdar://11065671

llvm-svn: 152954
2012-03-16 22:18:29 +00:00
Jim Grosbach db7db7d3a3 ARM divided syntax fmrx/fmxr mnemonics.
llvm-svn: 152946
2012-03-16 21:06:13 +00:00
Jim Grosbach 905686a82a ARM ldm/stm register lists can be out of order.
It's not a good style idea, as the registers will be laid down in memory in
numerical order, not the order they're in the list, but it's legal. vldm/vstm
are stricter.

rdar://11064740

llvm-svn: 152943
2012-03-16 20:48:38 +00:00
Jim Grosbach 7cb9a13b02 ARM optional operand on MRC/MCR assembly instructions.
rdar://11058464

llvm-svn: 152883
2012-03-16 00:45:58 +00:00
Jim Grosbach 24d90e2ddc ARM vmrs system registers mvfr0 and mvfr1 handling.
rdar://11058464

llvm-svn: 152881
2012-03-16 00:27:18 +00:00
Jim Grosbach 6d9766b355 Remove inadvertant commit.
llvm-svn: 152870
2012-03-15 23:00:30 +00:00
Chad Rosier 26d05887d9 [fast-isel] Address Eli's comments for r152847. Specifically, add a test case
and still allow immediate encoding, just not with cmn.
rdar://11038907

llvm-svn: 152869
2012-03-15 22:54:20 +00:00
Chad Rosier 01cecbffd6 [fast-isel] Don't try to encode LONG_MIN using cmn instructions.
rdar://11038907

llvm-svn: 152847
2012-03-15 21:40:23 +00:00
Jim Grosbach d28888dd77 ARM case-insensitive checking for APSR_nzcv.
rdar://11056591

llvm-svn: 152846
2012-03-15 21:34:14 +00:00
Jim Grosbach d74560b170 ARM aliases for pre-unified syntax fcmpz[sd] mnemonics.
rdar://11056647

llvm-svn: 152834
2012-03-15 20:48:18 +00:00
Lang Hames c35ee8b54a Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on
register allocation by allowing all 32 D-registers to be used. Patch by Cameron
Zwarich.

llvm-svn: 152824
2012-03-15 18:49:02 +00:00
Kristof Beyls 327d2f9da5 Fix VCVT decoding (between floating-point and fixed-point, Floating-point). Patch by Richard Barton.
llvm-svn: 152814
2012-03-15 17:50:29 +00:00
Chad Rosier b9b73170e3 [avx] Add patterns for VINSERTF128rm.
This results in things such as

	vmovaps	-96(%rbx), %xmm1
	vinsertf128	$1, %xmm1, %ymm0, %ymm0

to be combined to
         
	vinsertf128	$1, -96(%rbx), %ymm0, %ymm0

rdar://10643481

llvm-svn: 152762
2012-03-15 00:45:30 +00:00
Kevin Enderby 1ef22f33d0 Change the X86 assembler to not require a segment register on string
instruction's destination operand like it does for the source operand.
Also fix a typo in the comment for X86AsmParser::isSrcOp().

llvm-svn: 152654
2012-03-13 19:47:55 +00:00
Kevin Enderby fb3110b5d2 Added a missing error check for X86 assembly with mismatched base and index
registers not both being 64-bit or both being 32-bit registers.

llvm-svn: 152580
2012-03-12 21:32:09 +00:00
Bob Wilson 274d6f1777 Switch to unified syntax for VFP instructions in inline assembly.
<rdar://problem/11024696>

llvm-svn: 152548
2012-03-12 06:15:36 +00:00
Benjamin Kramer f6978230b8 Remove global map. This code isn't even hot.
llvm-svn: 152544
2012-03-11 18:12:04 +00:00
Craig Topper bef78fc2ee Convert more static tables of registers used by calling convention to uint16_t to reduce space.
llvm-svn: 152538
2012-03-11 07:57:25 +00:00
Craig Topper ca658c2264 Use uint16_t to store registers and opcode in static tables in the target specific backends.
llvm-svn: 152537
2012-03-11 07:16:55 +00:00
Craig Topper 41bd30e027 Remove unused functions getArgRegs and getNumArgRegs.
llvm-svn: 152535
2012-03-11 06:46:40 +00:00
Stepan Dyatkovskiy 97b02fc1b3 llvm::SwitchInst
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.

llvm-svn: 152532
2012-03-11 06:09:17 +00:00
Kay Tiong Khoo 57c8e7f364 *fix typo in comment; test of commit access
llvm-svn: 152507
2012-03-10 21:29:49 +00:00
Benjamin Kramer adfc73d68f C files in llvm still have to be C89 compliant, remove C++-style comments.
llvm-svn: 152495
2012-03-10 15:10:06 +00:00
Bill Wendling ebb10df441 Fix disasm of iret, sysexit, and sysret when displayed with Intel syntax.
Patch by Kay Tiong Khoo!

llvm-svn: 152487
2012-03-10 07:37:27 +00:00
Akira Hatanaka da00aa80b6 Do not custom lower i64 nodes if i64 is not a legal type. Move lines that set
operation action of nodes.

llvm-svn: 152452
2012-03-10 00:03:50 +00:00
Akira Hatanaka b7f78592e2 Lower SETCC nodes during legalization. Previously, it was lowered in DAG combine pass.
llvm-svn: 152450
2012-03-09 23:46:03 +00:00
Akira Hatanaka 3e914578c5 Remove unused header files.
llvm-svn: 152447
2012-03-09 23:28:30 +00:00
Kevin Enderby deed5aaa41 Add the missing call to Error when a bad X86 scale expression is parsed.
llvm-svn: 152443
2012-03-09 22:24:10 +00:00
Kevin Enderby 014e1cde5f Fix the x86 disassembler to at least print the lock prefix if it is the first
prefix.  Added a FIXME to remind us this still does not work when it is not the
first prefix.

llvm-svn: 152414
2012-03-09 17:52:49 +00:00
Craig Topper 2dac962864 Use uint16_t to store opcodes in static tables in X86 backend.
llvm-svn: 152391
2012-03-09 07:45:21 +00:00
Ahmed Charles 1662013a43 Fix undefined behavior in the Mips backend.
llvm-svn: 152390
2012-03-09 06:36:45 +00:00
Chad Rosier a281afc676 Fix a regression from r147481.
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.

Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078

llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Craig Topper 5a4bcc749a Use uint16_t to store instruction implicit uses and defs. Reduces static data.
llvm-svn: 152301
2012-03-08 08:22:45 +00:00
Stepan Dyatkovskiy 5b648afb4d Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html

Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".

ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.

Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.

Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow

for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
  BasicBlock *BB = i.getCaseSuccessor();
  ConstantInt *V = i.getCaseValue();
  // Do something.
}

If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.

There are also related changes in llvm-clients: klee and clang.

llvm-svn: 152297
2012-03-08 07:06:20 +00:00
Akira Hatanaka 5e152182a4 Invoke setTargetDAGCombine for SELECT.
llvm-svn: 152290
2012-03-08 03:26:37 +00:00
Akira Hatanaka 7dd7c08419 Swap the operands of a select node if the false (the second) operand is 0.
For example, this pattern 
(select (setcc lhs, rhs, cc), true, 0)
is transformed to this one:
(select (setcc lhs, rhs, inverse(cc)), 0, true)

This enables MipsDAGToDAGISel::ReplaceUsesWithZeroReg (added in r152280) to
replace 0 with $zero.

llvm-svn: 152285
2012-03-08 02:14:24 +00:00
Akira Hatanaka 956dd2261e Set minimum function alignment to 3 if target is Mips64.
llvm-svn: 152282
2012-03-08 01:59:33 +00:00
Akira Hatanaka 0b2fa914f0 This patch eliminates redundant instructions that produce 0.
For example, the first instruction in the code below can be eliminated if the
use of $vr0 is replaced with $zero: 

addiu $vr0, $zero, 0
add $vr2, $vr1, $vr0

add $vr2, $vr1, $zero

llvm-svn: 152280
2012-03-08 01:51:59 +00:00
Jim Grosbach 11e8c0d6b5 ARM don't use MCRelaxAll, as it's not safe on ARM.
The ARM code generator makes aggressive assumptions about the encodings
being selected for branches which MCRelaxAll invalidates.

rdar://11006355

llvm-svn: 152268
2012-03-08 00:07:52 +00:00
Chad Rosier 377f1f2d39 [fast-isel] ARMEmitCmp generates FMSTAT, which transfers the floating-point
condition flags to CPSR.  This allows us to simplify SelectCmp.
Patch by Zonr Chang <zonr.xchg@gmail.com>.

llvm-svn: 152243
2012-03-07 20:59:26 +00:00
Jim Grosbach eadd8ee49c ARM pre-v6 assembly parsing for umull/smull.
llvm-svn: 152188
2012-03-07 01:09:17 +00:00
Jim Grosbach 8db462042c ARM pre-v6 alias for 'nop' to 'mov r0, r0'
llvm-svn: 152185
2012-03-07 00:52:41 +00:00
Jim Grosbach eed9992b26 Tidy up. Remove dead code that slipped into previous commit.
llvm-svn: 152184
2012-03-07 00:52:39 +00:00
Jim Grosbach ed428bc1ce ARM more NEON VLD/VST composite physical register refactoring.
Register pair, all lanes subscripting.

llvm-svn: 152157
2012-03-06 23:10:38 +00:00
Jim Grosbach 13a292cc74 ARM refactor more NEON VLD/VST instructions to use composite physregs
Register pair VLD1/VLD2 all-lanes instructions. Kill off more of the
pseudos as a result.

llvm-svn: 152150
2012-03-06 22:01:44 +00:00
Eli Friedman de850676e0 Fix the operand ordering on aliases for shld and shrd. PR12173, part 2.
llvm-svn: 152136
2012-03-06 19:58:46 +00:00
Jim Grosbach 63ee881cd6 Tidy up. Kill some dead code.
llvm-svn: 152131
2012-03-06 18:59:19 +00:00
Jakob Stoklund Olesen 579e701fd9 Allow the same types in DPair as in QPR.
llvm-svn: 152129
2012-03-06 18:44:11 +00:00
Kevin Enderby 520eb3ba8a Fix a bug in the ARM disassembly of the neon VLD2 all lanes instruction.
llvm-svn: 152127
2012-03-06 18:33:12 +00:00
Roman Divacky ef21be2cda Convert PowerPC to register mask operands.
llvm-svn: 152122
2012-03-06 16:41:49 +00:00
Jakob Stoklund Olesen d9b427ee65 Add <imp-def> operands when reloading into physregs.
When an instruction only writes sub-registers, it is still necessary to
add an <imp-def> operand for the super-register.  When reloading into a
virtual register, rewriting will add the operand, but when loading
directly into a virtual register, the <imp-def> operand is still
necessary.

llvm-svn: 152095
2012-03-06 02:48:17 +00:00
Lang Hames 718cfbe05a Split fpscr into two registers: FPSCR and FPSCR_NZCV.
The fpscr register contains both flags (set by FP operations/comparisons) and
control bits. The control bits (FPSCR) should be reserved, since they're always
available and needn't be defined before use. The flag bits (FPSCR_NZCV) should
like to be unreserved so they can be hoisted by MachineCSE. This fixes PR12165.

llvm-svn: 152076
2012-03-06 00:19:55 +00:00
Jim Grosbach 8dc347fc27 ARM vpush/vpop assembler mnemonics accept an optional size suffix.
rdar://10988114

llvm-svn: 152068
2012-03-05 23:16:31 +00:00
Jim Grosbach e5307f9019 ARM Refactor VLD/VST spaced pair instructions.
Use the new composite physical registers.

llvm-svn: 152063
2012-03-05 21:43:40 +00:00
Jim Grosbach c71bf4739a ARM Remove a bit of dead code.
llvm-svn: 152061
2012-03-05 21:09:58 +00:00
Jim Grosbach c988e0c521 ARM refactor away a bunch of VLD/VST pseudo instructions.
With the new composite physical registers to represent arbitrary pairs
of DPR registers, we don't need the pseudo-registers anymore. Get rid of
a bunch of them that use DPR register pairs and just use the real
instructions directly instead.

llvm-svn: 152045
2012-03-05 19:33:30 +00:00
Jim Grosbach fd93a59557 Make MCRegisterInfo available to the the MCInstPrinter.
Used to allow context sensitive printing of super-register or sub-register
references.

llvm-svn: 152043
2012-03-05 19:33:20 +00:00
Chad Rosier 9424aa1c51 Address Evan's comments for r151877.
Specifically, remove the magic number when checking to see if the copy has a 
glue operand and simplify the checking logic.

rdar://10930395

llvm-svn: 152041
2012-03-05 19:27:12 +00:00
Sebastian Pop 957a6583f1 updated patch for the ARM fused multiply add/sub
In this update:
- I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
- I kept setting .fpu=neon-vfpv4 code attribute because that is what the
assembler understands.

Patch by Ana Pazos <apazos@codeaurora.org>

llvm-svn: 152036
2012-03-05 17:39:52 +00:00
Craig Topper 4b02a29eba Convert more GenRegisterInfo tables from unsigned to uint16_t to reduce static data size.
llvm-svn: 152016
2012-03-05 05:37:41 +00:00
Eli Friedman a5a6d6aa8f Make aliases for shld and shrd match gas. PR12173.
llvm-svn: 152014
2012-03-05 04:31:54 +00:00
Jakob Stoklund Olesen f729ceae04 Use <def,undef> operands when spilling NEON bundles.
MachineOperands that define part of a virtual register must have an
<undef> flag if they are not intended as read-modify-write operands.

The old trick of adding an <imp-def> operand doesn't work any longer.

Fixes PR12177.

llvm-svn: 152008
2012-03-04 18:40:30 +00:00
Craig Topper 1d32658877 Use uint16_t to store register overlaps to reduce static data.
llvm-svn: 152001
2012-03-04 10:43:23 +00:00
Craig Topper b35eacb0f0 Use uint16_t instead of unsigned to store registers in reg classes. Reduces static data size.
llvm-svn: 151998
2012-03-04 10:16:38 +00:00
Craig Topper 420525ce3b Use uint16_t to store registers in callee saved register tables to reduce size of static data.
llvm-svn: 151996
2012-03-04 03:33:22 +00:00
Craig Topper 6dedbae429 Use uint8_t instead of enums to store values in X86 disassembler table. Shaves 150k off the size of X86DisassemblerDecoder.o
llvm-svn: 151995
2012-03-04 02:16:41 +00:00
Chad Rosier f5e086f18e Prevent obscure and incorrect tail-call optimization.
In this instance we are generating the tail-call during legalizeDAG.  The 2nd
floor call can't be a tail call because it clobbers %xmm1, which is defined by
the first floor call.  The first floor call can't be a tail-call because it's
not in the tail position.  The only reasonable way I could think to fix this
in a target-independent manner was to check for glue logic on the copy reg.

rdar://10930395

llvm-svn: 151877
2012-03-02 02:50:46 +00:00
Evan Cheng d12af5dc69 Neuter the optimization I implemented with r107852 and r108258 which turn some
floating point equality comparisons into integer ones with -ffast-math. The
issue is the optimization causes +0.0 != -0.0.

Now the optimization is only done when one side is known to be 0.0. The other
side's sign bit is masked off for the comparison.

rdar://10964603

llvm-svn: 151861
2012-03-01 23:27:13 +00:00
Jakob Stoklund Olesen 693225f04a Handle regmasks in Thumb1RegisterInfo::saveScavengerRegister().
This function could have r12 live across a function call when compiling
thumb1 code.

The test case for this is not included because it is very long. It must
provoke emergency spilling near a function call. The behavior is
provoked by MultiSource/Applications/JM/lencod, and it triggers an
assertion in the scavenger.

<rdar://problem/10963642>

llvm-svn: 151855
2012-03-01 22:57:32 +00:00
Jim Grosbach 6990e5f08c ARM use the right opcode for FP<->Integer move in fast-isel.
rdar://10965031

llvm-svn: 151850
2012-03-01 22:47:09 +00:00
Michael J. Spencer 35145f830a Minimal changes for LLVM to compile under VS11.
llvm-svn: 151849
2012-03-01 22:42:52 +00:00
Akira Hatanaka 5350c24509 Changes for migrating to using register mask operands.
llvm-svn: 151847
2012-03-01 22:27:29 +00:00
Kevin Enderby f0269b4270 Change ARMInstPrinter::printPredicateOperand() so it will not abort if it
runs into the undefined 15 condition code value.

llvm-svn: 151844
2012-03-01 22:13:02 +00:00
Akira Hatanaka 6bbe1f0d10 Fix bugs which were introduced when support for base+index floating point loads
and stores was added.

- SelectAddr should return false if Parent is an unaligned f32 load or store.
- Only aligned load and store nodes should be matched to select reg+imm
  floating point instructions.
- MIPS does not have support for f64 unaligned load or store instructions.

llvm-svn: 151843
2012-03-01 22:12:30 +00:00
Benjamin Kramer e39d7ac396 Make TargetRegisterClasses non-virtual by making the only virtual function a function pointer.
This allows us to make TRC non-polymorphic and value-initializable, eliminating a huge static
initializer and a ton of cruft from the generated code.

Shrinks ARMBaseRegisterInfo.o by ~100k.

llvm-svn: 151806
2012-03-01 13:37:55 +00:00
Benjamin Kramer acd78d5092 Emit the "is an intrinsic overloaded" table as a bitfield.
llvm-svn: 151792
2012-03-01 02:16:57 +00:00
Akira Hatanaka 1ee768db53 Pass endian information to constructors. Define separate functions to create
objects for big endian and little endian targets.

Patch by Jack Carter.

llvm-svn: 151788
2012-03-01 01:53:15 +00:00
Kevin Enderby b119c08af3 Added annotations for x86 pc relative loads to llvm's 'C' disassembler.
So with darwin's otool(1) an x86_64 hello world .o file will print:
leaq	L_.str(%rip), %rax ## literal pool for: Hello world

llvm-svn: 151769
2012-02-29 22:58:34 +00:00
Andrew Trick 6eb6528b98 Intel Atom instruction itineraries for mov sign extension and mov zero extension.
Patch by Tyler Nowicki!

llvm-svn: 151743
2012-02-29 19:44:41 +00:00
Derek Schuff 56b662ce0f Make MemoryObject accessor members const again
llvm-svn: 151687
2012-02-29 01:09:06 +00:00
Jim Grosbach 617f84ddbd ARM implement TargetInstrInfo::getNoopForMachoTarget()
Without this hook, functions w/ a completely empty body (including no
epilogue) will cause an MCEmitter assertion failure.

For example,
define internal fastcc void @empty_function() {
  unreachable
}

rdar://10947471

llvm-svn: 151673
2012-02-28 23:53:30 +00:00
Jim Grosbach a0ec8896ac ARM vbit/vbif/vbsl assembly optional size suffix.
These instructions accept but do not require a size suffix.

rdar://10947225

llvm-svn: 151646
2012-02-28 19:11:07 +00:00
Evan Cheng 65f9d19c4f Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call.
llvm-svn: 151645
2012-02-28 18:51:51 +00:00
Roman Divacky 34d4b9682b Properly MCize the section switch, removing a FIXME.
llvm-svn: 151639
2012-02-28 18:15:25 +00:00
Daniel Dunbar ee7b899343 Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.
llvm-svn: 151630
2012-02-28 15:36:07 +00:00
Jia Liu f54f60f3ce remove blanks, and some code format
llvm-svn: 151625
2012-02-28 07:46:26 +00:00
Evan Cheng 87c7b09d8d Some ARM implementaions, e.g. A-series, does return stack prediction. That is,
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.

Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.

rdar://8979299

llvm-svn: 151623
2012-02-28 06:42:03 +00:00
Akira Hatanaka b2b980e628 Add comments.
llvm-svn: 151615
2012-02-28 03:18:43 +00:00
Akira Hatanaka b8a8e0c262 Do not reserve $gp as a dedicated global base register if the target ABI is not O32.
llvm-svn: 151614
2012-02-28 03:17:38 +00:00
Akira Hatanaka 330d901ce3 Add support for floating point base register + offset register addressing mode
load and store instructions.

llvm-svn: 151611
2012-02-28 02:55:02 +00:00
Jakob Stoklund Olesen 92c15b2b2c Enable ARM base pointer when calling functions with large arguments.
When an outgoing call takes more than 2k of arguments on the stack, we
don't allocate that call frame in the prolog, but adjust the stack
pointer immediately before the call instead.

This causes problems with the emergency spill slot because PEI can't
track stack pointer adjustments on the second pass, and if the outgoing
arguments are too big, SP can't be used to reach the emergency spill
slot at all.

Work around these problems by ensuring there is a base or frame pointer
that can be used to access the emergency spill slot.

<rdar://problem/10917166>

llvm-svn: 151604
2012-02-28 01:15:01 +00:00
Preston Gurd a49ef92a76 This patch adds instruction latencies for the SSE instructions
to the instruction scheduler for the Intel Atom.

llvm-svn: 151590
2012-02-27 23:35:03 +00:00
Evandro Menezes cf95bb758c Delete incorrect reference to inexistent Hexagon architecture manuals.
llvm-svn: 151582
2012-02-27 23:00:52 +00:00
Jim Grosbach 7b811d30d9 ARM BL/BLX instruction fixups should use relocations.
We on the linker to resolve calls to the appropriate BL/BLX instruction
to make interworking function correctly. It uses the symbol in the
relocation to do that, so we need to be careful about being too clever.

To enable this for ARM mode, split the BL/BLX fixup kind off from the
unconditional-branch fixups.

rdar://10927209

llvm-svn: 151571
2012-02-27 21:36:23 +00:00
Roman Divacky 8fe40cd659 Reapply r151278 with fixes.
MCize function entry label emission on PowerPC64 properly.

llvm-svn: 151547
2012-02-27 20:20:47 +00:00
Chad Rosier a72393a3f9 Add q suffix aliases for the fistp and fisttp mnemonics.
rdar://10921670
PR11935

llvm-svn: 151543
2012-02-27 19:43:12 +00:00
Akira Hatanaka b260f206d2 Remove unnecessary template parameters.
llvm-svn: 151540
2012-02-27 19:17:53 +00:00
Akira Hatanaka 3c5cab4730 Fix instruction predicates that were not set correctly.
llvm-svn: 151538
2012-02-27 19:09:08 +00:00
Kevin Enderby 1489b523c3 Fix the symbolic operand added for the C disassmbler API for the ARM bl
thumb instruction.  The PC adjustment is +4 in Thumb mode and +8 in ARM mode.

llvm-svn: 151530
2012-02-27 18:15:15 +00:00
Craig Topper 317640dfd0 Remove HexagonGenIntrinsics.inc from Hexagon cmake file. It does not appear in the Makefile and the output it produces isn't used. The Hexagon intrinsics are all in the global Intrinsics.gen.
llvm-svn: 151514
2012-02-27 02:59:43 +00:00
Jia Liu f6de2daf13 delete useless comment&blank
llvm-svn: 151512
2012-02-27 02:21:34 +00:00
Craig Topper 6491c8020e X86 disassembler support for jcxz, jecxz, and jrcxz. Fixes PR11643. Patch by Kay Tiong Khoo.
llvm-svn: 151510
2012-02-27 01:54:29 +00:00
Hal Finkel a1d6afeddf Default TargetData alignment information for 128-bit floating-point types.
llvm-svn: 151473
2012-02-26 04:13:31 +00:00
Hal Finkel 6fd2b434bd Revert r151278, breaks static linking.
Reverting this because it breaks static linking on ppc64. Specifically, it may be linkonce_odr functions that are the problem.
With this patch, if you link statically, calls to some functions end up calling their descriptor addresses instead
of calling to their entry points. This causes the execution to fail with SIGILL (b/c the descriptor address just
has some pointers, not code).

llvm-svn: 151433
2012-02-25 03:40:11 +00:00
NAKAMURA Takumi bdf94879df Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2 lowering for i386-*-win32 targets. Patch by Joe Groff.
[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.

llvm-svn: 151432
2012-02-25 03:37:25 +00:00
Akira Hatanaka 60f7a8e710 Add definitions of floating point multiply add/sub and negative multiply
add/sub instructions.

llvm-svn: 151415
2012-02-25 00:21:52 +00:00
Akira Hatanaka b049aef2d1 Add an option to use a virtual register as the global base register instead of
reserving a physical register ($gp or $28) for that purpose.

This will completely eliminate loads that restore the value of $gp after every
function call, if the register allocator assigns a callee-saved register, or
eliminate unnecessary loads if it assigns a temporary register. 

example:

.cpload $25       // set $gp.
...
.cprestore 16     // store $gp to stack slot 16($sp).
...
jalr $25          // function call. clobbers $gp.
lw $gp, 16($sp)   // not emitted if callee-saved reg is chosen.
...
lw $2, 4($gp)
...
jalr $25          // function call.
lw $gp, 16($sp)   // not emitted if $gp is not live after this instruction.
...

llvm-svn: 151402
2012-02-24 22:34:47 +00:00
Benjamin Kramer 9fceb90175 Remove unused cl::opt, make another opt static.
llvm-svn: 151398
2012-02-24 22:09:25 +00:00
Jim Grosbach 09b602d85c Thumb2 asm aliases for wide bitwise w/ immediate instructions.
llvm-svn: 151384
2012-02-24 19:06:05 +00:00
Michael J. Spencer 248d65e78b Add WIN_FTOL_* psudo-instructions to model the unique calling convention
used by the Win32 _ftol2 runtime function. Patch by Joe Groff!

llvm-svn: 151382
2012-02-24 19:01:22 +00:00
Hal Finkel a3e6ed2161 X11/X2 loads around indirect calls on ppc64 should not be deleted.
llvm-svn: 151374
2012-02-24 17:54:01 +00:00
Richard Osborne de47462012 Remove dead code.
Patch by Ahmed Charles

llvm-svn: 151360
2012-02-24 11:49:08 +00:00
Pete Cooper 682c76b7d4 Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Jia Liu 683f8fffcf comment fix
llvm-svn: 151341
2012-02-24 02:17:26 +00:00
Jia Liu 74aa025d68 some comment fix
llvm-svn: 151340
2012-02-24 02:15:57 +00:00
Jia Liu 13830229fd comment fix
llvm-svn: 151339
2012-02-24 02:15:21 +00:00
Jia Liu 9d2d2adc25 replace a balnk with -
llvm-svn: 151337
2012-02-24 02:05:28 +00:00
Jia Liu 19b0c8244b 80 columns of Mips InstPrinter Makefile
llvm-svn: 151332
2012-02-24 01:47:01 +00:00
Jakob Stoklund Olesen fa7a53746c Switch ARM target to register masks.
I'll let the buildbots determine the compile time improvements from this
change, but 464.h264ref has 5% faster codegen at -O2.

This patch does cause some assembly changes.  Branch folding can make
different decisions about calls with dead return values.
CriticalAntiDepBreaker may choose different registers because its
liveness tracking is affected.  MachineCopyPropagation may sometimes
leave a dead copy behind.

llvm-svn: 151331
2012-02-24 01:19:29 +00:00
Jim Grosbach 3a21e2c33e Make sure the regs are low regs for tMUL size reduction.
llvm-svn: 151318
2012-02-24 00:53:11 +00:00
Jim Grosbach c01104dfbf Thumb2 size reduction fix for tied operands of tMUL.
The tied source operand of tMUL is the second source operand, not the
first like every other two-address thumb instruction. Special case it
in the size reduction pass to make sure we create the tMUL instruction
properly.

llvm-svn: 151315
2012-02-24 00:33:36 +00:00
Dan Gohman d4a77c4682 When emitting a cmp with 0 for a lowered select, mask out the high
bits of the value carying the boolean condition, as their contents
are undefined. This fixes rdar://10887484.

llvm-svn: 151310
2012-02-24 00:09:36 +00:00
Roman Divacky a2d3608f78 MCize function entry label emission on PowerPC64 properly.
llvm-svn: 151278
2012-02-23 20:28:39 +00:00
Kevin Enderby 6fbcd8d439 Updated the llvm-mc disassembler C API to support for the X86 target.
rdar://10873652

As part of this I updated the llvm-mc disassembler C API to always call the
SymbolLookUp call back even if there is no getOpInfo call back.  If there is a
getOpInfo call back that is tried first and then if that gets no information
then the  SymbolLookUp is called.  I also made the code more robust by
memset(3)'ing to zero the LLVMOpInfo1 struct before then setting
SymbolicOp.Value before for the call to getOpInfo.  And also don't use any
values from the  LLVMOpInfo1 struct if getOpInfo returns 0.  And also don't
use any of the ReferenceType or ReferenceName values from SymbolLookUp if it
returns NULL. rdar://10873563 and rdar://10873683

For the X86 target also fixed bugs so the annotations get printed. 

Also fixed a few places in the ARM target that was not producing symbolic
operands for some instructions.  rdar://10878166

llvm-svn: 151267
2012-02-23 18:18:17 +00:00
Brendon Cahoon d5d166d4d4 Fix the numbering of some of the registers and reclassify a couple of them.
Also, some basic clean up.  Patch by Evandro Menezes.

llvm-svn: 151266
2012-02-23 18:17:17 +00:00
Duncan Sands a354d58f8d Remove unused variable.
llvm-svn: 151251
2012-02-23 11:01:22 +00:00
Evan Cheng f258a15bdf Canonicalize (srl (bswap x), 16) to (rotr (bswap x), 16) if the high 16 bits
of x are zero. This optimizes rev + lsr 16 to rev16.

rdar://10750814

llvm-svn: 151230
2012-02-23 02:58:19 +00:00
Evan Cheng e87681cf34 Optimize a couple of common patterns involving conditional moves where the false
value is zero. Instead of a cmov + op, issue an conditional op instead. e.g.
    cmp   r9, r4
    mov   r4, #0
    moveq r4, #1 
    orr   lr, lr, r4

should be:
    cmp   r9, r4
    orreq lr, lr, #1

That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend
this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y).

It's possible to extend this to ADD and SUB but I don't think they are common.

rdar://8659097

llvm-svn: 151224
2012-02-23 01:19:06 +00:00
Hal Finkel ad4d9f5848 Allow the use of an alternate symbol for calculating a function's size.
The standard function epilog includes a .size directive, but ppc64 uses
an alternate local symbol to tag the actual start of each function.

Until recently, binutils accepted the .size directive as:
 .size	test1, .Ltmp0-test1
however, using this directive with recent binutils will result in the error:
 .size expression for XXX does not evaluate to a constant
so we must use the label which actually tags the start of the function.

llvm-svn: 151200
2012-02-22 21:11:47 +00:00
Michael J. Spencer 8b98bf2d6b Properly emit _fltused with FastISel. Refactor to share code with SDAG.
Patch by Joe Groff!

llvm-svn: 151183
2012-02-22 19:06:13 +00:00
Chad Rosier 5dfe6dab25 Remove extra semi-colons.
llvm-svn: 151169
2012-02-22 17:25:00 +00:00
Sirish Pande 2a783d5b94 Efficient pattern for store truncate. Patch by Evandro Menezes.
llvm-svn: 151166
2012-02-22 16:45:10 +00:00
Craig Topper cc830f8cda Declare register classes as const. Fix a couple pointers to register classes that weren't already const.
llvm-svn: 151138
2012-02-22 07:28:11 +00:00
Craig Topper 760b134ffa Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
llvm-svn: 151134
2012-02-22 05:59:10 +00:00
Aaron Ballman e67173e718 Adding support for Microsoft's thiscall calling convention. LLVM side of the patch.
llvm-svn: 151123
2012-02-22 03:04:40 +00:00
Jakob Stoklund Olesen 5f37f1c39d Clarify ARM calling conventions.
llvm-svn: 151113
2012-02-22 01:07:19 +00:00
Akira Hatanaka a7721f6b4d Use a function in MathExtras to do sign extension.
llvm-svn: 151107
2012-02-22 00:16:54 +00:00
Jakob Stoklund Olesen 6909faaf35 Calls don't really change the stack pointer.
Even if a call instruction has %SP<imp-def> operands, it doesn't change
the value of the stack pointer.

llvm-svn: 151104
2012-02-21 23:47:43 +00:00
Evan Cheng 0460ae8d80 Proper support for a bastardized darwin-eabi hybird ABI.
llvm-svn: 151083
2012-02-21 20:46:00 +00:00
Andrew Trick da84e64683 Clear virtual registers after they are no longer referenced.
Passes after RegAlloc should be able to rely on MRI->getNumVirtRegs() == 0.
This makes sharing code for pre/postRA passes more robust.
Now, to check if a pass is running before the RA pipeline begins, use MRI->isSSA().
To check if a pass is running after the RA pipeline ends, use !MRI->getNumVirtRegs().

PEI resets virtual regs when it's done scavenging.

PTX will either have to provide its own PEI pass or assign physregs.

llvm-svn: 151032
2012-02-21 04:51:23 +00:00
James Molloy 547d4c0662 Improve generated code for extending loads and some trunc stores on ARM.
Teach TargetSelectionDAG about lengthening loads for vector types and set v4i8 as legal. Allow FP_TO_UINT for v4i16 from v4i32.

llvm-svn: 150956
2012-02-20 09:24:05 +00:00
Ahmed Charles 636a3d618c Remove dead code. Improve llvm_unreachable text. Simplify some control flow.
llvm-svn: 150918
2012-02-19 11:37:01 +00:00
Craig Topper de121a1000 Remove some unneeded includes and fix ordering in X86ISelLowering.cpp. Remove unneeded 'using namespace'.
llvm-svn: 150916
2012-02-19 07:15:48 +00:00
Craig Topper 65a4ceea1e Unify all shuffle mask checking functions take a mask and VT instead of VectorShuffleSDNode.
llvm-svn: 150913
2012-02-19 05:41:45 +00:00
Craig Topper 3e5c04e432 Make a bunch of X86ISelLowering shuffle functions static now that they are no longer needed by isel.
llvm-svn: 150908
2012-02-19 02:53:47 +00:00
Jia Liu 608dc6e257 comment fix ARM.h
llvm-svn: 150904
2012-02-19 02:04:03 +00:00
Jia Liu e1d619691b some comment fix for X86 and ARM
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Craig Topper 66a3597a4a Add vmfunc instruction to X86 assembler and disassembler.
llvm-svn: 150899
2012-02-19 01:39:49 +00:00
Jia Liu b22310fda6 Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper 57d3aaed78 Add X86InstrSVM.td that I forgot to add in r150873.
llvm-svn: 150874
2012-02-18 08:34:12 +00:00
Craig Topper ed7aa46366 Add X86 assembler and disassembler support for AMD SVM instructions. Original patch by Kay Tiong Khoo. Few tweaks by me for code density and to reduce replication.
llvm-svn: 150873
2012-02-18 08:19:49 +00:00
Jakob Stoklund Olesen 4fad5b2b9e Handle regmask operands in ARMInstrInfo.
llvm-svn: 150833
2012-02-17 19:23:15 +00:00
Jakob Stoklund Olesen 96732a438d Fix ARMBaseInstrInfo::getInstrLatency for calls.
Calls always clobber CPSR.

llvm-svn: 150831
2012-02-17 19:07:59 +00:00
Jia Liu 9f6101191b remove Emacs-tag form .cpp files in Mips Backend, and fix some typo.
llvm-svn: 150805
2012-02-17 08:55:11 +00:00
Craig Topper ba172d2d59 Remove the last of the old vector_shuffle patterns from X86 isel.
llvm-svn: 150795
2012-02-17 07:02:34 +00:00
Akira Hatanaka d608bac682 Do not promote i32 arguments to i64. This was causing unnecessary sign extension
instructions to be emitted.

llvm-svn: 150782
2012-02-17 02:20:26 +00:00
Jia Liu dd6c1cd4e8 add Emacs tag and fix some comment error in file headers
llvm-svn: 150775
2012-02-17 01:23:50 +00:00
Chad Rosier fcd29ae390 [fast-isel] Add support for returning non-legal types with no sign- or zero-
entend flag.

llvm-svn: 150774
2012-02-17 01:21:28 +00:00
Lang Hames 5bade3dc6e Re-enable 150652 and 150654 - Make FPSCR non-reserved, and make MachineCSE bail on reserved registers. This *should* be safe as of r150786.
llvm-svn: 150769
2012-02-17 00:27:16 +00:00
Akira Hatanaka b4d2ccf2ab Remove comment.
llvm-svn: 150739
2012-02-16 22:52:29 +00:00
Chad Rosier a0d3c75015 Remove unnecessary assignment to temporary, ResultReg.
llvm-svn: 150737
2012-02-16 22:45:33 +00:00
Jakob Stoklund Olesen bc6ba479b6 Remove the YMM_HI_6_15 hack.
Call clobbers are now represented with register mask operands.  The
regmask can easily represent the fact that xmm6 is call-preserved while
ymm6 isn't.  This is automatically computed by TableGen from the
CalleeSavedRegs containing xmm6.

llvm-svn: 150709
2012-02-16 17:56:06 +00:00
Jakob Stoklund Olesen 97e3115dc2 Use the same CALL instructions for Windows as for everything else.
The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.

llvm-svn: 150708
2012-02-16 17:56:02 +00:00
Akira Hatanaka 4705b0cc1c Remove trailing whitespace. Add newline.
llvm-svn: 150706
2012-02-16 17:48:20 +00:00
Lang Hames 55a2a96153 Oop - r150653 + r150654 broke one of my test cases. Backing out for now...
llvm-svn: 150655
2012-02-16 02:32:10 +00:00