Commit Graph

23654 Commits

Author SHA1 Message Date
Jack Carter 9c1a027fe8 This patch that sets the EmitAlias flag in td files
and enables the instruction printer to print aliased 
instructions. 

Due to usage of RegisterOperands a change in common 
code (utils/TableGen/AsmWriterEmitter.cpp) is required 
to get the correct register value if it is a RegisterOperand.

Contributer: Vladimir Medic
 
llvm-svn: 174358
2013-02-05 08:32:10 +00:00
Jack Carter 10be6aef15 This patch changes a static_cast to dyn_cast
for MipsELFStreamer objects.

Contributer: Jack Carter
 
llvm-svn: 174354
2013-02-05 07:47:41 +00:00
Jyotsna Verma 7ab68fbd1d Hexagon: Add V4 combine instructions and some more Def Pats for V2.
llvm-svn: 174331
2013-02-04 15:52:56 +00:00
Benjamin Kramer c35d526489 Disable a couple more vector splat optimizations on PPC.
I didn't see those because the test case used "not grep". FileCheck the test and
XFAIL it, preserving the old optimization, so this can be fixed eventually.

llvm-svn: 174330
2013-02-04 15:52:32 +00:00
Tim Northover 24937c12eb Fix some abuses of StringRef
We were taking a StringRef to a temporary result, which can go horribly wrong.

llvm-svn: 174328
2013-02-04 15:44:38 +00:00
Benjamin Kramer 2c9da989c2 X86: Open up some opportunities for constant folding by postponing shift lowering.
Fixes PR15141.

llvm-svn: 174327
2013-02-04 15:19:33 +00:00
Benjamin Kramer 0611298446 X86: Simplify code. No functionality change.
llvm-svn: 174326
2013-02-04 15:19:25 +00:00
Benjamin Kramer 548ffa274a SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors.
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp

The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.

Fix some trivially foldable X86 tests too.

llvm-svn: 174325
2013-02-04 15:19:18 +00:00
Tim Northover aefc30f2a4 Give explicit suffix to integer constant over 32-bits.
llvm-svn: 174324
2013-02-04 14:14:58 +00:00
Evgeniy Stepanov 1f5a71492d More MSan/ASan annotations.
This change lets us bootstrap LLVM/Clang under ASan and MSan. It contains
fixes for 2 issues:

- X86JIT reads return address from stack, which MSan does not know is
  initialized.
- bugpoint tests run binaries with RLIMIT_AS. This does not work with certain
  Sanitizers.

We are no longer including config.h in Compiler.h with this change.

llvm-svn: 174306
2013-02-04 07:03:24 +00:00
Arnold Schwaighofer 98f1012f9b ARM cost model: Penalize insertelement into D subregisters
Swift has a renaming dependency if we load into D subregisters. We don't have a
way of distinguishing between insertelement operations of values from loads and
other values. Therefore, we are pessimistic for now (The performance problem
showed up in example 14 of gcc-loops).

radar://13096933

llvm-svn: 174300
2013-02-04 02:52:05 +00:00
NAKAMURA Takumi 80159432de PPCDarwinAsmPrinter::EmitStartOfAsmFile(): Add checking range in CPUDirectives[].
llvm-svn: 174298
2013-02-04 00:47:38 +00:00
NAKAMURA Takumi 3d591ae0b9 PPCDarwinAsmPrinter::EmitStartOfAsmFile(): Add possible elements in CPUDirectives[].
llvm-svn: 174297
2013-02-04 00:47:33 +00:00
Reed Kotler f8933f83f0 Start static relocation implementation for mips16.
This checkin makes hello world work. 

llvm-svn: 174264
2013-02-02 04:07:35 +00:00
Bill Schmidt cc99a2f61d Add notes about future PowerPC features
llvm-svn: 174232
2013-02-01 23:10:09 +00:00
Bill Schmidt 52742c25ae LLVM enablement for some older PowerPC CPUs
llvm-svn: 174230
2013-02-01 22:59:51 +00:00
David Sehr 8114a7a651 Two changes relevant to LEA and x32:
1) allows the use of RIP-relative addressing in 32-bit LEA instructions under
   x86-64 (ILP32 and LP64)
2) separates the size of address registers in 64-bit LEA instructions from
   control by ILP32/LP64.

llvm-svn: 174208
2013-02-01 19:28:09 +00:00
Jyotsna Verma 2ceafa6684 Replace LDriu*[bhdw]_indexed_V4 instructions with "def Pats".
llvm-svn: 174193
2013-02-01 16:36:16 +00:00
Jyotsna Verma d6eda1c227 Add appropriate TSFlags to the instructions that must be always extended.
llvm-svn: 174186
2013-02-01 15:54:43 +00:00
Tim Northover 111b6cb37b Remove currently unused register decoder from AArch64.
This should fix a warning when building this backend.

llvm-svn: 174177
2013-02-01 14:55:05 +00:00
Chandler Carruth e5d8d0d64b Switch the code added in r173885 to use the new, shiny RTTI
infrastructure on MCStreamer to test for whether there is an
MCELFStreamer object available.

This is just a cleanup on the AsmPrinter side of things, moving ad-hoc
tests of random APIs to a direct type query. But the AsmParser
completely broken. There were no tests, it just blindly cast its
streamer to an MCELFStreamer and started manipulating it.

I don't have a test case -- this actually failed on LLVM's own
regression test suite. Unfortunately the failure only appears when the
stars, compilers, and runtime align to misbehave when we read a pointer
to a formatted_raw_ostream as-if it were an MCAssembler. =/

UBSan would catch this immediately.

Many thanks to Matt for doing about 80% of the debugging work here in
GDB, Jim for helping to explain how exactly to fix this, and others for
putting up with the hair pulling that ensued during debugging it.

llvm-svn: 174118
2013-01-31 23:43:14 +00:00
Chandler Carruth de093ef8d6 Give the MCStreamer class hierarchy LLVM RTTI facilities for use with
isa<> and dyn_cast<>. In several places, code is already hacking around
the absence of this, and there seem to be several interfaces that might
be lifted and/or devirtualized using this.

This change was based on a discussion with Jim Grosbach about how best
to handle testing for specific MCStreamer subclasses. He said that this
was the correct end state, and everything else was too hacky so
I decided to just make it so.

No functionality should be changed here, this is just threading the kind
through all the constructors and setting up the classof overloads.

llvm-svn: 174113
2013-01-31 23:29:57 +00:00
NAKAMURA Takumi e1137a2058 Update AMDGPURegisterInfo::eliminateFrameIndex() corresponding to r174083.
llvm-svn: 174106
2013-01-31 22:55:51 +00:00
Tom Stellard 4926921bd4 R600: Fold clamp, neg, abs
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174099
2013-01-31 22:11:54 +00:00
Tom Stellard dd04c83a4d R600: Consider bitcast when folding const_address node.
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174098
2013-01-31 22:11:53 +00:00
Tom Stellard af1bce7d1d R600: Make store_dummy intrinsic more general by passing export type
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174097
2013-01-31 22:11:46 +00:00
Chad Rosier 3145deac19 Remove unused variable, which should have been removed with r174083.
llvm-svn: 174094
2013-01-31 21:23:44 +00:00
Tim Northover c6d39314b2 Update AArch64 backend to changed eliminateFrameIndex interface.
llvm-svn: 174086
2013-01-31 20:46:53 +00:00
Chad Rosier df782d2225 [PEI] Pass the frame index operand number to the eliminateFrameIndex function.
Each target implementation was needlessly recomputing the index.
Part of rdar://13076458

llvm-svn: 174083
2013-01-31 20:02:54 +00:00
Tim Northover e0e3aefdd3 Add AArch64 as an experimental target.
This patch adds support for AArch64 (ARM's 64-bit architecture) to
LLVM in the "experimental" category. Currently, it won't be built
unless requested explicitly.

This initial commit should have support for:
    + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions
      (except the late addition CRC instructions).
    + CodeGen features required for C++03 and C99.
    + Compilation for the "small" memory model: code+static data <
      4GB.
    + Absolute and position-independent code.
    + GNU-style (i.e. "__thread") TLS.
    + Debugging information.

The principal omission, currently, is performance tuning.

This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.

Further reviews would be gratefully received.

llvm-svn: 174054
2013-01-31 12:12:40 +00:00
Eric Christopher 258c867c0b Whitespace.
llvm-svn: 174009
2013-01-31 00:50:48 +00:00
Eric Christopher 4e3e94c13d Check and allow floating point registers to select the size of the
register for inline asm. This conforms to how gcc allows for effective
casting of inputs into gprs (fprs is already handled).

llvm-svn: 174008
2013-01-31 00:50:46 +00:00
Hal Finkel e1df90958d PPC QPX requires a 32-byte aligned stack
On systems which support the QPX vector instructions, the stack must be
32-byte aligned.

llvm-svn: 173993
2013-01-30 23:43:27 +00:00
Evan Cheng d2ca4e2ed9 Restrict sin/cos optimization to 64-bit only for now. 32-bit is a bit messy and less critical.
llvm-svn: 173987
2013-01-30 22:56:35 +00:00
Hal Finkel b3fc509b23 Initialize hasQPX in PPCSubtarget
This should have gone in with r173973.

llvm-svn: 173984
2013-01-30 22:43:44 +00:00
Hal Finkel efb305e54c Add definitions for the PPC a2q core marked as having QPX available
This is the first commit of a large series which will add support for the
QPX vector instruction set to the PowerPC backend. This instruction set is
used on the IBM Blue Gene/Q supercomputers.

llvm-svn: 173973
2013-01-30 21:17:42 +00:00
Eli Bendersky 2e2ce49e59 Add a special ARM trap encoding for NaCl.
More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html

Patch by JF Bastien

llvm-svn: 173943
2013-01-30 16:30:19 +00:00
Logan Chien a436e4c7e4 Add missing header and test cases for r173939.
llvm-svn: 173941
2013-01-30 15:48:50 +00:00
Logan Chien 2bcc42c730 Override virtual function for ARM EH directives.
llvm-svn: 173939
2013-01-30 15:39:04 +00:00
David Blaikie 24f44ac53a Removing initializer for the field removed in r173887
llvm-svn: 173888
2013-01-30 03:04:07 +00:00
David Blaikie b7fa813373 Remove unused variable (introduced in r173884) to clear clang -Werror build
llvm-svn: 173887
2013-01-30 02:56:02 +00:00
Jack Carter b01a90ce40 Forgot to add new file to CMakeLists
llvm-svn: 173886
2013-01-30 02:32:36 +00:00
Jack Carter 718da0b53b This patch implements runtime ARM specific
setting of ELF header e_flags.

Contributer: Jack Carter
 
llvm-svn: 173885
2013-01-30 02:24:33 +00:00
Jack Carter 7f378104b6 This patch implements runtime Mips specific
setting of ELF header e_flags.

Contributer: Jack Carter
 
llvm-svn: 173884
2013-01-30 02:16:36 +00:00
Jack Carter 1bd90ff6cc This patch reworks how llvm targets set
and update ELF header e_flags.

Currently gathering information such as symbol, 
section and data is done by collecting it in an 
MCAssembler object. From MCAssembler and MCAsmLayout 
objects ELFObjectWriter::WriteObject() forms and 
streams out the ELF object file.

This patch just adds a few members to the MCAssember 
class to store and access the e_flag settings. It 
allows for runtime additions to the e_flag by 
assembler directives. The standalone assembler can 
get to MCAssembler from getParser().getStreamer().getAssembler().

This patch is the generic infrastructure and will be
followed by patches for ARM and Mips for their target 
specific use.

Contributer: Jack Carter
 
llvm-svn: 173882
2013-01-30 02:09:52 +00:00
Akira Hatanaka c0b020690b [mips] Lower EH_RETURN.
Patch by Sasa Stankovic.

llvm-svn: 173862
2013-01-30 00:26:49 +00:00
Renato Golin 5e9d55eca0 Adding simple cast cost to ARM
Changing ARMBaseTargetMachine to return ARMTargetLowering intead of
the generic one (similar to x86 code).

Tests showing which instructions were added to cast when necessary
or cost zero when not. Downcast to 16 bits are not lowered in NEON,
so costs are not there yet.

llvm-svn: 173849
2013-01-29 23:31:38 +00:00
Jyotsna Verma b16a9cb132 Use multiclass for post-increment store instructions.
llvm-svn: 173816
2013-01-29 18:42:41 +00:00
Jyotsna Verma a609b1c89d Add constant extender support for MInst type instructions.
llvm-svn: 173813
2013-01-29 18:18:50 +00:00
Evan Cheng 27e41c9f70 Remove dead code.
llvm-svn: 173812
2013-01-29 18:08:22 +00:00
NAKAMURA Takumi 978b5a0e02 R600/AMDILPeepholeOptimizer.cpp: Tweak std::make_pair to satisfy C++11.
llvm-svn: 173807
2013-01-29 16:31:56 +00:00
Hans Wennborg 5deecd9043 Fix typo in X86BaseInfo.h that I introduced in r157818.
llvm-svn: 173798
2013-01-29 14:05:57 +00:00
Tim Northover a0edd3ee66 Fix 64-bit atomic operations in Thumb mode.
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.

llvm-svn: 173780
2013-01-29 09:06:13 +00:00
Craig Topper c048154b9b Merge SSE and AVX shuffle instructions in the comment printer.
llvm-svn: 173777
2013-01-29 07:54:31 +00:00
Evan Cheng 0e88c7d897 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Hal Finkel 7f9e8d3eaa Add isBGQ method to PPCSubtarget
This function will be used in future commits.

llvm-svn: 173729
2013-01-29 00:22:47 +00:00
Craig Topper 5c683972bc Fix 256-bit PALIGNR comment decoding to understand that it works on independent 256-bit lanes.
llvm-svn: 173674
2013-01-28 07:41:18 +00:00
Craig Topper 71d99ffe4a Add missing break in 256-bit palignr comment printing. No test case yet because the comment itself is still wrong.
llvm-svn: 173669
2013-01-28 07:19:11 +00:00
Craig Topper 8fb09f0abb Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction.
llvm-svn: 173667
2013-01-28 06:48:25 +00:00
Craig Topper b3ede5e3b1 Remove addToNoHelperNeeded function that was left unused after r173649. Fixes a -Wunused warning.
llvm-svn: 173664
2013-01-28 06:09:24 +00:00
Reed Kotler 97f8e2fa8f Make some code a little simpler.
llvm-svn: 173649
2013-01-28 02:46:49 +00:00
Richard Osborne 038d24f90c [XCore] Add missing l2rus instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173634
2013-01-27 22:28:30 +00:00
Richard Osborne f2ecd40929 [XCore] Add missing l2r instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173629
2013-01-27 21:26:02 +00:00
Richard Osborne 7fe8f63544 [XCore] Add missing 1r instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173624
2013-01-27 20:46:21 +00:00
Richard Osborne 8f56317287 [XCore] Add missing 0r instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173623
2013-01-27 20:42:57 +00:00
Bill Wendling cc1fc9465b Convert the CPP backend to use the AttributeSet instead of AttributeWithIndex.
Further removal of the introspective AttributeWithIndex thing. Also fix the #includes.

llvm-svn: 173599
2013-01-27 01:22:51 +00:00
Benjamin Kramer 6a93596538 X86: Decode PALIGN operands so I don't have to do it in my head.
llvm-svn: 173572
2013-01-26 13:31:37 +00:00
Benjamin Kramer 99c68dd964 X86: Do splat promotion later, so the optimizer can chew on it first.
This catches many cases where we can emit a more efficient shuffle for a
specific mask or when the mask contains undefs. Once the splat is lowered to
unpacks we can't do that anymore.

There is a possibility of moving the promotion after pshufb matching, but I'm
not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so
I avoided that for now.

llvm-svn: 173569
2013-01-26 11:44:21 +00:00
Reed Kotler 233cee2b5b fix use of std::std. it's ordered set.
llvm-svn: 173563
2013-01-26 06:58:35 +00:00
Dmitri Gribenko c451bdf9ff Remove unused variables, silences -Wunused-variable
llvm-svn: 173526
2013-01-25 23:17:21 +00:00
Bill Wendling 57625a4966 Remove some introspection functions.
The 'getSlot' function and its ilk allow introspection into the AttributeSet
class. However, that class should be opaque. Allow access through accessor
methods instead.

llvm-svn: 173522
2013-01-25 23:09:36 +00:00
Hal Finkel 4e5ca9e578 Initial implementation of PPCTargetTransformInfo
This provides a place to add customized operation cost information and
control some other target-specific IR-level transformations.

The only non-trivial logic in this checkin assigns a higher cost to
unaligned loads and stores (covered by the included test case).

llvm-svn: 173520
2013-01-25 23:05:59 +00:00
Eli Bendersky 597fc1233a In this patch, we teach X86_64TargetMachine that it has a ILP32
(defined by the x32 ABI) mode, in which case its pointers are 32-bits
in size. This knowledge is also added to X86RegisterInfo that now
returns the appropriate registers in getPointerRegClass.

There are many outcomes to this change. In order to keep the patches
separate and manageable, we start by focusing on some simple testable
cases. The patch adds a test with passing a pointer to a function -
focusing on the difference between the two data models for x86-64.
Another test is added for handling of 'sret' arguments (and
functionality is added in X86ISelLowering to make it work).

A note on naming: the "x32 ABI" document refers to the AMD64
architecture (in LLVM it's distinguished by being is64Bits() in the
x86 subtarget) with two variations: the LP64 (default) data model, and
the ILP32 data model. This patch adds predicates to the subtarget
which are consistent with this naming scheme.

llvm-svn: 173503
2013-01-25 22:07:43 +00:00
Richard Osborne 6b86eec819 Add instruction encodings / disassembly support for l4r instructions.
llvm-svn: 173501
2013-01-25 21:55:32 +00:00
Bill Wendling 8649283e75 Use the new 'getSlotIndex' method to retrieve the attribute's slot index.
llvm-svn: 173499
2013-01-25 21:46:52 +00:00
Richard Osborne a520a7dcf3 Use the correct format in the STW / SETPSC instruction names.
llvm-svn: 173494
2013-01-25 21:25:12 +00:00
Richard Osborne 9a228a13c6 Fix order of operands for crc8_l4r
The order in which operands appear in the encoded instruction is different
to order in which they appear in assembly. This changes the XCore backend to
use the instruction encoding order.

llvm-svn: 173493
2013-01-25 21:20:28 +00:00
Richard Osborne a19fa86a70 Add instruction encodings / disassembly support for l5r instructions.
llvm-svn: 173479
2013-01-25 20:20:07 +00:00
Richard Osborne 8ae02d3cef Fix order of operands for l5r instructions.
With this change the operands order matches the order in which the operands
are encoded in the instruction.

llvm-svn: 173477
2013-01-25 20:16:00 +00:00
Richard Osborne ea023fcde1 Use correct mnemonic / instruction name for ldivu.
llvm-svn: 173476
2013-01-25 20:11:26 +00:00
Hal Finkel 53f4ba6ce3 More cleanup of PPC register definitions.
Uses the new !add TableGen operator to do more cleanup of the
PPC register definitions.

llvm-svn: 173446
2013-01-25 14:49:10 +00:00
Silviu Baranga 3eb45a03af Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation.
llvm-svn: 173437
2013-01-25 10:39:49 +00:00
Andrew Trick e2c3f5c982 MIsched: Improve the interface to SchedDFS analysis (subtrees).
Allow the strategy to select SchedDFS. Allow the results of SchedDFS
to affect initialization of the scheduler state.

llvm-svn: 173425
2013-01-25 06:33:57 +00:00
Jack Carter 07c818d2da This patch implements parsing the .word
directive for the Mips assembler.

Contributer: Vladimir Medic
 
llvm-svn: 173407
2013-01-25 01:31:34 +00:00
Akira Hatanaka 28aed9ca85 [mips] Set flag neverHasSideEffects flag on some of the floating point instructions.
llvm-svn: 173401
2013-01-25 00:20:39 +00:00
Renato Golin d4c392e6ff Moving Cost Tables up to share with other targets
llvm-svn: 173382
2013-01-24 23:01:00 +00:00
Hal Finkel 41176f43c4 Start cleanup of PPC register definitions using foreach loops.
No functionality change intended.

This captures the first two cases GPR32/64. For the others, we need
an addition operator (if we have one, I've not yet found it).

Based on a suggestion made by Tom Stellard in the AArch64 review!

llvm-svn: 173366
2013-01-24 20:43:18 +00:00
NAKAMURA Takumi bf8f207519 MipsISelLowering.cpp: Fill unreachable paths to fix warnings. [-Wsometimes-uninitialized]
FIXME: Could they, unreachable(s), be removed?
FIXME: I could prefer the coding standards...
llvm-svn: 173325
2013-01-24 06:08:06 +00:00
NAKAMURA Takumi f25b7c6816 MipsISelLowering.cpp: Fix a warning, take two. [-Wunused-variable]
...and fix a typo, s/#ifdef/#ifndef/

llvm-svn: 173324
2013-01-24 05:54:23 +00:00
NAKAMURA Takumi c77d028bfb MipsISelLowering.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 173323
2013-01-24 05:47:29 +00:00
Reed Kotler a2d76bce1f The next phase of Mips16 hard float implementation.
Allow Mips16 routines to call Mips32 routines that have abi requirements
that either arguments or return values are passed in floating point 
registers. This handles only the pic case. We have not done non pic
for Mips16 yet in any form.

The libm functions are Mips32, so with this addition we have a complete
Mips16 hard float implementation.

We still are not able to complete mix Mip16 and Mips32 with hard float.
That will be the next phase which will have several steps. For Mips32
to freely call Mips16 some stub functions must be created.

llvm-svn: 173320
2013-01-24 04:24:02 +00:00
Tom Stellard 6f1b8657f9 R600: Add a llvm.R600.store.swizzle intrinsics
This intrinsic is translated to ALLOC_EXPORT_WORD1_SWIZ, hence its
name. It is used to store vs/fs outputs

Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173297
2013-01-23 21:39:49 +00:00
Tom Stellard d8ac91d436 R600: Simplify stream outputs intrinsic
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173296
2013-01-23 21:39:47 +00:00
Richard Osborne 54e311821f Add instruction encodings / disassembly support for l6r instructions.
llvm-svn: 173288
2013-01-23 20:08:11 +00:00
Eli Bendersky f759526983 Fix powerpc test failure - forgot to initialize stack slot size for PPCLinuxMCAsmInfo
llvm-svn: 173275
2013-01-23 17:12:15 +00:00
Eli Bendersky 32aab2216d Clean up assignment of CalleeSaveStackSlotSize: get rid of the default and explicitly set this in every target that needs to change it from the default.
llvm-svn: 173270
2013-01-23 16:22:04 +00:00
Benjamin Kramer c4231cc9b3 NVPTX: Stop leaking memory by using a managed constant instead of a new Argument.
This is still an egregious hack since we don't have a nice interface for this
kind of thing but should help the valgrind leak check buildbot to become green.

llvm-svn: 173267
2013-01-23 15:21:44 +00:00
Bill Wendling d154e283f2 Add the IR attribute 'sspstrong'.
SSPStrong applies a heuristic to insert stack protectors in these situations:

* A Protector is required for functions which contain an array, regardless of
  type or length.

* A Protector is required for functions which contain a structure/union which
  contains an array, regardless of type or length.  Note, there is no limit to
  the depth of nesting.

* A protector is required when the address of a local variable (i.e., stack
  based variable) is exposed. (E.g., such as through a local whose address is
  taken as part of the RHS of an assignment or a local whose address is taken as
  part of a function argument.)

This patch implements the SSPString attribute to be equivalent to
SSPRequired. This will change in a subsequent patch.

llvm-svn: 173230
2013-01-23 06:41:41 +00:00
Tom Stellard 365366f9ef R600: rework handling of the constants
Remove Cxxx registers, add new special register - "ALU_CONST" and new
operand for each alu src - "sel". ALU_CONST is used to designate that the
new operand contains the value to override src.sel, src.kc_bank, src.chan
for constants in the driver.

Patch by: Vadim Girlin

Vincent Lejeune:
  - Use pointers for constants
  - Fold CONST_ADDRESS when possible

Tom Stellard:
  - Give CONSTANT_BUFFER_0 its own address space
  - Use integer types for constant loads

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173222
2013-01-23 02:09:06 +00:00
Tom Stellard ff62c35da0 R600: Add a CONST_ADDRESS node to model constant buf read
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173221
2013-01-23 02:09:03 +00:00
Tom Stellard ab28e9a30a R600: Factorise VTX_WORD0 and VTX_WORD1 in tblgen def
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 173220
2013-01-23 02:09:01 +00:00
Richard Osborne 1a06479f46 Add instruction encodings / disassembly support for u10 / lu10 instructions.
llvm-svn: 173204
2013-01-22 22:55:04 +00:00
Michael Liao 3dffc5e2b7 Fix an issue of pseudo atomic instruction DAG schedule
- Add list of physical registers clobbered in pseudo atomic insts
  Physical registers are clobbered when pseudo atomic instructions are
  expanded. Add them in clobber list to prevent DAG scheduler to
  mis-schedule them after these insns are declared side-effect free.
- Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 173200
2013-01-22 21:47:38 +00:00
Akira Hatanaka 88c0ec826c [mips] Implement MipsRegisterInfo::getRegPressureLimit.
llvm-svn: 173197
2013-01-22 21:34:25 +00:00
Akira Hatanaka f7d16d0563 [mips] Clean up code in MipsTargetLowering::LowerCall. No functional change
intended

llvm-svn: 173189
2013-01-22 20:05:56 +00:00
Benjamin Kramer fee7d21ae7 X86: Make sure we account for the FMA4 register immediate value, otherwise rip-rel relocations will be off by one byte.
PR15040.

llvm-svn: 173176
2013-01-22 18:05:59 +00:00
Eli Bendersky 0893e1079d Initial patch for x32 ABI support.
Add the x32 environment kind to the triple, and separate the concept of
pointer size and callee save stack slot size, since they're not equal
on x32.

llvm-svn: 173175
2013-01-22 18:02:49 +00:00
Tim Northover 29178a348a Make APFloat constructor require explicit semantics.
Previously we tried to infer it from the bit width size, with an added
IsIEEE argument for the PPC/IEEE 128-bit case, which had a default
value. This default value allowed bugs to creep in, where it was
inappropriate.

llvm-svn: 173138
2013-01-22 09:46:31 +00:00
Richard Osborne 5d477751df Fix some incorrectly named u10 / lu10 instructions.
llvm-svn: 173090
2013-01-21 21:12:30 +00:00
Richard Osborne 38cff3ea7f Remove unused multiclass.
llvm-svn: 173087
2013-01-21 20:50:54 +00:00
Richard Osborne 9d3ec06ef8 Add instruction encodings / disassembly support for u6 / lu6 instructions.
llvm-svn: 173086
2013-01-21 20:44:17 +00:00
Richard Osborne 6e58c6d86d Add instruction encoding / disassembly support for ru6 / lru6 instructions.
llvm-svn: 173085
2013-01-21 20:42:16 +00:00
Richard Osborne 0d68e21ca7 Use correct format for the LDAWCP instruction (u6).
llvm-svn: 173083
2013-01-21 20:32:54 +00:00
Tom Stellard c9b903138d R600/SI: Use unnormalized coordinates for sampling with the RECT target.
Patch by: Michel Dänzer

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 173053
2013-01-21 15:40:48 +00:00
Tom Stellard 14421a793f R600/SI: Take target parameter for sample intrinsics.
Patch by: Michel Dänzer

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 173052
2013-01-21 15:40:47 +00:00
Tom Stellard 74dda0da31 R600/SI: Derive all sample intrinsics from a single class.
Patch by: Michel Dänzer

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 173051
2013-01-21 15:40:46 +00:00
NAKAMURA Takumi c96fb1bd36 R600/SILowerControlFlow.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 173040
2013-01-21 14:06:48 +00:00
Craig Topper 66163a35ee Use <0 checks in place of ==-1 because it results in simpler code.
llvm-svn: 173010
2013-01-21 07:25:16 +00:00
Craig Topper 9b29486f42 Use MVT instead of EVT in LowerVECTOR_SHUFFLEtoBlend.
llvm-svn: 173009
2013-01-21 07:19:54 +00:00
Craig Topper 32c5406dcf Remove trailing whitespace.
llvm-svn: 173008
2013-01-21 06:57:59 +00:00
Craig Topper 5c84c25bf4 Fix some 80 column violations.
llvm-svn: 173006
2013-01-21 06:21:54 +00:00
Craig Topper 2cd375896a Make helper method static.
llvm-svn: 173005
2013-01-21 06:13:28 +00:00
Craig Topper cf93977920 Convert more EVT's to MVT's in the lowering methods.
llvm-svn: 172995
2013-01-20 21:50:27 +00:00
Craig Topper e65a08be64 Capitalize lowerTRUNCATE so that it matches the other lower functions in this file despite it not matching coding standards.
llvm-svn: 172994
2013-01-20 21:34:37 +00:00
Renato Golin e1fb059327 Revert CostTable algorithm, will re-write
llvm-svn: 172992
2013-01-20 20:57:20 +00:00
Richard Osborne 4e69724869 Add instruction encodings / disassembly support for l2rus instructions.
llvm-svn: 172987
2013-01-20 18:51:15 +00:00
Richard Osborne 9fbf57b26c Add instruction encodings / disassembly support for l3r instructions.
llvm-svn: 172986
2013-01-20 18:37:49 +00:00
Richard Osborne f063fcee7a Add instruction encodings / disassembler support for 2rus instructions.
llvm-svn: 172985
2013-01-20 17:22:43 +00:00
Richard Osborne 3fb7395233 Add instruction encodings / disassembly support 3r instructions.
It is not possible to distinguish 3r instructions from 2r / rus instructions
using only the fixed bits. Therefore if an instruction doesn't match the
2r / rus format try to decode it as a 3r instruction before returning Fail.

llvm-svn: 172984
2013-01-20 17:18:47 +00:00
Craig Topper ce61fdf0a3 Make LowerVSETCC a static function and use MVT instead of EVT.
llvm-svn: 172969
2013-01-20 09:02:22 +00:00
Nadav Rotem 9450fcfff1 Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.

llvm-svn: 172968
2013-01-20 08:35:56 +00:00
Craig Topper 9976974cc6 Make some helper methods static.
llvm-svn: 172936
2013-01-20 00:50:58 +00:00
Craig Topper 4ac87da529 Remove DebugLoc argument from static function. It can easily be obtained from the SVOp passed in.
llvm-svn: 172935
2013-01-20 00:43:42 +00:00
Craig Topper 3da6507c41 Use MVT instead of EVT in more instruction lowering code.
llvm-svn: 172933
2013-01-20 00:38:18 +00:00
Craig Topper 53c7fbabbf Use MVT instead of EVT in more of the shuffle lowering code.
llvm-svn: 172930
2013-01-19 23:36:09 +00:00
Craig Topper bb772d27a7 Capitalize LowerVectorIntExtend to be consistent with all the other lower functions in this file.
llvm-svn: 172927
2013-01-19 23:14:09 +00:00
Nadav Rotem 7b3120b9ae On Sandybridge split unaligned 256bit stores into two xmm-sized stores.
llvm-svn: 172894
2013-01-19 08:38:41 +00:00
Craig Topper 84b01120bc Use MVT instead of EVT when computing shuffle immediates since they can only be for legal types. Keeps compiler from generating unneeded checks and handling for extended types.
llvm-svn: 172893
2013-01-19 08:27:45 +00:00
Chandler Carruth 1fe21fc0b5 Sort all of the includes. Several files got checked in with mis-sorted
includes.

llvm-svn: 172891
2013-01-19 08:03:47 +00:00
Jack Carter 7ab15fafe3 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
Formatting fixes. Mostly long lines and 
blank spaces at end of lines.

Contributer: Jack Carter
 
llvm-svn: 172882
2013-01-19 02:00:40 +00:00
Nadav Rotem 7431211214 On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction.
llvm-svn: 172868
2013-01-18 23:10:30 +00:00
Jack Carter c1b17ed2e1 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
Support for Mips register information sections.

Mips ELF object files have a section that is dedicated
to register use info. Some of this information such as
the assumed Global Pointer value is used by the linker
in relocation resolution.

The register info file is .reginfo in o32 and .MIPS.options
in 64 and n32 abi files.

This patch contains the changes needed to create the sections,
but leaves the actual register accounting for a future patch.


Contributer: Jack Carter
 
llvm-svn: 172847
2013-01-18 21:20:38 +00:00
Tom Stellard c4cabef782 R600: Proper insert S_WAITCNT instructions
Some instructions like memory reads/writes are executed
asynchronously, so we need to insert S_WAITCNT instructions
to block before accessing their results. Previously we have
just inserted S_WAITCNT instructions after each async
instruction, this patch fixes this and adds a prober
insertion pass.

Patch by: Christian König

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 172846
2013-01-18 21:15:53 +00:00
Tom Stellard be8ebeebf7 R600: Optimize and cleanup KILL on SI
We shouldn't insert KILL optimization if we don't have a
kill instruction at all.

Patch by: Christian König

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 172845
2013-01-18 21:15:50 +00:00
Jack Carter 86c2c564ff This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.

Removal of redundant code and formatting fixes.

Contributers: Jack Carter/Vladimir Medic
 
llvm-svn: 172842
2013-01-18 20:15:06 +00:00
Craig Topper 1cb8aa581b Calculate vector element size more directly for VINSERTF128/VEXTRACTF128 immediate handling. Also use MVT since this only called on legal types during pattern matching.
llvm-svn: 172797
2013-01-18 08:41:28 +00:00
Craig Topper e938138daf Minor formatting fix. No functional change.
llvm-svn: 172795
2013-01-18 07:27:20 +00:00
Craig Topper 908f7d14b5 Spelling fix: extened->extended. Trailing whitespace in same function.
llvm-svn: 172793
2013-01-18 06:50:59 +00:00
Craig Topper 01fcf2e2f2 Make more use of is128BitVector/is256BitVector in place of getSizeInBits() == 128/256.
llvm-svn: 172792
2013-01-18 06:44:29 +00:00
Chad Rosier 1e8f053bd1 [ms-inline asm] Make the error message more generic now that we support the
'SIZE' and 'LENGTH' operators.

llvm-svn: 172773
2013-01-18 00:50:59 +00:00
Bill Schmidt dee1ef8f53 This patch fixes PR13626 by providing i128 support in the return
calling convention.  128-bit integers are now properly returned
in GPR3 and GPR4 on PowerPC.

llvm-svn: 172745
2013-01-17 19:34:57 +00:00
Chad Rosier d0ed73acb4 [ms-inline asm] Add support for the 'SIZE' and 'LENGTH' operators.
Part of rdar://12576868

llvm-svn: 172743
2013-01-17 19:21:48 +00:00
Jyotsna Verma 9b60c1d171 Add indexed load/store instructions for offset validation check.
This patch fixes bug 14902 - http://llvm.org/bugs/show_bug.cgi?id=14902

llvm-svn: 172737
2013-01-17 18:42:37 +00:00
Bill Schmidt 6b2940b01e This patch fixes the PPC calling convention to handle returns of
_Complex float and _Complex long double, by simply increasing the
number of floating point registers available for return values.

The test case verifies that the correct registers are loaded.

llvm-svn: 172733
2013-01-17 17:45:19 +00:00
Elena Demikhovsky f6a30e05d5 Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

llvm-svn: 172708
2013-01-17 09:59:53 +00:00
Craig Topper c7e6feee42 Combine AVX and SSE forms of MOVSS and MOVSD into the same multiclasses so they get instantiated together.
llvm-svn: 172704
2013-01-17 06:59:42 +00:00
Jakob Stoklund Olesen 213a2f8b3f Provide a place for targets to insert ILP optimization passes.
Move the early if-conversion pass into this group.

ILP optimizations usually need to find the right balance between
register pressure and ILP using the MachineTraceMetrics analysis to
identify critical paths and estimate other costs. Such passes should run
together so they can share dominator tree and loop info analyses.

Besides if-conversion, future passes to run here here could include
expression height reduction and ARM's MLxExpansion pass.

llvm-svn: 172687
2013-01-17 00:58:38 +00:00
Jack Carter 2a74a87b71 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.

The Mips RDHWR (Read Hardware Register) instruction was not 
tested for assembler or dissassembler consumption. This patch
adds that functionality.

Contributer: Vladimir Medic
 
llvm-svn: 172685
2013-01-17 00:28:20 +00:00
Renato Golin f104c4c4ca Change CostTable model to be global to all targets
Moving the X86CostTable to a common place, so that other back-ends
can share the code. Also simplifying it a bit and commoning up
tables with one and two types on operations.

llvm-svn: 172658
2013-01-16 21:29:55 +00:00
Jack Carter 5619f91bf7 reverting 172579
llvm-svn: 172594
2013-01-16 01:29:10 +00:00
Jack Carter e0c1e1a47e Akira,
Hope you are feeling better.

The Mips RDHWR (Read Hardware Register) instruction was not 
tested for assembler or dissassembler consumption. This patch
adds that functionality.

Contributer: Vladimir Medic
 
llvm-svn: 172579
2013-01-16 00:07:45 +00:00
Jack Carter f238510c43 This patch fixes a Mips specific bug where
we need to generate a N64 compound relocation
R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE.

The bug was exposed by the SingleSourcetest case 
DuffsDevice.c.

Contributer: Jack Carter
llvm-svn: 172496
2013-01-15 01:08:02 +00:00
Chad Rosier 5c118fd2ec [ms-inline asm] Extend support for parsing Intel bracketed memory operands that
have an arbitrary ordering of the base register, index register and displacement.
rdar://12527141

llvm-svn: 172484
2013-01-14 22:31:35 +00:00
Dmitri Gribenko f24e57f227 Improve r172468: const_cast is not needed here
llvm-svn: 172483
2013-01-14 22:18:18 +00:00
Dmitri Gribenko 2e1df0e354 Improve r172471: avoid all those extra casts on the lines nearby
llvm-svn: 172481
2013-01-14 22:08:37 +00:00
Quentin Colombet 77ca8b83a9 Follow up of commit r172472.
Refactor the big if/else sequence into one string switch for ARM subtype selection.

llvm-svn: 172475
2013-01-14 21:34:09 +00:00
Quentin Colombet 1a71168624 Complete the existing support of ARM v6m, v7m, and v7em, i.e., respectively cortex-m0, cortex-m3, and cortex-m4 on the backend side.
Adds new subtype values for the MachO format and use them when the related triple are set.

llvm-svn: 172472
2013-01-14 21:07:43 +00:00
David Greene cf7ae6c2fd Fix Casting
Fix a casting-away-const compiler warning.

llvm-svn: 172471
2013-01-14 21:04:47 +00:00
David Greene c311561708 Fix Another Cast
Properly cast code to eliminate cast-away-const errors.

llvm-svn: 172468
2013-01-14 21:04:42 +00:00
Craig Topper 0d2c29e807 Simplify nested strconcats in X86 td files since strconcat can take more than 2 arguments.
llvm-svn: 172379
2013-01-14 07:46:34 +00:00
Craig Topper 4c69a05d2d Create a single multiclass for SSE and AVX version of MOVL/MOVH. Prevents needing to specify everything twice. No functional change intended
llvm-svn: 172378
2013-01-14 07:26:58 +00:00
Nick Lewycky f41a80efd0 Fix typo in comment.
llvm-svn: 172364
2013-01-13 19:03:55 +00:00
Dmitri Gribenko 226fea5bd6 Remove redundant 'llvm::' qualifications
llvm-svn: 172358
2013-01-13 16:01:15 +00:00
Benjamin Kramer bcd14a0f26 X86: Add patterns for X86ISD::VSEXT in registers.
Those can occur when something between the sextload and the store is on the same
chain and blocks isel. Fixes PR14887.

llvm-svn: 172353
2013-01-13 11:37:04 +00:00
NAKAMURA Takumi de45c3a485 MipsDisassembler.cpp: Prune DecodeHWRegs64RegisterClass() to suppress a warning. [-Wunused-function]
llvm-svn: 172319
2013-01-12 15:37:00 +00:00
NAKAMURA Takumi 956c123ab6 MipsAsmParser: Try to unbreak tests to add extra check.
llvm-svn: 172315
2013-01-12 15:19:10 +00:00
Jack Carter 873c724b4a This patch tackles the problem of parsing Mips
register names in the standalone assembler llvm-mc.

Registers such as $A1 can represent either a 32 or
64 bit register based on the instruction using it.
In addition, based on the abi, $T0 can represent different
32 bit registers.


The problem is resolved by the Mips specific AsmParser 
td definitions changing to work together. Many cases of
RegisterClass parameters are now RegisterOperand.


Contributer: Vladimir Medic
llvm-svn: 172284
2013-01-12 01:03:14 +00:00
Preston Gurd 99c6990457 Update patch for the pad short functions pass for Intel Atom (only).
Adds a check for -Oz, changes the code to not re-visit BBs,
and skips over DBG_VALUE instrs.

Patch by Andy Zhang.

llvm-svn: 172258
2013-01-11 22:06:56 +00:00
NAKAMURA Takumi 7f25427686 X86AsmParser.cpp: Fix up r172148, to add initializer in another CreateMem().
llvm-svn: 172157
2013-01-11 01:13:54 +00:00
Jakub Staszak ab3d878f35 Remove heavy and unused #inclues from X86TargetObjectFile.cpp.
llvm-svn: 172151
2013-01-10 23:43:56 +00:00
Chad Rosier 8c2a9c744e [ms-inline asm] Make sure we set a default value for AddressOf. Follow on to
r172121.

llvm-svn: 172148
2013-01-10 23:39:07 +00:00
Chad Rosier a4bc9437a2 [ms-inline asm] Add support for calling functions from inline assembly.
Part of rdar://12991541

llvm-svn: 172121
2013-01-10 22:10:27 +00:00
Joel Jones 5459754d33 Fix description of ARMOperand
llvm-svn: 172011
2013-01-09 22:34:16 +00:00
Nadav Rotem b1791a75cd ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor.
llvm-svn: 172010
2013-01-09 22:29:00 +00:00
Adhemerval Zanella 1ae2248e14 PowerPC: EH adjustments
This patch adjust the r171506 to make all DWARF enconding pc-relative
for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT
(since the eh_frame will not generate PIC-relative relocation) and also
adds the emission of stubs created by the TTypeEncoding.

llvm-svn: 171979
2013-01-09 17:08:15 +00:00
Nadav Rotem 977e0be4a0 Efficient lowering of vector sdiv when the divisor is a splatted power of two constant.
PR 14848. The lowered sequence is based on the existing sequence the target-independent
DAG Combiner creates for the scalar case.

Patch by Zvi Rackover.

llvm-svn: 171953
2013-01-09 05:14:33 +00:00
Eric Christopher bf7bc4966c Last in the series of removing unnecessary '0' arguments for
address space. Reordered the EmitULEB128IntValue arguments to
make this easier.

llvm-svn: 171949
2013-01-09 03:52:05 +00:00
Andrew Trick 9f0b95f260 MIsched: add an ILP window property to machine model.
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.

I converted some in-order scheduling tests to A2. Hal is working on
more test cases.

llvm-svn: 171946
2013-01-09 03:36:49 +00:00
Eric Christopher e3ab3d0e2c These functions have default arguments of 0 for the last arg. Use
them.

llvm-svn: 171933
2013-01-09 01:57:54 +00:00
Nadav Rotem b696c36fcd Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM.
llvm-svn: 171928
2013-01-09 01:15:42 +00:00
Jack Carter c3dd91c4d7 This patch produces the correct addend value for
an R_MIPS_GPREL16 relocation.


Contributer: Jack Carter
llvm-svn: 171882
2013-01-08 19:01:28 +00:00
Jack Carter 9e28cd3fad This patch produces the correct pointer size
value in the 64 bit .eh_frame section.

It doesn't however allow exception handling to work
yet since it depends on the correct relocation model
being set in the ELF header flags.


Contributer: Jack Carter
llvm-svn: 171881
2013-01-08 18:53:20 +00:00
Preston Gurd a01daace88 Pad Short Functions for Intel Atom
The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879
2013-01-08 18:27:24 +00:00
Eli Bendersky 4d9ada036c Renamed MCInstFragment to MCRelaxableFragment and added some comments.
No change in functionality.

llvm-svn: 171822
2013-01-08 00:22:56 +00:00
Jim Grosbach 9dbf3ee9d0 ARM: Copy-paste error.
llvm-svn: 171790
2013-01-07 21:24:35 +00:00
Jim Grosbach 553eb75663 ARM: Fix a few copy-paste errors.
s/X86/ARM/

llvm-svn: 171789
2013-01-07 21:12:13 +00:00
Bill Schmidt 9b1e3e25dc This patch addresses bug 14678 by fixing two problems in medium code model
code generation.  Variables addressed through a GlobalAlias were not being
handled, and variables with available_externally linkage were treated
incorrectly.  The patch contains two new tests to verify the correct code
generation for these cases.

llvm-svn: 171778
2013-01-07 19:29:18 +00:00
Jordan Rose e8f1eaea8a Change SMRange to be half-open (exclusive end) instead of closed (inclusive)
This is necessary not only for representing empty ranges, but for handling
multibyte characters in the input. (If the end pointer in a range refers to
a multibyte character, should it point to the beginning or the end of the
character in a char array?) Some of the code in the asm parsers was already
assuming this anyway.

llvm-svn: 171765
2013-01-07 19:00:49 +00:00
NAKAMURA Takumi 458a8277cc R600/SIISelLowering.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 171728
2013-01-07 11:14:44 +00:00
Tim Northover 2883da3b51 Add LICENSE.TXT covering contributions made by ARM.
Absent a Contributor's License Agreement (CLA) with an LLVM legal entity and as
reviewed and agreed with Chris Lattner, add a patent license covering future
contributions from ARM until there is a CLA. This is to make explicit ARM's
grant of patent rights to recipients of LLVM containing ARM-contributed
material.

llvm-svn: 171721
2013-01-07 10:04:49 +00:00
Craig Topper ae65212a4b Remove more unnecessary # operators with nothing to paste proceeding them.
llvm-svn: 171702
2013-01-07 06:14:20 +00:00
Craig Topper a8c5ec09c7 Remove # from the beginning and end of def names. The # is a paste operator and should only be used with something to paste on either side.
llvm-svn: 171697
2013-01-07 05:45:56 +00:00
Craig Topper 25cdf92b34 Remove # from the beginning and end of def names.
llvm-svn: 171696
2013-01-07 05:26:58 +00:00
Craig Topper bd62d64cbf Remove unnecessary # tokens at the beginning and end of defm names.
llvm-svn: 171694
2013-01-07 05:04:39 +00:00
Chandler Carruth 2109f47d97 Fix the enumerator names for ShuffleKind to match tho coding standards,
and make its comments doxygen comments.

llvm-svn: 171688
2013-01-07 03:20:02 +00:00
Chandler Carruth 50a36cd148 Make the popcnt support enums and methods have more clear names and
follow the conding conventions regarding enumerating a set of "kinds" of
things.

llvm-svn: 171687
2013-01-07 03:16:03 +00:00
Chandler Carruth d3e73556d6 Move TargetTransformInfo to live under the Analysis library. This no
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
2013-01-07 03:08:10 +00:00
Chandler Carruth 664e354de7 Switch TargetTransformInfo from an immutable analysis pass that requires
a TargetMachine to construct (and thus isn't always available), to an
analysis group that supports layered implementations much like
AliasAnalysis does. This is a pretty massive change, with a few parts
that I was unable to easily separate (sorry), so I'll walk through it.

The first step of this conversion was to make TargetTransformInfo an
analysis group, and to sink the nonce implementations in
ScalarTargetTransformInfo and VectorTargetTranformInfo into
a NoTargetTransformInfo pass. This allows other passes to add a hard
requirement on TTI, and assume they will always get at least on
implementation.

The TargetTransformInfo analysis group leverages the delegation chaining
trick that AliasAnalysis uses, where the base class for the analysis
group delegates to the previous analysis *pass*, allowing all but tho
NoFoo analysis passes to only implement the parts of the interfaces they
support. It also introduces a new trick where each pass in the group
retains a pointer to the top-most pass that has been initialized. This
allows passes to implement one API in terms of another API and benefit
when some other pass above them in the stack has more precise results
for the second API.

The second step of this conversion is to create a pass that implements
the TargetTransformInfo analysis using the target-independent
abstractions in the code generator. This replaces the
ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
lib/Target with a single pass in lib/CodeGen called
BasicTargetTransformInfo. This class actually provides most of the TTI
functionality, basing it upon the TargetLowering abstraction and other
information in the target independent code generator.

The third step of the conversion adds support to all TargetMachines to
register custom analysis passes. This allows building those passes with
access to TargetLowering or other target-specific classes, and it also
allows each target to customize the set of analysis passes desired in
the pass manager. The baseline LLVMTargetMachine implements this
interface to add the BasicTTI pass to the pass manager, and all of the
tools that want to support target-aware TTI passes call this routine on
whatever target machine they end up with to add the appropriate passes.

The fourth step of the conversion created target-specific TTI analysis
passes for the X86 and ARM backends. These passes contain the custom
logic that was previously in their extensions of the
ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
I separated them into their own file, as now all of the interface bits
are private and they just expose a function to create the pass itself.
Then I extended these target machines to set up a custom set of analysis
passes, first adding BasicTTI as a fallback, and then adding their
customized TTI implementations.

The fourth step required logic that was shared between the target
independent layer and the specific targets to move to a different
interface, as they no longer derive from each other. As a consequence,
a helper functions were added to TargetLowering representing the common
logic needed both in the target implementation and the codegen
implementation of the TTI pass. While technically this is the only
change that could have been committed separately, it would have been
a nightmare to extract.

The final step of the conversion was just to delete all the old
boilerplate. This got rid of the ScalarTargetTransformInfo and
VectorTargetTransformInfo classes, all of the support in all of the
targets for producing instances of them, and all of the support in the
tools for manually constructing a pass based around them.

Now that TTI is a relatively normal analysis group, two things become
straightforward. First, we can sink it into lib/Analysis which is a more
natural layer for it to live. Second, clients of this interface can
depend on it *always* being available which will simplify their code and
behavior. These (and other) simplifications will follow in subsequent
commits, this one is clearly big enough.

Finally, I'm very aware that much of the comments and documentation
needs to be updated. As soon as I had this working, and plausibly well
commented, I wanted to get it committed and in front of the build bots.
I'll be doing a few passes over documentation later if it sticks.

Commits to update DragonEgg and Clang will be made presently.

llvm-svn: 171681
2013-01-07 01:37:14 +00:00
Craig Topper 4f1c7256f9 Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior.
cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.

Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.

llvm-svn: 171668
2013-01-06 20:39:29 +00:00
Evan Cheng 3fb03e23a4 Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch.
llvm-svn: 171665
2013-01-06 19:00:15 +00:00
Chandler Carruth 539edf4ee0 Convert the TargetTransformInfo from an immutable pass with dynamic
interfaces which could be extracted from it, and must be provided on
construction, to a chained analysis group.

The end goal here is that TTI works much like AA -- there is a baseline
"no-op" and target independent pass which is in the group, and each
target can expose a target-specific pass in the group. These passes will
naturally chain allowing each target-specific pass to delegate to the
generic pass as needed.

In particular, this will allow a much simpler interface for passes that
would like to use TTI -- they can have a hard dependency on TTI and it
will just be satisfied by the stub implementation when that is all that
is available.

This patch is a WIP however. In particular, the "stub" pass is actually
the one and only pass, and everything there is implemented by delegating
to the target-provided interfaces. As a consequence the tools still have
to explicitly construct the pass. Switching targets to provide custom
passes and sinking the stub behavior into the NoTTI pass is the next
step.

llvm-svn: 171621
2013-01-05 11:43:11 +00:00
Craig Topper 92a70b1e65 Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
llvm-svn: 171608
2013-01-05 07:39:25 +00:00
Nadav Rotem 478b6a47ec Revert revision 171524. Original message:
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603
2013-01-05 05:42:48 +00:00
Chandler Carruth 4a7c311008 Refactor the ScalarTargetTransformInfo API for querying about the
legality of an address mode to not use a struct of four values and
instead to accept them as parameters. I'd love to have named parameters
here as most callers only care about one or two of these, but the
defaults aren't terribly scary to write out.

That said, there is no real impact of this as the passes aren't yet
using STTI for this and are still relying upon TargetLowering.

llvm-svn: 171595
2013-01-05 03:36:17 +00:00
Akira Hatanaka d35a263076 [mips] Fix data layout string. Add 64 to the list of native integer widths
and add stack alignment information.

llvm-svn: 171587
2013-01-05 02:00:56 +00:00
Jakub Staszak 43fafaf496 Move 'break' to the right place to prevent fallthru. There is no test-case
because conditions in the next case prevented from doing anything nasty.

llvm-svn: 171549
2013-01-04 23:01:26 +00:00
Preston Gurd e36b685a94 The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524
2013-01-04 20:54:54 +00:00
Akira Hatanaka b13b33359b [mips] MipsTargetLowering::getSetCCResultType should return a vector type if
vectors are being compared.

llvm-svn: 171517
2013-01-04 20:06:01 +00:00
Akira Hatanaka e067e5a13f [mips] 80 columns.
llvm-svn: 171515
2013-01-04 19:38:05 +00:00
Akira Hatanaka f412e7501a [mips] Reorder template parameters. Remove class shift_rotate_imm32 and
shift_rotate_imm64.

llvm-svn: 171513
2013-01-04 19:25:46 +00:00
Akira Hatanaka a7a9fa1c16 [mips] Refactor conditional move instructions.
llvm-svn: 171511
2013-01-04 19:16:38 +00:00
Akira Hatanaka e36e2f6876 [mips] Refactor instructions which move data from or to coprocessors.
llvm-svn: 171510
2013-01-04 19:13:49 +00:00
Adhemerval Zanella 9b0b781395 PowerPC: Fix eh_frame relocation for PIC
This patch fixes the PPC eh_frame definitions for the personality and 
frame unwinding for PIC objects. It makes PIC build correctly creates
relative relocations in the '.rela.eh_frame' segments and thus avoiding
a text relocation that generates a DT_TEXTREL segments in link phase.

llvm-svn: 171506
2013-01-04 19:08:13 +00:00
Nadav Rotem e6bb35435d Change the default number of registers to prevent unrolling on targets that dont have this hook.
llvm-svn: 171489
2013-01-04 18:40:39 +00:00
Nadav Rotem e1d5c4b8b9 LoopVectorizer:
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.

llvm-svn: 171469
2013-01-04 17:48:25 +00:00
Nadav Rotem c616a5408a Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message:
Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.

llvm-svn: 171468
2013-01-04 17:35:21 +00:00
Elena Demikhovsky 5f2f06d2d9 Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.

llvm-svn: 171467
2013-01-03 08:48:33 +00:00
Michael Gottesman 820aac1c78 Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks."
This reverts commit r171461 since it breaks the following tests:

Clang :: Analysis/outofbound-notwork.c
Clang :: Analysis/string-fail.c
Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp
Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp
Clang :: CXX/temp/temp.param/p14.cpp
Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp
Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c
Clang :: CodeGen/blocks-2.c
Clang :: CodeGen/libcalls-d.c
Clang :: CodeGen/libcalls-ld.c
Clang :: CodeGenCXX/conversion-function.cpp
Clang :: CodeGenCXX/debug-info-limit-type.cpp
Clang :: CodeGenCXX/inheriting-constructor.cpp
Clang :: FixIt/fixit-errors.c
Clang :: FixIt/fixit-pmem.cpp
Clang :: Modules/namespaces.cpp
Clang :: PCH/changed-files.c
Clang :: PCH/pr4489.c
Clang :: PCH/source-manager-stack.c
Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp
Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp
Clang :: SemaTemplate/instantiate-function-1.mm

llvm-svn: 171466
2013-01-03 08:18:30 +00:00
Craig Topper 7c27cc9fd0 Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
llvm-svn: 171461
2013-01-03 06:40:20 +00:00
Hal Finkel 95de3f3018 Add a subtype parameter to VTTI::getShuffleCost
In order to cost subvector insertion and extraction, we need to know
the type of the subvector being extracted.

No functionality change.

llvm-svn: 171453
2013-01-03 02:34:09 +00:00
Kevin Enderby 726e0ea6eb Adds missing aliases for fcom and fcomp instructions without arguments.
Patch by Michael M Kuperstein!

llvm-svn: 171414
2013-01-02 21:20:15 +00:00
Nadav Rotem 761937a757 AVX: Fix a bug in WidenMaskArithmetic.
llvm-svn: 171398
2013-01-02 17:41:03 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth be81023d74 Resort the #include lines in include/... and lib/... with the
utils/sort_includes.py script.

Most of these are updating the new R600 target and fixing up a few
regressions that have creeped in since the last time I sorted the
includes.

llvm-svn: 171362
2013-01-02 10:22:59 +00:00
Craig Topper 9791afb182 Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP.
llvm-svn: 171356
2013-01-02 08:00:39 +00:00
Craig Topper 4bc5c4e152 Merge SSE and AVX instruction definitions for PSHUFD/PSHUFHW/PSHUFLW.
llvm-svn: 171355
2013-01-02 07:27:49 +00:00
Rafael Espindola db1a84c84a Revert 171351. It broke MC/X86/x86-32-avx.s.
llvm-svn: 171352
2013-01-02 01:35:11 +00:00
Craig Topper 86d0cdb82f Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP.
llvm-svn: 171351
2013-01-01 20:53:20 +00:00
Craig Topper 12ed9cd6ae Remove unused argument from a multiclass.
llvm-svn: 171340
2013-01-01 03:42:44 +00:00
Craig Topper 2edafc059d Merge intrinsic instruction definitions for SSE and AVX versions of RCPPS and RSQRTPS.
llvm-svn: 171339
2013-01-01 03:30:21 +00:00
Craig Topper d04dbec6c9 Remove 2 unused multiclasses.
llvm-svn: 171338
2013-01-01 02:02:45 +00:00
Craig Topper 7cc4f322cf Merge AVX/SSE instruction definitions for SQRTPS/PD, RSQRTPS, RCPPS. No funcitonal change intended.
llvm-svn: 171337
2013-01-01 00:11:07 +00:00
Craig Topper c2521cd309 Use packed instead of scalar itineraries for SSE1/2 SQRTPS/PD, RCPPS, and RSQRTPS. VEX-encoded forms already use packed.
llvm-svn: 171336
2012-12-31 23:49:05 +00:00
Bill Wendling 6e95ae803a Remove the getAttributesAtIndex and getNumAttrs methods in favor of using the getAttrSomewhere predicate. This prevents the uses of 'Attribute' as a collection of attributes.
llvm-svn: 171271
2012-12-31 00:49:59 +00:00
Nuno Lopes b6ad98224a convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset().
The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares).
There are a few more places left with duplicated code, which I'll remove soon.

llvm-svn: 171259
2012-12-30 16:25:48 +00:00
Bill Wendling 749a43d874 Use the predicate methods off of AttributeSet instead of Attribute.
llvm-svn: 171257
2012-12-30 13:50:49 +00:00
Bill Wendling 74dba875e2 Remove the Function::getRetAttributes method in favor of using the AttributeSet accessor method.
llvm-svn: 171256
2012-12-30 13:01:51 +00:00
Bill Wendling 94dcaf8e2b Remove Function::getParamAttributes and use the AttributeSet accessor methods instead.
llvm-svn: 171255
2012-12-30 12:45:13 +00:00
Bill Wendling 698e84fc4f Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
2012-12-30 10:32:01 +00:00
Bill Wendling 6190254e0f s/hasAttribute/contains/g to be more consistent with other method names.
llvm-svn: 171252
2012-12-30 09:17:46 +00:00
Craig Topper fe82eb6bcd Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those.
llvm-svn: 171237
2012-12-29 18:18:20 +00:00
Craig Topper f4a9c6e21b Merge similar functionality using a nested switch.
llvm-svn: 171229
2012-12-29 17:19:06 +00:00
Craig Topper 6b27251a76 Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled.
llvm-svn: 171227
2012-12-29 16:44:25 +00:00
Jakub Staszak 215f94143c Simplify code, no functionality change.
llvm-svn: 171226
2012-12-29 15:57:26 +00:00
Jakub Staszak afe8109fce Delete executive bit on ./lib/Target/Hexagon/HexagonAsmPrinter.h.
llvm-svn: 171225
2012-12-29 15:23:06 +00:00
Nadav Rotem 9785f519b4 CostModel: initial checkin for code that estimates the cost of special shuffles.
llvm-svn: 171180
2012-12-28 08:19:03 +00:00
Nadav Rotem c982a2dc25 wrap 80-col lines.
llvm-svn: 171179
2012-12-28 07:28:43 +00:00
Nadav Rotem 3da9ac72fa AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend.
llvm-svn: 171178
2012-12-28 05:45:24 +00:00
Nadav Rotem 68441914a5 Reverse the 'if' condition and reduce the indentation.
llvm-svn: 171172
2012-12-27 23:08:05 +00:00
Craig Topper ab2e6842cc Merge basic_sse12_fp_binop_p_int and basic_sse12_fp_binop_p_y_int multiclasses.
llvm-svn: 171171
2012-12-27 22:53:47 +00:00
Nadav Rotem 3b34190100 AVX/AVX2: Move the SEXT lowering code from a target specific DAGco to a lowering function.
llvm-svn: 171170
2012-12-27 22:47:16 +00:00
Craig Topper e2eec3c52b Merge basic_sse12_fp_binop_p and basic_sse12_fp_binop_p_y multiclasses.
llvm-svn: 171166
2012-12-27 18:51:50 +00:00
Nadav Rotem 2a054b4475 On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized
register. In most cases we actually compare or select YMM-sized registers
and mixing the two types creates horrible code. This commit optimizes
some of the transition sequences.

PR14657.

llvm-svn: 171148
2012-12-27 08:15:45 +00:00
Nadav Rotem 8e5d80eba3 AVX/AVX2: Move the code that lowers vector-trunc from a DAGCo-hook to custom lowering hook.
The vector truncs were scalarized during LegalizeVectorOps, later vectorized again by some DAGCombine optimization
and finally, lowered by a dagcombing optimization. Now, they are properly lowered during LegalizeVectorOps.
No new testcase because the original testcases still work.

llvm-svn: 171146
2012-12-27 07:45:10 +00:00
Craig Topper 757f3fc394 Add hasSideEffects=0 to some forms of ROUND, RCP, and RSQRT.
llvm-svn: 171143
2012-12-27 07:16:08 +00:00
Craig Topper 09ce4b9efe Move single letter 'P' prefix out of multiclass now that tablegen allows defm to start with #NAME. This makes instruction names more searchable again.
llvm-svn: 171141
2012-12-27 06:34:54 +00:00
Craig Topper 396cb795bc Add hasSideEffects=0 to some shift and rotate instructions. None of which are currently used by code generation.
llvm-svn: 171137
2012-12-27 03:35:44 +00:00
Craig Topper c7910828e4 Mark the divide instructions as hasSideEffects=0.
llvm-svn: 171136
2012-12-27 03:01:18 +00:00
Craig Topper 5b807aaa38 Add hasSideEffects=0 to CMP*rr_REV.
llvm-svn: 171130
2012-12-27 02:08:46 +00:00
Craig Topper 89e8607755 Add mayLoad, mayStore, and hasSideEffects tags to BT/BTS/BTR/BTC instructions. Shouldn't change any functionality since they don't have patterns to select them.
llvm-svn: 171128
2012-12-27 02:01:33 +00:00
Craig Topper c557343956 Fix operands and encoding form for ARPL instruction. Register form had and reversed. Memory form writes memory, but was marked as MRMSrcMem.
llvm-svn: 171123
2012-12-26 23:27:57 +00:00
Craig Topper d47a70de9f Add hasSideEffects=0 to some atomic instructions.
llvm-svn: 171122
2012-12-26 23:08:12 +00:00
Craig Topper af2372087b Mark the AL/AX/EAX forms of the basic arithmetic operations has never having side effects.
llvm-svn: 171121
2012-12-26 22:19:23 +00:00
Craig Topper 1b8c0750ee Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier.
llvm-svn: 171118
2012-12-26 21:30:22 +00:00
Craig Topper 18f2675e9b Remove a special conditional setting of neverHasSideEffects if the instruction didn't have a pattern. This was leftover from when tablegen used to complain if things were already inferred from patterns.
llvm-svn: 171117
2012-12-26 21:04:30 +00:00
Craig Topper 24f316e4db Merge still more SSE/AVX instruction definitions.
llvm-svn: 171103
2012-12-26 07:54:43 +00:00
Craig Topper af629e2700 Merge more SSE/AVX instruction definitions.
llvm-svn: 171102
2012-12-26 07:20:35 +00:00
Craig Topper 65fe30450d Fix 80 column violation.
llvm-svn: 171097
2012-12-26 06:15:53 +00:00
Craig Topper f4d0fe8fcd Fix class name in comment.
llvm-svn: 171096
2012-12-26 06:15:09 +00:00
Craig Topper 59747c4dbd Merge SSE/AVX PCMPEQ/PCMPGT instruction definitions.
llvm-svn: 171095
2012-12-26 06:14:15 +00:00
Craig Topper 8a48677586 Remove 'v' from mnemonic to fix asm matching failures.
llvm-svn: 171093
2012-12-26 06:02:15 +00:00
Craig Topper b4ef0fa3a1 Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for a bunch of SSE2 integer arithmetic instructions.
llvm-svn: 171092
2012-12-26 05:49:15 +00:00
Nadav Rotem 5267bb71b8 Reformat the docs.
llvm-svn: 171091
2012-12-26 04:59:20 +00:00
Craig Topper a2594dd5f0 Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for PAND/POR/PXOR/PANDN
llvm-svn: 171087
2012-12-26 04:36:03 +00:00
Craig Topper 97730a0d6a Merge an AVX/SSE 256-bit and 128-bit multiclass.
llvm-svn: 171086
2012-12-26 03:56:47 +00:00
Craig Topper 8b59746390 Mark VANDNPD/VANDNPDS as not commutable.
llvm-svn: 171085
2012-12-26 03:48:10 +00:00
Craig Topper 81d1e596bb Remove alignment from a bunch more VEX encoded operations in the folding tables.
llvm-svn: 171082
2012-12-26 02:44:47 +00:00
Craig Topper b2922164f0 Remove alignment from folding table for VMOVUPD as an unaligned instruction it shouldn't require alignment...
llvm-svn: 171081
2012-12-26 02:14:19 +00:00
Craig Topper d09a9af9b6 Remove alignment requirements from (V)EXTRACTPS. This instruction does 32-bit stores which aren't required to be aligned on SSE or AVX.
llvm-svn: 171080
2012-12-26 01:47:12 +00:00
Craig Topper caef1c5d86 Remove alignment requirement from VCVTSS2SD in folding tables. Reverting r171049. This instruction doesn't require alignment.
llvm-svn: 171078
2012-12-26 00:35:47 +00:00
Hal Finkel 1b5ff08d43 Expand PPC64 atomic load and store
Use of store or load with the atomic specifier on 64-bit types would
cause instruction-selection failures. As with the 32-bit case, these
can use the default expansion in terms of cmp-and-swap.

llvm-svn: 171072
2012-12-25 17:22:53 +00:00
Benjamin Kramer 81b5a8fd2e X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity.
llvm-svn: 171064
2012-12-25 13:09:08 +00:00
Benjamin Kramer df4af41b9b X86: Custom lower <2 x i64> eq and ne when SSE41 is not available.
pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack.
Small speedup on loop-vectorized viterbi (-march=core2).

llvm-svn: 171063
2012-12-25 12:54:19 +00:00
Nadav Rotem 00410ae625 VCVTSS2SD requires a strict alignment. Thanks Elena.
llvm-svn: 171049
2012-12-25 03:29:18 +00:00
Nick Lewycky 521e0d59f3 Quiet gcc's -Wparenthesis warning. No functionality change.
llvm-svn: 171044
2012-12-24 19:58:45 +00:00
Benjamin Kramer 9d46110ff1 Use a std::string rather than a dynamically allocated char* buffer.
This affords us to use std::string's allocation routines and use the destructor
for the memory management. Switching to that also means that we can use
operator==(const std::string&, const char *) to perform the string comparison
rather than resorting to libc functionality (i.e. strcmp).

Patch by Saleem Abdulrasool!

Differential Revision: http://llvm-reviews.chandlerc.com/D230

llvm-svn: 171042
2012-12-24 19:23:30 +00:00
Nadav Rotem 3ee6b10dd4 CostModel: We have API for checking the costs of known shuffles. This patch adds
support for the insert-subvector and extract-subvector kinds.

llvm-svn: 171027
2012-12-24 10:04:03 +00:00
Nadav Rotem dc0ad92b64 Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned.
When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding
tables and removes the alignment restrictions from VEX-encoded instructions.

llvm-svn: 171024
2012-12-24 09:40:33 +00:00
Nadav Rotem 7e1599e100 Change the codegen Cost Model API for shuffeles. This patch removes the API for broadcast and adds a more general API that accepts an enum of known shuffles.
llvm-svn: 171022
2012-12-24 08:57:47 +00:00
Nadav Rotem cf9999d9d5 CostModel: Change the default target-independent implementation for finding
the cost of arithmetic functions. We now assume that the cost of arithmetic
operations that are marked as Legal or Promote is low, but ops that are
marked as custom are higher.

llvm-svn: 171002
2012-12-23 17:31:23 +00:00
Nadav Rotem b15c69a725 whitespace
llvm-svn: 170997
2012-12-23 07:33:44 +00:00
Nadav Rotem 1bef5a0509 Rename a function.
llvm-svn: 170996
2012-12-23 07:30:09 +00:00
Nadav Rotem 2cade68025 Loop Vectorizer: Update the cost model of scatter/gather operations and make
them more expensive.

llvm-svn: 170995
2012-12-23 07:23:55 +00:00
Benjamin Kramer 76268ac682 X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available.
pmuludq is slow, but it turns out that all the unpacking and packing of the
scalarized mul is even slower. 10% speedup on loop-vectorized paq8p.

llvm-svn: 170985
2012-12-22 16:07:56 +00:00
Benjamin Kramer b2f0a2bd4b X86: Emit vector sext as shuffle + sra if vpmovsx is not available.
Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better
than scalarized loads. Fixes PR14590.

llvm-svn: 170984
2012-12-22 11:34:28 +00:00
Nadav Rotem d5aae980cb In some cases, due to scheduling constraints we copy the EFLAGS.
The only way to read the eflags is using push and pop. If we don't
adjust the stack then we run over the first frame index. This is
not something that we want to do, so we have to make sure that
our machine function does not copy the flags. If it does then
we have to emit the prolog that adjusts the stack.

rdar://12896831

llvm-svn: 170961
2012-12-21 23:48:49 +00:00
Akira Hatanaka 6ac2fc4976 [mips] Refactor subword-swap, EXT/INS, load-effective-address and read-hardware
instructions.

llvm-svn: 170956
2012-12-21 23:21:32 +00:00
Akira Hatanaka beea8a34c3 [mips] Refactor SYNC and multiply/divide instructions.
llvm-svn: 170955
2012-12-21 23:17:36 +00:00
Akira Hatanaka 31ddec5887 [mips] Refactor BAL instructions.
llvm-svn: 170954
2012-12-21 23:15:59 +00:00
Akira Hatanaka d6b694f036 [mips] Fix encoding of BAL instruction. Also, fix assembler test case which
was not catching the error.

llvm-svn: 170953
2012-12-21 23:13:59 +00:00
Akira Hatanaka a158042a56 [mips] Refactor jump, jump register, jump-and-link and nop instructions.
llvm-svn: 170952
2012-12-21 23:03:50 +00:00
Akira Hatanaka e1826d7464 [mips] Refactor load/store left/right and load-link and store-conditional
instructions.

llvm-svn: 170950
2012-12-21 23:01:24 +00:00
Akira Hatanaka d9bf8424e5 [mips] Refactor load/store instructions.
llvm-svn: 170948
2012-12-21 22:58:55 +00:00
Akira Hatanaka b59b047fbe [mips] Remove unnecessary isPseudo parameter.
llvm-svn: 170947
2012-12-21 22:57:26 +00:00
Akira Hatanaka e738efc95b [mips] Refactor LUI instruction.
llvm-svn: 170944
2012-12-21 22:46:07 +00:00
Akira Hatanaka 895e1cb2aa [mips] Refactor count leading zero or one instructions.
llvm-svn: 170942
2012-12-21 22:43:58 +00:00
Akira Hatanaka 4f4c4aa05e [mips] Refactor sign-extension-in-register instructions.
llvm-svn: 170940
2012-12-21 22:41:52 +00:00
Akira Hatanaka b14c6e4e5f [mips] Refactor instructions which copy from and to HI/LO registers.
llvm-svn: 170939
2012-12-21 22:39:17 +00:00
Akira Hatanaka 9e89195dce [mips] Refactor logical NOR instructions.
llvm-svn: 170937
2012-12-21 22:35:47 +00:00
Akira Hatanaka ac10697207 [mips] Move instruction definitions in MipsInstrInfo.td.
llvm-svn: 170936
2012-12-21 22:33:43 +00:00
Tom Stellard 09ef8425e9 R600: Coding style - remove empty spaces from the beginning of functions
No functionality change.

llvm-svn: 170923
2012-12-21 20:12:02 +00:00
Tom Stellard 41398026e7 R600: Fix MAX_UINT definition
Patch by: Vadim Girlin

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 170922
2012-12-21 20:12:01 +00:00
Tom Stellard 4fa7ac29f1 R600: Add SHADOWCUBE to TEX_SHADOW pattern
Patch by: Vadim Girlin

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 170921
2012-12-21 20:11:59 +00:00
Benjamin Kramer 5521b94b07 Cleanup compiler warnings on discarding type qualifiers in casts. Switch to C++ style casts.
Patch by Saleem Abdulrasool!

Differential Revision: http://llvm-reviews.chandlerc.com/D204

llvm-svn: 170917
2012-12-21 19:09:53 +00:00
Benjamin Kramer 82d1c371e2 X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization.
Part of PR14667.

llvm-svn: 170908
2012-12-21 17:46:58 +00:00
Roman Divacky a229186a82 Remove duplicate includes.
llvm-svn: 170902
2012-12-21 17:06:44 +00:00
Tom Stellard a8b0351720 R600: Expand vec4 INT <-> FP conversions
llvm-svn: 170901
2012-12-21 16:33:24 +00:00
Benjamin Kramer 4669d18893 X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
This is very mechanical, no functionality change. Preparation for PR14667.

llvm-svn: 170898
2012-12-21 14:04:55 +00:00
Nadav Rotem eacbb731d3 Add a missing "virtual" keyword.
llvm-svn: 170842
2012-12-21 05:02:12 +00:00
Quentin Colombet b1b66e7a25 Add ARM cortex-r5 subtarget.
llvm-svn: 170840
2012-12-21 04:35:05 +00:00
Nadav Rotem 6d4fdd6d2c Improve the X86 cost model for loads and stores.
llvm-svn: 170830
2012-12-21 01:33:59 +00:00
Nadav Rotem a4b53f20a3 BB-Vectorizer: Check the cost of the store pointer type
and not the return type, which is void. A number of test
cases fail after adding the assertion in TTImpl.

llvm-svn: 170828
2012-12-21 01:24:36 +00:00
Reed Kotler 9bff1ead0e Call llvm_unreachable instead of assert.
llvm-svn: 170822
2012-12-21 00:44:59 +00:00
Jakob Stoklund Olesen 33f5d1492d Add an MF argument to MI::copyImplicitOps().
This function is often used to decorate dangling instructions, so a
context reference is required to allocate memory for the operands.

Also add a corresponding MachineInstrBuilder method.

llvm-svn: 170797
2012-12-20 22:54:02 +00:00
Jakob Stoklund Olesen 2ea203694d MachineInstrBuilderize ARM.
llvm-svn: 170795
2012-12-20 22:53:55 +00:00
Jakob Stoklund Olesen 4255c96aed MachineInstrBuilderize NVPTX.
llvm-svn: 170794
2012-12-20 22:53:53 +00:00
Bob Wilson 7bba4f8957 Revert "Adding support for llvm.arm.neon.vaddl[su].* and"
This reverts r170694.  The operations can be represented in IR without
adding any new intrinsics.

llvm-svn: 170765
2012-12-20 21:09:38 +00:00
Evan Cheng ddc0cb6dc5 On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr,
are more expensive than the non-flag setting variant. Teach thumb2 size
reduction pass to avoid generating them unless we are optimizing for size.

rdar://12892707

llvm-svn: 170728
2012-12-20 19:59:30 +00:00
Roman Divacky ff95a1dc12 Remove MCTargetAsmLexer and its derived classes now that edis,
its only user, is gone.

llvm-svn: 170699
2012-12-20 14:43:30 +00:00
Renato Golin 6b2ea4a48f Adding support for llvm.arm.neon.vaddl[su].* and
llvm.arm.neon.vsub[su].* intrinsics.

Patch by Pete Couperus <pjcoup@gmail.com>

llvm-svn: 170694
2012-12-20 13:52:11 +00:00
Reed Kotler d11acc7dc0 Implement cfi_def_cfa_offset. "Make check" test case for this comming in the
next few days but it's already tested a lot from test-suite and works fine.
This patch completes almost 100% pass of test-suite for mips 16.

llvm-svn: 170674
2012-12-20 06:59:37 +00:00
Reed Kotler 8965d24a2a There is one more patch to finish large frames. Make sure we assert
on code that has large frames which will not yet compile correctly.

llvm-svn: 170673
2012-12-20 06:57:00 +00:00
Jyotsna Verma 56605448f2 Add constant extender support to GP-relative load/store instructions.
llvm-svn: 170672
2012-12-20 06:52:46 +00:00
Jyotsna Verma bf75aaf53e Add TSFlags to ALU32 type instructions for constant-extender/Relationship maps.
llvm-svn: 170671
2012-12-20 06:45:39 +00:00
Reed Kotler 7bff8f1d7a set register class properly for mips16 here
llvm-svn: 170669
2012-12-20 06:06:35 +00:00
Rafael Espindola fb8ac2df09 Undefine PPC harder.
This was causing a build failure while trying to build on ppc ubuntu 12.10 with
cmake.

llvm-svn: 170668
2012-12-20 05:13:09 +00:00
Reed Kotler 92fc33bc97 This assert is overly restrictive and does not work for mips16.
llvm-svn: 170667
2012-12-20 05:09:15 +00:00
Reed Kotler fd633229f7 Turn on register scavenger for Mips 16
We use an unused Mips 32 register for the emergency slot
instead of using the stack.

llvm-svn: 170665
2012-12-20 04:44:58 +00:00
Akira Hatanaka e7f1acc7c0 [mips] Refactor SLT (set on less than) instructions. Separate encoding
information from the rest. 

llvm-svn: 170664
2012-12-20 04:27:52 +00:00
Akira Hatanaka bbd197e9c4 [mips] Refactor unconditional branch instruction. Separate encoding information
from the rest. 

llvm-svn: 170663
2012-12-20 04:22:39 +00:00
Akira Hatanaka b1527b7505 [mips] Remove asm string parameter from pseudo instructions. Add InstrItinClass
parameter.

llvm-svn: 170661
2012-12-20 04:20:09 +00:00
Akira Hatanaka 14f9ce0f83 [mips] Delete definition of CPRESTORE instruction.
llvm-svn: 170660
2012-12-20 04:15:30 +00:00
Akira Hatanaka c0ea0bb99b [mips] Refactor conditional branch instructions with one register operand.
Separate encoding information from the rest.

llvm-svn: 170659
2012-12-20 04:13:23 +00:00
Akira Hatanaka f71ffd29d9 [mips] Refactor conditional branch instructions with two register operands.
Separate encoding information from the rest.

llvm-svn: 170657
2012-12-20 04:10:13 +00:00
Reed Kotler d019dbf75e fix most of remaining issues with large frames.
these patches are tested a lot by test-suite but
make check tests are forthcoming once the next
few patches that complete this are committed.
with the next few patches the pass rate for mips16 is
near 100%

llvm-svn: 170656
2012-12-20 04:07:42 +00:00
Akira Hatanaka f423672117 [mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy
physical register $r1 to $r0.

GNU disassembler recognizes an "or" instruction as a "move", and this change
makes the disassembled code easier to read.

Original patch by Reed Kotler.

llvm-svn: 170655
2012-12-20 04:06:06 +00:00
Richard Smith 15b1e3727b Fix use-before-construction of X86TargetLowering.
llvm-svn: 170654
2012-12-20 04:04:17 +00:00
Akira Hatanaka 7d75f9e3d3 [mips] Change the order of template parameters. Move the default parameters to
the end. 

llvm-svn: 170651
2012-12-20 03:52:08 +00:00
Akira Hatanaka 244f9e874c [mips] Refactor shift instructions with register operands. Separate encoding
information from the rest.

llvm-svn: 170650
2012-12-20 03:48:24 +00:00
Akira Hatanaka 7f96ad325f [mips] Refactor shift immediate instructions. Separate encoding information
from the rest.

llvm-svn: 170649
2012-12-20 03:44:41 +00:00
Akira Hatanaka ab1b715bf2 [mips] Refactor arithmetic and logic instructions with immediate operands.
Separate encoding information from the rest.

llvm-svn: 170648
2012-12-20 03:40:03 +00:00
Akira Hatanaka 1b37c4af01 [mips] Refactor arithmetic and logic instructions. Separate encoding
information from the rest.

llvm-svn: 170647
2012-12-20 03:34:05 +00:00
Akira Hatanaka 73495897b1 [mips] Delete ArithOverflowR and ArithOverflow and use ArithLogicR and
ArithLogicI as the instruction base classes.

llvm-svn: 170642
2012-12-20 03:00:16 +00:00
NAKAMURA Takumi 2a0b40f584 Target/R600: Update MIB according to r170588.
llvm-svn: 170620
2012-12-20 00:22:11 +00:00
Jim Grosbach 6df94846ec MC: Add MCInstrDesc::mayAffectControlFlow() method.
MC disassembler clients (LLDB) are interested in querying if an
instruction may affect control flow other than by virtue of being
an explicit branch instruction. For example, instructions which
write directly to the PC on some architectures.

llvm-svn: 170610
2012-12-19 23:38:53 +00:00
Tom Stellard 1c315d5411 R600: Remove unecessary VREG alignment.
Unlike SGPRs VGPRs doesn't need to be aligned.

Patch by: Christian König

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 170593
2012-12-19 22:10:34 +00:00
Tom Stellard e7b907d85c R600: control flow optimization
Branch if we have enough instructions so that it makes sense.
Also remove branches if they don't make sense.

Patch by: Christian König

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 170592
2012-12-19 22:10:33 +00:00
Tom Stellard f8794354b2 R600: New control flow for SI v2
This patch replaces the control flow handling with a new
pass which structurize the graph before transforming it to
machine instruction. This has a couple of different advantages
and currently fixes 20 piglit tests without a single regression.

It is now a general purpose transformation that could be not
only be used for SI/R6xx, but also for other hardware
implementations that use a form of structurized control flow.

v2: further cleanup, fixes and documentation

Patch by: Christian König

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 170591
2012-12-19 22:10:31 +00:00
Jakob Stoklund Olesen b159b5ff0d Remove the explicit MachineInstrBuilder(MI) constructor.
Use the version that also takes an MF reference instead.

It would technically be possible to extract an MF reference from the MI
as MI->getParent()->getParent(), but that would not work for MIs that
are not inserted into any basic block.

Given the reasonably small number of places this constructor was used at
all, I preferred the compile time check to a run time assertion.

llvm-svn: 170588
2012-12-19 21:31:56 +00:00
Evan Cheng eae6d2ccea LLVM sdisel normalize bit extraction of the form:
((x & 0xff00) >> 8) << 2
to
 (x >> 6) & 0x3fc

This is general goodness since it folds a left shift into the mask. However,
the trailing zeros in the mask prevents the ARM backend from using the bit
extraction instructions. And worse since the mask materialization may require
an addition instruction. This comes up fairly frequently when the result of 
the bit twiddling is used as memory address. e.g.

 = ptr[(x & 0xFF0000) >> 16]

We want to generate:
  ubfx   r3, r1, #16, #8
  ldr.w  r3, [r0, r3, lsl #2]

vs.
  mov.w  r9, #1020
  and.w  r2, r9, r1, lsr #14
  ldr    r2, [r0, r2]

Add a late ARM specific isel optimization to
ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the
'base + offset' address computation; change the mask to one which doesn't have
trailing zeros and enable the use of ubfx.

Note the optimization has to be done late since it's target specific and we
don't want to change the DAG normalization. It's also fairly restrictive
as shifter operands are not always free. It's only done for lsh 1 / 2. It's
known to be free on some cpus and they are most common for address
computation.

This is a slight win for blowfish, rijndael, etc.

rdar://12870177

llvm-svn: 170581
2012-12-19 20:16:09 +00:00
Roman Divacky e3d323052f Remove edis - the enhanced disassembler. Fixes PR14654.
llvm-svn: 170578
2012-12-19 19:55:47 +00:00
Paul Redmond 5917f4c715 Transform (x&C)>V into (x&C)!=0 where possible
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel

llvm-svn: 170576
2012-12-19 19:47:13 +00:00
Benjamin Kramer c5071466d4 PowerPC: Expand VSELECT nodes.
There's probably a better expansion for those nodes than the default for
altivec, but this is better than crashing. VSELECTs occur in loop vectorizer
output.

llvm-svn: 170551
2012-12-19 15:49:14 +00:00
Patrik Hagglund e09cac9a67 Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.

llvm-svn: 170537
2012-12-19 12:02:25 +00:00
Patrik Hagglund bad545ccba Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.

llvm-svn: 170535
2012-12-19 11:48:16 +00:00
Patrik Hagglund f9eb168ef4 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.

llvm-svn: 170532
2012-12-19 11:30:36 +00:00
NAKAMURA Takumi 89209462fe X86ISelLowering.cpp: Fix warnings. [-Wlogical-op-parentheses]
llvm-svn: 170523
2012-12-19 10:12:48 +00:00
Elena Demikhovsky 14a4af0e66 Optimized load + SIGN_EXTEND patterns in the X86 backend.
llvm-svn: 170506
2012-12-19 07:50:20 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Reed Kotler 3aad762d1d Add some missing Defs and Uses.
llvm-svn: 170493
2012-12-19 04:06:15 +00:00
Jakub Staszak 338863a546 Reverse order of checking SSE level when calculating compare cost, so we check
AVX2 before AVX.

llvm-svn: 170464
2012-12-18 22:57:56 +00:00
Quentin Colombet 23b404d5ad Disable ARM partial flag dependency optimization at -Oz
To not over constrain the scheduler for ARM in thumb mode, some optimizations  for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency).

Disables this check when code size is the priority, i.e., code is compiled with -Oz.

llvm-svn: 170462
2012-12-18 22:47:16 +00:00
Eli Bendersky 39e7c6e370 Get rid of the pesky -Woverloaded-virtual warning. No change in functionality.
llvm-svn: 170438
2012-12-18 18:21:29 +00:00
Jakob Stoklund Olesen 41bbf9c256 Repair bundles that were broken by removing and reinserting the first
instruction.

This isn't strictly necessary at the moment because Thumb2SizeReduction
also copies all MI flags from the old instruction to the new. However, a
future patch will make that kind of direct flag tampering illegal.

llvm-svn: 170395
2012-12-18 00:46:39 +00:00
Jakob Stoklund Olesen 43b1e13386 Extract a method, no functional change intended.
Sadly, this costs us a perfectly good opportunity to use 'goto'.

llvm-svn: 170385
2012-12-18 00:13:11 +00:00
Chad Rosier 150d35bc1d [arm fast-isel] Minor cleanup. No functional change intended.
llvm-svn: 170379
2012-12-17 22:35:29 +00:00
Chad Rosier 62a144f099 [arm fast-isel] Fast-isel only handles simple VTs, so make sure the necessary
checks are in place.  Some minor cleanup as well.

llvm-svn: 170360
2012-12-17 19:59:43 +00:00
Richard Osborne 459e35c261 Add instruction encodings / disassembly support for l2r instructions.
llvm-svn: 170345
2012-12-17 16:28:02 +00:00
Tom Stellard 5a6879466a R600: enable S_*N2_* instructions
They seem to work fine.

Patch by: Christian König

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 170343
2012-12-17 15:14:56 +00:00
Tom Stellard 9e90b5895d R600: BB operand support for SI
Patch by: Christian König

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 170342
2012-12-17 15:14:54 +00:00
Tom Stellard 16a17c6d3e R600: remove nonsense setPrefLoopAlignment
The Align parameter is a power of two, so 16 results in 64K
alignment. Additional to that even 16 byte alignment doesn't
make any sense, so just remove it.

Patch by: Christian König

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
llvm-svn: 170341
2012-12-17 15:14:53 +00:00
Patrik Hagglund c494d24a68 Revert/correct some FastISel changes in r170104 (EVT->MVT for
TargetLowering::getRegClassFor).

Some isSimple() guards were missing, or getSimpleVT() were hoisted too
far, resulting in asserts on valid LLVM assembly input.

llvm-svn: 170336
2012-12-17 14:30:06 +00:00
Richard Osborne 51bf1b269a Add instruction encodings for PEEK and ENDIN.
Previously these were marked with the wrong format.

llvm-svn: 170334
2012-12-17 14:23:54 +00:00
Richard Osborne c104bf2769 Fix parameter name in prototypes in XCoreDisassembler.
llvm-svn: 170332
2012-12-17 13:55:49 +00:00
Richard Osborne 041071c558 Add instruction encodings / disassembly support for rus instructions.
llvm-svn: 170330
2012-12-17 13:50:04 +00:00
Richard Osborne e405e58639 Add instruction encodings for ZEXT and SEXT.
Previously these were marked with the wrong format.

llvm-svn: 170327
2012-12-17 13:20:37 +00:00
Richard Osborne 3a0d5cc314 Add instruction encodings / disassembly support for 2r instructions.
llvm-svn: 170323
2012-12-17 12:29:31 +00:00
Richard Osborne 016967e4ff Add instruction encodings / disassembly support for 0r instructions.
llvm-svn: 170322
2012-12-17 12:26:29 +00:00
Richard Osborne 1cc2b68ad6 Simplify assertion in XCoreInstPrinter.
llvm-svn: 170321
2012-12-17 12:13:46 +00:00
Richard Osborne 4e1e14bccd Update comments to match recommended doxygen style.
llvm-svn: 170320
2012-12-17 12:13:41 +00:00
Richard Osborne eb31fa483e Remove unnecessary include.
llvm-svn: 170319
2012-12-17 12:13:32 +00:00
Craig Topper 354ed773b8 Remove EFLAGS from the BLSI/BLSMSK/BLSR patterns. The nodes created by DAG combine don't contain an EFLAGS def.
llvm-svn: 170308
2012-12-17 06:13:48 +00:00
Craig Topper f3ff6ae066 Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible.
llvm-svn: 170305
2012-12-17 05:12:30 +00:00
Craig Topper f924a58af1 Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt.
llvm-svn: 170304
2012-12-17 05:02:29 +00:00
Craig Topper 5b08cf7736 Remove store forms of DEC/INC from isDefConvertible. Since they are stores they don't have a register def.
llvm-svn: 170303
2012-12-17 04:55:07 +00:00
Richard Osborne 1b5562ad8e Add instruction encodings and disassembly for 1r instructions.
llvm-svn: 170293
2012-12-16 17:37:34 +00:00
Richard Osborne e31735a52b Add XCore disassembler.
Currently there is no instruction encoding info and
XCoreDisassembler::getInstruction() always returns Fail. I intend to add
instruction encodings and tests in follow on commits.

llvm-svn: 170292
2012-12-16 17:29:14 +00:00
Richard Osborne 872f51e301 Remove invalid instruction encodings.
llvm-svn: 170291
2012-12-16 16:46:31 +00:00
Richard Osborne e298556706 Mark anything deriving from PseudoInstXCore as a pseudo instruction.
llvm-svn: 170290
2012-12-16 16:46:28 +00:00
Richard Osborne f12cb9ef27 Set instruction size correctly in XCoreInstrFormats.td
llvm-svn: 170289
2012-12-16 16:46:24 +00:00
Richard Osborne 3c31e21837 Change XCoreAsmPrinter to lower MachineInstrs to MCInsts before emission.
This change adds XCoreMCInstLower to do the lowering to MCInst and
XCoreInstPrinter to print the MCInsts.

llvm-svn: 170288
2012-12-16 16:20:48 +00:00
Richard Osborne b1de9f7e07 Replace ${:comment} with the comment symbol.
llvm-svn: 170286
2012-12-16 15:59:02 +00:00
Reed Kotler aee4d5d194 This patch is needed to make c++ exceptions work for mips16.
Mips16 is really a processor decoding mode (ala thumb 1) and in the same
program, mips16 and mips32 functions can exist and can call each other.

If a jal type instruction encounters an address with the lower bit set, then
the processor switches to mips16 mode (if it is not already in it). If the
lower bit is not set, then it switches to mips32 mode.

The linker knows which functions are mips16 and which are mips32.
When relocation is performed on code labels, this lower order bit is
set if the code label is a mips16 code label.

In general this works just fine, however when creating exception handling
tables and dwarf, there are cases where you don't want this lower order
bit added in.

This has been traditionally distinguished in gas assembly source by using a
different syntax for the label.

lab1:      ; this will cause the lower order bit to be added
lab2=.     ; this will not cause the lower order bit to be added

In some cases, it does not matter because in dwarf and debug tables
the difference of two labels is used and in that case the lower order
bits subtract each other out.

To fix this, I have added to mcstreamer the notion of a debuglabel.
The default is for label and debug label to be the same. So calling
EmitLabel and EmitDebugLabel produce the same result.

For various reasons, there is only one set of labels that needs to be
modified for the mips exceptions to work. These are the "$eh_func_beginXXX" 
labels.

Mips overrides the debug label suffix from ":" to "=." .

This initial patch fixes exceptions. More changes most likely
will be needed to DwarfCFException to make all of this work
for actual debugging. These changes will be to emit debug labels in some
places where a simple label is emitted now.

Some historical discussion on this from gcc can be found at:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html
http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html 

llvm-svn: 170279
2012-12-16 04:00:45 +00:00
Benjamin Kramer b16ccde7a4 X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.

llvm-svn: 170273
2012-12-15 16:47:44 +00:00
Chandler Carruth 7a28f95419 Make '-mtune=x86_64' assume fast unaligned memory accesses.
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.

Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs. Reviewed by Dan Gohman.

llvm-svn: 170269
2012-12-15 09:01:13 +00:00
Reed Kotler 5fdeb21249 This code implements most of mips16 hardfloat as it is done by gcc.
In this case, essentially it is soft float with different library routines.
The next step will be to make this fully interoperational with mips32 floating
point and that requires creating stubs for functions with signatures that
contain floating point types.

I have a more sophisticated design for mips16 hardfloat which I hope to
implement at a later time that directly does floating point without the need
for function calls.

The mips16 encoding has no floating point instructions so one needs to
switch to mips32 mode to execute floating point instructions.

llvm-svn: 170259
2012-12-15 00:20:05 +00:00
Kevin Enderby 06aa3eb8ce Make sure the alternate PC+imm syntax of LDR instruction with a small
immediate generates the narrow version.  Needed when doing round-trip
assemble/disassemble testing using the alternate syntax that specifies
'pc' directly.

llvm-svn: 170255
2012-12-14 23:04:25 +00:00
Nadav Rotem 8487537bdb TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them.
llvm-svn: 170245
2012-12-14 21:20:37 +00:00
Bill Schmidt a4f898448c This patch removes some nondeterminism from direct object file output
for TLS dynamic models on 64-bit PowerPC ELF.  The default sort routine
for relocations only sorts on the r_offset field; but with TLS, there
can be two relocations with the same r_offset.  For PowerPC, this patch
sorts secondarily on descending r_type, which matches the behavior
expected by the linker.

llvm-svn: 170237
2012-12-14 20:28:38 +00:00
Bill Schmidt 9f0b4ec0f5 This patch improves the 64-bit PowerPC InitialExec TLS support by providing
for a wider range of GOT entries that can hold thread-relative offsets.
This matches the behavior of GCC, which was not documented in the PPC64 TLS
ABI.  The ABI will be updated with the new code sequence.

Former sequence:

  ld 9,x@got@tprel(2)
  add 9,9,x@tls

New sequence:

  addis 9,2,x@got@tprel@ha
  ld 9,x@got@tprel@l(9)
  add 9,9,x@tls

Note that a linker optimization exists to transform the new sequence into
the shorter sequence when appropriate, by replacing the addis with a nop
and modifying the base register and relocation type of the ld.

llvm-svn: 170209
2012-12-14 17:02:38 +00:00
Shuxin Yang 97e07bf211 Remove two popcount patterns which we are already able to recognize.
llvm-svn: 170158
2012-12-13 23:16:19 +00:00
Bill Schmidt 9ed4dbcb75 This is another cleanup patch for 64-bit PowerPC TLS processing. I had
some hackery in place that hid my poor use of TblGen, which I've now sorted
out and cleaned up.  No change in observable behavior, so no new test cases.

llvm-svn: 170149
2012-12-13 20:57:10 +00:00
Tom Stellard 6975d35979 Fix warnings with -DNDEBUG
Patch by: NAKAMURA Takumi

llvm-svn: 170142
2012-12-13 19:38:52 +00:00
Bill Schmidt 732eb91f05 This is just a clean-up patch that simplifies the initial-exec TLS logic by
avoiding use of machine operand flags.  No change in observable behavior, so
no new test cases.

llvm-svn: 170141
2012-12-13 18:45:54 +00:00
Patrik Hagglund 5e6c361bc0 Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

This is the second attempt. In the first attempt (r169837), a few
getSimpleVT() were hoisted too far, detected by bootstrap failures.

llvm-svn: 170104
2012-12-13 06:34:11 +00:00
Akira Hatanaka cf9a61b6ee [mips] Do not copy GOT address to register $gp if the function being called has
internal linkage.

llvm-svn: 170092
2012-12-13 03:17:29 +00:00
Eric Christopher 80882db88f Add a way of printing out an arbitrary label name for a section
given the section.

llvm-svn: 170087
2012-12-13 03:00:35 +00:00
Akira Hatanaka b2cc8a756f [mips] Delete all floating point instruction classes that are no longer used.
No functionality change.

llvm-svn: 170084
2012-12-13 02:05:02 +00:00
Akira Hatanaka 6262bbf819 [mips] Modify definitions of floating point conditional move instructions.
No functionality change.

llvm-svn: 170080
2012-12-13 01:41:15 +00:00
Akira Hatanaka 79e1cdb00b [mips] Modify definitions of floating point comparison instructions.
No functionality change.

llvm-svn: 170077
2012-12-13 01:34:09 +00:00
Akira Hatanaka fd9163b74c [mips] Modify definitions of floating point branch instructions.
No functionality change.

llvm-svn: 170076
2012-12-13 01:32:36 +00:00
Akira Hatanaka cd3dfd238e [mips] Modify definitions of floating point indexed load and store instructions.
No functionality change.

llvm-svn: 170075
2012-12-13 01:30:49 +00:00
Akira Hatanaka b0d4acbc65 [mips] Modify definitions of floating point multiply-add/sub instructions.
No functionality change.

llvm-svn: 170073
2012-12-13 01:27:48 +00:00
Akira Hatanaka 92994f4846 [mips] Modify definitions of floating point load and store instructions.
No functionality change.

llvm-svn: 170072
2012-12-13 01:24:00 +00:00
Akira Hatanaka 2b75dde5fa [mips] Modify definitions of move from/to coprocessor instructions.
No functionality change.

llvm-svn: 170071
2012-12-13 01:16:49 +00:00
Akira Hatanaka dea8f61ae0 [mips] Modify definitions of two register operand floating point instructions.
No functionality change.

llvm-svn: 170069
2012-12-13 01:14:07 +00:00
Akira Hatanaka 29b513871a [mips] Modify definitions of three register operand floating point instructions
and separate encoding information from the rest.

llvm-svn: 170066
2012-12-13 01:07:37 +00:00
Jakob Stoklund Olesen 436eea9833 Avoid setIsInsideBundle in Target/R600.
This function is going to be removed.

llvm-svn: 170064
2012-12-13 00:59:38 +00:00
Akira Hatanaka 84693d5606 [mips] Move classes that do not belong in MipsInstrFormats.td into
MipsInstrFPU.td.
 

llvm-svn: 170061
2012-12-13 00:49:23 +00:00
Akira Hatanaka db49b39200 [mips] Set isCommutable flag in a more explicit way.
llvm-svn: 170060
2012-12-13 00:46:23 +00:00
Akira Hatanaka 193e1f738a [mips] Remove fmt from the parameter list of classes FMADDSUB and FNMADDSUB.
llvm-svn: 170057
2012-12-13 00:38:59 +00:00
Akira Hatanaka caaf4dd516 [mips] Remove single-precision floating point instruction from multiclass
FFR2P_M.
 

llvm-svn: 170055
2012-12-13 00:35:54 +00:00
Akira Hatanaka 02ec5516f8 [mips] Move class IsCommutable into MipsInstrInfo.td.
llvm-svn: 170054
2012-12-13 00:32:01 +00:00
Akira Hatanaka e986a59ad9 [mips] Remove single-precision floating point instructions from multiclasses
FFR1_W_M and FFR1P_M. The new instruction definitions have one-to-one
correspondence with the instructions in the ISA manual.
 

llvm-svn: 170053
2012-12-13 00:29:29 +00:00
Eli Bendersky b2022f3a5a Fix a bogus comment
llvm-svn: 170052
2012-12-13 00:24:56 +00:00
Akira Hatanaka 7bc144c366 [mips] Fix a memory leak bug report by NAKAMURA Takumi.
llvm-svn: 170012
2012-12-12 20:09:58 +00:00
Bill Schmidt 24b8dd6eb7 This patch implements local-dynamic TLS model support for the 64-bit
PowerPC target.  This is the last of the four models, so we now have 
full TLS support.

This is mostly a straightforward extension of the general dynamic model.
I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the
register copy following ADDI_TLSLD_L; otherwise everything above the
ADDIS_DTPREL_HA appeared dead and was removed.

As before, there are new test cases to test the assembly generation, and
the relocations output during integrated assembly.  The expected code
gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll.

There are a couple of things I think can be done more efficiently in the
overall TLS code, so there will likely be a clean-up patch forthcoming;
but for now I want to be sure the functionality is in place.

Bill

llvm-svn: 170003
2012-12-12 19:29:35 +00:00
Logan Chien 4dd14fb5eb Add ARM NONE and PREL31 relocation types.
Add R_ARM_NONE and R_ARM_PREL31 relocation types
to MCExpr.  Both of them will be used while
generating .ARM.extab and .ARM.exidx sections.

llvm-svn: 169965
2012-12-12 07:14:46 +00:00
NAKAMURA Takumi 85292a1338 [CMake] Fixup R600.
llvm-svn: 169962
2012-12-12 03:34:26 +00:00
Evan Cheng 962711ee71 Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I
mention the inline memcpy / memset expansion code is a mess?

This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.

llvm-svn: 169959
2012-12-12 02:34:41 +00:00
Evan Cheng c3d1aca657 - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.

llvm-svn: 169954
2012-12-12 01:32:07 +00:00
Evan Cheng 04e5518783 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.

llvm-svn: 169944
2012-12-12 00:42:09 +00:00
Jim Grosbach 647c702780 Trim unneeded header #include.
llvm-svn: 169933
2012-12-11 23:39:51 +00:00
Jim Grosbach 0ddedcc560 ARM: Remove old testing option.
Pre-regalloc frame allocation and referencing has been on by default
for ages. No need for the testing option that disables it.

llvm-svn: 169931
2012-12-11 23:31:12 +00:00
Jim Grosbach 1197889c44 ARM: Remove old testing options.
Base pointer referencing has been enabled for ages.

llvm-svn: 169930
2012-12-11 23:31:10 +00:00
Evan Cheng eb54240dc2 Replace TargetLowering::isIntImmLegal() with
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.

llvm-svn: 169929
2012-12-11 23:26:14 +00:00
Tom Stellard 75aadc2813 Add R600 backend
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX

llvm-svn: 169915
2012-12-11 21:25:42 +00:00
Bill Schmidt c56f1d34bc This patch implements the general dynamic TLS model for 64-bit PowerPC.
Given a thread-local symbol x with global-dynamic access, the generated
code to obtain x's address is:

     Instruction                            Relocation            Symbol
  addis ra,r2,x@got@tlsgd@ha           R_PPC64_GOT_TLSGD16_HA       x
  addi  r3,ra,x@got@tlsgd@l            R_PPC64_GOT_TLSGD16_L        x
  bl __tls_get_addr(x@tlsgd)           R_PPC64_TLSGD                x
                                       R_PPC64_REL24           __tls_get_addr
  nop
  <use address in r3>

The implementation borrows from the medium code model work for introducing
special forms of ADDIS and ADDI into the DAG representation.  This is made
slightly more complicated by having to introduce a call to the external
function __tls_get_addr.  Using the full call machinery is overkill and,
more importantly, makes it difficult to add a special relocation.  So I've
introduced another opcode GET_TLS_ADDR to represent the function call, and
surrounded it with register copies to set up the parameter and return value.

Most of the code is pretty straightforward.  I ran into one peculiarity
when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like
BL8_NOP_ELF except that it takes another parameter to represent the symbol
("x" above) that requires a relocation on the call.  Something in the 
TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated
identically during the emit phase, so this second operand was never
visited to generate relocations.  This is the reason for the slightly
messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding().

Two new tests are included to demonstrate correct external assembly and
correct generation of relocations using the integrated assembler.

Comments welcome!

Thanks,
Bill

llvm-svn: 169910
2012-12-11 20:30:11 +00:00
Patrik Hagglund e98b7a0389 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
llvm-svn: 169854
2012-12-11 11:14:33 +00:00
Patrik Hagglund ad432a8e70 Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.

Accordingly, add bitsLT (and similar) to MVT.

llvm-svn: 169850
2012-12-11 10:20:51 +00:00
Patrik Hagglund 03e9628cfa Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.

llvm-svn: 169848
2012-12-11 10:09:23 +00:00
Patrik Hagglund 8d2e7cf561 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.

llvm-svn: 169845
2012-12-11 09:57:18 +00:00
Patrik Hagglund 3708e548f8 Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

llvm-svn: 169837
2012-12-11 09:10:33 +00:00
NAKAMURA Takumi 99feb75cb8 [CMake] Remove dependencies to intrinsics_gen I introduced in r169724.
llvm-svn: 169819
2012-12-11 05:53:54 +00:00
Jyotsna Verma 92e71918b1 Use multiclass for new-value store instructions with MEMri operand.
llvm-svn: 169814
2012-12-11 05:12:25 +00:00
Evan Cheng c2bd620fac Stylistic tweak.
llvm-svn: 169811
2012-12-11 02:31:57 +00:00
Chad Rosier df42cf39ab Fall back to the selection dag isel to select tail calls.
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

llvm-svn: 169796
2012-12-11 00:18:02 +00:00
Evan Cheng 79e2ca90bc Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078

llvm-svn: 169791
2012-12-10 23:21:26 +00:00
Akira Hatanaka 5d6faed1f0 [mips] Set HWEncoding field of registers. Use delete function
getMipsRegisterNumbering and use MCRegisterInfo::getEncodingValue instead.

llvm-svn: 169760
2012-12-10 20:04:40 +00:00
Chandler Carruth 867c7bff9a Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses."
Accidental commit... git svn betrayed me. Sorry for the noise.

llvm-svn: 169741
2012-12-10 18:23:52 +00:00
Chandler Carruth 7eaa45c738 Make '-mtune=x86_64' assume fast unaligned memory accesses.
Summary:
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.

Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D195

llvm-svn: 169740
2012-12-10 18:22:42 +00:00
Chandler Carruth 17f25c4e0d Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A.
Thanks to the PaX folks for noticing in review! We need some tests here,
any sugestions welcome...

llvm-svn: 169739
2012-12-10 18:22:40 +00:00
Chandler Carruth 0f58558101 Address a FIXME and update the fast unaligned memory feature for newer
Intel chips.

The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.

No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.

llvm-svn: 169730
2012-12-10 09:18:44 +00:00
NAKAMURA Takumi 6b819c5fb1 [CMake] Update dependencies to intrinsics_gen corresponding to r169711.
llvm-svn: 169724
2012-12-10 05:27:15 +00:00
Paul Redmond 2adb13c100 LoopVectorize: support vectorizing intrinsic calls
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.

Reviewed by: Nadav

llvm-svn: 169711
2012-12-09 20:42:17 +00:00
Shuxin Yang 95de7c37e2 - Re-enable population count loop idiom recognization
- fix a bug which cause sigfault.
- add two testing cases which was causing crash

llvm-svn: 169687
2012-12-09 03:12:46 +00:00
Chandler Carruth 91e47532fe Revert the patches adding a popcount loop idiom recognition pass.
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.

This reverts the commits:
  r169671: Fix a logic error.
  r169604: Move the popcnt tests to an X86 subdirectory.
  r168931: Initial commit adding the pass.

llvm-svn: 169683
2012-12-08 22:18:29 +00:00
Benjamin Kramer f242d8c31b Simplify code. Sort includes. No functionality change.
llvm-svn: 169676
2012-12-08 10:45:24 +00:00
Chandler Carruth 1d94e932bc Fix a use-after-free bug found by ASan. You can't assign a temporary
std::string to a StringRef. Moreover, the method being called accepts
a Twine to simplify these patterns.

Fixes this ASan failure:
==6312== ERROR: AddressSanitizer: heap-use-after-free on address 0x7fd558b1af58 at pc 0xcb7529 bp 0x7fffff572080 sp 0x7fffff572078
READ of size 1 at 0x7fd558b1af58 thread T0
    #0 0xcb7528 .../llvm/include/llvm/ADT/StringRef.h:192 llvm::StringRef::operator[]()
    #1 0x1d53c0a .../llvm/include/llvm/ADT/StringExtras.h:128 llvm::HashString()
    #2 0x1d53878 .../llvm/lib/Support/StringMap.cpp:64 llvm::StringMapImpl::LookupBucketFor()
    #3 0x1b6872f .../llvm/include/llvm/ADT/StringMap.h:352 llvm::StringMap<>::GetOrCreateValue<>()
    #4 0x1b61836 .../llvm/lib/MC/MCContext.cpp:109 llvm::MCContext::GetOrCreateSymbol()
    #5 0xe9fd47 .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:154 (anonymous namespace)::ARMELFStreamer::EmitMappingSymbol()
    #6 0xea01dd .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:133 (anonymous namespace)::ARMELFStreamer::EmitDataMappingSymbol()
    #7 0xe9f78b .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:91 (anonymous namespace)::ARMELFStreamer::EmitBytes()
    #8 0x1b15d82 .../llvm/lib/MC/MCStreamer.cpp:89 llvm::MCStreamer::EmitIntValue()
    #9 0xcc0f9b .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:713 llvm::ARMAsmPrinter::emitAttributes()
    #10 0xcc0d44 .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:632 llvm::ARMAsmPrinter::EmitStartOfAsmFile()
    #11 0x14692ad .../llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:162 llvm::AsmPrinter::doInitialization()
    #12 0x1bc4677 .../llvm/lib/VMCore/PassManager.cpp:1561 llvm::FPPassManager::doInitialization()
    #13 0x1bc4990 .../llvm/lib/VMCore/PassManager.cpp:1595 llvm::MPPassManager::runOnModule()
    #14 0x1bc55e5 .../llvm/lib/VMCore/PassManager.cpp:1705 llvm::PassManagerImpl::run()
    #15 0x1bc5878 .../llvm/lib/VMCore/PassManager.cpp:1740 llvm::PassManager::run()
    #16 0xc3954d .../llvm/tools/llc/llc.cpp:378 compileModule()
    #17 0xc38001 .../llvm/tools/llc/llc.cpp:194 main
    #18 0x7fd557d6a11c __libc_start_main
0x7fd558b1af58 is located 24 bytes inside of 29-byte region [0x7fd558b1af40,0x7fd558b1af5d)
freed by thread T0 here:
    #0 0xc337da .../llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:56 operator delete()
    #1 0x1ee9cef .../libstdc++-v3/include/bits/basic_string.h:535 std::string::~string()
    #2 0xea01dd .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:133 (anonymous namespace)::ARMELFStreamer::EmitDataMappingSymbol()
    #3 0xe9f78b .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:91 (anonymous namespace)::ARMELFStreamer::EmitBytes()
    #4 0x1b15d82 .../llvm/lib/MC/MCStreamer.cpp:89 llvm::MCStreamer::EmitIntValue()
    #5 0xcc0f9b .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:713 llvm::ARMAsmPrinter::emitAttributes()
    #6 0xcc0d44 .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:632 llvm::ARMAsmPrinter::EmitStartOfAsmFile()
    #7 0x14692ad .../llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:162 llvm::AsmPrinter::doInitialization()
    #8 0x1bc4677 .../llvm/lib/VMCore/PassManager.cpp:1561 llvm::FPPassManager::doInitialization()
    #9 0x1bc4990 .../llvm/lib/VMCore/PassManager.cpp:1595 llvm::MPPassManager::runOnModule()
    #10 0x1bc55e5 .../llvm/lib/VMCore/PassManager.cpp:1705 llvm::PassManagerImpl::run()
    #11 0x1bc5878 .../llvm/lib/VMCore/PassManager.cpp:1740 llvm::PassManager::run()
    #12 0xc3954d .../llvm/tools/llc/llc.cpp:378 compileModule()
    #13 0xc38001 .../llvm/tools/llc/llc.cpp:194 main
    #14 0x7fd557d6a11c __libc_start_main

llvm-svn: 169668
2012-12-08 03:10:14 +00:00
Bill Wendling e94d843e43 s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future.
llvm-svn: 169651
2012-12-07 23:16:57 +00:00
Nadav Rotem ad0b5fbe8c When we use the BLEND instruction that uses the MSB as a mask, we can remove
the VSRI instruction before it since it does not affect the MSB.

Thanks Craig Topper for suggesting this.

llvm-svn: 169638
2012-12-07 21:43:11 +00:00
Matthew Curtis 7a93811e8b In hexagon convertToHardwareLoop, don't deref end() iterator
In particular, check if MachineBasicBlock::iterator is end() before
using it to call getDebugLoc();

See also this thread on llvm-commits:
   http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html

llvm-svn: 169634
2012-12-07 21:03:15 +00:00
Nadav Rotem 481e50efe0 X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.
llvm-svn: 169624
2012-12-07 19:01:13 +00:00
Tim Northover 5cc3dc86bb Added Mapping Symbols for ARM ELF
Before this patch, when you objdump an LLVM-compiled file, objdump tried to
decode data-in-code sections as if they were code.  This patch adds the missing
Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D).

Patch based on work by Greg Fitzgerald.

llvm-svn: 169609
2012-12-07 16:50:23 +00:00
Jakob Stoklund Olesen 97030e0c0e Use the new MIBundleBuilder class in the Mips target.
This is the preferred way of creating bundled machine instructions.

llvm-svn: 169585
2012-12-07 04:23:40 +00:00
Akira Hatanaka efdce0fb09 [mips] Delete nodes and instructions for dynamic alloca that are no longer in
use.

llvm-svn: 169580
2012-12-07 03:10:18 +00:00
Akira Hatanaka 97e179f9e4 [mips] Shorten predicate name.
llvm-svn: 169579
2012-12-07 03:06:09 +00:00
Akira Hatanaka c5dc055922 [mips] Delete unused sub-target features.
llvm-svn: 169578
2012-12-07 03:04:05 +00:00
Akira Hatanaka 02a346d11f [mips] Remove unnecessary predicates.
llvm-svn: 169577
2012-12-07 03:01:24 +00:00
Matt Beaumont-Gay 4a04c92001 Add a 'using' declaration to suppress GCC's -Woverloaded-virtual while we
decide what pattern we want to follow in the future.

llvm-svn: 169561
2012-12-06 23:15:36 +00:00
Evan Cheng 9ec512d768 Replace r169459 with something safer. Rather than having computeMaskedBits to
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).

rdar://12771555

llvm-svn: 169536
2012-12-06 19:13:27 +00:00
Jakub Staszak 40ee5674cd Remove unneeded function, since PR8156 was fixed over a year ago.
llvm-svn: 169534
2012-12-06 19:05:46 +00:00
Jakub Staszak 65ca2fb9e6 Simplify code.
llvm-svn: 169521
2012-12-06 18:22:59 +00:00
Craig Topper 216bcd522b Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions.
llvm-svn: 169482
2012-12-06 07:31:16 +00:00
Craig Topper 922f10aec4 Mark MOVDQ(A/U)rm as ReMaterializable. Mark all MOVDQ(A/U) instructions as neverHasSideEffects.
llvm-svn: 169477
2012-12-06 06:49:16 +00:00
Chad Rosier 9f5c68af4c [arm fast-isel] Make the fast-isel implementation of memcpy respect alignment.
rdar://12821569

llvm-svn: 169460
2012-12-06 01:34:31 +00:00
Evan Cheng 5213139f48 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555

llvm-svn: 169459
2012-12-06 01:28:01 +00:00
Jyotsna Verma d3746e6895 Define new-value store instructions with base+immediate addressing mode
using multiclass.

llvm-svn: 169432
2012-12-05 22:02:56 +00:00
Nadav Rotem 0a471ea66c Cost Model: change the default cost of control flow instructions (br / ret / ...) to zero.
llvm-svn: 169423
2012-12-05 21:21:26 +00:00
David Sehr 05176cad21 Correct ARM NOP encoding
The encoding of NOP in ARMAsmBackend.cpp is missing a trailing zero, which
causes the emission of a coprocessor instruction rather than "mov r0, r0"
as indicated in the comment.  The test also checks for the wrong encoding.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/157919.html

llvm-svn: 169420
2012-12-05 21:01:27 +00:00
Justin Holewinski fb711156ae [NVPTX] Fix crash with unnamed struct arguments
Patch by Eric Holk

llvm-svn: 169418
2012-12-05 20:50:28 +00:00
Jyotsna Verma 90295156d8 Use multiclass to define store instructions with base+immediate offset
addressing mode and immediate stored value.

llvm-svn: 169408
2012-12-05 19:32:03 +00:00
Matthew Curtis cd8c881c9f Fix misplaced closing brace.
llvm-svn: 169404
2012-12-05 19:00:34 +00:00
Kevin Enderby 168ffb36a5 Added a option to the disassembler to print immediates as hex.
This is for the lldb team so most of but not all of the values are
to be printed as hex with this option.  Some small values like the
scale in an X86 address were requested to printed in decimal
without the leading 0x.

There may be some tweaks need to places that may still be in
decimal that they want in hex.  Specially for arm.  I made my best
guess.  Any tweaks from here should be simple.

I also did the best I know now with help from the C++ gurus
creating the cleanest formatImm() utility function and containing
the changes.  But if someone has a better idea to make something
cleaner I'm all ears and game for changing the implementation.

rdar://8109283

llvm-svn: 169393
2012-12-05 18:13:19 +00:00
Elena Demikhovsky cd3c1c4a16 Simplified BLEND pattern matching for shuffles.
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.

llvm-svn: 169366
2012-12-05 09:24:57 +00:00
Evan Cheng d31802c1f6 Add x86 isel lowering logic to form bit test with inverted condition. e.g.
x ^ -1.

Patch by David Majnemer.
rdar://12755626

llvm-svn: 169339
2012-12-05 00:10:38 +00:00
Matt Beaumont-Gay 50f61b662f Appease GCC's -Wparentheses.
(TIL that Clang's -Wparentheses ignores 'x || y && "foo"' on purpose. Neat.)

llvm-svn: 169337
2012-12-04 23:54:02 +00:00
Evan Cheng b4eae1361c ARM custom lower ctpop for vector types. Patch by Pete Couperus.
llvm-svn: 169325
2012-12-04 22:41:50 +00:00
Jyotsna Verma 4da904c8f8 Define store instructions with base+register offset addressing mode
using multiclass.

llvm-svn: 169314
2012-12-04 21:58:25 +00:00
Eli Bendersky abe546368b Make NaCl naming consistent. The triple OSType is called NaCl and is represented
textually as NativeClient. Also added a link to the native client project for
readers unfamiliar with it.

A Clang patch will follow shortly.

llvm-svn: 169291
2012-12-04 18:37:26 +00:00
Jyotsna Verma dfd779e108 Add patterns to define 'combine', 'tstbit', 'ct0/cl0' (count trailing/leading zeros)
instructions.

llvm-svn: 169287
2012-12-04 18:05:01 +00:00
Jyotsna Verma 22d61dd4ce Add constant extender support to ALU32 instructions for V2.
llvm-svn: 169284
2012-12-04 17:12:00 +00:00
Bill Schmidt ca4a0c9dbd This patch introduces initial-exec model support for thread-local storage
on 64-bit PowerPC ELF.

The patch includes code to handle external assembly and MC output with the
integrated assembler.  It intentionally does not support the "old" JIT.

For the initial-exec TLS model, the ABI requires the following to calculate
the address of external thread-local variable x:

 Code sequence            Relocation                  Symbol
  ld 9,x@got@tprel(2)      R_PPC64_GOT_TPREL16_DS      x
  add 9,9,x@tls            R_PPC64_TLS                 x

The register 9 is arbitrary here.  The linker will replace x@got@tprel
with the offset relative to the thread pointer to the generated GOT
entry for symbol x.  It will replace x@tls with the thread-pointer
register (13).

The two test cases verify correct assembly output and relocation output
as just described.

PowerPC-specific selection node variants are added for the two
instructions above:  LD_GOT_TPREL and ADD_TLS.  These are inserted
when an initial-exec global variable is encountered by
PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to
machine instructions LDgotTPREL and ADD8TLS.  LDgotTPREL is a pseudo
that uses the same LDrs support added for medium code model's LDtocL,
with a different relocation type.

The rest of the processing is straightforward.

llvm-svn: 169281
2012-12-04 16:18:08 +00:00
Chandler Carruth 802d755533 Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224
2012-12-04 07:12:27 +00:00
Jyotsna Verma 5929cfc534 Move all operand definitions into HexagonOperands.td
llvm-svn: 169213
2012-12-04 05:00:31 +00:00
Jyotsna Verma efe4f559b1 Move generic Hexagon subtarget information into Hexagon.td
llvm-svn: 169212
2012-12-04 04:29:16 +00:00
Jakob Stoklund Olesen a32d85b39d Remove the old TRI::ResolveRegAllocHint() and getRawAllocationOrder() hooks.
These functions have been replaced by TRI::getRegAllocationHints() which
provides the same capabilities.

llvm-svn: 169192
2012-12-04 00:46:13 +00:00
Akira Hatanaka 4c128509a5 Classic JIT is still being supported by MIPS, along with MCJIT.
This change adds endian-awareness to MipsJITInfo and emitWordLE in
MipsCodeEmitter has become emitWord now to support both endianness.

Patch by Petar Jovanovic.

llvm-svn: 169177
2012-12-03 23:11:12 +00:00
Akira Hatanaka 60c2837e8d Functions in MipsCodeEmitter.cpp that expand unaligned loads/stores are dead
code. Removing it.

Patch by Petar Jovanovic.

llvm-svn: 169174
2012-12-03 22:51:22 +00:00
Jakob Stoklund Olesen 742f201e30 Implement ARMBaseRegisterInfo::getRegAllocationHints().
This provides the same functionality as getRawAllocationOrder() for the
even/odd hints, but without the many constant register arrays.

llvm-svn: 169169
2012-12-03 22:35:35 +00:00
Jyotsna Verma 6f3bd03e50 Define store instructions with base+immediate offset addressing mode
using multiclass.

llvm-svn: 169168
2012-12-03 22:26:28 +00:00
Jyotsna Verma 4d8686cc42 Define load instructions with base+immediate offset addressing mode
using multiclass.

llvm-svn: 169153
2012-12-03 21:13:13 +00:00
Jyotsna Verma c86b3e1b26 Define unsigned const-ext predicates.
llvm-svn: 169149
2012-12-03 20:39:45 +00:00
Jyotsna Verma 6aba56e9d4 Removing unnecessary 'else' statement from the predicates defined in HexagonOperards.td.
llvm-svn: 169148
2012-12-03 20:14:38 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Jyotsna Verma 014dfe4de0 Define signed const-ext predicates.
llvm-svn: 169117
2012-12-03 06:54:50 +00:00
Sebastian Pop a204f72237 Codegen failure for vmull with small vectors
Codegen was failing with an assertion because of unexpected vector
operands when legalizing the selection DAG for a MUL instruction.

The asserting code was legalizing multiplies for vectors of size 128
bits. It uses a custom lowering to try and detect cases where it can
use a VMULL instruction instead of a VMOVL + VMUL.  The code was
looking for input operands to the MUL that had been sign or zero
extended. If it found the extended operands it would drop the
sign/zero extension and use the original vector size as input to a
VMULL instruction.

The code assumed that the original input vector was 64 bits so that
after dropping the extension it would fit directly into a D register
and could be used as an operand of a VMULL instruction. The input
code that trigger the failure used a vector of <4 x i8> that was
sign extended to <4 x i32>. It was not safe to drop the sign
extension in this case because the original vector is only 32 bits
wide. The fix is to insert a sign extension for the vector to reach
the required 64 bit size. In this particular example, the vector would
need to be sign extented to a <4 x i16>.

llvm-svn: 169024
2012-11-30 19:08:04 +00:00
Jyotsna Verma a77c054e85 Use multiclass for the load instructions with MEMri operand.
llvm-svn: 169018
2012-11-30 17:31:52 +00:00
Adhemerval Zanella 812410f2d1 This patch fixes the Altivec addend construction for the fused multiply-add
instruction (vmaddfp) to conform with IEEE to ensure the sign of a zero
result when resulting product is -0.0.

The -0.0 vector addend to vmaddfp is generated by a creating a vector
with full bits sets and then shifting each elements by 31-bits to the
left, resulting in a vector of 0x80000000 (or -0.0 as float).

The 'buildvec_canonicalize.ll' was adjusted to reflect this change and
the 'vec_mul.ll' was complemented with the float vector multiplication
test.

llvm-svn: 168998
2012-11-30 13:05:44 +00:00
Chandler Carruth f12e3a67db Switch LLVM_USE_RVALUE_REFERENCES to LLVM_HAS_RVALUE_REFERENCES.
Rationale:
1) This was the name in the comment block. ;]
2) It matches Clang's __has_feature naming convention.
3) It matches other compiler-feature-test conventions.

Sorry for the noise. =]

I've also switch the comment block to use a \brief tag and not duplicate
the name.

llvm-svn: 168996
2012-11-30 11:45:22 +00:00
Jyotsna Verma b950ea61fc Use multiclass for the store instructions with MEMri operand.
llvm-svn: 168983
2012-11-30 06:10:22 +00:00
Jyotsna Verma ede608cce0 Use multiclass for the load instructions with 'base + register offset'
addressing mode.

llvm-svn: 168976
2012-11-30 04:19:09 +00:00
Kevin Enderby 136d6746c5 Fixed the arm disassembly of invalid BFI instructions to not build a bad MCInst
which would then cause an assert when printed.  rdar://11437956

llvm-svn: 168960
2012-11-29 23:47:11 +00:00
Quentin Colombet 13cd521b24 Add cortex-a5 subtarget to the supported ARM architectures
llvm-svn: 168933
2012-11-29 19:48:01 +00:00
Shuxin Yang abcc370423 rdar://12100355 (part 1)
This revision attempts to recognize following population-count pattern:

 while(a) { c++; ... ; a &= a - 1; ... },
  where <c> and <a>could be used multiple times in the loop body.

 TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent 
instruction sequence, which need to be improved in the following commits.

Reviewed by Nadav, really appreciate!

llvm-svn: 168931
2012-11-29 19:38:54 +00:00
Jyotsna Verma e95559fc16 Use multiclass for 'transfer' instructions.
llvm-svn: 168929
2012-11-29 19:35:44 +00:00
Silviu Baranga 93aefa5f2c Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend.
llvm-svn: 168886
2012-11-29 14:41:25 +00:00
Justin Holewinski bc45119b44 Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors
For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>.

llvm-svn: 168882
2012-11-29 14:26:24 +00:00
Elena Demikhovsky eace43bff7 I changed hasAVX() to hasFp256() and hasAVX2() to hasInt256() in X86IselLowering.cpp.
The logic was not changed, only names.

llvm-svn: 168875
2012-11-29 12:44:59 +00:00
Jyotsna Verma 519b3856dd Define signed const-ext immediate operands and their predicates.
llvm-svn: 168810
2012-11-28 20:58:14 +00:00
Benjamin Kramer b1996da782 ARM: Implement CanLowerReturn so large vectors get expanded into sret.
Fixes 14337.

llvm-svn: 168809
2012-11-28 20:55:10 +00:00
Ulrich Weigand 9e159fd7a0 Fix initial frame state on powerpc64.
The createPPCMCAsmInfo routine used PPC::R1 as the initial frame
pointer register, but on PPC64 the 32-bit R1 register does not
have a corresponding DWARF number, causing invalid CIE initial
frame state to be emitted.  Fix by using PPC::X1 instead.

llvm-svn: 168799
2012-11-28 18:21:03 +00:00
Jakob Stoklund Olesen 9de596e650 Remove all references to TargetInstrInfoImpl.
This class has been merged into its super-class TargetInstrInfo.

llvm-svn: 168760
2012-11-28 02:35:17 +00:00
Jakob Stoklund Olesen fcf14e8436 Move Target{Instr,Register}Info.cpp into lib/CodeGen.
The Target library is not allowed to depend on the large CodeGen
library, but the TRI and TII classes provide abstract interfaces that
require both caller and callee to link to CodeGen.

The implementation files for these classes provide default
implementations of some of the hooks. These methods may need to
reference CodeGen, so they belong in that library.

We already have a number of methods implemented in the
TargetInstrInfoImpl sub-class because of that. I will merge that class
into the parent next.

llvm-svn: 168758
2012-11-28 02:35:09 +00:00
Bill Schmidt e0a68a562b This patch makes medium code model the default for 64-bit PowerPC ELF.
When the CodeGenInfo is to be created for the PPC64 target machine,
a default code-model selection is converted to CodeModel::Medium
provided we are not targeting the Darwin OS.  Defaults for Darwin
are unaffected.

llvm-svn: 168747
2012-11-27 23:36:26 +00:00
Chad Rosier b4ac423ed4 [arm fast-isel] Appease the machine verifier by using the proper register
classes.  The vast majority of the remaining issues are due to uses of
invalid registers, which are defined by getRegForValue().  Those will be
a little more challenging to cleanup.
rdar://12719844

llvm-svn: 168735
2012-11-27 22:29:43 +00:00
Chad Rosier 0c00758065 [arm fast-isel] Appease the machine verifier by using the proper register
classes.
rdar://12719844

llvm-svn: 168733
2012-11-27 22:12:11 +00:00
Chad Rosier 2ec7db0968 [arm fast-isel] Appease the machine verifier by using the proper register
classes.  Also a bit of cleanup.
rdar://12719844

llvm-svn: 168728
2012-11-27 21:46:46 +00:00
Manman Ren 5b4628201f X86: do not fold load instructions such as [V]MOVS[S|D] to other instructions
when the destination register is wider than the memory load.

These load instructions load from m32 or m64 and set the upper bits to zero,
while the folded instructions may accept m128.

rdar://12721174

llvm-svn: 168710
2012-11-27 18:09:26 +00:00
Bill Schmidt 34627e3434 This patch implements medium code model support for 64-bit PowerPC.
The default for 64-bit PowerPC is small code model, in which TOC entries
must be addressable using a 16-bit offset from the TOC pointer.  Additionally,
only TOC entries are addressed via the TOC pointer.

With medium code model, TOC entries and data sections can all be addressed
via the TOC pointer using a 32-bit offset.  Cooperation with the linker
allows 16-bit offsets to be used when these are sufficient, reducing the
number of extra instructions that need to be executed.  Medium code model
also does not generate explicit TOC entries in ".section toc" for variables
that are wholly internal to the compilation unit.

Consider a load of an external 4-byte integer.  With small code model, the
compiler generates:

	ld 3, .LC1@toc(2)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc ei[TC],ei

With medium model, it instead generates:

	addis 3, 2, .LC1@toc@ha
	ld 3, .LC1@toc@l(3)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc ei[TC],ei

Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the
32-bit offset of ei's TOC entry from the TOC base pointer.  Similarly,
.LC1@toc@l is a relocation requesting the lower 16 bits.  Note that if
the linker determines that ei's TOC entry is within a 16-bit offset of
the TOC base pointer, it will replace the "addis" with a "nop", and
replace the "ld" with the identical "ld" instruction from the small
code model example.

Consider next a load of a function-scope static integer.  For small code
model, the compiler generates:

	ld 3, .LC1@toc(2)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc test_fn_static.si[TC],test_fn_static.si
	.type	test_fn_static.si,@object
	.local	test_fn_static.si
	.comm	test_fn_static.si,4,4

For medium code model, the compiler generates:

	addis 3, 2, test_fn_static.si@toc@ha
	addi 3, 3, test_fn_static.si@toc@l
	lwz 4, 0(3)

	.type	test_fn_static.si,@object
	.local	test_fn_static.si
	.comm	test_fn_static.si,4,4

Again, the linker may replace the "addis" with a "nop", calculating only
a 16-bit offset when this is sufficient.

Note that it would be more efficient for the compiler to generate:

	addis 3, 2, test_fn_static.si@toc@ha
        lwz 4, test_fn_static.si@toc@l(3)

The current patch does not perform this optimization yet.  This will be
addressed as a peephole optimization in a later patch.

For the moment, the default code model for 64-bit PowerPC will remain the
small code model.  We plan to eventually change the default to medium code
model, which matches current upstream GCC behavior.  Note that the different
code models are ABI-compatible, so code compiled with different models will
be linked and execute correctly.

I've tested the regression suite and the application/benchmark test suite in
two ways:  Once with the patch as submitted here, and once with additional
logic to force medium code model as the default.  The tests all compile
cleanly, with one exception.  The mandel-2 application test fails due to an
unrelated ABI compatibility with passing complex numbers.  It just so happens
that small code model was incredibly lucky, in that temporary values in 
floating-point registers held the expected values needed by the external
library routine that was called incorrectly.  My current thought is to correct
the ABI problems with _Complex before making medium code model the default,
to avoid introducing this "regression."

Here are a few comments on how the patch works, since the selection code
can be difficult to follow:

The existing logic for small code model defines three pseudo-instructions:
LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for
constant pool addresses.  These are expanded by SelectCodeCommon().  The
pseudo-instruction approach doesn't work for medium code model, because
we need to generate two instructions when we match the same pattern.
Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY
node for medium code model, and generates an ADDIStocHA followed by either
a LDtocL or an ADDItocL.  These new node types correspond naturally to
the sequences described above.

The addis/ld sequence is generated for the following cases:
 * Jump table addresses
 * Function addresses
 * External global variables
 * Tentative definitions of global variables (common linkage)

The addis/addi sequence is generated for the following cases:
 * Constant pool entries
 * File-scope static global variables
 * Function-scope static variables

Expanding to the two-instruction sequences at select time exposes the
instructions to subsequent optimization, particularly scheduling.

The rest of the processing occurs at assembly time, in
PPCAsmPrinter::EmitInstruction.  Each of the instructions is converted to
a "real" PowerPC instruction.  When a TOC entry needs to be created, this
is done here in the same manner as for the existing LDtoc, LDtocJTI, and
LDtocCPT pseudo-instructions (I factored out a new routine to handle this).

I had originally thought that if a TOC entry was needed for LDtocL or
ADDItocL, it would already have been generated for the previous ADDIStocHA.
However, at higher optimization levels, the ADDIStocHA may appear in a 
different block, which may be assembled textually following the block
containing the LDtocL or ADDItocL.  So it is necessary to include the
possibility of creating a new TOC entry for those two instructions.

Note that for LDtocL, we generate a new form of LD called LDrs.  This
allows specifying the @toc@l relocation for the offset field of the LD
instruction (i.e., the offset is replaced by a SymbolLo relocation).
When the peephole optimization described above is added, we will need
to do similar things for all immediate-form load and store operations.

The seven "mcm-n.ll" test cases are kept separate because otherwise the
intermingling of various TOC entries and so forth makes the tests fragile
and hard to understand.

The above assumes use of an external assembler.  For use of the
integrated assembler, new relocations are added and used by
PPCELFObjectWriter.  Testing is done with "mcm-obj.ll", which tests for
proper generation of the various relocations for the same sequences
tested with the external assembler.

llvm-svn: 168708
2012-11-27 17:35:46 +00:00