Commit Graph

28582 Commits

Author SHA1 Message Date
Daniel Sanders e296a0fce5 [mips][msa] Fix vector insertions where the index is variable
Summary:
This isn't supported directly so we rotate the vector by the desired number of
elements, insert to element zero, then rotate back.

The i64 case generates rather poor code on MIPS32. There is an obvious
optimisation to be made in future (do both insert.w's inside a shared 
rotate/unrotate sequence) but for now it's sufficient to select valid code
instead of aborting.

Depends on D3536

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3537

llvm-svn: 207640
2014-04-30 12:09:32 +00:00
Tim Northover f9941a9dc6 ARM64: accept ELF-relocated load/store insts without a #.
E.g. we print "ldr x0, [x0, :lo12:symbol]" so we need to accept that syntax
too.

llvm-svn: 207639
2014-04-30 12:00:20 +00:00
Tim Northover 36c93db37a ARM64: remove duplication by templating InstPrinter methods
No functional change, so no tests.

llvm-svn: 207638
2014-04-30 11:43:36 +00:00
Matheus Almeida 525bc4f708 [mips] Add support for .cpload.
Summary:
This directive is used for setting up $gp in the beginning of a function.
It expands to three instructions if PIC is enabled:
lui   $gp, %hi(_gp_disp)
addui $gp, $gp, %lo(_gp_disp)
addu  $gp, $gp, $reg

_gp_disp is a special symbol that the linker sets to the distance between
the lui instruction and the context pointer (_gp).

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3480

llvm-svn: 207637
2014-04-30 11:28:42 +00:00
Tim Northover 970c4a8d35 ARM64: use hex immediates for movz/movk instructions
Since these are mostly used in "lsl #16", "lsl #32", "lsl #48" combinations to
piece together an immediate in 16-bit chunks, hex is probably the most
appropriate format.

llvm-svn: 207635
2014-04-30 11:19:40 +00:00
Tim Northover 4b2f8a990e ARM64: hexify printing various immediate operands
This is mostly aimed at the NEON logical operations and MOVI/MVNI (since they
accept weird shifts which are more naturally understandable in hex notation).

Also changes BRK/HINT etc, which is probably a neutral change, but easier than
the alternative.

llvm-svn: 207634
2014-04-30 11:19:28 +00:00
Tim Northover cfd6e66544 ARM64: print canonical syntax for add/sub (imm) instructions.
Since these instructions only accept a 12-bit immediate, possibly shifted left
by 12, the canonical syntax used by the architecture reference manual is "#N {,
lsl #12 }". We should accept an immediate that has already been shifted, (e.g.

Also, print a comment giving the full addend since it can be helpful.

llvm-svn: 207633
2014-04-30 11:19:15 +00:00
James Molloy 54f3485dba [ARM64] Simplify if condition.
v2f32 and v4f32 were missed out of these conditions, so this is also
a bugfix.

llvm-svn: 207628
2014-04-30 10:15:50 +00:00
James Molloy b5efbcfbe5 [ARM64] Fix stupid copy-pasto in ARM64MCAsmInfo.cpp - aarch64_be -> arm64_be
llvm-svn: 207627
2014-04-30 10:15:46 +00:00
Tim Northover 41cec5c3cb ARM64: make sure FastISel uses a GPR64 source in 64-bit extensions.
llvm-svn: 207620
2014-04-30 09:32:01 +00:00
Craig Topper 2d2aa0ca1f Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
llvm-svn: 207616
2014-04-30 07:17:30 +00:00
Saleem Abdulrasool 25947c318b ARM: support stack probe emission for Windows on ARM
This introduces the stack lowering emission of the stack probe function for
Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where
any page allocation which crosses a page boundary of the following guard page
will cause a page fault. This page fault must be handled by the kernel to
ensure that the page is faulted in. If this does not occur and a write access
any memory beyond that, the page fault will go unserviced, resulting in an
abnormal program termination.

The watermark for the stack probe appears to be at 4080 bytes (for
accommodating the stack guard canaries and stack alignment) when SSP is
enabled.  Otherwise, the stack probe is emitted on the page size boundary of
4096 bytes.

llvm-svn: 207615
2014-04-30 07:05:07 +00:00
Saleem Abdulrasool 0aca1c30c6 ARM: print COFF function header for Windows on ARM
Emit the COFF header when printing out the function.  This is important as the
header contains two important pieces of information: the storage class for the
symbol and the symbol type information.  This bit of information is required for
the linker to correctly identify the type of symbol that it is dealing with.

llvm-svn: 207613
2014-04-30 06:14:25 +00:00
Craig Topper ee7b0f3956 De-virtualize or remove some methods that have no overrides nor override anything. In some cases remove all together if there are no callers either.
llvm-svn: 207610
2014-04-30 05:53:27 +00:00
Saleem Abdulrasool ef550a6d01 ARM: move llvm_unreachable use
When building with -Werror=covered-switch-default (as on the buildbots), the
build would fail since all cases are covered by the switch.  Move the
llvm_unreachable to the end of the function as an annotation.

llvm-svn: 207609
2014-04-30 05:12:41 +00:00
Saleem Abdulrasool f8222631a5 ARM: partially handle 32-bit relocations for WoA
IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise
relocation is not split up and reordered. When expanding the mov32imm
pseudo-instruction, create a bundle if the machine operand is referencing an
address.  This helps ensure that the relocatable address load is not reordered
by subsequent passes.

Unfortunately, this only partially handles the case as the Constant Island Pass
occurs after the instructions are unbundled and does not properly handle
bundles.  That is a more fundamental issue with the pass itself and beyond the
scope of this change.

llvm-svn: 207608
2014-04-30 04:54:58 +00:00
Reid Kleckner fb69308568 Implement X86 code generation for musttail
Currently, musttail codegen is relying on sibcall optimization, and
reporting a fatal error if fails.  Sibcall optimization fails when stack
arguments need to be modified, which is insufficient for musttail.

The logic for moving arguments in memory safely is already implemented
for GuaranteedTailCallOpt.  This change merely arranges for musttail
calls to use it.

No functional change for GuaranteedTailCallOpt.

Reviewers: espindola

Differential Revision: http://reviews.llvm.org/D3493

llvm-svn: 207598
2014-04-29 23:55:41 +00:00
Benjamin Kramer d59664f4f7 raw_ostream: Forward declare OpenFlags and include FileSystem.h only where necessary.
llvm-svn: 207593
2014-04-29 23:26:49 +00:00
Tom Stellard 93f9f4950c R600: Remove duplicate setting of SELECT expansion.
It's already set in AMDGPUISelLowering for all GPUs

Patch By: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207592
2014-04-29 23:12:55 +00:00
Tom Stellard 919bb6b83f R600/SI: Custom lower SI_IF and SI_ELSE to avoid machine verifier errors
SI_IF and SI_ELSE are terminators which also produce a value.  For
these instructions ISel always inserts a COPY to move their value
to another basic block.  This COPY ends up between SI_(IF|ELSE)
and the S_BRANCH* instruction at the end of the block.

This breaks MachineBasicBlock::getFirstTerminator() and also the
machine verifier which assumes that terminators are grouped together at
the end of blocks.

To solve this we coalesce the copy away right after ISel to make sure
there are no instructions in between terminators at the end of blocks.

llvm-svn: 207591
2014-04-29 23:12:53 +00:00
Tom Stellard 58ac7440e6 R600/SI: Only select SALU instructions in the entry or exit block
SALU instructions ignore control flow, so it is not always safe to use
them within branches.  This is a partial solution to this problem
until we can come up with something better.

llvm-svn: 207590
2014-04-29 23:12:48 +00:00
Tom Stellard 676f571999 R600: optimize the UDIVREM 64 algorithm
This is a squash of several optimization commits:
 - calculate DIV_Lo and DIV_Hi separately
 - use BFE_U32 if we are operating on 32bit values
 - use precomputed constants instead of shifting in UDVIREM
 - skip the first 32 iterations of udivrem

v2: Check whether BFE is supported before using it

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207589
2014-04-29 23:12:46 +00:00
Tom Stellard bcd318fc76 R600: Implement iterative algorithm for udivrem
Initial implementation, rather slow

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207588
2014-04-29 23:12:45 +00:00
Tom Stellard 5f3378879f R600: Change UDIV/UREM to UDIVREM when legalizing types
When legalizing ops, with UDIV/UREM set to expand, they automatically
expand to UDIVREM (if legal or custom).
We need to do this manually for legalize types.

v2:
  SI should be set to Expand because the type is legal, and it is
    automatically lowered to UDIVREM if UDIVREM is Legal/Custom
  R600 should set to UDIV/UREM to Custom because it needs to lower them
    during type legalization

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207587
2014-04-29 23:12:43 +00:00
Tom Stellard df780303ef R600: remove unused variable
Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207586
2014-04-29 23:12:38 +00:00
Reed Kotler 67077b3032 Add Simple return instruction to Mips fast-isel
Reviewers: dsanders

Reviewed by: dsanders

Differential Revision: http://reviews.llvm.org/D3430

llvm-svn: 207565
2014-04-29 17:57:50 +00:00
Daniel Sanders 690e4d493e [mips] Remove two more redundant 'let Predicates = [HasStdEnc]' statements that were missed
Summary:
The InstSE class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3548

llvm-svn: 207558
2014-04-29 17:04:30 +00:00
Daniel Sanders 5682f63b46 [mips] Remove more redundant 'let Predicates = [HasStdEnc]' statements
Summary:
The InstSE class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3547

llvm-svn: 207551
2014-04-29 16:37:01 +00:00
Daniel Sanders f562582d15 [mips] Remove redundant 'let Predicates = [HasStdEnc]' statements
Summary:
The MipsPat class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3546

llvm-svn: 207548
2014-04-29 16:24:10 +00:00
Joerg Sonnenberger dd18d5b0f6 Parse and create GOT_PREL relocations.
llvm-svn: 207526
2014-04-29 13:42:02 +00:00
Daniel Sanders b3268e71e2 [mips][msa] Fix element extraction where the index is variable.
Summary:
This isn't supported directly so we splat the vector element and extract
the most convenient copy.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3530

llvm-svn: 207524
2014-04-29 13:31:37 +00:00
Rafael Espindola b60c829a2a Centralize the handling of the thumb bit.
This patch centralizes the handling of the thumb bit around
MCStreamer::isThumbFunc and makes isThumbFunc handle aliases.

This fixes a corner case, but the main advantage is having just one
way to check if a MCSymbol is thumb or not. This should still be
refactored to be ARM only, but at least now it is just one predicate
that has to be refactored instead of 3 (isThumbFunc,
ELF_Other_ThumbFunc, and SF_ThumbFunc).

llvm-svn: 207522
2014-04-29 12:46:50 +00:00
Tim Northover 9e7782dcf3 X86: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207518
2014-04-29 10:06:10 +00:00
Tim Northover 2372301bcf ARM: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207517
2014-04-29 10:06:05 +00:00
Benjamin Kramer e1ab3f062e AArch64: Mark vector long multiplication as expand.
There are no patterns for this. This was already fixed for ARM64 but I forgot
to apply it to AArch64 too.

llvm-svn: 207515
2014-04-29 09:37:54 +00:00
Elena Demikhovsky 299cf511c4 AVX-512: optimized a shuffle pattern to VINSERTI64x4.
Added intrinsics for VPERMT2PS/PD/D/Q instructions.

llvm-svn: 207513
2014-04-29 09:09:15 +00:00
Craig Topper 9d74a5a5f1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves.
llvm-svn: 207511
2014-04-29 07:58:41 +00:00
Craig Topper e06fc4f0ca [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. AArch64 edition
llvm-svn: 207510
2014-04-29 07:58:34 +00:00
Craig Topper f85b7fc197 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. ARM64 edition
llvm-svn: 207509
2014-04-29 07:58:25 +00:00
Craig Topper 906c2cd2e6 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Hexagon edition
llvm-svn: 207508
2014-04-29 07:58:16 +00:00
Craig Topper 6f9e59ea55 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. MSP430 edition
llvm-svn: 207507
2014-04-29 07:58:09 +00:00
Craig Topper 56c590af3b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Mips edition
llvm-svn: 207506
2014-04-29 07:58:02 +00:00
Craig Topper 2865c986d1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. NVPTX edition
llvm-svn: 207505
2014-04-29 07:57:44 +00:00
Craig Topper 0d3fa92514 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition
llvm-svn: 207504
2014-04-29 07:57:37 +00:00
Craig Topper 5656db4a8b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. R600 edition
llvm-svn: 207503
2014-04-29 07:57:24 +00:00
Craig Topper b0c941bebd [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Sparc edition
llvm-svn: 207502
2014-04-29 07:57:13 +00:00
Craig Topper 60879a3c76 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. XCore edition
llvm-svn: 207501
2014-04-29 07:57:00 +00:00
Hao Liu 6db3410071 [ARM64]Fix a bug about incorrect operand order in an EXT instruction, which is introduced by r207485.
llvm-svn: 207500
2014-04-29 07:51:19 +00:00
Hao Liu cf37110920 [ARM64]Fix a bug when lowering shuffle vector to an EXT instruction.
E.g. Mask like <-1, -1, 1, ...> will generate incorrect EXT index.

llvm-svn: 207485
2014-04-29 01:50:36 +00:00
Eric Christopher 612bb69bf7 None of these targets actually define their own CFI_INSTRUCTION
opcode so there's no reason to use the target namespace for it
rather than TargetOpcode.

llvm-svn: 207475
2014-04-29 00:16:46 +00:00
Eric Christopher 40af450562 80-column fixups.
llvm-svn: 207474
2014-04-29 00:16:42 +00:00
Eric Christopher d17374919b 80-column, tab characters, comment fixups.
llvm-svn: 207473
2014-04-29 00:16:40 +00:00
Eric Christopher 4237bf10f3 Fix 80-columns, tab characters, and comments.
llvm-svn: 207472
2014-04-29 00:16:33 +00:00
Quentin Colombet 50efe87e5b [X86] Add more details in the comments of X86TargetLowering::getScalingFactorCost.
llvm-svn: 207432
2014-04-28 18:39:57 +00:00
Chad Rosier 0def8e2652 [ARM64] Fix an issue where we were always assuming a copy was coming from a D subregister.
llvm-svn: 207423
2014-04-28 16:21:50 +00:00
Tim Northover 6ad1f5c817 ARM: stop passing unused values up the TableGen hierarchy.
It's bad enough that I have to look up 5 different levels of TableGen class
definitions to work out what bits go where in a simple NEON instruction anyway,
without having to keep track of umpteen unused parameters.

llvm-svn: 207420
2014-04-28 13:53:00 +00:00
Patrik Hagglund 319983810a Fix gcc -Wsign-compare warning in X86DisassemblerTables.cpp.
X86_MAX_OPERANDS is changed to unsigned.

Also, add range-based for loops for affected loops. This in turn
needed an ArrayRef instead of a pointer-to-array in
InternalInstruction.

llvm-svn: 207413
2014-04-28 12:12:27 +00:00
Tim Northover 7b839f833d ARM64: diagnose use of v16-v31 in certain indexed NEON instructions.
Someone couldn't bear to have a completely orthogonal set of floating-point
registers, so we've got some instructions that only accept v0-v15 (coming in
ARMv9, V128_prime: you're allowed v2, v3, v5, v7, ...).

Anyway, we were permitting even the out of range registers during assembly
(CodeGen handled it correctly). This adds a diagnostic.

llvm-svn: 207412
2014-04-28 11:27:43 +00:00
Hao Liu 9a342778b9 [ARM64]Fix a bug cannot select UQSHL/SQSHL with constant i64 shift amount.
llvm-svn: 207399
2014-04-28 07:34:27 +00:00
Craig Topper 8c0b4d0791 Convert more SelectionDAG functions to use ArrayRef.
llvm-svn: 207397
2014-04-28 05:57:50 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Rafael Espindola 466d66358d Add emitThumbSet to the arm target streamer.
This fixes the asm printer implementation and lets the parser be unaware of
what .thumb_set is.

llvm-svn: 207381
2014-04-27 20:23:58 +00:00
Craig Topper 131de82adb Convert SelectionDAG::MorphNodeTo to use ArrayRef.
llvm-svn: 207378
2014-04-27 19:21:16 +00:00
Craig Topper 481fb2879f Convert SelectionDAG::SelectNodeTo to use ArrayRef.
llvm-svn: 207377
2014-04-27 19:21:11 +00:00
Craig Topper dd5e16dd34 Convert one last signature of getNode to take an ArrayRef of SDUse.
llvm-svn: 207376
2014-04-27 19:21:06 +00:00
Craig Topper 64941d9786 Convert SelectionDAG::getMergeValues to use ArrayRef.
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Benjamin Kramer ce4b3fee72 X86TTI: Adjust sdiv cost now that we can lower it on plain SSE2.
Includes a fix for a horrible typo that caused all SDIV costs to be
slightly off :)

llvm-svn: 207371
2014-04-27 18:47:54 +00:00
Benjamin Kramer 3693e77cb4 X86: If SSE4.1 is missing lower SMUL_LOHI of v4i32 to pmuludq and fix up the high parts.
This is more expensive than pmuldq but still cheaper than scalarizing the whole thing.

llvm-svn: 207370
2014-04-27 18:47:41 +00:00
Rafael Espindola 4c6f61302e Avoid using MCSymbolData on the asm streamer.
Only the object streamers need to track if a symbol should be marked thumb or
not. This ports the ELF case. The COFF case is not ported since it is currently
not working for some other reason (I will report a bug).

llvm-svn: 207366
2014-04-27 17:10:46 +00:00
Saleem Abdulrasool 0ea5d091c7 ARM: MSVC does not support = default
Explicitly "implement" the destructor as MSVC does not support defaulted methods
yet.

llvm-svn: 207350
2014-04-27 05:28:10 +00:00
Saleem Abdulrasool 84b952b677 Add WoA object file emission support
Introduce support for WoA PE/COFF object file emission from LLVM.  Add the new
target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM
specific behaviour of PE/COFF object emission.  ARM exception information is not
yet emitted and is a TODO item.

The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific
relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer.
The MC layer needs to be updated to deal with the relocation adjustments.
Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts).

Minor tweaks to switch multiple conditional checks into equivalent switch
statements.  The ObjectFileInfo is updated to relax the object file setup for
Windows COFF.  Move the architecture checks into an assertion.  Windows COFF is
currently only supported on x86, x86_64, and ARM (thumb).  Rather than
defaulting to ELF, we will refuse to generate an object file.  This is better
though as you do not get an (arbitrary) object file which is different from the
request.

llvm-svn: 207345
2014-04-27 03:48:22 +00:00
Saleem Abdulrasool a8b1f7204b MC: create X86WinCOFFStreamer for target specific behaviour
This introduces a target specific streamer, X86WinCOFFStreamer, which handles
the target specific behaviour (e.g. WinEH).  This is mostly to ensure that
differences between ARM and X86 remain disjoint and do not accidentally cross
boundaries.  This is the final staging change for enabling object emission for
Windows on ARM.

llvm-svn: 207344
2014-04-27 03:48:12 +00:00
Saleem Abdulrasool 6d6fee9cbc ARM: Support SingleParameterDotFile on WoA
Currently, the integrated assembler is the only choice for assembling Windows on
ARM binaries.  IAS supports the .file <filename> directive which emits the file
symbol into the resulting object binary.  Mark the GNU COFF information to
indicate support for this feature.

llvm-svn: 207341
2014-04-27 03:47:57 +00:00
Craig Topper 59f626d9d5 Replace std::vector with SmallVector for some small, known size vectors.
llvm-svn: 207330
2014-04-26 19:29:47 +00:00
Craig Topper 206fcd450a Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
llvm-svn: 207329
2014-04-26 19:29:41 +00:00
Craig Topper 48d114bed1 Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Benjamin Kramer c2ad8f3ef1 Print X86ISD::PMULDQ nodes properly in debug output.
llvm-svn: 207322
2014-04-26 16:26:41 +00:00
Benjamin Kramer 7c3722724b X86TTI: i16/i32 vector div with a constant (splat) divisor are reasonably cheap now.
Turn vectorization back on.

llvm-svn: 207320
2014-04-26 14:53:05 +00:00
Benjamin Kramer 6d2dff61f9 X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available.
llvm-svn: 207318
2014-04-26 14:12:19 +00:00
Benjamin Kramer c9827ab103 X86: Add patterns for MULHU/MULHS of v8i16 and v16i16.
This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.

llvm-svn: 207317
2014-04-26 13:01:03 +00:00
Benjamin Kramer ad0168702a Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors.
llvm-svn: 207316
2014-04-26 13:00:53 +00:00
Benjamin Kramer 4dae598bc8 DAGCombiner: Turn divs of vector splats into vectorized multiplications.
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.

I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.

llvm-svn: 207315
2014-04-26 12:06:28 +00:00
Benjamin Kramer 29139d5cb5 X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs.
Test will follow soon.

llvm-svn: 207314
2014-04-26 12:06:11 +00:00
Michael Zolotukhin 1a97a7bcbf Revert r206749 till a final decision about the intrinsics is made.
llvm-svn: 207313
2014-04-26 09:56:41 +00:00
Quentin Colombet ea18933d97 [X86] Implement TargetLowering::getScalingFactorCost hook.
Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.

<rdar://problem/16730541>

llvm-svn: 207301
2014-04-26 01:11:26 +00:00
Filipe Cabecinhas 363b570d2a Optimization for certain shufflevector by using insertps.
Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.

Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.

Reviewers: nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3475

llvm-svn: 207291
2014-04-25 23:51:17 +00:00
Matt Arsenault de1c3410c3 R600: Fix function name printing in LowerCall
v2: Check both ExternalSymbol and GlobalAddress

Patch by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 207282
2014-04-25 22:22:01 +00:00
Reed Kotler 5c7f91e42f enable fast isel tablegen files for Mips
Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3498

llvm-svn: 207256
2014-04-25 18:36:38 +00:00
Duncan P. N. Exon Smith d2b2facb07 SCC: Change clients to use const, NFC
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit.  First, update the users.

<rdar://problem/14292693>

llvm-svn: 207252
2014-04-25 18:24:50 +00:00
Reed Kotler c041669927 Make sure that DSUB does not duplicate the pattern of DSUBU
Test Plan:
Run test suite to make sure there is no regression.
https://dmz-portal.mips.com/bb/builders/LLVM%20with%2064bit%20and%20delay%20slot%20optimizer%20and%20direct%20object%20emitter/builds/626

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3497

llvm-svn: 207247
2014-04-25 18:05:00 +00:00
Saleem Abdulrasool 99f0d458c3 ARM: remove @llvm.arm.sevl
This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic
which provides a generic, extensible manner for adding hint instructions.  This
functionality can now be represented as @llvm.arm.hint(i32 5).

llvm-svn: 207246
2014-04-25 17:51:25 +00:00
Saleem Abdulrasool 7e7c2f9ca6 ARM: provide a new generic hint intrinsic
Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into
the instruction stream. This is particularly useful for generating IR from a
compiler where the user may inject an intrinsic (e.g. __yield). These are then
pattern substituted into the correct instruction which already existed.

llvm-svn: 207242
2014-04-25 17:24:24 +00:00
Tilmann Scheller 2c65bbddd8 [ARM64] When compiling for ELF in PIC mode, local symbols shouldn't go through the GOT
There's no need for local symbols to go through the GOT, in fact it seems GNU ld is not even emitting GOT entries for local symbols and will error out when trying to resolve a GOT relocation for a local symbol.

This bug triggers when bootstrapping clang on AArch64 Linux with -fPIC and the ARM64 backend. The AArch64 backend is not affected.

With this commit it's now possible to bootstrap clang on AArch64 Linux with the ARM64 backend (-fPIC, -O3).

llvm-svn: 207226
2014-04-25 13:43:18 +00:00
Jiangning Liu 533b560bc6 [ARM64] Handle fp128 for parameter passing on stack
llvm-svn: 207222
2014-04-25 12:07:03 +00:00
Tim Northover eb7354fd3b ARM64: fix assertion in ISelDAGToDAG
Also an unused variable, so double bonus!

This should deal with PR19548.

llvm-svn: 207221
2014-04-25 10:48:47 +00:00
Bradley Smith 672df15122 [ARM64] Print preferred aliases for SFBM/UBFM in InstPrinter
llvm-svn: 207219
2014-04-25 10:25:29 +00:00
Kevin Qin 022d395c9c [ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 no-fp test.
This patch is a supplement of implementing predicate of FP, enabling aarch64 backend
no-fp tests on arm64 target for verification. During this, one bug is exposed and
fixed by this patch.

llvm-svn: 207215
2014-04-25 09:44:20 +00:00
Kevin Qin 0e7b07704e [ARM64] Support crc predicate on ARM64.
According to the specification, CRC is an optional extension of the
architecture.

llvm-svn: 207214
2014-04-25 09:25:42 +00:00
Saleem Abdulrasool d4cae62fda X86: convert object streamer selection to a switch
Change the object streamer selection to a switch from a series of if conditions.
Rather than defaulting to ELF, require that an ELF format is requested.  The
Windows/!ELF is maintained as MachO would have been selected first and will
still provide a MachO format.  Add an assertion that if COFF is requested that
the target platform is Windows as only WinCOFF object emission is currently
supported.

llvm-svn: 207200
2014-04-25 06:29:36 +00:00
Craig Topper 062a2baef0 [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Benjamin Kramer 76f753e9a9 X86: Don't transform shifts into ands when the sign bit is tested.
Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode.

llvm-svn: 207145
2014-04-24 20:51:37 +00:00
Reid Kleckner 5772b77789 Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

llvm-svn: 207143
2014-04-24 20:14:34 +00:00
Andrea Di Biagio d1ab866868 [X86] Add support for Read Time Stamp Counter x86 builtin intrinsics.
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
   'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
  and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
  correctly verifies that both READCYCLECOUNTER and the two new intrinsics
  work fine for both 64bit and 32bit Subtargets.

llvm-svn: 207127
2014-04-24 17:18:27 +00:00
Matt Arsenault 1018c897f6 R600/SI: Use address space in allowsUnalignedMemoryAccesses
llvm-svn: 207126
2014-04-24 17:08:26 +00:00
David Blaikie 908f4d4bf5 Spread some const around for non-mutating uses of MCSymbolData.
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.

llvm-svn: 207124
2014-04-24 16:59:40 +00:00
Matheus Almeida 583a13cf36 [mips] Remove non-ascii character.
llvm-svn: 207123
2014-04-24 16:31:10 +00:00
Tim Northover 6331d4b975 AArch64: print NEON lists with a space.
This matches ARM64 behaviour, which I think is clearer. It also puts all the
churn from that difference into one easily ignored commit.

llvm-svn: 207116
2014-04-24 14:06:20 +00:00
Evgeniy Stepanov f4a36999ad [asan] Use MCInstrInfo in inline asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 207115
2014-04-24 13:29:34 +00:00
Tim Northover d702d6ac6f AArch64/ARM64: allow negative addends, at least on ELF.
llvm-svn: 207111
2014-04-24 12:56:38 +00:00
Tim Northover 624928134f ARM64: support relocated "TBZ/TBNZ" instructions.
llvm-svn: 207110
2014-04-24 12:56:34 +00:00
Tim Northover 0815a43e7c AArch64/ARM64: support relocated ADR instruction
llvm-svn: 207109
2014-04-24 12:56:30 +00:00
Tim Northover 597ccb200c AArch64/ARM64: add support for :abs_gN_s: MOVZ modifiers
We only need assembly support, so it's fairly easy.

llvm-svn: 207108
2014-04-24 12:56:27 +00:00
Tim Northover 49153037d4 ARM64: shut up warning about variable only used in assert.
llvm-svn: 207106
2014-04-24 12:22:12 +00:00
Tim Northover 79ec019261 AArch64/ARM64: disentangle the "B.CC" and "LDR lit" operands
These can have different relocations in ELF. In particular both:

    b.eq global
    ldr x0, global

are valid, giving different relocations. The only possible way to distinguish
them is via a different fixup, so the operands had to be separated throughout
the backend.

llvm-svn: 207105
2014-04-24 12:12:10 +00:00
Tim Northover eb6611e727 AArch64/ARM64: implement BFI optimisation
ARM64 was not producing pure BFI instructions for bitfield insertion
operations, unlike AArch64. The approach had to be a little different (in
ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but
hopefully this gives it similar power.

This should address PR19424.

llvm-svn: 207102
2014-04-24 12:11:53 +00:00
Evgeniy Stepanov b6c47a5bd2 [asan] Fix instrumentation of x86 intel syntax inline assembly.
Patch by Yuri Gorshenin.

llvm-svn: 207092
2014-04-24 09:56:15 +00:00
Benjamin Kramer f4575db2fd X86: Emit test instead of constant shift + compare if the shift result is unused.
This allows us to compile
  return (mask & 0x8 ? a : b);
into
  testb $8, %dil
  cmovnel %edx, %esi
instead of
  andl  $8, %edi
  shrl  $3, %edi
  cmovnel %edx, %esi

which we formed previously because dag combiner canonicalizes setcc of and into shift.

llvm-svn: 207088
2014-04-24 08:15:31 +00:00
Stepan Dyatkovskiy 00dcc0f53c Fix for PR18921, "vmov" part.
Added support for bytes replication feature, so it could be GAS compatible.

E.g. instructions below:
"vmov.i32 d0, 0xffffffff"
"vmvn.i32 d0, 0xabababab"
"vmov.i32 d0, 0xabababab"
"vmov.i16 d0, 0xabab"
are incorrect, but we could deal with such cases.

For first one we should emit:
"vmov.i8 d0, 0xff"
For second one ("vmvn"):
"vmov.i8 d0, 0x54"
For last two instructions it should emit:
"vmov.i8 d0, 0xab"

P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code.
Just for keeping method bodies in harmony with themselves.

llvm-svn: 207080
2014-04-24 06:03:01 +00:00
Quentin Colombet ef86b4067c [ARM64] Fix the information we give to the peephole optimizer for comparison.
ANDS does not use the same encoding scheme as other xxxS instructions (e.g.,
ADDS). Take that into account to avoid wrong peephole optimization.

<rdar://problem/16693089>

llvm-svn: 207020
2014-04-23 20:43:38 +00:00
Quentin Colombet 04f7b74c39 [X86] Fix missing/wrong scheduling model found by code inspection.
llvm-svn: 207014
2014-04-23 19:30:26 +00:00
NAKAMURA Takumi d5696915d4 X86AsmParser.cpp: Fix memory leak at replacing movsd to movsl.
llvm-svn: 206991
2014-04-23 14:51:35 +00:00
Evgeniy Stepanov 0a951b775e Create MCTargetOptions.
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.

Patch by Yuri Gorshenin.

llvm-svn: 206971
2014-04-23 11:16:03 +00:00
James Molloy 029de8b769 [ARM64] Fix formatting.
llvm-svn: 206967
2014-04-23 10:50:32 +00:00
James Molloy 650cb57067 [ARM64] Add a big endian version of the ARM64 target machine, and update all users.
This completes the porting of r202024 (cpirker "Add AArch64 big endian Target (aarch64_be)") to ARM64.

llvm-svn: 206965
2014-04-23 10:26:40 +00:00
Alexey Volkov 9511327db8 Fixing typos in commit r206957
Differential Revision: http://reviews.llvm.org/D3451

llvm-svn: 206960
2014-04-23 10:20:31 +00:00
Alexey Volkov 0e55a99c0f [X86] Silvermont new scheduler model
This model is not final and work is still in progress.
However there are substantial improvements on integer tests mainly because of better RAL with new scheduler.

Differential Revision: http://reviews.llvm.org/D3451

llvm-svn: 206957
2014-04-23 08:57:09 +00:00
Elena Demikhovsky 8ac0bf96f0 X86Disassembler - fixed a bug in immediate print
llvm-svn: 206953
2014-04-23 07:21:04 +00:00
Kevin Qin a4ee178762 [ARM64] Enable feature predicates for NEON / FP / CRYPTO.
AArch64 has feature predicates for NEON, FP and CRYPTO instructions.
This allows the compiler to generate code without using FP, NEON
or CRYPTO instructions.

llvm-svn: 206949
2014-04-23 06:22:48 +00:00
Kevin Enderby 96918bc406 Fix the assembler to print a better relocatable expression error
diagnostic that includes location information.

Currently if one has this assembly:

	.quad (0x1234 + (4 * SOME_VALUE))

where SOME_VALUE is undefined ones gets the less than
useful error message with no location information:

% clang -c x.s
clang -cc1as: fatal error: error in backend: expected relocatable expression

With this fix one now gets a more useful error message
with location information:

% clang -c x.s 
x.s:5:8: error: expected relocatable expression
 .quad (0x1234 + (4 * SOME_VALUE))
       ^

To do this I plumbed the SMLoc through the MCObjectStreamer
EmitValue() and EmitValueImpl() interfaces so it could be used
when creating the MCFixup.

rdar://12391022

llvm-svn: 206906
2014-04-22 17:27:29 +00:00
Matt Arsenault 16353871c3 R600: Emit error instead of unreachable on function call
llvm-svn: 206904
2014-04-22 16:42:00 +00:00
Tom Stellard 8d6d449756 R600/SI: Reorganize SIInstructions.td
llvm-svn: 206902
2014-04-22 16:33:57 +00:00
Elena Demikhovsky acc5c9e83e AVX-512: store and truncstore for i1 values
llvm-svn: 206897
2014-04-22 14:13:10 +00:00
Tim Northover a962398a3f AArch64/ARM64: make use of ANDS and BICS instructions for comparisons.
llvm-svn: 206888
2014-04-22 12:45:42 +00:00
Lang Hames 64f6ebb8a9 [X86] Require HasBMI2 for the new BZHI tablegen patterns.
Evidently tablegen doesn't infer this from the HasBMI2 predicate on the BZHI
instructions. This should fix the recent bot failures.

llvm-svn: 206885
2014-04-22 12:04:53 +00:00
Robert Khasanov 189e7fdcfb [AVX512] Implemented integer conversions up/down with masking.
Added encoding tests.

llvm-svn: 206884
2014-04-22 11:36:19 +00:00
Lang Hames 70fa72d340 [X86] Remove Tablegen def of X86bzhi SDNode: It's not needed as of r206879.
llvm-svn: 206880
2014-04-22 10:50:46 +00:00
Lang Hames 3067ab2344 [X86] Use tablegen instead of DAG combines to match BZHI instructions, as
suggested by Ben Kramer in review of r206738.

Thanks again Ben!

llvm-svn: 206879
2014-04-22 10:41:56 +00:00
Matheus Almeida 2852af8a00 [mips] Clang-format MipsAsmParser.
No functional changes.

llvm-svn: 206878
2014-04-22 10:15:54 +00:00
Tim Northover 00b4ee848f AArch64/ARM64: add patterns for scalar_to_vector/extract pairs
llvm-svn: 206876
2014-04-22 10:10:18 +00:00
Tim Northover 978d25f391 ARM: disable emission of __XYZvfp in soft-float environment.
The point of these calls is to allow Thumb-1 code to make use of the VFP unit
to perform its operations. This is not desirable with -msoft-float, since most
of the reasons you'd want that apply equally to the runtime library.

rdar://problem/13766161

llvm-svn: 206874
2014-04-22 10:10:09 +00:00
Lang Hames f6f42cac3f [X86] Don't use BZHI for short masks (>=32 bits). Thanks to Ben Kramer for the
review.

llvm-svn: 206869
2014-04-22 07:40:34 +00:00
Matt Arsenault a3c8cde77b R600: Change how vector truncating stores are packed.
Don't introduce new operations on an illegal sub 32-bit type.
Do the operations on a 32-bit value, and then use a truncating store.

llvm-svn: 206864
2014-04-22 04:11:14 +00:00
Matt Arsenault 5dbd5db518 R600: Make sign_extend_inreg legal.
Don't know why I didn't just do this in the first place.

llvm-svn: 206862
2014-04-22 03:49:30 +00:00
Jiangning Liu 87486e0bac [AArch64] Enable global merge pass.
llvm-svn: 206861
2014-04-22 03:33:26 +00:00
Chandler Carruth 84e68b2994 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Target/...
edition.

llvm-svn: 206842
2014-04-22 02:41:26 +00:00
Chandler Carruth ff55593c40 [cleanup] Fix two headers where we included a standard library header
after including the generated code from tablegen.

llvm-svn: 206841
2014-04-22 02:28:45 +00:00
Chandler Carruth b5e481ac91 [cleanup] Fix another place where we were including the tablegen'ed code
of a '.inc' file before including actual headers. In this case we had
both duplicated a header's include and were including a standard header.

llvm-svn: 206840
2014-04-22 02:25:17 +00:00
Chandler Carruth d174b72a28 [cleanup] Lift using directives, DEBUG_TYPE definitions, and even some
system headers above the includes of generated '.inc' files that
actually contain code. In a few targets this was already done pretty
consistently, but it wasn't done *really* consistently anywhere. It is
strictly cleaner IMO and necessary in a bunch of places where the
DEBUG_TYPE is referenced from the generated code. Consistency with the
necessary places trumps. Hopefully the build bots are OK with the
movement of intrin.h...

llvm-svn: 206838
2014-04-22 02:03:14 +00:00
Chandler Carruth e96dd8975f [Modules] Make Support/Debug.h modular. This requires it to not change
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.

This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:

- Header files that need to provide a DEBUG_TYPE for some inline code
  can do so by defining the macro before their inline code and undef-ing
  it afterward so the macro does not escape.

- We no longer have rampant ODR violations due to including headers with
  different DEBUG_TYPE definitions. This may be mostly an academic
  violation today, but with modules these types of violations are easy
  to check for and potentially very relevant.

Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.

The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.

llvm-svn: 206822
2014-04-21 22:55:11 +00:00
Jim Grosbach 9446534025 ARM64: Refactor away a few redundant helpers.
The comment claimed that the register class information wasn't available
in the assembly parser, but that's not really true. It's just annoying to
get to. Replace the helper functions with references to the auto-generated
information.

llvm-svn: 206802
2014-04-21 22:13:57 +00:00
Jim Grosbach 9515c52294 ARM64: Improve diagnostics for malformed reg+reg addressing mode.
Make sure only general purpose registers are valid for offset regs and
that 32-bit regs are only valid for sxtw and uxtw extends.

llvm-svn: 206799
2014-04-21 21:45:57 +00:00
Jim Grosbach ac901086e5 Move helper functions earlier in the file.
No functional change.

llvm-svn: 206798
2014-04-21 21:45:53 +00:00
Jim Grosbach 9d205d42f3 ARM64: Extended addressing mode source reg is 64-bit.
The canonical form for the extended addressing mode (e.g.,
"[x1, w2, uxtw #3]" is for the MCInst to have the second register be the
full 64-bit GPR64 register class. The instruction printer cleans up
the output for display to show the 32-bit register instead, per the
specification.

This simplifies 205893 now that the aliasing is handled in the printer
in 206495 so that the codegen path and the disassembler path give the
same MCInst form.

llvm-svn: 206797
2014-04-21 21:45:44 +00:00
Rafael Espindola 6c76d1d7df Handle _GLOBAL_OFFSET_TABLE_ in 64 bit mode.
With this MC is able to handle _GLOBAL_OFFSET_TABLE_ in 64 bit mode, which is
needed for medium and large code models.

This fixes pr19470.

llvm-svn: 206793
2014-04-21 21:15:45 +00:00
Rafael Espindola 83752535ea clang-format this function.
No functionality change, it will just make the next patch easier to read.

llvm-svn: 206792
2014-04-21 21:00:58 +00:00
David Blaikie 422b93dcf1 Use unique_ptr to manage objects owned by the ScheduleDAGMI.
llvm-svn: 206784
2014-04-21 20:32:32 +00:00
Filipe Cabecinhas 20352216fb Rename X86insrtps to the proper instruction name.
Summary:
The INSERTPS pattern fragment was called insrtps (mising 'e'), which
would make it harder to grep for the patterns related to this instruction.
Renaming it to use the proper instruction name.

Reviewers: nadav

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3443

llvm-svn: 206779
2014-04-21 20:07:29 +00:00
Chandler Carruth a4a2066482 [Modules] Consolidate the DEBUG_TYPE defines in NVPTX to the top of the
cpp file rather than in the header and then again in the cpp file.

llvm-svn: 206778
2014-04-21 19:53:55 +00:00
Yi Jiang d069f6393a ARM64: Combine shifts and uses from different basic block to bit-extract instruction
llvm-svn: 206774
2014-04-21 19:34:27 +00:00
NAKAMURA Takumi 62774f3524 Appease autoconf build since X86Disassembler.c has been disappeared in r206717.
It can be reverted a few days later, after X86Disassembler.d is updated not to contain "X86Disassembler.c".

llvm-svn: 206758
2014-04-21 14:59:11 +00:00
Michael Zolotukhin f2ba994bf6 Reapply r206732. This time without optimization of branches.
llvm-svn: 206749
2014-04-21 12:01:33 +00:00
Benjamin Kramer d2da720ead [C++11] Replace OwningPtr with std::unique_ptr in places where it doesn't break the API.
No functionality change.

llvm-svn: 206740
2014-04-21 09:34:48 +00:00
Lang Hames 5aa6ee80b6 [X86] ISEL (and X, <constant mask>) to BZHI when BMI2 is available.
Generating BZHI in the variable mask case, i.e. (and X, (sub (shl 1, N), 1)),
was already supported, but we were missing the constant-mask case. This patch
fixes that.

<rdar://problem/15480077>

llvm-svn: 206738
2014-04-21 08:18:53 +00:00
Chandler Carruth a2533a7bef Revert r206732 which is causing llc to crash on most of the build bots.
Original commit message:
  Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN,
  safe.srem.iN, safe.urem.iN (iN = i8, i61, i32, or i64).

llvm-svn: 206735
2014-04-21 07:11:15 +00:00
Michael Zolotukhin 137a84616c Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN,
safe.urem.iN (iN = i8, i16, i32, or i64).

llvm-svn: 206732
2014-04-21 05:33:09 +00:00
Richard Smith 5d50610306 C++ has a bool type! (And C's had one too, for 15 years...)
llvm-svn: 206723
2014-04-20 22:15:37 +00:00
Richard Smith 6a6967eeaf More C++ification.
llvm-svn: 206722
2014-04-20 22:10:16 +00:00
Richard Smith 3c3410f139 Remove some more C junk from these files.
llvm-svn: 206721
2014-04-20 21:56:02 +00:00
Richard Smith ac15f1cda3 Don't provide two different definitions of ModRMDecision, OpcodeDecision, and ContextDecision in different source files (depending on #define magic).
llvm-svn: 206720
2014-04-20 21:52:16 +00:00
Richard Smith 82b47d5660 Don't define llvm::X86Disassembler::InstructionSpecifier in different ways in
different source files.

llvm-svn: 206719
2014-04-20 21:35:26 +00:00
Richard Smith 555134215b Maybe if I touch this file the buildbots will actually rerun configure like they need to...
llvm-svn: 206718
2014-04-20 21:28:33 +00:00
Richard Smith 89ee75d786 What year is it! This file has no reason to be written in C, and has doubly no
reason to expose a global symbol 'decodeInstruction' nor to pollute the global
scope with a bunch of external linkage entities (some of which conflict with
others elsewhere in LLVM).

This is just the initial transition to C++; more cleanups to follow.

llvm-svn: 206717
2014-04-20 21:07:34 +00:00
Alp Toker 9844434151 Remove some empty statements
Cleanup only.

llvm-svn: 206710
2014-04-19 23:56:35 +00:00
Yaron Keren d7ba46b287 Patch by Vadim Chugunov
Win64 stack unwinder gets confused when execution flow "falls through" after
a call to 'noreturn' function. This fixes the "missing epilogue" problem by 
emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows.

A secondary use for it would be for anyone wanting to make double-sure that
'noreturn' functions, indeed, do not return.

llvm-svn: 206684
2014-04-19 13:47:43 +00:00
Kevin Enderby b7e51f6af5 Change the ARM assembler to require a :lower16: or :upper16 on non-constant
expressions for mov instructions instead of silently truncating by default.

For the ARM assembler, we want to avoid misleadingly allowing something
like "mov r0, <symbol>" especially when we turn it into a movw and the
expression <symbol> does not have a :lower16: or :upper16" as part of the
expression.  We don't want the behavior of silently truncating, which can be
unexpected and lead to bugs that are difficult to find since this is an easy
mistake to make.

This does change the previous behavior of llvm but actually matches an
older gnu assembler that would not allow this but print less useful errors
of like “invalid constant (0x927c0) after fixup” and “unsupported relocation on
symbol foo”.  The error for llvm is "immediate expression for mov requires
:lower16: or :upper16" with correct location information on the operand
as shown in the added test cases.

rdar://12342160

llvm-svn: 206669
2014-04-18 23:06:39 +00:00
Chad Rosier 9149acb053 [ARM64] Ports the Cortex-A53 Machine Model description from AArch64.
Summary:
This port includes the rudimentary latencies that were provided for
the Cortex-A53 Machine Model in the AArch64 backend. It also changes
the SchedAlias for COPY in the Cyclone model to an explicit
WriteRes mapping to avoid conflicts in other subtargets.

Differential Revision: http://reviews.llvm.org/D3427
Patch by Dave Estes <cestes@codeaurora.org>!

llvm-svn: 206652
2014-04-18 21:22:04 +00:00
Adam Nemet ee7a3e38c9 [X86] Improve buildFromShuffleMostly for AVX
For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors,
both the BUILD_VECTOR and its operands may need to be legalized in multiple
steps.  Consider:

(v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>),
                     (extract_vector_elt %vreg0, Constant<2>),
                     (extract_vector_elt %vreg0, Constant<3>),
                     (extract_vector_elt %vreg0, Constant<4>),
                     (extract_vector_elt %vreg0, Constant<5>),
                     (extract_vector_elt %vreg0, Constant<6>),
                     (extract_vector_elt %vreg0, Constant<7>),
                     %vreg1))

a. We can't build a 256-bit vector efficiently so, we need to split it into
two 128-bit vecs and combine them with VINSERTX128.

b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be
split into a VEXTRACTX128 and a further extract_vector_elt from the
resulting 128-bit vector.

c. The extract_vector_elt from b. is lowered into a shuffle to the first
element and a movss.

Depending on the order in which we legalize the BUILD_VECTOR and its
operands[1], buildFromShuffleMostly may be faced with:

(v4f32 (BUILD_VECTOR (extract_vector_elt
                      (vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
                      Constant<0>),
                     (extract_vector_elt
                      (vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
                      Constant<0>),
                     (extract_vector_elt
                      (vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
                      Constant<0>),
                     %vreg1))

In order to figure out the underlying vector and their identity we need to see
through the shuffles.

[1] Note that the order in which operations and their operands are legalized is
only guaranteed in the first iteration of LegalizeDAG.

Fixes <rdar://problem/16296956>

llvm-svn: 206634
2014-04-18 19:44:16 +00:00
Tim Northover 37d9a9cebf ARM64: disable generation of .loh directives outside MachO.
Part of PR19455.

llvm-svn: 206611
2014-04-18 14:54:46 +00:00
Tim Northover be1d1b6681 ARM64: don't emit .subsections_via_symbols on ELF.
Part of PR19455.

llvm-svn: 206610
2014-04-18 14:54:41 +00:00
Tim Northover be3941cc79 ARM64: add extra NEG pattern.
llvm-svn: 206609
2014-04-18 14:54:35 +00:00
Tim Northover e3028832d1 AArch64/ARM64: add non-scalar lowering for more FCVT operations.
llvm-svn: 206591
2014-04-18 13:16:42 +00:00
Tim Northover 01f315a556 AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE.
We couldn't cope if the first mask element was UNDEF before, which
isn't ideal.

llvm-svn: 206588
2014-04-18 12:50:58 +00:00
Benjamin Kramer e6c821ef4c X86: Pattern match scalar loads + vcvtph2ps into just vcvtph2ps.
vcvtph2ps only reads the lower 64 bits of the address passed to the
intrinsic.

llvm-svn: 206579
2014-04-18 10:45:33 +00:00
Tim Northover a2c4c71c12 AArch64/ARM64: spot a greater variety of concat_vector operations.
Code mostly copied from AArch64, just tidied up a trifle and plumbed
into the ARM64 way of doing things.

This also enables the AArch64 tests which inspired the previous
untested commits.

llvm-svn: 206574
2014-04-18 09:31:27 +00:00
Tim Northover 848bb3ced5 ARM64: implement cunning optimisation from AArch64
A vector extract followed by a dup can become a single instruction even if the
types don't match. AArch64 handled this in ISelLowering, but a few reasonably
simple patterns can take care of it in TableGen, so that's where I've put it.

llvm-svn: 206573
2014-04-18 09:31:20 +00:00
Tim Northover 5ec51a8981 ARM64: spot a vector_shuffle that maps to INS and expand.
Tests will be coming very shortly when all the optimisations needed to
support AArch64's neon-copy.ll file are committed.

llvm-svn: 206572
2014-04-18 09:31:15 +00:00
Tim Northover 46d98ea8de ARM64: nick some AArch64 patterns for extract/insert -> INS.
Tests will be committed shortly when all optimisations needed to
support AArch64's neon-copy.ll file are supported.

llvm-svn: 206571
2014-04-18 09:31:11 +00:00
Tim Northover 8b2fa3dfef AArch64/ARM64: emit all vector FP comparisons as such.
ARM64 was scalarizing some vector comparisons which don't quite map to
AArch64's compare and mask instructions. AArch64's approach of sacrificing a
little efficiency to emulate them with the limited set available was better, so
I ported it across.

More "inspired by" than copy/paste since the backend's internal expectations
were a bit different, but the tests were invaluable.

llvm-svn: 206570
2014-04-18 09:31:07 +00:00
Tim Northover 0a44e66bb8 AArch64/ARM64: port BSL logic from AArch64 & enable test.
I enhanced it a little in the process. The decision shouldn't really be beased
on whether a BUILD_VECTOR is a splat: any set of constants will do the job
provided they're related in the correct way.

Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's
best to check for all 4 possibilities rather than assuming it'll be the RHS.

llvm-svn: 206569
2014-04-18 09:31:01 +00:00
Tim Northover 547a4ae6fa AArch64/ARM64: copy byval implementation from AArch64.
It's not actually used to handle C or C++ ABI rules on ARM64, but could well be
emitted by other language front-ends, so it's as well to have a sensible
implementation.

llvm-svn: 206568
2014-04-18 09:30:52 +00:00
Jiangning Liu ad874fca28 This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64.
A new test case is also added for ARM64.

Patched by Z.Zheng

llvm-svn: 206563
2014-04-18 07:57:54 +00:00
Matt Arsenault 209a7b92b5 R600: Minor cleanups.
Fix indentation, better line wrapping, unused includes.

llvm-svn: 206562
2014-04-18 07:40:20 +00:00
Jiangning Liu 40d81e10c5 This is one of the optimizations ported from ARM64 to AArch64 to address the performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64.
Patched by Z.Zheng

llvm-svn: 206559
2014-04-18 05:58:09 +00:00
Matt Arsenault 78b8670aac R600/SI: Try to use scalar BFE.
Use scalar BFE with constant shift and offset when possible.
This is complicated by the fact that the scalar version packs
the two operands of the vector version into one.

llvm-svn: 206558
2014-04-18 05:19:26 +00:00
Jiangning Liu e56c30614f This commit enables unaligned memory accesses of vector types on AArch64 back end. This should boost vectorized code performance.
Patched by Z. Zheng

llvm-svn: 206557
2014-04-18 03:58:38 +00:00
Matt Arsenault 27cc958dff R600/SI: Match sign_extend_inreg to s_sext_i32_i8 and s_sext_i32_i16
llvm-svn: 206547
2014-04-18 01:53:18 +00:00
Tom Stellard 1aa6cb4d88 R600/SI: Use SReg_64 instead of VSrc_64 when selecting BUILD_PAIR
llvm-svn: 206541
2014-04-18 00:36:21 +00:00
Jim Grosbach 6bfe18a365 [ARM64,C++11] Range'ify another loop.
llvm-svn: 206539
2014-04-17 23:41:57 +00:00
Reed Kotler 720c5ca4ea Start pushing changes for Mips Fast-Isel
llvm-svn: 206505
2014-04-17 22:15:34 +00:00
Tom Stellard aeeea8a864 R600: Add comment clariying use of sext for result of MUL_U24
llvm-svn: 206501
2014-04-17 21:00:13 +00:00
Tom Stellard 868fd92e54 R600/SI: Stop using i128 as the resource descriptor type
Having i128 as a legal type complicates the legalization phase.  v4i32
is already a legal type, so we will use that instead.

This fixes several piglit tests.

llvm-svn: 206500
2014-04-17 21:00:11 +00:00
Tom Stellard 334b29c7f6 R600/SI: Change default register class for i32 to SReg_32
SIFixSGPRCopies is smart enough to handle this now.

llvm-svn: 206499
2014-04-17 21:00:09 +00:00
Tom Stellard 4f3b04de21 R600/SI: Teach SIInstrInfo::moveToVALU() how to handle PHI instructions
llvm-svn: 206498
2014-04-17 21:00:07 +00:00
Tom Stellard e1a244502c R600/SI: Legalize operands after changing dst reg in FixSGPRCopies
Otherwise we may not legalize some illegal REG_SEQUENCE instructions.

llvm-svn: 206497
2014-04-17 21:00:01 +00:00
Louis Gerbarg 153e695ee2 Improve ARM64 vector creation
This patch improves the performance of vector creation in caseiswhere where
several of the lanes in the vector are a constant floating point value. It
also includes new patterns to fold together some of the instructions when the
value is 0.0f. Test cases included.

rdar://16349427

llvm-svn: 206496
2014-04-17 20:51:50 +00:00
Jim Grosbach 0fba6d98fc ARM64: [su]xtw use W regs as inputs, not X regs.
Update the SXT[BHW]/UXTW instruction aliases and the shifted reg addressing
mode handling.

PR19455 and rdar://16650642

llvm-svn: 206495
2014-04-17 20:47:31 +00:00
Tim Northover 11a6082e33 ARM64: switch to IR-based atomic operations.
Goodbye code!

(Game: spot the bug fixed by the change).

llvm-svn: 206490
2014-04-17 20:00:33 +00:00
Tim Northover 0129f298c4 ARM64: add acquire/release versions of the existing atomic intrinsics.
These will be needed to support IR-level lowering of atomic
operations.

llvm-svn: 206489
2014-04-17 20:00:24 +00:00
Tim Northover 037f26f212 Atomics: promote ARM's IR-based atomics pass to CodeGen.
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.

At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).

llvm-svn: 206485
2014-04-17 18:22:47 +00:00
Matt Arsenault a90d22fad5 R600/SI: f64 frint is legal on CI
llvm-svn: 206475
2014-04-17 17:06:37 +00:00
Chad Rosier c4eb4f8827 [AArch64] Implement the getCSRFirstUseCost API, mirroring that in ARM64.
llvm-svn: 206473
2014-04-17 16:19:54 +00:00
Craig Topper 0a9bf4c0c5 [X86] Add disassembler support for the 0x0f 0x7f form of movq %mm, %mm.
llvm-svn: 206447
2014-04-17 06:33:45 +00:00
Matt Arsenault 51df0c1965 R600/SI: Fix zext from i1 to i64
llvm-svn: 206437
2014-04-17 02:03:08 +00:00
Adam Nemet 287f989dde [ARM64] Fix "Cannot select" for vector ctpop
The commit of r205855:

Author: Arnold Schwaighofer <aschwaighofer@apple.com>
Date:   Wed Apr 9 14:20:47 2014 +0000

    SLPVectorizer: Only vectorize intrinsics whose operands are widened equally

    The vectorizer only knows how to vectorize intrinics by widening all operands by
    the same factor.

    Patch by Tyler Nowicki!

exposed a backend bug causing a regression (Cannot select ctpop).

The commit msg is a bit confusing because the patch actually changes the
behavior for the loop-vectorizer as well.  As things got refactored into a
helper ctpop got snuck in to the trivially-vectorizable helper which is now
used by both vectorizers.  In other words, we started seeing vector-ctpops in
the backend.

This change makes ctpop LegalizeAction::Expand for the types not supported by
the byte-only CNT instruction.  We may be able to custom-lower these later to
a single CNT but this is to fix the compiler crash first.

Fixes <rdar://problem/16578951>

llvm-svn: 206433
2014-04-17 01:01:37 +00:00
Aaron Ballman 5f1378c2a4 Replacing a non-ASCII character in a comment with an ASCII character. Fixes a C4819 warning in MSVC.
llvm-svn: 206403
2014-04-16 17:09:20 +00:00
Matheus Almeida 483d7e9349 [mips] Use TwoOperandAliasConstraint for shift instructions.
This enables TableGen to generate an additional two operand
matcher for our shift_rotate_imm and shift_rotate_reg class of instructions.

The tests were also updated so that they include now encoding information
for all affected instructions.

llvm-svn: 206398
2014-04-16 16:28:59 +00:00
Matheus Almeida 0051f2dc78 [mips] Add initial support for NaN2008 in the back-end.
This is so that EF_MIPS_NAN2008 is set if we are using IEEE 754-2008
NaN encoding (-mnan=2008). This patch also adds support for parsing
'.nan legacy' and '.nan 2008' assembly directives. The handling of
these directives should match GAS' behaviour i.e., the last directive
in use sets the ELF header bit (EF_MIPS_NAN2008).

Differential Revision: http://reviews.llvm.org/D3346

llvm-svn: 206396
2014-04-16 15:48:55 +00:00
Tim Northover ef7b34d403 ARM64: silence sign-comparison warning.
llvm-svn: 206393
2014-04-16 15:28:06 +00:00
Tim Northover 3e69958b6b AArch64/ARM64: produce correct relocation for conditional branches.
llvm-svn: 206391
2014-04-16 15:27:52 +00:00
Daniel Sanders 82cd99a126 [mips] Indentation
llvm-svn: 206389
2014-04-16 14:38:27 +00:00
Daniel Sanders 16fa1db637 [mips] Fix emission of '.option pic0' for MIPS-IV.
Summary: This was a case of incorrect usage of hasMips64() vs isABI_N64()

Reviewers: matheusalmeida, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3398

llvm-svn: 206388
2014-04-16 13:58:57 +00:00
Daniel Sanders a024fb0e04 [mips] Correct r206370 to account for non-Linux targets using the small data section.
This should fix the ninja-x64-msvc-RA-centos6 builder.

I suspect the check in MipsSubtarget.cpp is incorrect and is really trying to
check for a bare-metal target rather and anything other than linux. I'll
investigate this.

llvm-svn: 206385
2014-04-16 12:29:08 +00:00
Tim Northover 3ec1de7767 AArch64/ARM64: port across stub handling for ELF C++ exceptions.
The most important part here is that we should actuall emit the stubs we refer
to in the exception table, but as a side issue this uses more sensible & GCC
compatible representations for some of the bits of information.

llvm-svn: 206380
2014-04-16 11:52:55 +00:00
Tim Northover 18f68f6d1a ARM64: use 32-bit moves for constants where possible.
If we know that a particular 64-bit constant has all high bits zero, then we
can rely on the fact that 32-bit ARM64 instructions automatically zero out the
high bits of an x-register. This gives the expansion logic less constraints to
satisfy and so sometimes allows it to pick better sequences.

Came up while porting test/CodeGen/AArch64/movw-consts.ll: this will allow a
32-bit MOVN to be used in @test8 soon.

llvm-svn: 206379
2014-04-16 11:52:51 +00:00
Tim Northover 9cfb57dafa ARM64: use the integrated assembler on ELF.
llvm-svn: 206378
2014-04-16 11:52:40 +00:00
Matheus Almeida dc7e48e084 [mips] Emit '.set nomicromips' before a function's entry label
if not in micromips mode.

The test (elf_st_other.ll) was renamed as the name and description didn't
make sense as the test wasn't checking any symbol table entry.

Differential Revision: http://reviews.llvm.org/D3346

llvm-svn: 206377
2014-04-16 11:46:59 +00:00
Aaron Ballman 58ce7f24cd Fixing a compile error in debug versions of MSVC. It seems that the range-based for loop is confused by the DEBUG macro expansion unless a compound statement is used.
llvm-svn: 206376
2014-04-16 11:15:57 +00:00
Daniel Sanders 11c0c067c2 [mips] Correct callee saved list for the N32 ABI and enable test
Summary: Depends on D3339

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3340

llvm-svn: 206371
2014-04-16 10:23:37 +00:00
Tim Northover 97c5b6fe4f ARM64: mark x7 as used when an i128 gets shunted onto the stack.
The second half of a split i128 was ending up in x7, which is not a good thing.

This is another part of PR19432.

llvm-svn: 206366
2014-04-16 09:03:25 +00:00
Craig Topper abb4ac7f87 Convert SelectionDAG::getVTList to use ArrayRef
llvm-svn: 206357
2014-04-16 06:10:51 +00:00
Saleem Abdulrasool 0d3d6c45ef Target: whitespace
llvm-svn: 206353
2014-04-16 04:15:25 +00:00
Matt Arsenault 4e46665a80 R600: Expand sign extension of vectors.
Setting vector types to expand will result in scalarization on pre SI hw,
as those gpus don't have vector shifts either.
Expand also i32 vectors, this helps llvm make the correct decision
about scalarizing the vector ops.

v2: move setOperation() calls to R600ISelLowering.cpp.
    cleanup the SI code to make it obvious that this patch does is nop for SI

Patch by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 206348
2014-04-16 01:41:30 +00:00
Jim Grosbach 36c6a50512 [ARM64,C++11] Tidy up branch relaxation a bit w/ c++11.
No functional change.

llvm-svn: 206344
2014-04-16 00:42:46 +00:00
Jim Grosbach 01fc5887ad ARM64: Nuke some dead code.
Missed in previous commit.

llvm-svn: 206343
2014-04-16 00:42:43 +00:00
Jim Grosbach 80633094f8 [ARM64,C++11] Clean up the ARM64 LOH collection pass.
Range'ify a bunch of loops, mainly. As a result, we have a variety
of objects via reference rather than by pointer, so propogate that
through the various helper functions where it makes sense.

llvm-svn: 206337
2014-04-15 22:57:02 +00:00
Matt Arsenault e500e32939 R600/SI: Print code size along with used registers
llvm-svn: 206336
2014-04-15 22:40:47 +00:00
Matt Arsenault 4d7d38333b R600/SI: Print more immediates in hex format
Print in decimal for inline immediates, and hex otherwise. Use hex
always for offsets in addressing offsets.

This approximately matches what the shader compiler does.

llvm-svn: 206335
2014-04-15 22:32:49 +00:00
Matt Arsenault fcf86c5417 R600/SI: Cleanup parsing of register names.
Try to figure out the class and number of subregisters.

llvm-svn: 206334
2014-04-15 22:32:42 +00:00
Matt Arsenault 470acd81a8 R600/SI: Fix loads of i1
llvm-svn: 206330
2014-04-15 22:28:39 +00:00
Andrea Di Biagio aac2eac4c2 [X86] Improve the lowering of packed shifts by constant build_vector.
This patch teaches the backend how to efficiently lower logical and
arithmetic packed shifts on both SSE and AVX/AVX2 machines.

When possible, instead of scalarizing a vector shift, the backend should try
to expand the shift into a sequence of two packed shifts by immedate count
followed by a MOVSS/MOVSD.

Example
  (v4i32 (srl A, (build_vector < X, Y, Y, Y>)))

Can be rewritten as:
  (v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>)))

[with X and Y ConstantInt]

The advantage is that the two new shifts from the example would be lowered into
X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into
four scalar shifts plus four pairs of vector insert/extract.

llvm-svn: 206316
2014-04-15 19:30:48 +00:00
Quentin Colombet 72dad56c53 [ARM64] Set default CPU to generic instead of cyclone.
llvm-svn: 206313
2014-04-15 19:08:46 +00:00
NAKAMURA Takumi e1f3583b96 MipsAsmParser.cpp: Fix vg_leak in MipsOperand::CreateMem(). Mem.Base is managed by k_Memory itself.
llvm-svn: 206293
2014-04-15 14:13:21 +00:00
NAKAMURA Takumi bd524ef129 MipsAsmParser::ParseRegister(): Be responsible to delete an Operand on a temporary Operands.
llvm-svn: 206292
2014-04-15 14:06:27 +00:00
Tim Northover ebb3123a5f AArch64/ARM64: add missing pattern for extending load.
llvm-svn: 206290
2014-04-15 14:00:19 +00:00
Tim Northover cbcb7a37f7 AArch64/ARM64: only mangle MOVZ/MOVN during encoding when needed
Sometimes we need emit the bits that would actually be a MOVN when producing a
relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got
wrong until now.

llvm-svn: 206289
2014-04-15 14:00:15 +00:00
Tim Northover 6e27b8ded5 AArch64/ARM64: add support for large code-model jump tables.
I've left the MachO CodeGen as it is, there's a reasonable chance it should use
the GOT like ConstPools, but I'm not certain.

llvm-svn: 206288
2014-04-15 14:00:11 +00:00
Tim Northover 221b583951 AArch64/ARM64: add patterns for various commutations of FNMADD.
llvm-svn: 206287
2014-04-15 14:00:06 +00:00
Tim Northover b37cff1ae2 AArch64/ARM64: add half as a storage type on ARM64.
This brings it into line with the AArch64 behaviour and should open the way for
certain OpenCL features.

llvm-svn: 206286
2014-04-15 14:00:03 +00:00
Tim Northover 80a70a265a AArch64/ARM64: copy patterns for fixed-point conversions
Code is mostly copied directly across, with a slight extension of the
ISelDAGToDAG function so that it can cope with the floating-point constants
being behind a litpool.

llvm-svn: 206285
2014-04-15 13:59:57 +00:00
Tim Northover f70577b1cd ARM64: add constraints to various FastISel operations
llvm-svn: 206284
2014-04-15 13:59:53 +00:00
Tim Northover 2f553f326a FastISel: constrain the RegClass of operands when emitting instructions.
ARM64 suffered multiple -verify-machineinstr failures (principally over the
xsp/xzr issue) because FastISel was completely ignoring which subset of the
general-purpose registers each instruction required.

More fixes are coming in ARM64 specific FastISel, but this should cover the
generic problems.

llvm-svn: 206283
2014-04-15 13:59:49 +00:00
Tim Northover 20603726ce AArch64/ARM64: add dp tests from AArch64
llvm-svn: 206281
2014-04-15 13:59:40 +00:00
NAKAMURA Takumi 6091e1aed5 ARM64AsmParser.cpp: Fix vg_leak in MC/ARM64/fp-encoding.s.
llvm-svn: 206279
2014-04-15 13:22:11 +00:00
Stepan Dyatkovskiy 95cdac43af Optional hash symbol feature support for ARM64
http://reviews.llvm.org/D3328

llvm-svn: 206276
2014-04-15 11:43:09 +00:00
Vladimir Medic 16d671a413 Current definition of subtract with immediate instruction aliases uses CodeGenOnly defined instructions and post matcher expansion methods to emit real instructions add with immediate. However, they can directly alias add with immediate instruction and remove unnecessary definitions and code in MipsAsmParser.cpp. This patch makes no change in functionality, just removes unnecessary definitions and code.
llvm-svn: 206272
2014-04-15 10:14:49 +00:00
NAKAMURA Takumi df72764599 X86JITInfo: [x86] Rework r206240, X86CompilationCallback_SSE() should be called for SSE-enabled code generator, even if LLVM is not built with -msse.
llvm-svn: 206261
2014-04-15 08:28:23 +00:00
Nick Lewycky aad475b324 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Lang Hames a1bc0f5662 [MC] Require an MCContext when constructing an MCDisassembler.
This patch re-introduces the MCContext member that was removed from
MCDisassembler in r206063, and requires that an MCContext be passed in at
MCDisassembler construction time. (Previously the MCContext member had been
initialized in an ad-hoc fashion after construction). The MCCContext member
can be used by MCDisassembler sub-classes to construct constant or
target-specific MCExprs.

This patch updates disassemblers for in-tree targets, and provides the
MCRegisterInfo instance that some disassemblers were using through the
MCContext (previously those backends were constructing their own
MCRegisterInfo instances).

llvm-svn: 206241
2014-04-15 04:40:56 +00:00
NAKAMURA Takumi 33ec29ace9 X86JITInfo: [x86] Use X86CompilationCallback_SSE() along;
*not* Subtarget->hasSSE1()
  *but* __SSE__, the flag that LLVM libraries are compiled

The callback calls internal LLVM JIT libraries. It may be built with -msse (or above).

FIXME: JIT may use "host" instead of "generic" by default.
llvm-svn: 206240
2014-04-15 04:12:21 +00:00
Jim Grosbach 2c6ff0cbb4 [ARM64,C++11]: Range'ify the dead-register-definition pass.
Range-based for loops. No functional change intended.

llvm-svn: 206239
2014-04-15 02:14:09 +00:00
Quentin Colombet f9b61e6afd [ARM64][MC] Set the default CPU string to generic.
llvm-svn: 206228
2014-04-15 00:28:39 +00:00
Jim Grosbach a344b6c314 X86: Nuke one more CPU autodetect blurb.
Missed one in r206094. This brings MC and TargetMachine back into sync.

llvm-svn: 206220
2014-04-14 22:23:30 +00:00
David Blaikie 9027abae53 Change argument order and add explanatory comment to r206130
Changes requested in code review by Eric Christopher of r206130.

llvm-svn: 206219
2014-04-14 22:23:06 +00:00
Eric Christopher b45b4814f6 Use FrameSetup on frame instructions for the Mips port.
I can't seem to get a testcase to show a difference here, but it's
part of the unconditional-br.ll line table weirdness.

llvm-svn: 206218
2014-04-14 22:21:22 +00:00
Quentin Colombet 4097c8959c [ARM64][MC] Set the default CPU to cyclone when initilizating the MC layer.
This matches that ARM64Subtarget does for now.

This is related to <rdar://problem/16573920>

llvm-svn: 206211
2014-04-14 21:25:53 +00:00
Louis Gerbarg cfc05450e5 Fix for codegen bug that could cause illegal cmn instruction generation
In rare cases the dead definition elimination pass code can cause illegal cmn
instructions when it replaces dead registers on instructions that use
unmaterialized frame indexes. This patch disables the dead definition
optimization for instructions which include frame index operands.

rdar://16438284

llvm-svn: 206208
2014-04-14 21:05:05 +00:00
Louis Gerbarg 6d2e3c638f Add a flag to disable the ARM64DeadRegisterDefinitionsPass
This patch adds a -arm64-dead-def-elimination flag so that it is possible to
disable dead definition elimination. Includes test case.

llvm-svn: 206207
2014-04-14 21:05:02 +00:00
James Molloy d60571bad7 [ARM64] Port over missing subtarget features, and CPU definitions from AArch64.
llvm-svn: 206198
2014-04-14 17:38:00 +00:00
Daniel Sanders 863c35a358 [mips] Fix fcopysign for MIPS-IV and add the test.
Summary:
This was another incorrect use of hasMips64() vs isGP64bit().

Depends on D3344

Reviewers: matheusalmeida, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3347

llvm-svn: 206187
2014-04-14 16:24:12 +00:00
Daniel Sanders 3d84935d28 [mips] Fix more incorrect uses of HasMips64 and isMips64()
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's

Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
    LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
    LLVM :: CodeGen/Mips/cmov.ll
    LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
    LLVM :: CodeGen/Mips/largeimmprinting.ll
    LLVM :: CodeGen/Mips/longbranch.ll
    LLVM :: CodeGen/Mips/mips64-f128.ll
    LLVM :: CodeGen/Mips/mips64directive.ll
    LLVM :: CodeGen/Mips/mips64ext.ll
    LLVM :: CodeGen/Mips/mips64fpldst.ll
    LLVM :: CodeGen/Mips/mips64intldst.ll
    LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
    LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll

Reviewers: dsanders

Reviewed By: dsanders

CC: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3343

llvm-svn: 206183
2014-04-14 15:44:42 +00:00
Tim Northover cb9c3cfb58 ARM64: remove buggy REV16 pattern.
The 32-bit pattern is still valid: 0123 -> 3210 -> 1032.

llvm-svn: 206172
2014-04-14 12:59:52 +00:00
Tim Northover b6abe806c7 AArch64/ARM64: enable directcond.ll test on ARM64.
Code change is because optimizeCompareInstr didn't know how to pull the
condition code out of FCSEL instructions.

llvm-svn: 206171
2014-04-14 12:51:06 +00:00
Tim Northover 0d7bd4f444 ARM64: add patterns for csXYZ with reversed operands.
AArch64 tests for this, and it's obviously a good idea. Have to invert the
condition code, of course.

llvm-svn: 206170
2014-04-14 12:51:02 +00:00
Tim Northover 2f48303436 ARM64: add support for AArch64's addsub_ext.ll
There was one definite issue in ARM64 (the off-by-1 check for whether
a shift could be folded in) and one difference that is probably
correct: ARM64 didn't fold nodes with multiple uses into the
arithmetic operations unless optimising for code size.

llvm-svn: 206168
2014-04-14 12:50:50 +00:00
Tim Northover 23b1f08282 ARM64: optimise (cmp x, (sub 0, y)) to (cmn x, y).
This transformation is only valid when being used for an EQ or NE
comparison since the flags change otherwise.

llvm-svn: 206167
2014-04-14 12:50:47 +00:00
Richard Osborne da16ff47cd [XCore] Don't create invalid MKMSK instructions inside loadImmediate().
Summary:
Previously loadImmediate() would produce MKMSK instructions with invalid
immediate values such as mkmsk r0, 9. Fix this by checking the mask size
is valid.

Reviewers: robertlytton

Reviewed By: robertlytton

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3289

llvm-svn: 206163
2014-04-14 12:30:35 +00:00
Hal Finkel 0192cbac66 [PowerPC] [Constant Hoisting] Enable constant hoisting on PPC
Implements the various TTI functions to enable constant hoisting on PPC. The
only significant test-suite change is this:

MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup
(which essentially reverses the slowdown from r206120).

llvm-svn: 206141
2014-04-13 23:02:40 +00:00
Hal Finkel d9963c75da [PowerPC] Fix rlwimi isel when mask is not constant
We had been using the known-zero values of the operand of the or to construct
the mask for an rlwimi; this is not quite correct, but fine when the mask is
constant. When the mask is constant, then the known zeros of the operand must
be a superset of the zeros in the mask. However, when the mask is not a
constant, then there might be bits in the operand that are not known to be zero
that, at runtime, might be zero in the mask. Therefore, we check that any bits
not known to be zero *are* known to be one in the mask. Otherwise, we can't
fold the mask with the or and shift.

This was revealed as a miscompile of
MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with
constant hoisting.

llvm-svn: 206136
2014-04-13 17:10:58 +00:00
David Blaikie 269e0fb2e4 Fix instruction debug info location during legalization
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).

I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.

llvm-svn: 206130
2014-04-13 06:39:55 +00:00
Lang Hames 0563ca1be8 [X86] unique_ptr'ify one of X86GenericDisassembler's members.
llvm-svn: 206127
2014-04-13 04:09:16 +00:00
Hal Finkel 34974ed503 [PowerPC] Implement some additional TLI callbacks
Add implementations of:
  bool isLegalICmpImmediate(int64_t Imm) const
  bool isLegalAddImmediate(int64_t Imm) const
  bool isTruncateFree(Type *Ty1, Type *Ty2) const
  bool isTruncateFree(EVT VT1, EVT VT2) const
  bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const

Unfortunately, this regresses counter-register-based loop formation because
some of the loops now end up in forms were SE cannot compute loop counts.
However, nevertheless, the test-suite results favor committing:

SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup
MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup

MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown

llvm-svn: 206120
2014-04-12 21:52:38 +00:00
Benjamin Kramer 44a53da346 Spell the specialization namespace correctly.
Not sure why clang didn't diagnose this (GCC does).

llvm-svn: 206117
2014-04-12 18:45:24 +00:00
Benjamin Kramer 30120c0626 Make helper static and place random global into the llvm namespace.
llvm-svn: 206116
2014-04-12 18:39:57 +00:00
Benjamin Kramer 502b9e1d7f Retire llvm::array_endof in favor of non-member std::end.
While there make array_lengthof constexpr if we have support for it.

llvm-svn: 206112
2014-04-12 16:15:53 +00:00
Juergen Ributzka cf03068d91 [ARM64] Never hoist the shift value of a shift instruction.
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.

llvm-svn: 206101
2014-04-12 02:53:51 +00:00
Juergen Ributzka 6e17aa45a3 [ARM64] Fix the cost model for cheap large constants.
Originally the cost model would give up for large constants and just return the
maximum cost. This is not what we want for constant hoisting, because some of
these constants are large in bitwidth, but are still cheap to materialize.

This commit fixes the cost model to either return TCC_Free if the cost cannot be
determined, or accurately calculate the cost even for large constants
(bitwidth > 128).

This fixes <rdar://problem/16591573>.

llvm-svn: 206100
2014-04-12 02:36:28 +00:00
Jim Grosbach 48551fbdba X86: Remove TargetMachine CPU auto-detection.
This logic is properly in the realm of whatever is creating the
TargetMachine. This makes plain 'llc foo.ll' consistent across
heterogenous machines.

llvm-svn: 206094
2014-04-12 01:34:29 +00:00
Chad Rosier 4ec124bc3e [AArch64] Implement the isLegalAddressingMode and getScalingFactorCost APIs.
llvm-svn: 206089
2014-04-12 00:14:23 +00:00
Louis Gerbarg b9a0551862 Add ARM64 CLS patterns
This patch adds patterns to generate the cls instruction ARM64. Includes tests
for 64 bit and 32 bit operands.

rdar://15611957

llvm-svn: 206079
2014-04-11 22:27:58 +00:00
Matt Arsenault e1f030ca66 R600: Check if a sextload should be used for parameter loads.
Through some oddity where truncate (sextload x) isn't folded into
an anyextload for vectors, the sextload remains if the
vector isn't immediately scalarized. This keeps the expected
zextload instructions in the kernel-args test when small type
vectors aren't scalarized.

llvm-svn: 206070
2014-04-11 20:59:54 +00:00
Lang Hames 95400e22f9 Remove redundant symbolization support from MCDisassembler interface.
MCDisassembler has an MCSymbolizer member that is meant to take care of
symbolizing during disassembly, but it also has several methods that enable the
disassembler to do symbolization internally (i.e. without an attached symbolizer
object). There is no need for this duplication, but ARM64 had been making use of
it. This patch moves the ARM64 symbolization logic out of ARM64Disassembler and
into an ARM64ExternalSymbolizer class, and removes the duplicated MCSymbolizer
functionality from the MCDisassembler interface. Symbolization will now be
done exclusively through MCSymbolizers.

There should be no impact on disassembly for any platform, but this allows us to
tidy up the MCDisassembler interface and simplify the process of (and invariants
related to) disassembler setup.

llvm-svn: 206063
2014-04-11 20:07:58 +00:00
Matt Arsenault 0cb92e133f R600/SI: Refactor SOPC classes slightly.
Better match what is done for VOPC to eventually
prefer selecting these.

llvm-svn: 206048
2014-04-11 19:25:18 +00:00
Matt Arsenault 9ec3cf2c8a Move ExtractVectorElements to SelectionDAG.
This seems generally useful, and makes sense to
go along with SplitVector.

llvm-svn: 206041
2014-04-11 17:47:30 +00:00
David Blaikie ceec2bdaa5 Implement depth_first and inverse_depth_first range factory functions.
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?

Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).

Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)

llvm-svn: 206016
2014-04-11 01:50:01 +00:00
Jim Grosbach f77265bfee [ARM64,C++11] Range'ify use-lists iterators in address type promotion.
llvm-svn: 206013
2014-04-11 01:13:10 +00:00
Jim Grosbach 8838d793b7 [ARM64,C++11]: Range'ify use-list iterators in DAGToDAG.
llvm-svn: 206007
2014-04-11 00:27:22 +00:00
Jim Grosbach d3249d0923 [ARM64,C++11]: More range-based loop simplification.
llvm-svn: 206006
2014-04-11 00:27:19 +00:00
Reid Kleckner 9c6582129a Move the segmented stack switch to a function attribute
This removes the -segmented-stacks command line flag in favor of a
per-function "split-stack" attribute.

Patch by Luqman Aden and Alex Crichton!

llvm-svn: 205997
2014-04-10 22:58:43 +00:00
Jim Grosbach 577e921344 [ARM64,C++11]: Range'ify loops in InstrInfo.
llvm-svn: 205992
2014-04-10 22:00:18 +00:00
Jim Grosbach 8a0c50e5a9 [ARM64,C++11]: Range'ify loops in the conditional-compare pass.
llvm-svn: 205988
2014-04-10 21:49:24 +00:00
Kevin Enderby 488f20b64e For the ARM integrated assembler add checking of the
alignments on vld/vst instructions.  And report errors for
alignments that are not supported.

While this is a large diff and an big test case, the changes
are very straight forward.  But pretty much had to touch
all vld/vst instructions changing the addrmode to one of the
new ones that where added will do the proper checking for
the specific instruction.

FYI, re-committing this with a tweak so MemoryOp's default
constructor is trivial and will work with MSVC 2012. Thanks
to Reid Kleckner and Jim Grosbach for help with the tweak.

rdar://11312406

llvm-svn: 205986
2014-04-10 20:18:58 +00:00
Daniel Sanders 77277ea501 [mips] NotMips64 predicate is really a test for 32-bit GPR's.
Summary:
Similarly, the HasMips64 on the 64-bit move InstAlias is a test for 64-bit
GPR's.

No functional change.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3263

llvm-svn: 205968
2014-04-10 15:00:28 +00:00
Daniel Sanders ca275d2a14 [mips] Switch the MIPS-III and MIPS-IV assembler tests to use -mcpu=mips4.
Summary:
It is now the smallest superset for these ISA's.

FeatureMips4 now contains FeatureFPIdx since [ls][dw]xc1 were added in MIPS-IV.
Made the FPIdx feature bit lowercase so that it can be used in the -mattr option.

Depends on D3274

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3275

llvm-svn: 205964
2014-04-10 13:16:49 +00:00
NAKAMURA Takumi 12fbced6e8 ARM64/*/LLVMBuild.txt: Prune redundant deps.
llvm-svn: 205963
2014-04-10 12:46:13 +00:00
NAKAMURA Takumi 554c287262 LLVMBuild.txt: Add missing dependencies.
llvm-svn: 205962
2014-04-10 11:16:47 +00:00
NAKAMURA Takumi 98905d3f85 LLVMBuild.txt: Reformat.
llvm-svn: 205961
2014-04-10 11:16:17 +00:00
NAKAMURA Takumi d8570e5bc2 Fix abuse of StringRef on ARM64SysReg::MRSMapper::toString(Val, Valid).
FIXME: Could we use SmallString here?
llvm-svn: 205950
2014-04-10 03:05:59 +00:00
Saleem Abdulrasool c5e0099ffc ARM64: add an explicit cast to silence a silly warning
GCC 4.8 complains with:
  warning: enumeral and non-enumeral type in conditional expression

Although this is silly and harmless in this case, add an explicit cast to
silence the warning.

llvm-svn: 205949
2014-04-10 02:48:10 +00:00
Juergen Ributzka 48c8c07d0a [ARM64] Fix immediate cost calculation for types larger than i64.
The immediate cost calculation code was hitting an assertion in the included
test case, because APInt was still internally 128-bits. Truncating it to 64-bits
fixed the issue.

Fixes <rdar://problem/16572521>.

llvm-svn: 205947
2014-04-10 01:36:59 +00:00
Reid Kleckner 2d4a69e9c9 Revert "For the ARM integrated assembler add checking of the alignments on vld/vst instructions. And report errors for alignments that are not supported."
It doesn't build with MSVC 2012, because MSVC doesn't allow union
members that have non-trivial default constructors.  This change added
'SMLoc AlignmentLoc' to MemoryOp, which made MemoryOp's default ctor
non-trivial.

This reverts commit r205930.

llvm-svn: 205944
2014-04-10 00:52:14 +00:00
Jim Grosbach e4fef71981 Add support for load folding of avx1 logical instructions
AVX supports logical operations using an operand from memory. Unfortunately
because integer operations were not added until AVX2 the AVX1 logical
operation's types were preventing the isel from folding the loads. In a limited
number of cases the peephole optimizer would fold the loads, but most were
missed. This patch adds explicit patterns with appropriate casts in order for
these loads to be folded.

The included test cases run on reduced examples and disable the peephole
optimizer to ensure the folds are being pattern matched.

Patch by Louis Gerbarg <lgg@apple.com>

rdar://16355124

llvm-svn: 205938
2014-04-09 23:39:25 +00:00
Kevin Enderby c296ecd96c For the ARM integrated assembler add checking of the
alignments on vld/vst instructions.  And report errors for
alignments that are not supported.

While this is a large diff and an big test case, the changes
are very straight forward.  But pretty much had to touch
all vld/vst instructions changing the addrmode to one of the
new ones that where added will do the proper checking for
the specific instruction.

rdar://11312406

llvm-svn: 205930
2014-04-09 21:32:59 +00:00
Chad Rosier 5f8d6a6c15 [AArch64] Implement the isZExtFree APIs.
llvm-svn: 205926
2014-04-09 20:51:21 +00:00
Chad Rosier 9ce19fb65c [AArch64] Implement the isTruncateFree API.
In AArch64 i64 to i32 truncate operation is a subregister access.

This allows more opportunities for LSR optmization to eliminate
variables of different types (i32 and i64).

llvm-svn: 205925
2014-04-09 20:43:40 +00:00
Bob Wilson ae89ddedff Simple fix for build failures resulting from r205867.
llvm-svn: 205918
2014-04-09 18:34:45 +00:00
Justin Holewinski 30d56a7b86 [NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces
This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later).

llvm-svn: 205907
2014-04-09 15:39:15 +00:00
Justin Holewinski 9d852a8e08 [NVPTX] Add support for addrspacecast in global variable initializers, including emitting generic() when casting to address space 0.
llvm-svn: 205906
2014-04-09 15:39:11 +00:00
Justin Holewinski 5959695a16 [NVPTX] Add query support for read-write images and managed variables
This also fixes a bug in the annotation cache where the cache will not be cleared between modules if multiple modules are compiled in the same process.

llvm-svn: 205905
2014-04-09 15:38:52 +00:00
Alp Toker 16f98b255d Fix some doc and comment typos
llvm-svn: 205899
2014-04-09 14:47:27 +00:00
Bradley Smith 246b0b617d [ARM64] Change SYS without a register to an alias to make disassembling more consistant.
llvm-svn: 205898
2014-04-09 14:44:58 +00:00
Bradley Smith 2cef19a2e6 [ARM64] Correctly disassemble ISB operand as ISB not DBarrier.
llvm-svn: 205897
2014-04-09 14:44:54 +00:00
Bradley Smith 239120cada [ARM64] Properly support both apple and standard syntax for FMOV
llvm-svn: 205896
2014-04-09 14:44:49 +00:00
Bradley Smith a2308f47d3 [ARM64] Flag setting logical/add/sub immediate instructions don't use SP.
llvm-svn: 205895
2014-04-09 14:44:44 +00:00
Bradley Smith f280e91849 [ARM64] Conditional branches must always print their condition code, even AL.
llvm-svn: 205894
2014-04-09 14:44:39 +00:00
Bradley Smith a19b7e83dc [ARM64] Fix disassembly logic for extended loads/stores with 32-bit registers.
llvm-svn: 205893
2014-04-09 14:44:36 +00:00
Bradley Smith a0d7a9a12f [ARM64] When printing a pre-indexed address with #0, the ', #0' is not optional.
llvm-svn: 205892
2014-04-09 14:44:31 +00:00
Bradley Smith 70c6acbbfd [ARM64] Add missing shifted register MVN alias to ORN
llvm-svn: 205891
2014-04-09 14:44:26 +00:00
Bradley Smith 403bbf95c0 [ARM64] SXTW/UXTW are only valid aliases for 32-bit operations.
llvm-svn: 205890
2014-04-09 14:44:22 +00:00
Bradley Smith 779238a216 [ARM64] Fix canonicalisation of MOVs. MOV is too complex to be modelled by a dumb alias.
llvm-svn: 205889
2014-04-09 14:44:18 +00:00
Bradley Smith f823079acd [ARM64] Fixup ADR/ADRP parsing such that they accept immediates and all labels types
llvm-svn: 205888
2014-04-09 14:44:12 +00:00
Bradley Smith af2710c96f [ARM64] Ensure sp is decoded as SP, not XZR in LD1 instructions.
llvm-svn: 205887
2014-04-09 14:44:07 +00:00
Bradley Smith a0dce246ed [ARM64] Tighten up the special casing in emitting arithmetic extends. UXTW should only be translated when the instruction uses WSP, not SP. Vice versa for UXTX and 64-bit instructions.
llvm-svn: 205886
2014-04-09 14:44:03 +00:00
Bradley Smith 3971d3dc75 [ARM64] Rename LR to the UAL-compliant 'X30'.
llvm-svn: 205885
2014-04-09 14:43:59 +00:00
Bradley Smith 6f1aa59c31 [ARM64] Rename FP to the UAL-compliant 'X29'.
llvm-svn: 205884
2014-04-09 14:43:50 +00:00
Bradley Smith 5511f08055 [ARM64] Add a PostEncoderMethod to FCMP - the Rm field should canonically be zero but should be decoded/disassembled with any value.
llvm-svn: 205883
2014-04-09 14:43:40 +00:00
Bradley Smith eb4ca04db2 [ARM64] SCVTF and FCVTZS/U are undefined if scale<5> == 0.
llvm-svn: 205882
2014-04-09 14:43:35 +00:00
Bradley Smith db7b9b17eb [ARM64] EXT and EXTR instructions on v8i8 and W regs respectively must have the top bit of their immediate clear.
llvm-svn: 205881
2014-04-09 14:43:31 +00:00
Bradley Smith 60e7667886 [ARM64] Scaled fixed-point FCVTZSs should also have bit 29 set to zero.
llvm-svn: 205880
2014-04-09 14:43:27 +00:00
Bradley Smith 7525b47208 [ARM64] UBFM/BFM is undefined on w registers when imms<5> or immr<5> is 1.
llvm-svn: 205879
2014-04-09 14:43:24 +00:00
Bradley Smith 0243aa33fa [ARM64] Floating point to fixed point scaled conversions are only available on fcvtzs and fcvtzu.
llvm-svn: 205878
2014-04-09 14:43:20 +00:00
Bradley Smith 8f906a3c5f [ARM64] Port over the PostEncoderMethod fix for SMULH/UMULH from AArch64.
llvm-svn: 205877
2014-04-09 14:43:15 +00:00
Bradley Smith 9f29b726d5 [ARM64] Add missing tlbi operands and error for extra/missing register on tlbi aliases.
llvm-svn: 205876
2014-04-09 14:43:11 +00:00
Bradley Smith e8b4166acc [ARM64] Rework system register parsing to overcome SPSel clash in MSR variants.
llvm-svn: 205875
2014-04-09 14:43:06 +00:00
Bradley Smith bc35b1f138 [ARM64] Port over the PostEncoderMethod from AArch64 for exclusive loads and stores, so the unused register fields are set to all-ones canonically but are recognised with any value.
llvm-svn: 205874
2014-04-09 14:43:01 +00:00
Bradley Smith 4925be9b56 [ARM64] Use PStateMapper to ensure that MSRcpsr operands are validated during disassembly.
llvm-svn: 205873
2014-04-09 14:42:56 +00:00
Bradley Smith 3339427e2a [ARM64] Remove PrefetchOp and use ARM64PRFM instead.
llvm-svn: 205872
2014-04-09 14:42:53 +00:00
Bradley Smith 16478c4ccf [ARM64] Add WZR to isGPR32Register, since every use needs to check for this anyway.
llvm-svn: 205871
2014-04-09 14:42:49 +00:00
Bradley Smith 3db2a85853 [ARM64] Remove ARM64SYS.
llvm-svn: 205870
2014-04-09 14:42:45 +00:00
Bradley Smith fb90df563f [ARM64] Move CPSRField and DBarrier operands over to AArch64-style disassembly and assembly. This removes the last users of namespace ARM64SYS.
llvm-svn: 205869
2014-04-09 14:42:42 +00:00
Bradley Smith 08c391c156 [ARM64] Switch the decoder, disassembler, instprinter and asmparser over to using AArch64-style system registers, and fix up test failures discovered in the process.
llvm-svn: 205868
2014-04-09 14:42:36 +00:00
Bradley Smith 2ba17a4a17 [ARM64] Move ARM64BaseInfo.{cpp,h} into a Utils/ subdirectory, a la AArch64. These files are required in the decoder, disassembler and parser, and a layering violation was imminent.
llvm-svn: 205867
2014-04-09 14:42:27 +00:00
Bradley Smith ceeb04df60 [ARM64] Copy the named immediate operand mapping logic and enums from AArch64. AArch64's named immediate mapping and parsing is much more advanced than ARM64's. No functionality change - they're currently living side by side while I switch uses over.
llvm-svn: 205866
2014-04-09 14:42:16 +00:00
Bradley Smith 8c0b88c987 [ARM64] Shifted register ALU ops are reserved if sf=0 and imm6<5>=1, and also (for add/sub only) if shift=11.
llvm-svn: 205865
2014-04-09 14:42:11 +00:00
Bradley Smith 527bf86e56 [ARM64] Add support for NV condition code (exists only for valid assembly/disassembly, equivilant to AL)
llvm-svn: 205864
2014-04-09 14:42:07 +00:00
Bradley Smith 6d7af17a3f [ARM64] Add missing 1Q -> 1q vector kind alias
llvm-svn: 205863
2014-04-09 14:42:01 +00:00
Bradley Smith 7d253f29a4 [ARM64] Add parsing for vector lists such as {v0.8b-v3.8b}
llvm-svn: 205862
2014-04-09 14:41:58 +00:00
Bradley Smith 664aa67153 [ARM64] Correctly alias LSL to UXTW for 32bit instruction variants, rather than UXTX
llvm-svn: 205861
2014-04-09 14:41:53 +00:00
Bradley Smith 35cadc58c9 [ARM64] STRHro and STRBro were not being decoded at all.
llvm-svn: 205860
2014-04-09 14:41:49 +00:00
Bradley Smith 87c60e00d5 [ARM64] MOVK with sf=0 and hw<1>=1 is unallocated. Shift amount for ADD/SUB instructions is unallocated if shift > 4.
llvm-svn: 205859
2014-04-09 14:41:45 +00:00
Bradley Smith cd91e5cd0c [ARM64] Register-offset loads and stores with the 'option' field equal to 00x or 10x are undefined.
llvm-svn: 205858
2014-04-09 14:41:38 +00:00
Elena Demikhovsky cf0b9bafc3 AVX-512: insert element to mask vector; store i1 data
Implemented INSERT_VECTOR_ELT operation for v16i1 and v8i1 vectors;
Implemented "store" for i1 type

llvm-svn: 205850
2014-04-09 12:37:50 +00:00
Daniel Sanders b282f1fec5 Re-commit: [mips] abs.[ds], and neg.[ds] should be allowed regardless of -enable-no-nans-fp-math
Summary:
They behave in accordance with the Has2008 and ABS2008 configuration bits of the processor which are used to select between the 1985 and 2008 versions of IEEE 754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid operation exceptions when given NaN), in 2008 mode they are non-arithmetic (i.e. they are copies).

nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because the ISA spec does not explicitly state that they obey Has2008 and ABS2008.

Fixed the issue with the previous version of this patch (r205628). A pre-existing 'let Predicate =' statement was removing some predicates that were necessary for FP64 to behave correctly.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3274

llvm-svn: 205844
2014-04-09 09:56:43 +00:00
Matt Arsenault 2c33562cd6 R600/SI: Match not instruction.
llvm-svn: 205837
2014-04-09 07:16:16 +00:00
Tim Northover b36d428d27 ARM64: scalarize v1i64 mul operation
This is the second part of fixing PR19367.

llvm-svn: 205836
2014-04-09 07:07:02 +00:00
Tim Northover b430cf6681 ARM64: add pattern for <1 x i64> custom not node.
This should fix PR19367.

llvm-svn: 205835
2014-04-09 06:55:39 +00:00
Saleem Abdulrasool fedfc2a63f ARM MC: 80 column
llvm-svn: 205833
2014-04-09 06:18:26 +00:00
Saleem Abdulrasool 5019a978ae ARM MC: sort source files in CMakeLists
llvm-svn: 205832
2014-04-09 06:18:23 +00:00
Craig Topper 8d399f87af [C++11] Replace some comparisons with 'nullptr' with simple boolean checks to reduce verbosity.
llvm-svn: 205829
2014-04-09 04:20:00 +00:00
Juergen Ributzka c11e8b67bb [Constant Hoisting][ARM64] Enable constant hoisting for ARM64.
This implements the target-hooks for ARM64 to enable constant hoisting.

This fixes <rdar://problem/14774662> and <rdar://problem/16381500>.

llvm-svn: 205791
2014-04-08 20:39:59 +00:00
Hal Finkel a775e51274 [PowerPC] Don't return false from PPC::isVSLDOIShuffleMask
PPC::isVSLDOIShuffleMask should return -1, not false, when the shuffle
predicate should be false.

Noticed by inspection; no test case (yet).

llvm-svn: 205787
2014-04-08 19:00:27 +00:00
Kevin Enderby d88fec3d3a Fix the ARM VLD3 (single 3-element structure to all lanes)
size 16 double-spaced registers instruction printing.

This:
	vld3.16 {d0[], d2[], d4[]}, [r4]!

was being printed as:

	vld3.16	{d0[], d1[], d2[]}, [r4]!

rdar://16531387

llvm-svn: 205779
2014-04-08 18:00:52 +00:00
NAKAMURA Takumi 35289340de X86MCAsmInfoGNUCOFF: Set PointerSize as 8 for targeting x64. It caused DW_LNE_set_address was misemitted on x64.
FIXME: I haven't investigate whether CalleeSaveStackSlotSize should be 8.
llvm-svn: 205772
2014-04-08 15:28:50 +00:00
Tim Northover 33d07468bc ARM64: fix fmsub patterns which assumed accum operand was first
Confusingly, the NEON fmla instructions put the accumulator first but the
scalar versions put it at the end (like the fma lib function & LLVM's
intrinsic).

This should fix PR19345, assuming there's only one issue.

llvm-svn: 205758
2014-04-08 12:23:51 +00:00
Elena Demikhovsky 3dcfbdfa54 AVX-512: Added fp_to_uint and uint_to_fp patterns.
llvm-svn: 205754
2014-04-08 07:24:02 +00:00
David Majnemer c9d2625586 X86: Split the relocation selection up
Before, we would have conditional operators where one side of the
operator would be of type RelocationTypeAMD64 and the other is of type
RelocationTypeI386.  GCC would noisly warn with -Wenum-compare
diagnostic.

Instead, refactor the code so it is more like the X86 ELF object writer.

llvm-svn: 205752
2014-04-08 02:15:13 +00:00
Jim Grosbach e75c048ab9 Tidy up comments a bit.
Punctuation, grammar, formatting, etc..

llvm-svn: 205749
2014-04-07 23:47:23 +00:00
Jim Grosbach 75010e7712 ARM64: Range based for loop in ARM64PromoteConstant pass
llvm-svn: 205748
2014-04-07 23:47:21 +00:00
Jim Grosbach 64a28e70c8 ARM64: Clean up file header comment a bit.
llvm-svn: 205747
2014-04-07 23:14:38 +00:00
Reed Kotler 735da8e015 Reverting commit r205628 due to mips64 issues.
llvm-svn: 205741
2014-04-07 22:11:40 +00:00
Tom Stellard 204e61bbdf R600/SI: Handle INSERT_SUBREG in SIFixSGPRCopies
llvm-svn: 205732
2014-04-07 19:45:45 +00:00
Tom Stellard 50122a5890 R600: Match 24-bit arithmetic patterns in a Target DAGCombine
Moving these patterns from TableGen files to PerformDAGCombine()
should allow us to generate better code by eliminating unnecessary
shifts and extensions earlier.

This also fixes a bug where the MAD pattern was calling
SimplifyDemandedBits with a 24-bit mask on the first operand
even when the full pattern wasn't being matched.  This occasionally
resulted in some instructions being incorrectly deleted from the
program.

v2:
  - Fix bug with 64-bit mul

llvm-svn: 205731
2014-04-07 19:45:41 +00:00
Tom Stellard 3cbe014027 R600: Replace dyn_cast + assert with cast
llvm-svn: 205730
2014-04-07 19:31:13 +00:00
Matt Arsenault 4be76e99fe Use std::swap
llvm-svn: 205723
2014-04-07 16:44:26 +00:00
Matt Arsenault 7939acd7fa Use .data() instead of &x[0]
llvm-svn: 205722
2014-04-07 16:44:24 +00:00
David Blaikie 2f7711242a MachineInstr: introduce explicit_operands and implicit_operands ranges
Makes iteration over implicit and explicit machine operands more
explicit (har har). Insipired by code review discussion for r205565.

llvm-svn: 205680
2014-04-05 22:42:04 +00:00
Saleem Abdulrasool dd979e6457 ARM: consolidate MachO checks for ARM asm parser
This consolidates the duplicated MachO checks in the directive parsing for
various directives that are unsupported for Mach-O.  The error message change is
unimportant as this restores the behaviour to that prior to the addition of the
new directive handling.  Furthermore, use a more direct check for MachO
targeting rather than an indirect feature check of the assembler.

Also simplify the test execution command to avoid temporary files.  Further more,
perform the check in both object and assembly emission.

Whether all non-applicable directives are handled is another question.  .fnstart
is marked as being unsupported, however, the complementary .fnend is not.  The
additional unwinding directives are also still honoured.  This change does not
change that, though, it would be good to validate and mark them as being
unsupported if they are unsupported for the MachO emission.

llvm-svn: 205678
2014-04-05 22:09:51 +00:00
Hal Finkel 41e9b1d559 [PowerPC] Remove unused TM member variable to unbreak build
Fix "error: private field 'TM' is not used [-Werror,-Wunused-private-field]"

llvm-svn: 205660
2014-04-05 00:16:28 +00:00
Hal Finkel de0b413ec0 [PowerPC] Adjust load/store costs in PPCTTI
This provides more realistic costs for the insert/extractelement instructions
(which are load/store pairs), accounts for the cheap unaligned Altivec load
sequence, and for unaligned VSX load/stores.

Bad news:
MultiSource/Applications/sgefa/sgefa - 35% slowdown (this will require more investigation)
SingleSource/Benchmarks/McGill/queens - 20% slowdown (we no longer vectorize this, but it was a constant store that was scalarized)
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 - 2% slowdown

Good news:
SingleSource/Benchmarks/Shootout/ary3 - 54% speedup
SingleSource/Benchmarks/Shootout-C++/ary - 40% speedup
MultiSource/Benchmarks/Ptrdist/ks/ks - 35% speedup
MultiSource/Benchmarks/FreeBench/neural/neural - 30% speedup
MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt - 20% speedup

Unfortunately, estimating the costs of the stack-based scalarization sequences
is hard, and adjusting these costs is like a game of whac-a-mole :( I'll
revisit this again after we have better codegen for vector extloads and
truncstores and unaligned load/stores.

llvm-svn: 205658
2014-04-04 23:51:18 +00:00
Hal Finkel b1308d525c [PowerPC] PPCTTI Cleanup
Remove the declaration of an unimplemented function.

llvm-svn: 205657
2014-04-04 23:51:11 +00:00
Matt Arsenault cf6f688a40 Add DAG parameter to ComputeNumSignBitsForTargetNode
This way, you can check the number of sign bits in the
operands. The depth parameter it already has is pretty useless
without this.

llvm-svn: 205649
2014-04-04 20:13:13 +00:00
Matt Arsenault 5e1e4316c4 Fix tabs
llvm-svn: 205648
2014-04-04 20:13:08 +00:00
Kai Nacke 6da86e8529 [mips] Add Octeon cnMips instructions seqi/snei and v3mulu/vmm0/vmulu.
This patch adds the Octeon cnMips instructions seqi/snei and v3mulu/vmm0/vmulu.
It is only for the assembler. Test case is included.

Reviewed by: Daniel.Sanders@imgtec.com

llvm-svn: 205631
2014-04-04 16:21:59 +00:00
Hal Finkel fbf7e2a1a1 [PowerPC] Add a full condition code register to make the "cc" clobber work
gcc inline asm supports specifying "cc" as a clobber of all condition
registers. Add just enough modeling of the full register to make this work.
Fixed PR19326.

llvm-svn: 205630
2014-04-04 15:15:57 +00:00
Daniel Sanders d4341a0ad7 [mips] abs.[ds], and neg.[ds] should be allowed regardless of -enable-no-nans-fp-math
Summary:
They behave in accordance with the Has2008 and ABS2008 configuration bits of the
processor which are used to select between the 1985 and 2008 versions of IEEE
754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid
operation exceptions when given NaN), in 2008 mode they are non-arithmetic
(i.e. they are copies).

nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because
the ISA spec does not explicitly state that they obey Has2008 and ABS2008.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3274

llvm-svn: 205628
2014-04-04 14:52:54 +00:00
Tim Northover 07a8ff4892 ARM64: handle v1i1 types arising from setcc properly.
There were several overlapping problems here, and this solution is
closely inspired by the one adopted in AArch64 in r201381.

Firstly, scalarisation of v1i1 setcc operations simply fails if the
input types are legal. This is fixed in LegalizeVectorTypes.cpp this
time, and allows AArch64 code to be simplified slightly.

Second, vselect with such a setcc feeding into it ends up in
ScalarizeVectorOperand, where it's not handled. I experimented with an
implementation, but found that whatever DAG came out was rather
horrific. I think Hao's DAG combine approach is a good one for
quality, though there are edge cases it won't catch (to be fixed
separately).

Should fix PR19335.

llvm-svn: 205625
2014-04-04 14:49:21 +00:00
Stepan Dyatkovskiy 3f1fa3d545 Fix for PR18921 (LDRD/STRD part)::
Removed "GNU Assembler extension (compatibility)" definitions from ARMInstrInfo.td
Fixed ARMAsmParser::ParseInstruction GNU compatability branch, so it also works for thumb mode from now.
Added new tests.

llvm-svn: 205622
2014-04-04 10:17:56 +00:00
Tim Northover 85d6a16c46 ARM64: use regalloc-friendly COPY_TO_REGCLASS for bitcasts
The previous patterns directly inserted FMOV or INS instructions into
the DAG for scalar_to_vector & bitconvert patterns. This is horribly
inefficient and can generated lots more GPR <-> FPR register traffic
than necessary.

It's much better to emit instructions the register allocator
understands so it can coalesce the copies when appropriate.

It led to at least one ISelLowering hack to avoid the problems, which
was incorrect for v1i64 (FPR64 has no dsub). It can now be removed
entirely.

This should also fix PR19331.

llvm-svn: 205616
2014-04-04 09:03:09 +00:00
Tim Northover 1e4f2c5e5f ARM64: add 128-bit MLA operations to the custom selection code.
Without this change, the llvm_unreachable kicked in. The code pattern
being spotted is rather non-canonical for 128-bit MLAs, but it can
happen and there's no point in generating sub-optimal code for it just
because it looks odd.

Should fix PR19332.

llvm-svn: 205615
2014-04-04 09:03:02 +00:00
Stepan Dyatkovskiy a09bd2379c Fixed register class in STRD instruction for Thumb2 mode.
llvm-svn: 205612
2014-04-04 08:14:13 +00:00
Craig Topper 840beec2d0 Make consistent use of MCPhysReg instead of uint16_t throughout the tree.
llvm-svn: 205610
2014-04-04 05:16:06 +00:00
Jim Grosbach 537f3ed838 ARM: Range based for-loop over block predecessors.
No functional change.

llvm-svn: 205604
2014-04-04 02:11:03 +00:00
Jim Grosbach f92e8f5a8b ARM: Use range-based for loops in frame lowering.
No functional change.

llvm-svn: 205602
2014-04-04 02:10:55 +00:00
Quentin Colombet 9c816f39ad Revert r205599, the commit was not intended to have so many changes
llvm-svn: 205600
2014-04-04 02:02:49 +00:00
Quentin Colombet 7ee4e79dec [RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chance
recoloring cut-offs are hit.

This is related to PR18747.

Patch by MAYUR PANDEY <mayur.p@samsung.com>

llvm-svn: 205599
2014-04-04 01:58:57 +00:00
Saleem Abdulrasool a7a8a3e3ee MIPS: remove vim swap file
llvm-svn: 205595
2014-04-04 01:19:54 +00:00
Jim Grosbach b8bd4a5e2a Tidy up. Space before ':' in range-based for loops.
llvm-svn: 205585
2014-04-03 23:43:26 +00:00
Jim Grosbach bb1af943bb Tidy up. 80 columns.
llvm-svn: 205584
2014-04-03 23:43:22 +00:00
Jim Grosbach 1a59711505 Tidy up. Trailing whitespace.
llvm-svn: 205583
2014-04-03 23:43:18 +00:00
Jim Grosbach e04eb1dc12 Fix typo.
llvm-svn: 205582
2014-04-03 23:43:12 +00:00
Eli Bendersky bbef172f19 Optimize away unnecessary address casts.
Removes unnecessary casts from non-generic address spaces to the generic address
space for certain code patterns.

Patch by Jingyue Wu.

llvm-svn: 205571
2014-04-03 21:18:25 +00:00
Lang Hames cb74fa696b [ARM64] Teach the ARM64DeadRegisterDefinition pass to respect implicit-defs.
When rematerializing through truncates, the coalescer may produce instructions
with dead defs, but live implicit-defs of subregs:
E.g.
  %X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32

These instructions are live, and their definitions should not be rewritten.

Fixes <rdar://problem/16492408>

llvm-svn: 205565
2014-04-03 20:51:08 +00:00
Tom Stellard a0150cb6a9 R600: Correct opcode for BFE_INT
Acording to AMD documentation, the correct opcode for
BFE_INT is 0x5, not 0x4

Fixes Arithm/Absdiff.Mat/3 OpenCV test

Patch by: Bruno Jiménez

llvm-svn: 205562
2014-04-03 20:19:29 +00:00
Tom Stellard 7ed0b5235a R600/SI: Lower 64-bit immediates using REG_SEQUENCE
llvm-svn: 205561
2014-04-03 20:19:27 +00:00
Tim Northover 01b4aa9437 ARM: tell LLVM about zext properties of ldrexb/ldrexh
Implementing this via ComputeMaskedBits has two advantages:
  + It actually works. DAGISel doesn't deal with the chains properly
    in the previous pattern-based solution, so they never trigger.
  + The information can be used in other DAG combines, as well as the
    trivial "get rid of truncs". For example if the trunc is in a
    different basic block.

rdar://problem/16227836

llvm-svn: 205540
2014-04-03 15:10:35 +00:00
Daniel Sanders 442f1a12f1 [mips] Implement ehb, ssnop, and pause in assembler
Summary: Add negative tests for pause

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3246

llvm-svn: 205537
2014-04-03 13:21:51 +00:00
Tim Northover 70450c59a4 ARM: skip cmpxchg failure barrier if ordering is monotonic.
The terminal barrier of a cmpxchg expansion will be either Acquire or
SequentiallyConsistent. In either case it can be skipped if the
operation has Monotonic requirements on failure.

rdar://problem/15996804

llvm-svn: 205535
2014-04-03 13:06:54 +00:00
Zoran Jovanovic cabf0f41e0 Implementation of 16-bit microMIPS instructions MFHI and MFLO.
Differential Revision: http://llvm-reviews.chandlerc.com/D3141

llvm-svn: 205532
2014-04-03 12:47:34 +00:00
Daniel Sanders f7b32291ad [mips] Add initial (experimental) MIPS-IV support.
Summary:
Adds the 'mips4' processor and a simple test of the ELF e_flags.

Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

I made one small change to the testcase so that it uses
mips64-unknown-linux instead of mips4-unknown-linux.

This patch indirectly adds FeatureCondMov to FeatureMips64. This is ok
because it's supposed to be there anyway and it turns out that
FeatureCondMov is not a predicate of any instructions at the moment
(this is a bug that hasn't been noticed because there are no targets
without the conditional move instructions yet).

CC: theraven

Differential Revision: http://llvm-reviews.chandlerc.com/D3244

llvm-svn: 205530
2014-04-03 12:13:36 +00:00
Zoran Jovanovic 842f20ef0b MicroMIPS specific little endian fixup data byte ordering.
Differential Revision: http://llvm-reviews.chandlerc.com/D3245

llvm-svn: 205528
2014-04-03 12:01:01 +00:00
Tim Northover c882eb0723 ARM: expand atomic ldrex/strex loops in IR
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).

Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:

1. an atomicrmw followed by using the *new* value can be more
   efficient. As an IR pass, simple CSE could handle this
   efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
   in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
   optimisation.

I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).

llvm-svn: 205525
2014-04-03 11:44:58 +00:00
Stepan Dyatkovskiy 6207a4dadc PR19320:
The trouble as in ARMAsmParser, in ParseInstruction method. It assumes that ARM::R12 + 1 == ARM::SP.
It is wrong, since ARM::<Register> codes are generated by tablegen and actually could be any random numbers.

llvm-svn: 205524
2014-04-03 11:29:15 +00:00
Silviu Baranga a3106e6847 [ARM] When generating a vpaddl node the input lane type is not always the type of the
add operation since extract_vector_elt can perform an extend operation. Get the input lane
type from the vector on which we're performing the vpaddl operation on and extend or
truncate it to the output type of the original add node.

llvm-svn: 205523
2014-04-03 10:44:27 +00:00
Sasa Stankovic 06c4780311 [mips] Extend MipsMCExpr class to handle %higher(sym1 - sym2 + const) and
%highest(sym1 - sym2 + const) relocations. Remove "ABS_" from VK_Mips_HI
and VK_Mips_LO enums in MipsMCExpr, to be consistent with VK_Mips_HIGHER
and VK_Mips_HIGHEST.

This change also deletes test file test/MC/Mips/higher_highest.ll and moves
its CHECK's to the new test file test/MC/Mips/higher-highest-addressing.s.
The deleted file tests that R_MIPS_HIGHER and R_MIPS_HIGHEST relocations are
emitted in the .o file. Since it uses -force-mips-long-branch option, it was
created when MipsLongBranch's implementation was emitting R_MIPS_HIGHER and
R_MIPS_HIGHEST relocations in the .o file. It was disabled when MipsLongBranch
started to directly calculate offsets.

Differential Revision: http://llvm-reviews.chandlerc.com/D3230

llvm-svn: 205522
2014-04-03 10:37:45 +00:00
Tim Northover 2ad88d3aab ARM64: always use i64 for the RHS of shift operations
Switching between i32 and i64 based on the LHS type is a good idea in
theory, but pre-legalisation uses i64 regardless of our choice,
leading to potential ISel errors.

Should fix PR19294.

llvm-svn: 205519
2014-04-03 09:26:16 +00:00
Oliver Stannard 92e0fc0484 ARM: Use __STACK_LIMIT symbol for segmented stacks
We cannot use STACK_LIMIT, as it is not reserved for the compiler
by the C spec.

llvm-svn: 205516
2014-04-03 08:45:16 +00:00
Tim Northover c7c6a93704 ARM64: don't generate __sincos_stret calls unless on MachO
This should fix PR19314.

llvm-svn: 205514
2014-04-03 07:06:13 +00:00
Lang Hames c59a2d0529 [X86] As per suggestion from Craig Topper and Hal Finkel, override
TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather
than abusing commuteInstruction.

Thanks very much for the suggestion guys!

llvm-svn: 205489
2014-04-02 23:57:49 +00:00
Hal Finkel f823380a44 [PowerPC] Make PPCTTI::getMemoryOpCost call BasicTTI::getMemoryOpCost
PPCTTI::getMemoryOpCost will now make use of BasicTTI::getMemoryOpCost to
calculate the base cost of the memory access, and then adjust on top of that.
There is no functionality change from this modification, but it will become
important so that PPCTTI can take advantage of scalarization information for which
BasicTTI::getMemoryOpCost will account in the near future.

llvm-svn: 205476
2014-04-02 22:43:49 +00:00
Lang Hames c2c751312e [X86] Make the VFMA*231 variants commutable and relax the alignment restrictions
on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load
from unaligned memory.

Testcase to follow, along with related patch.

<rdar://problem/16478629>

llvm-svn: 205472
2014-04-02 22:06:16 +00:00
Juergen Ributzka 27435b3b8a Add comments and test case for [X86TTI] Make constant base pointers for GetElementPtr opaque (r204739).
llvm-svn: 205468
2014-04-02 21:45:36 +00:00
Saleem Abdulrasool cd1308296e ARM: update subtarget information for Windows on ARM
Update the subtarget information for Windows on ARM.  This enables using the MC
layer to target Windows on ARM.

llvm-svn: 205459
2014-04-02 20:32:05 +00:00
Jim Grosbach 2a2459f365 Make a few more range-based loops use explicit types.
No functional change.

llvm-svn: 205458
2014-04-02 20:21:22 +00:00
Tom Stellard 36a031870b TargetLibraryInfo: Disable memcpy and memset on R600
There are no implementations of these for R600.

llvm-svn: 205455
2014-04-02 19:53:29 +00:00
Jim Grosbach 36c4953348 Simplify resolveFrameIndex() signature.
Just pass a MachineInstr reference rather than an MBB iterator.
Creating a MachineInstr& is the first thing every implementation did
anyway.

llvm-svn: 205453
2014-04-02 19:28:18 +00:00
Jim Grosbach 4a1a9ce5e6 ARM: cortex-m0 doesn't support unaligned memory access.
Unlike other v6+ processors, cortex-m0 never supports unaligned accesses.
From the v6m ARM ARM:

"A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned
access occurs."

rdar://16491560

llvm-svn: 205452
2014-04-02 19:28:13 +00:00
Jim Grosbach df1e05bb8a Make some range based loop types more explicit.
No functional change, but more readable code.

llvm-svn: 205451
2014-04-02 19:28:08 +00:00
Kai Nacke 13673ac704 [mips] Add more Octeon cnMips instructions
Adds the instructions ext/ext32/cins/cins32.
It also changes pop/dpop to accept the two operand version and
adds a simple pattern to generate baddu.
Tests for the two operand versions (including baddu/dmul/dpop/pop)
and the code generation pattern for baddu are included.

Reviewed by: Daniel.Sanders@imgtec.com

llvm-svn: 205449
2014-04-02 18:40:43 +00:00
Jim Grosbach 20b0790df7 [C++11,ARM64] Range based for and explicit 'override' in STP cleanup.
No functional change intended.

llvm-svn: 205446
2014-04-02 18:00:59 +00:00
Jim Grosbach 05abd709f3 [C++11,ARM64] Range based for loops in constant promotion.
No functional change intended.

llvm-svn: 205445
2014-04-02 18:00:56 +00:00
Jim Grosbach 7dc9edeaa5 [C++11,ARM64] Range based for loops in load/store pair optimizer.
No functional change intended.

llvm-svn: 205444
2014-04-02 18:00:53 +00:00
Jim Grosbach 020e657790 [C++11,ARM64] Range based for loops in target lowering.
No functional change intended.

llvm-svn: 205443
2014-04-02 18:00:51 +00:00
Jim Grosbach 91f1f47751 [C++11,ARM64] Range based for loops in frame lowering.
No functional change intended.

llvm-svn: 205442
2014-04-02 18:00:49 +00:00
Jim Grosbach f39d752b03 [C++11,ARM64] Range based for loops in pseudo expansion.
No functional change intended.

llvm-svn: 205441
2014-04-02 18:00:46 +00:00
Jim Grosbach 673825ebac [C++11,ARM64] Range based for loops for LOH
No functional change intended.

llvm-svn: 205440
2014-04-02 18:00:44 +00:00
Jim Grosbach 2539c3d07a [C++11,ARM64] Range based for loops TLS cleanup.
No functional change intended.

llvm-svn: 205439
2014-04-02 18:00:41 +00:00
Jim Grosbach 0d0c5a614a [C++11,ARM64] Range based for loops in branch relaxation.
No functional change intended.

llvm-svn: 205438
2014-04-02 18:00:39 +00:00
Jim Grosbach 1c762ca9bd [C++11,ARM64] Range based for loops in address type promotion.
No functional change intended.

llvm-svn: 205437
2014-04-02 18:00:36 +00:00
Quentin Colombet 7bf9d8cd13 [ARM64][CollectLOH] Remove the link to the radar from the comments.
llvm-svn: 205435
2014-04-02 16:40:49 +00:00
Oliver Stannard b14c625111 ARM: Add support for segmented stacks
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.

llvm-svn: 205430
2014-04-02 16:10:33 +00:00
Tim Northover 6d69168ffd ARM64: use GOT for weak symbols & PIC.
Weak symbols cannot use the small code model's usual ADRP sequences since the
instruction simply may not be able to encode a value of 0.

This redirects them to use the GOT, which hopefully linkers are able to cope
with even in the static relocation model.

llvm-svn: 205426
2014-04-02 14:39:11 +00:00
Tim Northover 0d80f70530 ARM64: fix lowering of fp128 fptosi/fptoui
We were creating libcall nodes that returned an MVT::f128, when these
particular operations actually return an int of some stripe.

llvm-svn: 205425
2014-04-02 14:39:07 +00:00
Tim Northover ebd37ab382 ARM64: make sure first argument to INSERT_SUBVECTOR has right type.
Again, coalescing and other optimisations swiftly made the MachineInstrs
consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was
produced.

llvm-svn: 205423
2014-04-02 14:38:58 +00:00
Tim Northover 5e3a484e3b ARM64: convert fp16 narrowing ISel to pseudo-instruction
The previous attempt was fine with optimisations, but was actually rather
cavalier with its types. When compiled at -O0, it produced invalid COPY
MachineInstrs.

llvm-svn: 205422
2014-04-02 14:38:54 +00:00
Job Noorman f7da105f39 Mark FPB as a reserved register when needed.
llvm-svn: 205421
2014-04-02 13:13:56 +00:00
Renato Golin d93295ea56 Remove duplicated DMB instructions
ARM specific optimiztion, finding places in ARM machine code where 2 dmbs
follow one another, and eliminating one of them.

Patch by Reinoud Elhorst.

llvm-svn: 205409
2014-04-02 09:03:43 +00:00
Yaron Keren 2895496852 Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU()
and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the
four Windows environments in Triple.h.

Suggestion by Saleem Abdulrasool!

llvm-svn: 205393
2014-04-02 04:27:51 +00:00
Quentin Colombet 3c2b13b258 [ARM64][CollectLOH] Add some comments to explain how the LOHs
framework works (for the compiler part), since the design
document is not available.

llvm-svn: 205379
2014-04-02 01:02:28 +00:00
Hal Finkel 9e0baa6d3a [PowerPC] Add some missing VSX bitcast patterns
llvm-svn: 205352
2014-04-01 19:24:27 +00:00
Yaron Keren 48d68d439a If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and
Environment == Triple::MSVC so it will never be MinGW or Cygwin.

llvm-svn: 205349
2014-04-01 18:52:55 +00:00
Hal Finkel 2eed29f3c8 Implement X86TTI::getUnrollingPreferences
This provides an initial implementation of getUnrollingPreferences for x86.
getUnrollingPreferences is used by the generic (concatenation) unroller, which
is distinct from the unrolling done by the loop vectorizer. Many modern x86
cores have some kind of uop cache and loop-stream detector (LSD) used to
efficiently dispatch small loops, and taking full advantage of this requires
unrolling small loops (small here means 10s of uops).

These caches also have limits on the number of taken branches in the loop, and
so we also cap the loop unrolling factor based on the maximum "depth" of the
loop. This is currently calculated with a partial DFS traversal (partial
because it will stop early if the path length grows too much). This is still an
approximation, and one that is both conservative (because it does not account
for branches eliminated via block placement) and optimistic (because it is only
recording the maximum depth over minimum paths). Nevertheless, because the
loops that fit in these uop caches are so small, it is not clear how much the
details matter.

The original set of patches posted for review produced the following test-suite
performance results (from the TSVC benchmark) at that time:
  ControlLoops-dbl - 13% speedup
  ControlLoops-flt - 15% speedup
  Reductions-dbl - 7.5% speedup

llvm-svn: 205348
2014-04-01 18:50:34 +00:00
Kai Nacke af47f60f83 [mips] Add Octeon cnMips instructions mtmX and mtpX
Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px".
Includes tests.

Reviews by: Daniel.Sanders@imgtec.com

llvm-svn: 205343
2014-04-01 18:35:26 +00:00
Reid Kleckner 101102711d Support segmented stacks on Win64
Identical to Win32 method except the GS segment register is used for TLS
instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14.

llvm-svn: 205342
2014-04-01 18:34:21 +00:00
Yaron Keren 136fe7db46 isTargetWindows() renamed to isTargetKnownWindowsMSVC()
to reflect its current functionality.

Based on Takumi NAKAMURA suggestion.

llvm-svn: 205338
2014-04-01 18:15:34 +00:00
Christian Pirker dc9ff75554 ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ThumbLE/ThumbBE
llvm-svn: 205317
2014-04-01 15:19:30 +00:00
Tim Northover 0feb91ef15 ARM: teach LLVM that Cortex-A7 is very similar to A8.
llvm-svn: 205314
2014-04-01 14:10:07 +00:00
Aaron Ballman 8bf5a548ea Attempting to fix r205124, which had failed asserts when built with MSVC.
Suggestion from Yaron Keren.

llvm-svn: 205313
2014-04-01 13:56:35 +00:00
Tim Northover 1351030801 ARM: add cyclone CPU with ZeroCycleZeroing feature.
The Cyclone CPU is similar to swift for most LLVM purposes, but does have two
preferred instructions for zeroing a VFP register. This teaches LLVM about
them.

llvm-svn: 205309
2014-04-01 13:22:02 +00:00
Daniel Sanders 21bce30fdc [mips] Renamed ParseAnyRegisterWithoutDollar to MatchAnyRegisterWithoutDollar
This is for consistency with other functions. The Parse* functions consume
tokens and the Match* functions don't.

No functional change.

llvm-svn: 205305
2014-04-01 12:35:23 +00:00
Aaron Ballman 0947bb20d8 Fixing an MSVC warning about widening the result of a 32-bit shift implicitly. No functional change intended.
llvm-svn: 205304
2014-04-01 12:24:25 +00:00
Tim Northover 4f1dd58e2e ARM64: add intrinsic for pmull (p64 x p64 = p128) operations.
llvm-svn: 205302
2014-04-01 12:22:37 +00:00
Aaron Ballman d1726ee8fa Fixing warnings in the MSVC build. No functional changes intended.
llvm-svn: 205301
2014-04-01 12:22:20 +00:00
Daniel Sanders ffd8436d6c [mips] Extend ParseJumpTarget to support the full symbol expression syntax.
Summary:
This should fix the issues the D3222 caused in lld. Testcase is based on
the one that failed in the buildbot.

Depends on D3233

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3234

llvm-svn: 205298
2014-04-01 10:41:48 +00:00
Daniel Sanders 315386c083 [mips] Use AsmLexer::peekTok() to resolve the conflict between $reg and $sym
Summary:
Parsing registers no longer consume the $ token before it's confirmed whether it really has a register or not, therefore it's no longer impossible to match symbols if registers were tried first.

Depends on D3232

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3233

llvm-svn: 205297
2014-04-01 10:40:14 +00:00
Daniel Sanders 0993457891 [mips] Hoist Parser.Lex() calls out of MatchAnyRegisterNameWithoutDollar()
Summary:
No functional change

Depends on D3222

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3232

llvm-svn: 205295
2014-04-01 10:37:46 +00:00
Tim Northover ff179ba3d3 ARM64: add patterns for more lane-wise ld1/st1 operations.
llvm-svn: 205294
2014-04-01 10:37:09 +00:00
Tim Northover d8d613b979 ARM64: fix bug in ld3r (1d) SelectionDAG.
llvm-svn: 205293
2014-04-01 10:37:03 +00:00
Daniel Sanders b50ccf8e26 [mips] Rewrite MipsAsmParser and MipsOperand.
Summary:
Highlights:
- Registers are resolved much later (by the render method).
  Prior to that point, GPR32's/GPR64's are GPR's regardless of register
  size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register
  size or FR mode. Numeric registers can be anything.
- All registers are parsed the same way everywhere (even when handling
  symbol aliasing)
  - One consequence is that all registers can be specified numerically
    almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing
    but that can be easily resolved.
- Removes the need for the hasConsumedDollar hack
- Parenthesis and Bracket suffixes are handled generically
- Micromips instructions are parsed directly instead of going through the
  standard encodings first.
- rdhwr accepts all 32 registers, and the following instructions that previously
  xfailed now work:
    ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d,
    c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1
- Diagnostics involving registers point at the correct character (the $)
- There's only one kind of immediate in MipsOperand. LSA immediates are handled
  by the predicate and renderer.

Lowlights:
- Hardcoded '$zero' in the div patterns is handled with a hack.
  MipsOperand::isReg() will return true for a k_RegisterIndex token
  with Index == 0 and getReg() will return ZERO for this case. Note that it
  doesn't return ZERO_64 on isGP64() targets.
- I haven't cleaned up all of the now-unused functions.
  Some more of the generic parser could be removed too (integers and relocs
  for example).
- insve.df needed a custom decoder to handle the implicit fourth operand that
  was needed to make it parse correctly. The difficulty was that the matcher
  expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this.

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3222

llvm-svn: 205292
2014-04-01 10:35:28 +00:00
Alexey Volkov 1328b28dc6 [x86] Do not convert to cmp32 for Atom arch by Sergey Okunev
Differential Revision: http://llvm-reviews.chandlerc.com/D2824

llvm-svn: 205288
2014-04-01 08:13:07 +00:00
Adam Nemet 10c4ce2584 [X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well
Pretty obvious follow-on to r205159 to also handle conversion from double
besides float.

Fixes <rdar://problem/16373208>

llvm-svn: 205253
2014-03-31 21:54:48 +00:00
Matt Arsenault d6c4326786 R600/SI: Remove leftover pattern splitting 64-bit ors.
It's now matched to the scalar 64-bit or and split later if
necessary.'

llvm-svn: 205252
2014-03-31 21:46:46 +00:00
Manman Ren 63efd8e7e6 Register allocator: set CSRFirstUseCost to 5 for ARM64.
A value of 5 means if we have a split or spill option that has a really
low cost (1 << 14 is the entry frequency), we will choose to spill
or split the really cold path before using a callee-saved register.

This gives us the performance benefit on SPECInt2k and is also conservative.

rdar://16162005

llvm-svn: 205248
2014-03-31 21:06:36 +00:00
Matt Arsenault f751d6272d Change shouldSplitVectorElementType to better match the description.
Pass the entire vector type, and not just the element.

llvm-svn: 205247
2014-03-31 20:54:58 +00:00
Matt Arsenault d7bdcc46a6 R600/SI: Implement shouldConvertConstantLoadToIntImm
llvm-svn: 205244
2014-03-31 19:54:27 +00:00
Matt Arsenault 378bf9c68b R600: Compute masked bits for min and max
llvm-svn: 205242
2014-03-31 19:35:33 +00:00
Rafael Espindola ee1c342ef9 Don't relocate with sections if there might be a paired relocation.
llvm-svn: 205240
2014-03-31 19:00:23 +00:00
Daniel Sanders e34a120285 Revert: [mips] Rewrite MipsAsmParser and MipsOperand.' due to buildbot errors in lld tests.
It's currently unable to parse 'sym + imm' without surrounding parenthesis.

llvm-svn: 205237
2014-03-31 18:51:43 +00:00
Matt Arsenault 4c53717787 R600: Add BFE, BFI, and BFM intrinsics to help with writing tests.
llvm-svn: 205236
2014-03-31 18:21:18 +00:00
Matt Arsenault b34583661b R600: Add target nodes for BFM and BFI
llvm-svn: 205235
2014-03-31 18:21:13 +00:00
Saleem Abdulrasool 2070088bef ARM: fix typo
llvm-svn: 205233
2014-03-31 18:09:10 +00:00
Hal Finkel b4240ca0f4 [PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shuffles
If we have two unique values for a v2i64 build vector, this will always result
in two vector loads if we expand using shuffles. Only one is necessary.

llvm-svn: 205231
2014-03-31 17:48:16 +00:00
Daniel Sanders 0c648ba5be [mips] Rewrite MipsAsmParser and MipsOperand.
Summary:
Highlights:
- Registers are resolved much later (by the render method).
  Prior to that point, GPR32's/GPR64's are GPR's regardless of register
  size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register
  size or FR mode. Numeric registers can be anything.
- All registers are parsed the same way everywhere (even when handling
  symbol aliasing)
  - One consequence is that all registers can be specified numerically
    almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing
    but that can be easily resolved.
- Removes the need for the hasConsumedDollar hack
- Parenthesis and Bracket suffixes are handled generically
- Micromips instructions are parsed directly instead of going through the
  standard encodings first.
- rdhwr accepts all 32 registers, and the following instructions that previously
  xfailed now work:
    ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d,
    c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1
- Diagnostics involving registers point at the correct character (the $)
- There's only one kind of immediate in MipsOperand. LSA immediates are handled
  by the predicate and renderer.

Lowlights:
- Hardcoded '$zero' in the div patterns is handled with a hack.
  MipsOperand::isReg() will return true for a k_RegisterIndex token
  with Index == 0 and getReg() will return ZERO for this case. Note that it
  doesn't return ZERO_64 on isGP64() targets.
- I haven't cleaned up all of the now-unused functions.
  Some more of the generic parser could be removed too (integers and relocs
  for example).
- insve.df needed a custom decoder to handle the implicit fourth operand that
  was needed to make it parse correctly. The difficulty was that the matcher
  expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this.

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3222

llvm-svn: 205229
2014-03-31 17:43:46 +00:00
Hal Finkel 4c8f634f23 [PowerPC] Correct P7 dispatch unit allocation for vector instructions
llvm-svn: 205222
2014-03-31 17:02:10 +00:00
Eli Bendersky 6a0ccfb585 PR19099 - revert r203483
Now that r205212 was committed, r203483 is no longer necessary; it was a
temporary workaround that only handled a small number of the problematic cases.

llvm-svn: 205216
2014-03-31 16:11:57 +00:00
Christian Pirker 6d301b8ff2 ARM: change parameter names of the ELFARMAsmBackend constructor
I removed the underscore at the beginning of the parameter name,
because of a comment from Tim.

llvm-svn: 205215
2014-03-31 16:06:39 +00:00
Robert Khasanov ed0b2e9733 Test commit.
llvm-svn: 205214
2014-03-31 16:01:38 +00:00
Daniel Sanders d69adeb8e7 [mips] Fix use of uninitialized value reported by the sanitizer-x86_64-linux-bootstrap buildbot
llvm-svn: 205213
2014-03-31 15:58:58 +00:00
Eli Bendersky 264cd4672d Fix for PR19099 - NVPTX produces invalid symbol names.
This is a more thorough fix for the issue than r203483. An IR pass will run
before NVPTX codegen to make sure there are no invalid symbol names that can't
be consumed by the ptxas assembler.

llvm-svn: 205212
2014-03-31 15:56:26 +00:00
Tim Northover 5081cd0f81 ARM64: add extra patterns for scalar shifts
llvm-svn: 205209
2014-03-31 15:46:46 +00:00
Tim Northover e7834c3bbc ARM64: add extra scalar neg pattern & tests.
llvm-svn: 205208
2014-03-31 15:46:42 +00:00
Tim Northover 4468670345 ARM64: add patterns for scalar sqdmlal & sqdmlsl.
llvm-svn: 205207
2014-03-31 15:46:38 +00:00