Commit Graph

35772 Commits

Author SHA1 Message Date
Chih-Hung Hsieh 7993e18e80 [X86] Part 2 to fix x86-64 fp128 calling convention.
Part 1 was submitted in http://reviews.llvm.org/D15134.
Changes in this part:
* X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class.
* X86CallingConv.td: Pass f128 values in XMM registers or on stack.
* X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td:
  Add instruction selection patterns for f128.
* X86ISelLowering.cpp:
  When target has MMX registers, configure MVT::f128 in FR128RegClass,
  with TypeSoftenFloat action, and custom actions for some opcodes.
  Add missed cases of MVT::f128 in places that handle f32, f64, or vector types.
  Add TODO comment to support f128 type in inline assembly code.
* SelectionDAGBuilder.cpp:
  Fix infinite loop when f128 type can have
  VT == TLI.getTypeToTransformTo(Ctx, VT).
* Add unit tests for x86-64 fp128 type.

Differential Revision: http://reviews.llvm.org/D11438

llvm-svn: 255558
2015-12-14 22:08:36 +00:00
Dan Gohman 87b4aa8914 [WebAssembly] Add an assert to sanity-check dead flags.
The WebAssemblyStoreResults pass runs before LiveVariables, so it doesn't
expect to have to keep dead flags up to date; check this with an assert.

llvm-svn: 255551
2015-12-14 21:53:54 +00:00
Krzysztof Parzyszek 5e6f2bd0cb [Hexagon] Add "const" to function parameters in HexagonInstrInfo
llvm-svn: 255544
2015-12-14 21:32:25 +00:00
Krzysztof Parzyszek dac7102874 [Packetizer] Add AliasAnalysis as a parameter to the packetizer
This will make the depedence graph more accurate if an alias analysis
is provided. If nullptr is specified in its place, the behavior will
remain as it is currently.

llvm-svn: 255540
2015-12-14 20:35:13 +00:00
Yaron Keren 45ea8fa1f4 Save several std::string constructions using llvm::Twine.
llvm-svn: 255535
2015-12-14 19:28:40 +00:00
Krzysztof Parzyszek d44a1fd506 Add "const" to function arguments in DFAPacketizer
llvm-svn: 255526
2015-12-14 18:54:44 +00:00
Petar Jovanovic 280f7101e8 [Power PC] llvm soft float support for ppc32
This is the second in a set of patches for soft float support for ppc32,
it enables soft float operations.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D13700

llvm-svn: 255516
2015-12-14 17:57:33 +00:00
Matt Arsenault d079285e05 AMDGPU: Use generic bitreverse intrinsic
Also fix bug in vector legalization for bitreverse.

llvm-svn: 255512
2015-12-14 17:25:38 +00:00
Geoff Berry 8f5acb1bd1 Remove dead function AArch64TargetLowering::getFunctionAlignment. NFC.
Reviewers: t.p.northover, jmolloy, mcrosier

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D15458

llvm-svn: 255509
2015-12-14 17:01:10 +00:00
Matt Arsenault 52a52a564b AMDGPU: Fix splitting vector loads with existing offsets
If the original MMO had an offset, it was dropped.
Also use the correct alignment after adding the new offset.

llvm-svn: 255508
2015-12-14 16:59:40 +00:00
Krzysztof Parzyszek 759a7d0ed7 [Hexagon] Subtarget features/default CPU corrections
llvm-svn: 255501
2015-12-14 15:03:54 +00:00
Chad Rosier bc9d4f9947 [PPC] Early exit loop. NFC.
llvm-svn: 255497
2015-12-14 14:44:06 +00:00
Michael Zuckerman 02ecd43c63 [X86][inline asm] support even directive
The .even directive aligns content to an evan-numbered address.

In at&t syntax .even 
In Microsoft syntax even (without the dot).

Differential Revision: http://reviews.llvm.org/D15413

llvm-svn: 255462
2015-12-13 17:07:23 +00:00
Simon Pilgrim 3e0c022aed Fix line endings
llvm-svn: 255459
2015-12-13 12:49:48 +00:00
Cong Hou c106989fd5 Normalize MBB's successors' probabilities in several locations.
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).


Differential revision: http://reviews.llvm.org/D15259

llvm-svn: 255455
2015-12-13 09:26:17 +00:00
Saleem Abdulrasool 778c268594 ARM: only emit EABI attributes on EABI targets
EABI attributes should only be emitted on EABI targets.  This prevents the
emission of the optimization goals EABI attribute on Windows ARM.

llvm-svn: 255448
2015-12-13 05:27:45 +00:00
Simon Pilgrim 052191dd82 [X86][AVX512] Added support for VMOVQ shuffle comments
llvm-svn: 255442
2015-12-12 21:46:23 +00:00
David Majnemer 8a1c45d6e8 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

llvm-svn: 255422
2015-12-12 05:38:55 +00:00
Hal Finkel 98347d3f2c [PowerPC] OutStreamer cleanup in PPCAsmPrinter
We don't need to pass OutStreamer as a parameter to LowerSTACKMAP and
LowerPATCHPOINT. It is a member variable of PPCAsmPrinter, and thus, is already
available. NFC.

llvm-svn: 255418
2015-12-12 01:47:08 +00:00
Chen Li 1b26b9ec9d [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.

Reviewers: craig.topper, RKSimon

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14603

llvm-svn: 255415
2015-12-12 01:04:15 +00:00
Hal Finkel 65539e3c94 [PowerPC] Add Branch Hints for Highly-Biased Branches
This branch adds hints for highly biased branches on the PPC architecture. Even
in absence of profiling information, LLVM will mark code reaching unreachable
terminators and other exceptional control flow constructs as highly unlikely to
be reached.

Patch by Tom Jablin!

llvm-svn: 255398
2015-12-12 00:32:00 +00:00
Derek Schuff 8f55497264 [WebAssembly] Update test expectations
Many tests are now passing due to eliminateFrameIndex implementation and
the list needs to be re-triaged because it unblocks other failures, and
some previous failures are different. However I'm about to churn it more
by implementing more lowering, so will wait on that.

llvm-svn: 255396
2015-12-12 00:18:40 +00:00
Chen Li 02ef2e1385 Revert rL255391: [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
because it broke buildbot.

llvm-svn: 255395
2015-12-12 00:08:37 +00:00
Derek Schuff 9769debf88 [WebAssembly] Implement prolog/epilog insertion and FrameIndex elimination
Summary:
Use the SP32 physical register as the base for FrameIndex
lowering. Update it and the __stack_pointer global var in the prolog and
epilog. Extend the mapping of virtual registers to wasm locals to
include the physical registers.

Rather than modify the target-independent PrologEpilogInserter (which
asserts that there are no virtual registers left) include a
slightly-modified copy for Wasm that does not have this assertion and
only clears the virtual registers if scavenging was needed (which of
course it isn't for wasm).

Differential Revision: http://reviews.llvm.org/D15344

llvm-svn: 255392
2015-12-11 23:49:46 +00:00
Chen Li e8f9387e0c [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.

Reviewers: craig.topper, RKSimon

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14603

llvm-svn: 255391
2015-12-11 23:39:32 +00:00
Hal Finkel cd8664c3c2 Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html

it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.

r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

llvm-svn: 255387
2015-12-11 23:11:52 +00:00
Matthias Braun 60d69e2865 CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.

This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).

Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.

This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033

Differential Revision: http://reviews.llvm.org/D15320

llvm-svn: 255362
2015-12-11 19:42:09 +00:00
Matt Arsenault fbd9bbfda3 Start replacing vector_extract/vector_insert with extractelt/insertelt
These are redundant pairs of nodes defined for
INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
insertelement/extractelement are slightly closer to the corresponding
C++ node name, and has stricter type checking so prefer it.

Update targets to only use these nodes where it is trivial to do so.
AArch64, ARM, and Mips all have various type errors on simple replacement,
so they will need work to fix.

Example from AArch64:

def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
          (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;

Which is trying to do sext_inreg i8, i8.

llvm-svn: 255359
2015-12-11 19:20:16 +00:00
Derek Schuff 5a14306323 [WebAssembly] Fix ADJCALLSTACKDOWN/UP use/defs
Summary:
ADJCALLSTACK{DOWN,UP} (aka CALLSEQ_{START,END}) MIs are supposed to use
and def the stack pointer. Since they do not, all the nodes are being
eliminated by DeadMachineInstructionElim, so they aren't in the IR when
PrologEpilogInserter/eliminateCallFramePseudo needs them.

This change fixes that, but since RegStackify will not stackify across
them (and it runs early, before PEI), change LowerCall to only emit them
when the call frame size is > 0. That makes the current code work the
same way and makes code handled by D15344 also work the same way. We can
expand the condition beyond NumBytes > 0 in the future if needed.

Reviewers: sunfish, jfb

Subscribers: jfb, dschuff, llvm-commits

Differential Revision: http://reviews.llvm.org/D15459

llvm-svn: 255356
2015-12-11 18:55:34 +00:00
Hans Wennborg a8e6b3ecb7 Fix build after r255319.
llvm-svn: 255322
2015-12-11 00:58:32 +00:00
Kyle Butt 1452b76f1f [PPC]: Peephole optimize small accesss to aligned globals.
Access to aligned globals gives us a chance to peephole optimize nonzero
offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't
overflow the available displacement. For example:
        addis 3, 2, b4v@toc@ha
        addi 4, 3, b4v@toc@l
        lbz 5, b4v@toc@l(3) ; This is the result of the current peephole
        lbz 6, 1(4)         ; optimizer
        lbz 7, 2(4)
        lbz 8, 3(4)
If b4v is 4-byte aligned, we can skip using register 4 because we know
that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate:
        addis 3, 2, b4v@toc@ha
        lbz 4, b4v@toc@l(3)
        lbz 5, b4v@toc@l+1(3)
        lbz 6, b4v@toc@l+2(3)
        lbz 7, b4v@toc@l+3(3)
Saving a register and an addition.
Larger alignments allow larger structures/arrays to be optimized.

llvm-svn: 255319
2015-12-11 00:47:36 +00:00
Cong Hou 59898d8c68 [X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1.
Previously in the conversion cost table there are no entries for integer-integer
conversions on SSE2. This will result in imprecise costs for certain vectorized
operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers
are counted from the result of running llc on the new test case in this patch.


Differential revision: http://reviews.llvm.org/D15132

llvm-svn: 255315
2015-12-11 00:31:39 +00:00
Kyle Butt 28b01a51b3 PPC: Teach FMA mutate to respect register classes.
This was causing bad code gen and assembly that won't assemble, as
mixed altivec and vsx code would end up with a vsx high register
assigned to an altivec instruction, which won't work. Constraining the
classes allows the optimization to proceed.

llvm-svn: 255299
2015-12-10 21:28:40 +00:00
Pirama Arumuga Nainar 1317d5f311 Fix fptosi, fptoui from f16 vectors to i8, i16 vectors
Summary:
Convert f16 vectors to corresponding f32 vectors before doing the
conversion to int.

Add tests for v4f16, v8f16.

Reviewers: ab, jmolloy

Subscribers: llvm-commits, srhines

Differential Revision: http://reviews.llvm.org/D14936

llvm-svn: 255263
2015-12-10 17:16:49 +00:00
Dan Gohman b949b9c01b [WebAssembly] Make WebAssemblyStoreResults only return true when it has a change.
llvm-svn: 255253
2015-12-10 14:17:36 +00:00
Dan Gohman a87629d6d7 [WebAssembly] Fix WebAssemblyPeephole to set Changed to true when making changes.
llvm-svn: 255252
2015-12-10 14:16:34 +00:00
Dan Gohman acc0941bd1 [WebAssembly] Declare that WebAssemblyPeephole does not modify the CFG.
llvm-svn: 255251
2015-12-10 14:12:04 +00:00
Dan Gohman 6d63f96749 [WebAssembly] Remove an unneeded getAnalysisUsage override.
llvm-svn: 255250
2015-12-10 14:10:04 +00:00
Nemanja Ivanovic ac8d01add0 Bitcasts between FP and INT values using direct moves
This patch corresponds to review:
http://reviews.llvm.org/D15286

LLVM IR frequently contains bitcast operations between floating point and
integer values of the same width. Doing this through memory operations is
quite expensive on PPC. This patch allows the use of direct register moves
between FPRs and GPRs for lowering bitcasts.

llvm-svn: 255246
2015-12-10 13:35:28 +00:00
Jonas Paulsson e451eeff5c [PostRA scheduling] Allow a target to do scheduling when it wants post RA.
SystemZ needs to do its scheduling after branch relaxation, which can
only happen after block placement, and therefore the standard
PostRAScheduler point in the pass sequence is too early.

TargetMachine::targetSchedulesPostRAScheduling() is a new method that
signals on returning true that target will insert the final scheduling
pass on its own.

Reviewed by Hal Finkel

llvm-svn: 255234
2015-12-10 09:10:07 +00:00
Craig Topper 8e44b9a4d1 [X86] Fix a couple cases were bitwise and logical operations were being mixed. NFC
llvm-svn: 255224
2015-12-10 06:09:41 +00:00
Dan Gohman f170ba08af [WebAssembly] Implement mixed-type ISD::FCOPYSIGN.
ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner
uses this. Add some def : Pat rules to expand this out into an explicit
conversion and a normal copysign operation.

llvm-svn: 255220
2015-12-10 04:55:31 +00:00
Dan Gohman 9341c1d4b3 [WebAssembly] Implement fma.
It is lowered to a libcall for now, but this is expected to change in the future.

llvm-svn: 255219
2015-12-10 04:52:33 +00:00
Tom Stellard c2d654322b AMDGPU/SI: Fix warning introduced by r255204
llvm-svn: 255205
2015-12-10 03:10:46 +00:00
Tom Stellard c93fc11f36 AMDGPU/SI: Emit constant arrays in the .text section
Summary:
This allows us to remove the END_OF_TEXT_LABEL hack we had been using
and simplifies the fixups used to compute the address of constant
arrays.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15257

llvm-svn: 255204
2015-12-10 02:13:01 +00:00
Tom Stellard b3c3bda512 AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints
Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs.

Reviewers: arsenm, echristo

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15342

llvm-svn: 255203
2015-12-10 02:12:53 +00:00
Dan Gohman 60bddf17c5 [WebAssembly] Fix legalization of f32->f64 EXTLOAD.
llvm-svn: 255202
2015-12-10 02:07:53 +00:00
Derek Schuff 6fd28dfe5d [WebAssembly] Update known test failures
We can now select sign_extend_inreg

llvm-svn: 255197
2015-12-10 01:09:40 +00:00
Dan Gohman a5603b835b [WebAssembly] Also legalize sign_extend_inreg of i32->i64.
llvm-svn: 255191
2015-12-10 01:00:19 +00:00
Derek Schuff 71d0eae609 [WebAssembly] Update test failure expectations
llvm-svn: 255190
2015-12-10 00:56:18 +00:00
Dan Gohman a8483755d3 [WebAssembly] Fix legalization of shift operators with illegal types.
llvm-svn: 255181
2015-12-10 00:26:26 +00:00
Dan Gohman 7935fa3d1b [WebAssembly] Fix copy+pastos.
llvm-svn: 255180
2015-12-10 00:22:40 +00:00
Dan Gohman df00a9ebc2 [WebAssembly] Implement anyext.
llvm-svn: 255179
2015-12-10 00:17:35 +00:00
Quentin Colombet 5d2f7cfd44 [X86] Enable shrink-wrapping by default, but keep it disabled for stack frames
without a frame pointer when unwind may happen.
This is a workaround for a bug in the way we emit the CFI directives for
frameless unwind information. See PR25614.

llvm-svn: 255175
2015-12-09 23:08:18 +00:00
Dan Gohman 1cf96c0c34 [WebAssembly] Reintroduce ARGUMENT moving logic
Reinteroduce the code for moving ARGUMENTS back to the top of the basic block.
While the ARGUMENTS physical register prevents sinking and scheduling from
moving them, it does not appear to be sufficient to prevent SelectionDAG from
moving them down in the initial schedule. This patch introduces a patch that
moves them back to the top immediately after SelectionDAG runs.

This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is
one alternative, though the review has not been favorable, and proposed
alternatives are longer-term and have other downsides.

This fixes the main outstanding -verify-machineinstrs failures, so it adds
-verify-machineinstrs to several tests.

Differential Revision: http://reviews.llvm.org/D15377

llvm-svn: 255125
2015-12-09 16:23:59 +00:00
Tim Northover d91d635b36 ARM: don't use a deleted node as the BaseReg in complex pattern.
We mutated the DAG, which invalidated the node we were trying to use
as a base register. Sometimes we got away with it, but other times the
node really did get deleted before it was finished with.

Should fix PR25733

llvm-svn: 255120
2015-12-09 15:54:50 +00:00
JF Bastien 88f8014e8e WebAssembly: add missing failure to the list.
llvm-svn: 255119
2015-12-09 15:52:57 +00:00
Oliver Stannard 86f729296a [AArch64] Fix FP16 vector instructions that should only accept low registers
llvm-svn: 255113
2015-12-09 14:32:11 +00:00
Daniel Sanders 3c7223133d [mips][ias] Range check uimm10 operands
Summary:

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15229

llvm-svn: 255112
2015-12-09 13:48:05 +00:00
JF Bastien c2b30484ae WebAssembly: add known failures
The bots are now running the torture tests properly. Bin all failures from the GCC C torture tests so that we can tackle failures and make the tree go red on regressions.

llvm-svn: 255111
2015-12-09 13:29:32 +00:00
Vasileios Kalintiris ddf7e6885a [mips] Use multiclass patterns for f32/f64 comparisons and i32 selects.
Summary:
Although the multiclass for i32 selects might seem redundant as it has
only one instantiation, we will use it to replace the correspondent
patterns in Mips64r6InstrInfo.td in follow-up commits.

Reviewers: dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14612

llvm-svn: 255110
2015-12-09 13:24:22 +00:00
Zlatko Buljan 48f1f39bfe Revert r254897 "[mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions"
Commited patch was intended to implement LH, LHE, LHU and LHUE instructions.
After commit test-suite failed with error message in the form of:
fatal error: error in backend: Cannot select: t124: i32,ch = load<LD2[%d](tbaa=<0x94acc48>), sext from i16> t0, t2, undef:i32
For that reason I decided to revert commit r254897 and make new patch which besides implementation and standard regression tests will also have dedicated tests (CodeGen) for the above error. 

llvm-svn: 255109
2015-12-09 13:07:45 +00:00
Ahmed Bougacha 97564c3a1b [AArch64][ARM] Don't base interleaved op legality on type alloc size.
Otherwise, we think that most types that look like they'd fit in a
legal vector type are legal (so, basically, *any* vector type with a
size between 33 and 128 bits, I think, since we use pow2 alignment;
e.g., v2i25, v3f32, ...).

DataLayout::getTypeAllocSize rounds up based on alignment.
When checking for target intrinsic legality, that's not what we want:
if rounding makes a difference, the type isn't legal, and the
target intrinsics shouldn't be used, as they are always assumed legal.

One could make the argument that alloc size is ultimately the most
relevant here, since we're dealing with LD/ST intrinsics. That's only
true if we did legalize them though; that's a problem for another day.

Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits.
Type::getSizeInBits can't be used because that'd gratuitously break
pointer vector support.

Some of these uses are currently fine, because we only hit them when
the type is already known legal (e.g., r114454). Update them for
consistency. It's faster to avoid the rounding anyway!

llvm-svn: 255089
2015-12-09 01:19:50 +00:00
Vyacheslav Klochkov a3cd08b05c X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes.
Reviewer: Simon Pilgrim.
Differential Revision: http://reviews.llvm.org/D15317

llvm-svn: 255080
2015-12-09 00:12:13 +00:00
Pirama Arumuga Nainar e6ccd7b66a Define selection for v4f16, v8f16 scalar_to_vector
Summary:
This fixes failure when trying to select
    insertelement <4 x half> undef, half %a, i64 0
which gets transformed to a scalar_to_vector node.

The accompanying v4 and v8 tests fail instruction selection without this
patch.

Reviewers: ab, jmolloy

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D15322

llvm-svn: 255072
2015-12-08 23:07:06 +00:00
Simon Pilgrim 323e00d9c7 [X86][AVX] Fold loads + splats into broadcast instructions
On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector.

This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element.

Fix for PR23022

Differential Revision: http://reviews.llvm.org/D15310

llvm-svn: 255061
2015-12-08 22:17:11 +00:00
Artyom Skrobov 0a37b80bcb Fix ARMv4T (Thumb1) epilogue generation
Summary:
Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986;
so we need the special fixup in the epilogue.

Reviewers: jroelofs, qcolombet

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D15126

llvm-svn: 255047
2015-12-08 19:59:01 +00:00
Tim Northover 614e8ff855 X86: produce more friendly errors during MachO relocation handling
llvm-svn: 255036
2015-12-08 18:31:35 +00:00
Renato Golin 412ee3d45d [ARM] Allowing SP/PC for AND/BIC mod_imm_not
AND/BIC instructions do accept SP/PC, so the register class should be
more generic (rGPR -> GPR) to cope with that case. Adding more tests.

llvm-svn: 255034
2015-12-08 18:10:58 +00:00
Ron Lieberman e6540e244a [Hexagon] Add NewValueJump support for C4_cmpneq, C4_cmplte, C4_cmplteu
llvm-svn: 255027
2015-12-08 16:28:32 +00:00
Daniel Sanders 106d2d4693 [mips][ias] Range check uimm8 operands
Summary:

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D15226

llvm-svn: 255018
2015-12-08 14:42:10 +00:00
Daniel Sanders 59d092f883 [mips][ias] Range check uimm6 operands and fix a bug this revealed.
Summary:
We don't check the size operand on ext/dext*/ins/dins* yet because the
permitted range depends on the pos argument and we can't check that using
this mechanism.

The bug was that dextu/dinsu accepted 0..31 in the pos operand instead of 32..63.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D15190

llvm-svn: 255015
2015-12-08 13:49:19 +00:00
Oliver Stannard e4c3d21ea6 [AArch64] Add ARMv8.2-A FP16 vector instructions
ARMv8.2-A adds 16-bit floating point versions of all existing SIMD
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

Note that VFP without SIMD is not a valid combination for any version of
ARMv8-A, but I have ensured that these instructions all depend on both
FeatureNEON and FeatureFullFP16 for consistency.

The ".2h" vector type specifier is now legal (for the scalar pairwise
reduction instructions), so some unrelated tests have been modified as
different error messages are emitted. This is not a problem as the
invalid operands are still caught.

llvm-svn: 255010
2015-12-08 12:16:10 +00:00
Dan Gohman 31448f16b6 [WebAssembly] Fix a typo in a comment.
llvm-svn: 254999
2015-12-08 03:43:03 +00:00
Dan Gohman fd98ea89d9 [WebAssembly] Remove an unneeded static_cast.
llvm-svn: 254998
2015-12-08 03:42:50 +00:00
Dan Gohman 7f970765ea [WebAssembly] Fix an emacs syntax highlighting comment.
llvm-svn: 254997
2015-12-08 03:36:00 +00:00
Dan Gohman ad664b3bda [WebAssembly] Convert a file-level comment to doxygen style.
llvm-svn: 254996
2015-12-08 03:33:51 +00:00
Dan Gohman d70e5907cd [WebAssembly] Assert MRI.isSSA() in passes that depend on SSA form.
llvm-svn: 254995
2015-12-08 03:30:42 +00:00
Dan Gohman a8551b4d7e [WebAssembly] Trim some unneeded #includes.
llvm-svn: 254994
2015-12-08 03:25:35 +00:00
Dan Gohman 4a84b7322f [WebAssembly] Remove the override of haveFastSqrt.
The default implementation in BasicTTI already checks TLI and does
the right thing.

llvm-svn: 254993
2015-12-08 03:22:33 +00:00
Manman Ren cb8470b4b5 [CXX TLS calling convention] Add support for AArch64.
rdar://9001553

llvm-svn: 254978
2015-12-08 00:14:38 +00:00
Kit Barton a1c712fae5 [PPC64] Convert bool literals to i32
Convert i1 values to i32 values if they should be allocated in GPRs instead of CRs.

Phabricator: http://reviews.llvm.org/D14064
llvm-svn: 254942
2015-12-07 20:50:29 +00:00
Sanjay Patel a6bdd70f4b don't repeat function names in comments; NFC
llvm-svn: 254930
2015-12-07 19:31:34 +00:00
Sanjay Patel e4b9f507cf fix 'the the '; NFC
llvm-svn: 254928
2015-12-07 19:21:39 +00:00
Sanjay Patel f9bdb872bd remove redundant check: optForSize() includes a check for the minsize attribute; NFCI
llvm-svn: 254925
2015-12-07 19:13:40 +00:00
Elena Demikhovsky 291fe0159f VX-512: Fixed a bug in FP logic operation lowering
FP logic instructions are supported in DQ extension on AVX-512 target.
I use integer operations instead.
Added tests.
I also enabled FABS in this patch in order to check ANDPS.
The operations are FOR, FXOR, FAND, FANDN.
The instructions, that supported for 512-bit vector under DQ are:
VORPS/PD, VXORPS/PD, VANDPS/PD, FANDNPS/PD.

Differential Revision: http://reviews.llvm.org/D15110

llvm-svn: 254913
2015-12-07 14:33:34 +00:00
Artyom Skrobov e9b3fb8603 [ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM.
Summary: This reverts r254234, and adds a simple fix for the annoying case of use-after-free.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D15236

llvm-svn: 254912
2015-12-07 14:22:39 +00:00
Elena Demikhovsky 33e61eceb4 AVX-512: Fixed masked load / store instruction selection for KNL.
Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store.

This intrinsic comes with all legal types:
<8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru),
but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only.

All data operands should be widened to 512-bit vector.
The mask operand should be widened to v16i1 with zeroes.

Differential Revision: http://reviews.llvm.org/D15265

llvm-svn: 254909
2015-12-07 13:39:24 +00:00
Igor Breger 3ab6f17530 AVX-512: implement kunpck intrinsics.
Differential Revision: http://reviews.llvm.org/D14821

llvm-svn: 254908
2015-12-07 13:25:18 +00:00
Marina Yatsina 497d44a081 [X86] Adding support for FWORD type for MS inline asm
Adding support for FWORD type for MS inline asm.

Differential Revision: http://reviews.llvm.org/D15268

llvm-svn: 254904
2015-12-07 13:09:20 +00:00
Bradley Smith d5a1f47a63 [ARM] Flag vcvt{t,b} with an f16 type specifier as part of the FP16 extension
Additionally correct the Cortex-R7 definition to allow the FP16 feature.

llvm-svn: 254900
2015-12-07 10:54:36 +00:00
Zlatko Buljan 1a01c15027 [mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions
Differential Revision: http://reviews.llvm.org/D9824

llvm-svn: 254897
2015-12-07 08:29:31 +00:00
Dan Gohman 5e0886beb7 [WebAssembly] Factor out a TypeToString function, since we need it in multiple places.
llvm-svn: 254884
2015-12-06 19:42:29 +00:00
Dan Gohman 770f0d0a40 [WebAssembly] Make tableswitch's 'default' operand explicit. NFC.
llvm-svn: 254883
2015-12-06 19:34:57 +00:00
Dan Gohman a4b710a74f [WebAssembly] Enable folding of offsets into global variable addresses.
llvm-svn: 254882
2015-12-06 19:33:32 +00:00
Dan Gohman 753abf8de5 [WebAssembly] Add some more ideas to README.txt.
llvm-svn: 254880
2015-12-06 19:29:54 +00:00
Marina Yatsina 1d1aa0b0a8 [X86] Add support for loopz, loopnz for Intel syntax
According to x86 spec, loopz and loopnz should be supported for Intel syntax, where loopz is equivalent to loope and loopnz is equivalent to loopne.

Differential Revision: http://reviews.llvm.org/D15148

llvm-svn: 254877
2015-12-06 15:31:47 +00:00
Asaf Badouh 41ecf460fa [X86][AVX512] add vmovss/sd missing encoding
Differential Revision: http://reviews.llvm.org/D14701

llvm-svn: 254875
2015-12-06 13:26:56 +00:00
Michael Kuperstein 77ce9d3b1a [X86] Always generate precise CFA adjustments.
This removes the code path that generate "synchronous" (only correct at call site) CFA.
We will probably want to re-introduce it once we are capable of emitting different
.eh_frame and .debug_frame sections.

Differential Revision: http://reviews.llvm.org/D14948

llvm-svn: 254874
2015-12-06 13:06:20 +00:00
Igor Breger 076dfe5c12 AVX512: support AVX512BW Intrinsic in 32bit mode.
Differential Revision: http://reviews.llvm.org/D15076

llvm-svn: 254873
2015-12-06 11:35:18 +00:00
Craig Topper 15576e1c8f Use make_range to reduce mentions of iterator type. NFC
llvm-svn: 254872
2015-12-06 05:08:07 +00:00
Dan Gohman d85c3b1fbc [WebAssembly] Don't perform the returned-argument optimization on constants.
llvm-svn: 254866
2015-12-05 22:12:39 +00:00
Dan Gohman 3bb55e98e2 [WebAssembly] Replace the fake JUMP_TABLE instruction with a def : Pat. NFC.
llvm-svn: 254864
2015-12-05 20:46:53 +00:00
Dan Gohman e2a7a8278f [WebAssembly] Implement direct calls to external symbols.
llvm-svn: 254863
2015-12-05 20:41:36 +00:00
Dan Gohman 284384b640 [WebAssembly] Support inline asm constraints of type i16 and similar.
llvm-svn: 254861
2015-12-05 20:03:44 +00:00
Dan Gohman 905bef5cf9 [WebAssembly] Update a stale comment. NFC.
llvm-svn: 254859
2015-12-05 19:43:19 +00:00
JF Bastien f05f6fd1e2 WebAssembly: improve readme, add placeholder for tests.
llvm-svn: 254857
2015-12-05 19:36:33 +00:00
Dan Gohman 7615e46919 [WebAssembly] Move useAA() out of line to make it more convenient to experiment with.
llvm-svn: 254856
2015-12-05 19:27:18 +00:00
Dan Gohman b0921ca9e1 [WebAssembly] Call TargetPassConfig base class functions in overriding functions.
llvm-svn: 254855
2015-12-05 19:24:17 +00:00
Dan Gohman ebb23545de [WebAssembly] Expand frem as a floating point library function.
llvm-svn: 254854
2015-12-05 19:15:57 +00:00
Craig Topper 5c32279bee [Hexagon] Don't call getNumImplicitDefs and then iterate over the count. getNumImplicitDefs contains a loop so its better to just loop over the null terminated implicit def list. NFC
llvm-svn: 254852
2015-12-05 17:34:07 +00:00
Simon Pilgrim 4ba5969224 [X86][ADX] Added memory folding patterns and stack folding tests
llvm-svn: 254844
2015-12-05 07:27:50 +00:00
Craig Topper e5e035a3a8 Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef.
llvm-svn: 254843
2015-12-05 07:13:35 +00:00
Simon Pilgrim 5a64d98303 [X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructions
Both were defaulting to the float domain - now matches the packed instructions.

llvm-svn: 254841
2015-12-05 07:07:42 +00:00
Dan Gohman f0b165a7f8 [WebAssembly] Implement ReverseBranchCondition, and re-enable MachineBlockPlacement
This patch introduces a codegen-only instruction currently named br_unless,
which makes it convenient to implement ReverseBranchCondition and re-enable
the MachineBlockPlacement pass. Then in a late pass, it lowers br_unless
back into br_if.

Differential Revision: http://reviews.llvm.org/D14995

llvm-svn: 254826
2015-12-05 03:03:35 +00:00
Dan Gohman 4da4abd87f [WebAssembly] Fix scheduling dependencies in register-stackified code
Add physical register defs to instructions used from stackified
instructions to prevent them from being scheduled into the middle of
a stack sequence. This is a conservative measure which may be loosened
in the future.

Differential Revision: http://reviews.llvm.org/D15252

llvm-svn: 254811
2015-12-05 00:51:40 +00:00
Derek Schuff 9d77952332 [WebAssembly] Support constant offsets on loads and stores
This is just prototype for load/store for i32 types. I'll add them to
the rest of the types if we like this direction.

Differential Revision: http://reviews.llvm.org/D15197

llvm-svn: 254807
2015-12-05 00:26:39 +00:00
Philip Reames 7c6692de16 [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
When the notion of target specific memory intrinsics was introduced to EarlyCSE, the commit confused the notions of volatile and simple memory access.  Since I'm about to start working on this area, cleanup the naming so that patches aren't horribly confusing.  Note that the actual implementation was always bailing if the load or store wasn't simple.  

Reminder:
- "volatile" - C++ volatile, can't remove any memory operations, but in principal unordered
- "ordered" - imposes ordering constraints on other nearby memory operations
- "atomic" - can't be split or sheared.  In LLVM terms, all "ordered" operations are also atomic so the predicate "isAtomic" is often used.
- "simple" - a load which is none of the above.  These are normal loads and what most of the optimizer works with.

llvm-svn: 254805
2015-12-05 00:18:33 +00:00
Hans Wennborg fbf2822e6d Add FeatureLAHFSAHF to amdfam10 as well.
llvm-svn: 254801
2015-12-04 23:32:19 +00:00
Dan Gohman 35bfb24c28 [WebAssembly] Initial varargs support.
Full varargs support will depend on prologue/epilogue support, but this patch
gets us started with most of the basic infrastructure.

Differential Revision: http://reviews.llvm.org/D15231

llvm-svn: 254799
2015-12-04 23:22:35 +00:00
Hans Wennborg 5000ce8a63 X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported
These instructions are not supported by all CPUs in 64-bit mode. Emitting them
causes Chromium to crash on start-up for users with such chips.

(GCC puts these instructions behind -msahf on 64-bit for the same reason.)

This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets
and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering
from before r244503 when the instructions are not available.

Differential Revision: http://reviews.llvm.org/D15240

llvm-svn: 254793
2015-12-04 23:00:33 +00:00
Chad Rosier f3491496dc [AArch64] Expand vector SDIVREM/UDIVREM operations.
http://reviews.llvm.org/D15214
Patch by Ana Pazos <apazos@codeaurora.org>!

llvm-svn: 254773
2015-12-04 21:38:44 +00:00
Dan Gohman 1ce2b1afd6 [WebAssembly] Add several more calling conventions to the supported list.
llvm-svn: 254741
2015-12-04 18:27:03 +00:00
Sanjay Patel 1640c54593 fix formatting; NFC
llvm-svn: 254739
2015-12-04 17:51:55 +00:00
Manman Ren 19c7bbe3b7 [CXX TLS calling convention] Add CXX TLS calling convention.
This commit adds a new target-independent calling convention for C++ TLS
access functions. It aims to minimize overhead in the caller by perserving as
many registers as possible.

The target-specific implementation for X86-64 is defined as following:
  Arguments are passed as for the default C calling convention
  The same applies for the return value(s)
  The callee preserves all GPRs - except RAX and RDI

The access function makes C-style TLS function calls in the entry and exit
block, C-style TLS functions save a lot more registers than normal calls.
The added calling convention ties into the existing implementation of the
C-style TLS functions, so we can't simply use existing calling conventions
such as preserve_mostcc.

rdar://9001553

llvm-svn: 254737
2015-12-04 17:40:13 +00:00
Dan Gohman 541841e365 [WebAssembly] Give names to the callseq begin and end instructions.
llvm-svn: 254730
2015-12-04 17:19:44 +00:00
Dan Gohman a3f5ce5f1b [WebAssembly] clang-format CallingConvSupported. NFC.
llvm-svn: 254729
2015-12-04 17:18:32 +00:00
Dan Gohman 85dbdda1ed [WebAssembly] Factor out the list of supported calling conventions.
llvm-svn: 254728
2015-12-04 17:16:07 +00:00
Dan Gohman 2d822e73fa [WebAssembly] Check for more unsupported ABI flags.
llvm-svn: 254727
2015-12-04 17:12:52 +00:00
Dan Gohman cb7940f9f5 [WebAssembly] Use SelectionDAG::getUNDEF. NFC.
llvm-svn: 254726
2015-12-04 17:09:42 +00:00
Krzysztof Parzyszek f1b3e5e52e [Hexagon] Simplify LowerCONCAT_VECTORS, handle different types better
llvm-svn: 254724
2015-12-04 16:18:15 +00:00
Colin LeMahieu 4c606e66a7 [Hexagon] Using multiply instead of shift on signed number which can be UB
llvm-svn: 254719
2015-12-04 15:48:45 +00:00
Jonas Paulsson 7fa69cd5dd [SystemZ] Bugfix: Don't add CC twice to new three-address instruction.
Since BuildMI() automatically adds the implicit operands for a new instruction,
adding the old instructions CC operand resulted in that there were two CC imp-def
operands, where only one was marked as dead. This caused buildSchedGraph() to
miss dependencies on the CC reg.

Review by Ulrich Weigand

llvm-svn: 254714
2015-12-04 12:48:51 +00:00
Alexey Bataev 7cf324772f LEA code size optimization pass (Part 1): Remove redundant address recalculations, by Andrey Turetsky
Add new x86 pass which replaces address calculations in load or store instructions with def register of existing LEA (must be in the same basic block), if the LEA calculates address that differs only by a displacement. Works only with -Os or -Oz.
Differential Revision: http://reviews.llvm.org/D13294

llvm-svn: 254712
2015-12-04 10:53:15 +00:00
Quentin Colombet 901f036353 [ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine it
with its source instead of forcing the values on GPRs.

This improves the lowering of vector code when such bitcasts happen in the
middle of vector computations.

rdar://problem/23691584 

llvm-svn: 254684
2015-12-04 01:53:14 +00:00
JF Bastien 580b6572b5 X86InstrInfo::copyPhysReg: workaround reg liveness
Summary:
computeRegisterLiveness and analyzePhysReg are currently getting
confused about liveness in some cases, breaking copyPhysReg's
calculation of whether AX is dead in some cases. Work around this issue
temporarily by assuming that AX is always live.

See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7
And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201.

This workaround makes the code correct but slightly inefficient, but it
seems to confuse the machine instr verifier which now things EAX was
undefined in some cases where it's being conservatively saved /
restored.

Reviewers: majnemer, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15198

llvm-svn: 254680
2015-12-04 01:18:17 +00:00
Dan Gohman 391a98afd5 [WebAssembly] Fix dominance check for PHIs in the StoreResult pass
When a block has no terminator instructions, getFirstTerminator() returns
end(), which can't be used in dominance checks. Check dominance for phi
operands separately.

Also, remove some bits from WebAssemblyRegStackify.cpp that were causing
trouble on the same testcase; they were left behind from an earlier
experiment.

Differential Revision: http://reviews.llvm.org/D15210

llvm-svn: 254662
2015-12-03 23:07:03 +00:00
Chih-Hung Hsieh ed7d81e5d4 [X86] Part 1 to fix x86-64 fp128 calling convention.
Almost all these changes are conditioned and only apply to the new
x86-64 f128 type configuration, which will be enabled in a follow up
patch. They are required together to make new f128 work. If there is
any error, we should fix or revert them as a whole.
These changes should have no impact to current configurations.

* Relax type legalization checks to accept new f128 type configuration,
  whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has
  TLI.isTypeLegal true.
* Relax GetSoftenedFloat to return in some cases f128 type SDValue,
  which is TLI.isTypeLegal but not "softened" to i128 node.
* Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration,
  to generate optimized bitwise operators for libm functions.
* Enhance related Lower* functions to handle f128 type.
* Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions
  to keep new f128 type in register, and convert f128 operators to library calls.
* Fix Combiner, Emitter, Legalizer routines that did not handle f128 type.
* Add ExpandConstant to handle i128 constants, ExpandNode
  to handle ISD::Constant node.
* Add one more parameter to getCommonSubClass and firstCommonClass,
  to guarantee that returned common sub class will contain the specified
  simple value type.
  This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp.
* Fix infinite loop in getTypeLegalizationCost when f128 is the value type.
* Fix printOperand to handle null operand.
* Enhance ISD::BITCAST node to handle f128 constant.
* Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes.
* Enhance X86AsmPrinter to emit f128 values in comments.

Differential Revision: http://reviews.llvm.org/D15134

llvm-svn: 254653
2015-12-03 22:02:40 +00:00
Colin LeMahieu 15ca65c253 [Hexagon] Adding shuffling resources for HVX instructions and tests for instruction encodings.
llvm-svn: 254652
2015-12-03 21:44:28 +00:00
Reid Kleckner 93fc520339 [X86] Put no-op ADJCALLSTACK markers around all dynamic lowerings
Summary:
These ADJCALLSTACK markers don't generate code, but they keep dynamic
alloca code that calls chkstk out of the prologue.

This slightly pessimizes inalloca calls by preventing some register copy
coalescing, but I can live with that.

Reviewers: qcolombet

Subscribers: hans, llvm-commits

Differential Revision: http://reviews.llvm.org/D15200

llvm-svn: 254645
2015-12-03 20:46:59 +00:00
Krzysztof Parzyszek 7709aa0e07 [Hexagon] Remove variable unused in NDEBUG build
llvm-svn: 254623
2015-12-03 17:53:34 +00:00
Matthias Braun 0d4505c067 AArch64FastISel: Use cbz/cbnz to branch on i1
In the case of a conditional branch without a preceding cmp we used to emit
a "and; cmp; b.eq/b.ne" sequence, use tbz/tbnz instead.

Differential Revision: http://reviews.llvm.org/D15122

llvm-svn: 254621
2015-12-03 17:19:58 +00:00
Krzysztof Parzyszek c168c0165c [Hexagon] Implement CONCAT_VECTORS for HVX using V6_vcombine
llvm-svn: 254617
2015-12-03 16:47:20 +00:00
Colin LeMahieu 7c572b2125 [Hexagon] NFC Using canonicalizePacket to compound/duplex/pad packets rather than doing it separately. This also ensures the integrated assembler path matches the assembly parser path.
llvm-svn: 254616
2015-12-03 16:37:21 +00:00
Krzysztof Parzyszek 25ddd2c9e8 [Hexagon] Fix instruction descriptor flags for memory access size
llvm-svn: 254613
2015-12-03 15:41:33 +00:00
Marina Yatsina 4b1aea0802 [X86] MS inline asm: produce error when encountering "<type> ptr <reg name>"
Currently "<type> ptr <reg name>" treated as <reg name> in MS inline asm, ignoring the "<type> ptr" completely and possibly ignoring the intention of the user.
Fixed llvm to produce an error when encountering "<type> ptr <reg name>" operands.

For example: andpd xmm1,xmmword ptr xmm1 --> andpd xmm1, xmm1 
though andpd has 2 possible matching formats - andpd xmm, xmm/m128

Patch by: ziv.izhar@intel.com
Differential Revision: http://reviews.llvm.org/D14607

llvm-svn: 254607
2015-12-03 12:17:03 +00:00
Marina Yatsina 90d9ffa7d6 [X86] Add support for fcomip, fucomip for Intel syntax
According to x86 spec, fcomip and fucomip should be supported for Intel syntax.

Differential Revision: http://reviews.llvm.org/D15104

llvm-svn: 254595
2015-12-03 08:55:33 +00:00
Tom Stellard 9760f03757 AMDGPU/SI: Emit constant arrays in the .hsrodata_readonly_agent section
Summary: This is done only when targeting HSA.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13807

llvm-svn: 254587
2015-12-03 03:34:32 +00:00
Joerg Sonnenberger 48eb197434 Add a TODO item that the nop handling before FP conditional branches is
not enough for SPARCv7.

llvm-svn: 254580
2015-12-03 02:35:24 +00:00
Derek Schuff 5268aaf7b6 [WebAssembly] Add a test for wasm-store-results pass
Differential Revision: http://reviews.llvm.org/D15167

llvm-svn: 254570
2015-12-03 00:50:30 +00:00
Dan Gohman ac132e9305 [WebAssembly] Assert that byval and nest are not used for return types.
llvm-svn: 254567
2015-12-02 23:40:03 +00:00
Krzysztof Parzyszek 8d8b229de9 [Hexagon] Improve lowering of instructions to the MC layer
- Add extenders when necessary.
- Handle some basic relocations.

This should fix the failure in tools/clang/test/CodeGenCXX/crash.cpp

llvm-svn: 254564
2015-12-02 23:08:29 +00:00
David Majnemer 70497c696a Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Alexey Samsonov 44ff204fad Fixup for r254547: use format_hex() to simplify code.
llvm-svn: 254560
2015-12-02 22:59:22 +00:00
Alexey Samsonov 39b7d65d82 [PowerPC] Remove wild call to RegScavenger::initRegState().
This call should in fact be made by RegScavenger::enterBasicBlock()
called below. The first call does nothing except for triggering UB,
indicated by UBSan (passing nullptr to memset()).

llvm-svn: 254548
2015-12-02 21:25:28 +00:00
Alexey Samsonov bcfabaa05b [Hexagon] Remove std::hex in favor of format().
std::hex is not used anywhere in LLVM code base except for this place,
and it has a known undefined behavior (at least in libstdc++ 4.9.3):
https://llvm.org/bugs/show_bug.cgi?id=18156, which fires in UBSan
bootstrap of LLVM.

llvm-svn: 254547
2015-12-02 21:13:43 +00:00
Tom Stellard 00f2f91af4 AMDGPU/SI: Correctly emit agent global segment variables when targeting HSA
Differential Revision: http://reviews.llvm.org/D14508

llvm-svn: 254540
2015-12-02 19:47:57 +00:00
Krzysztof Parzyszek de25ecfa62 [Hexagon] Remove TFRI_V4 instruction, use existing A2_tfrsi instead
llvm-svn: 254539
2015-12-02 19:44:35 +00:00
Kyle Butt 015f4fc854 Test Commit: iteratee
Remove whitespace from blank lines. NFC

llvm-svn: 254531
2015-12-02 18:53:33 +00:00
Tom Stellard e928533dae AMDGPU: Fix msan test failure
llvm-svn: 254527
2015-12-02 18:35:23 +00:00
Tim Northover f520eff782 AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.

llvm-svn: 254524
2015-12-02 18:12:57 +00:00
Dan Gohman 53d1399792 [WebAssembly] Fix comments to say "LIFO" instead of "FIFO" when describing a stack.
llvm-svn: 254523
2015-12-02 18:08:49 +00:00
Tom Stellard e3b5aeaf83 AMDGPU/SI: Don't emit group segment global variables
Summary: Only global or readonly segment variables should appear in object files.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15111

llvm-svn: 254519
2015-12-02 17:00:42 +00:00
Michael Zuckerman 15152a5c41 By intel spec
|9B DD /7| FSTSW m2byte| Valid Valid Store FPU status word at m2byteafter checking for pending unmasked floating-point exceptions.|
|9B DF E0| FSTSW AX| Valid Valid Store FPU status word in AX register after checking for pending unmasked floating-point exceptions.|
|DD /7 |FNSTSW *m2byte| Valid Valid Store FPU status word at m2bytewithout checking for pending unmasked floating-point exceptions.|
|DF E0 |FNSTSW *AX| Valid Valid Store FPU status word in AX register without checking for pending unmasked floating-point exceptions|

m2byte is word register, and therefor instruction operand need to be change from f32mem to i16mem.

Differential Revision: http://reviews.llvm.org/D14953

llvm-svn: 254512
2015-12-02 14:34:34 +00:00
Christof Douma 8b5dc2c94e [AArch64]: Add support for Cortex-A35
Adds support for the new Cortex-A35 ARMv8-A core.

llvm-svn: 254503
2015-12-02 11:53:44 +00:00
Nemanja Ivanovic 74e31bc929 Patch to fix a crash in the PowerPC back end due to ISD::ROTL and ISD::ROTR
not being expanded. Test case included.

llvm-svn: 254501
2015-12-02 10:36:24 +00:00
Hrvoje Varga 672b0f5582 [mips][microMIPS] Implement PREPEND, RADDU.W.QB, RDDSP, REPL.PH, REPL.QB, REPLV.PH, REPLV.QB and MTHLIP instructions
Differential Revision: http://reviews.llvm.org/D14527

llvm-svn: 254496
2015-12-02 09:31:24 +00:00
Simon Pilgrim 3fc3454a0c [X86][FMA] Optimize FNEG(FMUL) Patterns
On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0)

Fix for PR24366

Differential Revision: http://reviews.llvm.org/D14909

llvm-svn: 254495
2015-12-02 09:07:55 +00:00
Elena Demikhovsky a1a40cce9f AVX-512: Updated cost of FP/SINT/UINT conversion operations
I checked and updated the cost of AVX-512 conversion operations. Added cost of conversion operations in DQ mode.
Conversion of illegal types that requires vector split is not calculated right now (like for other X86 targets).

Differential Revision: http://reviews.llvm.org/D15074

llvm-svn: 254494
2015-12-02 08:59:47 +00:00
Asaf Badouh 2489f350c0 [X86][AVX512] add comi with Sae
add builtin_ia32_vcomisd and builtin_ia32_vcomisd

Differential Revision: http://reviews.llvm.org/D14331

llvm-svn: 254493
2015-12-02 08:17:51 +00:00
Craig Topper f419a1f69a [X86] Change getZeroVector to take an MVT instead of EVT. One minor change needed to only try to perform 256-it shuffle combines on legal vector types.
llvm-svn: 254490
2015-12-02 06:39:19 +00:00
Craig Topper 6164297f46 [X86] Fix weird identation. NFC
llvm-svn: 254487
2015-12-02 05:24:38 +00:00
Quentin Colombet bbdebefff6 [X86] Fix a think-o when checking if the eflags needs to be preserved.
llvm-svn: 254480
2015-12-02 02:07:00 +00:00
Quentin Colombet f1e91c8bf1 [X86] Make sure the prologue does not clobber EFLAGS when it lives accross it.
This is a superset of the fix done in r254448.

This fixes PR25607.

llvm-svn: 254478
2015-12-02 01:22:54 +00:00
Tim Northover f3be9d5c0b AArch64: fix 128-bit shifts
We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an
UNDEF value (and worse, it's not what you want with the natural Arch64
implementation).

The generated code is pretty horrific, but I couldn't come up with an obviously
better alternative (if the amount is constant EXTR could help). Turns out
128-bit shifts are just nasty.

rdar://22491037

llvm-svn: 254475
2015-12-02 00:33:54 +00:00
Matt Arsenault 592d068198 AMDGPU: Error on addrspacecasts that aren't actually implemented
llvm-svn: 254469
2015-12-01 23:04:05 +00:00
Matt Arsenault f9bfeafd00 AMDGPU: Implement isNoopAddrSpaceCast
llvm-svn: 254468
2015-12-01 23:04:00 +00:00
Matthias Braun b258d794dd ARM: Change ArchCheck field to uint64_t
The values in this field are compared against getAvailableFeatures()
which returns an uint64_t. This was causing problems in an internal
branch.

llvm-svn: 254462
2015-12-01 21:48:52 +00:00
Matt Arsenault 3b15967008 AMDGPU: Disallow flat_scr in SI assembler
llvm-svn: 254459
2015-12-01 20:31:08 +00:00
Matt Arsenault 856d1928a8 AMDGPU: Optimize VOP2 operand legalization
Don't use commuteInstruction, and don't commute if
doing so will not improve legality. Skip the more
complex checks for literal operands and constant bus restrictions,
which are not a concern for VOP2 instructions because src1
does not accept SGPRs or constants and few implicitly
read vcc.

This gets called quite a few times and the
attempts at commuting are a significant fraction
of the time spent in SIFixSGPRCopies, so it's
somewhat worthwhile to optimize. With this patch and others
leading up to it, this reduces the compile time of SIFixSGPRCopies
on some of the LuxMark 2 kernels from ~8ms to ~5ms on my system.

llvm-svn: 254452
2015-12-01 19:57:17 +00:00
Quentin Colombet 9cb01aa30a [X86] Make sure the prologue does not clobber EFLAGS when it lives accross it.
This fixes PR25629.

llvm-svn: 254448
2015-12-01 19:49:31 +00:00
Artyom Skrobov 5d1f2524a0 Fix Thumb1 epilogue generation
Summary:
This had been broken for a very long time, but nobody noticed until
D14357 enabled shrink-wrapping by default.

Reviewers: jroelofs, qcolombet

Subscribers: tyomitch, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14986

llvm-svn: 254444
2015-12-01 19:25:11 +00:00
Weiming Zhao 56ab51870c [AArch64] Fix a corner case in BitFeild select
Summary:
When not useful bits, BitWidth becomes 0 and APInt will not be happy.

See https://llvm.org/bugs/show_bug.cgi?id=25571

We can just mark the operand as IMPLICIT_DEF is none bits of it is used.

Reviewers: t.p.northover, jmolloy

Subscribers: gberry, jmolloy, mgrang, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14803

llvm-svn: 254440
2015-12-01 19:17:49 +00:00
Matt Arsenault e830f5427b AMDGPU: Report extractelement as free in cost model
The cost for scalarized operations is computed as N * (scalar operation
cost + 1 extractelement + 1 insertelement). This partially fixes
inflating the cost of scalarized operations since every operation is
scalarized and free. I don't think we want any cost asociated with
scalarization, but for now insertelement is still counted. I'm not sure
if we should pretend that insertelement is also free, or add a way
to compute a custom scalarization cost.

llvm-svn: 254438
2015-12-01 19:08:39 +00:00
Tom Stellard 38b7cbe3e0 AMDGPU/SI: Remove REGISTER_STORE/REGISTER_LOAD code which is now dead
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15050

llvm-svn: 254427
2015-12-01 17:45:22 +00:00
Tom Stellard ff63c25753 AMDGPU: Use the default strings for data emission directives
Summary:
This makes the assembly output look nicer and there is no reason to
have custom strings for these.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D14671

llvm-svn: 254426
2015-12-01 17:45:17 +00:00
Sanjay Patel 60216f6943 [x86] add a convenience method to check for FMA capability; NFCI
llvm-svn: 254425
2015-12-01 17:27:55 +00:00
Elena Demikhovsky 0d0692d854 AVX-512: fixed asm string of vsqrtss
(vvsqrtss was generated before)

llvm-svn: 254411
2015-12-01 12:43:46 +00:00
Hrvoje Varga e51b0e13f3 [mips][microMIPS] Implement RECIP.fmt, RINT.fmt, ROUND.L.fmt, ROUND.W.fmt, SEL.fmt, SELEQZ.fmt, SELNEQZ.fmt and CLASS.fmt
Differential Revision: http://reviews.llvm.org/D13885

llvm-svn: 254405
2015-12-01 11:59:21 +00:00
Yury Gribov d7dbb66eb8 Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.
The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset
from native stack pointer to the address of the most recent dynamic alloca on
the caller's stack. These intrinsics are intendend for use in combination with
@llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic
alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning
routines.

Patch by Max Ostapenko.

Differential Revision: http://reviews.llvm.org/D14983

llvm-svn: 254404
2015-12-01 11:40:55 +00:00
Oliver Stannard a34e47066e [AArch64] Add ARMv8.2-A Statistical Profiling Extension
The Statistical Profiling Extension is an optional extension to
ARMv8.2-A. Since it is an optional extension, I have added the
FeatureSPE subtarget feature to control it. The assembler-visible parts
of this extension are the new "psb csync" instruction, which is
equivalent to "hint #17", and a number of system registers.

Differential Revision: http://reviews.llvm.org/D15021

llvm-svn: 254401
2015-12-01 10:48:51 +00:00
Oliver Stannard 4667071574 [ARM] Add ARMv8.2-A to TargetParser
Add ARMv8.2-A to TargetParser, so that it can be used by the clang
command-line options and the .arch directive.

Most testing of this will be done in clang, checking that the
command-line options that this enables work.

Differential Revision: http://reviews.llvm.org/D15037

llvm-svn: 254400
2015-12-01 10:33:56 +00:00
Oliver Stannard 8addbf4350 [ARM] Add subtarget features for ARMv8.2-A
This adds subtarget features for ARMv8.2-A, which builds on (and
requires the features from) ARMv8.1-A. Most assembler-visible features
of ARMv8.2-A are system instructions, and are all required parts of the
architecture, so just depend on the HasV8_2aOps subtarget feature.
There is also one large, optional feature, which adds 16-bit floating
point versions of all existing floating-point instructions (VFP and
SIMD), this is represented by the FeatureFullFP16 subtarget feature.

Differential Revision: http://reviews.llvm.org/D15036

llvm-svn: 254399
2015-12-01 10:23:06 +00:00
Craig Topper c458c7c6c9 [X86] Fix patterns for memory forms of FP FSUBR and FDIVR. They need to have memory on the left hand side of the fsub/fdiv operations in their patterns.
Not sure how to test this. I noticed by inspection in the isel tables where the same pattern tried to produce DIV and DIVR or SUB and SUBR.

llvm-svn: 254388
2015-12-01 06:13:16 +00:00
Craig Topper 271f9ded44 [X86] Use range-based for loops. NFC
llvm-svn: 254387
2015-12-01 06:13:15 +00:00
Craig Topper ba894c3c0d [X86] Use array_lengthof instead of calculating manually. Also change index types to size_t to match.
llvm-svn: 254386
2015-12-01 06:13:13 +00:00
Craig Topper ddc76f2bed [Hexagon] Use std::begin() and std::end() instead of doing the same manually. NFC
llvm-svn: 254385
2015-12-01 06:13:10 +00:00
Craig Topper d824f5f0d9 [Hexagon] Use array_lengthof and const correct and type correct the array and array size. NFC
llvm-svn: 254384
2015-12-01 06:13:08 +00:00
Craig Topper 6261e1b94d Use array_lengthof instead of manually calculating it. NFC
llvm-svn: 254383
2015-12-01 06:13:06 +00:00
Craig Topper 3da000c07f [Hexagon] Use ArrayRef to avoid needing to calculate an array size. Interestingly the original code may have had a bug because it was passing the byte size of a uint16_t array instead of the number of entries.
llvm-svn: 254382
2015-12-01 06:13:04 +00:00
Craig Topper 8072081b63 [ARM] Use range-based for loops to avoid the need for calculating an array size that I would have otherwise cconverted to array_lengthof. NFC
llvm-svn: 254381
2015-12-01 06:13:01 +00:00
Cong Hou d97c100dc4 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
(This is the second attempt to submit this patch. The first caused two assertion
 failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687)

The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254377
2015-12-01 05:29:22 +00:00
Hans Wennborg 1dbaf67537 Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces."
and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction."

Asserts were firing in Chromium builds. See PR25687.

llvm-svn: 254366
2015-12-01 03:49:42 +00:00
Matt Arsenault 456fdfcdc2 Squelch unused variable warning in SIRegisterInfo.cpp.
Patch by Justin Lebar

llvm-svn: 254362
2015-12-01 02:14:33 +00:00
Cong Hou fa1917c673 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254348
2015-12-01 00:02:51 +00:00
Simon Pilgrim db26b3ddfa [X86][FMA4] Prefer FMA4 to FMA
We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4).

This patch flips this so FMA4 is preferred; this is for several reasons:

1 - FMA4 is non-destructive reducing the need for mov instructions.
2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference).
3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions.

Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs.

Differential Revision: http://reviews.llvm.org/D14997

llvm-svn: 254339
2015-11-30 22:22:06 +00:00
Matt Arsenault ada6cf1b22 AMDGPU: Fix unused function
llvm-svn: 254333
2015-11-30 21:32:10 +00:00
Matt Arsenault 41003af292 AMDGPU: Error if too many user SGPRs used
llvm-svn: 254332
2015-11-30 21:16:07 +00:00
Matt Arsenault 26f8f3db39 AMDGPU: Rework how private buffer passed for HSA
If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.

If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.

This also only selectively enables all of the input registers
which are really required instead of always enabling them.

llvm-svn: 254331
2015-11-30 21:16:03 +00:00
Matt Arsenault ac234b604d AMDGPU: Rename enums to be consistent with HSA code object terminology
llvm-svn: 254330
2015-11-30 21:15:57 +00:00
Matt Arsenault 0e3d38937e AMDGPU: Remove SIPrepareScratchRegs
It does not work because of emergency stack slots.
This pass was supposed to eliminate dummy registers for the
spill instructions, but the register scavenger can introduce
more during PrologEpilogInserter, so some would end up
left behind if they were needed.

The potential for spilling the scratch resource descriptor
and offset register makes doing something like this
overly complicated. Reserve registers to use for the resource
descriptor and use them directly in eliminateFrameIndex.

Also removes creating another scratch resource descriptor
when directly selecting scratch MUBUF instructions.

The choice of which registers are reserved is temporary.
For now it attempts to pick the next available registers
after the user and system SGPRs.

llvm-svn: 254329
2015-11-30 21:15:53 +00:00
Matt Arsenault ff6da2fe89 AMDGPU: Use assert zext for workgroup sizes
llvm-svn: 254328
2015-11-30 21:15:45 +00:00
Quentin Colombet cdad10f333 [ARM] For old thumb ISA like v4t, we cannot use PC directly in pop.
Fix the epilogue emission to account for that.

llvm-svn: 254325
2015-11-30 20:37:58 +00:00
David Majnemer bf4119faf6 [X86] Add RIP to GR64_TCW64
The MachineVerifier wants to check that the register operands of an
instruction belong to the instruction's register class.  RIP-relative
control flow instructions violated this by referencing RIP.  While this
was fixed for SysV, it was never fixed for Win64.

llvm-svn: 254315
2015-11-30 19:04:19 +00:00
Kit Barton f4ce2f3a9e Enable shrink wrapping for PPC64
Re-enable shrink wrapping for PPC64 Little Endian.

One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly.

Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found.

PHabricator: http://reviews.llvm.org/D14778
llvm-svn: 254314
2015-11-30 18:59:41 +00:00
Dan Gohman 96029f7880 [WebAssembly] Fix a few minor compiler warnings. NFC.
llvm-svn: 254311
2015-11-30 18:42:08 +00:00
Sanjay Patel 239be1fb0d fix formatting; NFC
llvm-svn: 254310
2015-11-30 17:52:02 +00:00
Colin LeMahieu e6241798c9 [Hexagon] NFC Reordering headers.
llvm-svn: 254307
2015-11-30 17:32:34 +00:00
Matt Arsenault ea03cf2fa1 AMDGPU: Don't reserve SCRATCH_PTR input register
This hasn't been doing anything since using relocations was added.

llvm-svn: 254304
2015-11-30 15:46:47 +00:00
Aaron Ballman 33c95f08b0 Silencing a 32-bit to 64-bit implicit conversion warning; NFC.
llvm-svn: 254302
2015-11-30 14:52:33 +00:00
Hrvoje Varga c03957f049 [mips][microMIPS] Implement LBUX, LHX, LWX, MAQ_S[A].W.PHL, MAQ_S[A].W.PHR, MFHI, MFLO, MTHI and MTLO instructions
Differential Revision: http://reviews.llvm.org/D14436

llvm-svn: 254297
2015-11-30 12:58:39 +00:00
Zoran Jovanovic a887b36167 [mips][microMIPS] Fix issue with offset operand of BALC and BC instructions
Value of offset operand for microMIPS BALC and BC instructions is currently shifted 2 bits, but it should be 1 bit.
Differential Revision: http://reviews.llvm.org/D14770

llvm-svn: 254296
2015-11-30 12:56:18 +00:00
Zlatko Buljan 56f3b0e410 [mips][microMIPS] Implement PRECR.QB.PH, PRECR_SRA[_R].PH.W, PRECRQ.PH.W, PRECRQ.QB.PH, PRECRQU_S.QB.PH and PRECRQ_RS.PH.W instructions
Differential Revision: http://reviews.llvm.org/D14605

llvm-svn: 254291
2015-11-30 08:37:38 +00:00
Craig Topper 27e2912fa8 Revert r254279 "[X86] Use ArrayRef. NFC". It seems to have upset an MSVC build bot.
llvm-svn: 254280
2015-11-30 02:28:19 +00:00
Craig Topper b84f39865f [X86] Use ArrayRef. NFC
llvm-svn: 254279
2015-11-30 02:08:05 +00:00
Craig Topper aad5f11e5f [AVX512] The vpermi2 instructions require an integer vector for the index vector. This is reflected correctly in the intrinsics, but was not refelected in the isel patterns.
For the floating point types, this requires adding a bitcast to the index vector when its passed through to the output.

llvm-svn: 254277
2015-11-30 00:13:24 +00:00
Craig Topper fbde7aa13a [X86] Remove duplicate entries from intrinsics tables and add asserts to verify there are no others.
llvm-svn: 254274
2015-11-29 23:18:32 +00:00
Dan Gohman 9551a44d0c [WebAssembly] Delete an obsolete TODO comment.
llvm-svn: 254272
2015-11-29 23:09:41 +00:00
Dan Gohman 174b2d83ee [WebAssembly] Set several MCInstrDesc flags.
llvm-svn: 254271
2015-11-29 22:59:19 +00:00
Craig Topper ecae476e4c [X86] int_x86_avx2_permps and X86ISD::VPERMV should take an integer vector for its shuffle indices.
llvm-svn: 254269
2015-11-29 22:53:22 +00:00
Dan Gohman 5237b3991d [WebAssembly] Delete unused functions. NFC.
llvm-svn: 254268
2015-11-29 22:48:57 +00:00
Dan Gohman 7a6b9825ce [WebAssembly] Minor clang-format and selected clang-tidy cleanups. NFC.
llvm-svn: 254267
2015-11-29 22:32:02 +00:00
Simon Pilgrim 88aa627c0b [X86][SSE] Added support for lowering to ADDSUBPS/ADDSUBPD with commuted inputs
We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching.

llvm-svn: 254259
2015-11-29 16:41:04 +00:00
Igor Breger e293e83f5d AVX512:Implemented encoding for the vmovq.s instruction.
Differential Revision: http://reviews.llvm.org/D14810

llvm-svn: 254248
2015-11-29 07:41:26 +00:00
Renato Golin 5dbc8a5283 Revert "[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM."
This reverts commit r254201 and r254202, as it broke test-suite,
self-hosting and sanitizer tests on ARM buildbots.

llvm-svn: 254234
2015-11-28 17:23:46 +00:00
Jonas Paulsson f12b925bb1 [Stack realignment] Handling of aligned allocas.
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.

It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.

SystemZ benefits from this, due to its stack frame layout.

New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.

Review and help from Ulrich Weigand and Hal Finkel.

llvm-svn: 254227
2015-11-28 11:02:32 +00:00
Artyom Skrobov f01a59f9fb Follow-up fix for r254201
llvm-svn: 254202
2015-11-27 16:20:34 +00:00
Artyom Skrobov b955b90509 [ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM.
Summary:
Since this build attribute corresponds to a whole module, and
different functions in a module may differ in the optimizations
enabled for them, this attribute is emitted after all functions,
and only in the case that the optimization goals for all
functions match.

Reviewers: logan, hans

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14934

llvm-svn: 254201
2015-11-27 15:30:51 +00:00
Oliver Stannard b25914e03f [AArch64] Add ARMv8.2-A FP16 scalar instructions
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

Most of these instructions are the same as the 32- and 64-bit versions,
but with the type field (bits 23-22) set to 0b11. Previously the top bit
of the size field was always 0, so the instruction classes only provided
a 1-bit size field, which I have widened to 2 bits.

Differential Revision: http://reviews.llvm.org/D15014

llvm-svn: 254198
2015-11-27 13:04:48 +00:00
Craig Topper e38c57a4b8 [X86] Pair a NoVLX with HasAVX512 to match the others and remove a unique predicate check in the isel tables. NFC
llvm-svn: 254191
2015-11-27 05:44:02 +00:00
Craig Topper a47576f297 [X86] Now that X86VPermt2 is used in all the avx512_perm_t_sizes just hardcode it into the patterns instead of passing as an argument. NFC
llvm-svn: 254177
2015-11-26 20:21:29 +00:00
Craig Topper 05858f52fe [X86] Merge X86VPermt2Fp and X86VPermt2Int back together by weakening them just enough. The SDTCisSameSizeAs introduced in r254138 helps here.
llvm-svn: 254176
2015-11-26 20:02:01 +00:00
Craig Topper 0009656335 [X86] Split ISD node for Vfpclass and Vfpclasss so that we can write strong type constraints for each that don't cause ambiguous isel.
llvm-svn: 254172
2015-11-26 19:41:34 +00:00
Craig Topper ff2f14731a [X86] Revert part of r254167 to recover bots.
llvm-svn: 254169
2015-11-26 19:13:05 +00:00
Krzysztof Parzyszek 08ff8883fd [Hexagon] Lowering of V60/HVX vector types
llvm-svn: 254168
2015-11-26 18:38:27 +00:00
Craig Topper 9d1deb4b72 [X86] Strengthen more type constraints to reduce isel table size.
llvm-svn: 254167
2015-11-26 18:31:19 +00:00
Krzysztof Parzyszek 4eb6d4d1f2 [Hexagon] Hexagon V60 HVX intrinsic defintions
Author: Ron Lieberman <ronl@codeaurora.org>
llvm-svn: 254165
2015-11-26 16:54:33 +00:00
Daniel Sanders daa4b6fbd9 [mips][ias] Range check uimm5 operands and fix several bugs this revealed.
Summary:
The bugs were:
* append, prepend, and balign were not tested
* balign takes a uimm2 not a uimm5.
* drotr32 was correctly implemented with a uimm5 but the tests expected
  '52' to be valid.
* li/la were implemented with a uimm5 instead of simm32. simm32 isn't
  completely correct either but I'll fix that when I get to simm32.

A notable omission are some of the shift instructions. Several of these
have been implemented using a single uimm6 instruction (rather than two
uimm5 instructions and a CodeGen-only uimm6 pseudo). These will be updated
in the uimm6 patch.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14712

llvm-svn: 254164
2015-11-26 16:35:41 +00:00
Oliver Stannard 64c167db7a [AArch64] Add ARMv8.2-A new AT instruction variants
ARMv8.2-A adds new variants of the "at" (address translate) system
instruction, which take the PSTATE.PAN bit (added in ARMv8.1-A). These
are a required part of ARMv8.2-A, so no additional subtarget features
are required.

Differential Revision: http://reviews.llvm.org/D15018

llvm-svn: 254159
2015-11-26 15:34:44 +00:00
Martell Malone d12292480a ARM: address WOA unsigned division overflow crash
Building on r253865 the crash is not limited to signed overflows.

Disable custom handling of unsigned 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit unsigned integer overflow.

llvm-svn: 254158
2015-11-26 15:34:03 +00:00
Oliver Stannard 911ea20f07 [AArch64] Add ARMv8.2-A UAO PSTATE bit
ARMv8.2-A adds a new PSTATE bit, PSTATE.UAO, which allows the LDTR/STTR
instructions to behave the same as LDR/STR with respect to execute-only
pages at higher privilege levels. New variants of the MSR/MRS
instructions are added to allow reading and writing this bit. It is a
required part of ARMv8.2-A, so no additional subtarget features are
required.

Differential Revision: http://reviews.llvm.org/D15020

llvm-svn: 254157
2015-11-26 15:32:30 +00:00
Oliver Stannard 1a81cc9f43 [AArch64] Add ARMv8.2-A persistent memory instruction
ARMv8.2-A adds the "dc cvap" instruction, which is a system instruction
that cleans caches to the point of persistence (for systems that have
persistent memory). It is a required part of ARMv8.2-A, so no additional
subtarget features are required.

Differential Revision: http://reviews.llvm.org/D15016

llvm-svn: 254156
2015-11-26 15:28:47 +00:00
Oliver Stannard 48b43741d0 [AArch64] Add ARMv8.2-A ID_A64MMFR2_EL1 register
ARMv8.2-A adds a new ID register, ID_A64MMFR2_EL1, which behaves in the
same way as ID_A64MMFR0_EL1 and ID_A64MMFR1_EL1. It is a required part
of ARMv8.2-A, so no additional subtarget features are required.

Differential Revision: http://reviews.llvm.org/D15017

llvm-svn: 254155
2015-11-26 15:26:10 +00:00
Oliver Stannard 7cc0c4e675 [AArch64] Add subtarget features for ARMv8.2-A
This adds subtarget features for ARMv8.2-A, which builds on (and
requires the features from) ARMv8.1-A. Most assembler-visible features
of ARMv8.2-A are system instructions, and are all required parts of the
architecture, so just depend on the HasV8_2aOps subtarget feature. There
is also one large, optional feature, which adds 16-bit floating point
versions of all existing floating-point instructions (VFP and SIMD),
this is represented by the FeatureFullFP16 subtarget feature.

Differential Revision: http://reviews.llvm.org/D15013

llvm-svn: 254154
2015-11-26 15:23:32 +00:00
Craig Topper a3ac738725 [X86] Strengthen more type constraints to reduce isel table size.
llvm-svn: 254142
2015-11-26 07:58:20 +00:00
Vyacheslav Klochkov ed865dfcc5 X86-FMA3: Improved/enabled the memory folding optimization for scalar loads
generated for _mm_losd_s{s,d}() intrinsics and used in scalar FMAs generated 
for FMA intrinsics _mm_f{madd,msub,nmadd,nmsub}_s{s,d}().

Reviewer: David Kreitzer
Differential Revision: http://reviews.llvm.org/D14762

llvm-svn: 254140
2015-11-26 07:45:30 +00:00
Craig Topper 4c175cdc8e [X86] Strengthen the type constraints on X86psadbw and X86dbpsadbw to reduce some of the type checks in the isel matching tables.
llvm-svn: 254139
2015-11-26 07:02:21 +00:00
Krzysztof Parzyszek 195dc8d0db [Hexagon] HVX vector register classes and more isel patterns
llvm-svn: 254132
2015-11-26 04:33:11 +00:00
Tom Stellard 48f29f21ee AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsic
Summary:
This returns a pointer to the dispatch packet, which can be used to load
information about the kernel dispach.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D14898

llvm-svn: 254116
2015-11-26 00:43:29 +00:00
Dan Gohman a774d719a0 [WebAssembly] Fix inline asm support for i64 operands.
llvm-svn: 254106
2015-11-25 22:28:50 +00:00
Dan Gohman d9b4218831 [WebAssembly] Fold setne and seteq comparisons into selects.
llvm-svn: 254104
2015-11-25 22:13:48 +00:00
Krzysztof Parzyszek 70a134d29f [Hexagon] Treat transfers of FP immediates are pseudo instructions
This is a temporary fix to address ICE on 2005-10-21-longlonggtu.ll.
The proper fix will be to use A2_tfrsi, but it will need more work to
teach all users of A2_tfrsi to also expect a floating-point operand.

llvm-svn: 254099
2015-11-25 21:40:03 +00:00
Dan Gohman 5941bde03c [WebAssembly] Add some comments. NFC.
llvm-svn: 254096
2015-11-25 21:32:06 +00:00
Marek Olsak 7ed6b2f414 AMDGPU/SI: select S_ABS_I32 when possible (v2)
v2: added more tests, moved the SALU->VALU conversion to a separate function

It looks like it's not possible to get subregisters in the S_ABS lowering
code, and I don't feel like guessing without testing what the correct code
would look like.

llvm-svn: 254095
2015-11-25 21:22:45 +00:00
Dan Gohman 80e34e0a18 [WebAssembly] Fix WebAssembly register numbering for registers added late.
If virtual registers are created late, mappings to WebAssembly
registers need to be added explicitly. This patch adds a function
to do so and teaches WebAssemblyPeephole to use it. This fixes
an out-of-bounds access on the WARegs vector.

llvm-svn: 254094
2015-11-25 21:13:02 +00:00
Matt Arsenault 49affb8462 AMDGPU: Check feature attributes in SIMachineFunctionInfo
llvm-svn: 254091
2015-11-25 20:55:12 +00:00
Krzysztof Parzyszek 207c13f254 Add hexagonv55 and hexagonv60 as recognized CPUs, make v60 the default
llvm-svn: 254089
2015-11-25 20:30:59 +00:00
Matt Arsenault 61001bbc03 AMDGPU: Make v2i64/v2f64 legal types.
They can be loaded and stored, so count them as legal. This is
mostly to fix a number of common cases for load/store merging.

llvm-svn: 254086
2015-11-25 19:58:34 +00:00
Artyom Skrobov 314ee04268 Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.

Reviewers: MatzeB, andreadb, tstellarAMD

Subscribers: arsenm, jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D14945

llvm-svn: 254085
2015-11-25 19:41:11 +00:00
Dan Gohman fb3e0594e4 [WebAssembly] Use a physical register to describe ARGUMENT liveness.
Instead of trying to move ARGUMENT instructions back up to the top after
they've been scheduled or sunk down, use a fake physical register to
create a liveness constraint that prevents ARGUMENT instructions from
moving down in the first place. This is still not entirely ideal, however
it is more robust than letting them move and moving them back.

llvm-svn: 254084
2015-11-25 19:36:19 +00:00
Dan Gohman 9c54d3b4c6 [WebAssembly] Clean up several FIXME comments.
llvm-svn: 254079
2015-11-25 18:13:18 +00:00
Dan Gohman 81719f8555 [WebAssembly] Support for register stackifying with load and store instructions.
llvm-svn: 254076
2015-11-25 16:55:01 +00:00
Dan Gohman 2c8fe6a428 [WebAssembly] Codegen support for ISD::ExternalSymbol
llvm-svn: 254075
2015-11-25 16:44:29 +00:00
Dan Gohman fd4a88c376 [WebAssembly] Add 'final' to some classes. NFC.
llvm-svn: 254073
2015-11-25 16:29:24 +00:00
Dan Gohman 04c0401f28 [WebAssembly] Whitespace consistency. NFC.
llvm-svn: 254071
2015-11-25 16:26:14 +00:00
Sanjay Patel 25150784ae fix typo; NFC
llvm-svn: 254069
2015-11-25 15:33:36 +00:00
Hal Finkel 005f840959 [PowerPC] Don't generate mfocrf on the e500mc
The e500mc does not actually support the mfocrf instruction; update the
processor definitions to reflect that fact.

Patch by Tom Rix (with some test-case cleanup by me).

llvm-svn: 254064
2015-11-25 10:14:31 +00:00
Elena Demikhovsky f07df9fcac AVX-512: Fixed a bug in VPERMT2* intrinsic.
It was wrong order of operands (from intrinsic to DAG node).
I added more strict type specification for instruction selection.

Differential Revision: http://reviews.llvm.org/D14942

llvm-svn: 254059
2015-11-25 08:17:56 +00:00
Hans Wennborg e412b71f95 Revert r253528: "[X86] Enable shrink-wrapping by default."
This caused PR25607 and also caused Chromium to crash on start-up.

(Also had to update test/CodeGen/X86/avx-splat.ll, which was committed
after shrink wrapping was enabled.)

llvm-svn: 254044
2015-11-25 00:05:13 +00:00
Kaelyn Takata d0955312d9 Fix an asan error where NumElements > 32 for at least one case in
test/CodeGen/X86/avg.ll.

llvm-svn: 254043
2015-11-25 00:03:29 +00:00
Simon Pilgrim 1b4fecb098 [X86][FMA] Optimize FNEG(FMA) Patterns
X86 needs to use its own FMA opcodes, preventing the standard FNEG(FMA) pattern table recognition method used by other platforms. This patch adds support for lowering FNEG(FMA(X,Y,Z)) into a single suitably negated FMA instruction.

Fix for PR24364

Differential Revision: http://reviews.llvm.org/D14906

llvm-svn: 254016
2015-11-24 20:31:46 +00:00
Cong Hou db6220f84d [X86] Fix several issues related to X86's psadbw instruction.
This patch fixes the following issues:

1. Fix the return type of X86psadbw: it should not be the same type of inputs.
   For vNi8 inputs the output should be vMi64, where M = N/8.
2. Fix the return type of int_x86_avx512_psad_bw_512 accordingly.
3. Fix the definiton of PSADBW, VPSADBW, and VPSADBWY accordingly.
4. Adjust the return type when building a DAG node of X86ISD::PSADBW type.
5. Update related tests.


Differential revision: http://reviews.llvm.org/D14897

llvm-svn: 254010
2015-11-24 19:51:26 +00:00
Sanjay Patel a0d354541d [x86] remove duplicate movq instruction defs (PR25554)
We had duplicated definitions for the same hardware '[v]movq' instructions. For example with SSE:

  def MOVZQI2PQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src),
                     "mov{d|q}\t{$src, $dst|$dst, $src}", // X86-64 only
                     [(set VR128:$dst, (v2i64 (X86vzmovl (v2i64 (scalar_to_vector GR64:$src)))))],
                     IIC_SSE_MOVDQ>;

  def MOV64toPQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src),
                     "mov{d|q}\t{$src, $dst|$dst, $src}",
                     [(set VR128:$dst, (v2i64 (scalar_to_vector GR64:$src)))],
                     IIC_SSE_MOVDQ>, Sched<[WriteMove]>;

As shown in the test case and PR25554:
https://llvm.org/bugs/show_bug.cgi?id=25554

This causes us to miss reusing an operand because later passes don't know these 'movq' are the same instruction.
This patch deletes one pair of these defs.
Sadly, this won't fix the original test case in the bug report. Something else is still broken.

Differential Revision: http://reviews.llvm.org/D14941

llvm-svn: 253988
2015-11-24 15:44:35 +00:00
Krzysztof Parzyszek aa93575b7e [Hexagon] Add missing include of <cctype>
Lack thereof breaks Windows builds due to the use of std::isspace
in HexagonInstrInfo.cpp.

llvm-svn: 253987
2015-11-24 15:11:13 +00:00
Krzysztof Parzyszek b9a1c3a32c [Hexagon] Bring HexagonInstrInfo up to date
llvm-svn: 253986
2015-11-24 14:55:26 +00:00
Matt Arsenault ff05da806c AMDGPU: Split LDS vector loads
If properly aligned this could allow using ds_read_b64.

llvm-svn: 253975
2015-11-24 12:18:54 +00:00
Matt Arsenault 4d801cd357 AMDGPU: Split x8 and x16 vector loads instead of scalarize
The one regression in the builtin tests is in the read2 test which now
(again) has many extra copies, but this should be solved once the pass
is replaced with a DAG combine.

llvm-svn: 253974
2015-11-24 12:05:03 +00:00
Cong Hou 1938f2eb98 Let SelectionDAG start to use probability-based interface to add successors.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes.
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights.
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This the second patch above. In this patch SelectionDAG starts to use
probability-based interfaces in MBB to add successors but other MC passes are
still using weight-based interfaces. Therefore, we need to maintain correct
weight list in MBB even when probability-based interfaces are used. This is
done by updating weight list in probability-based interfaces by treating the
numerator of probabilities as weights. This change affects many test cases
that check successor weight values. I will update those test cases once this
patch looks good to you.


Differential revision: http://reviews.llvm.org/D14361

llvm-svn: 253965
2015-11-24 08:51:23 +00:00
Cong Hou bed60d35ed [X86][SSE] Detect AVG pattern during instruction combine for SSE2/AVX2/AVX512BW.
This patch detects the AVG pattern in vectorized code, which is simply
c = (a + b + 1) / 2, where a, b, and c have the same type which are vectors of
either unsigned i8 or unsigned i16. In the IR, i8/i16 will be promoted to
i32 before any arithmetic operations. The following IR shows such an example:

%1 = zext <N x i8> %a to <N x i32>
%2 = zext <N x i8> %b to <N x i32>
%3 = add nuw nsw <N x i32> %1, <i32 1 x N>
%4 = add nuw nsw <N x i32> %3, %2
%5 = lshr <N x i32> %N, <i32 1 x N>
%6 = trunc <N x i32> %5 to <N x i8>

and with this patch it will be converted to a X86ISD::AVG instruction.

The pattern recognition is done when combining instructions just before type
legalization during instruction selection. We do it here because after type
legalization, it is much more difficult to do pattern recognition based
on many instructions that are doing type conversions. Therefore, for
target-specific instructions (like X86ISD::AVG), we need to take care of type
legalization by ourselves. However, as X86ISD::AVG behaves similarly to
ISD::ADD, I am wondering if there is a way to legalize operands and result
types of X86ISD::AVG together with ISD::ADD. It seems that the current design
doesn't support this idea.

Tests are added for SSE2, AVX2, and AVX512BW and both i8 and i16 types of
variant vector sizes.


Differential revision: http://reviews.llvm.org/D14761

llvm-svn: 253952
2015-11-24 05:44:19 +00:00
Dan Gohman 192dddc595 [WebAssembly] Don't print the types of memory_size and grow_memory
This matches the current spec, for now.

llvm-svn: 253931
2015-11-23 22:37:29 +00:00
Andy Ayers 9f7501896e findDeadCallerSavedReg needs to pay attention to calling convention
Caller saved regs differ between SysV and Win64. Use the tail call available set to scavenge from.

Refactor register info to create new helper to get at tail call GPRs. Added a new test case for windows. Fixed up a number of X64 tests since now RCX is preferred over RDX on SysV.

Differential Revision: http://reviews.llvm.org/D14878

llvm-svn: 253927
2015-11-23 22:17:44 +00:00
Dan Gohman 2f16f25391 [WebAssembly] Don't special-case call operand order.
With the '=' suffix now indicating which operands are output operands, it's
no longer as important to distinguish between a call's inputs and its outputs
using operand ordering, so we can go back to printing them in the normal order.

llvm-svn: 253925
2015-11-23 22:04:06 +00:00
Dan Gohman 700515fa92 [WebAssembly] Suffix output operands with '='.
This distinguishes input operands from output operands. This is something of
a syntactic experiment to see whether the mild amount of clutter this adds is
outweighed by the extra information it conveys to the reader.

llvm-svn: 253922
2015-11-23 21:55:57 +00:00
Dan Gohman 7054ac1b8b [WebAssembly] Model the return value of store instructions in wasm.
llvm-svn: 253916
2015-11-23 21:16:35 +00:00
Dan Gohman aa0a4bd05b [WebAssembly] Don't use set_local instructions explicitly.
The current approach to using get_local and set_local is to use them
implicitly, as register uses and defs. Introduce new copy instructions
which are themselves no-ops except for the get_local and set_local
that they imply, so that we use get_local and set_local consistently.

llvm-svn: 253905
2015-11-23 19:30:43 +00:00
Dan Gohman f6857223c9 [WebAssembly] Always print loop end labels
WebAssembly is currently using labels to end scopes, so for example a
loop scope looks like this:

BB0_0:
  loop BB0_1
  ...
BB0_1:

with BB0_0 being the label of the first block not in the loop. This
requires that the label be printed even when it's only reachable via
fallthrough. To arrange this, insert a no-op LOOP_END instruction in
such cases at the end of the loop.

llvm-svn: 253901
2015-11-23 19:12:37 +00:00
Dan Gohman e425c32224 [WebAssembly] Remove incomplete MCCodeEmitter bits.
These are parts of a separate patch that I accidentally included in r253878.

llvm-svn: 253892
2015-11-23 18:00:04 +00:00
Dan Gohman 53828fd777 [WebAssembly] Emit .param, .result, and .local through MC.
This eliminates one of the main remaining uses of EmitRawText.

llvm-svn: 253878
2015-11-23 16:50:18 +00:00
Dan Gohman 3280793234 [WebAssembly] Use dominator information to improve BLOCK placement
Always starting blocks at the top of their containing loops works, but creates
unnecessarily deep nesting because it makes all blocks in a loop overlap.
Refine the BLOCK placement algorithm to start blocks at nearest common
dominating points instead, which significantly shrinks them and reduces
overlapping.

llvm-svn: 253876
2015-11-23 16:19:56 +00:00
Daniel Sanders 2b561336d9 [mips] .ent and .end should also set the type and size of the symbol respectively.
Reviewers: vkalintiris

Subscribers: llvm-commits, seanbruno, emaste, vkalintiris, dsanders

Differential Revision: http://reviews.llvm.org/D14221

llvm-svn: 253875
2015-11-23 16:08:03 +00:00
Krzysztof Parzyszek 29d23f9f4c [Hexagon] Update instruction formats
llvm-svn: 253867
2015-11-23 14:09:26 +00:00
Martell Malone a6b867eb0d ARM: address WoA division overflow crash
Disable custom handling of signed 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit integer overflow crashes.

llvm-svn: 253865
2015-11-23 13:11:39 +00:00
Craig Topper 2241dfd2dc [Mips] Remove an unnecessary wrapping of a predicate with std::ptr_fun. NFC
llvm-svn: 253855
2015-11-23 07:19:06 +00:00
Elena Demikhovsky 0fd11526e2 AVX-512: Optimized INSERT_SUBVECTOR for i1 vector types
ISERT_SUBVECTOR for i1 vectors may be done with shifts, when we insert into the lower part, or into the upper part, on into all-zero vector.
CONCAT_VECTORS uses ISERT_SUBVECTOR.

Differential Revision: http://reviews.llvm.org/D14815

llvm-svn: 253819
2015-11-22 13:57:38 +00:00
Sanjay Patel 8066d906f1 fix formatting; NFC
llvm-svn: 253802
2015-11-22 00:03:16 +00:00
Krzysztof Parzyszek b46557292c Hexagon V60/HVX DFA scheduler support
Extended DFA tablegen to:
  - added "-debug-only dfa-emitter" support to llvm-tblgen

  - defined CVI_PIPE* resources for the V60 vector coprocessor

  - allow specification of multiple required resources
    - supports ANDs of ORs
    - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
           (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)

  - added support for combo resources
    - allows specifying ORs of ANDs
    - e.g. [CVI_XLSHF, CVI_MPY01] means:
           (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)

  - increased DFA input size from 32-bit to 64-bit
    - allows for a maximum of 4 AND'ed terms of 16 resources

  - supported expressions now include:

    expression     => term [AND term] [AND term] [AND term]
    term           => resource [OR resource]*
    resource       => one_resource | combo_resource
    combo_resource => (one_resource [AND one_resource]*)

Author: Dan Palermo <dpalermo@codeaurora.org>

kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.

Reapply the previous patch, this time without circular dependencies.

llvm-svn: 253793
2015-11-21 20:00:45 +00:00
Krzysztof Parzyszek 4ca21fc1aa Revert r253790: it breaks all builds for some reason.
llvm-svn: 253791
2015-11-21 17:38:33 +00:00
Krzysztof Parzyszek 220a9bc018 Hexagon V60/HVX DFA scheduler support
Extended DFA tablegen to:
  - added "-debug-only dfa-emitter" support to llvm-tblgen

  - defined CVI_PIPE* resources for the V60 vector coprocessor

  - allow specification of multiple required resources
    - supports ANDs of ORs
    - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
           (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)

  - added support for combo resources
    - allows specifying ORs of ANDs
    - e.g. [CVI_XLSHF, CVI_MPY01] means:
           (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)

  - increased DFA input size from 32-bit to 64-bit
    - allows for a maximum of 4 AND'ed terms of 16 resources

  - supported expressions now include:

    expression     => term [AND term] [AND term] [AND term]
    term           => resource [OR resource]*
    resource       => one_resource | combo_resource
    combo_resource => (one_resource [AND one_resource]*)

Author: Dan Palermo <dpalermo@codeaurora.org>

kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.

llvm-svn: 253790
2015-11-21 17:23:52 +00:00
Simon Pilgrim d5a154424b [X86][AVX512] Added AVX512 VMOVLHPS/VMOVHLPS shuffle decode comments.
llvm-svn: 253777
2015-11-21 13:04:42 +00:00
Simon Pilgrim 96cbce61b2 [X86][SSE] Legal XMM Register Class ordering for SSE1
It turns out we have a number of places that just grab the first type attached to a register class for various reasons. This is fine unless for some reason that type isn't legal on the current target, such as for SSE1 which doesn't support v16i8/v8i16/v4i32/v2i64 - all of which were included before 4f32 in the class.

Given that this is such a rare situation I've just re-ordered the types and placed the float types first.

Fix for PR16133

Differential Revision: http://reviews.llvm.org/D14787

llvm-svn: 253773
2015-11-21 12:38:34 +00:00
Matthias Braun 5a1857b6eb ARMLoadStoreOptimizer: Cleanup isMemoryOp(); NFC
llvm-svn: 253757
2015-11-21 02:09:49 +00:00
Vinicius Tinti 67cf33d9ab Test commit
llvm-svn: 253737
2015-11-20 23:20:12 +00:00
Eric Christopher 25bf4a8617 Power8 and later support fusing addis/addi and addis/ld instruction
pairs that use the same register to execute as a single instruction.
No Functional Change

Patch by Kyle Butt!

llvm-svn: 253724
2015-11-20 22:38:20 +00:00
Jun Bum Lim 80ec0d3f5a [AArch64]Merge narrow zero stores to a wider store
This change merges adjacent zero stores into a wider single store.
For example :
  strh wzr, [x0]
  strh wzr, [x0, #2]
becomes
  str wzr, [x0]

This will fix PR25410.

llvm-svn: 253711
2015-11-20 21:14:07 +00:00
Eric Christopher c180836722 Weak non-function symbols were being accessed directly, which is
incorrect, as the chosen representative of the weak symbol may not live
with the code in question. Always indirect the access through the TOC
instead.

Patch by Kyle Butt!

llvm-svn: 253708
2015-11-20 20:51:31 +00:00
Krzysztof Parzyszek 6c5ca95814 [Hexagon] Fix the return value from HexagonGenInsert::runOnMachineFunction
llvm-svn: 253705
2015-11-20 20:46:23 +00:00
Artyom Skrobov 91f339ab3f Handle ARMv6-J as an alias, instead of fake architecture
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.

The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.

The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.

Reviewers: rengolin, logan, compnerd

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14755

llvm-svn: 253675
2015-11-20 16:46:09 +00:00
Tilmann Scheller 4cd1d51a4d [Hexagon] Remove redundant assignment.
Identified by the Clang static analyzer.

llvm-svn: 253664
2015-11-20 13:27:30 +00:00
Daniel Sanders b700203c8b Partially revert r253662: some unrelated work was accidentally committed with it.
Sorry.

llvm-svn: 253663
2015-11-20 13:16:35 +00:00
Daniel Sanders be9db3c00a Revert the revert 253497 and 253539 - These commits aren't the cause of the clang-cmake-mips failures.
Sorry for the noise.

llvm-svn: 253662
2015-11-20 13:13:53 +00:00
Tilmann Scheller bfd7ce01ea [Hexagon] Remove redundant local variable.
Identified by the Clang static analyzer.

llvm-svn: 253660
2015-11-20 12:10:17 +00:00
Hrvoje Varga b65518c15c [mips][microMIPS] Implement MUL[_S].PH, MULEQ_S.W.PHL, MULEQ_S.W.PHR, MULEU_S.PH.QBL, MULEU_S.PH.QBR, MULQ_RS.PH, MULQ_RS.W, MULQ_S.PH and MULQ_S.W instructions
Differential Revision: http://reviews.llvm.org/D14280

llvm-svn: 253651
2015-11-20 07:14:52 +00:00
Dan Gohman d9625276a7 [WebAssembly] Remove the AsmPrinter code for printing physical registers.
WebAssembly does not have physical registers, so even if LLVM uses physical
registers like SP, they'll need to be lowered to virtual registers before
AsmPrinter time.

llvm-svn: 253644
2015-11-20 03:13:31 +00:00
Dan Gohman dfa81d8e22 [WebAssembly] Add a few open tasks to the target README.txt.
llvm-svn: 253643
2015-11-20 03:08:27 +00:00
Dan Gohman bb7ce8e408 [WebAssembly] Rename SWITCH to TABLESWITCH to match the current wording in the spec.
llvm-svn: 253642
2015-11-20 03:02:49 +00:00
Dan Gohman 2dfc3b8be5 [WebAssembly] Remove done items from the README.txt.
llvm-svn: 253640
2015-11-20 02:51:38 +00:00
Dan Gohman 7bafa0eaef [WebAssembly] Add asserts that the expression stack is used in stack order.
llvm-svn: 253638
2015-11-20 02:33:24 +00:00
Dan Gohman b0992dafb3 [WebAssemby] Enforce FIFO ordering for instructions using stackified registers.
llvm-svn: 253634
2015-11-20 02:19:12 +00:00
Eric Christopher eb027124af Split the argument unscheduling loop in the WebAssembly register
coloring pass. Turn the logic into "look for an insert point and
then move things past the insert point".

No functional change intended.

llvm-svn: 253626
2015-11-20 00:34:54 +00:00
Eric Christopher 8c3dbcab1d Fix a [-Werror,-Wcovered-switch-default] warning by removing the
unnecessary default case.

llvm-svn: 253621
2015-11-19 23:45:42 +00:00
Dan Gohman 3192ddfeba [WebAssembly] Implement isCheapToSpeculateCtlz and isCheapToSpeculateCttz.
This unbreaks test/CodeGen/WebAssembly/i32.ll and
test/CodeGen/WebAssembly/i64.ll after r224899.

llvm-svn: 253617
2015-11-19 23:04:59 +00:00
Simon Pilgrim a9912617c8 [X86][SSE4A] Fix issue with EXTRQI shuffles not starting at the correct start index.
Found during stress testing.

llvm-svn: 253611
2015-11-19 22:13:56 +00:00
Reid Kleckner ebee6129cd Fix UMRs in Mips disassembler on invalid instruction streams
The Insn and Size local variables were used without initialization.

llvm-svn: 253607
2015-11-19 21:51:55 +00:00
Simon Pilgrim ae0140d6ec [X86] Use existing MachineInstrBuilder::addDisp to create offseted pointer. NFC.
Minor code duplication tidyup to D13988

llvm-svn: 253606
2015-11-19 21:50:57 +00:00
Jun Bum Lim c12c2790e1 [AArch64] Refactoring aarch64-ldst-opt. NCF.
Summary :
 * Rename isSmallTypeLdMerge() to isNarrowLoad().
 * Rename NumSmallTypeMerged to NumNarrowTypePromoted.
 * Use Subtarget defined as a member variable.

llvm-svn: 253587
2015-11-19 18:41:27 +00:00
Jun Bum Lim 4c35ccac91 [AArch64]Extend merging narrow loads into a wider load
This change extends r251438 to handle more narrow load promotions
including byte type, unscaled, and signed. For example, this change will
convert :
  ldursh w1, [x0, #-2]
  ldurh  w2, [x0, #-4]
into
  ldur  w2, [x0, #-4]
  asr   w1, w2, #16
  and   w2, w2, #0xffff

llvm-svn: 253577
2015-11-19 17:21:41 +00:00
Hans Wennborg dcc2500452 X86: More efficient legalization of wide integer compares
In particular, this makes the code for 64-bit compares on 32-bit targets
much more efficient.

Example:

  define i32 @test_slt(i64 %a, i64 %b) {
  entry:
    %cmp = icmp slt i64 %a, %b
    br i1 %cmp, label %bb1, label %bb2
  bb1:
    ret i32 1
  bb2:
    ret i32 2
  }

Before this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          setae   %al
          cmpl    16(%esp), %ecx
          setge   %cl
          je      .LBB2_2
          movb    %cl, %al
  .LBB2_2:
          testb   %al, %al
          jne     .LBB2_4
          movl    $1, %eax
          retl
  .LBB2_4:
          movl    $2, %eax
          retl

After this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          sbbl    16(%esp), %ecx
          jge     .LBB1_2
          movl    $1, %eax
          retl
  .LBB1_2:
          movl    $2, %eax
          retl

Differential Revision: http://reviews.llvm.org/D14496

llvm-svn: 253572
2015-11-19 16:35:08 +00:00
Zoran Jovanovic 00f998b440 [mips] Expansion of ROL and ROR macros
Author: obucina

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10611

llvm-svn: 253564
2015-11-19 14:15:03 +00:00
Elena Demikhovsky 7c2c9fd243 AVX-512: Fixed COPY_TO_REGCLASS for mask registers
Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits.
Copying 8 bits under DQ may be done with kmovb.

Differential Revision: http://reviews.llvm.org/D14812

llvm-svn: 253563
2015-11-19 13:13:00 +00:00
Simon Pilgrim 846b64e17a [X86][AVX] Fix lowering of X86ISD::VZEXT_MOVL for 128-bit -> 256-bit extension
The lowering patterns for X86ISD::VZEXT_MOVL for 128-bit to 256-bit vectors were just copying the lower xmm instead of actually masking off the first scalar using a blend.

Fix for PR25320.

Differential Revision: http://reviews.llvm.org/D14151

llvm-svn: 253561
2015-11-19 12:18:37 +00:00
Alexey Bataev b7b82bf33e Alternative to long nops for X86 CPUs, by Andrey Turetsky
Make X86AsmBackend generate smarter nops instead of a bunch of 0x90 for code alignment for CPUs which don't support long nop instructions.
Differential Revision: http://reviews.llvm.org/D14178

llvm-svn: 253557
2015-11-19 11:44:35 +00:00
Igor Breger 1f78296869 AVX512: Implemented encoding, intrinsics and DAG lowering for VMOVDDUP instructions.
Differential Revision: http://reviews.llvm.org/D14702

llvm-svn: 253548
2015-11-19 08:26:56 +00:00
Igor Breger 4424aaa28e AVX512: Implemented encoding for the vmovss.s and vmovsd.s instructions.
Differential Revision: http://reviews.llvm.org/D14771

llvm-svn: 253547
2015-11-19 07:58:33 +00:00
Igor Breger 81b79de54c AVX512: Implemented encoding for the follow instructions.
vmovapd.s, vmovaps.s, vmovdqa32.s, vmovdqa64.s, vmovdqu16.s, vmovdqu32.s, vmovdqu64.s, vmovdqu8.s, vmovupd.s, vmovups.s

Differential Revision: http://reviews.llvm.org/D14768

llvm-svn: 253546
2015-11-19 07:43:43 +00:00
Elena Demikhovsky 1ca72e1846 Pointers in Masked Load, Store, Gather, Scatter intrinsics
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.

Differential Revision: http://reviews.llvm.org/D14150

llvm-svn: 253544
2015-11-19 07:17:16 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Quentin Colombet 46d5c71135 [X86] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14156

rdar://problem/21118279

llvm-svn: 253528
2015-11-19 00:38:00 +00:00
Quentin Colombet f6645cce91 [AArch64] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14360

rdar://problem/20820748

llvm-svn: 253520
2015-11-18 23:12:20 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Tim Northover 747ae9a7de ARM: make sure backend is consistent about exception handling method.
It turns out we decide whether to use SjLj exceptions or some alternative in
two separate places in the backend, and they disagreed with each other. This
led to inconsistent code and is generally a terrible idea.

So make them consistent and add an assert that they *do* match (unfortunately
MCAsmInfo isn't available in opt, so it can't be used to initialise the CodeGen
version directly).

llvm-svn: 253502
2015-11-18 21:10:39 +00:00
Matthew Simpson 343af07aa9 [Aarch64] Add cost for missing extensions.
This patch adds a cost estimate for some missing sign and zero extensions. The
costs were determined by counting the number of shift instructions generated
without context for each new extension.

Differential Revision: http://reviews.llvm.org/D14730

llvm-svn: 253482
2015-11-18 18:03:06 +00:00
Dan Gohman 94ef41ff1d [WebAssembly] Add more whitespace characters to prettify the assembly output.
llvm-svn: 253472
2015-11-18 17:05:35 +00:00
Dan Gohman 1f29c68042 [WebAssembly] Add some spaces to the assembly output to vertically align operands.
llvm-svn: 253468
2015-11-18 16:25:38 +00:00
Dan Gohman 4ba4816b97 [WebAssembly] Enable register coloring and register stackifying.
This also takes the push/pop syntax another step forward, introducing stack
slot numbers to make it easier to see how expressions are connected. For
example, the value pushed in $push7 is popped in $pop7.

And, this begins an experiment with making get_local and set_local implicit
when an operation directly uses or defines a register. This greatly reduces
clutter. If this experiment succeeds, it may make sense to do this for
const instructions as well.

And, this introduces more special code for ARGUMENTS; hopefully this code
will soon be obviated by proper support for live-in virtual registers.

llvm-svn: 253465
2015-11-18 16:12:01 +00:00
Asaf Badouh 0d957b8b09 [X86][AVX512CD] add mask broadcast intrinsics
Differential Revision: http://reviews.llvm.org/D14573

llvm-svn: 253450
2015-11-18 09:42:45 +00:00
Igor Breger 5574730454 AVX512: Implemented encoding for vpextrw.s instruction.
Differential Revision: http://reviews.llvm.org/D14766

llvm-svn: 253447
2015-11-18 08:46:16 +00:00
Hrvoje Varga 78409019d9 [mips][microMIPS] Implement DPS.W.PH, DPSQ_S.W.PH, DPSQ_SA.L.W, DPSQX_S.W.PH, DPSQX_SA.W.PH, DPSU.H.QBL, DPSU.H.QBR and DPSX.W.PH instructions
Differential Revision: http://reviews.llvm.org/D14058

llvm-svn: 253443
2015-11-18 07:41:35 +00:00
Craig Topper 66059c9f4d Replace dyn_cast with isa in places that weren't using the returned value for more than a boolean check. NFC.
llvm-svn: 253441
2015-11-18 07:07:59 +00:00
Rafael Espindola 449711cb36 Stop producing .data.rel sections.
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.

It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.

There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
  concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
  linker since it knowns where the dynamic relocations are.

llvm-svn: 253436
2015-11-18 06:02:15 +00:00
Quentin Colombet 8cb95b8e51 [ARM] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14357

rdar://problem/21942589

llvm-svn: 253411
2015-11-18 00:40:54 +00:00
Simon Pilgrim 2da4178737 [X86][AVX512] Added AVX512 SHUFP*/VPERMILP* shuffle decode comments.
llvm-svn: 253396
2015-11-17 23:29:49 +00:00
Simon Pilgrim 8483df6e24 [X86][AVX512] Added support for AVX512 UNPCK shuffle decode comments.
llvm-svn: 253391
2015-11-17 22:35:45 +00:00
Reid Kleckner c20276d0b2 [WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.

The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.

The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic.  This
intrinsic generates no code and simply marks the alloca in use.

Reviewers: JCTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14668

llvm-svn: 253378
2015-11-17 21:10:25 +00:00
Charlie Turner 7968b981bf [ARM] Don't pessimize i32 vselect.
The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.

I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.

From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%

I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change

Differential Revision: http://reviews.llvm.org/D14743

llvm-svn: 253349
2015-11-17 17:25:15 +00:00
Ahmed Bougacha 88ddeae8bd [AArch64] Promote f16 SELECT_CC CC operands when op is legal.
SELECT_CC has the nasty property of having operands with unrelated
types. So if you do something like:

  f32 = select_cc f16, f16, f32, f32, cc

You'd only look for the action for <select_cc, f32>, but never f16.
If the types are all legal, but the op isn't (as for f16 on AArch64,
or for f128 on x86_64/AArch64?), then you get into trouble.
For f128, we have softenSetCCOperands to handle this case.

Similarly, for f16, we can directly promote the CC operands.

llvm-svn: 253344
2015-11-17 16:45:40 +00:00
Bradley Smith 982a8888b8 [ARM] Default to ARMv4t in favour of adding Other to ARMArch
llvm-svn: 253335
2015-11-17 13:38:29 +00:00
Charlie Turner b4613c6973 [ARM] Match VABDL from log2 shuffles.
Differential Revision: http://reviews.llvm.org/D14664

llvm-svn: 253334
2015-11-17 13:21:35 +00:00
Zlatko Buljan 72a7f9c1f5 [mips][microMIPS] Implement EXTP, EXTPDP, EXTPDPV, EXTPV, EXTR[_RS].W, EXTR_S.H, EXTRV[_RS].W and EXTRV_S.H instructions
Differential Revision: http://reviews.llvm.org/D14174

llvm-svn: 253332
2015-11-17 12:54:15 +00:00
Bradley Smith 4320205484 [ARM] Properly initialize ARMArch in the ARM subtarget
llvm-svn: 253331
2015-11-17 11:57:33 +00:00
Zlatko Buljan 246b21f66a [mips][microMIPS] Implement SUBQ[_S].PH, SUBQ_S.W, SUBQH[_R].PH, SUBQH[_R].W, SUBU[_S].PH, SUBU[_S].QB and SUBUH[_R].QB instructions
Differential Revision: http://reviews.llvm.org/D14114

llvm-svn: 253329
2015-11-17 10:11:22 +00:00
Oliver Stannard 9be59af3ab [Assembler] Make fatal assembler errors non-fatal
Currently, if the assembler encounters an error after parsing (such as an
out-of-range fixup), it reports this as a fatal error, and so stops after the
first error. However, for most of these there is an obvious way to recover
after emitting the error, such as emitting the fixup with a value of zero. This
means that we can report on all of the errors in a file, not just the first
one. MCContext::reportError records the fact that an error was encountered, so
we won't actually emit an object file with the incorrect contents.

Differential Revision: http://reviews.llvm.org/D14717

llvm-svn: 253328
2015-11-17 10:00:43 +00:00
Zlatko Buljan 3e0588d033 [mips][microMIPS] Implement PRECEQ.W.PHL, PRECEQ.W.PHR, PRECEQU.PH.QBL, PRECEQU.PH.QBLA, PRECEQU.PH.QBR, PRECEQU.PH.QBRA, PRECEU.PH.QBL, PRECEU.PH.QBLA, PRECEU.PH.QBR and PRECEU.PH.QBRA instructions
Differential Revision: http://reviews.llvm.org/D14279

llvm-svn: 253326
2015-11-17 09:43:29 +00:00
Jay Foad b64f0a5a1a Fix typos in comments.
llvm-svn: 253324
2015-11-17 08:54:53 +00:00
Rafael Espindola 65e4902156 Drop prelink support.
The way prelink used to work was

* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.

There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
  implementation is pretty broken. It talks about relocations that are
  "resolved by the static linker". If they are resolved, there are none
  left for the prelinker. What one needs to track is if an expression
  will require only dynamic relocations that point to the same DSO.

At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)

This patch removes support for it. That is, it stops printing the
".local" sections.

llvm-svn: 253280
2015-11-17 00:51:23 +00:00
Derek Schuff 71e8169ea8 [WebAssembly] Fix printing of global operands
This was regressed in r252656 which wasn't quite NFC. Instead of using a
custom instruction as before, use a pattern to select CONST_I32 for the
global addrs.

Differential Revision: http://reviews.llvm.org/D14587

llvm-svn: 253276
2015-11-17 00:20:44 +00:00
Simon Pilgrim 13d3a20ad7 [X86][SSE] Merged BLEND shuffle decode comments. NFC.
Now that we can recognise different vector sizes.

llvm-svn: 253268
2015-11-16 23:03:18 +00:00
Simon Pilgrim b9ada27052 [X86][SSE] Merged ALIGNR/SLLDQ/SRLDQ shuffle decode comments. NFC.
Now that we can recognise different vector sizes - will make future AVX512 additions easier.

llvm-svn: 253266
2015-11-16 22:54:41 +00:00
Simon Pilgrim 5883a73f18 [X86][SSE] Merged SHUF/PERM shuffle decode comments. NFC.
Now that we can recognise different vector sizes - will make future AVX512 additions easier.

llvm-svn: 253260
2015-11-16 22:39:27 +00:00
Simon Pilgrim 66e43ee289 [X86][SSE] Merged UNPCK shuffle decode comments. NFC.
Now that we can recognise different vector sizes - will make future AVX512 additions easier.

llvm-svn: 253258
2015-11-16 22:21:10 +00:00
Derek Schuff 46e3316888 [WebAssembly] Fix function return type printing
Summary:
Previously return type information for a function was derived from
return dag nodes. But this didn't work for dags with != return node. So
instead compute it directly from the LLVM function as is done for imports.

Differential Revision: http://reviews.llvm.org/D14593

llvm-svn: 253251
2015-11-16 21:12:41 +00:00
Derek Schuff 4ed4778419 [WebAssembly] Reverse the order of operands for br_if
Summary: This is to match the new version in the spec

Reviewers: sunfish

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D14519

llvm-svn: 253249
2015-11-16 21:04:51 +00:00
Kit Barton 9c432ae111 Find available scratch register to use in function prologue and epilogue as part of shrink wrapping.
Phabricator: http://reviews.llvm.org/D13955
llvm-svn: 253247
2015-11-16 20:22:15 +00:00
Reid Kleckner c397b26790 [WinEH] Don't let UnwindHelp alias the return address
On top of that, don't bother allocating and initializing UnwindHelp if
we don't have any funclets. Currently we always use RBP as our frame
pointer when funclets are present, so this change makes it impossible to
come here without any fixed stack objects.

Fixes PR25533.

llvm-svn: 253245
2015-11-16 18:47:25 +00:00
Reid Kleckner 4255b04e7b Use the subtarget reference that we already have
llvm-svn: 253244
2015-11-16 18:47:12 +00:00
Vasileios Kalintiris 88faf6d697 [mips] Disable code generation through FastISel for MIPS32R6.
Reviewers: dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14708

llvm-svn: 253225
2015-11-16 17:05:01 +00:00
Petr Pavlu a770379524 [ARM] Prevent use of a value pointed by end() iterator when placing a jump table
Function ARMConstantIslands::doInitialJumpTablePlacement() iterates over all
basic blocks in a machine function. It calls `MI = MBB.getLastNonDebugInstr()`
to get the last instruction in each block and then uses MI->getOpcode() to
decide what to do. If getLastNonDebugInstr() returns MBB.end() (for example,
when the block does not contain any instructions) then calling getOpcode() on
this value is incorrect. Avoid this problem by checking the result of
getLastNonDebugInstr().

Differential Revision: http://reviews.llvm.org/D14694

llvm-svn: 253222
2015-11-16 16:41:13 +00:00
Oliver Stannard 9327a7575b [ARM,AArch64] Store source location of asm constant pool entries
Storing the source location of the expression that created a constant pool
entry allows us to emit better error messages if we later discover that the
expression cannot be represented by a relocation.

Differential Revision: http://reviews.llvm.org/D14646

llvm-svn: 253220
2015-11-16 16:25:47 +00:00
Oliver Stannard 09be060606 [ARM,AArch64] Store source location for values in assembly files
The MCValue class can store a SMLoc to allow better error messages to be
emitted if an error is detected after parsing. The ARM and AArch64 assembly
parsers were not setting this, so error messages did not have source
information.

Differential Revision: http://reviews.llvm.org/D14645

llvm-svn: 253219
2015-11-16 16:22:47 +00:00
Dan Gohman 1462faad35 [WebAssembly] Prototype passes for register coloring and register stackifying.
These passes are not yet enabled by default.

llvm-svn: 253217
2015-11-16 16:18:28 +00:00
Artyom Skrobov f187a65f99 Handle ARMv6KZ naming
Summary:
* ARMv6KZ is the "canonical" name, given in the ARMARM
* ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM
* ARMv6ZK is a popular misspelling, which we should support as an alias.

The patch corrects the handling of the names.

Functional changes:
* ARMv6Z no longer treated as an architecture in its own right
* ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias
* arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K
* default ARMv6K CPU changed to arm1176j-s

Reviewers: rengolin, logan, compnerd

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14568

llvm-svn: 253206
2015-11-16 14:05:32 +00:00
Bradley Smith 323fee105d [ARM] Introduce subtarget features per ARM architecture.
This allows for accurate architecture targeting as well as removing
duplicate information (hardcoded feature strings) from MCTargetDesc.

llvm-svn: 253196
2015-11-16 11:10:19 +00:00
James Molloy 2018091e87 Properly check if a CMPZ node is in fact comparing against zero
This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless.

Caught by Oliver Stannard using csmith; regression test added.

llvm-svn: 253195
2015-11-16 10:49:25 +00:00
Oliver Stannard db9081bf89 [AArch64] ldr= pseudo-instruction silently ignored if register invalid
The AArch64 assembler was silently ignoring instructions like this:
  ldr foo, =bar

AArch64AsmParser::parseOperand was returning true as the parse failed, but was
not calling AArch64AsmParser::Error to report this to the user, so the
instruction was ignored without printing an error message.

Differential Revision: http://reviews.llvm.org/D14651

llvm-svn: 253193
2015-11-16 10:25:19 +00:00
Igor Breger 24cab0fa06 AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions.
Differential Revision: http://reviews.llvm.org/D14322

llvm-svn: 253185
2015-11-16 07:22:00 +00:00
Dan Gohman 1031d4a8c3 [WebAssembly] Use tabs instead of spaces in assembly output.
This seems to be the most popular convention among the other backends.

llvm-svn: 253172
2015-11-15 15:34:19 +00:00
Simon Pilgrim cbba348ae7 [X86][SSE] Tidyup with implicit SDValue bool check. NFC.
llvm-svn: 253171
2015-11-15 14:57:07 +00:00
Igor Breger 3ff8ef9eb7 Revert r253160.
It broke layering violation. Reproducible with BUILD_SHARED_LIBS=ON.

llvm-svn: 253163
2015-11-15 12:19:11 +00:00
Igor Breger aa40ddd3ba AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions.
Differential Revision: http://reviews.llvm.org/D14322

llvm-svn: 253160
2015-11-15 07:23:13 +00:00
Dan Gohman 5219ecf068 [WebAssembly] Minor code simplification. NFC.
llvm-svn: 253150
2015-11-14 23:28:15 +00:00
Dan Gohman 8ad045c1d1 [WebAssembly] Support signext, zeroext, and several other function attributes.
llvm-svn: 253148
2015-11-14 23:15:41 +00:00
Akira Hatanaka b11ef0897c Reduce the size of MCRelaxableFragment.
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.

This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
 
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.

rdar://problem/21736951

Differential Revision: http://reviews.llvm.org/D14346

llvm-svn: 253127
2015-11-14 06:35:56 +00:00
Akira Hatanaka bd9fc28444 [MCTargetAsmParser] Move the member varialbes that reference
MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a
member function getSTI.

This is done in preparation for making changes to shrink the size of
MCRelaxableFragment. (see http://reviews.llvm.org/D14346).

llvm-svn: 253124
2015-11-14 05:20:05 +00:00
Eric Christopher 57a6e1321f Add MMX to the 3dnow enum and propagate changes around. This makes
it somewhat more consistent with how the feature is used.

llvm-svn: 253122
2015-11-14 03:04:00 +00:00
Justin Bogner fff708db92 AArch64: Default AArch64Subtarget::ReserveX18 to true on darwin
Darwin reserves x18, so it's never ABI compliant to generate code that
uses it. Set the default value based on the OS part of the triple
rather than forcing front-ends to set the +reserve-x18 target feature
in order to build correct code for Darwin.

This will make r243310 redundant, so I'll revert that shortly.

llvm-svn: 253102
2015-11-13 23:05:46 +00:00
Colin LeMahieu 655489433c [Hexagon] Fixing memory leak during relaxation by allocating MCInst in MCContext.
llvm-svn: 253090
2015-11-13 21:45:50 +00:00
Reid Kleckner 75b4be9a11 [WinEH] Fix ESP management with 32-bit __CxxFrameHandler3
The C++ EH personality automatically restores ESP from the C++ EH
registration node after a catchret. I mistakenly thought it was like
SEH, which does not restore ESP.

It makes sense for C++ EH to differ from SEH here because SEH does not
use funclets for catches, and does not allow catching inside of finally.
C++ EH may need to unwind through multiple catch funclets and eventually
catchret to some outer funclet. Therefore, the runtime has to keep track
of which ESP to use with catchret, rather than having the compiler
reload it manually.

llvm-svn: 253084
2015-11-13 21:27:00 +00:00
Dan Gohman dd0071f440 [WebAssembly] Rename the Const instructions to be upper-case too.
llvm-svn: 253072
2015-11-13 20:27:45 +00:00
Dan Gohman f433324290 [WebAssembly] Rename memory intrinsics to be upper-case, following convention. NFC.
llvm-svn: 253070
2015-11-13 20:19:11 +00:00
Cong Hou ef4074bac2 [X86][SSE] Combine UNPCKL with vector_shuffle into UNPCKH to save one instruction for sext from v16i8 to v16i16 and v8i16 to v8i32.
This patch is enabling combining UNPCKL with vector_shuffle that moves the upper
half of a vector into the lower half, into a UNPCKH instruction. For example:

t2: v16i8 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t1, undef:v16i8
t3: v16i8 = X86ISD::UNPCKL undef:v16i8, t2

will be combined to:

t3: v16i8 = X86ISD::UNPCKH undef:v16i8, t1


Differential revision: http://reviews.llvm.org/D14399

llvm-svn: 253067
2015-11-13 19:47:43 +00:00
Reid Kleckner 94b57065c6 [WinEH] Make UnwindHelp a fixed stack object allocated after XMM CSRs
Now the offset of UnwindHelp in our EH tables and the offset that we
store to in the prologue agree.

llvm-svn: 253059
2015-11-13 19:06:01 +00:00
Colin LeMahieu f0af6e5243 [Hexagon] Factoring bundle creation in to a utility function.
llvm-svn: 253056
2015-11-13 17:42:46 +00:00
Tom Stellard afd6e2f3c3 AMDGPU: Add stony support
Patch by: Alex Deucher

llvm-svn: 253053
2015-11-13 17:06:32 +00:00
James Molloy b564098c62 [ARM] Replace ARMISD::RBIT with ISD::BITREVERSE
ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering.

llvm-svn: 253047
2015-11-13 16:05:22 +00:00
Zlatko Buljan 32fb5c40d2 [mips][microMIPS] Implement SHRA[_R].PH, SHRAV[_R].PH, SHRAV[_R].QB, SHRAV_R.W, SHRA_R.W, SHRL.PH, SHRL.QB, SHRLV.PH and SHRLV.QB instructions
Differential Revision: http://reviews.llvm.org/D14010

llvm-svn: 253041
2015-11-13 13:14:25 +00:00
Ulrich Weigand 19d24d2699 [SystemZ] Simplify boolean conditional return statements
Use clang-tidy to simplify conditonal return statements.

Author: LegalizeAdulthood
Differential Revision: http://reviews.llvm.org/D9986

llvm-svn: 253038
2015-11-13 13:00:27 +00:00
Colin LeMahieu b3c97271e3 [Hexagon] Fixing leak in padEndloop by allocating in MCContext.
llvm-svn: 253019
2015-11-13 07:58:06 +00:00
Dan Gohman f19ed56288 [WebAssembly] Inline asm support.
llvm-svn: 252997
2015-11-13 01:42:29 +00:00
Colin LeMahieu 8bb168b160 [Hexagon] Adding relaxation functionality to backend and test.
llvm-svn: 252989
2015-11-13 01:12:25 +00:00
Dan Gohman bc58a7bad0 [WebAssembly] Un-mangle the conversion instruction names.
This arranges the types in the LLVM instruction names in the same order that
they appear in the WebAssembly opcode names, and eliminates
double-underscores.

llvm-svn: 252988
2015-11-13 00:50:04 +00:00
Dan Gohman 231244c304 [WebAssembly] Rename BR_IF_ to BR_IF
With MC-based instruction printing, we no longer need instruction names to
mangle in hints about how they should be printed.

llvm-svn: 252987
2015-11-13 00:46:31 +00:00
Dan Gohman c9dd057e3c [WebAssembly] Remove unneeded TODO items. NFC.
llvm-svn: 252985
2015-11-13 00:41:25 +00:00
Dan Gohman b1daa3aec7 [WebAssembly] Tidy up and update a TODO item. NFC.
llvm-svn: 252984
2015-11-13 00:40:37 +00:00
Joseph Tremoulet 149c433bcc [WinEH] Find root frame correctly in CLR funclets
Summary:
The value that the CoreCLR personality passes to a funclet for the
establisher frame may be the root function's frame or may be the parent
funclet's (mostly empty) frame in the case of nested funclets.  Each
funclet stores a pointer to the root frame in its own (mostly empty)
frame, as does the root function itself.  All frames allocate this slot at
the same offset, measured from the post-prolog stack pointer, so that the
same sequence can accept any ancestor as an establisher frame parameter
value, and so that a single offset can be reported to the GC, which also
looks at this slot.

This change allocate the slot when processing function entry, and records
its frame index on the WinEHFuncInfo object, then inserts the code to
set/copy it during prolog emission.


Reviewers: majnemer, AndyAyers, pgavlin, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14614

llvm-svn: 252983
2015-11-13 00:39:23 +00:00
Dan Gohman 058fce5435 [WebAssembly] Introduce a new pseudo-operand for unused expression results.
llvm-svn: 252975
2015-11-13 00:21:05 +00:00
Vyacheslav Klochkov cbc56baae6 X86-FMA3: Implemented commute transformations FMA*_Int instructions.
It made it possible to apply the memory folding optimization for the 2nd
operand of FMA*_Int instructions.

Reviewer: Quentin Colombet
Differential Revision: http://reviews.llvm.org/D14550

llvm-svn: 252973
2015-11-13 00:07:35 +00:00
Tom Stellard 0967c91e0c Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

llvm-svn: 252956
2015-11-12 21:43:25 +00:00
Vyacheslav Klochkov 1ff9cbdfc0 My first/test commit. Removed a trailing whitespace.
llvm-svn: 252940
2015-11-12 20:11:57 +00:00
Benjamin Kramer 7c576d8bcf [Hexagon] Allocate MCInst in the MCContext to avoid leaking it.
Found by leaksanitizer.

llvm-svn: 252931
2015-11-12 19:30:40 +00:00
David Blaikie b0311c590d Roll an expression into an assert to fix -Wunused-variable in a -Asserts build
llvm-svn: 252925
2015-11-12 19:07:43 +00:00
Dan Gohman cf4748f180 [WebAssembly] Reapply r252858, with svn add for the new file.
Switch to MC for instruction printing.

This encompasses several changes which are all interconnected:
 - Use the MC framework for printing almost all instructions.
 - AsmStrings are now live.
 - This introduces an indirection between LLVM vregs and WebAssembly registers,
   and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
   This addresses some basic issues with argument registers and unused registers.
 - The way ARGUMENT instructions are handled no longer generates redundant
   get_local+set_local for every argument.

This also changes the assembly syntax somewhat; most notably, MC's printing
does not use sigils on label names, so those are no longer present, and
push/pop now have a sigil to keep them unambiguous.

The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.

llvm-svn: 252910
2015-11-12 17:04:33 +00:00
Michael Zuckerman fd3fe9e45a [x86] translating "fp" (floating point) instructions from {fadd,fdiv,fmul,fsub,fsubr,fdivr} to {faddp,fdivp,fmulp,fsubp,fsubrp,fdivrp}
LLVM Missing the following instructions: fadd\fdiv\fmul\fsub\fsubr\fdivr.
GAS and MS supporting this instruction and lowering them in to a faddp\fdivp\fmulp\fsubp\fsubrp\fdivrp instructions.

Differential Revision: http://reviews.llvm.org/D14217

llvm-svn: 252908
2015-11-12 16:58:51 +00:00
Artyom Skrobov 2c2f378f8a Cull non-standard variants of ARM architectures (NFC)
Summary:
This patch changes ARMV5, ARMV5E, ARMV6SM, ARMV6HL, ARMV7, ARMV7L,
ARMV7HL, ARMV7EM to be treated as aliases for the corresponding
standard architectures, instead of as actual architectures.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14577

llvm-svn: 252903
2015-11-12 15:51:41 +00:00
Hans Wennborg 7384a2de02 Revert r252858: "[WebAssembly] Switch to MC for instruction printing."
It broke the CMake build:

"Cannot find source file: WebAssemblyRegNumbering.cpp"

llvm-svn: 252897
2015-11-12 14:37:56 +00:00
Vasileios Kalintiris 48e0256ed6 Re-apply "[mips] Use correct frame register for DWARF info when dynamically realigning the stack.""
r252219 reversed the direction of subprogram -> function edge. Fixed the
IR to account for this.

llvm-svn: 252895
2015-11-12 14:11:43 +00:00
James Molloy 8e99e97f2a [ARM] CMOV->BFI combining: handle both senses of CMPZ
I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was.

If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases.

llvm-svn: 252891
2015-11-12 13:49:17 +00:00
Renato Golin 93064025bd Revert "[ARM] Enable shrink-wrapping by default."
This reverts commit r252825, as it broke ASAN on ARM. Investigating...

llvm-svn: 252889
2015-11-12 13:34:50 +00:00
Daniel Sanders 9f6ad49740 Implement .reloc (constant offset only) with support for R_MIPS_NONE and R_MIPS_32.
Summary:
Support for R_MIPS_NONE allows us to parse MIPS16's usage of .reloc.
R_MIPS_32 was included to be able to better test the directive.

Targets can add their relocations by overriding MCAsmBackend::getFixupKind().

Subscribers: grosbach, rafael, majnemer, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D13659

llvm-svn: 252888
2015-11-12 13:33:00 +00:00
Zlatko Buljan 797c2aec6b [mips][microMIPS] Implement LWM16, SB16, SH16, SW16, SWSP and SWM16 instructions
Differential Revision: http://reviews.llvm.org/D11406

llvm-svn: 252885
2015-11-12 13:21:33 +00:00
Vasileios Kalintiris d38860610d Revert "[mips] Use correct frame register for DWARF info when dynamically realigning the stack."
This reverts commit r252882. LLParser complains for invalid field 'function'
in DISubprogram.

llvm-svn: 252884
2015-11-12 13:19:11 +00:00
Vasileios Kalintiris 352eb55baf [mips] Use correct frame register for DWARF info when dynamically realigning the stack.
Summary:
This patch overrides TargetFrameLowering::getFrameIndexReference() in order to
specify the correct register when the function needs dynamic stack realignment.
The values returned from this function are used in order to create DW_AT_locations
for DWARF info. These locations would use the wrong registers as it's been
reported in PR25028.

Reviewers: dsanders

Subscribers: dean, llvm-commits

Differential Revision: http://reviews.llvm.org/D13511

llvm-svn: 252882
2015-11-12 13:04:16 +00:00
Dylan McKay c498ba3a3e Add AVR backend skeleton
This adds part of the target info code, and adds modifications to
the build scripts so that AVR is recognized a supported, experimental
backend.

It does not include any AVR-specific code, just the bare sources required
for a backend to exist.

From D14039.

llvm-svn: 252865
2015-11-12 09:26:44 +00:00
Dan Gohman 9dd55a8065 [WebAssembly] Switch to MC for instruction printing.
This encompasses several changes which are all interconnected:
 - Use the MC framework for printing almost all instructions.
 - AsmStrings are now live.
 - This introduces an indirection between LLVM vregs and WebAssembly registers,
   and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
   This addresses some basic issues with argument registers and unused registers.
 - The way ARGUMENT instructions are handled no longer generates redundant
   get_local+set_local for every argument.

This also changes the assembly syntax somewhat; most notably, MC's printing
use sigils on label names, so those are no longer present, and push/pop now
have a sigil to keep them unambiguous.

The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.

llvm-svn: 252858
2015-11-12 06:10:03 +00:00
Manman Ren 3f2b9c18e2 [TLS on Darwin] use a different mask for tls calls on x86-64.
Calls involved in thread-local variable lookup save more registers
than normal calls.

rdar://problem/23073171

llvm-svn: 252837
2015-11-12 00:54:04 +00:00
Quentin Colombet 10f9813528 [ARM] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14357

rdar://problem/21942589

llvm-svn: 252825
2015-11-11 23:31:46 +00:00
Joseph Tremoulet 9f467353a5 [WinEH] Only generate UnwindHelp slot for MSVCXX
Summary: Other personalities don't use this special frame slot.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14580

llvm-svn: 252778
2015-11-11 19:21:09 +00:00
Sanjay Patel f740129198 [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any MIPS32
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:
  jr  $ra
  clz  $2, $4

cttz:
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  jr  $ra
  subu  $2, $2, $1

Instead of:

ctlz:
  beqz  $4, $BB0_2
  addiu  $2, $zero, 32
  clz  $2, $4
$BB0_2:
  jr  $ra
  nop

cttz:
  beqz  $4, $BB1_2
  addiu  $2, $zero, 32
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  subu  $2, $2, $1
$BB1_2:
  jr  $ra
  nop

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14500

llvm-svn: 252755
2015-11-11 17:24:56 +00:00
Diego Novillo 0767ae5896 Properly fix unused variable in disable-assert builds.
I missed the side-effects of ParseBFI in my previous attempt (r252748).
Thanks dblaikie for the suggestion of adding a void use of the unused
variable instead.

llvm-svn: 252751
2015-11-11 16:39:22 +00:00
Diego Novillo 29f88a2460 Remove unused variable in disable-assert builds. NFC.
llvm-svn: 252748
2015-11-11 16:14:52 +00:00
Douglas Katzman a14039764b Visibly fail if attempting to encode register AH,BH,CH,DH in a REX-prefixed instruction.
Differential Revision: http://reviews.llvm.org/D13316
Fixes PR25003

llvm-svn: 252743
2015-11-11 15:51:16 +00:00
James Molloy ce12c92f66 [ARM] Combine BFIs together
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.

llvm-svn: 252740
2015-11-11 15:40:40 +00:00