llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	84e22b9096	[X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when decoding a PSHUFB mask. The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes where the mask node is a load from constant pool, and the constant pool node is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching it how to also look through X86ISD::WrapperRIP. This helps function combineX86ShufflesRecusively to combine more shuffle sequences containing PSHUFB nodes if we are in RIPRel PIC mode. Before this change, llc (with -relocation-model=pic -march=x86-64) was unable to decode a pshufb where the mask was loaded from a constant pool. For example, the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its operand, so instead of generating a single 'movaps' the backend always generated a sub-optimal 'movdqa + pshufb' sequence. Added test x86-fold-pshufb.ll. llvm-svn: 236863	2015-05-08 15:11:07 +00:00
Jozef Kolek	8abad7bacc	[mips][microMIPSr6] Implement ALUIPC and AUIPC instructions This patch implements ALUIPC and AUIPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8441 llvm-svn: 236858	2015-05-08 14:25:11 +00:00
Jozef Kolek	9ce6e0a926	[mips][microMIPSr6] Implement ADDIUPC and LWPC instructions This patch implements ADDIUPC and LWPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8415 llvm-svn: 236852	2015-05-08 13:52:04 +00:00
Denis Protivensky	159a49e5d6	Fix gcc warning of different enum and non-enum types in ternary Make '0' literal explicitly unsigned with '0u'. This appeared after r236775. llvm-svn: 236838	2015-05-08 12:21:03 +00:00
Toma Tabacu	8b3345ba7c	[mips] Only use FGR_{32,64} in TableGen descriptions. NFC. Summary: Instead of explicitly adding the IsFP64bit and NotFP64bit predicates through AdditionalRequires. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9566 llvm-svn: 236835	2015-05-08 12:15:04 +00:00
Vasileios Kalintiris	42544d6472	[mips] Emit the .insn directive for empty basic blocks. Summary: In microMIPS, labels need to know whether they are on code or data. This is indicated with STO_MIPS_MICROMIPS and can be inferred by being followed by instructions. For empty basic blocks, we can ensure this by emitting the .insn directive after the label. Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the exception handling regression tests under: SingleSource/Regression/C++/EH Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9530 llvm-svn: 236815	2015-05-08 09:10:15 +00:00
Eric Christopher	54966ebc54	InMips16HardFloat was only being set conditional on whether or not IsSoftFloat was set so remove it from here simplifying the accessor. llvm-svn: 236795	2015-05-07 23:10:23 +00:00
Eric Christopher	e8ae3e3acd	Rename the MIPS routine abiUsesSoftFloat -> useSoftFloat to match some incoming changes and the general scheme used by features (use/has). llvm-svn: 236794	2015-05-07 23:10:21 +00:00
Matthias Braun	f45afee3dc	Fix typo. llvm-svn: 236785	2015-05-07 22:16:10 +00:00
Matthias Braun	d04893fa36	Change getTargetNodeName() to produce compiler warnings for missing cases, fix them llvm-svn: 236775	2015-05-07 21:33:59 +00:00
Pete Cooper	f52123b454	[AArch64] Fix sext/zext folding in address arithmetic. We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there. Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use. llvm-svn: 236764	2015-05-07 19:21:36 +00:00
Nemanja Ivanovic	f3c94b1e3c	Add VSX Scalar loads and stores to the PPC back end This patch corresponds to review: http://reviews.llvm.org/D9440 It adds a new register class to the PPC back end to contain single precision values in VSX registers. Additionally, it adds scalar loads and stores for VSX registers. llvm-svn: 236755	2015-05-07 18:24:05 +00:00
Jozef Kolek	cf98462818	[mips][microMIPSr6] Implement JIALC and JIC instructions This patch implements JIALC and JIC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8389 llvm-svn: 236748	2015-05-07 17:12:23 +00:00
Matt Arsenault	585b566278	R600: Fix comment that mentions AMDIL llvm-svn: 236745	2015-05-07 17:02:32 +00:00
Sanjay Patel	5b373cacf2	Use intrinsic pattern to make a simpler match This is a follow-on to r236740 where I took Andrea's advice in D9504 to remove a redundant pattern...except that I removed the wrong pattern! AFAICT, there is no change in the final code produced because subsequent passes would clean up the extra instructions created by the more complicated pattern. llvm-svn: 236743	2015-05-07 16:51:12 +00:00
Sanjay Patel	a9f6d3505d	[x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507) Finish the job that was abandoned in D6958 following the refactoring in http://reviews.llvm.org/rL230221: 1. Uncomment the intrinsic def for the AVX r_Int instruction. 2. Add missing r_Int entries to the load folding tables; there are already tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I haven't added any more in this patch. 3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ). So instead of this: movaps %xmm0, %xmm1 rcpss %xmm1, %xmm1 movss %xmm1, %xmm0 We should now get: rcpss %xmm0, %xmm0 And instead of this: vsqrtss %xmm0, %xmm0, %xmm1 vblendps $1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3] We should now get: vsqrtss %xmm0, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D9504 llvm-svn: 236740	2015-05-07 15:48:53 +00:00
Simon Atanasyan	fee03b1be8	[MIPS] Move MIPS ABI flags structure constants to the separate header http://reviews.llvm.org/D9517 The separate header file allows to reuse the MIPS ABI flags structure constants in other LLVM tools like the llvm-readobj. No functional changes. llvm-svn: 236732	2015-05-07 14:57:04 +00:00
Elena Demikhovsky	29792e9a80	AVX-512: Added all forms of FP compare instructions for KNL and SKX. Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec. By Igor Breger (igor.breger@intel.com) llvm-svn: 236714	2015-05-07 11:24:42 +00:00
Toma Tabacu	506cfd0b2b	[mips] Add the SoftFloat MipsSubtarget feature. Summary: This will enable the IAS to reject floating point instructions if soft-float is enabled. Reviewers: dsanders, echristo Reviewed By: dsanders Subscribers: jfb, llvm-commits, mpf Differential Revision: http://reviews.llvm.org/D9053 llvm-svn: 236713	2015-05-07 10:29:52 +00:00
Sanjoy Das	2e0d29fb09	[X86MCInst] Move LowerSTATEPOINT to inside X86AsmPrinter. NFC. llvm-svn: 236676	2015-05-06 23:53:26 +00:00
Sanjoy Das	80876d5db3	[X86MCInst] Clean up LowerSTATEPOINT: variable names. NFC. llvm-svn: 236675	2015-05-06 23:53:24 +00:00
Pete Cooper	d31583ddfb	[x86] Fix register class of folded load index reg. When folding a load in to another instruction, we need to fix the class of the index register Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier. llvm-svn: 236644	2015-05-06 21:37:19 +00:00
Wei Mi	062c74484d	[X86] Disable loop unrolling in loop vectorization pass when VF is 1. The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613	2015-05-06 17:12:25 +00:00
Pete Cooper	d927c6eaf8	[ARM] Fast-Isel was incorrectly selecting <2 x double> adds. With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add. However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it." This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 \|\| VT == MVT::i64' llvm-svn: 236609	2015-05-06 16:39:17 +00:00
Bill Schmidt	5fe2e25f7c	[PPC64LE] Adjust vector splats during VSX swap optimization The initial code drop for VSX swap optimization permitted the optimization only when all operations in a web of related computation are lane-insensitive. For some lane-sensitive operations, we can still permit the optimization provided that we make adjustments to those operations. This patch adds special handling for vector splats so that their presence doesn't kill the optimization. Vector splats are lane-sensitive since they identify by number a vector element to be used as the source of a splat. When swap optimizations take place, the desired vector element will move to the opposite doubleword of the quadword vector. We thus replace the index I by (I + N/2) % N, where N is the number of elements in the vector. A new test case is added to test that swap optimization succeeds when vector splats are present, and that the proper input element is used as the source of the splat. An ancillary change removes SH_BUILDVEC as one of the kinds of special handling that may be required by VSX swap optimization. From experience with GCC, I had expected to need some modifications for vector build operations, but I did not find that to be the case. llvm-svn: 236606	2015-05-06 15:40:46 +00:00
NAKAMURA Takumi	d7c0be9c42	Revert r236546, "propagate IR-level fast-math-flags to DAG nodes (NFC)" It caused undefined behavior. llvm-svn: 236600	2015-05-06 14:03:12 +00:00
Artyom Skrobov	3f8eae92a4	[ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too llvm-svn: 236590	2015-05-06 11:44:10 +00:00
Ahmed Bougacha	e8d0c4ccea	[ARM][FastISel] Use TST #1 instead of CMP #0 for select. Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore. "TST #1" is the conservatively correct alternative - the tradeoff being that it doesn't have a 16-bit encoding -, so use that instead. llvm-svn: 236569	2015-05-06 04:14:02 +00:00
Pete Cooper	d0dae3e577	[X86 fast-isel] Constrain the index reg class to not include SP. The index reg on instructions with complex address modes is a GPR64_NOSP. Constrain it to appease the machine verifier. llvm-svn: 236557	2015-05-05 23:41:53 +00:00
Sanjay Patel	801caff64d	propagate IR-level fast-math-flags to DAG nodes (NFC) This patch adds the minimum plumbing necessary to use IR-level fast-math-flags (FMF) in the backend without actually using them for anything yet. This is a follow-on to: http://reviews.llvm.org/rL235997 ...which split the existing nsw / nuw / exact flags and FMF into their own struct. There are 2 structural changes here: 1. The main diff is that we're preparing to extend the optimization flags to affect more than just binary SDNodes. Eg, IR intrinsics ( https://llvm.org/bugs/show_bug.cgi?id=21290 ) or non-binop nodes that don't even exist in IR such as FMA, FNEG, etc. 2. The other change is that we're actually copying the FP fast-math-flags from the IR instructions to SDNodes. Differential Revision: http://reviews.llvm.org/D8900 llvm-svn: 236546	2015-05-05 21:40:38 +00:00
Sanjay Patel	fbca70d767	use range-based for-loop; NFC llvm-svn: 236544	2015-05-05 21:20:52 +00:00
Peter Collingbourne	85a0e23bc8	Thumb2SizeReduction: Check the correct set of registers for LDMIA. The register set for LDMIA begins at offset 3, not 4. We were previously missing the short encoding of this instruction in the case where the base register was the first register in the register set. Also clean up some dead code: - The isARMLowRegister check is redundant with what VerifyLowRegs does; replace with an assert. - Remove handling of LDMDB instruction, which has no short encoding (and does not appear in ReduceTable). Differential Revision: http://reviews.llvm.org/D9485 llvm-svn: 236535	2015-05-05 20:07:10 +00:00
Ulrich Weigand	c1708b2618	[SystemZ] Add vector intrinsics This adds intrinsics to allow access to all of the z13 vector instructions. Note that instructions whose semantics can be described by standard LLVM IR do not get any intrinsics. For each instructions whose semantics cannot (fully) be described, we define an LLVM IR target-specific intrinsic that directly maps to this instruction. For instructions that also set the condition code, the LLVM IR intrinsic returns the post-instruction CC value as a second result. Instruction selection will attempt to detect code that compares that CC value against constants and use the condition code directly instead. Based on a patch by Richard Sandiford. llvm-svn: 236527	2015-05-05 19:31:09 +00:00
Ulrich Weigand	5211f9ff4d	[SystemZ] Mark v1i128 and v1f128 as unsupported The ABI specifies that <1 x i128> and <1 x fp128> are supposed to be passed in vector registers. We do not yet support those types, and some infrastructure is missing before we can do so. In order to prevent accidentally generating code violating the ABI, this patch adds checks to detect those types and error out if user code attempts to use them. llvm-svn: 236526	2015-05-05 19:30:05 +00:00
Ulrich Weigand	cd2a1b5341	[SystemZ] Handle sub-128 vectors The ABI allows sub-128 vectors to be passed and returned in registers, with the vector occupying the upper part of a register. We therefore want to legalize those types by widening the vector rather than promoting the elements. The patch includes some simple tests for sub-128 vectors and also tests that we can recognize various pack sequences, some of which use sub-128 vectors as temporary results. One of these forms is based on the pack sequences generated by llvmpipe when no intrinsics are used. Signed unpacks are recognized as BUILD_VECTORs whose elements are individually sign-extended. Unsigned unpacks can have the equivalent form with zero extension, but they also occur as shuffles in which some elements are zero. Based on a patch by Richard Sandiford. llvm-svn: 236525	2015-05-05 19:29:21 +00:00
Ulrich Weigand	49506d78e7	[SystemZ] Add CodeGen support for scalar f64 ops in vector registers The z13 vector facility includes some instructions that operate only on the high f64 in a v2f64, effectively extending the FP register set from 16 to 32 registers. It's still better to use the old instructions if the operands happen to fit though, since the older instructions have a shorter encoding. Based on a patch by Richard Sandiford. llvm-svn: 236524	2015-05-05 19:28:34 +00:00
Ulrich Weigand	80b3af7ab3	[SystemZ] Add CodeGen support for v4f32 The architecture doesn't really have any native v4f32 operations except v4f32->v2f64 and v2f64->v4f32 conversions, with only half of the v4f32 elements being used. Even so, using vector registers for <4 x float> and scalarising individual operations is much better than generating completely scalar code, since there's much less register pressure. It's also more efficient to do v4f32 comparisons by extending to 2 v2f64s, comparing those, then packing the result. This particularly helps with llvmpipe. Based on a patch by Richard Sandiford. llvm-svn: 236523	2015-05-05 19:27:45 +00:00
Ulrich Weigand	cd808237b2	[SystemZ] Add CodeGen support for v2f64 This adds ABI and CodeGen support for the v2f64 type, which is natively supported by z13 instructions. Based on a patch by Richard Sandiford. llvm-svn: 236522	2015-05-05 19:26:48 +00:00
Ulrich Weigand	ce4c109585	[SystemZ] Add CodeGen support for integer vector types This the first of a series of patches to add CodeGen support exploiting the instructions of the z13 vector facility. This patch adds support for the native integer vector types (v16i8, v8i16, v4i32, v2i64). When the vector facility is present, we default to the new vector ABI. This is characterized by two major differences: - Vector types are passed/returned in vector registers (except for unnamed arguments of a variable-argument list function). - Vector types are at most 8-byte aligned. The reason for the choice of 8-byte vector alignment is that the hardware is able to efficiently load vectors at 8-byte alignment, and the ABI only guarantees 8-byte alignment of the stack pointer, so requiring any higher alignment for vectors would require dynamic stack re-alignment code. However, for compatibility with old code that may use vector types, when not using the vector facility, the old alignment rules (vector types are naturally aligned) remain in use. These alignment rules are not only implemented at the C language level (implemented in clang), but also at the LLVM IR level. This is done by selecting a different DataLayout string depending on whether the vector ABI is in effect or not. Based on a patch by Richard Sandiford. llvm-svn: 236521	2015-05-05 19:25:42 +00:00
Ulrich Weigand	a8b04e1cbc	[SystemZ] Add z13 vector facility and MC support This patch adds support for the z13 processor type and its vector facility, and adds MC support for all new instructions provided by that facilily. Apart from defining the new instructions, the main changes are: - Adding VR128, VR64 and VR32 register classes. - Making FP64 a subclass of VR64 and FP32 a subclass of VR32. - Adding a D(V,B) addressing mode for scatter/gather operations - Adding 1-, 2-, and 3-bit immediate operands for some 4-bit fields. Until now all immediate operands have been the same width as the underlying field (hence the assert->return change in decode[SU]ImmOperand). In addition, sys::getHostCPUName is extended to detect running natively on a z13 machine. Based on a patch by Richard Sandiford. llvm-svn: 236520	2015-05-05 19:23:40 +00:00
Reid Kleckner	0738a9c02e	Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236360. This change exposed a bug in WinEHPrepare by opting win32 code into EH preparation. We already knew that WinEHPrepare has bugs, and is the status quo for x64, so I don't think that's a reason to hold off on this change. I disabled exceptions in the sanitizer tests in r236505 and an earlier revision. llvm-svn: 236508	2015-05-05 17:44:16 +00:00
Quentin Colombet	61b305edfd	[ShrinkWrap] Add (a simplified version) of shrink-wrapping. This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. Context Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. Motivating example Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. Proposed Solution This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. Design Decisions 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507	2015-05-05 17:38:16 +00:00
Kit Barton	d4eb73c00e	This patch adds ABI support for v1i128 data type. It adds v1i128 to the appropriate register classes and checks parameter passing and return values. This is related to http://reviews.llvm.org/D9081, which will add instructions that exploit the v1i128 datatype. Phabricator review: http://reviews.llvm.org/D9475 llvm-svn: 236503	2015-05-05 16:10:44 +00:00
Daniel Sanders	eda60d217b	[mips] Generate code for insert/extract operations when using the N64 ABI and MSA. Summary: When using the N64 ABI, element-indices use the i64 type instead of i32. In many cases, we can use iPTR to account for this but additional patterns and pseudo's are also required. This fixes most (but not quite all) failures in the test-suite when using N64 and MSA together. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9342 llvm-svn: 236494	2015-05-05 10:32:24 +00:00
Daniel Sanders	4160c802d9	[mips][msa] Test basic operations for the N32 ABI too. Summary: This required adding instruction aliases for dneg. N64 will be enabled shortly but requires additional bugfixes. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9341 llvm-svn: 236489	2015-05-05 08:48:35 +00:00
Reid Kleckner	9dad227b85	[X86] Fix assertion while DAG combining offsets and ExternalSymbols ExternalSymbol nodes do not contain offsets, unlike GlobalValue nodes. llvm-svn: 236471	2015-05-04 23:22:36 +00:00
Pete Cooper	4dddbcfbb1	[ARM] IT block insertion needs to update kill flags When forming an IT block from the first MOV here: %R2<def> = t2MOVr %R0, pred:1, pred:%CPSR, opt:%noreg %R3<def> = tMOVr %R0<kill>, pred:14, pred:%noreg the move in to R3 is moved out of the IT block so that later instructions on the same predicate can be inside this block, and we can share the IT instruction. However, when moving the R3 copy out of the IT block, we need to clear its kill flags for anything in use at this point in time, ie, R0 here. This appeases the machine verifier which thought that R0 wasn't defined when used. I have a test case, but its extremely register allocator specific. It would be too fragile to commit a test which depends on the register allocator here. llvm-svn: 236468	2015-05-04 22:44:47 +00:00
Reid Kleckner	b61f06c9c2	Fix -Wmicrosoft warning by making enum unsigned llvm-svn: 236436	2015-05-04 18:21:35 +00:00
Ulrich Weigand	9ac2f9b2d8	[SystemZ] Reclassify f32 subregs of f64 registers At the moment, all subregs defined by the SystemZ target can be modified independently of the wider register. E.g. writing to a GR32 does not change the upper 32 bits of the GR64. Writing to an FP32 does not change the lower 32 bits of the FP64. Hoewver, the upcoming support for the vector extension redefines FP64 as one half of a V128. Floating-point operations leave the other half of a V128 in an unpredictable state, so it's no longer the case that writing to an FP32 leaves the bits of the underlying register (the V128) alone. I'd prefer to have separate subreg_ names for this situation, so that it's obvious at a glance whether we're talking about a subreg that leaves the other parts of the register alone. No behavioral change intended. Patch originally by Richard Sandiford. llvm-svn: 236433	2015-05-04 17:41:22 +00:00
Ulrich Weigand	1f698b003c	[SystemZ] Clean up AsmParser isMem() handling We know what MemoryKind an operand has at the time we construct it, so we might as well just record it in an unused part of the structure. This makes it easier to add scatter/gather addresses later. No behavioral change intended. Patch originally by Richard Sandiford. llvm-svn: 236432	2015-05-04 17:40:53 +00:00
Ulrich Weigand	1c6f07d616	[SystemZ] Fix getTargetNodeName It seems SystemZTargetLowering::getTargetNodeName got out of sync with some recent changes to the SystemZISD opcode list. Add back all the missing opcodes (and re-sort to the same order as SystemISelLowering.h). llvm-svn: 236430	2015-05-04 17:39:40 +00:00
Tom Stellard	b81f4aa952	R600/SI: Code cleanup This is a follow-up to r236004 llvm-svn: 236427	2015-05-04 16:45:08 +00:00
Elena Demikhovsky	60eb9db7bb	AVX-512: added calling convention for i1 vectors in 32-bit mode. Fixed some bugs in extend/truncate for AVX-512 target. Removed VBROADCASTM (masked broadcast) node, since it is not used any more. llvm-svn: 236420	2015-05-04 12:40:50 +00:00
Elena Demikhovsky	52266388f8	AVX-512: added integer "add" and "sub" instructions with saturation for SKX with intrinsics and tests by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236418	2015-05-04 12:35:55 +00:00
Elena Demikhovsky	2557a22be7	AVX-512: Added VPACK* instructions forms for KNL and SKX and their intrinsics by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236414	2015-05-04 09:14:02 +00:00
Elena Demikhovsky	1b60ed7069	Masked gather and scatter intrinsics - enabled codegen for KNL. llvm-svn: 236394	2015-05-03 07:12:25 +00:00
Simon Pilgrim	d5e20306cc	[SSE2] Minor tidyup of v16i8 SHL lowering. NFC. Removed code that was replicating v8i16 'shift + mask' implementation that is done more nicely by making use of LowerScalarImmediateShift llvm-svn: 236388	2015-05-02 14:42:43 +00:00
Reid Kleckner	83d89fa546	Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236359. Things are still broken despite testing. :( llvm-svn: 236360	2015-05-01 22:50:14 +00:00
Reid Kleckner	51476acd77	Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236340. llvm-svn: 236359	2015-05-01 22:40:25 +00:00
Quentin Colombet	0de2346859	[AArch64][FastISel] Variant of the logical instructions that use two input registers cannot write on SP. rdar://problem/20748715 llvm-svn: 236352	2015-05-01 21:34:57 +00:00
Colin LeMahieu	6efd273a61	[Hexagon] Removing variable unused in release. llvm-svn: 236351	2015-05-01 21:30:22 +00:00
Colin LeMahieu	b662565475	[Hexagon] Adding expression MC emission and removing XFAIL from test that hits this code path. llvm-svn: 236348	2015-05-01 21:14:21 +00:00
Quentin Colombet	9df2fa261b	[AArch64][FastISel] Fix the setting of kill flags for MUL -> UMULH sequences. rdar://problem/20748715 llvm-svn: 236346	2015-05-01 20:57:11 +00:00
Reid Kleckner	2747d3d55a	Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236339, it breaks the win32 clang-cl self-host. llvm-svn: 236340	2015-05-01 20:14:04 +00:00
Reid Kleckner	4856fc61b4	[WinEH] Add an EH registration and state insertion pass for 32-bit x86 This pass is responsible for constructing the EH registration object that gets linked into fs:00, which is all it does in this change. In the future, it will also insert stores to update the EH state number. I considered keeping this functionality in WinEHPrepare, but it's pretty separable and X86 specific. It has conceptually very little to do with the task of WinEHPrepare, which is currently outlining. WinEHPrepare is also in theory useful on ARM, but this logic is pretty x86 specific. Reviewers: andrew.w.kaylor, majnemer Differential Revision: http://reviews.llvm.org/D9422 llvm-svn: 236339	2015-05-01 20:04:54 +00:00
Pete Cooper	f68d5038e6	[ARM] Transfer the internal flag in thumb2 size reduction. Converting from t2LDRs to tLDRr caused the shift argument to drop the internal flag. This would then throw machine verifier errors. Unfortunately i'm having trouble reducing a test case. I'm going to keep trying, but so far its a scary combination of machine sinking, an 'and i1', loads feeding loads, and a bunch of code which shouldn't change IT block formation, but does. Its not useful to commit a test in that state as we have no way of knowing if it even hits this code reliably in future. rdar://problem/20752113 llvm-svn: 236333	2015-05-01 18:57:32 +00:00
Peter Collingbourne	d27d3a151f	ARM: Align functions containing Thumb-2 jump tables to 4 bytes. Functions with jump tables need an alignment of 4 because they use the ADR instruction, which aligns the PC to 4 bytes before adding an offset. Differential Revision: http://reviews.llvm.org/D9424 llvm-svn: 236327	2015-05-01 18:05:59 +00:00
James Y Knight	35e04e84fa	[Sparc] Repair fixups in little endian mode. Differential Revision: http://reviews.llvm.org/D9434 llvm-svn: 236324	2015-05-01 17:13:02 +00:00
Toma Tabacu	00e9867988	[mips] [IAS] Fix error messages for using LI with 64-bit immediates. Summary: LI should never accept immediates larger than 32 bits. The additional Is32BitImm boolean also paves the way for unifying the functionality that LA and LI have in common. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9289 llvm-svn: 236313	2015-05-01 12:19:27 +00:00
Toma Tabacu	a2861db834	[mips] [IAS] Slightly improve shift instruction generation in expandLoadImm. Summary: Generate one DSLL32 of 0 instead of two consecutive DSLL of 16. In order to do this I had to change createLShiftOri's template argument from a bool to an unsigned. This also gave me the opportunity to rewrite the mips64-expansions.s test, as it was testing the same cases multiple times and skipping over other cases. It was also somewhat unreadable, as the CHECK lines were grouped in a huge block of text at the beginning of the file. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8974 llvm-svn: 236311	2015-05-01 10:26:47 +00:00
Tom Stellard	aa798340c3	R600/SI: Add VCC as an implict def of SI_KILL When SI_KILL has a register operand, its lowered form writes to vcc. llvm-svn: 236307	2015-05-01 03:44:09 +00:00
Tom Stellard	0b7feb1cb7	R600/SI: Fix verifier errors from the SIAnnotateControlFlow pass This pass was generating 'Instruction does not dominate all uses!' errors for programs which had loops with a condition variable that depended on the result of a phi instruction from outside of the loop. The pass was inserting new phi nodes outside of the loop which used values defined inside the loop. http://bugs.freedesktop.org/show_bug.cgi?id=90056 llvm-svn: 236306	2015-05-01 03:44:08 +00:00
Pete Cooper	2127b00cd5	[ARM] optimizeSelect should clear kill flags. If we move an instruction from one block down to a MOVC and predicate it, then the original instruction could be moved in to a loop. In this case, its invalid for any kill flags to remain on there. Fails with -verfy-machineinstrs. rdar://problem/20752113 llvm-svn: 236290	2015-04-30 23:57:47 +00:00
Quentin Colombet	329fa890ba	[AArch64] Fix bad register class constraint in fast-isel for TST instruction. rdar://problem/20748715 llvm-svn: 236273	2015-04-30 22:27:20 +00:00
Pete Cooper	5111881cfc	Don't always apply kill flag in thumb2 ABS pseudo expansion. The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272	2015-04-30 22:15:59 +00:00
Reid Kleckner	60d5232be2	[X86] Use 4 byte preferred aggregate alignment on Win32 This helps reduce the frequency of stack realignment prologues in 32-bit X86 Windows code. Before this change and the corresponding clang change, we would take the max of the type preferred alignment and the explicit alignment on the alloca. If you don't override aggregate alignment in datalayout, you get a default of 8. This dates back to 2007 / r34356, and changing it seems prohibitively difficult at this point. llvm-svn: 236270	2015-04-30 22:11:59 +00:00
Matt Arsenault	d42e017ee4	Mips: Remove dead declaration llvm-svn: 236250	2015-04-30 19:35:43 +00:00
Quentin Colombet	0a905042cd	[ARM] Do not generate invalid encoding for stack adjust, even if this is just temporary. Because of that: 1. The machine verifier was complaining on such code. 2. The generate code worked just because the thumb reduction size pass fixed the opcode. rdar://problem/20749824 llvm-svn: 236247	2015-04-30 18:52:49 +00:00
Tim Northover	03b99f66d7	AArch64: add BFC alias for the BFI/BFM instructions. Unlike 32-bit ARM, AArch64 can use wzr/xzr to implement this without the need for a separate instruction. rdar://18679590 llvm-svn: 236245	2015-04-30 18:28:58 +00:00
Jan Vesely	808fff585b	Reinstate revisions r234755, r234759, r234760 changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240	2015-04-30 17:15:56 +00:00
Daniel Jasper	232778a7a0	Silence unused warning in non-assert builds. llvm-svn: 236213	2015-04-30 09:01:21 +00:00
Elena Demikhovsky	e1eda8a9e6	Masked gather and scatter - added DAGCombine visitors and AVX-512 instruction selection patterns. All other patches, including tests will follow. http://reviews.llvm.org/D7665 llvm-svn: 236211	2015-04-30 08:38:48 +00:00
Simon Pilgrim	ecf5875bd5	[SSE] Fix for MUL v16i8 on pre-SSE41 targets (PR23369). Sign extension of i8 to i16 was placing the unpacked bytes in the lower byte instead of the upper byte. llvm-svn: 236209	2015-04-30 08:23:16 +00:00
Pete Cooper	46361a1ea1	Change x86 CMOVE_F to read it source, not write it. This was breaking sqlite with the machine verifier because operand 0 was a def according to tablegen, but didn't have the 'isDef' flag set. Looking at the ISA, its clear that this operand is a source as writing to st(0) is implicit. So move the operand to the correct place in the td file. rdar://problem/20751584 llvm-svn: 236183	2015-04-29 23:51:33 +00:00
Douglas Katzman	9160e78ac8	[Sparc] Really add sparcel architecture support. Mostly copy-and-paste from Sparc v8 architecture. Differential Revision: http://reviews.llvm.org/D8741 llvm-svn: 236146	2015-04-29 20:30:57 +00:00
Manman Ren	0e20822887	[AArch64] Refactor out codes that depend on specific CS save sequence. No functionality change. llvm-svn: 236143	2015-04-29 20:03:38 +00:00
Tim Northover	5211715360	ARM: mark branch-like instructions with correct flags. There's probably no way to test BXJ, but if the compiler ever did emit it during CodeGen it would have to be a block terminator so "isBranch" is appropriate. BLX is more tricky. Clearly a call, but it affects surprisingly little. rdar://18719544 llvm-svn: 236140	2015-04-29 19:16:38 +00:00
Douglas Katzman	9cb88b73c6	Make Sparc assembler accept parenthesized constant expressions. Differential Revision: http://reviews.llvm.org/D9087 llvm-svn: 236137	2015-04-29 18:48:29 +00:00
Zoran Jovanovic	387ce30685	[mips][microMIPSr6] Implement MUL, MUH, MULU and MUHU instructions Differential Revision: http://reviews.llvm.org/D8894 llvm-svn: 236131	2015-04-29 17:23:22 +00:00
Reid Kleckner	c695471365	[X86] Avoid mangling frameescape labels x86 Windows uses the '_' prefix for all global symbols, and this was mistakenly being applied to frameescape labels, which are not externally visible global symbols. They use the private global prefix 'L'. The right way to fix this is probably to stop masquerading this label as an ExternalSymbol and create a new SDNode type. These labels are not "external", and we know they will be resolved by assembly time. Having a custom SDNode type would allow us to do better X86 address mode matching, so it's probably worth doing eventually. llvm-svn: 236123	2015-04-29 16:46:01 +00:00
Duncan P. N. Exon Smith	a9308c49ef	IR: Give 'DI' prefix to debug info metadata Finish off PR23080 by renaming the debug info IR constructs from `MD` to `DI`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the previous commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to this commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120	2015-04-29 16:38:44 +00:00
Zoran Jovanovic	cca29e8f6e	[mips][microMIPSr6] Implement SUB and SUBU instructions Differential Revision: http://reviews.llvm.org/D8764 llvm-svn: 236118	2015-04-29 16:22:46 +00:00
Zoran Jovanovic	5f34d44354	[mips][microMIPSr6] Implement ADD, ADDU and ADDIU instructions Differential Revision: http://reviews.llvm.org/D8704 llvm-svn: 236111	2015-04-29 15:11:07 +00:00
James Y Knight	c09bdfa4cb	Sparc: Prefer reg+reg address encoding when only one register used. Reg+%g0 is preferred to Reg+imm0 by the manual, and is what GCC produces. Futhermore, reg+imm is invalid for the (not yet supported) "alternate address space" instructions. Differential Revision: http://reviews.llvm.org/D8753 llvm-svn: 236107	2015-04-29 14:54:44 +00:00
Vasileios Kalintiris	1249e74648	Mips fast-isel - handle functions which return i8 or i6 . Summary: Allow Mips fast-isel to handle functions which return i8/i16 signed/unsigned. Test Plan: Make check tests are forthcoming. Already passes test-suite at O0/O2 for Mips 32 r1/r2 Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6765 llvm-svn: 236103	2015-04-29 14:17:14 +00:00
Daniel Sanders	301f937765	[mips] Correct 128-bit shifts on 64-bit targets. Summary: The existing code was correct for 32-bit GPR's but not 64-bit GPR's. It now accounts for both cases. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits, mohit.bhakkad, sagar Differential Revision: http://reviews.llvm.org/D9337 llvm-svn: 236099	2015-04-29 12:28:58 +00:00
Toma Tabacu	79588100d7	[mips] [IAS] Inline assemble-time shifting out of createLShiftOri. NFC. Summary: Do the assemble-time shifts from createLShiftOri at the source, which groups all the shifting together, closer to the main logic path, and store the results in concisely-named variables to improve code clarity. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8973 llvm-svn: 236096	2015-04-29 10:19:56 +00:00
Elena Demikhovsky	a9f20495a2	fixed 80-chars; NFC llvm-svn: 236093	2015-04-29 08:49:57 +00:00
Eric Christopher	0ba41a6841	Reuse a lookup in an assert. llvm-svn: 236054	2015-04-28 22:38:35 +00:00
Tim Northover	e18d662201	ARM: fix peephole optimisation of TST We were trying to look through COPY instructions, but only to the next instruction in a BB and incorrectly anyway. The cases where that would actually be a good idea are rare enough (and not even tested!) that it's not worth trying to get right. rdar://20721342 llvm-svn: 236050	2015-04-28 22:03:55 +00:00
James Y Knight	e8da8096ec	Sparc: Add alternate aliases for conditional branch instructions. llvm-svn: 236042	2015-04-28 21:27:31 +00:00
Alexei Starovoitov	659ece9ddb	[bpf] fix build Patch by Brenden Blanco. llvm-svn: 236030	2015-04-28 20:38:56 +00:00
Sanjay Patel	f75ee4dc07	[x86] remove RCPPS and RSQRTPS intrinsic instruction definitions We don't need codegen-only intrinsic instructions for the vector forms of these instructions. This makes the reciprocal estimate instruction lowering identical to how we handle normal square roots: (V)SQRTPS / (V)SQRTPD. No existing regression tests fail with this patch. Differential Revision: http://reviews.llvm.org/D9301 llvm-svn: 236013	2015-04-28 18:48:45 +00:00
Eric Christopher	35a8a62125	Add a fixme to resetTargetOptions to explain why it needs to go away. llvm-svn: 236009	2015-04-28 18:09:05 +00:00
Eric Christopher	f4bf3779d8	Fix a [-Werror,-Winconsistent-missing-override] problem in the NVPTX overrides. llvm-svn: 236007	2015-04-28 18:06:27 +00:00
Tom Stellard	96301d2455	R600: Fix up for AsmPrinter's OutStreamer being a unique_ptr Fixes a crash with basically any OpenGL application using the radeonsi driver. Patch by: Michel Dänzer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90176 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 236004	2015-04-28 17:37:03 +00:00
Tom Stellard	0a0fa03d5a	R600/SI: Add a lower case alias for subtarget feature: +DumpCode llc converts all feature strings to lower case, while the LLVM C API does not, so we need a lower case alias in order to test this with llc. llvm-svn: 236003	2015-04-28 17:37:00 +00:00
Justin Holewinski	3d2a976197	[NVPTX] Handle addrspacecast constant expressions in aggregate initializers We need to track if an AddrSpaceCast expression was seen when generating an MCExpr for a ConstantExpr. This change introduces a custom lowerConstant method to the NVPTX asm printer that will create NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode the information that a given symbol needs to be casted to a generic address. llvm-svn: 236000	2015-04-28 17:18:30 +00:00
Sanjay Patel	ba55804ea3	move IR-level optimization flags into their own struct This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900). In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment, we should eventually share this data between the IR passes and the backend. We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to use the new struct. The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a data member to the subclass. This means we don't have to repeat all of the get/set methods per flag, but we're potentially adding size to all nodes of this subclassi type. In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at least one free byte to play with due to struct alignment. Differential Revision: http://reviews.llvm.org/D9325 llvm-svn: 235997	2015-04-28 16:39:12 +00:00
Elena Demikhovsky	1f7b3644d3	Fixed crash of variable shift inst on AVX2 https://llvm.org/bugs/show_bug.cgi?id=22955 llvm-svn: 235993	2015-04-28 14:46:35 +00:00
Toma Tabacu	7dea2e3982	[mips] [IAS] Do not generate redundant ORi in createLShiftOri. Summary: If the immediate is 0, the ORi is pointless. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8969 llvm-svn: 235990	2015-04-28 14:06:35 +00:00
Sergey Dmitrouk	842a51bad8	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes" [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989	2015-04-28 14:05:47 +00:00
Daniel Jasper	48e93f7181	Revert "[DebugInfo] Add debug locations to constant SD nodes" This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987	2015-04-28 13:38:35 +00:00
Toma Tabacu	6114565269	[mips] [IAS] Rename the createShiftOr function to createLShiftOri. NFC. Summary: The new name is more accurate with regard to the functionality. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8968 llvm-svn: 235984	2015-04-28 13:16:06 +00:00
Toma Tabacu	137d90ab88	[mips] [IAS] Store the expandLoadImm destination register in a variable. NFC. Summary: This removes multiple calls to getReg() and saves us column space in the source file. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8924 llvm-svn: 235978	2015-04-28 12:04:53 +00:00
Sergey Dmitrouk	adb4c69d5c	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977	2015-04-28 11:56:37 +00:00
Elena Demikhovsky	ae51853924	AVX-512: Added "pandn" intrinsics set by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 235971	2015-04-28 08:12:42 +00:00
Ahmed Bougacha	190528703f	[MC] Use LShr for constant evaluation of ">>" on ELF/arm64--darwin. This matches other assemblers and is less unexpected (e.g. PR23227). On ELF, I tried binutils gas v2.24 and nasm 2.10.09, and they both agree on LShr. On COFF, I couldn't get my hands on an assembler yet, so don't change the behavior. For now, don't change it on non-AArch64 Darwin either, as the other assembler is gas v1.38, which does an AShr. llvm-svn: 235963	2015-04-28 01:37:11 +00:00
Matthias Braun	eec4efcca5	Cleanup, remove unused return value llvm-svn: 235952	2015-04-28 00:37:05 +00:00
Sanjay Patel	ca5ad5fb6d	remove obsolete pattern matches for scalar SSE ops The blendi pattern should always replace the insertps pattern after: http://reviews.llvm.org/rL232850 http://reviews.llvm.org/rL235124 llvm-svn: 235930	2015-04-27 22:23:17 +00:00
Ahmed Bougacha	c004c60c0a	[AArch64] Also combine vector selects fed by non-i1 SETCCs. After legalization, scalar SETCC has an i32 result type on AArch64. The i1 requirement seems too conservative, replace it with an assert. This also means that we now can run after legalization. That should also be fine, since the ops legalizer runs again after each combine, and all types created all have the same sizes as the (legal) inputs. Exposed by r235917; while there, robustize its tests (bsl also uses the register it defines). llvm-svn: 235922	2015-04-27 21:43:12 +00:00
Ahmed Bougacha	89bba61c84	[AArch64] Don't assert when combining (v3f32 select (setcc f64)). When the setcc has f64 operands, we can't build a vector setcc mask to feed a vselect, because f64 doesn't divide v3f32 evenly. Just bail out when that happens. llvm-svn: 235917	2015-04-27 21:01:20 +00:00
Bill Schmidt	e71db85bed	Silence unused variable errors for no-asserts builds llvm-svn: 235913	2015-04-27 20:22:35 +00:00
Bill Schmidt	fe723b9a6d	[PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations This patch adds a new SSA MI pass that runs on little-endian PPC64 code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors without alignment constraints are accomplished for little-endian using lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional xxswapd instructions hurts performance in comparison with big-endian code, but they are necessary in the general case to support correct semantics. However, the general case does not apply to most vector code. Many vector instructions are lane-insensitive; they do not "care" which lanes the parallel computations are performed within, provided that the resulting data is stored into the correct locations. Thus this pass looks for computations that perform only lane-insensitive operations, and remove the unnecessary swaps from loads and stores in such computations. Future improvements will allow computations using certain lane-sensitive operations to also be optimized in this manner, by modifying the lane-sensitive operations to account for the permuted order of the lanes. However, this patch only adds the infrastructure to permit this; no lane-sensitive operations are optimized at this time. This code is heavily exercised by the various vectorizing applications in the projects/test-suite tree. For the time being, I have only added one simple test case to demonstrate what the pass is doing. Although it is quite simple, it provides coverage for much of the code, including the special case handling of copies and subreg-to-reg operations feeding the swaps. I plan to add additional tests in the future as I fill in more of the "special handling" code. Two existing tests were affected, because they expected the swaps to be present, but they are now removed. llvm-svn: 235910	2015-04-27 19:57:34 +00:00
Sanjay Patel	8fd573e87f	fix 80-cols; NFC llvm-svn: 235902	2015-04-27 17:45:44 +00:00
Sanjay Patel	912315811e	fix typos; NFC llvm-svn: 235896	2015-04-27 17:03:31 +00:00
Toma Tabacu	bda745f532	[mips] Correct bytes to bits in 2 comments. NFC. llvm-svn: 235891	2015-04-27 15:21:38 +00:00
Elena Demikhovsky	a480ef5494	AVX-512: added calling conventions for i1 vectors. Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724 llvm-svn: 235889	2015-04-27 15:11:19 +00:00
Brendon Cahoon	55bdeb7bc7	[Hexagon] Use constant extenders to fix up hardware loops Use a loop instruction with a constant extender for a hardware loop instruction that is too far away from the start of the loop. This is cheaper than changing the SA register value. Differential Revision: http://reviews.llvm.org/D9262 llvm-svn: 235882	2015-04-27 14:16:43 +00:00
Toma Tabacu	d9d344b485	[mips] [IAS] Improve warning for using AT with .set noat. Summary: Changed the warning message to show the current value of $at, similar to what clang does for typedef's, and renamed warnIfAssemblerTemporary to a more descriptive name. I also changed the type of variables which store registers from int to unsigned, updated the relevant test and tried to make the related comments clearer. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8479 llvm-svn: 235881	2015-04-27 14:05:04 +00:00
Vasileios Kalintiris	7a6b18783f	Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel."" This reapplies r235194, which was reverted in r235495 because it was causing a failure in our out-of-tree buildbots for MIPS. With the sign-extension patch in r235718, this patch doesn't cause any problem any more. llvm-svn: 235878	2015-04-27 13:28:05 +00:00
Toma Tabacu	b19cf2082f	[mips] [IAS] Rename getATRegNum and setATReg to {g,s}etATRegIndex. NFC. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8480 llvm-svn: 235877	2015-04-27 13:12:59 +00:00
Elena Demikhovsky	d1084c5b3f	AVX-512: Extend/Truncate operations for SKX, SETCC for bit-vectors llvm-svn: 235875	2015-04-27 12:57:59 +00:00
Simon Pilgrim	4f683c264a	[X86][SSE] Add v16i8/v32i8 multiplication support Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized. The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result. Differential Revision: http://reviews.llvm.org/D9115 llvm-svn: 235837	2015-04-27 07:55:46 +00:00
Alexei Starovoitov	f26c748b1b	[bpf] fix build and remove a compiler warning in Release mode Patch by Brenden Blanco. llvm-svn: 235814	2015-04-26 01:58:08 +00:00
Benjamin Kramer	a44b37e676	[ARM] Simplify code. NFC. llvm-svn: 235803	2015-04-25 17:25:13 +00:00
Benjamin Kramer	6246069c89	[hexagon] Use range-based for loops. No functionality change intended. llvm-svn: 235802	2015-04-25 14:46:53 +00:00
Benjamin Kramer	a37c809ce5	[hexagon] Remove setHexLibcallName, it leaks memory. Just spell out the full names, it's not that much more code. No functional change intended. llvm-svn: 235801	2015-04-25 14:46:46 +00:00
Lang Hames	9ff69c8f4d	[AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr. AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a reference for this is crufty. llvm-svn: 235752	2015-04-24 19:11:51 +00:00
Vasileios Kalintiris	1202f36b10	[mips][FastISel] Specify which types we handle for integer extension. Summary: Perform integer extension only when the destination type is one of i8, i16 & i32 and when the source type is i1, i8 or i16. For other combinations we fall back to SelectionDAG. This fixes the test MultiSource/Benchmarks/7zip that was failing in our out-of-tree MIPS buildbots. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9243 llvm-svn: 235718	2015-04-24 13:48:19 +00:00
Jingyue Wu	72fca6c89b	Resurrect r235688 We should skip vector types which are not SCEVable. test/CodeGen/NVPTX/sched2.ll passes llvm-svn: 235695	2015-04-24 04:22:39 +00:00
Jingyue Wu	62af99b0db	Revert r235688 Seems breaking builds llvm-svn: 235690	2015-04-24 03:26:11 +00:00
Jingyue Wu	312fd0242d	[NVPTX] Emits "generic()" depending on the original address space Summary: Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()" on aggregates of addrspacecasts. Test Plan: addrspacecast-gvar.ll Reviewers: eliben, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9130 llvm-svn: 235689	2015-04-24 02:57:30 +00:00
Jingyue Wu	3daace5295	[NVPTX] enable NaryReassociate in NVPTX Summary: We run NaryReassociate right after SLSR because SLSR enables many opportunities for NaryReassociate. For example, in nary-slsr.ll foo((a + b) + c); foo((a + b * 2) + c); foo((a + b * 3) + c); // 2 muls and 6 adds after SLSR: ab = a + b; foo(ab + c); ab2 = ab + b; foo(ab2 + c); ab3 = ab2 + b; foo(ab3 + c); // 6 adds after NaryReassociate: abc = (a + b) + c; foo(abc); ab2c = abc + b; foo(ab2c); ab3c = ab2c + b; foo(ab3c); // 4 adds Test Plan: nary-slsr.ll Reviewers: jholewinski, eliben Reviewed By: eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9066 llvm-svn: 235688	2015-04-24 02:54:06 +00:00
Matt Arsenault	5e10016f03	R600/SI: Fix verifier error when producing v_madmk_f32 Copy the kill flags when swapping the operands. llvm-svn: 235687	2015-04-24 01:57:58 +00:00
Matthias Braun	e1a67412cf	R600/RegisterCoalescer: Enable more rematerialization/add missing testcase This enables the rematerialization of some R600 MOV instructions in the RegisterCoalescer and adds a testcase for r235668. llvm-svn: 235675	2015-04-24 00:25:50 +00:00
Matt Arsenault	a48b866068	R600/SI: Special case v_mov_b32 as really rematerializable This should be fixed to properly understand all rematerializable instructions while ignoring implicit reads of exec. llvm-svn: 235671	2015-04-23 23:34:48 +00:00
Hal Finkel	4dc8fcc224	[PowerPC] Support register name prefixes for vector registers Match binutils by supporting the optional register name prefix for new vector registers ("vs" for VSX registers and "q" for QPX registers). llvm-svn: 235665	2015-04-23 23:16:22 +00:00
Hal Finkel	d86e90abdd	[PowerPC] Use sync inst alias when printing So long as the choice between printing msync and sync is not ambiguous, we can print 'sync 0' and just 'sync'. llvm-svn: 235663	2015-04-23 23:05:08 +00:00
Tom Stellard	ff5cf0e1fd	R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operands llvm-svn: 235662	2015-04-23 22:59:24 +00:00
Hal Finkel	fefcfffe68	[PowerPC] Add asm/disasm support for dcbt with hint Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint field specified (non-zero). Unforunately, the syntax for this instruction is special in that it differs for server vs. embedded cores: dcbt ra, rb, th [server] dcbt th, ra, rb [embedded] where th can be omitted when it is 0. dcbtst is the same. Thus we need to play games in the parser and the printer to flip the operands around on the embedded cores. We'll use the server syntax as the default (binutils currently uses the embedded form by default, but IBM is changing that). We also stop marking dcbtst as having unmodeled side effects (this is not necessary, it is just a hint like dcbt -- noticed by inspection, so no separate test case). llvm-svn: 235657	2015-04-23 22:47:57 +00:00
Krzysztof Parzyszek	ed75e7aece	Unbreak build llvm-svn: 235646	2015-04-23 20:57:39 +00:00
Krzysztof Parzyszek	27ba19a177	[Hexagon] Minor cleanup in HexagonFrameLowering llvm-svn: 235645	2015-04-23 20:42:20 +00:00
Tom Stellard	8b0182af2f	R600/SI: Fix indirect addressing with a negative constant offset When the base register index of the vector plus the constant offset was less than zero, we were passing the wrong base register to the indirect addressing instruction. In this case, we need to set the base register to v0 and then add the computed (negative) index to m0. llvm-svn: 235641	2015-04-23 20:32:01 +00:00
Peter Collingbourne	167668f8c8	Thumb2: When applying branch optimizations, visit branches in reverse order. The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640	2015-04-23 20:31:35 +00:00
Peter Collingbourne	cfee5b04bc	ARM: When re-creating a branch via InsertBranch, preserve CPSR flags. In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639	2015-04-23 20:31:32 +00:00
Peter Collingbourne	6529523151	Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638	2015-04-23 20:31:30 +00:00
Peter Collingbourne	78f1ecc59c	ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637	2015-04-23 20:31:26 +00:00
Peter Collingbourne	1213918bf4	ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools. This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636	2015-04-23 20:31:22 +00:00
Krzysztof Parzyszek	e568967986	[Hexagon] Fix compiler warnings in release build Patch by Aditya Nandakumar. llvm-svn: 235635	2015-04-23 20:26:21 +00:00
Jingyue Wu	3286ec1484	[NVPTX] run SeparateConstOffsetFromGEP before SLSR Summary: We pick this order because SeparateConstOffsetFromGEP may create more opportunities for SLSR. Test Plan: reassociate-geps-and-slsr.ll no performance regression on internal benchmarks Reviewers: meheff Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D9230 llvm-svn: 235632	2015-04-23 20:00:04 +00:00
Tom Stellard	d1f0f0268c	R600/SI: Add assembler support for all CI and VI VOP1 instructions llvm-svn: 235629	2015-04-23 19:33:54 +00:00
Tom Stellard	4b3e755480	R600/SI: v_mov_fed_b32 does not exist on VI llvm-svn: 235628	2015-04-23 19:33:52 +00:00
Tom Stellard	21cce29041	R600/SI: Use a better error message for unsupported instructions in the assembler llvm-svn: 235627	2015-04-23 19:33:51 +00:00
Tom Stellard	7130ef49cb	R600/SI: Improve AsmParser support for forced e64 encoding We can now force e64 encoding even when the operands would be legal for e32 encoding. llvm-svn: 235626	2015-04-23 19:33:48 +00:00
Hal Finkel	7c5cb066d0	[PowerPC] Enable printing instructions using aliases TableGen had been nicely generating code to print a number of instructions using shorter aliases (and PowerPC has plenty of short mnemonics), but we were not calling it. For some of the aliases we support in the parser, TableGen can't infer the "inverse" alias relationship, so there is still more to do. Thus, after some hours of updating test cases... llvm-svn: 235616	2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar	745615ca00	[AArch64] Add nvcast patterns for v4f16 and v8f16 Summary: Constant stores of f16 vectors can create NvCast nodes from various operand types to v4f16 or v8f16 depending on patterns in the stored constants. This patch adds nvcast rules with v4f16 and v8f16 values. AArchISelLowering::LowerBUILD_VECTOR has the details on which constant patterns generate the nvcast nodes. Reviewers: jmolloy, srhines, ab Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9201 llvm-svn: 235610	2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar	b18815354d	[AArch64] Handle vec4, vec8, vec16 *itofp for half Summary: Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32, v8i8, v8i16 inputs to allow promotion of v4f16 results. Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32, and i64 vectors. Only missing tests are for v16i8 and v16i16 as the shift operations are too complicated to write a proper check sequence. The conversions from v4i64 to v4f16 do not depend on this patch - v4i64 is split and the conversion gets handled while lowering v2i64. I am adding a test here for completeness. Reviewers: aemerson, rengolin, ab, jmolloy, srhines Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9166 llvm-svn: 235609	2015-04-23 17:16:27 +00:00
Hans Wennborg	0867b151c9	Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608	2015-04-23 16:45:24 +00:00
Krzysztof Parzyszek	876a19d855	[Hexagon] Shrink-wrap stack frame (Hexagon-specific) llvm-svn: 235603	2015-04-23 16:05:39 +00:00
Toma Tabacu	7fc89d2141	[mips] [IAS] Move NOP emission after pseudo-instruction expansion. NFC. As suggested in the review for http://reviews.llvm.org/D8537. llvm-svn: 235601	2015-04-23 14:48:38 +00:00
Aaron Ballman	0be238cebd	Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597	2015-04-23 13:41:59 +00:00
Hans Wennborg	15823d49b6	Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560	2015-04-22 23:14:56 +00:00
Krzysztof Parzyszek	952d951418	[Hexagon] Some cleanup of instruction selection code llvm-svn: 235552	2015-04-22 21:17:00 +00:00
Krzysztof Parzyszek	cd97c985c7	[Hexagon] Use A2_tfrsi for constant pool and jump table addresses llvm-svn: 235535	2015-04-22 18:25:53 +00:00
Pete Cooper	037b700b7f	[AArch64] Use MachineRegisterInfo instead of LiveIntervals to calculate liveness. NFC. The CondOpt pass currently uses LiveIntervals to set the dead flag on a def. This patch uses MachineRegisterInfo::use_empty instead as that is equivalent to the def being dead. This removes an instance of LiveIntervals in the pass manager pipeline and saves 3.8% of compile time on llc conpiled for AArch64. Reviewed by Chad Rosier and Zhaoshi. llvm-svn: 235532	2015-04-22 18:05:13 +00:00
Krzysztof Parzyszek	05902163b6	[Hexagon] Consider constant-extended offsets to be valid llvm-svn: 235529	2015-04-22 17:51:26 +00:00
Krzysztof Parzyszek	9ee04e401a	Fix Windows build break: use LLVM_FUNCTION_NAME instead of __func__. llvm-svn: 235525	2015-04-22 17:19:44 +00:00
Matt Arsenault	deaef8e24b	R600: Fix always inline pass breaking noinline functions No test since calls are not actually supported yet. llvm-svn: 235524	2015-04-22 17:10:44 +00:00
Krzysztof Parzyszek	4fa2a9f7fd	[Hexagon] Overhaul of stack object allocation - Use static allocation for aligned stack objects. - Simplify dynamic stack object allocation. - Simplify elimination of frame-indices. llvm-svn: 235521	2015-04-22 16:43:53 +00:00
Sanjay Patel	cab567873f	[x86] Add store-folded memop patterns for vcvtps2ph Differential Revision: http://reviews.llvm.org/D7296 llvm-svn: 235517	2015-04-22 16:11:19 +00:00
Krzysztof Parzyszek	6bbcb31fda	[Hexagon] Treat CFI as solo instructions llvm-svn: 235516	2015-04-22 15:47:35 +00:00
Krzysztof Parzyszek	badf3a6356	[Hexagon] Implement HexagonInstPrinter::printRegName llvm-svn: 235514	2015-04-22 15:38:17 +00:00
Andrea Di Biagio	6cd2f42fac	[X86][AVX] Fix failure due to a missing ISel pattern to select VBROADCAST nodes (PR23259). This fixes a regression introduced at revision 218263. On AVX, if we optimize for size, a splat build_vector of a load is lowered into a VBROADCAST node. This is done even if the value type of the splat build_vector node is v2i64. Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast where the scalar element comes from a loadi64/loadf64. However, revision 218263 forgot to add an extra fallback pattern for the case where we have a X86VBroadcast of a loadi64 with multiple uses. This patch adds the missing tablegen pattern in X86InstrSSE.td. This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern to select v2i64 X86ISD::BROADCAST nodes. llvm-svn: 235509	2015-04-22 14:53:39 +00:00
Zoran Jovanovic	b59a541926	[mips][microMIPSr6] Implement mips32 to microMIPSr6 mapping support Differential Revision: http://reviews.llvm.org/D8661 llvm-svn: 235505	2015-04-22 13:27:34 +00:00
Vasileios Kalintiris	e7508c9fc7	Revert "[mips][FastISel] Implement shift ops for Mips fast-isel." This reverts commit r235194. It was causing a failure in FastISel buildbots due to sign-extension issues. llvm-svn: 235495	2015-04-22 10:08:46 +00:00
James Molloy	cd2334e86e	[AArch64] Disable complex GEP optimization by default. Enough concerns were raised that this optimization is pessimising some code patterns. The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher. llvm-svn: 235491	2015-04-22 09:11:38 +00:00
Lang Hames	65613a634a	[patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and the X86 backend. The code generated for symbolic targets is identical to the code generated for constant targets, except that a relocation is emitted to fix up the actual target address at link-time. This allows IR and object files containing patchpoints to be cached across JIT-invocations where the target address may change. llvm-svn: 235483	2015-04-22 06:02:31 +00:00
Sanjay Patel	fe1365ac50	[x86] allow 64-bit extracted vector element integer stores on a 32-bit system With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system even though 64-bit integers are not legal types. So instead of producing this: pshufd $229, %xmm0, %xmm1 ## xmm1 = xmm0[1,1,2,3] movd %xmm0, (%eax) movd %xmm1, 4(%eax) We can do: movq %xmm0, (%eax) This is a fix for the problem noted in D7296. Differential Revision: http://reviews.llvm.org/D9134 llvm-svn: 235460	2015-04-22 00:24:30 +00:00
Krzysztof Parzyszek	499bc5faa1	[Hexagon] Patterns for frame index with offset for isel llvm-svn: 235418	2015-04-21 21:28:03 +00:00
Jingyue Wu	66a161f05e	[NVPTX] do not run DCE after SLSR and SeparateConstOffsetFromGEP Summary: With D9096 and D9101, there's no need to run DCE after SLSR and SeparateConstOffsetFromGEP. Test Plan: no regression Reviewers: jholewinski, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9172 llvm-svn: 235415	2015-04-21 20:47:15 +00:00
Matthias Braun	9e9e8b3230	X86: Match for X86ISD nodes in LowerBUILD_VECTOR instead of BUILD_VECTORCombine There doesn't seem to be a reason to perform this target ISD node matching in an DAGCombine, moving it to lowering fixes PR23296. Differential Revision: http://reviews.llvm.org/D9137 llvm-svn: 235394	2015-04-21 17:21:36 +00:00
Elena Demikhovsky	0e6d6d54ce	AVX-512: Added VPMOVx2M instructions for SKX, fixed encoding of VPMOVM2x. llvm-svn: 235385	2015-04-21 14:38:31 +00:00
Elena Demikhovsky	431b81e41f	AVX-512: Added VPTESTM and VPTESTNM instructions for SKX llvm-svn: 235383	2015-04-21 13:13:46 +00:00
Toma Tabacu	11e14a9467	[mips] [IAS] Implement the .asciiz directive. Summary: This directive is exactly the same as .asciz, except it's only used by MIPS. It is used to store null terminated strings in object files. Reviewers: rafael, dsanders, echristo Reviewed By: dsanders, echristo Subscribers: echristo, llvm-commits Differential Revision: http://reviews.llvm.org/D7530 llvm-svn: 235382	2015-04-21 11:50:52 +00:00
Jozef Kolek	8e086cedfa	[mips][microMIPSr6] Implement CACHE and PREF instructions Implement CACHE and PREF instructions using mapping. Differential Revision: http://reviews.llvm.org/D8893 llvm-svn: 235379	2015-04-21 11:17:25 +00:00
Vasileios Kalintiris	41b0100dea	[mips] Cleanup old floating-point flag conditions definitions. NFC. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D7947 llvm-svn: 235377	2015-04-21 10:53:57 +00:00
Vasileios Kalintiris	32177d6bec	[mips] Optimize code generation for 64-bit variable shift instructions. Summary: The 64-bit version of the variable shift instructions uses the shift_rotate_reg class which uses a GPR32Opnd to specify the variable shift amount. With this patch we avoid the generation of a redundant SLL instruction for the variable shift instructions in 64-bit targets. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7413 llvm-svn: 235376	2015-04-21 10:49:03 +00:00
Elena Demikhovsky	50b88ddb87	AVX-512: Added logical and arithmetic instructions for SKX by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 235375	2015-04-21 10:27:40 +00:00
Simon Pilgrim	398ce22b86	[X86][SSE] Provide execution domains for scalar floating point operations This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since. I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests. Differential Revision: http://reviews.llvm.org/D9095 llvm-svn: 235372	2015-04-21 08:40:22 +00:00
Matthias Braun	b6b5aaad98	X86: Do not select X86 custom vector nodes if operand types don't match X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected if the operand types do not match the result type because vector type legalization cannot deal with this for custom nodes. Testcase X86ISD::ADDSUB is attached. I could not create a testcase for the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296 Differential Revision: http://reviews.llvm.org/D9120 llvm-svn: 235367	2015-04-21 01:13:41 +00:00
Pirama Arumuga Nainar	34056dea1b	[MIPS] OperationAction for FP_TO_FP16, FP16_TO_FP Summary: Set operation action for FP16 conversion opcodes, so the Op legalizer can choose the gnu_* libcalls for Mips. Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to prevent (fpext (load )) and (store (fptrunc)) from getting combined into unsupported operations. Added test cases to test that these operations are handled correctly for f16 scalars and vectors. This patch depends on http://reviews.llvm.org/D8755. Reviewers: srhines Subscribers: llvm-commits, ab Differential Revision: http://reviews.llvm.org/D8804 llvm-svn: 235341	2015-04-20 20:15:36 +00:00
Jozef Kolek	207d248eba	[mips][microMIPSr6] Implement BITSWAP instruction Implement BITSWAP instruction using mapping. Differential Revision: http://reviews.llvm.org/D8857 llvm-svn: 235321	2015-04-20 18:14:59 +00:00
Vladimir Sukharev	bad1d1dc02	[AArch64] LORID_EL1 register must be treated as read-only Patch by: John Brawn Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9105 llvm-svn: 235314	2015-04-20 16:54:37 +00:00
Bill Schmidt	6779075c44	[PowerPC] Flow oversized lines for r235309 llvm-svn: 235310	2015-04-20 15:58:46 +00:00
Bill Schmidt	1962f709c7	[PowerPC] Add future work for vector insert/extract to README_ALTIVEC.txt llvm-svn: 235309	2015-04-20 15:54:26 +00:00
Jozef Kolek	676d60125c	[mips][microMIPSr6] Implement disassembler support Implement disassembler support for microMIPS32r6. Differential Revision: http://reviews.llvm.org/D8490 llvm-svn: 235307	2015-04-20 14:40:38 +00:00
Jozef Kolek	5de4a6c0af	[mips][microMIPSr6] Implement BALC and BC instructions This patch implements BALC and BC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8388 llvm-svn: 235302	2015-04-20 13:04:14 +00:00
Jozef Kolek	6ca13eaf82	[mips][microMIPSr6] Implement initial mapping support Differential Revision: http://reviews.llvm.org/D8387 llvm-svn: 235298	2015-04-20 12:42:08 +00:00
Jozef Kolek	c22555d977	[mips][microMIPSr6] Implement initial subtarget support Differential Revision: http://reviews.llvm.org/D8386 llvm-svn: 235296	2015-04-20 12:23:06 +00:00
Andrea Di Biagio	98c367093d	[X86][FastIsel] Fix assertion failure when selecting int-to-double conversion (PR23273). This fixes a regression introduced at revision 231243. The target-independent selection algorithm in FastISel knows how to select a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr. Method X86FastISel::X86SelectSIToFP was therefore working under the wrong assumption that the target was AVX. That assumption was incorrect since we can have a target that is neither AVX nor SSE. So, rather than asserting for the presence of AVX, we should have had an early exit from 'X86SelectSIToFP' if the target was not AVX. This patch fixes the issue replacing the invalid assertion with an early exit. Thanks to Dimitry Andric for reporting this problem and for providing a small reproducible testcase. Added test pr23273.ll. llvm-svn: 235295	2015-04-20 11:56:59 +00:00
Simon Pilgrim	749953eebb	[X86][SSE] Fix for getScalarValueForVectorElement to detect scalar sources requiring truncation. The fix ensures that scalar sources inserted into a vector are the correct bit size. Integer scalar sources from BUILD_VECTOR and SCALAR_TO_VECTOR nodes may require truncation that this function doesn't currently support. llvm-svn: 235281	2015-04-19 22:16:49 +00:00
Craig Topper	43d413b698	Remove unnecessary include and probably a layering violation. llvm-svn: 235262	2015-04-19 00:57:33 +00:00
Ahmed Bougacha	e14a4d487e	[AArch64] Don't force MVT::Untyped when selecting LD1LANEpost. The result is either an Untyped reg sequence, on ldN with N > 1, or just the type of the input vector, on ld1. Don't force Untyped. Instead, just use the type of the reg sequence. This mirrors the behavior of createTuple, which feeds the LD1*_POST. The narrow code path wasn't actually covered by tests, because V64 insert_vector_elt are widened to V128 before the LD1LANEpost combine has the chance to run, usually. The only case where it does run on V64 vectors is if the vector ops legalizer ran. So, tickle the code with a ctpop. Fixes PR23265. llvm-svn: 235243	2015-04-17 23:43:33 +00:00
Ahmed Bougacha	2448ef5f33	[AArch64] Avoid vector->load dependency cycles when creating LD1post. They would break the SelectionDAG. Note that the opposite load->vector dependency is already obvious in: (LD1post vec, ..) llvm-svn: 235224	2015-04-17 21:02:30 +00:00
Vasileios Kalintiris	816ea84e7a	[mips][FastISel] Implement FastMaterializeAlloca in Mips fast-isel. Summary: Implement the method FastMaterializeAlloca in Mips fast-isel Based on a patch by Reed Kotler. Test Plan: Passes test-suite at O0/O2 for mips32 r1/r2 fastalloca.ll Reviewers: dsanders, rkotler Subscribers: rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6742 llvm-svn: 235213	2015-04-17 17:29:58 +00:00
Sanjay Patel	2161c49a4e	[X86, AVX] add an exedepfix entry for vmovq == vmovlps == vmovlpd This is the AVX extension of r235014: http://llvm.org/viewvc/llvm-project?view=revision&revision=235014 Review: http://reviews.llvm.org/D8691 llvm-svn: 235210	2015-04-17 17:02:37 +00:00
Vasileios Kalintiris	a4035e6284	[mips][FastISel] Implement shift ops for Mips fast-isel. Summary: Add shift operators implementation to fast-isel for Mips. These are shift ops for non legal forms, i.e. i8 and i16. Based on a patch by Reed Kotler. Test Plan: Reviewers: dsanders Subscribers: echristo, rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6726 llvm-svn: 235194	2015-04-17 14:29:21 +00:00
Rafael Espindola	7f4e07befc	Move AliasedSymbol to MachObjectWriter. It was only used by MachO. Part of pr19627. llvm-svn: 235185	2015-04-17 12:28:43 +00:00
Vasileios Kalintiris	bb60cfb5c4	[mips] Teach the delay slot filler to remove needless KILL instructions. Summary: Previously, the presence of KILL instructions would block valid candidates from filling a specific delay slot. With the elimination of the KILL instructions, in the appropriate range, we are able to fill more slots and keep the information from future def/use analysis consistent. Reviewers: dsanders Reviewed By: dsanders Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D7724 llvm-svn: 235183	2015-04-17 12:01:02 +00:00
Benjamin Kramer	97fbdd5a39	[mc] Clean up emission of byte sequences No functional change intended. llvm-svn: 235178	2015-04-17 11:12:43 +00:00
Daniel Sanders	81eb66c992	[mips] Move ABI-dependent register selections to MipsABIInfo. NFC. Summary: For example, a common idiom was 'isN64 ? Mips::SP_64 : Mips::SP'. This has been moved to MipsABIInfo and replaced with 'ABI.GetStackPtr()'. There are others that should also be moved. This patch sticks to the ones that are obviously non-functional. The others have minor mistakes that need fixing at the same time, mostly involving checks for 64-bit GPR's instead of checks for 64-bit pointers. Reviewers: tomatabacu Reviewed By: tomatabacu Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8972 llvm-svn: 235173	2015-04-17 09:50:21 +00:00
Ahmed Bougacha	941420d9ea	[AArch64] Don't assert on f16 in DUP PerfectShuffle generator. Found by code inspection, but breaking i16 at least breaks other tests. They aren't checking this in particular though, so also add some explicit tests for the already working types. llvm-svn: 235148	2015-04-16 23:57:07 +00:00
Pete Cooper	19d704d13c	Disable AArch64 fast-isel on big-endian call vector returns. A big-endian vector return needs a byte-swap which we aren't doing right now. For now just bail on these cases to get correctness back. llvm-svn: 235133	2015-04-16 21:19:36 +00:00
Vladimir Sukharev	6334cf3d69	[AArch64] Add v8.1a "Virtualization Host Extensions" Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8500 Patch by: Tom Coxon llvm-svn: 235107	2015-04-16 15:38:58 +00:00
Vladimir Sukharev	d49cb8fdd7	[AArch64] Add v8.1a "Limited Ordering Regions" extension Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8499 Patch by: Tom Coxon llvm-svn: 235105	2015-04-16 15:30:43 +00:00
Vladimir Sukharev	251ce0c2db	[AArch64] Add v8.1a "Privileged Access Never" extension Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8498 llvm-svn: 235104	2015-04-16 15:20:51 +00:00
Vladimir Sukharev	a11db3eb88	[AArch64] Handle Cyclone-specific register in common way Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8584 Patch by: Tom Coxon llvm-svn: 235102	2015-04-16 15:01:20 +00:00
Vladimir Sukharev	950b606a2b	[AArch64] Follow-up to: Refactor AArch64NamedImmMapper to become dependent on subtarget features Fixed compilation with clang on some buildbots with "-Werror -Wmissing-field-initializers" Related to: http://reviews.llvm.org/rL235089 llvm-svn: 235099	2015-04-16 14:36:13 +00:00
Toma Tabacu	2cc44f50a5	[mips] [IAS] Preserve microMIPS label marking for objects when assigning. Summary: Previously, this was only happening for functions, but because of .insn, objects can also be marked now. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8007 llvm-svn: 235095	2015-04-16 13:37:32 +00:00
Benjamin Kramer	727b505161	[Mips] Use unique_ptr to manage ownership. Required some tweaking of ValueMap to accommodate a move-only value type. No functional change intended. llvm-svn: 235091	2015-04-16 12:43:33 +00:00
Benjamin Kramer	90a84a33f6	Make it obvious that we're iterating over a range of pointers. Found by -Wrange-loop-analysis. llvm-svn: 235090	2015-04-16 12:43:07 +00:00
Vladimir Sukharev	a98f6897a2	[AArch64] Refactor AArch64NamedImmMapper to become dependent on subtarget features. In order to introduce v8.1a-specific entities, Mappers should be aware of SubtargetFeatures available. This patch introduces refactoring, that will then allow to easily introduce: - v8.1-specific "pan" PState for PStateMapper (PAN extension) - v8.1-specific sysregs for SysRegMapper (LOR,VHE extensions) Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8496 Patch by Tom Coxon llvm-svn: 235089	2015-04-16 12:15:27 +00:00
James Molloy	f8aa57aa3b	[AArch64] Fix invalid use of references to BuildMI. This was found in GCC PR65773 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773). We shouldn't be taking a reference to the temporary that BuildMI returns, we must copy it. llvm-svn: 235088	2015-04-16 11:37:40 +00:00
Vladimir Sukharev	0e0f8d2c1f	[ARM] Add v8.1a "Privileged Access Never" extension Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8504 llvm-svn: 235087	2015-04-16 11:34:25 +00:00
Toma Tabacu	9ca5096f59	[mips] [IAS] Add support for the .insn directive. Summary: This assembler directive marks the current label as an instruction label in microMIPS and MIPS16. This initial implementation works only for microMIPS. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8006 llvm-svn: 235084	2015-04-16 09:53:47 +00:00
Duncan P. N. Exon Smith	b273d06b63	DebugInfo: Gut DIScope, DIEnumerator and DISubrange The only class the still has API left is `DIDescriptor` itself. llvm-svn: 235067	2015-04-16 01:37:00 +00:00
Duncan P. N. Exon Smith	35ef22cf53	DebugInfo: Gut DICompileUnit and DIFile Continuing gutting `DIDescriptor` subclasses; this edition, `DICompileUnit` and `DIFile`. In the name of PR23080. llvm-svn: 235055	2015-04-15 23:19:27 +00:00
Charlie Turner	6f13d0ca84	Fix BXJ is undefined in AArch32. BXJ was incorrectly said to be unsupported in ARMv8-A. It is not supported in the A64 instruction set, but it is supported in the T32 and A32 instruction sets, because it's listed as an instruction in the ARM ARM section F7.1.28. Using SP as an operand to BXJ changed from UNPREDICTABLE to PREDICTABLE in v8-A. This patch reflects that update as well. This was found by MCHammer. llvm-svn: 235024	2015-04-15 17:28:23 +00:00
Sanjay Patel	c03d93baa0	[X86] add an exedepfix entry for movq == movlps == movlpd This is a 1-line patch (with a TODO for AVX because that will affect even more regression tests) that lets us substitute the appropriate 64-bit store for the float/double/int domains. It's not clear to me exactly what the difference is between the 0xD6 (MOVPQI2QImr) and 0x7E (MOVSDto64mr) opcodes, but this is apparently the right choice. Differential Revision: http://reviews.llvm.org/D8691 llvm-svn: 235014	2015-04-15 15:47:51 +00:00
Sanjay Patel	7024b8121a	[x86] Implement combineRepeatedFPDivisors Set the transform bar at 2 divisions because the fastest current x86 FP divider circuit is in SandyBridge / Haswell at 10 cycle latency (best case) relative to a 5 cycle multiplier. So that's the worst case for this transform (no latency win), but multiplies are obviously pipelined while divisions are not, so there's still a big throughput win which we would expect to show up in typical FP code. These are the sequences I'm comparing: divss %xmm2, %xmm0 mulss %xmm1, %xmm0 divss %xmm2, %xmm0 Becomes: movss LCPI0_0(%rip), %xmm3 ## xmm3 = mem[0],zero,zero,zero divss %xmm2, %xmm3 mulss %xmm3, %xmm0 mulss %xmm1, %xmm0 mulss %xmm3, %xmm0 [Ignore for the moment that we don't optimize the chain of 3 multiplies into 2 independent fmuls followed by 1 dependent fmul...this is the DAG version of: https://llvm.org/bugs/show_bug.cgi?id=21768 ...if we fix that, then the transform becomes even more profitable on all targets.] Differential Revision: http://reviews.llvm.org/D8941 llvm-svn: 235012	2015-04-15 15:22:55 +00:00
Daniel Sanders	93ea6ab136	[msp430] Only support the 'm' inline assembly memory constraint. NFC. Summary: MSP430 doesn't seem to have any additional constraints. Therefore remove the target hook. No functional change intended. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8208 llvm-svn: 235003	2015-04-15 12:51:28 +00:00
Toma Tabacu	89a712b0be	[mips] [IAS] Refactor the function which checks for the availability of AT. NFC. Summary: Refactor MipsAsmParser::getATReg to return an internal register number instead of a register index. Also change all the int's to unsigned, seeing as the current AT register index is stored as an unsigned in MipsAssemblerOptions. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8478 llvm-svn: 234996	2015-04-15 10:48:56 +00:00
Alexei Starovoitov	1a8102e020	[bpf] fix build fix build due to refactoring in DIL/MDL and raw_pwrite_stream llvm-svn: 234971	2015-04-15 02:48:57 +00:00
Richard Trieu	6b1aa5f5e1	Change range-based for-loops to be -Wrange-loop-analysis clean. No functionality change. llvm-svn: 234963	2015-04-15 01:21:15 +00:00
Rafael Espindola	5560a4cfbd	Use raw_pwrite_stream in the object writer/streamer. The ELF object writer will take advantage of that in the next commit. llvm-svn: 234950	2015-04-14 22:14:34 +00:00
Ed Maste	8ed40ce56d	Correct 'teh' and other typos / repeated words. Patch by Eitan Adler. Differential Revision: http://reviews.llvm.org/D8514 llvm-svn: 234939	2015-04-14 20:52:58 +00:00
Alexander Kornienko	fb37cfa346	Refactor: Simplify boolean expressions in ARM target Simplify boolean expressions using `true` and `false` with `clang-tidy` http://reviews.llvm.org/D8524 Patch by Richard Thomson! llvm-svn: 234901	2015-04-14 15:32:58 +00:00
Bradley Smith	b913653b91	[AArch64] Allow non-standard INS/DUP encodings The ARMv8 ARMARM states that for these instructions in A64 state: "Unspecified bits in "imm5" are ignored but should be set to zero by an assembler.", (imm4 for INS). Make the disassembler accept any encoding with these ignored bits set to 1. llvm-svn: 234896	2015-04-14 15:07:26 +00:00
Tom Stellard	d4a1950500	R600/SI: Fix verifier error caused by SIAnnotateControlFlow This pass will always try to insert llvm.SI.ifbreak intrinsics in the same block that its conditional value is computed in. This is a problem when conditions for breaks or continue are computed outside of the loop, because the llvm.SI.ifbreak intrinsic ends up being inserted outside of the loop. This patch fixes this problem by inserting the llvm.SI.ifbreak intrinsics in the loop header when the condition is computed outside the loop. llvm-svn: 234891	2015-04-14 14:36:45 +00:00

... 3 4 5 6 7 ...

33113 Commits