llvm-project

Commit Graph

Author	SHA1	Message	Date
Derek Schuff	9769debf88	[WebAssembly] Implement prolog/epilog insertion and FrameIndex elimination Summary: Use the SP32 physical register as the base for FrameIndex lowering. Update it and the __stack_pointer global var in the prolog and epilog. Extend the mapping of virtual registers to wasm locals to include the physical registers. Rather than modify the target-independent PrologEpilogInserter (which asserts that there are no virtual registers left) include a slightly-modified copy for Wasm that does not have this assertion and only clears the virtual registers if scavenging was needed (which of course it isn't for wasm). Differential Revision: http://reviews.llvm.org/D15344 llvm-svn: 255392	2015-12-11 23:49:46 +00:00
Chen Li	e8f9387e0c	[X86ISelLowering] Add additional support for multiplication-to-shift conversion. Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255391	2015-12-11 23:39:32 +00:00
Hal Finkel	cd8664c3c2	Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387	2015-12-11 23:11:52 +00:00
Matthias Braun	60d69e2865	CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness() computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362	2015-12-11 19:42:09 +00:00
Matt Arsenault	fbd9bbfda3	Start replacing vector_extract/vector_insert with extractelt/insertelt These are redundant pairs of nodes defined for INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT. insertelement/extractelement are slightly closer to the corresponding C++ node name, and has stricter type checking so prefer it. Update targets to only use these nodes where it is trivial to do so. AArch64, ARM, and Mips all have various type errors on simple replacement, so they will need work to fix. Example from AArch64: def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8), (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>; Which is trying to do sext_inreg i8, i8. llvm-svn: 255359	2015-12-11 19:20:16 +00:00
Derek Schuff	5a14306323	[WebAssembly] Fix ADJCALLSTACKDOWN/UP use/defs Summary: ADJCALLSTACK{DOWN,UP} (aka CALLSEQ_{START,END}) MIs are supposed to use and def the stack pointer. Since they do not, all the nodes are being eliminated by DeadMachineInstructionElim, so they aren't in the IR when PrologEpilogInserter/eliminateCallFramePseudo needs them. This change fixes that, but since RegStackify will not stackify across them (and it runs early, before PEI), change LowerCall to only emit them when the call frame size is > 0. That makes the current code work the same way and makes code handled by D15344 also work the same way. We can expand the condition beyond NumBytes > 0 in the future if needed. Reviewers: sunfish, jfb Subscribers: jfb, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D15459 llvm-svn: 255356	2015-12-11 18:55:34 +00:00
Hans Wennborg	a8e6b3ecb7	Fix build after r255319. llvm-svn: 255322	2015-12-11 00:58:32 +00:00
Kyle Butt	1452b76f1f	[PPC]: Peephole optimize small accesss to aligned globals. Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319	2015-12-11 00:47:36 +00:00
Cong Hou	59898d8c68	[X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1. Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers are counted from the result of running llc on the new test case in this patch. Differential revision: http://reviews.llvm.org/D15132 llvm-svn: 255315	2015-12-11 00:31:39 +00:00
Kyle Butt	28b01a51b3	PPC: Teach FMA mutate to respect register classes. This was causing bad code gen and assembly that won't assemble, as mixed altivec and vsx code would end up with a vsx high register assigned to an altivec instruction, which won't work. Constraining the classes allows the optimization to proceed. llvm-svn: 255299	2015-12-10 21:28:40 +00:00
Pirama Arumuga Nainar	1317d5f311	Fix fptosi, fptoui from f16 vectors to i8, i16 vectors Summary: Convert f16 vectors to corresponding f32 vectors before doing the conversion to int. Add tests for v4f16, v8f16. Reviewers: ab, jmolloy Subscribers: llvm-commits, srhines Differential Revision: http://reviews.llvm.org/D14936 llvm-svn: 255263	2015-12-10 17:16:49 +00:00
Dan Gohman	b949b9c01b	[WebAssembly] Make WebAssemblyStoreResults only return true when it has a change. llvm-svn: 255253	2015-12-10 14:17:36 +00:00
Dan Gohman	a87629d6d7	[WebAssembly] Fix WebAssemblyPeephole to set Changed to true when making changes. llvm-svn: 255252	2015-12-10 14:16:34 +00:00
Dan Gohman	acc0941bd1	[WebAssembly] Declare that WebAssemblyPeephole does not modify the CFG. llvm-svn: 255251	2015-12-10 14:12:04 +00:00
Dan Gohman	6d63f96749	[WebAssembly] Remove an unneeded getAnalysisUsage override. llvm-svn: 255250	2015-12-10 14:10:04 +00:00
Nemanja Ivanovic	ac8d01add0	Bitcasts between FP and INT values using direct moves This patch corresponds to review: http://reviews.llvm.org/D15286 LLVM IR frequently contains bitcast operations between floating point and integer values of the same width. Doing this through memory operations is quite expensive on PPC. This patch allows the use of direct register moves between FPRs and GPRs for lowering bitcasts. llvm-svn: 255246	2015-12-10 13:35:28 +00:00
Jonas Paulsson	e451eeff5c	[PostRA scheduling] Allow a target to do scheduling when it wants post RA. SystemZ needs to do its scheduling after branch relaxation, which can only happen after block placement, and therefore the standard PostRAScheduler point in the pass sequence is too early. TargetMachine::targetSchedulesPostRAScheduling() is a new method that signals on returning true that target will insert the final scheduling pass on its own. Reviewed by Hal Finkel llvm-svn: 255234	2015-12-10 09:10:07 +00:00
Craig Topper	8e44b9a4d1	[X86] Fix a couple cases were bitwise and logical operations were being mixed. NFC llvm-svn: 255224	2015-12-10 06:09:41 +00:00
Dan Gohman	f170ba08af	[WebAssembly] Implement mixed-type ISD::FCOPYSIGN. ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner uses this. Add some def : Pat rules to expand this out into an explicit conversion and a normal copysign operation. llvm-svn: 255220	2015-12-10 04:55:31 +00:00
Dan Gohman	9341c1d4b3	[WebAssembly] Implement fma. It is lowered to a libcall for now, but this is expected to change in the future. llvm-svn: 255219	2015-12-10 04:52:33 +00:00
Tom Stellard	c2d654322b	AMDGPU/SI: Fix warning introduced by r255204 llvm-svn: 255205	2015-12-10 03:10:46 +00:00
Tom Stellard	c93fc11f36	AMDGPU/SI: Emit constant arrays in the .text section Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204	2015-12-10 02:13:01 +00:00
Tom Stellard	b3c3bda512	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203	2015-12-10 02:12:53 +00:00
Dan Gohman	60bddf17c5	[WebAssembly] Fix legalization of f32->f64 EXTLOAD. llvm-svn: 255202	2015-12-10 02:07:53 +00:00
Derek Schuff	6fd28dfe5d	[WebAssembly] Update known test failures We can now select sign_extend_inreg llvm-svn: 255197	2015-12-10 01:09:40 +00:00
Dan Gohman	a5603b835b	[WebAssembly] Also legalize sign_extend_inreg of i32->i64. llvm-svn: 255191	2015-12-10 01:00:19 +00:00
Derek Schuff	71d0eae609	[WebAssembly] Update test failure expectations llvm-svn: 255190	2015-12-10 00:56:18 +00:00
Dan Gohman	a8483755d3	[WebAssembly] Fix legalization of shift operators with illegal types. llvm-svn: 255181	2015-12-10 00:26:26 +00:00
Dan Gohman	7935fa3d1b	[WebAssembly] Fix copy+pastos. llvm-svn: 255180	2015-12-10 00:22:40 +00:00
Dan Gohman	df00a9ebc2	[WebAssembly] Implement anyext. llvm-svn: 255179	2015-12-10 00:17:35 +00:00
Quentin Colombet	5d2f7cfd44	[X86] Enable shrink-wrapping by default, but keep it disabled for stack frames without a frame pointer when unwind may happen. This is a workaround for a bug in the way we emit the CFI directives for frameless unwind information. See PR25614. llvm-svn: 255175	2015-12-09 23:08:18 +00:00
Dan Gohman	1cf96c0c34	[WebAssembly] Reintroduce ARGUMENT moving logic Reinteroduce the code for moving ARGUMENTS back to the top of the basic block. While the ARGUMENTS physical register prevents sinking and scheduling from moving them, it does not appear to be sufficient to prevent SelectionDAG from moving them down in the initial schedule. This patch introduces a patch that moves them back to the top immediately after SelectionDAG runs. This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is one alternative, though the review has not been favorable, and proposed alternatives are longer-term and have other downsides. This fixes the main outstanding -verify-machineinstrs failures, so it adds -verify-machineinstrs to several tests. Differential Revision: http://reviews.llvm.org/D15377 llvm-svn: 255125	2015-12-09 16:23:59 +00:00
Tim Northover	d91d635b36	ARM: don't use a deleted node as the BaseReg in complex pattern. We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120	2015-12-09 15:54:50 +00:00
JF Bastien	88f8014e8e	WebAssembly: add missing failure to the list. llvm-svn: 255119	2015-12-09 15:52:57 +00:00
Oliver Stannard	86f729296a	[AArch64] Fix FP16 vector instructions that should only accept low registers llvm-svn: 255113	2015-12-09 14:32:11 +00:00
Daniel Sanders	3c7223133d	[mips][ias] Range check uimm10 operands Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15229 llvm-svn: 255112	2015-12-09 13:48:05 +00:00
JF Bastien	c2b30484ae	WebAssembly: add known failures The bots are now running the torture tests properly. Bin all failures from the GCC C torture tests so that we can tackle failures and make the tree go red on regressions. llvm-svn: 255111	2015-12-09 13:29:32 +00:00
Vasileios Kalintiris	ddf7e6885a	[mips] Use multiclass patterns for f32/f64 comparisons and i32 selects. Summary: Although the multiclass for i32 selects might seem redundant as it has only one instantiation, we will use it to replace the correspondent patterns in Mips64r6InstrInfo.td in follow-up commits. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14612 llvm-svn: 255110	2015-12-09 13:24:22 +00:00
Zlatko Buljan	48f1f39bfe	Revert r254897 "[mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions" Commited patch was intended to implement LH, LHE, LHU and LHUE instructions. After commit test-suite failed with error message in the form of: fatal error: error in backend: Cannot select: t124: i32,ch = load<LD2[%d](tbaa=<0x94acc48>), sext from i16> t0, t2, undef:i32 For that reason I decided to revert commit r254897 and make new patch which besides implementation and standard regression tests will also have dedicated tests (CodeGen) for the above error. llvm-svn: 255109	2015-12-09 13:07:45 +00:00
Ahmed Bougacha	97564c3a1b	[AArch64][ARM] Don't base interleaved op legality on type alloc size. Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, any vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089	2015-12-09 01:19:50 +00:00
Vyacheslav Klochkov	a3cd08b05c	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes. Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080	2015-12-09 00:12:13 +00:00
Pirama Arumuga Nainar	e6ccd7b66a	Define selection for v4f16, v8f16 scalar_to_vector Summary: This fixes failure when trying to select insertelement <4 x half> undef, half %a, i64 0 which gets transformed to a scalar_to_vector node. The accompanying v4 and v8 tests fail instruction selection without this patch. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15322 llvm-svn: 255072	2015-12-08 23:07:06 +00:00
Simon Pilgrim	323e00d9c7	[X86][AVX] Fold loads + splats into broadcast instructions On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector. This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element. Fix for PR23022 Differential Revision: http://reviews.llvm.org/D15310 llvm-svn: 255061	2015-12-08 22:17:11 +00:00
Artyom Skrobov	0a37b80bcb	Fix ARMv4T (Thumb1) epilogue generation Summary: Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986; so we need the special fixup in the epilogue. Reviewers: jroelofs, qcolombet Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15126 llvm-svn: 255047	2015-12-08 19:59:01 +00:00
Tim Northover	614e8ff855	X86: produce more friendly errors during MachO relocation handling llvm-svn: 255036	2015-12-08 18:31:35 +00:00
Renato Golin	412ee3d45d	[ARM] Allowing SP/PC for AND/BIC mod_imm_not AND/BIC instructions do accept SP/PC, so the register class should be more generic (rGPR -> GPR) to cope with that case. Adding more tests. llvm-svn: 255034	2015-12-08 18:10:58 +00:00
Ron Lieberman	e6540e244a	[Hexagon] Add NewValueJump support for C4_cmpneq, C4_cmplte, C4_cmplteu llvm-svn: 255027	2015-12-08 16:28:32 +00:00
Daniel Sanders	106d2d4693	[mips][ias] Range check uimm8 operands Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D15226 llvm-svn: 255018	2015-12-08 14:42:10 +00:00
Daniel Sanders	59d092f883	[mips][ias] Range check uimm6 operands and fix a bug this revealed. Summary: We don't check the size operand on ext/dext/ins/dins yet because the permitted range depends on the pos argument and we can't check that using this mechanism. The bug was that dextu/dinsu accepted 0..31 in the pos operand instead of 32..63. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D15190 llvm-svn: 255015	2015-12-08 13:49:19 +00:00
Oliver Stannard	e4c3d21ea6	[AArch64] Add ARMv8.2-A FP16 vector instructions ARMv8.2-A adds 16-bit floating point versions of all existing SIMD floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. Note that VFP without SIMD is not a valid combination for any version of ARMv8-A, but I have ensured that these instructions all depend on both FeatureNEON and FeatureFullFP16 for consistency. The ".2h" vector type specifier is now legal (for the scalar pairwise reduction instructions), so some unrelated tests have been modified as different error messages are emitted. This is not a problem as the invalid operands are still caught. llvm-svn: 255010	2015-12-08 12:16:10 +00:00
Dan Gohman	31448f16b6	[WebAssembly] Fix a typo in a comment. llvm-svn: 254999	2015-12-08 03:43:03 +00:00
Dan Gohman	fd98ea89d9	[WebAssembly] Remove an unneeded static_cast. llvm-svn: 254998	2015-12-08 03:42:50 +00:00
Dan Gohman	7f970765ea	[WebAssembly] Fix an emacs syntax highlighting comment. llvm-svn: 254997	2015-12-08 03:36:00 +00:00
Dan Gohman	ad664b3bda	[WebAssembly] Convert a file-level comment to doxygen style. llvm-svn: 254996	2015-12-08 03:33:51 +00:00
Dan Gohman	d70e5907cd	[WebAssembly] Assert MRI.isSSA() in passes that depend on SSA form. llvm-svn: 254995	2015-12-08 03:30:42 +00:00
Dan Gohman	a8551b4d7e	[WebAssembly] Trim some unneeded #includes. llvm-svn: 254994	2015-12-08 03:25:35 +00:00
Dan Gohman	4a84b7322f	[WebAssembly] Remove the override of haveFastSqrt. The default implementation in BasicTTI already checks TLI and does the right thing. llvm-svn: 254993	2015-12-08 03:22:33 +00:00
Manman Ren	cb8470b4b5	[CXX TLS calling convention] Add support for AArch64. rdar://9001553 llvm-svn: 254978	2015-12-08 00:14:38 +00:00
Kit Barton	a1c712fae5	[PPC64] Convert bool literals to i32 Convert i1 values to i32 values if they should be allocated in GPRs instead of CRs. Phabricator: http://reviews.llvm.org/D14064 llvm-svn: 254942	2015-12-07 20:50:29 +00:00
Sanjay Patel	a6bdd70f4b	don't repeat function names in comments; NFC llvm-svn: 254930	2015-12-07 19:31:34 +00:00
Sanjay Patel	e4b9f507cf	fix 'the the '; NFC llvm-svn: 254928	2015-12-07 19:21:39 +00:00
Sanjay Patel	f9bdb872bd	remove redundant check: optForSize() includes a check for the minsize attribute; NFCI llvm-svn: 254925	2015-12-07 19:13:40 +00:00
Elena Demikhovsky	291fe0159f	VX-512: Fixed a bug in FP logic operation lowering FP logic instructions are supported in DQ extension on AVX-512 target. I use integer operations instead. Added tests. I also enabled FABS in this patch in order to check ANDPS. The operations are FOR, FXOR, FAND, FANDN. The instructions, that supported for 512-bit vector under DQ are: VORPS/PD, VXORPS/PD, VANDPS/PD, FANDNPS/PD. Differential Revision: http://reviews.llvm.org/D15110 llvm-svn: 254913	2015-12-07 14:33:34 +00:00
Artyom Skrobov	e9b3fb8603	[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM. Summary: This reverts r254234, and adds a simple fix for the annoying case of use-after-free. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15236 llvm-svn: 254912	2015-12-07 14:22:39 +00:00
Elena Demikhovsky	33e61eceb4	AVX-512: Fixed masked load / store instruction selection for KNL. Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store. This intrinsic comes with all legal types: <8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru), but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only. All data operands should be widened to 512-bit vector. The mask operand should be widened to v16i1 with zeroes. Differential Revision: http://reviews.llvm.org/D15265 llvm-svn: 254909	2015-12-07 13:39:24 +00:00
Igor Breger	3ab6f17530	AVX-512: implement kunpck intrinsics. Differential Revision: http://reviews.llvm.org/D14821 llvm-svn: 254908	2015-12-07 13:25:18 +00:00
Marina Yatsina	497d44a081	[X86] Adding support for FWORD type for MS inline asm Adding support for FWORD type for MS inline asm. Differential Revision: http://reviews.llvm.org/D15268 llvm-svn: 254904	2015-12-07 13:09:20 +00:00
Bradley Smith	d5a1f47a63	[ARM] Flag vcvt{t,b} with an f16 type specifier as part of the FP16 extension Additionally correct the Cortex-R7 definition to allow the FP16 feature. llvm-svn: 254900	2015-12-07 10:54:36 +00:00
Zlatko Buljan	1a01c15027	[mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions Differential Revision: http://reviews.llvm.org/D9824 llvm-svn: 254897	2015-12-07 08:29:31 +00:00
Dan Gohman	5e0886beb7	[WebAssembly] Factor out a TypeToString function, since we need it in multiple places. llvm-svn: 254884	2015-12-06 19:42:29 +00:00
Dan Gohman	770f0d0a40	[WebAssembly] Make tableswitch's 'default' operand explicit. NFC. llvm-svn: 254883	2015-12-06 19:34:57 +00:00
Dan Gohman	a4b710a74f	[WebAssembly] Enable folding of offsets into global variable addresses. llvm-svn: 254882	2015-12-06 19:33:32 +00:00
Dan Gohman	753abf8de5	[WebAssembly] Add some more ideas to README.txt. llvm-svn: 254880	2015-12-06 19:29:54 +00:00
Marina Yatsina	1d1aa0b0a8	[X86] Add support for loopz, loopnz for Intel syntax According to x86 spec, loopz and loopnz should be supported for Intel syntax, where loopz is equivalent to loope and loopnz is equivalent to loopne. Differential Revision: http://reviews.llvm.org/D15148 llvm-svn: 254877	2015-12-06 15:31:47 +00:00
Asaf Badouh	41ecf460fa	[X86][AVX512] add vmovss/sd missing encoding Differential Revision: http://reviews.llvm.org/D14701 llvm-svn: 254875	2015-12-06 13:26:56 +00:00
Michael Kuperstein	77ce9d3b1a	[X86] Always generate precise CFA adjustments. This removes the code path that generate "synchronous" (only correct at call site) CFA. We will probably want to re-introduce it once we are capable of emitting different .eh_frame and .debug_frame sections. Differential Revision: http://reviews.llvm.org/D14948 llvm-svn: 254874	2015-12-06 13:06:20 +00:00
Igor Breger	076dfe5c12	AVX512: support AVX512BW Intrinsic in 32bit mode. Differential Revision: http://reviews.llvm.org/D15076 llvm-svn: 254873	2015-12-06 11:35:18 +00:00
Craig Topper	15576e1c8f	Use make_range to reduce mentions of iterator type. NFC llvm-svn: 254872	2015-12-06 05:08:07 +00:00
Dan Gohman	d85c3b1fbc	[WebAssembly] Don't perform the returned-argument optimization on constants. llvm-svn: 254866	2015-12-05 22:12:39 +00:00
Dan Gohman	3bb55e98e2	[WebAssembly] Replace the fake JUMP_TABLE instruction with a def : Pat. NFC. llvm-svn: 254864	2015-12-05 20:46:53 +00:00
Dan Gohman	e2a7a8278f	[WebAssembly] Implement direct calls to external symbols. llvm-svn: 254863	2015-12-05 20:41:36 +00:00
Dan Gohman	284384b640	[WebAssembly] Support inline asm constraints of type i16 and similar. llvm-svn: 254861	2015-12-05 20:03:44 +00:00
Dan Gohman	905bef5cf9	[WebAssembly] Update a stale comment. NFC. llvm-svn: 254859	2015-12-05 19:43:19 +00:00
JF Bastien	f05f6fd1e2	WebAssembly: improve readme, add placeholder for tests. llvm-svn: 254857	2015-12-05 19:36:33 +00:00
Dan Gohman	7615e46919	[WebAssembly] Move useAA() out of line to make it more convenient to experiment with. llvm-svn: 254856	2015-12-05 19:27:18 +00:00
Dan Gohman	b0921ca9e1	[WebAssembly] Call TargetPassConfig base class functions in overriding functions. llvm-svn: 254855	2015-12-05 19:24:17 +00:00
Dan Gohman	ebb23545de	[WebAssembly] Expand frem as a floating point library function. llvm-svn: 254854	2015-12-05 19:15:57 +00:00
Craig Topper	5c32279bee	[Hexagon] Don't call getNumImplicitDefs and then iterate over the count. getNumImplicitDefs contains a loop so its better to just loop over the null terminated implicit def list. NFC llvm-svn: 254852	2015-12-05 17:34:07 +00:00
Simon Pilgrim	4ba5969224	[X86][ADX] Added memory folding patterns and stack folding tests llvm-svn: 254844	2015-12-05 07:27:50 +00:00
Craig Topper	e5e035a3a8	Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef. llvm-svn: 254843	2015-12-05 07:13:35 +00:00
Simon Pilgrim	5a64d98303	[X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructions Both were defaulting to the float domain - now matches the packed instructions. llvm-svn: 254841	2015-12-05 07:07:42 +00:00
Dan Gohman	f0b165a7f8	[WebAssembly] Implement ReverseBranchCondition, and re-enable MachineBlockPlacement This patch introduces a codegen-only instruction currently named br_unless, which makes it convenient to implement ReverseBranchCondition and re-enable the MachineBlockPlacement pass. Then in a late pass, it lowers br_unless back into br_if. Differential Revision: http://reviews.llvm.org/D14995 llvm-svn: 254826	2015-12-05 03:03:35 +00:00
Dan Gohman	4da4abd87f	[WebAssembly] Fix scheduling dependencies in register-stackified code Add physical register defs to instructions used from stackified instructions to prevent them from being scheduled into the middle of a stack sequence. This is a conservative measure which may be loosened in the future. Differential Revision: http://reviews.llvm.org/D15252 llvm-svn: 254811	2015-12-05 00:51:40 +00:00
Derek Schuff	9d77952332	[WebAssembly] Support constant offsets on loads and stores This is just prototype for load/store for i32 types. I'll add them to the rest of the types if we like this direction. Differential Revision: http://reviews.llvm.org/D15197 llvm-svn: 254807	2015-12-05 00:26:39 +00:00
Philip Reames	7c6692de16	[EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC) When the notion of target specific memory intrinsics was introduced to EarlyCSE, the commit confused the notions of volatile and simple memory access. Since I'm about to start working on this area, cleanup the naming so that patches aren't horribly confusing. Note that the actual implementation was always bailing if the load or store wasn't simple. Reminder: - "volatile" - C++ volatile, can't remove any memory operations, but in principal unordered - "ordered" - imposes ordering constraints on other nearby memory operations - "atomic" - can't be split or sheared. In LLVM terms, all "ordered" operations are also atomic so the predicate "isAtomic" is often used. - "simple" - a load which is none of the above. These are normal loads and what most of the optimizer works with. llvm-svn: 254805	2015-12-05 00:18:33 +00:00
Hans Wennborg	fbf2822e6d	Add FeatureLAHFSAHF to amdfam10 as well. llvm-svn: 254801	2015-12-04 23:32:19 +00:00
Dan Gohman	35bfb24c28	[WebAssembly] Initial varargs support. Full varargs support will depend on prologue/epilogue support, but this patch gets us started with most of the basic infrastructure. Differential Revision: http://reviews.llvm.org/D15231 llvm-svn: 254799	2015-12-04 23:22:35 +00:00
Hans Wennborg	5000ce8a63	X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793	2015-12-04 23:00:33 +00:00
Chad Rosier	f3491496dc	[AArch64] Expand vector SDIVREM/UDIVREM operations. http://reviews.llvm.org/D15214 Patch by Ana Pazos <apazos@codeaurora.org>! llvm-svn: 254773	2015-12-04 21:38:44 +00:00
Dan Gohman	1ce2b1afd6	[WebAssembly] Add several more calling conventions to the supported list. llvm-svn: 254741	2015-12-04 18:27:03 +00:00
Sanjay Patel	1640c54593	fix formatting; NFC llvm-svn: 254739	2015-12-04 17:51:55 +00:00
Manman Ren	19c7bbe3b7	[CXX TLS calling convention] Add CXX TLS calling convention. This commit adds a new target-independent calling convention for C++ TLS access functions. It aims to minimize overhead in the caller by perserving as many registers as possible. The target-specific implementation for X86-64 is defined as following: Arguments are passed as for the default C calling convention The same applies for the return value(s) The callee preserves all GPRs - except RAX and RDI The access function makes C-style TLS function calls in the entry and exit block, C-style TLS functions save a lot more registers than normal calls. The added calling convention ties into the existing implementation of the C-style TLS functions, so we can't simply use existing calling conventions such as preserve_mostcc. rdar://9001553 llvm-svn: 254737	2015-12-04 17:40:13 +00:00
Dan Gohman	541841e365	[WebAssembly] Give names to the callseq begin and end instructions. llvm-svn: 254730	2015-12-04 17:19:44 +00:00
Dan Gohman	a3f5ce5f1b	[WebAssembly] clang-format CallingConvSupported. NFC. llvm-svn: 254729	2015-12-04 17:18:32 +00:00
Dan Gohman	85dbdda1ed	[WebAssembly] Factor out the list of supported calling conventions. llvm-svn: 254728	2015-12-04 17:16:07 +00:00
Dan Gohman	2d822e73fa	[WebAssembly] Check for more unsupported ABI flags. llvm-svn: 254727	2015-12-04 17:12:52 +00:00
Dan Gohman	cb7940f9f5	[WebAssembly] Use SelectionDAG::getUNDEF. NFC. llvm-svn: 254726	2015-12-04 17:09:42 +00:00
Krzysztof Parzyszek	f1b3e5e52e	[Hexagon] Simplify LowerCONCAT_VECTORS, handle different types better llvm-svn: 254724	2015-12-04 16:18:15 +00:00
Colin LeMahieu	4c606e66a7	[Hexagon] Using multiply instead of shift on signed number which can be UB llvm-svn: 254719	2015-12-04 15:48:45 +00:00
Jonas Paulsson	7fa69cd5dd	[SystemZ] Bugfix: Don't add CC twice to new three-address instruction. Since BuildMI() automatically adds the implicit operands for a new instruction, adding the old instructions CC operand resulted in that there were two CC imp-def operands, where only one was marked as dead. This caused buildSchedGraph() to miss dependencies on the CC reg. Review by Ulrich Weigand llvm-svn: 254714	2015-12-04 12:48:51 +00:00
Alexey Bataev	7cf324772f	LEA code size optimization pass (Part 1): Remove redundant address recalculations, by Andrey Turetsky Add new x86 pass which replaces address calculations in load or store instructions with def register of existing LEA (must be in the same basic block), if the LEA calculates address that differs only by a displacement. Works only with -Os or -Oz. Differential Revision: http://reviews.llvm.org/D13294 llvm-svn: 254712	2015-12-04 10:53:15 +00:00
Quentin Colombet	901f036353	[ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine it with its source instead of forcing the values on GPRs. This improves the lowering of vector code when such bitcasts happen in the middle of vector computations. rdar://problem/23691584 llvm-svn: 254684	2015-12-04 01:53:14 +00:00
JF Bastien	580b6572b5	X86InstrInfo::copyPhysReg: workaround reg liveness Summary: computeRegisterLiveness and analyzePhysReg are currently getting confused about liveness in some cases, breaking copyPhysReg's calculation of whether AX is dead in some cases. Work around this issue temporarily by assuming that AX is always live. See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7 And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201. This workaround makes the code correct but slightly inefficient, but it seems to confuse the machine instr verifier which now things EAX was undefined in some cases where it's being conservatively saved / restored. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15198 llvm-svn: 254680	2015-12-04 01:18:17 +00:00
Dan Gohman	391a98afd5	[WebAssembly] Fix dominance check for PHIs in the StoreResult pass When a block has no terminator instructions, getFirstTerminator() returns end(), which can't be used in dominance checks. Check dominance for phi operands separately. Also, remove some bits from WebAssemblyRegStackify.cpp that were causing trouble on the same testcase; they were left behind from an earlier experiment. Differential Revision: http://reviews.llvm.org/D15210 llvm-svn: 254662	2015-12-03 23:07:03 +00:00
Chih-Hung Hsieh	ed7d81e5d4	[X86] Part 1 to fix x86-64 fp128 calling convention. Almost all these changes are conditioned and only apply to the new x86-64 f128 type configuration, which will be enabled in a follow up patch. They are required together to make new f128 work. If there is any error, we should fix or revert them as a whole. These changes should have no impact to current configurations. * Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true. * Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node. * Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. * Enhance related Lower* functions to handle f128 type. * Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls. * Fix Combiner, Emitter, Legalizer routines that did not handle f128 type. * Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node. * Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp. * Fix infinite loop in getTypeLegalizationCost when f128 is the value type. * Fix printOperand to handle null operand. * Enhance ISD::BITCAST node to handle f128 constant. * Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes. * Enhance X86AsmPrinter to emit f128 values in comments. Differential Revision: http://reviews.llvm.org/D15134 llvm-svn: 254653	2015-12-03 22:02:40 +00:00
Colin LeMahieu	15ca65c253	[Hexagon] Adding shuffling resources for HVX instructions and tests for instruction encodings. llvm-svn: 254652	2015-12-03 21:44:28 +00:00
Reid Kleckner	93fc520339	[X86] Put no-op ADJCALLSTACK markers around all dynamic lowerings Summary: These ADJCALLSTACK markers don't generate code, but they keep dynamic alloca code that calls chkstk out of the prologue. This slightly pessimizes inalloca calls by preventing some register copy coalescing, but I can live with that. Reviewers: qcolombet Subscribers: hans, llvm-commits Differential Revision: http://reviews.llvm.org/D15200 llvm-svn: 254645	2015-12-03 20:46:59 +00:00
Krzysztof Parzyszek	7709aa0e07	[Hexagon] Remove variable unused in NDEBUG build llvm-svn: 254623	2015-12-03 17:53:34 +00:00
Matthias Braun	0d4505c067	AArch64FastISel: Use cbz/cbnz to branch on i1 In the case of a conditional branch without a preceding cmp we used to emit a "and; cmp; b.eq/b.ne" sequence, use tbz/tbnz instead. Differential Revision: http://reviews.llvm.org/D15122 llvm-svn: 254621	2015-12-03 17:19:58 +00:00
Krzysztof Parzyszek	c168c0165c	[Hexagon] Implement CONCAT_VECTORS for HVX using V6_vcombine llvm-svn: 254617	2015-12-03 16:47:20 +00:00
Colin LeMahieu	7c572b2125	[Hexagon] NFC Using canonicalizePacket to compound/duplex/pad packets rather than doing it separately. This also ensures the integrated assembler path matches the assembly parser path. llvm-svn: 254616	2015-12-03 16:37:21 +00:00
Krzysztof Parzyszek	25ddd2c9e8	[Hexagon] Fix instruction descriptor flags for memory access size llvm-svn: 254613	2015-12-03 15:41:33 +00:00
Marina Yatsina	4b1aea0802	[X86] MS inline asm: produce error when encountering "<type> ptr <reg name>" Currently "<type> ptr <reg name>" treated as <reg name> in MS inline asm, ignoring the "<type> ptr" completely and possibly ignoring the intention of the user. Fixed llvm to produce an error when encountering "<type> ptr <reg name>" operands. For example: andpd xmm1,xmmword ptr xmm1 --> andpd xmm1, xmm1 though andpd has 2 possible matching formats - andpd xmm, xmm/m128 Patch by: ziv.izhar@intel.com Differential Revision: http://reviews.llvm.org/D14607 llvm-svn: 254607	2015-12-03 12:17:03 +00:00
Marina Yatsina	90d9ffa7d6	[X86] Add support for fcomip, fucomip for Intel syntax According to x86 spec, fcomip and fucomip should be supported for Intel syntax. Differential Revision: http://reviews.llvm.org/D15104 llvm-svn: 254595	2015-12-03 08:55:33 +00:00
Tom Stellard	9760f03757	AMDGPU/SI: Emit constant arrays in the .hsrodata_readonly_agent section Summary: This is done only when targeting HSA. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13807 llvm-svn: 254587	2015-12-03 03:34:32 +00:00
Joerg Sonnenberger	48eb197434	Add a TODO item that the nop handling before FP conditional branches is not enough for SPARCv7. llvm-svn: 254580	2015-12-03 02:35:24 +00:00
Derek Schuff	5268aaf7b6	[WebAssembly] Add a test for wasm-store-results pass Differential Revision: http://reviews.llvm.org/D15167 llvm-svn: 254570	2015-12-03 00:50:30 +00:00
Dan Gohman	ac132e9305	[WebAssembly] Assert that byval and nest are not used for return types. llvm-svn: 254567	2015-12-02 23:40:03 +00:00
Krzysztof Parzyszek	8d8b229de9	[Hexagon] Improve lowering of instructions to the MC layer - Add extenders when necessary. - Handle some basic relocations. This should fix the failure in tools/clang/test/CodeGenCXX/crash.cpp llvm-svn: 254564	2015-12-02 23:08:29 +00:00
David Majnemer	70497c696a	Move EH-specific helper functions to a more appropriate place No functionality change is intended. llvm-svn: 254562	2015-12-02 23:06:39 +00:00
Alexey Samsonov	44ff204fad	Fixup for r254547: use format_hex() to simplify code. llvm-svn: 254560	2015-12-02 22:59:22 +00:00
Alexey Samsonov	39b7d65d82	[PowerPC] Remove wild call to RegScavenger::initRegState(). This call should in fact be made by RegScavenger::enterBasicBlock() called below. The first call does nothing except for triggering UB, indicated by UBSan (passing nullptr to memset()). llvm-svn: 254548	2015-12-02 21:25:28 +00:00
Alexey Samsonov	bcfabaa05b	[Hexagon] Remove std::hex in favor of format(). std::hex is not used anywhere in LLVM code base except for this place, and it has a known undefined behavior (at least in libstdc++ 4.9.3): https://llvm.org/bugs/show_bug.cgi?id=18156, which fires in UBSan bootstrap of LLVM. llvm-svn: 254547	2015-12-02 21:13:43 +00:00
Tom Stellard	00f2f91af4	AMDGPU/SI: Correctly emit agent global segment variables when targeting HSA Differential Revision: http://reviews.llvm.org/D14508 llvm-svn: 254540	2015-12-02 19:47:57 +00:00
Krzysztof Parzyszek	de25ecfa62	[Hexagon] Remove TFRI_V4 instruction, use existing A2_tfrsi instead llvm-svn: 254539	2015-12-02 19:44:35 +00:00
Kyle Butt	015f4fc854	Test Commit: iteratee Remove whitespace from blank lines. NFC llvm-svn: 254531	2015-12-02 18:53:33 +00:00
Tom Stellard	e928533dae	AMDGPU: Fix msan test failure llvm-svn: 254527	2015-12-02 18:35:23 +00:00
Tim Northover	f520eff782	AArch64: use ldxp/stxp pair to implement 128-bit atomic loads. The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524	2015-12-02 18:12:57 +00:00
Dan Gohman	53d1399792	[WebAssembly] Fix comments to say "LIFO" instead of "FIFO" when describing a stack. llvm-svn: 254523	2015-12-02 18:08:49 +00:00
Tom Stellard	e3b5aeaf83	AMDGPU/SI: Don't emit group segment global variables Summary: Only global or readonly segment variables should appear in object files. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15111 llvm-svn: 254519	2015-12-02 17:00:42 +00:00
Michael Zuckerman	15152a5c41	By intel spec \|9B DD /7\| FSTSW m2byte\| Valid Valid Store FPU status word at m2byteafter checking for pending unmasked floating-point exceptions.\| \|9B DF E0\| FSTSW AX\| Valid Valid Store FPU status word in AX register after checking for pending unmasked floating-point exceptions.\| \|DD /7 \|FNSTSW m2byte\| Valid Valid Store FPU status word at m2bytewithout checking for pending unmasked floating-point exceptions.\| \|DF E0 \|FNSTSW AX\| Valid Valid Store FPU status word in AX register without checking for pending unmasked floating-point exceptions\| m2byte is word register, and therefor instruction operand need to be change from f32mem to i16mem. Differential Revision: http://reviews.llvm.org/D14953 llvm-svn: 254512	2015-12-02 14:34:34 +00:00
Christof Douma	8b5dc2c94e	[AArch64]: Add support for Cortex-A35 Adds support for the new Cortex-A35 ARMv8-A core. llvm-svn: 254503	2015-12-02 11:53:44 +00:00
Nemanja Ivanovic	74e31bc929	Patch to fix a crash in the PowerPC back end due to ISD::ROTL and ISD::ROTR not being expanded. Test case included. llvm-svn: 254501	2015-12-02 10:36:24 +00:00
Hrvoje Varga	672b0f5582	[mips][microMIPS] Implement PREPEND, RADDU.W.QB, RDDSP, REPL.PH, REPL.QB, REPLV.PH, REPLV.QB and MTHLIP instructions Differential Revision: http://reviews.llvm.org/D14527 llvm-svn: 254496	2015-12-02 09:31:24 +00:00
Simon Pilgrim	3fc3454a0c	[X86][FMA] Optimize FNEG(FMUL) Patterns On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495	2015-12-02 09:07:55 +00:00
Elena Demikhovsky	a1a40cce9f	AVX-512: Updated cost of FP/SINT/UINT conversion operations I checked and updated the cost of AVX-512 conversion operations. Added cost of conversion operations in DQ mode. Conversion of illegal types that requires vector split is not calculated right now (like for other X86 targets). Differential Revision: http://reviews.llvm.org/D15074 llvm-svn: 254494	2015-12-02 08:59:47 +00:00
Asaf Badouh	2489f350c0	[X86][AVX512] add comi with Sae add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 254493	2015-12-02 08:17:51 +00:00
Craig Topper	f419a1f69a	[X86] Change getZeroVector to take an MVT instead of EVT. One minor change needed to only try to perform 256-it shuffle combines on legal vector types. llvm-svn: 254490	2015-12-02 06:39:19 +00:00
Craig Topper	6164297f46	[X86] Fix weird identation. NFC llvm-svn: 254487	2015-12-02 05:24:38 +00:00
Quentin Colombet	bbdebefff6	[X86] Fix a think-o when checking if the eflags needs to be preserved. llvm-svn: 254480	2015-12-02 02:07:00 +00:00

1 2 3 4 5 ...

35449 Commits