llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	a4b1b6ea05	LegalizeDAG: Don't replace vector load with integer unless legal On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927	2016-03-30 21:15:10 +00:00
Nirav Dave	2aab7f4358	Add support for no-jump-tables Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756	2016-03-29 17:46:23 +00:00
Manman Ren	f46262e0b7	Swift Calling Convention: add swiftself attribute. Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754	2016-03-29 17:37:21 +00:00
Kyle Butt	5e241b11ed	[Codegen] Decrease minimum jump table density. Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689	2016-03-29 00:23:41 +00:00
Junmo Park	a26e93bcec	Minor code cleanup. NFC. llvm-svn: 264505	2016-03-26 06:04:55 +00:00
Nirav Dave	fa250cad37	Prevent construction of cycle in DAG store merge When merging stores in DAGCombiner, add check to ensure that no dependenices exist that would cause the construction of a cycle in our DAG. This may happen if one store has a data dependence on another instruction (e.g. a load) which itself has a (chain) dependence on another store being merged. These stores cannot be merged safely and doing so results in a cycle that is discovered in LegalizeDAG. This test is only done in cases where Antialias analysis is used (UseAA) as non-AA store merge candidates will be merged logically after all loads which have been checked to not alias. Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight Subscribers: llvm-commits, tberghammer, danalbert, srhines Differential Revision: http://reviews.llvm.org/D18336 llvm-svn: 264461	2016-03-25 21:06:30 +00:00
Manman Ren	9dd8c14674	CXX TLS: collect return blocks after SelectAllBasicBlocks. It is incorrect to get the corresponding MBB for a ReturnInst before SelectAllBasicBlocks since SelectAllBasicBlocks can change the correspondence between a ReturnInst and the MBB it is in. PR27062 llvm-svn: 264358	2016-03-24 23:21:29 +00:00
Sanjoy Das	fd3eaa8c5c	Reduce code duplication by extracting out a helper function; NFC llvm-svn: 264355	2016-03-24 22:51:49 +00:00
Sanjoy Das	731c67fed2	Lower varargs correctly in deopt bundle lowering Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg. llvm-svn: 264354	2016-03-24 22:37:52 +00:00
Sanjoy Das	df9ae70f49	Add lowering support for llvm.experimental.deoptimize Summary: Only adds support for "naked" calls to llvm.experimental.deoptimize. Support for round-tripping through RewriteStatepointsForGC will come as a separate patch (should be simpler than this one). Reviewers: reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18429 llvm-svn: 264329	2016-03-24 20:23:29 +00:00
Sanjoy Das	c0c59fe14e	[Statepoints] Fix yet another issue around gc pointer uniqueing Given that StatepointLowering now uniques derived pointers before putting them in the per-statepoint spill map, we may end up with missing entries for derived pointers when we visit a gc.relocate on a pointer that was de-duplicated away. Fix this by keeping two maps, one mapping gc pointers to their de-duplicated values, and one mapping a de-duplicated value to the slot it is spilled in. llvm-svn: 264320	2016-03-24 18:57:39 +00:00
Sanjoy Das	42f91a9959	Minor cosmestic changes (NFC) - Reflow comments - Rename function llvm-svn: 264319	2016-03-24 18:57:31 +00:00
Tim Northover	4498eff9bb	CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS. If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296	2016-03-24 15:38:38 +00:00
Pirama Arumuga Nainar	dc45aef2d8	Remove unsafe AssertZext after promoting result of FP_TO_FP16 Summary: Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32 instruction, do not guarantee that the top 16 bits are zeroed out. Remove the unsafe AssertZext and add tests to exercise this. Reviewers: jmolloy, sbaranga, kristof.beyls, aadg Subscribers: llvm-commits, srhines, aemerson Differential Revision: http://reviews.llvm.org/D18426 llvm-svn: 264285	2016-03-24 14:06:03 +00:00
Justin Bogner	c35c10593b	SelectionDAG: Remove a tautological dyn_cast. NFC Index is already a StoreSDNode, so this dyn_cast doesn't do anything. llvm-svn: 264177	2016-03-23 18:15:33 +00:00
Sanjoy Das	a5b2972977	Remove stale comment llvm-svn: 264131	2016-03-23 02:28:35 +00:00
Sanjoy Das	ac53dc7520	[StatepointLowering] Don't do two DenseMap lookups; nfci llvm-svn: 264130	2016-03-23 02:24:15 +00:00
Sanjoy Das	7edbef316b	[StatepointLowering] Minor NFC cleanups - Use auto - Name variables in LLVM style - Use llvm::find instead of std::find - Blank lines between declarations llvm-svn: 264129	2016-03-23 02:24:13 +00:00
Sanjoy Das	4cd746ebe0	[StatepointLowering] Minor nfc refactoring Now that StatepointLoweringInfo represents base pointers, derived pointers and gc relocates as SmallVectors and not ArrayRefs, we no longer need to allocate "backing storage" on stack in LowerStatepoint. So elide the backing storage, and inline the trivial body of getIncomingStatepointGCValues. llvm-svn: 264128	2016-03-23 02:24:10 +00:00
Sanjoy Das	e58ca59cf4	[StatepointLowering] Schedule gc relocates before uniqueing them Otherwise we can see an "unexpected" gc.relocate that we uniqued away. llvm-svn: 264127	2016-03-23 02:24:07 +00:00
Simon Pilgrim	c6f5fe3d69	[SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected. llvm-svn: 264085	2016-03-22 19:59:53 +00:00
Tim Northover	b49a8a9dbb	CodeGen: check return types match when emitting tail call to builtin. We were just completely ignoring the types when determining whether we could safely emit a libcall as a tail call. This is clearly wrong. Theoretically, we could dig deeper looking for incidental matches (much like the generic code in Analysis.cpp does), but it's probably not worth it for the few libcalls that exist. llvm-svn: 264084	2016-03-22 19:14:38 +00:00
Sanjoy Das	eb5037cadc	Allow lowering call sites with both funclets and deopt state Lowering funclets is a no-op, so we can just go ahead and lower the deopt state. llvm-svn: 264078	2016-03-22 18:10:39 +00:00
Sanjoy Das	6b535630a1	Add a hasOperandBundlesOtherThan helper, and use it; NFC llvm-svn: 264072	2016-03-22 17:51:25 +00:00
Simon Pilgrim	25fb4177fb	[X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062	2016-03-22 16:22:08 +00:00
Sanjoy Das	38bfc22161	Add "first class" lowering for deopt operand bundles Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015	2016-03-22 00:59:13 +00:00
Silviu Baranga	46030585b3	[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935	2016-03-21 11:43:46 +00:00
Manman Ren	2828c57b6f	[CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86. Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855	2016-03-18 23:38:49 +00:00
Sanjoy Das	3a02019fbc	[SelectionDAG] Remove visitStatepoint; NFC This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682	2016-03-17 00:47:14 +00:00
Sanjoy Das	43e33d61c6	Fix indentation; NFC llvm-svn: 263672	2016-03-16 23:11:21 +00:00
Sanjoy Das	70697ff74d	Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC Summary: This is a step towards implementing "direct" lowering of calls and invokes with deopt operand bundles into STATEPOINT nodes (as opposed to having them mandatorily pass through RewriteStatepointsForGC, which is the case today). This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint` helper function that is able to lower a "statepoint like thing", and uses it to lower `gc.statepoint` calls. This is an NFC now, but in a later change we will use `LowerAsStatepoint` to directly lower calls and invokes with operand bundles without going through an intermediate `gc.statepoint` IR representation. FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add support for lowering non gc.statepoints, right now it is fairly tightly coupled with an IR level `gc.statepoint`. Reviewers: reames, pgavlin, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18106 llvm-svn: 263671	2016-03-16 23:08:00 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Sanjoy Das	19c6159833	[SelectionDAG] Extract out populateCallLoweringInfo; NFC SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663	2016-03-16 20:49:31 +00:00
Simon Pilgrim	b5a20f0fec	Removed trailing whitespace llvm-svn: 263650	2016-03-16 18:37:44 +00:00
Sanjoy Das	c11460e051	[StatepointLowering] Move an assertion; NFCI Instead of running an explicit loop over `gc.relocate` calls hanging off of a `gc.statepoint`, assert the validity of the type of the value being relocated in `visitRelocate`. llvm-svn: 263516	2016-03-15 01:16:31 +00:00
Eric Christopher	da8b3f1914	Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512	2016-03-14 23:59:57 +00:00
Sanjay Patel	7506852709	[DAG] use !isUndef() ; NFCI llvm-svn: 263453	2016-03-14 18:09:43 +00:00
Sanjay Patel	5719584129	[DAG] use isUndef() ; NFCI llvm-svn: 263448	2016-03-14 17:28:46 +00:00
Sanjoy Das	ecf96c9516	Make gc relocates more strongly typed; NFC Don't use a `Value ` where we can use a stronger `GCRelocateInst ` type. llvm-svn: 263327	2016-03-12 02:54:27 +00:00
Simon Pilgrim	33d57c7547	[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303	2016-03-11 22:18:05 +00:00
Simon Pilgrim	61eb49e437	[X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159	2016-03-10 20:40:26 +00:00
Tom Stellard	9f2e00de7b	SelectionDAG: Fix a crash on inline asm when output register supports multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022	2016-03-09 16:02:52 +00:00
Hans Wennborg	e00b6e7249	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935	2016-03-08 16:21:41 +00:00
Justin Bogner	671febc0f7	Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler" This re-applies r262886 with a fix for 32 bit platforms that have 8 byte pointer alignment, effectively reverting r262892. Original Message: Currently some SDNode operands are malloc'd, some are stored inline in subclasses of SDNode, and some are thrown into a BumpPtrAllocator. This scheme is complex, inconsistent, and makes refactoring SDNodes fairly difficult. Instead, we can allocate all of the operands using an ArrayRecycler that wraps a BumpPtrAllocator. This keeps the cache locality when iterating operands, improves locality when iterating SDNodes without looking at operands, and vastly simplifies the ownership semantics. It also means we stop overallocating SDNodes by 2-3x and will make it simpler to fix the rampant undefined behaviour we have in how we mutate SDNodes from one kind to another (See llvm.org/pr26808). This is NFC other than the changes in memory behaviour, and I ran some LNT tests to make sure this didn't hurt compile time. Not many tests changed: there were a couple of 1-2% regressions reported, but there were more improvements (of up to 4%) than regressions. llvm-svn: 262902	2016-03-08 03:14:29 +00:00
Justin Bogner	7e6f09c28f	Revert "SelectionDAG: Store SDNode operands in an ArrayRecycler" Looks like the largest SDNode is different between 32 and 64 bit now, so this is breaking 32 bit bots. Reverting while I figure out a fix. This reverts r262886. llvm-svn: 262892	2016-03-08 01:07:03 +00:00
Justin Bogner	6543a9385f	SelectionDAG: Store SDNode operands in an ArrayRecycler Currently some SDNode operands are malloc'd, some are stored inline in subclasses of SDNode, and some are thrown into a BumpPtrAllocator. This scheme is complex, inconsistent, and makes refactoring SDNodes fairly difficult. Instead, we can allocate all of the operands using an ArrayRecycler that wraps a BumpPtrAllocator. This keeps the cache locality when iterating operands, improves locality when iterating SDNodes without looking at operands, and vastly simplifies the ownership semantics. It also means we stop overallocating SDNodes by 2-3x and will make it simpler to fix the rampant undefined behaviour we have in how we mutate SDNodes from one kind to another (See llvm.org/pr26808). This is NFC other than the changes in memory behaviour, and I ran some LNT tests to make sure this didn't hurt compile time. Not many tests changed: there were a couple of 1-2% regressions reported, but there were more improvements (of up to 4%) than regressions. llvm-svn: 262886	2016-03-08 00:39:51 +00:00
Matt Arsenault	ceb2c06cbd	DAGCombiner: Check legality before creating extract_vector_elt Problem not hit by any in tree target. llvm-svn: 262852	2016-03-07 21:10:09 +00:00
Craig Topper	267bdb2094	[CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC llvm-svn: 262815	2016-03-07 07:29:12 +00:00
Michael Kuperstein	b89f0fa2a2	[DAGCombine] Fix divrem combine not to assume div/rem type is simple. The divrem combine assumed the type of the div/rem is simple, which isn't necessarily true. This probably worked fine until r250825, since it only saw legal types, but now breaks when it runs as a pre-type-legalization combine. This fixes PR26835. Differential Revision: http://reviews.llvm.org/D17878 llvm-svn: 262746	2016-03-04 21:23:29 +00:00
Renato Golin	175c6d6d95	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738	2016-03-04 19:19:36 +00:00
Simon Pilgrim	91dd0a796c	[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599	2016-03-03 09:43:28 +00:00
Renato Golin	3d78271eac	Revert "[ARM] Merging 64-bit divmod lib calls into one" This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594	2016-03-03 08:57:44 +00:00
David Majnemer	1ef654024f	[X86] Don't give catch objects a displacement of zero Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546	2016-03-03 00:01:25 +00:00
Renato Golin	93e42d9934	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507	2016-03-02 19:35:45 +00:00
Justin Bogner	b2ecee9c31	SelectionDAG: Use correctly sized allocation functions for SDNodes The placement new calls here were all calling the allocation function in RecyclingAllocator/Recycler for SDNode, instead of the function for the specific subclass we were constructing. Since this particular allocator always overallocates it more or less worked, but would hide what we're actually doing from any memory tools. Also, if you tried to change this allocator so something like a BumpPtrAllocator or MallocAllocator, the compiler would crash horribly all the time. Part of llvm.org/PR26808. llvm-svn: 262500	2016-03-02 19:01:11 +00:00
Matt Arsenault	7d0a77b979	DAGCombiner: Make sure an integer is being truncated llvm-svn: 262446	2016-03-02 01:36:51 +00:00
Matt Arsenault	b36d462fac	DAGCombiner: Turn truncate of a bitcasted vector to an extract On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. llvm-svn: 262397	2016-03-01 21:31:53 +00:00
Vasileios Kalintiris	36901dd1c3	Revert "[mips] Promote the result of SETCC nodes to GPR width." This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. llvm-svn: 262387	2016-03-01 20:25:43 +00:00
Justin Lebar	b5ca00a58d	[NVPTX] Use different, convergent MIs for convergent calls. Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373	2016-03-01 19:24:03 +00:00
Matt Arsenault	03dac8d8e4	DAGCombiner: Turn extract of bitcasted integer into truncate This reduces the number of bitcast nodes and generally cleans up the DAG when bitcasting between integers and vectors everywhere. llvm-svn: 262358	2016-03-01 18:01:37 +00:00
Vasileios Kalintiris	3a8f7f9e31	[mips] Promote the result of SETCC nodes to GPR width. Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 llvm-svn: 262316	2016-03-01 10:08:01 +00:00
Matt Arsenault	a67c4916cf	LegalizeDAG: Use correct ptr type when expanding unaligned load/store This fixes regressions exposed in existing AMDGPU tests in a future commit when all loads are custom lowered. llvm-svn: 262299	2016-03-01 05:13:35 +00:00
Matt Arsenault	982224cfb8	DAGCombiner: Don't unnecessarily swap operands in ReassociateOps In the case where op = add, y = base_ptr, and x = offset, this transform: (op y, (op x, c1)) -> (op (op x, y), c1) breaks the canonical form of add by putting the base pointer in the second operand and the offset in the first. This fix is important for the R600 target, because for some address spaces the base pointer and the offset are stored in separate register classes. The old pattern caused the ISel code for matching addressing modes to put the base pointer and offset in the wrong register classes, which required no-trivial code transformations to fix. llvm-svn: 262148	2016-02-27 19:57:45 +00:00
Matt Arsenault	360d244d5b	DAGCombiner: Relax sqrt NaN folding check This is OK for +0 since compares to +/-0 give the same result. llvm-svn: 262125	2016-02-27 09:38:05 +00:00
Cong Hou	e0eb8bfe37	Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64. llvm-svn: 262091	2016-02-26 23:25:30 +00:00
Cong Hou	4ce0280a41	Detecte vector reduction operations just before instruction selection. (This is the second attemp to commit this patch, after fixing pr26652 & pr26653). This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261804	2016-02-24 23:40:36 +00:00
Artur Pilipenko	31bcca47d3	NFC. Move isDereferenceable to Loads.h/cpp This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736	2016-02-24 12:49:04 +00:00
Matt Arsenault	0c6bd7b0d3	SelectionDAG: Use correct addrspace when lowering memcpy This was causing assertions later from using the wrong pointer size with LDS operations. getOptimalMemOpType should also have address space arguments later. This avoids assertions in existing tests exposed by a future commit. llvm-svn: 261580	2016-02-22 22:01:42 +00:00
Duncan P. N. Exon Smith	e9bc579c37	ADT: Remove == and != comparisons between ilist iterators and pointers I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498	2016-02-21 20:39:50 +00:00
Simon Pilgrim	c5199aae82	[DAGCombiner] Use getBitcast helper when possible. NFCI. llvm-svn: 261437	2016-02-20 15:05:29 +00:00
Sanjoy Das	ffb7bd11f7	[StatepointLowering] Minor non-semantic cleanups Use auto, bring file up to coding standards etc. llvm-svn: 261358	2016-02-19 19:37:07 +00:00
Sanjoy Das	f6fee29ceb	[StatepointLowering] Update StatepointMaxSlotsRequired correctly Now that we don't always add an element to AllocatedStackSlots if we don't find a pre-existing unallocated stack slot, bumping StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead bump the statistic near the push_back, to Builder.FuncInfo.StatepointStackSlots.size(). llvm-svn: 261348	2016-02-19 18:15:56 +00:00
Sanjoy Das	e8019df552	[StatepointLowering] Fix a mistake in rL261336 The check on MFI->getObjectSize() has to be on the FrameIndex, not on the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests I added in rL261336 didn't catch this. llvm-svn: 261347	2016-02-19 18:15:53 +00:00
Sanjoy Das	171313c69a	[StatepointLowering] Change AllocatedStackSlots to use SmallBitVector NFCI. They key motivation here is that I'd like to use SmallBitVector::all() in a later change. Also, using a bit vector here seemed better in general. The only interesting change here is that in the failure case of allocateStackSlot, we no longer (the equivalent of) push_back(true) to AllocatedStackSlots. As far as I can tell, this is fine, since we'd never re-use those slots in the same StatepointLoweringState instance. Technically there was no need to change the operator[] type accesses to set() and test(), but I thought it'd be nice to make it obvious that we're using something other than a std::vector like thing. llvm-svn: 261337	2016-02-19 17:15:26 +00:00
Sanjoy Das	d2db73ba59	[StatepointLowering] Fix bug in allocateStackSlot allocateStackSlot did not consider the size of the value to be spilled before deciding to re-use a spill slot. This was originally okay (since originally we'd only ever spill pointers), but it became not okay when we changed our scheme to directly spill vectors of pointers. While this change fixes the bug pointed out, it has two performance caveats: - It matches spill slot and spillee size exactly, while in theory we can spill, e.g., an 8 byte pointer into a 16 byte slot. This is slightly complicated to fix since in the stackmaps section, we report the size of the spill slot as the size of the "indirect value"; and if they're no longer equivalent, we'll have to keep track of the (indirect) value size separately from the stack slot size. - It will "spuriously run out" of reusable slots, since we now have an second check in the search loop in addition to the availablity check (e.g. you had two free scalar slots, and you first ask for a vector slot followed by a scalar slot). I'll fix this in a later commit. llvm-svn: 261336	2016-02-19 17:15:22 +00:00
Sanjoy Das	7b2e91fb59	[StatepointLowering] Clean up allocateStackSlot This removes the unusual loop structure in allocateStackSlot in favor of something more straightforward. I've also removed the cautionary comment in the function, which I suspect is historical cruft now, and confuses more than it enlightens. llvm-svn: 261335	2016-02-19 17:15:17 +00:00
Matthias Braun	848e79c578	LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs llvm-svn: 261306	2016-02-19 04:44:19 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Nico Weber	e6154ffbe0	Revert r261070, it caused PR26652 / PR26653. llvm-svn: 261127	2016-02-17 18:47:29 +00:00
Cong Hou	bbd4e3b400	Detecte vector reduction operations just before instruction selection. This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261070	2016-02-17 06:37:04 +00:00
Ahmed Bougacha	93cff7fb82	[CodeGen] Document and use getConstant's splat-building feature. NFC. Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901	2016-02-15 18:07:29 +00:00
Pirama Arumuga Nainar	7476bc89e9	Don't combine fp_round (fp_round x) if f80 to f16 is generated Summary: This patch skips DAG combine of fp_round (fp_round x) if it results in an fp_round from f80 to f16. fp_round from f80 to f16 always generates an expensive (and as yet, unimplemented) libcall to __truncxfhf2. This prevents selection of native f16 conversion instructions from f32 or f64. Moreover, the first (value-preserving) fp_round from f80 to either f32 or f64 may become a NOP in platforms like x86. Reviewers: ab Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D17221 llvm-svn: 260769	2016-02-13 00:08:05 +00:00
Sanjay Patel	e5df1dfb14	[SelectionDAG] change getConstant() to use the input SDLoc when building splat vectors The code change is simple enough: instead of attaching an anonymous SDLoc to splatted vector constants, use the scalar constant's existing SDLoc since that is what is passed into getConstant() as a param. But this changes instruction scheduling, so I'll explain why that happens. The motivation for this patch starts near: http://reviews.llvm.org/rL258833 ...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'. But when I made that change locally, several x86 codegen tests wiggled. It turns out that the lack of SDLoc consistency in getConstant() changes the way ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG scheduler algorithms use IROrder for tie-breaking. Differential Revision: http://reviews.llvm.org/D16972 llvm-svn: 260582	2016-02-11 20:21:24 +00:00
Ahmed Bougacha	f8dfb47c02	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI. llvm-svn: 260316	2016-02-09 22:54:12 +00:00
Sanjay Patel	73200f72de	[SelectionDAG] make getMemBasePlusOffset() accessible; NFCI I reinvented this functionality in http://reviews.llvm.org/D16828 because it was hidden away as a static function. The changes in x86 are not based on a complete audit. I suspect there are other possible uses there, and there are almost certainly more potential users in other targets. llvm-svn: 260295	2016-02-09 21:42:04 +00:00
Hans Wennborg	850ec6ca18	[X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532) This matches GCC and MSVC's behaviour, and saves on code size. We were already not extending i1 return values on x86_64 after r127766. This takes that patch further by applying it to x86 target as well, and also for i8 and i16. The ABI docs have been unclear about the required behaviour here. The new i386 psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being updated to say the same [2]. Differential Revision: http://reviews.llvm.org/D16907 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ llvm-svn: 260133	2016-02-08 19:34:30 +00:00
Matt Arsenault	2bba779272	SelectionDAG: Lower some range metadata to AssertZext If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. llvm-svn: 260109	2016-02-08 16:28:19 +00:00
Sanjoy Das	86d7d83f2a	[StatepointLower] Use None instead of Optional<int>() llvm-svn: 259956	2016-02-05 23:40:04 +00:00
Petar Jovanovic	23e44f5e39	[Power PC] softening long double type This patch implements softening of long double type (ppcf128) on ppc32 architecture and enables operations for this type for soft float. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D15811 llvm-svn: 259791	2016-02-04 14:43:50 +00:00
Sanjay Patel	e9fa3363b4	rangify; NFCI llvm-svn: 259722	2016-02-03 22:44:14 +00:00
Tim Shen	f99f0d5a7e	[SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior This patch consists of two parts: a performance fix in DAGCombiner.cpp and a correctness fix in SelectionDAG.cpp. The test case tests the bug that's uncovered by the performance fix, and fixed by the correctness fix. The performance fix keeps the containers required by the hasPredecessorHelper (which is a lazy DFS) and reuse them. Since hasPredecessorHelper is called in a loop, the overall efficiency reduced from O(n^2) to O(n), where n is the number of SDNodes. The correctness fix keeps iterating the neighbor list even if it's time to early return. It will return after finishing adding all neighbors to Worklist, so that no neighbors are discarded due to the original early return. llvm-svn: 259691	2016-02-03 20:58:55 +00:00
Eugene Zelenko	ecefe5a81f	Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes. Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539	2016-02-02 18:20:45 +00:00
Balaram Makam	92431703d7	AArch64: Implement missed conditional compare sequences. Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387	2016-02-01 19:13:07 +00:00
Tim Shen	3b428cb764	[SelectionDAG] Eliminate exponential behavior in WalkChainUsers llvm-svn: 259315	2016-01-31 03:59:34 +00:00
Matthias Braun	b30f2f5141	Avoid overly large SmallPtrSet/SmallSet These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283	2016-01-30 01:24:31 +00:00
Yaron Keren	eb2a25467e	Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240	2016-01-29 20:50:44 +00:00
David Majnemer	4543ff09a2	[X86] Don't transform X << 1 to X + X during type legalization While legalizing a 64-bit shift left by 1, the following occurs: We split the shift operand in half: a high half and a low half. We then create an ADDC with the low half and a ADDE with the high half + the carry bit from the ADDC. This is problematic if X is any_ext'd because the high half computation is now undef + undef + carry bit and there is no way to ensure that the two undef values had the same bitwise representation. This results in the lowest bit in the high half turning into garbage. Instead, do not try to turn shifts into arithmetic during type legalization. This fixes PR26350. llvm-svn: 259065	2016-01-28 18:20:05 +00:00
Junmo Park	b3327b7007	[DAGCombiner] Don't add volatile or indexed stores to ChainedStores Summary: findBetterNeighborChains does not handle volatile or indexed stores. However, it did not check when adding stores to ChainedStores. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D16463 llvm-svn: 259024	2016-01-28 06:23:33 +00:00
Benjamin Kramer	f9172fd4ac	Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939	2016-01-27 16:32:26 +00:00
Chris Bieneman	e49730d4ba	Remove autoconf support Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861	2016-01-26 21:29:08 +00:00

1 2 3 4 5 ...

7543 Commits