llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Dardis	212cccb2f4	Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns" By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. The previous version of this patch had a "conditional move or jump depends on uninitialized value". Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 305083	2017-06-09 14:37:08 +00:00
Nirav Dave	43a4d8122f	Prevent RemoveDeadNodes from deleted already deleted node. This prevents against assertion errors like PR32659 which occur from a replacement deleting a node after it's been added to the list argument of RemoveDeadNodes. The specific failure from PR32659 does not currently happen, but it is still potentially possible. The underlying cause is that the callers of the change dfunction builds up a list of nodes to delete after having moved their uses and it possible that a move of a later node will cause a previously deleted nodes to be deleted. Reviewers: bkramer, spatel, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33731 llvm-svn: 305070	2017-06-09 12:57:35 +00:00
Eugene Zelenko	6ac7a34816	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 304954	2017-06-07 23:53:32 +00:00
Nirav Dave	772ea3ae1a	[DAG] Improve Store Merge candidate pruning. NFC. When considering merging stores values are the results of loads only consider stores whose values come from loads from the same base. This fixes much of the longer compile times in PR33330. llvm-svn: 304934	2017-06-07 18:51:56 +00:00
Simon Pilgrim	be8866f691	[DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering. This will allow commutation of target-specific DAG nodes in future patches Differential Revision: https://reviews.llvm.org/D33882 llvm-svn: 304911	2017-06-07 14:05:04 +00:00
Sanjay Patel	2726ea2ac0	[DAG] remove duplicated code for isOnlyUsedInZeroEqualityComparison(); NFCI llvm-svn: 304822	2017-06-06 19:40:09 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Mandeep Singh Grang	5e1697ef28	[llvm] Remove double semicolons Reviewers: craig.topper, arsenm, mehdi_amini Reviewed By: mehdi_amini Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33924 llvm-svn: 304767	2017-06-06 05:08:36 +00:00
Davide Italiano	fb4d5c095b	[SelectionDAG] Update the dominator after splitting critical edges. Running `llc -verify-dom-info` on the attached testcase results in a crash in the verifier, due to a stale dominator tree. i.e. DominatorTree is not up to date! Computed: =============================-------------------------------- Inorder Dominator Tree: [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7} [2] %lor.lhs.false.i61.i.i.i {1,2} [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6} [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5} Actual: =============================-------------------------------- Inorder Dominator Tree: [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9} [2] %lor.lhs.false.i61.i.i.i {1,2} [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8} [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5} [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7} This is because in `SelectionDAGIsel` we split critical edges without updating the corresponding dominator for the function (and we claim in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved). We could either stop preserving the domtree in `getAnalysisUsage` or tell `splitCriticalEdge()` to update it. As the second option is easy to implement, that's the one I chose. Differential Revision: https://reviews.llvm.org/D33800 llvm-svn: 304742	2017-06-05 22:16:41 +00:00
Sanjay Patel	6350de76fa	[DAGCombine] Fix unchecked calls to DAGCombiner::ExtPromoteOperand Other calls to DAGCombiner::PromoteOperand check the result, but here it could cause an assertion in getNode. Falling back to any extend in this case instead of failing outright seems correct to me. No test case because: The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Legal);. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D33633 llvm-svn: 304723	2017-06-05 17:01:10 +00:00
Galina Kistanova	bd79f73f02	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304635	2017-06-03 05:11:14 +00:00
Eugene Zelenko	c85638b29d	[CodeGen] Fix Windows builds which treat warnings as errors, broken in r304621. llvm-svn: 304627	2017-06-03 01:04:06 +00:00
Eugene Zelenko	167595ab51	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 304621	2017-06-03 00:22:41 +00:00
Philip Reames	b70cecd60a	[Statepoint] Be consistent about using deopt naming [NFCI] We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed. llvm-svn: 304607	2017-06-02 23:03:26 +00:00
Sanjay Patel	cdb5dad4cc	[TargetLowering] fix formatting; NFC llvm-svn: 304569	2017-06-02 17:35:02 +00:00
Amaury Sechet	437f7060fe	nits in TargetLowering.cpp . NFC llvm-svn: 304532	2017-06-02 09:18:18 +00:00
Max Kazantsev	4d8748a987	[SelectionDAG] Get rid of recursion in findNonImmUse The recursive implementation of findNonImmUse may overflow stack on extremely long use chains. This patch replaces it with an equivalent iterative implementation. Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D33775 llvm-svn: 304522	2017-06-02 07:11:00 +00:00
Nirav Dave	4952871630	[SDAG] Fix CombineTo ordering in visitZERO_EXTEND and visitSIGN_EXTEND Reorder CombineTo Calls to prevent references to stale/deleted SDNodes which caused undue assertions. Reviewers: dbabokin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D31625 llvm-svn: 304460	2017-06-01 19:33:50 +00:00
Matt Arsenault	b083570532	DAG: Remove pointless type check These are only integer operations. llvm-svn: 304417	2017-06-01 14:49:46 +00:00
Amaury Sechet	c84cc230b3	Only generate addcarry node when it is legal. Summary: This is a problem uncovered by stage2 testing. ADDCARRY end up being generated on target that do not support it. The patch that introduced the problem has other patches layed on top of it, so we want to fix the issue rather than revert it to avoid creating a lor of churn. A regression test will be added shortly, but this is committed as this in order to get the build back to green promptly. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33770 llvm-svn: 304409	2017-06-01 12:03:16 +00:00
Amaury Sechet	251ea8a4f8	Do not legalize large setcc with setcce, introduce setcccarry and do it with usubo/setcccarry. Summary: This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process. This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33374 llvm-svn: 304404	2017-06-01 11:14:17 +00:00
Amaury Sechet	9c5d1e966b	[DAGCombine] Refactor common addcarry pattern. Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32756 llvm-svn: 304402	2017-06-01 10:48:04 +00:00
Amaury Sechet	2e43cb6d03	[DAGCombine] (add/uaddo X, Carry) -> (addcarry X, 0, Carry) Summary: This enables further transforms. Depends on D32916 Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32925 llvm-svn: 304401	2017-06-01 10:42:39 +00:00
Nirav Dave	3424373f30	[ScheduleDAG] Deal with already scheduled loads in ScheduleDAG. Summary: If we attempt to unfold an SUnit in ScheduleDAG that results in finding an already scheduled load, we must should abort the unfold as it will not improve scheduling. This fixes PR32610. Reviewers: jmolloy, sunfish, bogner, spatel Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D32911 llvm-svn: 304321	2017-05-31 18:43:17 +00:00
Nirav Dave	7c70fddba6	[DAG] Avoid use of stale store. Correct references to alignment of store which may be deleted in a previous iteration of merge. Instead use first store that would be merged. Corrects pr33172's use-after-poison caught by ASan. Reviewers: spatel, hfinkel, RKSimon Reviewed By: RKSimon Subscribers: thegameg, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33686 llvm-svn: 304299	2017-05-31 13:36:17 +00:00
Craig Topper	5fd588be34	[SelectionDAG] Remove special case for ISD::FPOWI from the strict FP intrinsic handling. This code was compensating for FPOWI defaulting to Legal and many targets not changing it to Expand. This was fixed in r304215 to default to Expand so this special handling should no longer be necessary. llvm-svn: 304221	2017-05-30 17:12:18 +00:00
Craig Topper	f6d4dc5b4a	[SelectionDAG] Set ISD::FPOWI to Expand by default Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215	2017-05-30 15:27:55 +00:00
Sanjay Patel	51152a3727	[DAGCombiner] fix load narrowing transform to exclude loads with extension The extending load possibility was missed in: https://reviews.llvm.org/rL304072 We might want to handle this cases as a follow-up, but bailing out for now to avoid miscompiling. llvm-svn: 304153	2017-05-29 13:24:58 +00:00
Sanjay Patel	33f4a97287	[DAGCombiner] use narrow load to avoid vector extract If we have (extract_subvector(load wide vector)) with no other users, that can just be (load narrow vector). This is intentionally conservative. Follow-ups may loosen the one-use constraint to account for the extract cost or just remove the one-use check. The memop chain updating is based on code that already exists multiple times in x86 lowering, so that should be pulled into a helper function as a follow-up. Background: this is a potential improvement noticed via regressions caused by making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see comments in D33137). Differential Revision: https://reviews.llvm.org/D33578 llvm-svn: 304072	2017-05-27 14:07:03 +00:00
Benjamin Kramer	debb3c35e0	Make helper functions static. NFC. llvm-svn: 304029	2017-05-26 20:09:00 +00:00
Sanjay Patel	ec13ebf2c8	[DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790) In the best case: extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN ...we kill all of the extract/concat and just have narrow binops remaining. If only one of the binop operands is amenable, this transform is still worthwhile because we kill some of the extract/concat. Optional bitcasting makes the code more complicated, but there doesn't seem to be a way to avoid that. The TODO about extending to more than bitwise logic is there because we really will regress several x86 tests including madd, psad, and even a plain integer-multiply-by-2 or shift-left-by-1. I don't think there's anything fundamentally wrong with this patch that would cause those regressions; those folds are just missing or brittle. If we extend to more binops, I found that this patch will fire on at least one non-x86 regression test. There's an ARM NEON test in test/CodeGen/ARM/coalesce-subregs.ll with a pattern like: t5: v2f32 = vector_shuffle<0,3> t2, t4 t6: v1i64 = bitcast t5 t8: v1i64 = BUILD_VECTOR Constant:i64<0> t9: v2i64 = concat_vectors t6, t8 t10: v4f32 = bitcast t9 t12: v4f32 = fmul t11, t10 t13: v2i64 = bitcast t12 t16: v1i64 = extract_subvector t13, Constant:i32<0> There was no functional change in the codegen from this transform from what I could see though. For the x86 test changes: 1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case, but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops, there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op. SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat ops to match the pattern. 2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract. Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026 3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT. 4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction count by one in each case by eliminating two insert/extract while adding one narrower logic op. https://bugs.llvm.org/show_bug.cgi?id=32790 Differential Revision: https://reviews.llvm.org/D33137 llvm-svn: 303997	2017-05-26 15:33:18 +00:00
Nirav Dave	689709c928	[DAG] Move legal type checks in store merge to be checked only on non-legal cases. NFC. llvm-svn: 303994	2017-05-26 14:37:27 +00:00
John Brawn	9009d2905d	[ARM] Fix lowering of misaligned memcpy/memset Currently getOptimalMemOpType returns i32 for large enough sizes without checking for alignment, leading to poor code generation when misaligned accesses aren't permitted as we generate a word store then later split it up into byte stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for memset we splat the memset value into a word then immediately split it up again. Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type to use, but also fix a bug there where it wasn't correctly checking if misaligned memory accesses are allowed. Differential Revision: https://reviews.llvm.org/D33442 llvm-svn: 303990	2017-05-26 13:59:12 +00:00
Andrew Kaylor	f466001eef	Add constrained intrinsics for some libm-equivalent operations Differential revision: https://reviews.llvm.org/D32319 llvm-svn: 303922	2017-05-25 21:31:00 +00:00
Adrian Prantl	f062192632	Fix SelectionDAGBuilder::getDbgValue to not expect DW_OP_deref on FI vars This fixes an oversight in r300522, which changed alloca dbg.values to no longer emit a DW_OP_deref. The array.ll testcase was regenerated from source. Fixes PR33166: https://bugs.llvm.org/show_bug.cgi?id=33166 llvm-svn: 303897	2017-05-25 18:54:10 +00:00
Nirav Dave	7a8717d216	[DAG] Prevent crashes when merging constant stores with high-bit set. NFC. llvm-svn: 303802	2017-05-24 19:56:39 +00:00
Tim Northover	8c605c0eda	Revert LLVM changes for "Sema: allow imaginary constants via GNU extension if UDL overloads not present." The changes accidentally crept into a Clang commit I was making. llvm-svn: 303697	2017-05-23 21:53:11 +00:00
Tim Northover	6b5eceac2e	Sema: allow imaginary constants via GNU extension if UDL overloads not present. C++14 added user-defined literal support for complex numbers so that you can write something like "complex<double> val = 2i". However, there is an existing GNU extension supporting this syntax and interpreting the result as a _Complex type. This changes parsing so that such literals are interpreted in terms of C++14's operators if an overload is present but otherwise falls back to the original GNU extension. llvm-svn: 303694	2017-05-23 21:41:49 +00:00
Nirav Dave	6c910c0dd8	[DAG] Add AddressSpace parameter to canMergeStoresTo. NFC. llvm-svn: 303673	2017-05-23 18:53:02 +00:00
Nirav Dave	3b4f7cc0b3	[DAG] Add canMergeStoresTo predicate checks. NFCI. Propagate canMergeStoresTo checks to missing cases in StoreMerge. llvm-svn: 303668	2017-05-23 18:33:09 +00:00
Craig Topper	7e0aeeb884	[KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or similar. NFC llvm-svn: 303614	2017-05-23 07:18:37 +00:00
Nirav Dave	e00da22ef3	[DAG] Rework store merge to loop on load candidates. NFCI. Continue to consider remaining candidate merges until all possible merges have been considered. llvm-svn: 303560	2017-05-22 15:33:47 +00:00
Matthias Braun	50ec0b5dce	SimplifyLibCalls: Optimize wcslen Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461	2017-05-19 22:37:09 +00:00
Amaury Sechet	77cfb4a85f	[DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X) Summary: While this makes some case better and some case worse - so it's unclear if it is a worthy combine just by itself - this is a useful canonicalisation. As per discussion in D32756 . Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32916 llvm-svn: 303441	2017-05-19 18:20:44 +00:00
Craig Topper	8a950275f7	[Statistics] Add a method to atomically update a statistic that contains a maximum Summary: There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways: MaxNumFoo = std::max(MaxNumFoo, NumFoo); or MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo; The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes. This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again. This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ Reviewers: dberlin, chandlerc, hfinkel, dblaikie Reviewed By: chandlerc Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D33301 llvm-svn: 303318	2017-05-18 00:51:39 +00:00
Nirav Dave	da8f221273	Elide stores which are overwritten without being observed. Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 llvm-svn: 303198	2017-05-16 19:43:56 +00:00
Nirav Dave	cfd357a61a	[DAG] Prune deleted nodes in TokenFactor Fix visitTokenFactor to correctly remove deleted nodes. NFC. llvm-svn: 303181	2017-05-16 15:49:02 +00:00
Peter Collingbourne	6f0ecca3b5	IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134	2017-05-16 00:39:01 +00:00
Simon Pilgrim	754c1618ec	[SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts in ComputeNumSignBits llvm-svn: 302997	2017-05-13 22:10:58 +00:00
Simon Pilgrim	7666afd042	[SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits llvm-svn: 302993	2017-05-13 19:57:10 +00:00
Craig Topper	9fe357971c	[ValueTracking] Remove const_casts on several calls to computeKnownBits and ComputeSignBit. NFC llvm-svn: 302991	2017-05-13 17:22:16 +00:00
Tim Shen	10c64e6aea	[PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC. Summary: Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc., because those are not defined for b > sizeof(a) * 8, even after some of the combiners run. However, PPCISD::SHL defines that behavior (as the instructions themselves). Move the combination to the backend. The tests in shift_mask.ll still pass. Reviewers: echristo, hfinkel, efriedma, iteratee Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D33076 llvm-svn: 302937	2017-05-12 19:25:37 +00:00
Craig Topper	8df66c602a	[KnownBits] Add bit counting methods to KnownBits struct and use them where possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925	2017-05-12 17:20:30 +00:00
Simon Pilgrim	eabf6fc4b5	[DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI. llvm-svn: 302907	2017-05-12 15:26:50 +00:00
Simon Pilgrim	f01c301f72	[DAGCombine] Use SelectionDAG::getZExtOrTrunc helper. NFCI. llvm-svn: 302897	2017-05-12 13:22:12 +00:00
Simon Pilgrim	a6ed1b2f12	Use SDValue::getOperand() helper. NFCI. llvm-svn: 302896	2017-05-12 13:20:24 +00:00
Vadzim Dambrouski	38e30197c3	[MSP430] Generate EABI-compliant libcalls Updates the MSP430 target to generate EABI-compatible libcall names. As a byproduct, adjusts the hardware multiplier options available in the MSP430 target, adds support for promotion of the ISD::MUL operation for 8-bit integers, and correctly marks R11 as used by call instructions. Patch by Andrew Wygle. Differential Revision: https://reviews.llvm.org/D32676 llvm-svn: 302820	2017-05-11 19:56:14 +00:00
Simon Pilgrim	6faddcbd07	[DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI. llvm-svn: 302808	2017-05-11 16:40:44 +00:00
Simon Pilgrim	a4a13a0da0	Strip trailing whitespace. NFCI. llvm-svn: 302784	2017-05-11 10:03:05 +00:00
David L. Jones	bbd97d273b	Revert "[SDAG] Relax conditions under stores of loaded values can be merged" This reverts r302712. The change fails with ASAN enabled: ERROR: AddressSanitizer: use-after-poison on address ... at ... READ of size 2 at ... thread T0 #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42 #1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3 #2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17 #3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7 Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33081 llvm-svn: 302746	2017-05-10 23:56:21 +00:00
Nirav Dave	a38c049fc5	[SDAG] Relax conditions under stores of loaded values can be merged Summary: Allow consecutive stores whose values come from consecutive loads to merged in the presense of other uses of the loads. Previously this was disallowed as in general the merged load cannot be shared with the other uses. Merging N stores into 1 may cause as many as N redundant loads. However in the context of caching this should have neglible affect on memory pressure and reduce instruction count making it almost always a win. Fixes PR32086. Reviewers: spatel, jyknight, andreadb, hfinkel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30471 llvm-svn: 302712	2017-05-10 19:53:41 +00:00
Amaury Sechet	197685c6d8	Small refactoring in DAGCombine. NFC llvm-svn: 302699	2017-05-10 17:58:28 +00:00
Simon Pilgrim	cd4d913336	[DAGCombiner] Dropped explicit (sra 0, x) -> 0 and (sra -1, x) -> 0 folds. These are both handled (and tested) by the earlier ComputeNumSignBits == EltSizeInBits fold. llvm-svn: 302651	2017-05-10 13:06:26 +00:00
Simon Pilgrim	c29af824bf	[DAGCombiner] Add vector support to fold (shl/srl 0, x) -> 0 llvm-svn: 302641	2017-05-10 12:34:27 +00:00
Ahmed Bougacha	604526fe87	[CodeGen] Don't require AA in SDAGISel at -O0. Before r247167, the pass manager builder controlled which AA implementations were used, exporting them all in the AliasAnalysis analysis group. Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA implementations if made available in the pass pipeline. But regardless, SDAGISel is required at O0, and really doesn't need to be doing fancy optimizations based on useful AA results. Don't require AA at CodeGenOpt::None, and only use it otherwise. This does have a functional impact (and one testcase is pessimized because we can't reuse a load). But I think that's desirable no matter what. Note that this alone doesn't result in less DT computations: TwoAddress was previously able to reuse the DT we computed for SDAG. That will be fixed separately. Differential Revision: https://reviews.llvm.org/D32766 llvm-svn: 302611	2017-05-10 00:39:30 +00:00
Zvi Rackover	b483e28c77	DAGCombine: Combine shuffles of splat-shuffles Summary: Reapply r299047, but this time handle correctly splat-masks with undef elements. Reviewers: spatel, RKSimon, eli.friedman, andreadb Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31961 llvm-svn: 302583	2017-05-09 20:25:38 +00:00
Reid Kleckner	3a363fff7e	Re-land "Use the frame index side table for byval and inalloca arguments" This re-lands r302483. It was not the cause of PR32977. llvm-svn: 302544	2017-05-09 16:02:20 +00:00
Reid Kleckner	84075fddff	Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare" This re-lands commit r302461. It was not the cause of PR32977. llvm-svn: 302543	2017-05-09 16:01:47 +00:00
Serge Pavlov	d526b13e61	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527	2017-05-09 13:35:13 +00:00
Amara Emerson	cf9daa33a7	Introduce experimental generic intrinsics for horizontal vector reductions. - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514	2017-05-09 10:43:25 +00:00
Reid Kleckner	41bb94233b	Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare" This reverts commit r302461. It appears to be causing failures compiling gtest with debug info on the Linux sanitizer bot. I was unable to reproduce the failure locally, however. llvm-svn: 302504	2017-05-09 01:57:44 +00:00
Reid Kleckner	9f29914d40	Revert "Use the frame index side table for byval and inalloca arguments" This reverts r302483 and it's follow up fix. llvm-svn: 302493	2017-05-09 01:14:39 +00:00
Reid Kleckner	45efcf0c96	Use the frame index side table for byval and inalloca arguments Summary: For inalloca functions, this is a very common code pattern: %argpack = type <{ i32, i32, i32 }> define void @f(%argpack* inalloca %args) { entry: %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0 %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1 %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2 tail call void @llvm.dbg.declare(metadata i32* %a, ... "a") tail call void @llvm.dbg.declare(metadata i32* %c, ... "b") tail call void @llvm.dbg.declare(metadata i32* %b, ... "c") Even though these GEPs can be simplified to a constant offset from EBP or RSP, we don't do that at -O0, and each GEP is computed into a register. Registers used to compute argument addresses are typically spilled and clobbered very quickly after the initial computation, so live debug variable tracking loses information very quickly if we use DBG_VALUE instructions. This change moves processing of dbg.declare between argument lowering and basic block isel, so that we can ask if an argument has a frame index or not. If the argument lives in a register as is the case for byval arguments on some targets, then we don't put it in the side table and during ISel we emit DBG_VALUE instructions. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32980 llvm-svn: 302483	2017-05-08 23:20:27 +00:00
Reid Kleckner	bf828eedb4	Don't add DBG_VALUE instructions for static allocas in dbg.declare Summary: An llvm.dbg.declare of a static alloca is always added to the MachineFunction dbg variable map, so these values are entirely redundant. They survive all the way through codegen to be ignored by DWARF emission. Effectively revert r113967 Two bugpoint-reduced test cases from 2012 broke as a result of this change. Despite my best efforts, I haven't been able to rewrite the test case using dbg.value. I'm not too concerned about the lost coverage because these were reduced from the test-suite, which we still run. Reviewers: aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32920 llvm-svn: 302461	2017-05-08 19:58:15 +00:00
Dean Michael Berris	9bcaed867a	[XRay] Custom event logging intrinsic This patch introduces an LLVM intrinsic and a target opcode for custom event logging in XRay. Initially, its use case will be to allow users of XRay to log some type of string ("poor man's printf"). The target opcode compiles to a noop sled large enough to enable calling through to a runtime-determined relative function call. At runtime, when X-Ray is enabled, the sled is replaced by compiler-rt with a trampoline to the logic for creating the custom log entries. Future patches will implement the compiler-rt parts and clang-side support for emitting the IR corresponding to this intrinsic. Reviewers: timshen, dberris Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D27503 llvm-svn: 302405	2017-05-08 05:45:21 +00:00
Simon Pilgrim	2c15447f99	[DAGCombiner] If ISD::ABS is legal/custom, use it directly instead of canonicalizing first. Remove an extra canonicalization step if ISD::ABS is going to be used anyway. Updated x86 abs combine to check that we are lowering from both canonicalizations. llvm-svn: 302337	2017-05-06 13:44:42 +00:00
Reid Kleckner	ac1a97b32f	Simplify dbg.value handling in SDISel with early returns No functional change other than improving dbgs logging accuracy on constant dbg values. Previously we would add things like "i32 42" as debug values, and then log that we were dropping the debug info, which is silly. Delete some dead code that was checking for static allocas. This remained after r207165, but served no purpose. Currently, static alloca dbg.values are always sent through the DanglingDebugInfoMap, and are usually made valid the first time the alloca is used. llvm-svn: 302267	2017-05-05 18:30:34 +00:00
Craig Topper	f0aeee01c3	[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262	2017-05-05 17:36:09 +00:00
Chad Rosier	84a238dd62	[DAGCombine] Transform (fadd A, (fmul B, -2.0)) -> (fsub A, (fadd B, B)). Differential Revision: http://reviews.llvm.org/D32596 llvm-svn: 302153	2017-05-04 14:14:44 +00:00
Krzysztof Parzyszek	41b6e14dc5	Refactoring with range-based for, NFC Patch by Wei-Ren Chen. Differential Revision: https://reviews.llvm.org/D32682 llvm-svn: 302148	2017-05-04 13:35:17 +00:00
Craig Topper	d4d09fd73d	[SelectionDAG] Improve known bits support for CTPOP. This is based on the same concept from ValueTracking's version of computeKnownBits. llvm-svn: 302110	2017-05-04 04:33:27 +00:00
Craig Topper	d938fd1397	[KnownBits] Add zext, sext, and trunc methods to KnownBits This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible. Differential Revision: https://reviews.llvm.org/D32784 llvm-svn: 302088	2017-05-03 22:07:25 +00:00
Sanjay Patel	e1cf61c69f	[TargetLowering] use isSubsetOf in SimplifyDemandedBits; NFCI This is the DAG equivalent of https://reviews.llvm.org/D32255 , which will hopefully be committed again. The functionality (preferring a 'not' op) is already here in the DAG, so this is just intended to be a clean-up and performance improvement. llvm-svn: 302087	2017-05-03 21:55:34 +00:00
Amaury Sechet	666c705953	[DAGCombine] (addcarry (add\|uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry) Summary: Do the transform when the carry isn't used. It's a pattern exposed when legalizing large integers. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32755 llvm-svn: 302047	2017-05-03 16:28:10 +00:00
Tim Shen	e59d06fe78	[PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction Summary: This is the corresponding llvm change to D28037 to ensure no performance regression. Reviewers: bogner, kbarton, hfinkel, iteratee, echristo Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28329 llvm-svn: 301990	2017-05-03 00:07:02 +00:00
Amaury Sechet	106a7eab84	[DAGCombine] (uaddo X, (addcarry Y, 0, Carry)) -> (addcarry X, Y, Carry) Summary: This is a common pattern that arise when legalizing large integers operations. Only do it when Y + 1 cannot overflow as this would change the carry behavior of uaddo . Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32687 llvm-svn: 301922	2017-05-02 14:15:48 +00:00
Amaury Sechet	153911f71d	[DAGCombine] (add X, (addcarry Y, 0, Carry)) -> (addcarry X, Y, Carry) Summary: Common pattern when legalizing large integers operations. Similar to D32687, when the carry isn't used. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Differential Revision: https://reviews.llvm.org/D32738 llvm-svn: 301919	2017-05-02 13:34:25 +00:00
Simon Pilgrim	89ad89cc73	[SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088) PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats. This patch adds support for extension/truncation for both integer and float types. Differential Revision: https://reviews.llvm.org/D32391 llvm-svn: 301910	2017-05-02 10:33:08 +00:00
Simon Pilgrim	8deb87a6c0	[DAGCombiner] Improve MatchBswapHword logic (PR31357) The existing code only looks at half of the tree when matching bswap + rol patterns ending in an OR tree (as opposed to a cascade). Patch originally introduced by Jim Lewis. Submitted on the behalf of Dinar Temirbulatov. Differential Revision: https://reviews.llvm.org/D32039 llvm-svn: 301907	2017-05-02 10:16:19 +00:00
Craig Topper	6b1b630a98	[SelectionDAG] Use known ones to provide a better bound for the known zeros for CTTZ/CTLZ operations. This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output. I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information. I'll fix CTPOP in a future patch. Differential Revision: https://reviews.llvm.org/D32692 llvm-svn: 301806	2017-05-01 16:08:06 +00:00
Amara Emerson	d28f0cd448	Generalize the specialized flag-carrying SDNodes by moving flags into SDNode. This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803	2017-05-01 15:17:51 +00:00
Sanjay Patel	ad13826aea	[DAGCombiner] shrink/widen a vselect to match its condition operand size (PR14657) We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check legality of the select that we want to produce. A few things to note: 1. We can't wait until after legalization and do this generically because (at least in the x86 tests from PR14657), we'll have PACKSS and bitcasts in the pattern. 2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but that requires a closer look to make sure we don't end up worse. 3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases. That should be fixed next. 4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple legal vector sizes, but if there are other targets like that, we should add more tests. 5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests): despite IR that is terrible for the target, this patch allows us to generate the optimal loop code because something post-ISEL is hoisting the splat extends above the vector loops. Differential Revision: https://reviews.llvm.org/D32620 llvm-svn: 301781	2017-04-30 22:44:51 +00:00
Amaury Sechet	8ac81f3924	Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29872 llvm-svn: 301775	2017-04-30 19:24:09 +00:00
Craig Topper	778f57b4f1	[APInt] Replace calls to setBits with more specific calls to setBitsFrom and setLowBits where possible. llvm-svn: 301768	2017-04-30 07:44:58 +00:00
Craig Topper	ca48af3c87	[KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit. Reviewers: RKSimon, spatel, davide Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32651 llvm-svn: 301747	2017-04-29 16:43:11 +00:00
Reid Kleckner	859f8b544a	Make getParamAlignment use argument numbers The method is called "get Param Alignment", and is only used for return values exactly once, so it should take argument indices, not attribute indices. Avoids confusing code like: IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError); Alignment = CS->getParamAlignment(ArgIdx + 1); Add getRetAlignment to handle the one case in Value.cpp that wants the return value alignment. This is a potentially breaking change for out-of-tree backends that do their own call lowering. llvm-svn: 301682	2017-04-28 20:34:27 +00:00
Matthias Braun	744c215e29	TargetLowering: Add finalizeLowering() function; NFC Adds a new method finalizeLowering to TargetLoweringBase. This is in preparation for an upcoming commit. This function is meant for target specific adjustments to MachineFrameInfo or register reservations. Move the freezeRegisters() and the hasCopyImplyingStackAdjustment() handling into the new function to prove the concept. As an added bonus GlobalISel no longer missed the hasCopyImplyingStackAdjustment() handling with this. Differential Revision: https://reviews.llvm.org/D32621 llvm-svn: 301679	2017-04-28 20:25:05 +00:00
Reid Kleckner	6652a52e2b	Use Argument::hasAttribute and AttributeList::ReturnIndex more This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666	2017-04-28 18:37:16 +00:00
Craig Topper	24db6b800f	[APInt] Add clearSignBit method. Use it and setSignBit in a few places. NFCI llvm-svn: 301656	2017-04-28 16:58:05 +00:00
Jun Bum Lim	919f9e8d65	[InlineCost] Improve the cost heuristic for Switch Summary: The motivation example is like below which has 13 cases but only 2 distinct targets ``` lor.lhs.false2: ; preds = %if.then switch i32 %Status, label %if.then27 [ i32 -7012, label %if.end35 i32 -10008, label %if.end35 i32 -10016, label %if.end35 i32 15000, label %if.end35 i32 14013, label %if.end35 i32 10114, label %if.end35 i32 10107, label %if.end35 i32 10105, label %if.end35 i32 10013, label %if.end35 i32 10011, label %if.end35 i32 7008, label %if.end35 i32 7007, label %if.end35 i32 5002, label %if.end35 ] ``` which is compiled into a balanced binary tree like this on AArch64 (similar on X86) ``` .LBB853_9: // %lor.lhs.false2 mov w8, #10012 cmp w19, w8 b.gt .LBB853_14 // BB#10: // %lor.lhs.false2 mov w8, #5001 cmp w19, w8 b.gt .LBB853_18 // BB#11: // %lor.lhs.false2 mov w8, #-10016 cmp w19, w8 b.eq .LBB853_23 // BB#12: // %lor.lhs.false2 mov w8, #-10008 cmp w19, w8 b.eq .LBB853_23 // BB#13: // %lor.lhs.false2 mov w8, #-7012 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_14: // %lor.lhs.false2 mov w8, #14012 cmp w19, w8 b.gt .LBB853_21 // BB#15: // %lor.lhs.false2 mov w8, #-10105 add w8, w19, w8 cmp w8, #9 // =9 b.hi .LBB853_17 // BB#16: // %lor.lhs.false2 orr w9, wzr, #0x1 lsl w8, w9, w8 mov w9, #517 and w8, w8, w9 cbnz w8, .LBB853_23 .LBB853_17: // %lor.lhs.false2 mov w8, #10013 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_18: // %lor.lhs.false2 mov w8, #-7007 add w8, w19, w8 cmp w8, #2 // =2 b.lo .LBB853_23 // BB#19: // %lor.lhs.false2 mov w8, #5002 cmp w19, w8 b.eq .LBB853_23 // BB#20: // %lor.lhs.false2 mov w8, #10011 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_21: // %lor.lhs.false2 mov w8, #14013 cmp w19, w8 b.eq .LBB853_23 // BB#22: // %lor.lhs.false2 mov w8, #15000 cmp w19, w8 b.ne .LBB853_3 ``` However, the inline cost model estimates the cost to be linear with the number of distinct targets and the cost of the above switch is just 2 InstrCosts. The function containing this switch is then inlined about 900 times. This change use the general way of switch lowering for the inline heuristic. It etimate the number of case clusters with the suitability check for a jump table or bit test. Considering the binary search tree built for the clusters, this change modifies the model to be linear with the size of the balanced binary tree. The model is off by default for now : -inline-generic-switch-cost=false This change was originally proposed by Haicheng in D29870. Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier Reviewed By: hans Subscribers: joerg, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31085 llvm-svn: 301649	2017-04-28 16:04:03 +00:00

1 2 3 4 5 ...

8347 Commits