llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	718af2f817	Revert r333268: [IPSCCP] Use PredicateInfo to propagate facts from... Reverting this to see if this is causing the failures of the clang-with-thin-lto-ubuntu bot. [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333323	2018-05-25 23:32:02 +00:00
Florian Hahn	b4a70b9f47	[IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333268	2018-05-25 11:12:33 +00:00
Chandler Carruth	e6c30fdda7	Restore the LoopInstSimplify pass, reverting r327329 that removed it. The plan had always been to move towards using this rather than so much in-pass simplification within the loop pipeline, but we never got around to it.... until only a couple months after it was removed due to disuse. =/ This commit is just a pure revert of the removal. I will add tests and do some basic cleanup in follow-up commits. Then I'll wire it into the loop pass pipeline. Differential Revision: https://reviews.llvm.org/D47353 llvm-svn: 333250	2018-05-25 01:32:36 +00:00
Jun Bum Lim	dfbe6fa832	[LICM] Preserve DT and LoopInfo specifically Summary: In LICM, CFG could be changed in splitPredecessorsOfLoopExit(), which update only DT and LoopInfo. Therefore, we should preserve only DT and LoopInfo specifically, instead of all analyses that depend on the CFG (setPreservesCFG()). This change should fix PR37323. Reviewers: uabelho, davide, dberlin, Ka-Ka Reviewed By: dberlin Subscribers: mzolotukhin, bjope, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D46775 llvm-svn: 333198	2018-05-24 15:58:34 +00:00
Chad Rosier	274d72faad	[InstCombine] Combine XOR and AES instructions on ARM/ARM64. The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in the instruction. Therefore, if the AES key is zero and the AES data was previously XORed, it can be combined into a single instruction. Differential Revision: https://reviews.llvm.org/D47239 Patch by Michael Brase! llvm-svn: 333193	2018-05-24 15:26:42 +00:00
Andrei Elovikov	d34b765cb2	[NFC][VPlan] Wrap PlainCFGBuilder with an anonymous namespace. Summary: It's internal to the VPlanHCFGBuilder and should not be visible outside of its translation unit. Reviewers: dcaballe, fhahn Reviewed By: fhahn Subscribers: rengolin, bollu, tschuett, llvm-commits, rkruppe Differential Revision: https://reviews.llvm.org/D47312 llvm-svn: 333187	2018-05-24 14:31:00 +00:00
Karl-Johan Karlsson	478232d52f	[NaryReassociate] Detect deleted instr with WeakVH Summary: If NaryReassociate succeed it will, when replacing the old instruction with the new instruction, also recursively delete trivially dead instructions from the old instruction. However, if the input to the NaryReassociate pass contain dead code it is not save to recursively delete trivially deadinstructions as it might lead to deleting the newly created instruction. This patch will fix the problem by using WeakVH to detect this rare case, when the newly created instruction is dead, and it will then restart the basic block iteration from the beginning. This fixes pr37539 Reviewers: tra, meheff, grosser, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47139 llvm-svn: 333155	2018-05-24 06:09:02 +00:00
Changpeng Fang	5f9154618e	StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly Summary: StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order. The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops. To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited before moving on to outer loop. However, we found a problem for a SubRegion which is a loop itself: --> BB1 --> BB2 --> BB3 --> In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead BB2 to be placed in the wrong order. In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth to guard the sorting. Reviewers: arsenm, jlebar Differential Revision: https://reviews.llvm.org/D46912 llvm-svn: 333111	2018-05-23 18:34:48 +00:00
Roman Lebedev	6b6c553bb8	[InstCombine] Fold unfolded masked merge pattern with variable mask! Summary: Finally fixes [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]]. Now that the backend is all done, we can finally fold it! The canonical unfolded masked merge pattern is ```(x & m) \| (y & ~m)``` There is a second, equivalent variant: ```(x \| ~m) & (y \| m)``` Only one of them (the or-of-and's i think) is canonical. And if the mask is not a constant, we should fold it to: ```((x ^ y) & M) ^ y``` https://rise4fun.com/Alive/ndQw Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: nicholas, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D46814 llvm-svn: 333106	2018-05-23 17:47:52 +00:00
Jakub Kuderski	ef33edd9b5	[Dominators] Add PDT constructor from Function Summary: This patch adds a PDT constructor from Function and lets codes previously using a local class to do this use PostDominatorTree class directly. Reviewers: davide, kuhar, grosser, dberlin Reviewed By: kuhar Author: NutshellySima Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46709 llvm-svn: 333102	2018-05-23 17:29:21 +00:00
Craig Topper	3b768e8602	[InstCombine] Negate ABS/NABS patterns by swapping the select operands to remove the negation Differential Revision: https://reviews.llvm.org/D47236 llvm-svn: 333101	2018-05-23 17:29:03 +00:00
Nicola Zaghen	03d0b91f43	Remove DEBUG macro. Now that the LLVM_DEBUG() macro landed on the various sub-projects the DEBUG macro can be removed. Also change the new uses of DEBUG to LLVM_DEBUG. Differential Revision: https://reviews.llvm.org/D46952 llvm-svn: 333091	2018-05-23 15:09:29 +00:00
Max Kazantsev	d99f3bacb4	[LoopUnswitch] Fix SCEV invalidation in unswitching Loop unswitching makes substantial changes to a loop that can also affect cached SCEV info in its outer loops as well, but it only cares to invalidate SCEV cache for the innermost loop in case of full unswitching and does not invalidate anything at all in case of trivial unswitching. As result, we may end up with incorrect data in cache. Differential Revision: https://reviews.llvm.org/D46045 Reviewed By: mzolotukhin llvm-svn: 333072	2018-05-23 10:09:53 +00:00
Sanjay Patel	4b96935bd7	[InstCombine] use nsw negation for abs libcalls Also, produce the canonical IR abs (s<0) to be more efficient. This is the libcall equivalent of the clang builtin change from: rL333038 Pasting from that commit message: The stdlib functions are defined in section 7.20.6.1 of the C standard with: "If the result cannot be represented, the behavior is undefined." That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would be UB/poison. llvm-svn: 333042	2018-05-22 23:29:40 +00:00
David Bolvansky	1f343fa0e0	[InstCombine] Remove calloc transformations Summary: Previous patch does not care if a value is changed between calloc and strlen. This needs to be removed from InstCombine and maybe moved to DSE later after some rework. Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47218 llvm-svn: 333022	2018-05-22 20:27:36 +00:00
Florian Hahn	a6e63f176c	[NewGVN] Fix handling of assumes This patch fixes two bugs: * test1: Previously assume(a >= 5) concluded that a == 5. That's only valid for assume(a == 5)... * test2: If operands were swapped, additional users were added to the wrong cmp operand. This resulted in an "unsettled iteration" assertion failure. Patch by Nikita Popov Differential Revision: https://reviews.llvm.org/D46974 llvm-svn: 333007	2018-05-22 17:38:22 +00:00
David Bolvansky	41f4b64ee1	[InstCombine] Calloc-ed strings optimizations Summary: Example cases: strlen(calloc(...)) -> 0 Reviewers: efriedma, bkramer Reviewed By: bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47059 llvm-svn: 332990	2018-05-22 15:41:23 +00:00
Karl-Johan Karlsson	11d68a619e	[LowerSwitch] Fixed faulty PHI node update Summary: When lowerswitch merge several cases into a new default block it's not updating the PHI nodes accordingly. The code that update the PHI nodes for the default edge only update the first entry and do not remove the remaining ones, to make sure the number of entries match the number of predecessors. This is easily fixed by replacing the code that update the PHI node with the already existing utility function for updating PHI nodes. Reviewers: hans, reames, arsenm Reviewed By: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47055 llvm-svn: 332960	2018-05-22 08:46:48 +00:00
Bjorn Pettersson	fecef6be9e	[LoopVersioning] Don't modify the list that we iterate over in addPHINodes Summary: In LoopVersioning::addPHINodes we need to iterate over all users for a value "Inst", and if the user is outside of the VersionedLoop we should replace the use of "Inst" by using the value "PN" instead. Replacing the use of "Inst" for a user of "Inst" also means that Inst->users() is modified. So it is not safe to do the replace while iterating over Inst->users() as we used to do. This patch splits the task into two steps. First we iterate over Inst->users() to find all users that should be updated. Those users are saved into a local data structure on the stack. And then, in the second step, we do the actual updates. This time iterating over the local data structure. Reviewers: mzolotukhin, anemet Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47134 llvm-svn: 332958	2018-05-22 08:33:02 +00:00
Stanislav Mekhanoshin	0e132dca53	[AMDGPU] Optimze old value of v_mov_b32_dpp We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf. This is alternative implementation working with the intrinsic in InstCombine. Original review for past-ISel optimization: D46570. Differential Revision: https://reviews.llvm.org/D46596 llvm-svn: 332956	2018-05-22 08:04:33 +00:00
Diego Caballero	1bd5f2261d	Fix warning from r332654 with LLVM_ATTRIBUTE_USED r332654 tried to fix an unused function warning with a void cast. This approach worked for clang and gcc but not for MSVC. This commit replaces the void cast with the LLVM_ATTRIBUTE_USED approach. llvm-svn: 332910	2018-05-21 22:12:38 +00:00
Sanjay Patel	b8346e3f07	[InstCombine] remove fptrunc (select) code; NFCI This pattern is handled within commonCastTransforms(), so the code here is dead AFAICT. llvm-svn: 332887	2018-05-21 20:39:35 +00:00
Craig Topper	f14e62c9a5	[EarlyCSE] Improve EarlyCSE of some absolute value cases. Change matchSelectPattern to return X and -X for ABS/NABS in a well defined order. Adjust EarlyCSE to account for this. Ensure the SPF result is some kind of min/max and not abs/nabs in one place in InstCombine that made me nervous. Prevously we returned the two operands of the compare part of the abs pattern. The RHS is always going to be a 0i, 1 or -1 constant. This isn't a very meaningful thing to return for any one. There's also some freedom in the abs pattern as to what happens when the value is equal to 0. This freedom led to early cse failing to match when different constants were used in otherwise equivalent operations. By returning the input and its negation in a defined order we can ensure an exact match. This also makes sure both patterns use the exact same subtract instruction for the negation. I believe CSE should evebntually make this happen and properly merge the nsw/nuw flags. But I'm not familiar with CSE and what order it does things in so it seemed like it might be good to really enforce that they were the same. Differential Revision: https://reviews.llvm.org/D47037 llvm-svn: 332865	2018-05-21 18:42:42 +00:00
Diego Caballero	168d04d544	[VPlan] Reland r332654 and silence unused func warning r332654 was reverted due to an unused function warning in release build. This commit includes the same code with the warning silenced. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332860	2018-05-21 18:14:23 +00:00
Alexey Bataev	7c9ad0db3d	[InstCombine] Fix PR37526: MinMax patterns produce an infinite loop. Summary: This patch fixes PR37526 by simplifying the newly generated LoadInst instructions. If the pointer address is a bitcast from the pointer to the NewType, we can just remove this extra bitcast instead of creating the new one. This fixes the PR37526 + may speed up the whole compilation process. Reviewers: spatel, RKSimon, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47144 llvm-svn: 332855	2018-05-21 17:46:34 +00:00
Nico Weber	e4a12cfa2f	revert r332610, it breaks cfi, see D46326 llvm-svn: 332838	2018-05-21 11:44:39 +00:00
David Green	8ceab61c75	[CVP] Require DomTree for new Pass Manager We were previously using a DT in CVP through SimplifyQuery, but not requiring it in the new pass manager. Hence it would crash if DT was not already available. This now gets DT directly and plumbs it through to where it is used (instead of using it through SQ). llvm-svn: 332836	2018-05-21 11:06:28 +00:00
Eric Christopher	563d0b9cb9	Fix up a few grammar issues. llvm-svn: 332835	2018-05-21 10:27:36 +00:00
Craig Topper	e4c045b7df	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824	2018-05-20 23:34:04 +00:00
Sanjay Patel	a003c728a5	[InstCombine] choose 1 form of abs and nabs as canonical We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819	2018-05-20 14:23:23 +00:00
Max Kazantsev	c0b268f90c	[IRCE] Fix miscompile with range checks against negative values In the patch rL329547, we have lifted the over-restrictive limitation on collected range checks, allowing to work with range checks with the end of their range not being provably non-negative. However it appeared that the non-negativity of this value was assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative. The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK) and the second time with `X = IRC.getEnd()`, where we may now see the problem if the end is actually a negative value. In this case, we may sometimes miscompile. This patch is the conservative fix of the miscompile problem. Rather than rejecting non-provably non-negative `getEnd()` values, we will check it for non-negativity in runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X` is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original `[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative. So we in fact prohibit execution of the main loop if at least one of range checks was made against a negative value (and we figured it out in runtime). It is still better than what we have before (non-negativity had to be proved in compile time) and prevents us from miscompile, however it is sometiles too restrictive for unsigned range checks against a negative value (which in fact can be eliminated). Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly, this limitation can be lifted, too. Differential Revision: https://reviews.llvm.org/D46860 Reviewed By: samparker llvm-svn: 332809	2018-05-19 13:06:37 +00:00
Benjamin Kramer	a76b64ff80	[MergeICmps] Don't crash when memcmp is not available Fixes clang crashing with -fno-builtin, PR37527. llvm-svn: 332808	2018-05-19 12:51:59 +00:00
Yaxun Liu	ea988f1fd9	Fix evaluator for non-zero alloca addr space The evaluator goes through BB and creates global vars as temporary values to evaluate results of LLVM instructions. It creates undef for alloca, however it assumes alloca in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid cast instruction. This patch let the temp global var have an address space matching alloca addr space, so that the valuation can be done. Differential Revision: https://reviews.llvm.org/D47081 llvm-svn: 332794	2018-05-19 02:58:16 +00:00
Piotr Padlewski	a26a08cb52	Constant fold launder of null and undef Summary: This might be useful because clang will add some barriers for pointer comparisons. Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc, kuhar Subscribers: davide, amharc, llvm-commits Differential Revision: https://reviews.llvm.org/D32423 llvm-svn: 332786	2018-05-18 23:52:57 +00:00
Craig Topper	0198b73769	[InstCombine] Qualify a select pattern based transform to restrct to only min/max and ignore abs/nabs. llvm-svn: 332770	2018-05-18 21:21:56 +00:00
Evgeniy Stepanov	28f330fd6f	[msan] Don't check divisor shadow in fdiv. Summary: Floating point division by zero or even undef does not have undefined behavior and may occur due to optimizations. Fixes https://bugs.llvm.org/show_bug.cgi?id=37523. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47085 llvm-svn: 332761	2018-05-18 20:19:53 +00:00
Galina Kistanova	083ea389d6	Reverted r332654 as it has broken some buildbots and left unfixed for a long time. The introduced problem is: llvm.src/lib/Transforms/Vectorize/VPlanVerifier.cpp:29:13: error: unused function 'hasDuplicates' [-Werror,-Wunused-function] static bool hasDuplicates(const SmallVectorImpl<VPBlockBase *> &VPBlockVec) { ^ llvm-svn: 332747	2018-05-18 18:14:06 +00:00
David Stenberg	0af67e5b65	[SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest() Summary: Fix a case where FoldBranchToCommonDest() would bail out from doing CSE when encountering a debug intrinsic. Handle that by skipping past the debug intrinsics. Also, as a minor refactoring, rename checkCSEInPredecessor() to tryCSEWithPredecessor() to make it a bit more clear that the function may remove instructions. Reviewers: fhahn, craig.topper, dblaikie, xbolva00 Reviewed By: fhahn, xbolva00 Subscribers: vsk, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D46635 llvm-svn: 332698	2018-05-18 08:52:15 +00:00
Walter Lee	cdbb207bd1	[asan] Add instrumentation support for Myriad 1. Define Myriad-specific ASan constants. 2. Add code to generate an outer loop that checks that the address is in DRAM range, and strip the cache bit from the address. The former is required because Myriad has no memory protection, and it is up to the instrumentation to range-check before using it to index into the shadow memory. 3. Do not add an unreachable instruction after the error reporting function; on Myriad such function may return if the run-time has not been initialized. 4. Add a test. Differential Revision: https://reviews.llvm.org/D46451 llvm-svn: 332692	2018-05-18 04:10:38 +00:00
Heejin Ahn	b4be38fcdd	[WebAssembly] Add Wasm personality and isScopedEHPersonality() Summary: - Add wasm personality function - Re-categorize the existing `isFuncletEHPersonality()` function into two different functions: `isFuncletEHPersonality()` and `isScopedEHPersonality(). This becomes necessary as wasm EH uses scoped EH instructions (catchswitch, catchpad/ret, and cleanuppad/ret) but not outlined funclets. - Changed some callsites of `isFuncletEHPersonality()` to `isScopedEHPersonality()` if they are related to scoped EH IR-level stuff. Reviewers: majnemer, dschuff, rnk Subscribers: jfb, sbc100, jgravelle-google, eraman, JDevlieghere, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45559 llvm-svn: 332667	2018-05-17 20:52:03 +00:00
Diego Caballero	f58ad3129c	[LV][VPlan] Build plain CFG with simple VPInstructions for outer loops. Patch #3 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). Expected to be NFC for the current inner loop vectorization path. It introduces the basic algorithm to build the VPlan plain CFG (single-level CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization path using VPInstructions. It includes: - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now). - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG. - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan. Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332654	2018-05-17 19:24:47 +00:00
Xinliang David Li	bc471c39ee	Add a limit for phi folding instcombine Differential Revision: http://reviews.llvm.org/D47023 llvm-svn: 332653	2018-05-17 19:24:03 +00:00
Craig Topper	bd332588bd	[InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version. According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB. The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far. Differential Revision: https://reviews.llvm.org/D46988 llvm-svn: 332623	2018-05-17 16:29:52 +00:00
Dmitry Mikulin	3c6b4e35bd	In thin and full LTO + CFI, direct function calls may go through jump table entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 332610	2018-05-17 14:29:07 +00:00
Bjorn Pettersson	81a76a388a	[SROA] Handle PHI with multiple duplicate predecessors Summary: The verifier accepts PHI nodes with multiple entries for the same basic block, as long as the value is the same. As seen in PR37203, SROA did not handle such PHI nodes properly when speculating loads over the PHI, since it inserted multiple loads in the predecessor block and changed the PHI into having multiple entries for the same basic block, but with different values. This patch teaches SROA to reuse the same speculated load for each PHI duplicate entry in such situations. Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203 Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma Reviewed By: efriedma Subscribers: dberlin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46426 llvm-svn: 332577	2018-05-17 07:21:41 +00:00
Hiroshi Inoue	f5c0e6c285	[SROA] pr37267: fix assertion failure in integer widening The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad). This patch adds explicit checks for this case in isIntegerWideningViableForSlice. Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition. Differential Revision: https://reviews.llvm.org/D46750 llvm-svn: 332575	2018-05-17 06:32:17 +00:00
Vedant Kumar	5a0872c2b7	[STLExtras] Add size() for ranges, and remove distance() r332057 introduced distance() for ranges. Based on post-commit feedback, this renames distance() to size(). The new size() is also only enabled when the operation is O(1). Differential Revision: https://reviews.llvm.org/D46976 llvm-svn: 332551	2018-05-16 23:20:42 +00:00
Benjamin Kramer	8ac15bf4dc	[InstCombine] Fix the signature of fgets_unlocked. It returns a pointer, not an int. This miscompiles all code that uses the return value of fgets. llvm-svn: 332531	2018-05-16 21:45:39 +00:00
Sanjay Patel	2eb3512090	[InstCombine] allow more binop (shuffle X), C transforms The canonicalization was restricted to shuffle masks with a 1-to-1 mapping to the constant vector, but that disqualifies the common splat pattern. This is part of solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 332479	2018-05-16 15:15:22 +00:00
David Bolvansky	ca22d427b9	[SimplifyLibcalls] Replace locked IO with unlocked IO Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja Reviewed By: rja Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 332452	2018-05-16 11:39:52 +00:00
David Green	cdee1d957e	[LoopUnroll] Split out simplify code after Unroll into a new function. NFC So that it can be shared with other passes that may end up doing the same thing. Differential Revision: https://reviews.llvm.org/D45874 llvm-svn: 332450	2018-05-16 10:41:58 +00:00
Shoaib Meenai	074728a2a9	[ObjCARC] Prevent code motion into a catchswitch A catchswitch must be the only non-phi instruction in its basic block; attempting to move a retain or release into a catchswitch basic block will result in invalid IR. Explicitly mark a CFG hazard in this case to prevent the code motion. Differential Revision: https://reviews.llvm.org/D46482 llvm-svn: 332430	2018-05-16 04:52:18 +00:00
Evgeny Stupachenko	bff9302c3d	Fix LSR compile time hang. Summary: Limit number of reassociations in GenerateReassociationsImpl. Reviewers: qcolombet, mkazantsev Differential Revision: https://reviews.llvm.org/D46039 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 332426	2018-05-16 02:48:50 +00:00
Sanjay Patel	919882638e	[InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check uses llvm-svn: 332407	2018-05-15 22:00:37 +00:00
Marek Olsak	3c5fd145c5	StructurizeCFG: fix inverting conditions Author: Samuel Pitoiset Without this patch, it appears to me that we are selecting the wrong operand when inverting conditions. In the attached test, it will select %tmp3 instead of %tmp4. To fix it, just use 'A' as everywhere. This fixes a regression introduced by "[PatternMatch] define m_Not using m_Xor and cst_pred_ty" https://reviews.llvm.org/D46351 llvm-svn: 332403	2018-05-15 21:41:55 +00:00
Evgeniy Stepanov	091fed94ae	[msan] Instrument masked.store, masked.load intrinsics. Summary: Instrument masked store/load intrinsics. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46785 llvm-svn: 332402	2018-05-15 21:28:25 +00:00
Sanjay Patel	3c569f0de0	[InstCombine] clean up code for binop-shuffle transforms; NFCI llvm-svn: 332399	2018-05-15 21:23:58 +00:00
Sanjay Patel	3c35290c58	[InstCombine] fix binop-of-shuffles to check uses llvm-svn: 332375	2018-05-15 17:14:23 +00:00
whitequark	8f0ab258bd	[MergeFunctions] Fix merging of small weak functions When two interposable functions are merged, we cannot replace uses and have to emit calls to a common internal function. However, writeThunk() will not actually emit a thunk if the function is too small. This leaves us in a broken state where mergeTwoFunctions already rewired the functions, but writeThunk doesn't do anything. This patch changes the implementation so that: * writeThunk() does just that. * The direct replacement of calls is moved into mergeTwoFunctions() into the non-interposable case only. * isThunkProfitable() is extracted and will be called for the non-iterposable case always, and in the interposable case only if uses are still left after replacement. This issue has been introduced in https://reviews.llvm.org/D34806, where the code for checking thunk profitability has been moved. Differential Revision: https://reviews.llvm.org/D46804 Reviewed By: whitequark llvm-svn: 332342	2018-05-15 11:31:07 +00:00
Max Kazantsev	9b90373c8b	[NFC] Add const to method signature llvm-svn: 332317	2018-05-15 01:21:56 +00:00
Keno Fischer	de577af8c0	[InstCombine] fix crash due to ignored addrspacecast Summary: Part of the InstCombine code for simplifying GEPs looks through addrspacecasts. However, this was done by updating a variable also used by the next transformation, for marking GEPs as inbounds. This led to replacing a GEP with a similar instruction in a different addrspace, which caused an assertion failure in RAUW. This caused julia issue https://github.com/JuliaLang/julia/issues/27055 Patch by Jeff Bezanson <jeff@juliacomputing.com> Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D46722 llvm-svn: 332302	2018-05-14 22:05:01 +00:00
Sanjay Patel	bf55e6dee1	[AggressiveInstCombine] avoid crashing on unsimplified code (PR37446) This bug: https://bugs.llvm.org/show_bug.cgi?id=37446 ...raises another question: why do we run aggressive-instcombine before regular instcombine? llvm-svn: 332243	2018-05-14 13:43:32 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Nicola Zaghen	617d4a8199	Test commit access. Remove trailing whitespace. llvm-svn: 332220	2018-05-14 08:24:29 +00:00
Craig Topper	0e71c6d5ca	[X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use uitofp+insertelement instead. llvm-svn: 332206	2018-05-14 00:06:49 +00:00
Craig Topper	911025b1cd	[X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 bit version. llvm-svn: 332202	2018-05-13 21:56:32 +00:00
Craig Topper	85906cf041	[X86] Remove and autoupgrade masked vpermd/vpermps intrinsics. llvm-svn: 332198	2018-05-13 18:03:59 +00:00
Craig Topper	df3a9cedff	[X86] Remove an autoupgrade legacy cvtss2sd intrinsics. llvm-svn: 332187	2018-05-13 00:29:40 +00:00
Craig Topper	38ad7ddabc	[X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang has used for a very long time. llvm-svn: 332186	2018-05-12 23:14:39 +00:00
Michael Zolotukhin	a41660df7e	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." Stage3/stage4 bootstrap miscompares should be fixed by a non-determinism fix in IDF (r332167). This reverts commit r330446. llvm-svn: 332168	2018-05-12 01:52:36 +00:00
Sergey Dmitriev	69c9cd277d	[CodeExtractor] Allow extracting blocks with exception handling This is a CodeExtractor improvement which adds support for extracting blocks which have exception handling constructs if that is legal to do. CodeExtractor performs validation checks to ensure that extraction is legal when it finds invoke instructions or EH pads (landingpad, catchswitch, or cleanuppad) in blocks to be extracted. I have also added an option to allow extraction of blocks with alloca instructions, but no validation is done for allocas. CodeExtractor caller has to validate it himself before allowing alloca instructions to be extracted. By default allocas are still not allowed in extraction blocks. Differential Revision: https://reviews.llvm.org/D45904 llvm-svn: 332151	2018-05-11 22:49:49 +00:00
Craig Topper	a17d627abb	[X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang. llvm-svn: 332146	2018-05-11 21:59:34 +00:00
Artem Belevich	c2cd5d5ce0	[Split GEP] handle trunc() in separate-const-offset-from-gep pass. Let separate-const-offset-from-gep pass handle trunc() when it calculates constant offset relative to base. The pass itself may insert trunc() instructions when it canonicalises array indices to pointer-size integers and needs to handle trunc() in order to evaluate the offset. Differential Revision: https://reviews.llvm.org/D46732 llvm-svn: 332142	2018-05-11 21:13:19 +00:00
Daniel Neilson	f6651d4d94	[InstCombine] Handle atomic memset in the same way as regular memset Summary: This change adds handling of the atomic memset intrinsic to the code path that simplifies the regular memset. In practice this means that we will now also expand a small constant-length atomic memset into a single unordered atomic store. Reviewers: apilipenko, skatkov, mkazantsev, anna, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D46660 llvm-svn: 332132	2018-05-11 20:04:50 +00:00
David Bolvansky	cd93c4ef1a	[InstCombine] snprintf optimizations Reviewers: spatel, efriedma, majnemer, rja, bkramer Reviewed By: rja, bkramer Subscribers: mstorsjo, rja, llvm-commits Differential Revision: https://reviews.llvm.org/D46285 llvm-svn: 332110	2018-05-11 17:50:49 +00:00
Davide Italiano	6e1f7bf316	[Reassociate] Prevent infinite loops when processing PHIs. Phi nodes can reside in live blocks but one of their incoming arguments can come from a dead block. Dead blocks and reassociate don't play nice together. In fact, reassociate performs an RPO as a first step to avoid processing dead blocks. The reason why Reassociate might not fixpoint when examining dead blocks is that the following: %xor0 = xor i16 %xor1, undef %xor1 = xor i16 %xor0, undef is perfectly valid LLVM IR (if it appears in a dead block), so the worklist algorithm keeps pushing the two instructions for reexamination. Note that this is not Reassociate fault, at least not entirely. It's llvm that has a weird definition of dominance. Fixes PR37390. llvm-svn: 332100	2018-05-11 15:45:36 +00:00
Daniel Neilson	8f30ec65b0	[InstCombine] Unify handling of atomic memtransfer with non-atomic memtransfer Summary: This change reworks the handling of atomic memcpy within the instcombine pass. Previously, a constant length atomic memcpy would be lowered into loads & stores as long as no more than 16 load/store pairs are created. This is quite different from the lowering done for a non-atomic memcpy; which only ever lowers into a single load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are expanded to load/stores in later passes, such as SelectionDAG lowering. In this change the behaviour for atomic memcpy is unified with non-atomic memcpy; atomic memcpy is now treated in the same was as non-atomic memcpy has always been. We leave it to later passes to lower longer-length atomic memcpy calls. Due to the structure of the pass's handling of memtransfer intrinsics, this change also gives us handling of atomic memmove that we did not previously have. Reviewers: apilipenko, skatkov, mkazantsev, anna, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D46658 llvm-svn: 332093	2018-05-11 14:30:02 +00:00
Brian Gesiak	c651113439	[Coroutines] PR34897: Fix incorrect elisions Summary: https://bugs.llvm.org/show_bug.cgi?id=34897 demonstrates an incorrect coroutine frame allocation elision in the coro-elide pass. The elision is performed on the basis that the SSA variables from all llvm.coro.begin are directly referenced in subsequent llvm.coro.destroy instructions. However, this ignores the fact that the function may exit through paths that do not run these destroy instructions. In the sample program from PR34897, for example, the llvm.coro.destroy instruction is only executed in exception handling code. When the coroutine function exits normally, llvm.coro.destroy is not called. Eliding the allocation in this case causes a subsequent reference to the coroutine handle from outside of the function to access freed memory. To fix the issue, when finding an llvm.coro.destroy for each llvm.coro.begin, only consider llvm.coro.destroy that are executed along non-exceptional paths. Test Plan: 1. Download the sample program from https://bugs.llvm.org/show_bug.cgi?id=34897, compile it with `clang++ -fcoroutines-ts -stdlib=libc++ -std=c++1z -O2`, and run it. It should print `"run1\ncheck1\nrun2\ncheck2"` and then exit successfully. 2. Compile https://godbolt.org/g/mCKfnr and confirm it is still optimized to a single instruction, 'return 1190'. 3. `check-llvm` Reviewers: rsmith, GorNishanov, eric_niebler Reviewed By: GorNishanov Subscribers: andrewrk, lewissbaker, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D43242 llvm-svn: 332077	2018-05-11 03:12:28 +00:00
Kostya Serebryany	a2759327fd	[sanitizer-coverage] don't instrument a function if it's entry block ends with 'unreachable' llvm-svn: 332072	2018-05-11 01:09:39 +00:00
Kamil Rytarowski	02c432a72b	Register NetBSD/i386 in AddressSanitizer.cpp Summary: Ship kNetBSD_ShadowOffset32 set to 1ULL << 30. This is prepared for the amd64 kernel runtime. Sponsored by <The NetBSD Foundation> Reviewers: vitalybuka, joerg, kcc Reviewed By: vitalybuka Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46724 llvm-svn: 332069	2018-05-11 00:58:01 +00:00
Wei Mi	0c2f6be662	[SampleFDO] Don't treat warm callsite with inline instance in the profile as cold We found current sampleFDO had a performance issue when triaging a regression. For a callsite with inline instance in the profile, even if hot callsite inliner cannot inline it, it may still execute enough times and should not be treated as cold in regular inliner later. However, currently if such callsite is not inlined by hot callsite inliner, and the BB where the callsite locates doesn't get samples from other instructions inside of it, the callsite will have no profile metadata annotated. In regular inliner cost analysis, if the callsite has no profile annotated and its caller has profile information, it will be treated as cold. The fix changes the isCallsiteHot check and chooses to compare CallsiteTotalSamples with hot cutoff value computed by ProfileSummaryInfo. Differential Revision: https://reviews.llvm.org/D45377 llvm-svn: 332058	2018-05-10 23:02:27 +00:00
Vedant Kumar	e0b5f86b30	[STLExtras] Add distance() for ranges, pred_size(), and succ_size() This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057	2018-05-10 23:01:54 +00:00
Craig Topper	ea78a261de	[InstCombine] Replace an 'if' that should always be true with an assert. The bitwidth of the operation should always be wider than the result width of the truncate since we don't recurse through any width changing operations. llvm-svn: 332055	2018-05-10 22:45:28 +00:00
Martin Storsjo	86e6742c17	Revert "[InstCombine] snprintf optimizations" This reverts commit SVN r331889, which could trigger failed assertions for cases where the snprintf function is declared with a vaguely differing signature (e.g. being defined as static inline), see PR37408. llvm-svn: 332043	2018-05-10 21:23:36 +00:00
Sanjay Patel	c7bb14301a	[InstCombine] add folds for minnum(-a, -b) --> -maxnum(a, b) This is similar to what we do for integer min/max with 'not' ops (rL321882). This should fix: https://bugs.llvm.org/show_bug.cgi?id=37404 https://bugs.llvm.org/show_bug.cgi?id=37405 llvm-svn: 332031	2018-05-10 20:03:13 +00:00
Omer Paparo Bivas	fbb83deef7	[InstCombine] Moving overflow computation logic from InstCombine to ValueTracking; NFC Differential Revision: https://reviews.llvm.org/D46704 Change-Id: Ifabcbe431a2169743b3cc310f2a34fd706f13f02 llvm-svn: 332026	2018-05-10 19:46:19 +00:00
Chandler Carruth	baf045fb28	[PM/LoopUnswitch] Avoid pointlessly creating an exit block set. This code can just test whether blocks are in the loop, which we already have a dedicated set tracking in the loop itself. llvm-svn: 332004	2018-05-10 17:33:20 +00:00
Daniel Neilson	71fa1b904a	[DSE] Teach the pass about partial overwrite of atomic memory intrinsics Summary: This change teaches DSE that the atomic memory intrinsics can be overwriten partially in the same way as the non-atomic forms. Specifically, that the atomic memcpy & memset can be shortened at the end and that the atomic memset can be shortened at the beginning, if they partially overwritten by later stores. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45584 llvm-svn: 331991	2018-05-10 15:12:49 +00:00
whitequark	68403564df	[PR37339] Fix assertion in FunctionComparator::cmpInlineAsm Fixes bug https://bugs.llvm.org/show_bug.cgi?id=37339. InlineAsm is only uniqued if the FunctionTypes are exactly the same, while cmpTypes() for example considers all pointer types in the default address space to be the same. For this reason the end of cmpInlineAsm() can be reached. This patch replaces the unreachable assertion with a check that the function types are not identical. Differential Revision: https://reviews.llvm.org/D46495 Reviewers: jfb llvm-svn: 331990	2018-05-10 15:05:47 +00:00
Benjamin Kramer	456f473ea8	[InstCombine] Only propagate known leading zeros from udiv input to output. Put in a conservatively correct estimate for now. Avoids miscompiling clang in FDO mode. This is really tricky to trigger in reality as basically all interesting cases will be folded away by computeKnownBits earlier, I was unable to find a reasonably small test case. llvm-svn: 331975	2018-05-10 11:45:18 +00:00
Craig Topper	553d451e95	[InstCombine] Reorder an if condition to put a cheap check in front of a computeKnownBits call. NFC llvm-svn: 331948	2018-05-10 00:53:25 +00:00
Craig Topper	333efc951a	[InstCombine] Use APInt::getBitsSetFrom to shortern a line and fix an 80 columns violation. NFC Fix a similar line in the same function. llvm-svn: 331947	2018-05-10 00:53:22 +00:00
Philip Reames	913a779df2	[Inscombine] fix a signedness warning which broke -Werror builds llvm-svn: 331944	2018-05-10 00:05:29 +00:00
Sanjay Patel	ac3951a735	[AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compare This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" pattern in addition to "any-bits-set". This was a little more complicated than I thought it would be initially because the "and 1" instruction can be anywhere in the chain. Hopefully, the code comments make that logic understandable, but if you see a way to simplify or improve that, it's most appreciated. This transforms patterns that emerge from bitfield tests as seen in PR37098: https://bugs.llvm.org/show_bug.cgi?id=37098 I think it would also help reduce the large test from: D46336 D46595 but we need something to reassociate that case to the forms we're expecting here first. Differential Revision: https://reviews.llvm.org/D46649 llvm-svn: 331937	2018-05-09 23:08:15 +00:00
Philip Reames	79e917d117	[InstCombine] Widen guards with conditions between The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed. Differential Revision: https://reviews.llvm.org/D46203 llvm-svn: 331935	2018-05-09 22:56:32 +00:00
Benjamin Kramer	0d2fc1a501	[InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend bits that are zero in the divisor This is safe as long as the udiv is not exact. The pattern is not common in C++ code, but comes up all the time in code generated by XLA's GPU backend. Differential Revision: https://reviews.llvm.org/D46647 llvm-svn: 331933	2018-05-09 22:27:34 +00:00
David Bolvansky	9b5e6e8288	[InstCombine] snprintf optimizations Reviewers: spatel, efriedma, majnemer, rja, bkramer Reviewed By: rja, bkramer Subscribers: rja, llvm-commits Differential Revision: https://reviews.llvm.org/D46285 llvm-svn: 331889	2018-05-09 16:09:31 +00:00
Krzysztof Parzyszek	ea4c1bb772	[LV] Change MaxVectorSize bound to 256 in assertion, NFC otherwise It's possible to have a vector of 256 bytes in HVX code on Hexagon (vector pair in 128-byte mode). llvm-svn: 331885	2018-05-09 15:18:12 +00:00
Benjamin Kramer	ccb0fbe9a0	Revert "[InstCombine] snprintf optimizations" This reverts commit r331849. It miscompiles snprintf(buf, sizeof(buf), "%s", "any constant string); into memcpy(buf, "%s", sizeof("any constant string")); llvm-svn: 331866	2018-05-09 11:38:57 +00:00
Bjorn Pettersson	9f953cdd7c	[MergedLoadStoreMotion] Fix a debug invariant bug in mergeStores Summary: MergedLoadStoreMotion::mergeStores is using some heuristics to limit the amount of stores that it tries to sink (see MagicCompileTimeControl in MergedLoadStoreMotion.cpp). The heuristic involves counting the number of instructions in one of the basic blocks that is part of the transformation. We now ignore dbg intrinsics when counting instruction for the MagicCompileTimeControl heuristic. This to make sure that the amount of stores that are sunk doesn't depend on the amount of debug information (if -g is used or not). Reviewers: Gerolf, davide, majnemer Reviewed By: davide Subscribers: dberlin, bjope, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46600 llvm-svn: 331852	2018-05-09 06:52:12 +00:00
David Bolvansky	44a37f04b2	[InstCombine] snprintf optimizations Reviewers: spatel, efriedma, majnemer, rja Reviewed By: rja Subscribers: rja, llvm-commits Differential Revision: https://reviews.llvm.org/D46285 llvm-svn: 331849	2018-05-09 06:34:20 +00:00
Shiva Chen	2c864551df	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841	2018-05-09 02:40:45 +00:00
Heejin Ahn	bf7716952a	Support a funclet operand bundle in LowerInvoke Summary: The current LowerInvoke pass cannot handle invoke instructions with a funclet bundle operand. The order of operands for an invoke instruction is {call arguments, callee, funclet operand (if any), normal dest, unwind dest}. The current code assumes there is no funclet operand and incorrectly includes a funclet operand into call arguments. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46242 llvm-svn: 331832	2018-05-09 00:53:50 +00:00
Davide Italiano	48283ba3a1	[SimplifyCFG] Fix a crash when folding PHIs. We enter MergeBlockIntoPredecessor with a block looking like this: for.inc.us-lcssa: ; preds = %cond.end %k.1.lcssa.ph = phi i32 [ %conv15, %cond.end ] %t.3.lcssa.ph = phi i32 [ %k.1.lcssa.ph, %cond.end ] br label %for.inc, !dbg !66 [note the first arg of the PHI being a PHI]. FoldSingleEntryPHINodes gets rid of both PHIs (calling, eraseFromParent). But right before we call the function, we push into IncomingValues the only argument of the PHIs, and shortly after we try to iterate over something which has been invalidated before :( The fix its not trying to remove PHIs which have an incoming value coming from the same BB we're looking at. Fixes PR37300 and rdar://problem/39910460 Differential Revision: https://reviews.llvm.org/D46568 llvm-svn: 331824	2018-05-08 23:28:15 +00:00
Hideki Saito	d722d61402	[LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body is single basic block Summary: Broadcast code generation emitted instructions in pre-header, while the instruction they are dependent on in the vector loop body. This resulted in an IL verification error ---- value used before defined. Reviewers: rengolin, fhahn, hfinkel Reviewed By: rengolin, fhahn Subscribers: dcaballe, Ka-Ka, llvm-commits Differential Revision: https://reviews.llvm.org/D46302 llvm-svn: 331799	2018-05-08 18:57:34 +00:00
Bjorn Pettersson	51cebc98f3	[LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions Summary: In formLCSSAForInstructions we speculatively add new PHI nodes, that sometimes ends up without having any uses. It has been discovered that sometimes an added PHI node can appear as being unused in one iteration of the Worklist, although it can end up being used by a PHI node added in a later iteration. We now check, a second time, that the PHI node still is unused before we remove it. This avoids an assert about "Trying to remove a phi with uses." for the added test case. Reviewers: davide, mzolotukhin, mattd, dberlin Reviewed By: mzolotukhin, dberlin Subscribers: dberlin, mzolotukhin, davide, bjope, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D46422 llvm-svn: 331741	2018-05-08 06:59:47 +00:00
Teresa Johnson	59da890c96	[NewPM] Emit inliner NoDefinition missed optimization remark Summary: Makes this consistent with the old PM. Reviewers: eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D46526 llvm-svn: 331709	2018-05-08 01:45:46 +00:00
Dmitry Mikulin	738bac77c1	Remove explicit setting of the CFI jumptable section name, it does not appear to be needed: jump table sections are created with .cfi.jumptable suffix. With this change each jump table is placed in a separate section, which allows the linker to re-order them. Differential Revision: https://reviews.llvm.org/D46537 llvm-svn: 331680	2018-05-07 21:30:15 +00:00
Fangrui Song	862eebb6d6	Simplify LLVM_ATTRIBUTE_USED call sites. llvm-svn: 331599	2018-05-05 20:14:38 +00:00
George Burgess IV	f9d26af4ea	Range-ify for loop; NFC llvm-svn: 331582	2018-05-05 04:52:26 +00:00
Craig Topper	781aa181ab	Fix a bunch of places where operator-> was used directly on the return from dyn_cast. Inspired by r331508, I did a grep and found these. Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa. llvm-svn: 331577	2018-05-05 01:57:00 +00:00
Peter Collingbourne	e04ecc88de	LowerTypeTests: Fix non-determinism in code that handles icall branch funnels. This was exposed by enabling expensive checks, which causes llvm::sort to sort randomly. Differential Revision: https://reviews.llvm.org/D45901 llvm-svn: 331573	2018-05-05 00:51:55 +00:00
Philip Reames	5b39acd111	[LICM] Compute a must execute property for the prefix of the header as we go Computing this property within the existing walk ensures that the cost is linear with the size of the block. If we did this from within isGuaranteedToExecute, it would be quadratic without some very fancy caching. This allows us to reliably catch a hoistable instruction within a header which may throw at some point after our hoistable instruction. It doesn't do anything for non-header cases, but given how common single block loops are, this seems very worthwhile. llvm-svn: 331557	2018-05-04 21:35:00 +00:00
Shoaib Meenai	57fadab1cb	[ObjCARC] Account for catchswitch in bitcast insertion A catchswitch is both a pad and a terminator, meaning it must be the only non-phi instruction in its basic block. When we're inserting a bitcast in the incoming basic block for a phi, if that incoming block is a catchswitch, we should go up the dominator tree to find a valid insertion point rather than attempting to insert before the catchswitch (which would result in invalid IR). Differential Revision: https://reviews.llvm.org/D46412 llvm-svn: 331548	2018-05-04 19:03:11 +00:00
Craig Topper	ded8ee07e9	[LoopIdiomRecognize] Don't create an IRBuilder just to call getTrue/getFalse. We can call the methods in ConstantInt directly. We just need a context. llvm-svn: 331542	2018-05-04 17:39:08 +00:00
Max Kazantsev	786032c1b7	[IRCE] Fix misuse of dyn_cast which leads to UB llvm-svn: 331508	2018-05-04 07:34:35 +00:00
Craig Topper	9510f70636	[LoopIdiomRecognize] Replace more unchecked dyn_casts with cast. Two of these are immediately dereferenced on the next line. The other two are passed immediately to the IRBuilder constructor which can't handle a nullptr. llvm-svn: 331500	2018-05-04 01:04:28 +00:00
Craig Topper	cafae62ec9	[LoopIdiomRecognize] Use a regular array instead of a SmallVector and explicit ArrayRef. llvm-svn: 331499	2018-05-04 01:04:26 +00:00
Craig Topper	8304231508	[LoopIdiomRecognize] Turn two uncheck dyn_casts into regular casts. These are casts on users of a PHINode to Instruction. I think since PHINode is an Instruction any users would also be Instructions. At least a cast will give us an assertion if its wrong. llvm-svn: 331498	2018-05-04 01:04:24 +00:00
Sanjay Patel	e7b6654711	[InstCombine] refine select-of-constants to bitwise ops Add logic for the special case when a cmp+select can clearly be reduced to just a bitwise logic instruction, and remove an over-reaching chunk of general purpose bit magic. The primary goal is to remove cases where we are not improving the IR instruction count when doing these select transforms, and in all cases here that is true. In the motivating 3-way compare tests, there are further improvements because we can combine/propagate select values (not sure if that belongs in instcombine, but it's there for now). DAGCombiner has folds to turn some of these selects into bit magic, so there should be no difference in the end result in those cases. Not all constant combinations are handled there yet, however, so it is possible that some targets will see more cmov/csel codegen with this change in IR canonicalization. Ideally, we'll go further to not turn selects into multiple logic/math ops in instcombine, and we'll canonicalize to selects. But we should make sure that this step does not result in regressions first (and if it does, we should fix those in the backend). The general direction for this change was discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html Alive proofs for the new bit magic: https://rise4fun.com/Alive/XG7 Differential Revision: https://reviews.llvm.org/D46086 llvm-svn: 331486	2018-05-03 21:58:44 +00:00
Piotr Padlewski	c77ab8ef2f	perform DSE through launder.invariant.group Summary: Alias Analysis knows that llvm.launder.invariant.group returns pointer that mustalias argument, but this information wasn't used, therefor we didn't DSE through launder.invariant.group Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma Reviewed By: dberlin Subscribers: amharc, llvm-commits, nlewycky, rsmith Differential Revision: https://reviews.llvm.org/D31581 llvm-svn: 331449	2018-05-03 11:03:53 +00:00
Craig Topper	856fd68690	[LoopIdiomRecognize] When looking for 'x & (x -1)' for popcnt, make sure the left hand side of the 'and' matches the left hand side of the 'subtract' llvm-svn: 331437	2018-05-03 05:48:49 +00:00
Craig Topper	8ef2abdbc4	[LoopIdiomRecognize] Remove unnecessary cast from BinaryOperator to Instruction. NFC BinaryOperator is a sub class of Instruction. We don't need an explicit cast back to Instruction. llvm-svn: 331432	2018-05-03 05:00:18 +00:00
Shoaib Meenai	a07295f977	[ObjCARC] Convert an if to an early continue. NFC This reduces nesting and makes the logic slightly easier to follow. Differential Revision: https://reviews.llvm.org/D46371 llvm-svn: 331422	2018-05-03 01:20:36 +00:00
Chandler Carruth	e74c354d12	[gcov] Switch to an explicit if clunky array to satisfy some compilers on various build bots that are unhappy with using makeArrayRef with an initializer list. llvm-svn: 331418	2018-05-03 00:11:03 +00:00
Chandler Carruth	71c3a3fac5	[GCOV] Emit the writeout function as nested loops of global data. Summary: Prior to this change, LLVM would in some cases emit massive writeout functions with many 10s of 1000s of function calls in straight-line code. This is a very wasteful way to represent what are fundamentally loops and creates a number of scalability issues. Among other things, register allocating these calls is extremely expensive. While D46127 makes this less severe, we'll still run into scaling issues with this eventually. If not in the compile time, just from the code size. Now the pass builds up global data structures modeling the inputs to these functions, and simply loops over the data structures calling the relevant functions with those values. This ensures that the code size is a fixed and only data size grows with larger amounts of coverage data. A trivial change to IRBuilder is included to make it easier to build the constants that make up the global data. Reviewers: wmi, echristo Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46357 llvm-svn: 331407	2018-05-02 22:24:39 +00:00
Daniel Sanders	8d0d1aa229	[reassociate] Fix excessive revisits when processing long chains of reassociatable instructions. Summary: Some of our internal testing detected a major compile time regression which I've tracked down to: r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst". It appears that processing long chains of reassociatable instructions causes non-linear (potentially exponential) growth in the number of times an instruction is revisited. For example, the included test revisits instructions 220 times in a 20-instruction test. It appears that r278938 reversed the order instructions were visited and that this is preventing scheduled revisits from being cancelled as a result of visiting the instructions naturally during normal processing. However, simply reversing the order also harmed the generated code. Upon closer inspection, it was discovered that revisits occurred in the opposite order to the first pass (Thanks to escha for spotting that). This patch makes the revisit order consistent with the first pass which allows more revisits to be cancelled. This does appear to have a small impact on the generated code in few cases but it significantly reduces compile-time. After this patch, our internal test that was most affected by the regression dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46% of the runtime it had before (99.54% improvement). Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only: \| metric \| geomean before patch \| geomean after patch \| delta \| \| ----- \| ----- \| ----- \| ----- \| \| compile time \| 0.1956 \| 0.1261 \| -35.54% \| \| execution time \| 0.3240 \| 0.3237 \| - \| \| code size \| 7365.4459 \| 7365.6079 \| - \| The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though: \| Performance Regressions - compile_time \| Δ \| Previous \| Current \| \| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk \| 9.82% \| 2.0473 \| 2.2483 \| Reviewers: javed.absar, dberlin Reviewed By: dberlin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45734 llvm-svn: 331381	2018-05-02 17:59:16 +00:00
Simon Pilgrim	f53ee8e640	Fix '32-bit shift implicitly converted to 64 bits' warning by using APInt::setBit instead. llvm-svn: 331359	2018-05-02 14:22:30 +00:00
Florian Hahn	5912c667b0	[LoopInterchange] Update some loops to use range base for loops (NFC). llvm-svn: 331342	2018-05-02 10:53:04 +00:00
Sanjay Patel	d2025a2e31	[AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare and (or (lshr X, C), ...), 1 --> (X & C') != 0 I initially thought about implementing the minimal pattern in instcombine as mentioned here: https://bugs.llvm.org/show_bug.cgi?id=37098#c6 ...but we need to do better to catch the more general sequence from the motivating test (more than 2 bits in the compare). And a test-suite run with statistics showed that this pattern only happened 2 times currently. It would potentially happen more often if reassociation worked better (D45842), but it's probably still not too frequent? This is small enough that I didn't see a need to create a whole new class/file within AggressiveInstCombine. There are likely other relatively small matchers like what was discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). We could potentially also consolidate matchers for ctpop, bswap, etc under here. Differential Revision: https://reviews.llvm.org/D45986 llvm-svn: 331311	2018-05-01 21:02:09 +00:00
Adrian Prantl	4dfcc4a788	Remove @brief commands from doxygen comments, too. This is a follow-up to r331272. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done https://reviews.llvm.org/D46290 llvm-svn: 331275	2018-05-01 16:10:38 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Daniel Neilson	9e4bbe801a	[LV] Preserve inbounds on created GEPs Summary: This is a fix for PR23997. The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations. This patch preserves the inbounds property in the case where a load/store is being fed by an inbounds GEP. Reviewers: mkuper, javed.absar, hsaito Reviewed By: hsaito Subscribers: dcaballe, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D46191 llvm-svn: 331269	2018-05-01 15:35:08 +00:00
Wei Mi	eec5ba9fae	Fix the issue that ComputeValueKnownInPredecessors only handles the case when phi is on lhs of a comparison op. For the following testcase, L1: %t0 = add i32 %m, 7 %t3 = icmp eq i32* %t2, null br i1 %t3, label %L3, label %L2 L2: %t4 = load i32, i32* %t2, align 4 br label %L3 L3: %t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ] %t6 = icmp eq i32 %t0, %t5 br i1 %t6, label %L4, label %L5 We know if we go through the path L1 --> L3, %t6 should always be true. However currently, if the rhs of the eq comparison is phi, JumpThreading fails to evaluate %t6 to true. And we know that Instcombine cannot guarantee always canonicalizing phi to the left hand side of the comparison operation according to the operand priority comparison mechanism in instcombine. The patch handles the case when rhs of the comparison op is a phi. Differential Revision: https://reviews.llvm.org/D46275 llvm-svn: 331266	2018-05-01 14:47:24 +00:00
Omer Paparo Bivas	82ef8e19ef	[InstCombine] Adjusting bswap pattern matching to hold for And/Shift mixed case Differential Revision: https://reviews.llvm.org/D45731 Change-Id: I85d4226504e954933c41598327c91b2d08192a9d llvm-svn: 331257	2018-05-01 12:25:46 +00:00
Chandler Carruth	2c85a23123	[PM/LoopUnswitch] Remove the last manual domtree update code from loop unswitch and replace it with the amazingly simple update API code. This addresses piles of FIXMEs around the update logic here and makes everything substantially simpler. llvm-svn: 331247	2018-05-01 09:54:39 +00:00
Chandler Carruth	44aab925fd	[PM/LoopUnswitch] Add back a successor set that was removed based on code review. It turns out this is necessary, and I read the comment on the API correctly the first time. ;] The `applyUpdates` routine requires that updates are "balanced". This is in order to cleanly handle cycles like inserting, removing, nad then re-inserting the same edge. This precludes inserting the same edge multiple times in a row as handling that would cause the insertion logic to become ordered instead of unordered (which is what the API provides). It happens that in this specific case nothing (other than an assert and contract violation) goes wrong because we're never inserting and removing the same edge. The implementation happens to do the right thing to eliminate redundant insertions in that case. But the requirement is there and there is an assert to catch it. Somehow, after the code review I never did another asserts-clang build testing loop-unswich for a long time. As a consequence, I didn't notice this despite a bunch of testing going on, but it shows up immediately with an asserts build of clang itself. llvm-svn: 331246	2018-05-01 09:42:09 +00:00
Florian Hahn	3df8844b92	[SimplifyCFG] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC). This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: aprantl, vsk, hans, danielcdh Reviewed By: hans Differential Revision: https://reviews.llvm.org/D46252 llvm-svn: 331221	2018-04-30 20:10:53 +00:00
Florian Hahn	8fe04ad3f7	[LoopSimplify] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC). This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: aprantl, vsk, chandlerc Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D46253 llvm-svn: 331217	2018-04-30 19:19:36 +00:00
Roman Lebedev	aa4faec114	[InstCombine] Unfold masked merge with constant mask Summary: As discussed in D45733, we want to do this in InstCombine. https://rise4fun.com/Alive/LGk Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: chandlerc, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D45867 llvm-svn: 331205	2018-04-30 17:59:33 +00:00
Davide Italiano	bd3bf1660b	[SLPVectorizer] Debug info shouldn't impact spill cost computation. <rdar://problem/39794738> (Also, PR32761). Differential Revision: https://reviews.llvm.org/D46199 llvm-svn: 331199	2018-04-30 16:57:33 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Florian Hahn	deb01ea126	[LV] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC). This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: mkuper, rengolin, dcaballe, aprantl, vsk Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D46254 llvm-svn: 331174	2018-04-30 13:28:08 +00:00
Hideki Saito	f2ec16ccc2	[NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file Summary: This is a follow up to D45420 (included here since it is still under review and this change is dependent on that) and D45072 (committed). Actual change for this patch is LoopVectorize* and cmakefile. All others are all from D45420. LoopVectorizationLegality is an analysis and thus really belongs to Analysis tree. It is modular enough and it is reusable enough ---- we can further improve those aspects once uses outside of LV picks up. Hopefully, this will make it easier for people familiar with vectorization theory, but not necessarily LV itself to contribute, by lowering the volume of code they should deal with. We probably should start adding some code in LV to check its own capability (i.e., vectorization is legal but LV is not ready to handle it) and then bail out. Reviewers: rengolin, fhahn, hfinkel, mkuper, aemerson, mssimpso, dcaballe, sguggill Reviewed By: rengolin, dcaballe Subscribers: egarcia, rogfer01, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D45552 llvm-svn: 331139	2018-04-29 07:26:18 +00:00
Roman Lebedev	136867931a	[InstCombine] Canonicalize variable mask in masked merge Summary: Masked merge has a pattern of: `((x ^ y) & M) ^ y`. But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`, We should canonicalize the pattern to non-inverted mask. https://rise4fun.com/Alive/Yol Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45664 llvm-svn: 331112	2018-04-28 15:45:07 +00:00
Philip Reames	502d4481d4	[LoopGuardWidening] Make PostDomTree optional The effect of doing so is not disrupting the LoopPassManager when mixing this pass with other loop passes. This should help locality of access substaintially and avoids the cost of computing PostDom. The assumption here is that the full GuardWidening (which does use PostDom) is run as a canonicalization before loop opts and that this version is just catching cases exposed by other loop passes. (i.e. LoopPredication, IndVarSimplify, LoopUnswitch, etc..) llvm-svn: 331094	2018-04-27 23:15:56 +00:00
Adrian Prantl	210a29de7b	Fix a bug in GlobalOpt's handling of DIExpressions. This patch adds support for fragment expressions TryToShrinkGlobalToBoolean() which were previously just dropped. Thanks to Reid Kleckner for providing me a reproducer! llvm-svn: 331086	2018-04-27 21:41:36 +00:00
Roman Lebedev	6959b8e76f	[PatternMatch] Stabilize the matching order of commutative matchers Summary: Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the `LHS` and `RHS` matchers: 1. match `RHS` matcher to the `first` operand of binary operator, 2. and then match `LHS` matcher to the `second` operand of binary operator. This works ok. But it complicates writing of commutative matchers, where one would like to match (`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side. This is additionally complicated by the fact that `m_Specific()` stores the `Value `, not `Value `, so it won't work at all out of the box. The last problem is trivially solved by adding a new `m_c_Specific()` that stores the `Value `, not `Value `. I'm choosing to add a new matcher, not change the existing one because i guess all the current users are ok with existing behavior, and this additional pointer indirection may have performance drawbacks. Also, i'm storing pointer, not reference, because for some mysterious-to-me reason it did not work with the reference. The first one appears trivial, too. Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ operands: 1. match ~~`RHS`~~ `LHS` matcher to the ~~`first`~~ `second` operand of binary operator, 2. and then match ~~`LHS`~~ `RHS` matcher to the ~~`second`~ `first` operand of binary operator. Surprisingly, `$ ninja check-llvm` still passes with this. But i expect the bots will disagree.. The motivational unittest is included. I'd like to use this in D45664. Reviewers: spatel, craig.topper, arsenm, RKSimon Reviewed By: craig.topper Subscribers: xbolva00, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D45828 llvm-svn: 331085	2018-04-27 21:23:20 +00:00
Philip Reames	5a6482450a	[LICM] Reduce nesting with an early return [NFC] llvm-svn: 331080	2018-04-27 20:58:30 +00:00
Daniel Neilson	a19ee7d7b6	[LV] Common duplicate vector load/store address calculation (NFC) Summary: Commoning some obviously copy/paste code in InnerLoopVectorizer::vectorizeMemoryInstruction llvm-svn: 331076	2018-04-27 20:29:18 +00:00
Philip Reames	de5a1da2d2	[GuardWidening] Add some clarifying comments about heuristics [NFC] llvm-svn: 331061	2018-04-27 17:41:37 +00:00
Philip Reames	9258e9d190	[LoopGuardWidening] Split out a loop pass version of GuardWidening The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk. This patch only includes a legacy pass. A follow up will add a new style pass as well. llvm-svn: 331060	2018-04-27 17:29:10 +00:00
Florian Hahn	f3fea0f11f	[LoopInterchange] Allow some loops with PHI nodes in the exit block. We currently support LCSSA PHI nodes in the outer loop exit, if their incoming values do not come from the outer loop latch or if the outer loop latch has a single predecessor. In that case, the outer loop latch will be executed only if the inner loop gets executed. If we have multiple predecessors for the outer loop latch, it may be executed even if the inner loop does not get executed. This is a first step to support the case described in https://bugs.llvm.org/show_bug.cgi?id=30472 Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43237 llvm-svn: 331037	2018-04-27 13:52:51 +00:00
Matt Morehouse	1ae1febfde	Revert "[SimplifyLibcalls] Replace locked IO with unlocked IO" This reverts r331002 due to sanitizer bot breakage. llvm-svn: 331011	2018-04-27 01:48:09 +00:00
Eli Friedman	e06539456c	[LowerTypeTests] Mark .cfi.jumptable nounwind. It doesn't unwind, and the wrong marking leads to the creation of an .eh_frame section when it isn't necessary. Differential Revision: https://reviews.llvm.org/D46082 llvm-svn: 331008	2018-04-27 00:32:24 +00:00
David Bolvansky	2c9cc9c731	[SimplifyLibcalls] Replace locked IO with unlocked IO Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 331002	2018-04-26 22:31:43 +00:00
Sanjoy Das	6f1937b10f	[InstCombine] Simplify Add with remainder expressions as operands. Summary: Simplify integer add expression X % C0 + (( X / C0 ) % C1) * C0 to X % (C0 * C1). This is a common pattern seen in code generated by the XLA GPU backend. Add test cases for this new optimization. Patch by Bixia Zheng! Reviewers: sanjoy Reviewed By: sanjoy Subscribers: efriedma, craig.topper, lebedev.ri, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D45976 llvm-svn: 330992	2018-04-26 20:52:28 +00:00
Vlad Tsyrklevich	b768d235a9	Revert "Enable EliminateAvailableExternally pass for -O1" This reverts commit r330961 because it breaks a handful of clang tests. llvm-svn: 330964	2018-04-26 17:54:53 +00:00
Vlad Tsyrklevich	42c5a9c29a	Enable EliminateAvailableExternally pass for -O1 Summary: Follow-up to D43690, the EliminateAvailableExternally pass currently runs under -O0 and -O2 and up. Under -O1 we would still want to drop available_externally symbols to reduce space without inlining having run. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D46093 llvm-svn: 330961	2018-04-26 17:33:24 +00:00
Florian Hahn	fd2bc11248	[LoopInterchange] Ignore debug intrinsics during legality checks. Reviewers: aprantl, mcrosier, karthikthecool Reviewed By: aprantl Subscribers: mattd, vsk, #debug-info, llvm-commits Differential Revision: https://reviews.llvm.org/D45379 llvm-svn: 330931	2018-04-26 10:26:17 +00:00
David Bolvansky	cb8ca5f37c	[SimplifyLibcalls] Atoi, strtol replacements Reviewers: spatel, lebedev.ri, xbolva00, efriedma Reviewed By: xbolva00, efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45418 llvm-svn: 330860	2018-04-25 18:58:53 +00:00
Taewook Oh	923c216da5	[ICP] Do not attempt type matching for variable length arguments. Summary: When performing indirect call promotion, current implementation inspects "all" parameters of the callsite and attemps to match with the formal argument type of the callee function. However, it is not possible to find the type for variable length arguments, and the compiler crashes when it attemps to match the type for variable lenght argument. It seems that the bug is introduced with D40658. Prior to that, the type matching is performed only for the parameters whose ID is less than callee->getFunctionNumParams(). The attached test case will crash without the patch. Reviewers: mssimpso, davidxl, davide Reviewed By: mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46026 llvm-svn: 330844	2018-04-25 17:19:21 +00:00
Sanjay Patel	807ddee1bf	[InstCombine] clean up foldSelectICmpAnd(); NFC As discussed in D45862, we want to delete parts of this code because it can create more instructions than it removes. But we also want to preserve some folds that are winners, so tidy up what's here to make splitting the good from bad a bit easier. llvm-svn: 330841	2018-04-25 16:34:01 +00:00
Florian Hahn	1da30c659d	[LoopInterchange] Use getExitBlock()/getExitingBlock instead of manual impl. This also means we have to check if the latch is the exiting block now, as `transform` expects the latches to be the exiting blocks too. https://bugs.llvm.org/show_bug.cgi?id=36586 Reviewers: efriedma, davide, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45279 llvm-svn: 330806	2018-04-25 09:35:54 +00:00
Bjorn Pettersson	bec2a7c4eb	[DebugInfo] Invalidate debug info in ReassociatePass::RewriteExprTree Summary: When Reassociate is rewriting an expression tree it may reuse old binary expression nodes, for new expressions. Whenever an expression node is reused, but with a non-trivial change in the result, we need to invalidate any debug info that is associated with the node. If for example rewriting x = mul a, b y = mul c, x into x = mul c, b y = mul a, x we still get the same result for 'y', but 'x' is a new expression. All debug info referring to 'x' must be invalidated (marked as optimized out) since we no longer calculate the expected value. As a side-effect this patch avoid (at least some) problems where reassociate could end up creating IR with debug-use before def. Earlier the dbg.value nodes where left untouched in the IR, while the reused binary nodes where sinked to just before the root node of the rewritten expression tree. See PR27273 for more info about such problems. Reviewers: dblaikie, aprantl, dexonsmith Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45975 llvm-svn: 330804	2018-04-25 09:23:56 +00:00
David Bolvansky	3ea50f9fef	Merging r46043: ------------------------------------------------------------------------ llvm-svn: 330799	2018-04-25 04:33:36 +00:00
Geoff Berry	2af5f3c1e5	[DivRemPairs] Fix non-determinism in use list order. Summary: Use a MapVector instead of a DenseMap for RemMap since it is iteratated over and the order of iteration can effect the order that new instructions are created. This can in turn effect the use list order of div/rem input values if multiple new instructions are created that share any input values. Reviewers: spatel Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45858 llvm-svn: 330792	2018-04-25 02:17:56 +00:00
Chandler Carruth	69e68f8468	[PM/LoopUnswitch] Begin teaching SimpleLoopUnswitch to use the new update API for dominators rather than doing manual, hacky updates. This is just the first step, but in some ways the most important as it moves the non-trivial unswitching to update the domtree rather than fully recalculating it each time. Subsequent patches should remove the custom update logic used by the trivial unswitch and replace it with uses of the update API. This also fixes a number of bugs I was seeing when testing non-trivial unswitch due to it querying the quasi-correct dominator tree. Now the tree is 100% correct and safe to query. That said, there are still more bugs I can see with non-trivial unswitch just running over the test suite, so more bugfix patches are needed as well. Thanks to both Sanjoy and Fedor for reviews and testing! Differential Revision: https://reviews.llvm.org/D45943 llvm-svn: 330787	2018-04-25 00:18:07 +00:00
Diego Caballero	60f2776b2f	[LV][VPlan] Detect outer loops for explicit vectorization. Patch #2 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces the basic infrastructure to detect, legality check and process outer loops annotated with hints for explicit vectorization. All these changes are protected under the feature flag -enable-vplan-native-path. This should make this patch NFC for the existing inner loop vectorizer. Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso. Differential Revision: https://reviews.llvm.org/D42447 llvm-svn: 330739	2018-04-24 17:04:17 +00:00
Florian Hahn	ceee788947	[LoopInterchange] Make isProfitableForVectorization slightly more conservative. After D43236, we started interchanging loops with empty dependence matrices. In isProfitableForVectorization, we try to determine if interchanging makes the loop dependences more friendly to the vectorizer. If there are no dependences, we should not interchange, based on that heuristic. Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource Reviewed By: mcrosier Differential Revision: https://reviews.llvm.org/D45208 llvm-svn: 330738	2018-04-24 16:55:32 +00:00
David Blaikie	ba47dd16c5	Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h) llvm-svn: 330726	2018-04-24 15:40:07 +00:00
Benjamin Kramer	f85f5da3b2	[LoadStoreVectorize] Ignore interleaved invariant loads. The memory location an invariant load is using can never be clobbered by any store, so it's safe to move the load ahead of the store. Differential Revision: https://reviews.llvm.org/D46011 llvm-svn: 330725	2018-04-24 15:28:47 +00:00
Chandler Carruth	43acdb35bc	[PM/LoopUnswitch] Fix a bug in the loop block set formation of the new loop unswitch. This code incorrectly added the header to the loop block set early. As a consequence we would incorrectly conclude that a nested loop body had already been visited when the header of the outer loop was the preheader of the nested loop. In retrospect, adding the header eagerly doesn't really make sense. It seems nicer to let the cycle be formed naturally. This will catch crazy bugs in the CFG reconstruction where we can't correctly form the cycle earlier rather than later, and makes the rest of the logic just fall out. I've also added various asserts that make these issues much easier to debug. llvm-svn: 330707	2018-04-24 10:33:08 +00:00
Max Kazantsev	c54e67d6b9	[NFC] Remove recently added SE verification because it may be false-positive llvm-svn: 330699	2018-04-24 09:11:01 +00:00
Max Kazantsev	30dee7874d	[NFC] Use forgetTopmostLoop instead of logic duplication llvm-svn: 330683	2018-04-24 04:33:04 +00:00
Chandler Carruth	0ace148ca6	[PM/LoopUnswitch] Remove another over-aggressive assert. This code path can very clearly be called in a context where we have baselined all the cloned blocks to a particular loop and are trying to handle nested subloops. There is no harm in this, so just relax the assert. I've added a test case that will make sure we actually exercise this code path. llvm-svn: 330680	2018-04-24 03:27:00 +00:00
Max Kazantsev	5a0a40b8cb	[NFC] Add clarification comment llvm-svn: 330677	2018-04-24 02:08:05 +00:00
David Blaikie	a27771b62f	InstCombine: Fix layering by not including Scalar.h in InstCombine (notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be included by InstCombine which doesn't/shouldn't need to depend on ScalarOpts) llvm-svn: 330669	2018-04-24 00:48:59 +00:00
Craig Topper	1bcb258ba3	[AggressiveInstCombine] Add aggressive inst combiner to the LLVM C API. I just tried to copy what was done for regular InstCombine. Hopefully I didn't miss anything. llvm-svn: 330668	2018-04-24 00:39:29 +00:00
Alex Shlyapnikov	909fb12f0c	[HWASan] Use dynamic shadow memory on Android only (LLVM) There're issues with IFUNC support on other platforms. DIfferential Revision: https://reviews.llvm.org/D45840 llvm-svn: 330665	2018-04-24 00:16:54 +00:00
Craig Topper	d4eb2073b7	[AggressiveInstCombine] Add library initializer routine for AggressiveInstCombine library. Use it in bugpoint and llvm-opt-fuzzer to match regular InstCombine. This should make aggressive instcombine usable with these tools. llvm-svn: 330663	2018-04-24 00:05:21 +00:00
Florian Hahn	7441818560	[LoopInterchange] Do not change LI for BBs in child loops. If a loop with child loops becomes our new inner loop after interchanging, we only need to update LoopInfo for the blocks defined in the old outer loop. BBs in child loops will stay there. Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45970 llvm-svn: 330653	2018-04-23 21:38:19 +00:00
Xin Tong	8edff27923	[CallSiteSplit] Make sure we remove nonnull if the parameter turns out to be a constant. Summary: We do not need nonull attribute if we know an argument is going to be constant. Reviewers: junbuml, davide, fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45608 llvm-svn: 330641	2018-04-23 20:09:08 +00:00
Bjorn Pettersson	8e484dc531	[MemCpyOpt] Skip optimizing basic blocks not reachable from entry Summary: Skip basic blocks not reachable from the entry node in MemCpyOptPass::iterateOnFunction. Code that is unreachable may have properties that do not exist for reachable code (an instruction in a basic block can for example be dominated by a later instruction in the same basic block, for example if there is a single block loop). MemCpyOptPass::processStore is only safe to use for reachable basic blocks, since it may iterate past the basic block beginning when used for unreachable blocks. By simply skipping to optimize unreachable basic blocks we can avoid asserts such as "Assertion `!NodePtr->isKnownSentinel()' failed." in MemCpyOptPass::processStore. The problem was detected by fuzz tests. Reviewers: eli.friedman, dneilson, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45889 llvm-svn: 330635	2018-04-23 19:55:04 +00:00
Daniel Neilson	cc45e923c5	[DSE] Teach the pass that atomic memory intrinsics are stores. Summary: This change teaches DSE that the atomic memory intrinsics are stores that can be eliminated, and can allow other stores to be eliminated. This change specifically does not teach DSE that these intrinsics can be partially eliminated (i.e. length reduced, and dest/src changed); that will be handled in another change. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith Reviewed By: efriedma Subscribers: dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D45535 llvm-svn: 330629	2018-04-23 19:06:49 +00:00
Alex Shlyapnikov	a2b4f9b4d4	[HWASan] Switch back to fixed shadow mapping for x86-64 For now switch back to fixed shadow mapping for x86-64 due to the issues with IFUNC linking on older binutils. More details will be added to https://bugs.chromium.org/p/chromium/issues/detail?id=835864 Differential Revision: https://reviews.llvm.org/D45840 llvm-svn: 330623	2018-04-23 18:14:39 +00:00
Max Kazantsev	91f481665e	[LoopRotate] Fix incorrect SCEV invalidation in loop rotation LoopRotate only invalidates innermost loops while the changes that it makes may also affert any of this parents. With patch rL329047, SCEV becomes much smarter about calculation of exit counts for outer loops, so we cannot assume that they are not affected. Differential Revision: https://reviews.llvm.org/D45945 llvm-svn: 330582	2018-04-23 12:33:31 +00:00
Max Kazantsev	acda4c0f18	[LoopUnroll] Fix potentially incorrect SCEV invalidation in UnrollRuntime Current runtime unrolling invalidates parent loop saying that it might have changed after the inner loop has changed, but it doesn't bother to do the same to its parents. With patch rL329047, SCEV becomes much smarter about calculation of exit counts for outer loops. We might need to invalidate not only the immediate parent, but also any of its parents as well. There is no clear evidence that there is some miscompile happening because of this (at least I don't have such test), but the common sense says that the current code is wrong. Differential Revision: https://reviews.llvm.org/D45940 Reviewed By: chandlerc llvm-svn: 330577	2018-04-23 10:39:38 +00:00
Max Kazantsev	b1137c42fa	[LoopSimplify] Fix incorrect SCEV invalidation In the function `simplifyOneLoop` we optimistically assume that changes in the inner loop only affect this very loop and have no impact on its parents. In fact, after rL329047 has been merged, we can now calculate exit counts for outer loops which may depend on inner loops. Thus, we need to invalidate all parents when we do something to a loop. There is an evidence of incorrect behavior of `simplifyOneLoop`: when we insert `SE->verify()` check in the end of this funciton, it fails on a bunch of existing test, in particular: LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll LLVM :: Transforms/LoopUnroll/peel-loop.ll LLVM :: Transforms/LoopUnroll/peel-loop2.ll Note that previously we have fixed issues of this variety, see rL328483. This patch makes this function invalidate the outermost loop properly. Differential Revision: https://reviews.llvm.org/D45937 Reviewed By: chandlerc llvm-svn: 330576	2018-04-23 10:32:37 +00:00
Chandler Carruth	bf7190a154	[PM/LoopUnswitch] Remove a buggy assert in the new loop unswitch. The condition this was asserting doesn't actually hold. I've added comments to explain why, removed the assert, and added a fun test case reduced out of 403.gcc. llvm-svn: 330564	2018-04-23 06:58:36 +00:00
Chandler Carruth	b525424118	[PM/LoopUnswitch] Fix comment typo. NFC. llvm-svn: 330560	2018-04-23 00:48:42 +00:00
Sanjay Patel	30be665e82	[PatternMatch] allow undef elements when matching a vector zero This is the last step in getting constant pattern matchers to allow undef elements in constant vectors. I'm adding a dedicated m_ZeroInt() function and building m_Zero() from that. In most cases, calling code can be updated to use m_ZeroInt() directly when there's no need to match pointers, but I'm leaving that efficiency optimization as a follow-up step because it's not always clear when that's ok. There are just enough icmp folds in InstSimplify that can be used for integer or pointer types, that we probably still want a generic m_Zero() for those cases. Otherwise, we could eliminate it (and possibly add a m_NullPtr() as an alias for isa<ConstantPointerNull>()). We're conservatively returning a full zero vector (zeroinitializer) in InstSimplify/InstCombine on some of these folds (see diffs in InstSimplify), but I'm not sure if that's actually necessary in all cases. We may be able to propagate an undef lane instead. One test where this happens is marked with 'TODO'. llvm-svn: 330550	2018-04-22 17:07:44 +00:00
Shoaib Meenai	106df7dd20	[ObjCARC] Take BlockColors by const reference. NFC llvm-svn: 330489	2018-04-20 22:14:45 +00:00
Shoaib Meenai	d64b83266b	[ObjCARC] Account for funclet token in storeStrong transform When creating a call to storeStrong in ObjCARCContract, ensure the call gets the correct funclet token, otherwise WinEHPrepare will turn the call (and all subsequent instructions) into unreachable. We already have logic to do this for the ARC autorelease elision marker; factor that out into a common function that's used for both. These are the only two places in this transform that create call instructions. Differential Revision: https://reviews.llvm.org/D45857 llvm-svn: 330487	2018-04-20 22:11:03 +00:00
Alex Shlyapnikov	99cf54baa6	[HWASan] Introduce non-zero based and dynamic shadow memory (LLVM). Summary: Support the dynamic shadow memory offset (the default case for user space now) and static non-zero shadow memory offset (-hwasan-mapping-offset option). Keeping the the latter case around for functionality and performance comparison tests (and mostly for -hwasan-mapping-offset=0 case). The implementation is stripped down ASan one, picking only the relevant parts in the following assumptions: shadow scale is fixed, the shadow memory is dynamic, it is accessed via ifunc global, shadow memory address rematerialization is suppressed. Keep zero-based shadow memory for kernel (-hwasan-kernel option) and calls instreumented case (-hwasan-instrument-with-calls option), which essentially means that the generated code is not changed in these cases. Reviewers: eugenis Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D45840 llvm-svn: 330475	2018-04-20 20:04:04 +00:00
Sean Fertile	18f17333dd	[PartialInlining] Fix Crash from holding a reference to a destructed ORE. The callback used to create an ORE for the legacy PI pass caches the allocated object in a unique_ptr in the runOnModule function, and returns a reference to that object. Under certian circumstances we can end up holding onto that reference after the OREs destruction. Rather then allowing the new and legacy passes to create ORE object in diffrent ways, create the ORE at the point of use. Differential Revision: https://reviews.llvm.org/D43219 llvm-svn: 330473	2018-04-20 19:56:26 +00:00
Michael Zolotukhin	e268304122	Revert r330431. There are still stage3/stage4 miscompares :( llvm-svn: 330446	2018-04-20 16:57:10 +00:00
Florian Hahn	773872fd67	[NewGVN] Split OpPHI detection and creation. It also adds a check making sure PHIs for operands are all in the same block. Patch by Daniel Berlin <dberlin@dberlin.org> Reviewers: dberlin, davide Differential Revision: https://reviews.llvm.org/D43865 llvm-svn: 330444	2018-04-20 16:37:13 +00:00
Michael Zolotukhin	a2c9af0209	Revert "Revert r330403 and r330413." Reapply the patches with a fix. Thanks Ilya and Hans for the reproducer! This reverts commit r330416. The issue was that removing predecessors invalidated uses that we stored for rewrite. The fix is to finish manipulating with CFG before we select uses for rewrite. llvm-svn: 330431	2018-04-20 13:34:32 +00:00
Ilya Biryukov	afe822bd6d	Revert r330403 and r330413. Revert r330413: "[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites." Revert r330403 "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time." r330403 commit seems to crash clang during our integrate while doing PGO build with the following stacktrace: #2 llvm::SSAUpdaterBulk::RewriteAllUses(llvm::DominatorTree, llvm::SmallVectorImpl<llvm::PHINode>) #3 llvm::JumpThreadingPass::ThreadEdge(llvm::BasicBlock, llvm::SmallVectorImpl<llvm::BasicBlock> const&, llvm::BasicBlock) #4 llvm::JumpThreadingPass::ProcessThreadableEdges(llvm::Value, llvm::BasicBlock, llvm::jumpthreading::ConstantPreference, llvm::Instruction) #5 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock) The crash happens while compiling 'lib/Analysis/CallGraph.cpp'. r3340413 is reverted due to conflicting changes. llvm-svn: 330416	2018-04-20 10:52:54 +00:00
Michael Zolotukhin	9dea079315	[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites. llvm-svn: 330413	2018-04-20 10:31:06 +00:00
Michael Zolotukhin	79e4f7fadb	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. Hopefully, changing set to vector removes nondeterminism detected by some bots, or the new assert will catch something. This reverts commit r330180. llvm-svn: 330403	2018-04-20 08:01:08 +00:00
Michael Zolotukhin	26339b445a	[SSAUpdaterBulk] Add an assert. llvm-svn: 330402	2018-04-20 07:59:57 +00:00
Michael Zolotukhin	0df1d48ca9	[SSAUpdaterBulk] Add * and & to auto. llvm-svn: 330400	2018-04-20 07:58:54 +00:00
Michael Zolotukhin	bc843211fd	[SSAUpdaterBulk] Use PredCache in ComputeLiveInBlocks. llvm-svn: 330399	2018-04-20 07:57:24 +00:00
Michael Zolotukhin	79cb54b2d9	[SSAUpdaterBulk] Use SmallVector instead of SmallPtrSet for uses. llvm-svn: 330398	2018-04-20 07:56:00 +00:00
Vlad Tsyrklevich	230b256783	LowerTypeTests: Propagate symver directives Summary: This change fixes https://crbug.com/834474, a build failure caused by LowerTypeTests not preserving .symver symbol versioning directives for exported functions. Emit symver information to ThinLTO summary data and then propagate symver directives for exported functions to the merged module. Emitting symver information to the summaries increases the size of intermediate build artifacts for a Chromium build by less than 0.2%. Reviewers: pcc Reviewed By: pcc Subscribers: tejohnson, mehdi_amini, eraman, llvm-commits, eugenis, kcc Differential Revision: https://reviews.llvm.org/D45798 llvm-svn: 330387	2018-04-20 01:36:48 +00:00
Jin Lin	585f2699cf	Refine the loop rotation's API Summary: The following changes addresses the following two issues. 1) The existing loop rotation pass contains both loop latch simplification and loop rotation. So one flag RotationOnly is added to be passed to the loop rotation pass. 2) The threshold value is initialized with MAX_UINT since the loop rotation utility should not have threshold limit. Reviewers: dmgreen, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45582 llvm-svn: 330362	2018-04-19 20:29:43 +00:00
Chandler Carruth	32e62f9c5b	[PM/LoopUnswitch] Detect irreducible control flow within loops and skip unswitching non-trivial edges. Summary: This fixes the bug pointed out in review with non-trivial unswitching. This also provides a basis that should make it pretty easy to finish fleshing out a routine to scan an entire function body for irreducible control flow, but this patch remains minimal for disabling loop unswitch. Reviewers: sanjoy, fedor.sergeev Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45754 llvm-svn: 330357	2018-04-19 18:44:25 +00:00
Sanjay Patel	a201787fd7	[Reassociate] fix formatting; NFC llvm-svn: 330348	2018-04-19 17:56:36 +00:00
Florian Hahn	b789165e6b	[NewGVN] Add ops as dependency if we cannot find a leader for ValueOp. If those operands change, we might find a leader for ValueOp, which could enable new phi-of-op creation. This fixes a case where we missed creating a phi-of-ops node. With D43865 and this patch, bootstrapping clang/llvm works with -enable-newgvn, whereas without it, the "value changed after iteration" assertion is triggered. Reviewers: dberlin, davide Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D42180 llvm-svn: 330334	2018-04-19 15:05:47 +00:00
Sanjay Patel	b2ab3f28d5	[SimplifyLibcalls] Realloc(null, N) -> Malloc(N) Patch by Dávid Bolvanský! Differential Revision: https://reviews.llvm.org/D45413 llvm-svn: 330259	2018-04-18 14:21:31 +00:00
Sam Parker	3c19051bf0	[IRCE] Only check for NSW on equality predicates After investigation discussed in D45439, it would seem that the nsw flag restriction is unnecessary in most cases. So the IsInductionVar lambda has been removed, the functionality extracted, and now only require nsw when using eq/ne predicates. Differential Revision: https://reviews.llvm.org/D45617 llvm-svn: 330256	2018-04-18 13:50:28 +00:00
Florian Hahn	ac27758895	[LoopUnroll] Only peel if a predicate becomes known in the loop body. If a predicate does not become known after peeling, peeling is unlikely to be beneficial. Reviewers: mcrosier, efriedma, mkazantsev, junbuml Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D44983 llvm-svn: 330250	2018-04-18 12:29:24 +00:00
Bjorn Pettersson	bc4f19b6bd	[DebugInfo] Sink related dbg users when sinking in InstCombine Summary: When sinking an instruction in InstCombine we now also sink the DbgInfoIntrinsics that are using the sunken value. Example) When sinking the load in this input bb.X: %0 = load i64, i64* %start, align 4, !dbg !31 tail call void @llvm.dbg.value(metadata i64 %0, ...) br i1 %cond, label %for.end, label %for.body.lr.ph for.body.lr.ph: br label %for.body we now also move the dbg.value, like this bb.X: br i1 %cond, label %for.end, label %for.body.lr.ph for.body.lr.ph: %0 = load i64, i64* %start, align 4, !dbg !31 tail call void @llvm.dbg.value(metadata i64 %0, ...) br label %for.body In the past we haven't moved the dbg.value so we got bb.X: tail call void @llvm.dbg.value(metadata i64 %0, ...) br i1 %cond, label %for.end, label %for.body.lr.ph for.body.lr.ph: %0 = load i64, i64* %start, align 4, !dbg !31 br label %for.body So in the past we got a debug-use before the def of %0. And that dbg.value was also on the path jumping to %for.end, for which %0 never was defined. CodeGenPrepare normally comes to rescue later (when not moving the dbg.value), since it moves dbg.value instrinsics quite brutally, without really analysing if it is correct to move the intrinsic (see PR31878). So at the moment this patch isn't expected to have much impact, besides that it is moving the dbg.value already in opt, making the IR look more sane directly. This can be seen as a preparation to (hopefully) make it possible to turn off CodeGenPrepare::placeDbgValues later as a solution to PR31878. I also adjusted test/DebugInfo/X86/sdagsplit-1.ll to make the IR in the test case up-to-date with this behavior in InstCombine. Reviewers: rnk, vsk, aprantl Reviewed By: vsk, aprantl Subscribers: mattd, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45425 llvm-svn: 330243	2018-04-18 08:08:04 +00:00
Sanjay Patel	aea15131db	[InstCombine] peek through bitcasted vector/array pointer GEP operand The bitcast may be interfering with other combines or vectorization as shown in PR16739: https://bugs.llvm.org/show_bug.cgi?id=16739 Most pointer-related optimizations are probably able to look through this bitcast, but removing the bitcast shrinks the IR, so it's at least a size savings. Differential Revision: https://reviews.llvm.org/D44833 llvm-svn: 330237	2018-04-18 00:36:40 +00:00
Vedant Kumar	b0585893cc	[Mem2Reg] Create merged debug locations for inserted phis Track the debug locations of the incoming values to newly-created phis, and apply merged debug locations to the phis. A merged location will be on line 0, but will have the correct scope set. This improves crash reporting when an inlined instruction with a merged location triggers a machine exception. A debugger will be able to narrow down the crash to the correct inlined scope, instead of simply pointing to the outer scope of the caller. Taken together with a change allows generating merged line-0 locations for instructions which aren't calls, this results in a 0.5% increase in the uncompressed size of the .debug_line section of a stage2+Release build of clang (-O3 -g). rdar://33858697 Differential Revision: https://reviews.llvm.org/D45397 llvm-svn: 330227	2018-04-17 22:03:08 +00:00
Vedant Kumar	4b29172d09	[Mem2Reg] Make RenamePassData a struct, NFC llvm-svn: 330226	2018-04-17 22:03:07 +00:00
Stanislav Mekhanoshin	0bee630814	LoadStoreVectorizer crashes due to unsized type When we skip bitcasts while looking for GEP in LoadSoreVectorizer we should also verify that the type is sized otherwise we assert Differential Revision: https://reviews.llvm.org/D45709 llvm-svn: 330221	2018-04-17 21:40:04 +00:00
Michael Zolotukhin	21458fdc55	Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again." This reverts r330175. There are still stage3/stage4 miscompares. llvm-svn: 330180	2018-04-17 07:31:27 +00:00
Michael Zolotukhin	a6e7bd7001	[SSAUpdaterBulk] Add debug logging. llvm-svn: 330176	2018-04-17 04:45:40 +00:00
Michael Zolotukhin	3f5fd1b129	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again. One more, hopefully the last, bug is fixed: when forming UsesToRewrite we should ignore phi operands coming from edges that we want to delete. This reverts r329910. llvm-svn: 330175	2018-04-17 04:45:22 +00:00
Haicheng Wu	f7466f3164	[SLP] Use getExtractWithExtendCost() to compute the scalar cost of extractelement/ext pair We use getExtractWithExtendCost to calculate the cost of extractelement and s\|zext together when computing the extract cost after vectorization, but we calculate the cost of extractelement and s\|zext separately when computing the scalar cost which is larger than it should be. Differential Revision: https://reviews.llvm.org/D45469 llvm-svn: 330143	2018-04-16 18:09:49 +00:00
Sanjay Patel	f4c4fc77cd	[InstCombine] simplify code in SimplifyAssociativeOrCommutative; NFCI llvm-svn: 330137	2018-04-16 17:15:13 +00:00
Sanjay Patel	d93b8a0740	[InstCombine] simplify getBinOpsForFactorization(); NFC llvm-svn: 330129	2018-04-16 15:19:24 +00:00
Sanjay Patel	1170daa277	[InstCombine] simplify fneg+fadd folds; NFC Two cleanups: 1. As noted in D45453, we had tests that don't need FMF that were misplaced in the 'fast-math.ll' test file. 2. This removes the final uses of dyn_castFNegVal, so that can be deleted. We use 'match' now. llvm-svn: 330126	2018-04-16 14:13:57 +00:00
Sanjay Patel	77e990d887	[InstCombine] fix formatting; NFC llvm-svn: 330124	2018-04-16 13:21:15 +00:00
Roman Lebedev	f84bfb2147	[InstCombine] Simplify 'xor' to 'or' if no common bits are set. Summary: In order to get the whole fold as specified in [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]], let's first handle the simple straight-forward things. Let's start with the `and` -> `or` simplification. The one obvious thing missing here: the constant mask is not handled. I have an idea how to handle it, but it will require some thinking, and is not strictly required here, so i've left that for later. https://rise4fun.com/Alive/Pkmg Reviewers: spatel, craig.topper, eli.friedman, jingyue Reviewed By: spatel Subscribers: llvm-commits Was reviewed as part of https://reviews.llvm.org/D45631 llvm-svn: 330103	2018-04-15 18:59:44 +00:00
Roman Lebedev	25cbb62d18	[NFC] ConstantOffsetExtractor::CanTraceInto(): add FIXME: no tests As suggested in https://reviews.llvm.org/D45631#1068338, looking at haveNoCommonBitsSet() users, and trying to show the change effect elsewhere. llvm-svn: 330100	2018-04-15 18:59:27 +00:00
Sanjay Patel	34ea6cdfab	[InstCombine] simplify more code for distributive property; NFCI Also, fix capitalization to current style. Follow-up to: rL330096 llvm-svn: 330097	2018-04-15 16:20:58 +00:00
Sanjay Patel	f1aa0d7af2	[InstCombine] simplify code for distributive property; NFCI llvm-svn: 330096	2018-04-15 15:39:57 +00:00
Warren Ristow	8b2f27ce3a	[InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMF These simplifications were previously enabled only with isFast(), but that is more restrictive than required. Since r317488, FMF has 'reassoc' to control these cases at a finer level. llvm-svn: 330089	2018-04-14 19:18:28 +00:00
Hiroshi Inoue	ae17900997	[NFC] fix trivial typos in document and comments "not not" -> "not" etc llvm-svn: 330083	2018-04-14 08:59:00 +00:00
Roman Tereshin	dab10b5468	[DebugInfo][OPT] NFC follow-up on "Fixing a couple of DI duplication bugs of CloneModule" llvm-svn: 330070	2018-04-13 21:23:11 +00:00
Roman Tereshin	d769eb36ab	[DebugInfo][OPT] Fixing a couple of DI duplication bugs of CloneModule As demonstrated by the regression tests added in this patch, the following cases are valid cases: 1. A Function with no DISubprogram attached, but various debug info related to its instructions, coming, for instance, from an inlined function, also defined somewhere else in the same module; 2. ... or coming exclusively from the functions inlined and eliminated from the module entirely. The ValueMap shared between CloneFunctionInto calls within CloneModule needs to contain identity mappings for all of the DISubprogram's to prevent them from being duplicated by MapMetadata / RemapInstruction calls, this is achieved via DebugInfoFinder collecting all the DISubprogram's. However, CloneFunctionInto was missing calls into DebugInfoFinder for functions w/o DISubprogram's attached, but still referring DISubprogram's from within (case 1). This patch fixes that. The fix above, however, exposes another issue: if a module contains a DISubprogram referenced only indirectly from other debug info metadata, but not attached to any Function defined within the module (case 2), cloning such a module causes a DICompileUnit duplication: it will be moved in indirecty via a DISubprogram by DebugInfoFinder first (because of the first bug fix described above), without being self-mapped within the shared ValueMap, and then will be copied during named metadata cloning. So this patch makes sure DebugInfoFinder visits DICompileUnit's referenced from DISubprogram's as it goes w/o re-processing llvm.dbg.cu list over and over again for every function cloned, and makes sure that CloneFunctionInto self-maps DICompileUnit's referenced from the entire function, not just its own DISubprogram attached that may also be missing. The most convenient way of tesing CloneModule I found is to rely on CloneModule call from `opt -run-twice`, instead of writing tedious unit tests. That feature has a couple of properties that makes it hard to use for this purpose though: 1. CloneModule doesn't copy source filename, making `opt -run-twice` report it as a difference. 2. `opt -run-twice` does the second run on the original module, not its clone, making the result of cloning completely invisible in opt's actual output with and without `-run-twice` both, which directly contradicts `opt -run-twice`s own error message. This patch fixes this as well. Reviewed By: aprantl Reviewers: loladiro, GorNishanov, espindola, echristo, dexonsmith Subscribers: vsk, debug-info, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45593 llvm-svn: 330069	2018-04-13 21:22:24 +00:00
Krzysztof Parzyszek	dfed941eec	[LV] Introduce TTI::getMinimumVF The function getMinimumVF(ElemWidth) will return the minimum VF for a vector with elements of size ElemWidth bits. This value will only apply to targets for which TTI::shouldMaximizeVectorBandwidth returns true. The value of 0 indicates that there is no minimum VF. Differential Revision: https://reviews.llvm.org/D45271 llvm-svn: 330062	2018-04-13 20:16:32 +00:00
Mandeep Singh Grang	636d94db3b	[Transforms] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: kcc, pcc, danielcdh, jmolloy, sanjoy, dberlin, ruiu Reviewed By: ruiu Subscribers: ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D45142 llvm-svn: 330059	2018-04-13 19:47:57 +00:00
Andrey Konovalov	1ba9d9c6ca	hwasan: add -fsanitize=kernel-hwaddress flag This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables -hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff. Differential Revision: https://reviews.llvm.org/D45046 llvm-svn: 330044	2018-04-13 18:05:21 +00:00
Roman Lebedev	c00659328a	[InstCombine]: foldSelectICmpAndAnd(): and is commutative Summary: The fold added in D45108 did not account for the fact that the and instruction is commutative, and if the mask is a variable, the mask variable and the fold variable may be swapped. I have noticed this by accident when looking into [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]] This extends/generalizes that fold, so it is handled too. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45539 llvm-svn: 330001	2018-04-13 09:57:57 +00:00
Craig Topper	254ed028a4	[X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR. This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990	2018-04-13 06:07:18 +00:00
Xin Tong	d83c883d29	[CallSiteSplit] Fix comment. NFC llvm-svn: 329987	2018-04-13 04:35:38 +00:00
Eli Friedman	e1938cbc87	Don't call skipModule for CFI lowering passes. opt-bisect shouldn't skip these passes; they lower intrinsics which no other pass can handle. llvm-svn: 329961	2018-04-12 22:04:11 +00:00
Benjamin Kramer	b4ba3988bb	Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time." This reverts commit r329865. Causes stage2/stage3 miscompare. llvm-svn: 329910	2018-04-12 13:52:02 +00:00
Sam Parker	9737535943	[IRCE] isKnownNonNegative helper function Created a helper function to query for non negative SCEVs. Uses the SGE predicate to catch constants that could be interpreted as negative. Differential Revision: https://reviews.llvm.org/D45481 llvm-svn: 329907	2018-04-12 12:49:40 +00:00
Hiroshi Inoue	bcadfee2ad	[NFC] fix trivial typos in documents and comments "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878	2018-04-12 05:53:20 +00:00
George Burgess IV	48ee59b6f0	[DeadArgElim] Remove allocsize attributes on callsites We're already removing allocsize attributes from Functions that we remove args from, since removing arguments from a function may make the allocsize attribute incorrect. It appears we forgot to also remove them from callsites. Without this, I get verifier errors on `@Test2`. It probably wouldn't be too hard to make DAE properly update allocsize attributes instead of dropping them, but I can't think of a scenario where that'd be useful in practice. llvm-svn: 329868	2018-04-12 02:06:01 +00:00
Michael Zolotukhin	815f453f76	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reapplies commit r329644. llvm-svn: 329865	2018-04-11 23:37:53 +00:00
Michael Zolotukhin	4fbb93003b	[SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify order of evaluation. The standard says that the order of evaluation of an expression s[x] = foo() is unspecified. In our case, we first create an empty entry in the map, then call foo(), then store its return value to the created entry. The problem is that foo uses the map as a cache, so if it finds that there is an entry in the map, it stops computation. This change explicitly sets the order, thus fixing this heisenbug. llvm-svn: 329864	2018-04-11 23:37:37 +00:00
Sanjay Patel	ff98682c9c	[InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse() llvm-svn: 329821	2018-04-11 15:57:18 +00:00
Artur Gainullin	d928201ac5	Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max. Bitwise 'not' of the min/max could be eliminated in the pattern: %notx = xor i32 %x, -1 %cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y %smax = select i1 %cmp1, i32 %notx, i32 %y %res = xor i32 %smax, -1 https://rise4fun.com/Alive/lCN Reviewers: spatel Reviewed by: spatel Subscribers: a.elovikov, llvm-commits Differential Revision: https://reviews.llvm.org/D45317 llvm-svn: 329791	2018-04-11 10:29:37 +00:00
Sriraman Tallam	182f2df7c5	Simplification of libcall like printf->puts must check for RtLibUseGOT metadata. With -fno-plt, for example, calls to printf when getting converted to puts still use the PLT. This patch checks for the metadata "RtLibUseGOT" and annotates the declaration with the right attributes. Differential Revision: https://reviews.llvm.org/D45180 llvm-svn: 329768	2018-04-10 23:32:36 +00:00
Sanjay Patel	3b6d46761f	[CVP] simplify phi with constant incoming values that match common variable edge values This is based on an example that was recently posted on llvm-dev: void propagate_null(void b, int* g) { if (!b) { return 0; } (*g)++; return b; } https://godbolt.org/g/xYk3qG The original code or constant propagation in other passes has obscured the fact that the phi can be removed completely. Differential Revision: https://reviews.llvm.org/D45448 llvm-svn: 329755	2018-04-10 20:42:39 +00:00
Michael Zolotukhin	d6beefd5d3	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reverts r329661. Bots are still unhappy. llvm-svn: 329666	2018-04-10 03:40:29 +00:00
Michael Zolotukhin	8a13f6d4a7	Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."" This reapplies commit r329644. llvm-svn: 329661	2018-04-10 02:16:45 +00:00
Michael Zolotukhin	aa7868594e	[SSAUpdaterBulk] Handle CFG with unreachable from entry blocks. llvm-svn: 329660	2018-04-10 02:16:29 +00:00
Michael Zolotukhin	0274632ee6	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." This reverts commit r329644. llvm-svn: 329650	2018-04-10 00:42:43 +00:00
Hideki Saito	d829973794	Fix for the buildbot failure. Now-unused private field TTI deleted. llvm-svn: 329649	2018-04-10 00:38:36 +00:00
Hideki Saito	dfa932b049	[NFC][LV] Move InterleaveInfo from Legal to CostModel Summary: Another clean up, following D43208. Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it. In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that, by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created. After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis. Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson Reviewed By: rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45072 llvm-svn: 329645	2018-04-09 23:45:40 +00:00
Michael Zolotukhin	c6d2d65f37	[PR16756] Use SSAUpdaterBulk in JumpThreading. Summary: SSAUpdater is a bottleneck in JumpThreading, and this patch improves the situation by using SSAUpdaterBulk instead. Compile time impact: no noticable changes on CTMark, a big improvement on the test from PR16756. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329644	2018-04-09 23:37:37 +00:00
Michael Zolotukhin	52b064f3d3	[PR16756] Add SSAUpdaterBulk. Summary: SSAUpdater is a bottleneck in a number of passes, and one of the reasons is that it performs a lot of unnecessary computations (DT/IDF) over and over again. This patch adds a new SSAUpdaterBulk that uses existing DT and avoids recomputing IDF when possible. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329643	2018-04-09 23:37:20 +00:00
Simon Pilgrim	23c2182c2b	Support generic expansion of ordered vector reduction (PR36732) Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order. This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1] Differential Revision: https://reviews.llvm.org/D45366 llvm-svn: 329585	2018-04-09 15:44:20 +00:00
Xin Tong	fdad23bc36	[MergeICmp] Update debug msg.NFC llvm-svn: 329572	2018-04-09 14:29:13 +00:00
Xin Tong	0efadbbcde	[MergeICmp] Split blocks that do other work. Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564	2018-04-09 13:14:06 +00:00
Max Kazantsev	8624a4786a	[IRCE] Relax restriction on collected range checks In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547	2018-04-09 06:01:22 +00:00
Hiroshi Inoue	9ff2380ea6	[NFC] fix trivial typos in comments and error message "is is" -> "is", "are are" -> "are" llvm-svn: 329546	2018-04-09 04:37:53 +00:00
Xin Tong	99c4e2f364	[LIR] Reorder header. NFC llvm-svn: 329530	2018-04-08 13:19:53 +00:00
Sanjay Patel	2a24958923	[InstCombine] simplify code that propagates FMF; NFC llvm-svn: 329503	2018-04-07 14:14:23 +00:00
Roman Lebedev	41922f1a6d	[InstCombine] Get rid of select of bittest (PR36950 / PR17564) Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 \| PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 \| PR17564 ]], D45065, D45107 https://godbolt.org/g/iAYRup Alive proof: https://rise4fun.com/Alive/uiH Testing: `ninja check-llvm` Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45108 llvm-svn: 329492	2018-04-07 10:37:24 +00:00
Nico Weber	b64da22db7	Remove trailing space in build file. llvm-svn: 329479	2018-04-07 03:30:28 +00:00
Vitaly Buka	9cb59b92cc	Fix warning by cl::opt<int> -> cl::opt<unsigned> llvm-svn: 329461	2018-04-06 21:41:17 +00:00
Vitaly Buka	66f53d71f7	Runtime flag to control branch funnel threshold Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459	2018-04-06 21:32:36 +00:00
Geoff Berry	5bf4a5eafa	[EarlyCSE] Add debug counter for debugging mis-optimizations. NFC. Reviewers: reames, spatel, davide, dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45162 llvm-svn: 329443	2018-04-06 18:47:33 +00:00
Sanjay Patel	a9ca709011	[InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse() As noted in the post-commit discussion for r329350, we shouldn't generally assume that fsub is the same cost as fneg. llvm-svn: 329429	2018-04-06 17:24:08 +00:00
Simon Pilgrim	a74f4ae404	Strip trailing whitespace. NFCI. llvm-svn: 329421	2018-04-06 17:01:54 +00:00
Mircea Trofin	aa3fea6cb0	[GlobalOpt] Fix support for casts in ctors. Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409	2018-04-06 15:54:47 +00:00
Chad Rosier	45735b8e40	[LoopUnroll] Make LoopPeeling respect the AllowPeeling preference. The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395	2018-04-06 13:57:21 +00:00
Hans Wennborg	b230c763a4	EntryExitInstrumenter: Handle musttail calls Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385	2018-04-06 10:14:09 +00:00
Max Kazantsev	832563a782	[NFC] Add missing end of line symbols llvm-svn: 329383	2018-04-06 09:47:06 +00:00
Sanjay Patel	04683de82f	[InstCombine] FP: Z - (X - Y) --> Z + (Y - X) This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362	2018-04-05 23:21:15 +00:00
Sanjay Patel	03e2526728	[InstCombine] nsz: -(X - Y) --> Y - X This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350	2018-04-05 21:37:17 +00:00
Daniel Neilson	367c2aea4e	[InstCombine] Properly change GEP type when reassociating loop invariant GEP chains Summary: This is a fix to PR37005. Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug whereby it will convert: %src = getelementptr inbounds i8, i8* %base, <2 x i64> %val %res = getelementptr inbounds i8, <2 x i8> %src, i64 %val2 into: %src = getelementptr inbounds i8, i8 %base, i64 %val2 %res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2 is loop invariant. This fix recreates new GEP instructions if the index operand swap would result in the type of %src changing from vector to scalar, or vice versa. Reviewers: sebpop, spatel Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45287 llvm-svn: 329331	2018-04-05 18:51:45 +00:00
Sanjay Patel	deaf4f354e	[InstCombine] use pattern matchers for fsub --> fadd folds This allows folding for vectors with undef elements. llvm-svn: 329316	2018-04-05 17:06:45 +00:00
Sanjay Patel	236442e063	[InstCombine] cleanup; NFC llvm-svn: 329282	2018-04-05 13:24:26 +00:00
Florian Hahn	6e0043365b	[LoopInterchange] Add stats counter for number of interchanged loops. Reviewers: samparker, karthikthecool, blitz.opensource Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D45209 llvm-svn: 329269	2018-04-05 10:39:23 +00:00
Florian Hahn	831a757728	[LoopInterchange] Preserve LoopInfo after interchanging. LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place. Reviewers: davide, efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45278 llvm-svn: 329264	2018-04-05 09:48:45 +00:00
Taewook Oh	e0db533feb	[CallSiteSplitting] Do not perform callsite splitting inside landing pad Summary: If the callsite is inside landing pad, do not perform callsite splitting. Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure. Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad. Attached test case will crash with assertion failure without the fix. Reviewers: fhahn, junbuml, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45130 llvm-svn: 329250	2018-04-05 04:16:23 +00:00
Evgeniy Stepanov	1f1a7a719d	hwasan: add -hwasan-match-all-tag flag Sometimes instead of storing addresses as is, the kernel stores the address of a page and an offset within that page, and then computes the actual address when it needs to make an access. Because of this the pointer tag gets lost (gets set to 0xff). The solution is to ignore all accesses tagged with 0xff. This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore accesses through pointers with a particular pointer tag value for validity. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D44827 llvm-svn: 329228	2018-04-04 20:44:59 +00:00
Benjamin Kramer	1fc0da4849	Make helpers static. NFC. llvm-svn: 329170	2018-04-04 11:45:11 +00:00
Nicolai Haehnle	eb7311ffb1	StructurizeCFG: Test for branch divergence correctly Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165	2018-04-04 10:58:15 +00:00
Craig Topper	7d3aba6687	[SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142	2018-04-04 03:47:17 +00:00
Ikhlas Ajbar	1376d934ed	[Hexagon] peel loops with runtime small trip counts Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129	2018-04-03 22:55:09 +00:00
Sanjay Patel	81b3b10a95	[InstCombine] allow more fmul folds with 'reassoc' The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121	2018-04-03 22:19:19 +00:00
Vlad Tsyrklevich	07cf78cdad	Fix bad copy-and-paste in r329108 llvm-svn: 329118	2018-04-03 21:40:27 +00:00
Gor Nishanov	d4712715dd	[coroutines] Respect alloca alignment requirements when building coroutine frame Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112	2018-04-03 20:54:20 +00:00
Florian Hahn	9467ccf447	[LoopInterchange] Add remark for calls preventing interchanging. It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111	2018-04-03 20:54:04 +00:00
Vlad Tsyrklevich	d17f61ea3b	Add the ShadowCallStack attribute Summary: Introduce the ShadowCallStack function attribute. It's added to functions compiled with -fsanitize=shadow-call-stack in order to mark functions to be instrumented by a ShadowCallStack pass to be submitted in a separate change. Reviewers: pcc, kcc, kubamracek Reviewed By: pcc, kcc Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44800 llvm-svn: 329108	2018-04-03 20:10:40 +00:00
Alexey Bataev	d5b1f7892f	[SLP] Fixed formatting, NFC. llvm-svn: 329091	2018-04-03 17:48:14 +00:00
Daniel Neilson	901acfab0c	[InstCombine] Fold compare of int constant against a splatted vector of ints Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087	2018-04-03 17:26:20 +00:00
Alexey Bataev	428e9d9d87	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085	2018-04-03 17:14:47 +00:00
Alexey Bataev	df989c54cf	Recommit "[SLP] Fix issues with debug output in the SLP vectorizer." The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082	2018-04-03 16:40:33 +00:00
Benjamin Kramer	2fc3b18922	Revert "[SLP] Fix PR36481: vectorize reassociated instructions." This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071	2018-04-03 14:40:33 +00:00
Alexander Potapenko	ac70668cff	MSan: introduce the conservative assembly handling mode. The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first \|sizeof(type)\| bytes for every \|type*\| pointer passed into the assembly statement. llvm-svn: 329054	2018-04-03 09:50:06 +00:00
Chandler Carruth	597bfd8448	[SLP] Fix issues with debug output in the SLP vectorizer. The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046	2018-04-03 05:27:28 +00:00
Ikhlas Ajbar	b7322e8ac7	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042	2018-04-03 03:39:43 +00:00
Haicheng Wu	7f0daaeb86	[SLP] Distinguish "demanded and shrinkable" from "demanded and not shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035	2018-04-03 00:05:10 +00:00
Brian Gesiak	64521bed0d	[Coroutines] Avoid assert splitting hidden coros Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function after `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033	2018-04-02 23:39:40 +00:00
Reid Kleckner	298ffc609b	[InstCombine] Don't strip function type casts from musttail calls Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027	2018-04-02 22:49:44 +00:00
Reid Kleckner	a9e9918ee4	Treat inlining a notail call as a regular, non-tail call Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015	2018-04-02 21:23:16 +00:00
Sanjay Patel	cbb0450540	[InstCombine] add folds for icmp + sub (PR36969) (A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011	2018-04-02 20:37:40 +00:00
Rong Xu	5a8d4c3357	[DeadArgumentElim] Clone function level metadatas Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991	2018-04-02 17:27:38 +00:00
Gor Nishanov	b0316d96ae	[coroutines] Add support for llvm.coro.noop intrinsics Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986	2018-04-02 16:55:12 +00:00
Alexey Bataev	3decaf4275	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980	2018-04-02 14:51:37 +00:00
Teresa Johnson	974706ebf7	[ThinLTO] Add an import cutoff for debugging/triaging Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934	2018-04-01 15:54:40 +00:00
David Green	f80ebc8d21	[LoopRotate] Rotate loops with loop exiting latches If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933	2018-04-01 12:48:24 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Peter Collingbourne	d03bf12c1b	DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890	2018-03-30 18:37:55 +00:00
Krzysztof Parzyszek	fce30c2ba3	Revert "peel loops with runtime small trip counts" This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875	2018-03-30 16:55:44 +00:00
Ikhlas Ajbar	66c8ba5a50	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854	2018-03-30 03:05:34 +00:00
David Blaikie	f423062aff	Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation llvm-svn: 328844	2018-03-29 22:42:08 +00:00
David Blaikie	7883340331	Remove unused header to fix layering. llvm-svn: 328842	2018-03-29 22:35:59 +00:00
David Blaikie	4778bb88ef	Remove unused headers to fix layering llvm-svn: 328840	2018-03-29 22:31:39 +00:00
David Blaikie	c90289b5d3	llvm-c: Split Utils out of Scalar.h To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839	2018-03-29 22:31:38 +00:00
Evgeniy Stepanov	50635dab26	Add msan custom mapping options. Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830	2018-03-29 21:18:17 +00:00
Philip Reames	5c14ed89f6	[NFC][LICM] Rearrange checks to have the cheap bail out first llvm-svn: 328822	2018-03-29 20:32:15 +00:00
Haicheng Wu	c7cc87922e	[JumpThreading] Don't select an edge that we know we can't thread In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798	2018-03-29 16:01:26 +00:00
David Green	b0aa36f9c2	[LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766	2018-03-29 08:48:15 +00:00
Benjamin Kramer	6b995a4a7e	[Transforms] Make sure to include the c binding header when defining c binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765	2018-03-29 07:56:53 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
David Blaikie	eb8cc04ea2	Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back llvm-svn: 328720	2018-03-28 18:03:25 +00:00
David Blaikie	a373d18eb7	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717	2018-03-28 17:44:36 +00:00
Alexander Potapenko	4e7ad0805e	[MSan] Introduce ActualFnStart. NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697	2018-03-28 11:35:09 +00:00
Alexander Potapenko	e1d5877847	[MSan] Add an isStore argument to getShadowOriginPtr(). NFC This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692	2018-03-28 10:17:17 +00:00
Xin Tong	0272cb077f	80-line wrap. NFC llvm-svn: 328660	2018-03-27 19:43:02 +00:00
Rong Xu	662f38b16f	[PGO] Fix branch probability remarks assert Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653	2018-03-27 18:55:56 +00:00
Krzysztof Parzyszek	5d93fdfa89	[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632	2018-03-27 16:14:11 +00:00
Max Kazantsev	b1ad66ff12	[LoopUnroll][NFC] Remove redundant canPeel check We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615	2018-03-27 09:40:51 +00:00
Sam Parker	90b7f4f72c	[IRCE] Enable decreasing loops of non-const bound As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613	2018-03-27 08:24:53 +00:00
Sanjay Patel	0e3167cb30	[InstCombine] improve code comment; NFC llvm-svn: 328560	2018-03-26 17:52:02 +00:00
Sebastian Pop	d870aea03e	[InstCombine] reassociate loop invariant GEP chains to enable LICM This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539	2018-03-26 16:19:31 +00:00
Sanjay Patel	4fd4fd610c	[InstCombine] distribute fmul over fadd/fsub This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502	2018-03-26 15:03:57 +00:00
Sanjay Patel	2455fef497	[InstCombine] check uses before creating instructions for fmul distribution As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498	2018-03-26 14:25:43 +00:00
Krzysztof Parzyszek	0b377e0ae9	[LSR] Allow giving priority to post-incrementing addressing modes Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490	2018-03-26 13:10:09 +00:00
Max Kazantsev	a55749312b	[LoopUnroll] Fix dangling pointers in SCEV Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483	2018-03-26 11:31:46 +00:00
Benjamin Kramer	8840f644b4	[DeadArgElim] Strip allocsize attributes when deleting an argument. Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481	2018-03-26 09:44:24 +00:00
Sam Parker	53a423a417	[IRCE] Enable increasing loops of variable bounds CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480	2018-03-26 09:29:42 +00:00
Sanjay Patel	93e64dd9a1	[PatternMatch] allow undef elements when matching vector FP +0.0 This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461	2018-03-25 21:16:33 +00:00
Sanjay Patel	841aac04d4	[InstCombine] peek through more icmp of FP cast + bitcast This is an extension of rL328426 as noted in D44367. llvm-svn: 328448	2018-03-25 14:01:42 +00:00
Sanjay Patel	745a9c62c2	[InstCombine] peek through FP casts for sign-bit compares (PR36682) This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426	2018-03-24 15:45:02 +00:00
Sanjay Patel	286074e8a1	[InstCombine] fix formatting; NFC llvm-svn: 328425	2018-03-24 15:41:59 +00:00
David Blaikie	53f51c1df8	Remove unused header from EntryExitInstrumenter Fixes layering, since Transforms/Utils doesn't depend on CodeGen, so shouldn't include headers from it. llvm-svn: 328399	2018-03-24 00:06:14 +00:00
Philip Reames	6a1f3446b5	[GuardWidening] Group code by class [NFC] llvm-svn: 328387	2018-03-23 23:41:47 +00:00
David Blaikie	4fe1fe1418	Fix Layering, move instrumentation transform headers into Instrumentation subdirectory llvm-svn: 328379	2018-03-23 22:11:06 +00:00
Fedor Sergeev	6660fd0f95	[PM][FunctionAttrs] add NoUnwind attribute inference to PostOrderFunctionAttrs pass Summary: This was motivated by absence of PrunEH functionality in new PM. It was decided that a proper way to do PruneEH is to add NoUnwind inference into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top. This change generalizes attribute handling implemented for (a removal of) Convergent attribute, by introducing a generic builder-like class AttributeInferer It registers all the attribute inference requests, storing per-attribute predicates into a vector, and then goes through an SCC Node, scanning all the instructions for not breaking attribute assumptions. The main idea is that as soon all the instructions from all the functions of SCC Node conform to attribute assumptions then we are free to infer the attribute as set for all the functions of SCC Node. It handles two distinct cases of attributes: - those that might break due to derefinement of the function code for these attributes we are allowed to apply inference only if all the functions are "exact definitions". Example - NoUnwind. - those that do not care about derefinement for these attributes we are allowed to apply inference as soon as we see any function definition. Example - removal of Convergent attribute. Also in this commit: * Converted all the FunctionAttrs tests to use FileCheck and added new-PM invocations to them * FunctionAttrs/convergent.ll test demonstrates a difference in behavior between new and old PM implementations. Marked with FIXME. * PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg combo as intended * some of "other" tests were updated since function-attrs now infers 'nounwind' even for old PM pipeline * -disable-nounwind-inference hidden option added as a possible workaround for a supposedly rare case when nounwind being inferred by default presents a problem Reviewers: chandlerc, jlebar Reviewed By: jlebar Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44415 llvm-svn: 328377	2018-03-23 21:46:16 +00:00
Sanjay Patel	32381d7c7e	[InstCombine] simplify code for FP intrinsic shrinking; NFCI llvm-svn: 328372	2018-03-23 21:18:12 +00:00
Alex Shlyapnikov	83e7841419	[HWASan] Port HWASan to Linux x86-64 (LLVM) Summary: Porting HWASan to Linux x86-64, first of the three patches, LLVM part. The approach is similar to ARM case, trap signal is used to communicate memory tag check failure. int3 instruction is used to generate a signal, access parameters are stored in nop [eax + offset] instruction immediately following the int3 one. One notable difference is that x86-64 has to untag the pointer before use due to the lack of feature comparable to ARM's TBI (Top Byte Ignore). Reviewers: eugenis Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44699 llvm-svn: 328342	2018-03-23 17:57:54 +00:00
Andrew Kaylor	a237866faf	Fix a block copying problem in LICM Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336	2018-03-23 17:36:18 +00:00
Sanjay Patel	713ca3d36a	[InstCombine] reduce code duplication; NFC llvm-svn: 328323	2018-03-23 15:07:35 +00:00
Sanjay Patel	6de89ce3f7	[InstCombine] improve variable name; NFC llvm-svn: 328322	2018-03-23 14:48:31 +00:00
Matthew Simpson	6c289a1c74	[SLP] Stop counting cost of gather sequences with multiple uses When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316	2018-03-23 14:18:27 +00:00
Florian Hahn	f73c3ece7f	Revert r328307: [IPSCCP] Use constant range information for comparisons of parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312	2018-03-23 12:49:39 +00:00
Florian Hahn	b1feec087e	[IPSCCP] Use constant range information for comparisons of parameters. For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307	2018-03-23 11:56:00 +00:00
Florian Hahn	52436a587e	[LoopUnroll] Simplify induction variables after peeling too. Loop peeling also has an impact on the induction variables, so we should benefit from induction variable simplification after peeling too. Reviewers: sanjoy, bogner, mzolotukhin, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43878 llvm-svn: 328301	2018-03-23 10:38:12 +00:00
David Blaikie	301627f875	Move SampleProfile.h into IPO along with the rest of the IPO pass headers llvm-svn: 328262	2018-03-22 22:42:44 +00:00
David Blaikie	376294c23a	Finish moving the IPSCCP pass from Scalar to IPO - moving the registration llvm-svn: 328259	2018-03-22 22:07:53 +00:00
David Blaikie	3bbf5af0ac	Fix layering between SCCP and IPO SCCP Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but this meant Transforms/Scalar including Transfroms/IPO headers, creating a circular dependency. (IPO depends on Scalar already) - so move the IPO SCCP shims out into IPO and the basic library implementation accessible from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation. llvm-svn: 328250	2018-03-22 21:41:29 +00:00
David Blaikie	2965a01e98	Move the initialization of the Meta Renamer pass over to IPO along with the rest of it that was moved in r328209 llvm-svn: 328234	2018-03-22 19:36:54 +00:00
Daniel Neilson	710d7b9945	[InstCombineCalls] Update deprecated API usage (NFC) Summary: Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The former has been deprecated. llvm-svn: 328227	2018-03-22 18:36:15 +00:00
Matt Morehouse	236cdaf84c	[SimplifyCFG] Create attribute for fuzzing-specific optimizations. Summary: When building with libFuzzer, converting control flow to selects or obscuring the original operands of CMPs reduces the effectiveness of libFuzzer's heuristics. This patch provides an attribute to disable or modify certain optimizations for optimal fuzzing signal. Provides a less aggressive alternative to https://reviews.llvm.org/D44057. Reviewers: vitalybuka, davide, arsenm, hfinkel Reviewed By: vitalybuka Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44232 llvm-svn: 328214	2018-03-22 17:07:51 +00:00
Anna Thomas	9b1176b0ef	[LoopPredication] Add profitability check based on BPI Summary: LoopPredication is not profitable when the loop is known to always exit through some block other than the latch block. A coarse grained latch check can cause loop predication to predicate the loop, and unconditionally deoptimize. However, without predicating the loop, the guard may never fail within the loop during the dynamic execution because the non-latch loop termination condition exits the loop before the latch condition causes the loop to exit. We teach LP about this using BranchProfileInfo pass. Reviewers: apilipenko, skatkov, mkazantsev, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44667 llvm-svn: 328210	2018-03-22 16:03:59 +00:00
David Blaikie	0368417595	Move MetaRenamer from Transforms/UTils to Transforms/IPO since it implements part of IPO.h llvm-svn: 328209	2018-03-22 15:57:47 +00:00
Florian Hahn	9bc0bc4b9b	[CallSiteSplitting] Preserve DominatorTreeAnalysis. The dominator tree analysis can be preserved easily. Some other kinds of analysis can probably be preserved too. Reviewers: junbuml, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43173 llvm-svn: 328206	2018-03-22 15:23:33 +00:00
Sanjay Patel	94c91b78e7	[InstCombine] add folds for xor-of-icmp signbit tests (PR36682) This is a retry of r328119 which was reverted at r328145 because it could crash by trying to combine icmps with different operand types. This version has a check for that and additional tests. Original commit message: This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328197	2018-03-22 14:08:16 +00:00
Florian Hahn	3bb822e7d6	[CloneFunction] Preserve DT in DuplicateInstructionsInSplitBetween. DuplicateInstructionsInSplitBetween can preserve the DT by passing through DT to SplitEdge. Reviewers: sanjoy, junbuml, anna, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D44629 llvm-svn: 328189	2018-03-22 11:38:53 +00:00
David Blaikie	2be3922807	Fix a couple of layering violations in Transforms Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165	2018-03-21 22:34:23 +00:00
Reid Kleckner	762331be07	Revert r328119 "[InstCombine] add folds for xor-of-icmp signbit tests (PR36682)" This asserts when compiling safe_numerics_unittest.cpp in Chromium with MSan. llvm-svn: 328145	2018-03-21 20:35:36 +00:00
Sanjay Patel	778032f39d	[InstCombine] add folds for xor-of-icmp signbit tests (PR36682) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328119	2018-03-21 17:17:13 +00:00
Daniel Neilson	6f1eb58e92	[MemCpyOpt] Update to new API for memory intrinsic alignment Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097	2018-03-21 14:14:55 +00:00
Justin Lebar	038cbc5c13	Re-re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for causing performance regressions. Re-landing because we've determined that these regressions were noise. Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 328096	2018-03-21 14:08:21 +00:00
Philip Reames	23aed5ef6f	[MustExecute] Move isGuaranteedToExecute and related rourtines to Analysis Next step is to actually merge the implementations and get both implementations tested through the new printer. llvm-svn: 328055	2018-03-20 22:45:23 +00:00
Shoaib Meenai	3f689c8632	[ObjCARC] Add funclet token to ARC marker The inline assembly generated for the ARC autorelease elision marker must have a funclet token if it's emitted inside a funclet, otherwise the inline assembly (and all subsequent code in the funclet) will be marked unreachable by WinEHPrepare. Note that this only applies for the non-O0 case, since at O0, clang emits the autorelease elision marker itself rather than deferring to the backend. The fix for clang is handled in a separate change. Differential Revision: https://reviews.llvm.org/D44641 llvm-svn: 328042	2018-03-20 20:45:41 +00:00
Xin Tong	a713ebea24	[MergeICmps] Break eargerly out of loop llvm-svn: 327972	2018-03-20 12:03:25 +00:00
Xin Tong	bdbd97ed9a	[MergeICmp] Fix a bug in entry block shuffled to middle of the chain Summary: Fix a bug in entry block shuffled to middle of the chain. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44642 llvm-svn: 327971	2018-03-20 11:57:54 +00:00
Andrei Elovikov	8b8253fdc7	[LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast. Summary: It turned out to be error-prone to expect the callers to handle that - better to leave the decision to this routine and make the required data to be explicitly passed to the function. This handles the case that was missed in the r322473 and fixes the assert mentioned in PR36524. Reviewers: dorit, mssimpso, Ayal, dcaballe Reviewed By: dcaballe Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D43812 llvm-svn: 327960	2018-03-20 09:04:39 +00:00
Sanjay Patel	0ce3086777	[InstCombine] canonicalize fcmp+select to fabs This is complicated by -0.0 and nan. This is based on the DAG patterns as shown in D44091. I'm hoping that we can just remove those DAG folds and always rely on IR canonicalization to handle the matching to fabs. We would still need to delete the broken code from DAGCombiner to fix PR36600: https://bugs.llvm.org/show_bug.cgi?id=36600 Differential Revision: https://reviews.llvm.org/D44550 llvm-svn: 327858	2018-03-19 15:14:30 +00:00
Alexander Potapenko	fa0217276a	[MSan] fix the types of RegSaveAreaPtrPtr and OverflowArgAreaPtrPtr Despite their names, RegSaveAreaPtrPtr and OverflowArgAreaPtrPtr used to be i8* instead of i8**. This is important, because these pointers are dereferenced twice (first in CreateLoad(), then in getShadowOriginPtr()), but for some reason MSan allowed this - most certainly because it was possible to optimize getShadowOriginPtr() away at compile time. Differential revision: https://reviews.llvm.org/D44520 llvm-svn: 327830	2018-03-19 10:08:04 +00:00
Alexander Potapenko	014ff63f24	[MSan] Don't create zero offsets in getShadowPtrForArgument(). NFC For MSan instrumentation with MS.ParamTLS and MS.ParamOriginTLS being TLS variables, the CreateAdd() with ArgOffset==0 is a no-op, because the compiler is able to fold the addition of 0. But for KMSAN, which receives ParamTLS and ParamOriginTLS from a call to the runtime library, this introduces a stray instruction which complicates reading/testing the IR. Differential revision: https://reviews.llvm.org/D44514 llvm-svn: 327829	2018-03-19 10:03:47 +00:00
Alexander Potapenko	e0bafb4359	[MSan] Introduce insertWarningFn(). NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to use a different warning function, __msan_warning_32(uptr origin), so we'd better create the warning calls in one place. Differential Revision: https://reviews.llvm.org/D44513 llvm-svn: 327828	2018-03-19 09:59:44 +00:00
Anastasis Grammenos	3a589103a4	[LICM] Salvage DI from dying Instructions LICM deletes trivially dead instructions which it won't attempt to sink. Attempt to salvage debug values which reference these instructions. llvm-svn: 327800	2018-03-18 15:59:19 +00:00
Roman Lebedev	e6da3063a5	[InstCombine] peek through unsigned FP casts for zero-equality compares (PR36682) Summary: This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H See also D44416 Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44424 llvm-svn: 327799	2018-03-18 15:53:02 +00:00
Sanjay Patel	63b1028953	[InstCombine] add nnan requirement for sqrt(x) * sqrt(y) -> sqrt(x*y) This is similar to D43765. llvm-svn: 327797	2018-03-18 14:32:54 +00:00
Oren Ben Simhon	fdd72fd522	[X86] Added support for nocf_check attribute for indirect Branch Tracking X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET). IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp. The `nocf_check` attribute has two roles in the context of X86 IBT technology: 1. Appertains to a function - do not add ENDBR instruction at the beginning of the function. 2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction. This patch implements `nocf_check` context for Indirect Branch Tracking. It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks. Differential Revision: https://reviews.llvm.org/D41879 llvm-svn: 327767	2018-03-17 13:29:46 +00:00
Craig Topper	71d69b2ea5	[CorrelatedValuePropagation] Use SelectInst::getCondition/getTrueValue/getFalseValue instead of getOperand for readability. NFC llvm-svn: 327728	2018-03-16 18:18:47 +00:00
Philip Reames	8a106272e8	[LICM/mustexec] Extend first iteration must execute logic to fcmps This builds on the work from https://reviews.llvm.org/D44287. It turned out supporting fcmp was much easier than I realized, so let's do that now. As an aside, our -O3 handling of a floating point IVs leaves a lot to be desired. We do convert the float IV to an integer IV, but do so late enough that many other optimizations are missed (e.g. we don't vectorize). Differential Revision: https://reviews.llvm.org/D44542 llvm-svn: 327722	2018-03-16 16:33:49 +00:00
Brian M. Rzycki	f65ddc5fa2	[JumpThreading] Track unreachable BBs to avoid processing JumpThreading iterates over F until the IR quiesces. Transforming unreachable BBs increases compile time and it is also possible to never stabilize causing JumpThreading to hang. An older attempt at fixing this problem was D3991 where removeUnreachableBlocks(F) was called before JumpThreading began. This has a few drawbacks: * expensive - the routine attempts to fix up the IR to identify additional BBs that can be removed along with unreachable BBs. * aggressive - does not identify and preserve the shape of the IR. At a minimum it does not preserve loop hierarchies. * invasive - altering reachable blocks it may disrupt IR shapes that could have otherwise been JumpThreaded. This patch avoids removeUnreachableBlocks(F) and instead tracks unreachable BBs in a SmallPtrSet using DominatorTree to validate the initial state of all BBs. We then rely on subsequent passes to identify and remove these unreachable blocks from F. Reviewers: dberlin, sebpop, kuhar, dinesh.d Reviewed by: sebpop, kuhar Subscribers: hiraditya, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D44177 llvm-svn: 327713	2018-03-16 15:13:47 +00:00
Florian Hahn	fc97b6173f	[LoopUnroll] Peel off iterations if it makes conditions true/false. If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 llvm-svn: 327671	2018-03-15 21:34:43 +00:00
Philip Reames	a21d5f1e18	[LICM] Ignore exits provably not taken on first iteration when computing must execute It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest is known to be taken on the first iteration. This case comes up in two major ways: * If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration. * Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated. The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate. Differential Revision: https://reviews.llvm.org/D44287 llvm-svn: 327664	2018-03-15 21:04:28 +00:00
Diego Caballero	cae4994a58	[LV] Test commit. Removing white space. This is just to check that I have commit access privilege. llvm-svn: 327656	2018-03-15 19:34:27 +00:00
Philip Reames	422024a1b7	[EarlyCSE] Don't hide earler invariant.scopes If we've already established an invariant scope with an earlier generation, we don't want to hide it in the scoped hash table with one with a later generation. I noticed this when working on the invariant-load handling, but it also applies to the invariant.start case as well. Without this change, my previous patch for invariant-load regresses some cases, so I'm pushing this without waiting for review. This is why you don't make last minute tweaks to patches to catch "obvious cases" after it's already been reviewed. Bad Philip! llvm-svn: 327655	2018-03-15 18:12:27 +00:00
Philip Reames	ca587fe0b4	[EarlyCSE] Reuse invariant scopes for invariant load This is a follow up to https://reviews.llvm.org/D43716 which rewrites the invariant load handling using the new infrastructure. It's slightly more powerful, but only in somewhat minor ways for the moment. It's not clear that DSE of stores to invariant locations is actually interesting since why would your IR have such a construct to start with? Note: The submitted version is slightly different than the reviewed one. I realized the scope could start for an invariant load which was proven redundant and removed. Added a test case to illustrate that as well. Differential Revision: https://reviews.llvm.org/D44497 llvm-svn: 327646	2018-03-15 17:29:32 +00:00
Ulrich Weigand	f4ceef8d3f	[Debug] Retain both copies of debug intrinsics in HoistThenElseCodeToIf When hoisting common code from the "then" and "else" branches of a condition to before the "if", the HoistThenElseCodeToIf routine will attempt to merge the debug location associated with the two original copies of the hoisted instruction. This is a problem in the special case where the hoisted instruction is a debug info intrinsic, since for those the debug location is considered part of the intrinsic and attempting to modify it may resut in invalid IR. This is the underlying cause of PR36410. This patch fixes the problem by handling debug info intrinsics specially: instead of hoisting one copy and merging the two locations, the code now simply hoists both copies, each with its original location intact. Note that this is still only done in the case where both original copies are otherwise (i.e. apart from location metadata) identical. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D44312 llvm-svn: 327622	2018-03-15 12:28:48 +00:00

... 6 7 8 9 10 ...

20356 Commits