llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Renouf	5a0794327a	[StructurizeCFG] Enable -structurizecfg-relaxed-uniform-regions by default D62198 introduced an option to relax the checks for hasOnlyUniformBranches. This commit turns the option on by default, for better code generation in some cases in AMDGPU. Differential Revision: https://reviews.llvm.org/D63198 Change-Id: I9cbff002a1e74d3b7eb96b4192dc8129936d537d llvm-svn: 368042	2019-08-06 14:30:19 +00:00
Hideki Saito	ec818d7fb3	[LV][NFC] Share the LV illegality reporting with LoopVectorize. Reviewers: hsaito, fhahn, rengolin Reviewed By: rengolin Patch by psamolysov, thanks! Differential Revision: https://reviews.llvm.org/D62997 llvm-svn: 367980	2019-08-06 06:08:48 +00:00
Johannes Doerfert	21fe0a314e	[Attributor][NFC] Outline common pattern into helper method This helper will also allow to also place logic to determine if an abstract attribute is necessary in the first place. llvm-svn: 367966	2019-08-06 00:55:11 +00:00
Johannes Doerfert	d0f6400978	[Attributor] Provide a generic interface to check live instructions Summary: Similar to `Attributor::checkForAllCallSites`, we now provide such functionality for instructions of a certain opcode through `Attributor::checkForAllInstructions` and the convenient wrapper `Attributor::checkForAllCallLikeInstructions`. This cleans up code, avoids duplication, and simplifies the usage of liveness information. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65731 llvm-svn: 367961	2019-08-06 00:32:43 +00:00
Johannes Doerfert	eccdf08577	[Attributor] Introduce the IRAttribute helper struct Summary: Certain properties, e.g., an AttrKind, are not shared among all abstract attributes. This patch extracts the functionality into a helper struct. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65712 llvm-svn: 367953	2019-08-05 23:35:12 +00:00
Johannes Doerfert	fb69f7688a	[Attributor] Make abstract attributes stateless To remove boilerplate, mostly passing through values to the AbstractAttriubute base class, we extract the state into an IRPosition helper. There is no function change intended but the IRPosition struct will provide more functionality down the line. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65711 llvm-svn: 367952	2019-08-05 23:32:31 +00:00
Johannes Doerfert	2402062557	[Attributor] Use proper ID for attribute lookup Summary: The new scheme is similar to the pass manager and dyn_cast scheme where we identify classes by the address of a static member. This is better than the old scheme in which we had to "invent" new Attributor enums if there was no corresponding one. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65710 llvm-svn: 367951	2019-08-05 23:30:01 +00:00
Johannes Doerfert	007153e9d4	[Attributor][NFCI] Avoid duplication of the InformationCache reference Summary: Instead of storing the reference to the InformationCache we now pass it whenever it might be needed. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65709 llvm-svn: 367950	2019-08-05 23:26:06 +00:00
Johannes Doerfert	e83f303938	[Attributor] Deduce the "no-return" attribute for functions A function is "no-return" if we never reach a return instruction, either because there are none or the ones that exist are dead. Test have been adjusted: - either noreturn was added, or - noreturn was avoided by modifying the code. The new noreturn_{sync,async} test make sure we do handle invoke instructions with a noreturn (and potentially nowunwind) callee correctly, even in the presence of potential asynchronous exceptions. llvm-svn: 367948	2019-08-05 23:22:05 +00:00
Johannes Doerfert	3d7bbc6f9c	[Attributor][Fix] Do not remove instructions during manifestation When we remove instructions cached references could still be live. This patch avoids removing invoke instructions that are replaced by calls and instead keeps them around but in a dead block. llvm-svn: 367933	2019-08-05 21:35:02 +00:00
Johannes Doerfert	924d2138fc	[Attributor][Fix] Keep invokes if handlers catch asynchronous exceptions Similar to other places where we transform invokes to calls we need to be careful if the handler (=personality) can catch asynchronous exceptions as they are not modeled as part of nounwind. This is tested with D59978. llvm-svn: 367931	2019-08-05 21:34:45 +00:00
Sanjay Patel	5dbb90bfe1	[InstCombine] combine mul+shl separated by zext This appears to slightly help patterns similar to what's shown in PR42874: https://bugs.llvm.org/show_bug.cgi?id=42874 ...but not in the way requested. That fix will require some later IR and/or backend pass to decompose multiply/shifts into something more optimal per target. Those transforms already exist in some basic forms, but probably need enhancing to catch more cases. https://rise4fun.com/Alive/Qzv2 llvm-svn: 367891	2019-08-05 16:59:58 +00:00
Sanjay Patel	1a29823b9c	[InstCombine] add extra use constraint for shl-zext fold As the test shows, we can end up with more instructions than we started with if we don't include the extra-use check. llvm-svn: 367880	2019-08-05 16:04:07 +00:00
Guillaume Chatelet	65e4b47aad	[LLVM][Alignment] Introduce Alignment Type in DataLayout Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65521 Make getFunctionPtrAlign() return MaybeAlign llvm-svn: 367817	2019-08-05 09:00:43 +00:00
Fangrui Song	d9b948b6eb	Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC F_{None,Text,Append} are kept for compatibility since r334221. llvm-svn: 367800	2019-08-05 05:43:48 +00:00
Johannes Doerfert	305b961f64	[Attributor][NFC] Create some attributes earlier llvm-svn: 367793	2019-08-04 18:40:01 +00:00
Johannes Doerfert	6471bb6f18	[Attributor][NFC] Improve debug output llvm-svn: 367792	2019-08-04 18:39:28 +00:00
Johannes Doerfert	4361da24ac	[Attributor][Fix] Resolve various liveness issues Summary: This contains various fixes: - Explicitly determine and return the next noreturn instruction. - If an invoke calls a noreturn function which is not nounwind we keep the unwind destination live. This also means we require an invoke. Though we can still add the unreachable to the normal destination block. - Check if the return instructions are dead after we look for calls to avoid triggering an optimistic fixpoint in the presence of assumed liveness information. - Make the interface work with "const" pointers. - Some simplifications While additional tests are included, full coverage is achieved only with D59978. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65701 llvm-svn: 367791	2019-08-04 18:38:53 +00:00
Johannes Doerfert	d1c3793563	[Attributor][NFC] Simplify common pattern wrt. fixpoints When a fixpoint is indicated the change status is known due to the fixpoint kind. This simplifies a common code pattern by making the connection explicit. llvm-svn: 367790	2019-08-04 18:37:38 +00:00
Johannes Doerfert	b6acee5c7b	[Attributor][NFC] Invalid DerefState is at fixpoint Summary: If the DerefBytesState (and thereby the DerefState) is invalid, we reached a fixpoint for the whole DerefState as we will not manifest/provide information then. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65586 llvm-svn: 367789	2019-08-04 17:55:15 +00:00
Yonghong Song	44b16bd4a5	[Transforms] Do not drop !preserve.access.index metadata Currently, when a GVN or CSE optimization happens, the llvm.preserve.access.index metadata is dropped. This caused a problem for BPF AbstructMemberOffset phase as it relies on the metadata (debuginfo types). This patch added proper hooks in lib/Transforms to preserve !preserve.access.index metadata. A test case is added to ensure metadata is preserved under CSE. Differential Revision: https://reviews.llvm.org/D65700 llvm-svn: 367769	2019-08-03 23:41:26 +00:00
Stefan Stipanovic	7849e41635	[Attributor][NFC] run clang-format on Attributor.cpp llvm-svn: 367757	2019-08-03 15:27:41 +00:00
Hideto Ueno	96bb347205	[Attributor] Fix dereferenceable callsite argument initialization llvm-svn: 367748	2019-08-03 04:10:50 +00:00
Stefan Stipanovic	d021617bf7	[Attributor] Using liveness in other attributes. Modifying other AbstractAttributes to use Liveness AA and skip dead instructions. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D65243 llvm-svn: 367725	2019-08-02 21:31:22 +00:00
Peter Collingbourne	196931a7dd	hwasan: Remove unused field CurModuleUniqueId. NFCI. llvm-svn: 367717	2019-08-02 20:14:58 +00:00
Alina Sbirlea	5545e6963f	[SimplifyCFG] Cleanup redundant conditions [NFC]. Summary: Since the for loop iterates over BB's predecessors, the branch conditions found must have BB as one of the successors. For an unconditional branch the successor must be BB, added `assert`. For a conditional branch, one of the two successors must be BB, simplify `else if` to `else` and `assert`. Sink common instructions outside the if/else block. Reviewers: sanjoy.google Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65596 llvm-svn: 367699	2019-08-02 18:06:54 +00:00
Sanjay Patel	9ce5f41851	[InstCombine] fold cmp+select using select operand equivalence As discussed in PR42696: https://bugs.llvm.org/show_bug.cgi?id=42696 ...but won't help that case yet. We have an odd situation where a select operand equivalence fold was implemented in InstSimplify when it could have been done more generally in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand. Here's an example: https://rise4fun.com/Alive/Xplr %cmp = icmp eq i32 %x, 2147483647 %add = add nsw i32 %x, 1 %sel = select i1 %cmp, i32 -2147483648, i32 %add => %sel = add i32 %x, 1 I've left the InstSimplify code in place for now, but my guess is that we'd prefer to remove that as a follow-up to save on code duplication and compile-time. Differential Revision: https://reviews.llvm.org/D65576 llvm-svn: 367695	2019-08-02 17:39:32 +00:00
Teresa Johnson	d2df54e6a5	[ThinLTO] Implement index-based WPD This patch adds support to the WholeProgramDevirt pass to perform index-based WPD, which is invoked from ThinLTO during the thin link. The ThinLTO backend (WPD import phase) behaves the same regardless of whether the WPD decisions were made with the index-based or (the existing) IR-based analysis. Depends on D54815. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55153 llvm-svn: 367679	2019-08-02 13:10:52 +00:00
Serguei Katkov	de67affd00	[Loop Peeling] Introduce an option for profile based peeling disabling. This patch adds an ability to disable profile based peeling causing the peeling of all iterations and as a result prohibits further unroll/peeling attempts on that loop. The motivation to get an ability to separate peeling usage in pipeline where in the first part we peel only separate iterations if needed and later in pipeline we apply the full peeling which will prohibit further peeling. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D64983 llvm-svn: 367668	2019-08-02 09:32:52 +00:00
Serguei Katkov	bbdcc82111	[Loop Peeling] Do not close further unroll/peel if profile based peeling was not used. Current peeling cost model can decide to peel off not all iterations but only some of them to eliminate conditions on phi. At the same time if any peeling happens the door for further unroll/peel optimizations on that loop closes because the part of the code thinks that if peeling happened it is profile based peeling and all iterations are peeled off. To resolve this inconsistency the patch provides the flag which states whether the full peeling basing on profile is enabled or not and peeling cost model is able to modify this field like it does not PeelCount. In a separate patch I will introduce an option to allow/disallow peeling basing on profile. To avoid infinite loop peeling the patch tracks the total number of peeled iteration through llvm.loop.peeled.count loop metadata. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D64972 llvm-svn: 367647	2019-08-02 04:29:23 +00:00
Stanislav Mekhanoshin	6fe00a21f2	Handle casts changing pointer size in the vectorizer Added code to truncate or shrink offsets so that we can continue base pointer search if size has changed along the way. Differential Revision: https://reviews.llvm.org/D65612 llvm-svn: 367646	2019-08-02 04:03:37 +00:00
Stanislav Mekhanoshin	eee9312a85	Relax load store vectorizer pointer strip checks The previous change to fix crash in the vectorizer introduced performance regressions. The condition to preserve pointer address space during the search is too tight, we only need to match the size. Differential Revision: https://reviews.llvm.org/D65600 llvm-svn: 367624	2019-08-01 22:18:56 +00:00
Sjoerd Meijer	e0dfce0723	Follow up of rL367592, fix the build Some buildbots complained about: error: default label in switch which covers all enumeration values llvm-svn: 367603	2019-08-01 18:54:29 +00:00
Alina Sbirlea	3af2a69575	[SimplifyCFG] Mark missed Changed to true. Summary: DominatorTree is invalid after SimplifyCFG because of a missed `Changed = true` when simplifying a branch condition and removing an edge. Resolves PR42272. Reviewers: zhizhouy, manojgupta Subscribers: jlebar, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65490 llvm-svn: 367596	2019-08-01 18:37:34 +00:00
Alina Sbirlea	172838df6b	[MemorySSA] Set LoopSimplify to preserve MemorySSA in the NPM, if analysis exists. Summary: LoopSimplify is preserved in the legacy pass manager, but not in the new pass manager. Update LoopSimplify to preserve MemorySSA conditionally when the analysis is available (same behavior as the legacy pass manager). Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65418 llvm-svn: 367594	2019-08-01 18:28:28 +00:00
Sjoerd Meijer	20b198ec5e	[LV] Tail-Loop Folding This allows folding of the scalar epilogue loop (the tail) into the main vectorised loop body when the loop is annotated with a "vector predicate" metadata hint. To fold the tail, instructions need to be predicated (masked), enabling/disabling lanes for the remainder iterations. Differential Revision: https://reviews.llvm.org/D65197 llvm-svn: 367592	2019-08-01 18:21:44 +00:00
Johannes Doerfert	da4d811707	[Attributor][FIX] Indicate a missing update change User of AAReturnedValues need to know if HasOverdefinedReturnedCalls changed from false to true as it will impact the result of the return value traversal (calls are not ignored anymore). This will be tested with the tests in D59978. llvm-svn: 367581	2019-08-01 16:21:54 +00:00
Roman Lebedev	081e990d08	[IR] Value: add replaceUsesWithIf() utility Summary: While there is always a `Value::replaceAllUsesWith()`, sometimes the replacement needs to be conditional. I have only cleaned a few cases where `replaceUsesWithIf()` could be used, to both add test coverage, and show that it is actually useful. Reviewers: jdoerfert, spatel, RKSimon, craig.topper Reviewed By: jdoerfert Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65528 llvm-svn: 367548	2019-08-01 12:32:08 +00:00
Roman Lebedev	0efeaa8162	[IR] SelectInst: add swapValues() utility Summary: Sometimes we need to swap true-val and false-val of a `SelectInst`. Having a function for that is nicer than hand-writing it each time. Reviewers: spatel, RKSimon, craig.topper, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65520 llvm-svn: 367547	2019-08-01 12:31:35 +00:00
Philip Reames	79c27c9464	Fix a release-only build warning triggered by rL367485 llvm-svn: 367499	2019-08-01 01:16:08 +00:00
Philip Reames	f8e7b53657	[IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work) This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes. The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop. llvm-svn: 367485	2019-07-31 21:15:21 +00:00
Sanjay Patel	435cdecdf7	[InstCombine] canonicalize fneg before fmul/fdiv Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it easier to implement the transforms (and possibly other fneg transforms) in 1 place because we can always start the pattern match from fneg (either the legacy binop or the new unop). There's a secondary practical benefit seen in PR21914 and PR42681: https://bugs.llvm.org/show_bug.cgi?id=21914 https://bugs.llvm.org/show_bug.cgi?id=42681 ...hoisting fneg rather than sinking seems to play nicer with LICM in IR (although this change may expose analysis holes in the other direction). 1. The instcombine test changes show the expected neutral IR diffs from reversing the order. 2. The reassociation tests show that we were missing an optimization opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says that all of these transforms are allowed (regardless of binop/unop fneg version) because: "For all other operations [besides copy/abs/negate/copysign], this standard does not specify the sign bit of a NaN result." In all of these transforms, we always have some other binop (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a potential intermediate NaN operand. (If that interpretation is wrong, then we must already have a bug in the existing transforms?) 3. The clang tests shouldn't exist as-is, but that's effectively a revert of rL367149 (the test broke with an extension of the pre-existing fneg canonicalization in rL367146). Differential Revision: https://reviews.llvm.org/D65399 llvm-svn: 367447	2019-07-31 16:53:22 +00:00
Stanislav Mekhanoshin	ba1e845c21	[AMDGPU] Fix for vectorizer crash with pointers of different size When vectorizer strips pointers it can eventually end up with pointers of two different sizes, then SCEV will crash. Differential Revision: https://reviews.llvm.org/D65480 llvm-svn: 367443	2019-07-31 16:33:11 +00:00
Florian Hahn	fa42f42858	[IPSCCP] Move callsite check to the beginning of the loop. We have some code marks instructions with struct operands as overdefined, but if the instruction is a call to a function with tracked arguments, this breaks the assumption that the lattice values of all call sites are not overdefined and will be replaced by a constant. This also re-adds the assertion from D65222, with additionally skipping non-callsite uses. This patch should address the cases reported in which the assertion fired. Fixes PR42738. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65439 llvm-svn: 367430	2019-07-31 12:57:04 +00:00
Roman Lebedev	5e4e6b1fb1	[DivRemPairs] Fixup DNDEBUG build - variable is only used in assertion llvm-svn: 367423	2019-07-31 12:26:37 +00:00
Roman Lebedev	a686c60c45	[DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673) Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. The original patch was committed in rL367288 but was reverted in rL367289 because it exposed pre-existing RAUW issues in internal data structures of the pass; those now have been addressed in a previous patch. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367419	2019-07-31 12:06:51 +00:00
Roman Lebedev	5f616901f5	[DivRemPairs] Avoid RAUW pitfalls (PR42823) Summary: `DivRemPairs` internally creates two maps: * {sign, divident, divisor} -> div instruction * {sign, divident, divisor} -> rem instruction Then it iterates over rem map, and looks if there is an entry in div map with the same key. Then depending on some internal logic it may RAUW rem instruction with something else. But if that rem instruction is an input to other div/rem, then it was used as a key in these maps, so the old value (used in key) is now dandling, because RAUW didn't update those maps. And we can't even RAUW map keys in general, there's `ValueMap`, but we don't have a single `Value` as key... The bug was discovered via D65298, and the test there exists. Now, i'm not sure how to expose this issue in trunk. The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`, but i guess this didn't miscompiled anything thus far? I really don't think this is benin without that patch. The fix is actually rather straight-forward - instead of trying to somehow shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new `ValueMap` with key being a struct of `Value`s, we can just have an intermediate data structure - a vector, each entry containing matching `Div, Rem` pair, and pre-filling it before doing any modifications. This way we won't need to query map after doing RAUW, so no bug is possible. Reviewers: spatel, bogner, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, hans, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65451 llvm-svn: 367417	2019-07-31 12:06:38 +00:00
Florian Hahn	189efe295b	Recommit "[GVN] Preserve loop related analysis/canonical forms." This fixes some pipeline tests. This reverts commit `d0b6f42936`. llvm-svn: 367401	2019-07-31 09:27:54 +00:00
Florian Hahn	d0b6f42936	Revert [GVN] Preserve loop related analysis/canonical forms. This reverts r367332 (git commit `2d7227ec3a`) llvm-svn: 367335	2019-07-30 17:04:58 +00:00
Florian Hahn	2d7227ec3a	[GVN] Preserve loop related analysis/canonical forms. LoopInfo can be easily preserved by passing it to the functions that modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor. SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in LoopInfo. The test case shows that we preserve LoopSimplify and LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some reason. Also I am not sure if it is possible to preserve those in the new pass manager, as they aren't analysis passes. Reviewers: reames, hfinkel, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65137 llvm-svn: 367332	2019-07-30 16:43:39 +00:00
Kit Barton	de0b633999	[LoopFusion] Extend use of OptimizationRemarkEmitter Summary: This patch extends the use of the OptimizationRemarkEmitter to provide information about loops that are not fused, and loops that are not eligible for fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops that are not eligible for fusion and the OptimizationRemarkMissed to identify loops that cannot be fused. It also reuses the statistics to provide the messages used in the OptimizationRemarks. This provides common message strings between the optimization remarks and the statistics. I would like feedback on this approach, in general. If people are OK with this, I will flesh out additional remarks in subsequent commits. Subscribers: hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63844 llvm-svn: 367327	2019-07-30 15:58:43 +00:00
Roman Lebedev	be612ea471	[InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two Summary: I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling. While we did happen to handle this fold in most of the cases, the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two), or first turn `x s% -y` to `x u% y`; that does handle most of the cases. But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`, and thus we end up being stuck with `(x s% INT_MIN) == 0`. There is no such restriction for the more general fold: https://rise4fun.com/Alive/IIeS To be noted, the fold does not enforce that `y` is a constant, so it may indeed increase instruction count. This is consistent with what `x u% y`->`x & (y-1)` already does. I think it makes sense, it's at most one (simple) extra instruction, while `rem`ainder is really much more un-simple (and likely very costly). Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65046 llvm-svn: 367322	2019-07-30 15:28:22 +00:00
Roman Lebedev	8e0cf076ac	Revert "[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)" test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke: Only PHI nodes may reference their own value! %sub33 = srem i32 %sub33, %ranks_in_i This reverts commit r367288. llvm-svn: 367289	2019-07-30 07:44:58 +00:00
Roman Lebedev	c75cdd056f	[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673) Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367288	2019-07-30 07:10:00 +00:00
Peter Collingbourne	dd9682196b	ThinLTOBitcodeWriter: Include globals associated with type metadata globals in the merged module. Globals that are associated with globals with type metadata need to appear in the merged module because they will reference the global's section directly. Differential Revision: https://reviews.llvm.org/D65312 llvm-svn: 367242	2019-07-29 17:22:40 +00:00
Sanjay Patel	e9ee7b47d4	[InstCombine] fold fadd+fneg with fdiv/fmul betweena The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367227	2019-07-29 13:50:25 +00:00
Sanjay Patel	5483f4225e	[InstCombine] reduce code for fadd with fneg operand; NFC llvm-svn: 367224	2019-07-29 13:20:46 +00:00
Sanjay Patel	99c57c6daf	[InstCombine] fold fsub+fneg with fdiv/fmul between The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194	2019-07-28 17:10:06 +00:00
Hideto Ueno	e7bea9b73a	[Attributor] Deduce "align" attribute Summary: Deduce "align" attribute in attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64152 llvm-svn: 367187	2019-07-28 07:04:01 +00:00
Sanjay Patel	02b9e45a7e	[InstSimplify] remove quadratic time looping (PR42771) The test case from: https://bugs.llvm.org/show_bug.cgi?id=42771 ...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is seemingly done just to avoid invalidating the instruction iterator. We can instead delay instruction deletion until we reach the end of the block (or we could delay until we reach the end of all blocks). There's a test diff here for a degenerate case with llvm.assume that is not meaningful in itself, but serves to verify this change in logic. This change probably doesn't result in much overall compile-time improvement because we call '-instsimplify' as a standalone pass only once in the standard -O2 opt pipeline currently. Differential Revision: https://reviews.llvm.org/D65336 llvm-svn: 367173	2019-07-27 14:05:51 +00:00
Florian Hahn	d89f6cb299	Revert [IPSCCP] Add assertion to surface cases where we zap returns with overdefined users. This reverts r366998 (git commit `5354c83ece`) This breaks a linux kernel build and we have reproducer to investigate. llvm-svn: 367160	2019-07-26 22:14:08 +00:00
Wei Mi	55a68a2400	[JumpThreading] Stop searching predecessor when the current bb is in a unreachable loop. updatePredecessorProfileMetadata in jumpthreading tries to find the first dominating predecessor block for a PHI value by searching upwards the predecessor block chain. But jumpthreading may see some temporary IR state which contains unreachable bb not being cleaned up. If an unreachable loop happens to be on the predecessor block chain, keeping chasing the predecessor block will run into an infinite loop. The patch fixes it. Differential Revision: https://reviews.llvm.org/D65310 llvm-svn: 367154	2019-07-26 20:59:22 +00:00
Sanjay Patel	a9ab31558c	[InstCombine] canonicalize negated operand of fdiv This is a transform that we use with fmul, so use it for fdiv too for consistency. llvm-svn: 367146	2019-07-26 19:56:59 +00:00
Sanjay Patel	c229cfeb7a	[InstCombine] remove flop from lerp patterns (Y * (1.0 - Z)) + (X * Z) --> Y - (Y * Z) + (X * Z) --> Y + Z * (X - Y) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=42716 Factoring eliminates an instruction, so that should be a good canonicalization. The potential conversion to FMA would be handled by the backend based on target capabilities. Differential Revision: https://reviews.llvm.org/D65305 llvm-svn: 367101	2019-07-26 11:19:18 +00:00
Serguei Katkov	7f8c809592	[Loop Utils] Extend the scope of addStringMetadataToLoop. To avoid duplicates in loop metadata, if the string to add is already there, just update the value. Reviewers: reames, Ashutosh Reviewed By: reames Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D65265 llvm-svn: 367087	2019-07-26 07:04:34 +00:00
Serguei Katkov	3c3a76527e	[Loop Utils] Move utilty addStringMetadataToLoop to LoopUtils.cpp. NFC. Just move the utility function to LoopUtils.cpp to re-use it in loop peeling. Reviewers: reames, Ashutosh Reviewed By: reames Subscribers: hiraditya, asbirlea, llvm-commits Differential Revision: https://reviews.llvm.org/D65264 llvm-svn: 367085	2019-07-26 06:10:08 +00:00
Leonard Chan	007f674c6a	Reland the "[NewPM] Port Sancov" patch from rL365838. No functional changes were made to the patch since then. -------- [NewPM] Port Sancov This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty. Changes: - Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over functions and ModuleSanitizerCoverage for passing over modules. - ModuleSanitizerCoverage exists for adding 2 module level calls to initialization functions but only if there's a function that was instrumented by sancov. - Added legacy and new PM wrapper classes that own instances of the 2 new classes. - Update llvm tests and add clang tests. llvm-svn: 367053	2019-07-25 20:53:15 +00:00
Florian Hahn	c74808b914	[PredicateInfo] Replace pointer comparisons with deterministic compares. Currently there are a few pointer comparisons in ValueDFS_Compare, which can cause non-deterministic ordering when materializing values. There are 2 cases this patch fixes: 1. Order defs before uses used to compare pointers, which guarantees defs before uses, but causes non-deterministic ordering between 2 uses or 2 defs, depending on the allocation order. By converting the pointers to booleans, we can circumvent that problem. 2. comparePHIRelated was comparing the basic block pointers of edges, which also results in a non-deterministic order and is also not really meaningful for ordering. By ordering by their destination DFS numbers we guarantee a deterministic order. For the example below, we can end up with 2 different uselist orderings, when running `opt -mem2reg -ipsccp` hundreds of times. Because the non-determinism is caused by allocation ordering, we cannot reproduce it with ipsccp alone. declare i32 @hoge() local_unnamed_addr #0 define dso_local i32 @ham(i8* %arg, i8* %arg1) #0 { bb: %tmp = alloca i32 %tmp2 = alloca i32, align 4 br label %bb19 bb4: ; preds = %bb20 br label %bb6 bb6: ; preds = %bb4 %tmp7 = call i32 @hoge() store i32 %tmp7, i32* %tmp %tmp8 = load i32, i32* %tmp %tmp9 = icmp eq i32 %tmp8, 912730082 %tmp10 = load i32, i32* %tmp br i1 %tmp9, label %bb11, label %bb16 bb11: ; preds = %bb6 unreachable bb13: ; preds = %bb20 br label %bb14 bb14: ; preds = %bb13 %tmp15 = load i32, i32* %tmp br label %bb16 bb16: ; preds = %bb14, %bb6 %tmp17 = phi i32 [ %tmp10, %bb6 ], [ 0, %bb14 ] br label %bb19 bb18: ; preds = %bb20 unreachable bb19: ; preds = %bb16, %bb br label %bb20 bb20: ; preds = %bb19 indirectbr i8* null, [label %bb4, label %bb13, label %bb18] } Reviewers: davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64866 llvm-svn: 367049	2019-07-25 20:48:13 +00:00
Serguei Katkov	cde00c02e1	[Loop Peeling] Fix idom detection algorithm. We'd like to determine the idom of exit block after peeling one iteration. Let Exit is exit block. Let ExitingSet - is a set of predecessors of Exit block. They are exiting blocks. Let Latch' and ExitingSet' are copies after a peeling. We'd like to find an idom'(Exit) - idom of Exit after peeling. It is an evident that idom'(Exit) will be the nearest common dominator of ExitingSet and ExitingSet'. idom(Exit) is a nearest common dominator of ExitingSet. idom(Exit)' is a nearest common dominator of ExitingSet'. Taking into account that we have a single Latch, Latch' will dominate Header and idom(Exit). So the idom'(Exit) is nearest common dominator of idom(Exit)' and Latch'. All these basic blocks are in the same loop, so what we find is (nearest common dominator of idom(Exit) and Latch)'. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D65292 llvm-svn: 367044	2019-07-25 19:31:50 +00:00
Sanjay Patel	b456310902	[SimplifyCFG] avoid crashing after simplifying a switch (PR42737) Later code in TryToSimplifyUncondBranchFromEmptyBlock() assumes that we have cleaned up unreachable blocks, but that was not happening with this switch transform. llvm-svn: 367037	2019-07-25 17:01:12 +00:00
JF Bastien	dbc0a5df8d	Allow prefetching from non-zero address spaces Summary: This is useful for targets which have prefetch instructions for non-default address spaces. <rdar://problem/42662136> Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65254 llvm-svn: 367032	2019-07-25 16:11:57 +00:00
Vlad Tsyrklevich	5d5a58317c	Revert "[InstCombine] try to narrow a truncated load" This reverts commit `bc4a63fd3c`, this is a speculative revert to fix a number of sanitizer bots (like sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2 compiler crashes, presumably due to a miscompile. llvm-svn: 367029	2019-07-25 15:37:57 +00:00
Florian Hahn	c0d0e3bda8	[PredicateInfo] Use SmallVector instead of SmallPtrSet. We do not need the SmallPtrSet to avoid adding duplicates to OpsToRename, because we already keep a ValueInfo mapping. If we see an op for the first time, Infos will be empty and we can also add it to OpsToRename. We process operands by visiting BBs depth-first and then iterate over all instructions & users, so the order should be deterministic. Therefore we can skip one round of sorting, which we purely needed for guaranteeing a deterministic order when iterating over the SmallPtrSet. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64816 llvm-svn: 367028	2019-07-25 15:35:10 +00:00
Sanjay Patel	38a0200868	[Utils] remove duplicated documentation comments; NFC http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments llvm-svn: 367015	2019-07-25 13:11:21 +00:00
Sanjay Patel	bc4a63fd3c	[InstCombine] try to narrow a truncated load trunc (load X) --> load (bitcast X to narrow type) We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated load pattern can interfere with other instcombine transforms, so I'd like to allow the fold sooner. Example: https://bugs.llvm.org/show_bug.cgi?id=16739 ...in that report, we have bitcasts bracketing these ops, so those could get eliminated too. We've generally ruled out widening of loads early in IR ( LoadCombine - http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but that reasoning may not apply to narrowing if we can preserve information such as the dereferenceable range. Differential Revision: https://reviews.llvm.org/D64432 llvm-svn: 367011	2019-07-25 12:14:27 +00:00
Florian Hahn	5354c83ece	[IPSCCP] Add assertion to surface cases where we zap returns with overdefined users. We should only zap returns in functions, where all live users have a replace-able value (are not overdefined). Unused return values should be undefined. This should make it easier to detect bugs like in PR42738. Alternatively we could bail out of zapping the function returns, but I think it would be better to address those divergences between function and call-site values where they are actually caused. Reviewers: davide, efriedma Reviewed By: davide, efriedma Differential Revision: https://reviews.llvm.org/D65222 llvm-svn: 366998	2019-07-25 09:37:09 +00:00
Sjoerd Meijer	5c606cef79	[LV] Scalar Epilogue Lowering. NFC. This refactors boolean 'OptForSize' that was passed around in a lot of places. It controlled folding of the tail loop, the scalar epilogue, into the main loop but code-size reasons may not be the only reason to do this. Thus, this is a first step to generalise the concept of tail-loop folding, and hence OptForSize has been renamed and is using an enum ScalarEpilogueStatus that holds the status how the epilogue should be lowered. This will be followed up by D65197, that picks up the predicate loop hint and performs the tail-loop folding. Differential Revision: https://reviews.llvm.org/D64916 llvm-svn: 366993	2019-07-25 08:06:02 +00:00
Chen Zheng	a2d74d3d90	[PowerPC] exclude more icmps in LSR which is converted in later hardware loop pass Differential Revision: https://reviews.llvm.org/D64795 llvm-svn: 366976	2019-07-25 01:22:08 +00:00
Evandro Menezes	5cd5f9b65d	[InstCombine] Swap order of checks to improve compile time (NFC) llvm-svn: 366962	2019-07-24 23:31:04 +00:00
Sanjay Patel	86e9f9dc26	[Transforms] move copying of load metadata to helper function; NFC There's another proposed load combine that can make use of this code in D64432. llvm-svn: 366949	2019-07-24 22:11:11 +00:00
Craig Topper	e9abc8177a	[InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of a range comparision. Similar for foldAndOfICmps We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow it to merge with icmp ugt X, C. Similar for the other constants. We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there. Fixes PR42691. Differential Revision: https://reviews.llvm.org/D65017 llvm-svn: 366945	2019-07-24 20:57:29 +00:00
David Bolvansky	d2904ccf88	Let CorrelatedValuePropagation preserve LazyValueInfo Summary: This patch makes CorrelatedValuePropagation preserve LazyValueInfo by adding LazyValueInfo::eraseValue & calling it whenever an instruction is erased. Passes `make check` , test-suite, and SPECrate 2017. Patch by aqjune (Juneyoung Lee) Reviewers: reames, mzolotukhin Reviewed By: reames Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59349 llvm-svn: 366942	2019-07-24 20:27:32 +00:00
Petr Hosek	8b161bacf4	[SafeStack] Insert the deref before remaining elements This is a follow up to D64971. While we need to insert the deref after the offset, it needs to come before the remaining elements in the original expression since the deref needs to happen before the LLVM fragment if present. Differential Revision: https://reviews.llvm.org/D65172 llvm-svn: 366865	2019-07-24 00:16:23 +00:00
Philip Reames	ea5c94b497	[IndVars] Fix a subtle bug in optimizeLoopExits The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts. This could cause two separate bugs: 1) We might exit the loop early, and leave optimizations undone. This is what triggered the assertion failure in the reported test case. 2) We might optimize one exit, then exit without indicating a change. This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars. Note that the pointer exit counts are a really fragile concept. They show up only when we have a pointer IV w/o a datalayout to provide their size. It's really questionable to me whether the complexity implied is worth it. llvm-svn: 366829	2019-07-23 17:45:11 +00:00
Simon Pilgrim	5d4bb8628c	[SLPVectorizer] Revert local change that got accidently got committed in rL366799 This wasn't part of D63281 llvm-svn: 366807	2019-07-23 13:42:01 +00:00
Simon Pilgrim	743d45ee25	[TargetLowering] Add SimplifyMultipleUseDemandedBits This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured. The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use. We do see a couple of regressions that need to be addressed: AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better). X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG. The code owners have confirmed its ok for these cases to fixed up in future patches. Differential Revision: https://reviews.llvm.org/D63281 llvm-svn: 366799	2019-07-23 12:39:08 +00:00
Simon Pilgrim	87adcf8c47	[SLPVectorizer] Remove null-pointer test. NFCI. cast<CallInst> shouldn't return null and we dereference the pointer in a lot of other places, causing both MSVC + cppcheck to warn about dereferenced null pointers llvm-svn: 366793	2019-07-23 10:51:43 +00:00
Hideto Ueno	9f5d80d79c	[Attributor][NFC] Re-run clang-format on the Attributor.cpp llvm-svn: 366789	2019-07-23 08:29:22 +00:00
Hideto Ueno	19c07afe17	[Attributor] Deduce "dereferenceable" attribute Summary: Deduce dereferenceable attribute in Attributor. These will be added in a later patch. * dereferenceable(_or_null)_globally (D61652) * Deduction based on load instruction (similar to D64258) Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64876 llvm-svn: 366788	2019-07-23 08:16:17 +00:00
Robert Widmann	fcf3c55a8c	[LLVM-C] Improve Bindings to The Internalize Pass Summary: Adds a binding to the internalize pass that allows the caller to pass a function pointer that acts as the visibility-preservation predicate. Previously, one could only pass an unsigned value (not LLVMBool?) that directed the pass to consider "main" or not. Reviewers: whitequark, deadalnix, harlanhaskins Reviewed By: whitequark, harlanhaskins Subscribers: kren1, hiraditya, llvm-commits, harlanhaskins Tags: #llvm Differential Revision: https://reviews.llvm.org/D62456 llvm-svn: 366777	2019-07-23 04:56:44 +00:00
Richard Trieu	3a52c3857f	Inline function call into assert to fix unused variable warning. llvm-svn: 366774	2019-07-23 03:10:06 +00:00
Stefan Stipanovic	6058b86373	Fixing build error from commit `95cbc3d` [Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64162 llvm-svn: 366769	2019-07-22 23:58:23 +00:00
Stefan Stipanovic	5a9ba27c71	Revert "Fixing build error from commit 9285295." This reverts commit `95cbc3da88`. llvm-svn: 366759	2019-07-22 22:55:05 +00:00
Stefan Stipanovic	95cbc3da88	Fixing build error from commit `9285295`. [Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D64162 llvm-svn: 366753	2019-07-22 22:10:59 +00:00
Roman Lebedev	3a94765bfc	[NFC][PatternMatch] Refactor code into a proper "matcher for any integral constant" Having it as a proper matcher is better for reusability elsewhere (in a follow-up patch.) llvm-svn: 366752	2019-07-22 22:09:24 +00:00
Eric Christopher	77dc6d2479	Temporarily Revert "[Attributor] Liveness analysis." as it's breaking the build. This reverts commit `9285295f75`. llvm-svn: 366737	2019-07-22 21:04:23 +00:00
Stefan Stipanovic	9285295f75	[Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D64162 llvm-svn: 366736	2019-07-22 20:54:30 +00:00
Stefan Stipanovic	69ebb02001	[Attributor] NoAlias on return values. Porting function return value attribute noalias to attributor. This will be followed with a patch for callsite and function argumets. Reviewers: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63067 llvm-svn: 366728	2019-07-22 19:36:27 +00:00
Petr Hosek	f6cd6ffbc9	[SafeStack] Insert the deref after the offset While debugging code that uses SafeStack, we've noticed that LLVM produces an invalid DWARF. Concretely, in the following example: int main(int argc, char* argv[]) { std::string value = ""; printf("%s\n", value.c_str()); return 0; } DWARF would describe the value variable as being located at: DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus The assembly to get this variable is: leaq -32(%r14), %rbx The order of operations in the DWARF symbols is incorrect in this case. Specifically, the deref is incorrect; this appears to be incorrectly re-inserted in repalceOneDbgValueForAlloca. With this change which inserts the deref after the offset instead of before it, LLVM produces correct DWARF: DW_OP_breg14 R14-32 Differential Revision: https://reviews.llvm.org/D64971 llvm-svn: 366726	2019-07-22 18:52:42 +00:00
Peter Collingbourne	ef5cfc2dae	WholeProgramDevirt: Teach the pass to respect the global's alignment. The bytes inserted before an overaligned global need to be padded according to the alignment set on the original global in order for the initializer to meet the global's alignment requirements. The previous implementation that padded to the pointer width happened to be correct for vtables on most platforms but may do the wrong thing if the vtable has a larger alignment. This issue is visible with a prototype implementation of HWASAN for globals, which will overalign all globals including vtables to 16 bytes. There is also no padding requirement for the bytes inserted after the global because they are never read from nor are they significant for alignment purposes, so stop inserting padding there. Differential Revision: https://reviews.llvm.org/D65031 llvm-svn: 366725	2019-07-22 18:50:45 +00:00
Peter Collingbourne	c3b8661df5	LowerTypeTests: Teach the pass to respect global alignments. We were previously ignoring alignment entirely when combining globals together in this pass. There are two main things that we need to do here: add additional padding before each global to meet the alignment requirements, and set the combined global's alignment to the maximum of all of the original globals' alignments. Since we now need to calculate layout as we go anyway, use the calculated layout to produce GlobalLayout instead of using StructLayout. Differential Revision: https://reviews.llvm.org/D65033 llvm-svn: 366722	2019-07-22 18:47:03 +00:00
Simon Pilgrim	3ebd2fe91a	[SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI. llvm-svn: 366712	2019-07-22 17:57:36 +00:00
Christudasan Devadasan	006cf8c03d	Added address-space mangling for stack related intrinsics Modified the following 3 intrinsics: int_addressofreturnaddress, int_frameaddress & int_sponentry. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64561 llvm-svn: 366679	2019-07-22 12:42:48 +00:00
Serguei Katkov	c6c31da867	[Loop Peeling] Fix the handling of branch weights of peeled off branches. Current algorithm to update branch weights of latch block and its copies is based on the assumption that number of peeling iterations is approximately equal to trip count. However it is not correct. According to profitability check in one case we can decide to peel in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration can be less then estimated trip count. This patch introduces another way to set the branch weights to peeled of branches. Let F is a weight of the edge from latch to header. Let E is a weight of the edge from latch to exit. F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit. Then, Estimated TripCount = F / E. For I-th (counting from 0) peeled off iteration we set the the weights for the peeled latch as (TC - I, 1). It gives us reasonable distribution, The probability to go to exit 1/(TC-I) increases. At the same time the estimated trip count of remaining loop reduces by I. As a result after peeling off N iteration the weights will be (F - N * E, E) and trip count of loop becomes F / E - N or TC - N. The idea is taken from the review of the patch D63918 proposed by Philip. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64235 llvm-svn: 366665	2019-07-22 05:15:34 +00:00
Craig Topper	e6cd20ba53	[InstCombine] Update comment I missed in r366649. NFC llvm-svn: 366658	2019-07-21 16:15:03 +00:00
Craig Topper	1d149d08d3	[InstCombine] Remove insertRangeTest code that handles the equality case. For equality, the function called getTrue/getFalse with the VT of the comparison input. But getTrue/getFalse need the boolean VT. So if this code ever executed, it would assert. I believe these cases are removed by InstSimplify so we don't get here. So this patch just fixes up an assert to exclude the equality possibility and removes the broken code. llvm-svn: 366649	2019-07-21 06:43:38 +00:00
Craig Topper	8fabdfe9fc	[InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI AddOne/SubOne create new Constant objects. That seems heavy for comparing ConstantInts which wrap APInts. Just do the math on on the APInts and compare them. llvm-svn: 366648	2019-07-21 05:26:05 +00:00
Florian Hahn	0a7faa4e3d	[Local] Zap blockaddress without users in ConstantFoldTerminator. If the blockaddress is not destoryed, the destination block will still be marked as having its address taken, limiting further transformations. I think there are other places where the dead blockaddress constants are kept around, I'll look into that as follow up. Reviewers: craig.topper, brzycki, davide Reviewed By: brzycki, davide Differential Revision: https://reviews.llvm.org/D64936 llvm-svn: 366633	2019-07-20 12:25:47 +00:00
Hubert Tong	2711e16b35	[sanitizers] Use covering ObjectFormatType switches Summary: This patch removes the `default` case from some switches on `llvm::Triple::ObjectFormatType`, and cases for the missing enumerators (`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added. For `UnknownObjectFormat`, the effect of the action for the `default` case is maintained; otherwise, where `llvm_unreachable` is called, `report_fatal_error` is used instead. Where the `default` case returns a default value, `report_fatal_error` is used for XCOFF as a placeholder. For `Wasm`, the effect of the action for the `default` case in maintained. The code is structured to avoid strongly implying that the `Wasm` case is present for any reason other than to make the switch cover all `ObjectFormatType` enumerator values. Reviewers: sfertile, jasonliu, daltenty Reviewed By: sfertile Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64222 llvm-svn: 366544	2019-07-19 08:46:18 +00:00
Serguei Katkov	bde33af85a	[Loop Peeling] Enable peeling of multiple exits by default. Enable loop peeling with multiple exits where all non-latch exits ends up with deopt by default. Reviewers: reames, fhahn Reviewed By: reames Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64619 llvm-svn: 366542	2019-07-19 08:35:45 +00:00
Roman Lebedev	f2eb403144	[InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) Normally, the inner pattern is sign-extend, but for our purposes it's no different to other patterns: alive proofs: f: https://rise4fun.com/Alive/7U3 For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64524 llvm-svn: 366540	2019-07-19 08:26:58 +00:00
Roman Lebedev	441c9d6ca8	[InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: e: https://rise4fun.com/Alive/0FT For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64521 llvm-svn: 366539	2019-07-19 08:26:47 +00:00
Roman Lebedev	3c212ce305	[InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: d: https://rise4fun.com/Alive/I5Y For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64519 llvm-svn: 366538	2019-07-19 08:26:37 +00:00
Roman Lebedev	2ebe57386d	[InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: c: https://rise4fun.com/Alive/RgJh For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64517 llvm-svn: 366537	2019-07-19 08:26:25 +00:00
Roman Lebedev	4422a1657c	[InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: b. `(x & (~(-1 << maskNbits))) << shiftNbits` All these patterns can be simplified to just: `x << ShiftShAmt` iff: b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: b: https://rise4fun.com/Alive/y8M For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64514 llvm-svn: 366536	2019-07-19 08:26:13 +00:00
Roman Lebedev	a5f0824eb5	[InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: a. `(x & ((1 << MaskShAmt) - 1)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: a. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: a: https://rise4fun.com/Alive/wi9 Indeed, not all of these patterns are canonical. But since this fold will only produce a single instruction i'm really interested in handling even uncanonical patterns, since i have this general kind of pattern in hotpaths, and it is not totally outlandish for bit-twiddling code. For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, huihuiz, xbolva00 Reviewed By: xbolva00 Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64512 llvm-svn: 366535	2019-07-19 08:25:43 +00:00
Serguei Katkov	0ffa833d54	[LoopInfo] Use early return in branch weight update functions. NFC. llvm-svn: 366411	2019-07-18 07:36:20 +00:00
Peter Collingbourne	3b82b92c6b	hwasan: Initialize the pass only once. This will let us instrument globals during initialization. This required making the new PM pass a module pass, which should still provide access to analyses via the ModuleAnalysisManager. Differential Revision: https://reviews.llvm.org/D64843 llvm-svn: 366379	2019-07-17 21:45:19 +00:00
Hideto Ueno	4a09a73fb0	[Attributor][NFC] Remove unnecessary debug output llvm-svn: 366373	2019-07-17 21:11:02 +00:00
Hideto Ueno	11d3710c1c	[Attributor] Deduce "willreturn" function attribute Summary: Deduce the "willreturn" attribute for functions. For now, intrinsics are not willreturn. More annotation will be done in another patch. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63046 llvm-svn: 366335	2019-07-17 15:15:43 +00:00
Philip Reames	6e1c3bb181	[IndVars] Speculative fix for an assertion failure seen in bots I don't have an IR sample which is actually failing, but the issue described in the comment is theoretically possible, and should be guarded against even if there's a different root cause for the bot failures. llvm-svn: 366241	2019-07-16 18:23:49 +00:00
Amara Emerson	228a7b4f2a	[ADCE] Fix non-deterministic behaviour due to iterating over a pointer set. Original patch by Yann Laigle-Chapuy Differential Revision: https://reviews.llvm.org/D64785 llvm-svn: 366215	2019-07-16 15:23:10 +00:00
Rui Ueyama	49a3ad21d6	Fix parameter name comments using clang-tidy. NFC. This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib//.{cpp,h} ../clang/lib/*/.{cpp,h} ../lld/*/.{cpp,h} llvm-svn: 366177	2019-07-16 04:46:31 +00:00
Peter Collingbourne	e5c4b468f0	hwasan: Pad arrays with non-1 size correctly. Spotted by eugenis. Differential Revision: https://reviews.llvm.org/D64783 llvm-svn: 366171	2019-07-16 03:25:50 +00:00
Eric Christopher	93dfb93ad6	Temporarily Revert "[SLP] Recommit: Look-ahead operand reordering heuristic." As there are some reported miscompiles with AVX512 and performance regressions in Eigen. Verified with the original committer and testcases will be forthcoming. This reverts commit r364964. llvm-svn: 366154	2019-07-15 23:36:02 +00:00
Leonard Chan	bb147aabc6	Revert "[NewPM] Port Sancov" This reverts commit `5652f35817`. llvm-svn: 366153	2019-07-15 23:18:31 +00:00
Evgeniy Stepanov	c5e7f56249	ARM MTE stack sanitizer. Add "memtag" sanitizer that detects and mitigates stack memory issues using armv8.5 Memory Tagging Extension. It is similar in principle to HWASan, which is a software implementation of the same idea, but there are enough differencies to warrant a new sanitizer type IMHO. It is also expected to have very different performance properties. The new sanitizer does not have a runtime library (it may grow one later, along with a "debugging" mode). Similar to SafeStack and StackProtector, the instrumentation pass (in a follow up change) will be inserted in all cases, but will only affect functions marked with the new sanitize_memtag attribute. Reviewers: pcc, hctim, vitalybuka, ostannard Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64169 llvm-svn: 366123	2019-07-15 20:02:23 +00:00
Johannes Doerfert	3dcd7996f1	[FunctionAttrs] Remove readonly and writeonly assertion There are scenarios where mutually recursive functions may cause the SCC to contain both read only and write only functions. This removes an assertion when adding read attributes which caused a crash with a the provided test case, and instead just doesn't add the attributes. Patch by Luke Lau <luke.lau@intel.com> Differential Revision: https://reviews.llvm.org/D60761 llvm-svn: 366090	2019-07-15 17:31:26 +00:00
Serguei Katkov	d021ad9fbe	[Loop Peeling] Fix the bug with IDom setting for exit loops It is possible that loop exit has two predecessors in a loop body. In this case after the peeling the iDom of the exit should be a clone of iDom of original exit but no a clone of a block coming to this exit. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64618 llvm-svn: 366050	2019-07-15 09:13:11 +00:00
Florian Hahn	1d554b7441	[LoopVectorize] Pass unfiltered list of arguments to getIntrinsicInstCost. We do not compute the scalarization overhead in getVectorIntrinsicCost and TTI::getIntrinsicInstrCost requires the full arguments list. llvm-svn: 366049	2019-07-15 08:48:47 +00:00
Serguei Katkov	3ed93b4673	[Loop Peeling] Enable peeling for loops with multiple exits This CL enables peeling of the loop with multiple exits where one exit should be from latch and others are basic blocks with call to deopt. The peeling is enabled under the flag which is false by default. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D63923 llvm-svn: 366048	2019-07-15 08:26:45 +00:00
Hideto Ueno	54869ec907	[Attributor] Deduce "nonnull" attribute Summary: Porting nonnull attribute to attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63604 llvm-svn: 366043	2019-07-15 06:49:04 +00:00
Serguei Katkov	45c43e7d04	[LoopUtils] Extend the scope of getLoopEstimatedTripCount With this patch the getLoopEstimatedTripCount function will accept also the loops where there are more than one exit but all exits except latch block should ends up with a call to deopt. This side exits should not impact the estimated trip count. Reviewers: reames, mkuper, danielcdh Reviewed By: reames Subscribers: fhahn, lebedev.ri, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64553 llvm-svn: 366042	2019-07-15 06:42:39 +00:00
Serguei Katkov	f1ee04c42a	[LoopInfo] Introduce getUniqueNonLatchExitBlocks utility function Extract the code from LoopUnrollRuntime into utility function to re-use it in D63923. Reviewers: reames, mkuper Reviewed By: reames Subscribers: fhahn, hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D64548 llvm-svn: 366040	2019-07-15 05:51:10 +00:00
Florian Hahn	9428d95ce7	[LV] Exclude loop-invariant inputs from scalar cost computation. Loop invariant operands do not need to be scalarized, as we are using the values outside the loop. We should ignore them when computing the scalarization overhead. Fixes PR41294 Reviewers: hsaito, rengolin, dcaballe, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D59995 llvm-svn: 366030	2019-07-14 20:12:36 +00:00
Johannes Doerfert	c7a1db3298	[Attributor][NFC] Run clang-format on the attributor files (.h/.cpp) The Attributor files are kept formatted with clang-format, we should try to keep this state. llvm-svn: 365984	2019-07-13 01:09:27 +00:00
Johannes Doerfert	0a7f4cdce9	[Attributor] Only return attributes with a valid state Attributor::getAAFor will now only return AbstractAttributes with a valid AbstractState. This simplifies call sites as they only need to check if the returned pointer is non-null. It also reduces the potential for accidental misuse. llvm-svn: 365983	2019-07-13 01:09:21 +00:00
Alina Sbirlea	db101864bd	[MemorySSA] Use SetVector to avoid nondeterminism. Summary: Use a SetVector for DeadBlockSet. Resolves PR42574. Reviewers: george.burgess.iv, uabelho, dblaikie Subscribers: jlebar, Prazek, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64601 llvm-svn: 365970	2019-07-12 22:30:30 +00:00
David Bolvansky	9f0d718c66	[InstCombine] Disable fold from D64285 for non-integer types llvm-svn: 365959	2019-07-12 21:14:21 +00:00
Sterling Augustine	6d75a9e873	The variable "Latch" is only used in an assert, which makes builds that use "-DNDEBUG" fail with unused variable messages. Summary: Move the logic into the assert itself. Subscribers: hiraditya, sanjoy, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64654 llvm-svn: 365943	2019-07-12 18:51:08 +00:00
Stefan Stipanovic	cb5ecae1f6	Addition to rL365925, removing remaining virtuals llvm-svn: 365938	2019-07-12 18:34:06 +00:00
Leonard Chan	223573c8ba	Remove unused methods in Sancov. llvm-svn: 365931	2019-07-12 18:09:09 +00:00
Stefan Stipanovic	15e86f707b	[Attributor] Removing unnecessary `virtual` keywords. Some function in the Attributor framework are unnecessarily marked virtual. This patch removes virtual keyword Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64637 llvm-svn: 365925	2019-07-12 17:42:14 +00:00
Hideto Ueno	65bbaf9ece	[Attributor] Deduce "nofree" function attribute Summary: Deduce "nofree" function attribute. A more concise description of "nofree" is on D49165. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: homerdin, hfinkel, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62687 llvm-svn: 365924	2019-07-12 17:38:51 +00:00
Philip Reames	34495b5533	[IndVars] Use exit count reasoning to discharge obviously untaken exits Continue in the spirit of D63618, and use exit count reasoning to prove away loop exits which can not be taken since the backedge taken count of the loop as a whole is provably less than the minimal BE count required to take this particular loop exit. As demonstrated in the newly added tests, this triggers in a number of cases where IndVars was previously unable to discharge obviously redundant exit tests. And some not so obvious ones. Differential Revision: https://reviews.llvm.org/D63733 llvm-svn: 365920	2019-07-12 17:05:35 +00:00
Fangrui Song	b251cc0d91	Delete dead stores llvm-svn: 365903	2019-07-12 14:58:15 +00:00
David Bolvansky	af1b3185f5	[InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y)) Summary: (select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y)) -> ashr (X, Y)) (select (icmp slt x, 1), ashr (X, Y), lshr (X, Y)) -> ashr (X, Y)) Fixes PR41173 Alive proof by @lebedev.ri (thanks) Name: PR41173 %cmp = icmp slt i32 %x, 1 %shr = lshr i32 %x, %y %shr1 = ashr i32 %x, %y %retval.0 = select i1 %cmp, i32 %shr1, i32 %shr => %retval.0 = ashr i32 %x, %y Optimization: PR41173 Done: 1 Optimization is correct! Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: nikic, craig.topper, llvm-commits, lebedev.ri Tags: #llvm Differential Revision: https://reviews.llvm.org/D64285 llvm-svn: 365893	2019-07-12 11:31:16 +00:00
Evandro Menezes	b21692672e	[InstCombine] Reorder pow() transformations (NFC) Move the transformation from `powf(x, itofp(y))` to `powi(x, y)` to the group of transformations related to the exponent. llvm-svn: 365851	2019-07-12 00:33:49 +00:00
Leonard Chan	5652f35817	[NewPM] Port Sancov This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty. Changes: - Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over functions and ModuleSanitizerCoverage for passing over modules. - ModuleSanitizerCoverage exists for adding 2 module level calls to initialization functions but only if there's a function that was instrumented by sancov. - Added legacy and new PM wrapper classes that own instances of the 2 new classes. - Update llvm tests and add clang tests. Differential Revision: https://reviews.llvm.org/D62888 llvm-svn: 365838	2019-07-11 22:35:40 +00:00
Stefan Stipanovic	0626367202	[Attributor] Deduce "nosync" function attribute. Introduce and deduce "nosync" function attribute to indicate that a function does not synchronize with another thread in a way that other thread might free memory. Reviewers: jdoerfert, jfb, nhaehnle, arsenm Subscribers: wdng, hfinkel, nhaenhle, mehdi_amini, steven_wu, dexonsmith, arsenm, uenoku, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D62766 llvm-svn: 365830	2019-07-11 21:37:40 +00:00
Sanjay Patel	3487791fea	[InstCombine] don't move FP negation out of a constant expression -(X * ConstExpr) becomes X * (-ConstExpr), so don't reverse that and infinite loop. llvm-svn: 365774	2019-07-11 13:44:29 +00:00
Tim Northover	27658ed512	OpaquePtr: use load instruction directly for type. NFC. llvm-svn: 365768	2019-07-11 13:12:08 +00:00
David Bolvansky	e23be09e66	[InstCombine] Reorder recently added/improved pow transformations Changed cases are now faster with exp2. llvm-svn: 365758	2019-07-11 10:55:04 +00:00
Vitaly Buka	d03bd1db59	NFC: Pass DataLayout into isBytewiseValue Summary: We will need to handle IntToPtr which I will submit in a separate patch as it's not going to be NFC. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63940 llvm-svn: 365709	2019-07-10 22:53:52 +00:00
Alina Sbirlea	58a37754bb	[LoopRotate + MemorySSA] Keep an <instruction-cloned instruction> map. Summary: The map kept in loop rotate is used for instruction remapping, in order to simplify the clones of instructions. Thus, if an instruction can be simplified, its simplified value is placed in the map, even when the clone is added to the IR. MemorySSA in contrast needs to know about that clone, so it can add an access for it. To resolve this: keep a different map for MemorySSA. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63680 llvm-svn: 365672	2019-07-10 17:36:56 +00:00
Vedant Kumar	5eb6ba060a	[CodeExtractor] Fix sinking of allocas with multiple bitcast uses (PR42451) An alloca which can be sunk into the extraction region may have more than one bitcast use. Move these uses along with the alloca to prevent use-before-def. Testing: check-llvm, stage2 build of clang Fixes llvm.org/PR42451. Differential Revision: https://reviews.llvm.org/D64463 llvm-svn: 365660	2019-07-10 16:32:20 +00:00
Vedant Kumar	f65f302cc7	[CodeExtractor] Simplify findAllocas, NFC Split getLifetimeMarkers out into its own method and have it return a struct. Differential Revision: https://reviews.llvm.org/D64467 llvm-svn: 365659	2019-07-10 16:32:16 +00:00
Roman Lebedev	c5f92bd67b	[PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::Predicate As discussed in the original review, this may be useful, so let's just do it. llvm-svn: 365652	2019-07-10 16:07:35 +00:00
David Bolvansky	0735cc1954	[InstCombine] pow(C,x) -> exp2(log2(C)x) Summary: Transform pow(C,x) To exp2(log2(C)x) if C > 0, C != inf, C != NaN (and C is not power of 2, since we have some fold for such case already). log(C) is folded by the compiler and exp2 is much faster to compute than pow. Reviewers: spatel, efriedma, evandro Reviewed By: evandro Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64099 llvm-svn: 365637	2019-07-10 14:43:27 +00:00
Serguei Katkov	d000f8b69f	[SimpleLoopUnswitch] Don't consider unswitching `switch` insructions with one unique successor Only instructions with two or more unique successors should be considered for unswitching. Patch Author: Daniil Suchkov. Reviewers: reames, asbirlea, skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64404 llvm-svn: 365611	2019-07-10 10:25:22 +00:00
Nikita Popov	5ca39e828c	[SLP] Optimize getSpillCost(); NFCI For a given set of live values, the spill cost will always be the same for each call. Compute the cost once and multiply it by the number of calls. (I'm not sure this spill cost modeling makes sense if there are multiple calls, as the spill cost will likely be shared across calls in that case. But that's how it currently works.) llvm-svn: 365552	2019-07-09 20:24:44 +00:00
Peter Collingbourne	1366262b74	hwasan: Improve precision of checks using short granule tags. A short granule is a granule of size between 1 and `TG-1` bytes. The size of a short granule is stored at the location in shadow memory where the granule's tag is normally stored, while the granule's actual tag is stored in the last byte of the granule. This means that in order to verify that a pointer tag matches a memory tag, HWASAN must check for two possibilities: * the pointer tag is equal to the memory tag in shadow memory, or * the shadow memory tag is actually a short granule size, the value being loaded is in bounds of the granule and the pointer tag is equal to the last byte of the granule. Pointer tags between 1 to `TG-1` are possible and are as likely as any other tag. This means that these tags in memory have two interpretations: the full tag interpretation (where the pointer tag is between 1 and `TG-1` and the last byte of the granule is ordinary data) and the short tag interpretation (where the pointer tag is stored in the granule). When HWASAN detects an error near a memory tag between 1 and `TG-1`, it will show both the memory tag and the last byte of the granule. Currently, it is up to the user to disambiguate the two possibilities. Because this functionality obsoletes the right aligned heap feature of the HWASAN memory allocator (and because we can no longer easily test it), the feature is removed. Also update the documentation to cover both short granule tags and outlined checks. Differential Revision: https://reviews.llvm.org/D63908 llvm-svn: 365551	2019-07-09 20:22:36 +00:00
Philip Reames	a6548d0437	[PoisonChecking] Flesh out complete todo list for full coverage Note: I don't actually plan to implement all of the cases at the moment, I'm just documenting them for completeness. There's a couple of cases left which are practically useful for me in debugging loop transforms, and I'll probably stop there for the moment. llvm-svn: 365550	2019-07-09 19:59:39 +00:00
Philip Reames	3dbd7e98d8	[PoisonCheker] Support for out of bounds operands on shifts + insert/extractelement These are sources of poison which don't come from flags, but are clearly documented in the LangRef. Left off support for scalable vectors for the moment, but should be easy to add if anyone is interested. llvm-svn: 365543	2019-07-09 19:26:12 +00:00
Philip Reames	3b38b92541	[PoisonChecking] Add validation rules for "exact" on sdiv/udiv As directly stated in the LangRef, no ambiguity here... llvm-svn: 365538	2019-07-09 18:56:41 +00:00
Philip Reames	f47a313e71	Add a transform pass to make the executable semantics of poison explicit in the IR Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language. The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program. At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code. (See comment in PoisonChecking.cpp for full explanation and context) Differential Revision: https://reviews.llvm.org/D64215 llvm-svn: 365536	2019-07-09 18:49:29 +00:00
Tim Northover	60afa49abe	OpaquePtr: add Type parameter to Loads analysis API. This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468	2019-07-09 11:35:35 +00:00
Serguei Katkov	77bb3a486f	[Loop Peeling] Add support for peeling of loops with multiple exits This patch modifies the loop peeling transformation so that it does not expect that there is only one loop exit from latch. It modifies only transformation. Update of branch weights remains only for exit from latch. The motivation is that in follow-up patch I plan to enable loop peeling for loops with multiple exits but only if other exits then from latch one goes to block with call to deopt. For now this patch is NFC. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames, fhahn Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D63921 llvm-svn: 365441	2019-07-09 06:07:25 +00:00
Serguei Katkov	c6caddb73d	[LoopInfo] Update getExitEdges to accept vector of pairs for non const BasicBlock D63921 requires getExitEdges fills a vector of Edge pairs where BasicBlocks are not constant. The rest Loop API mostly returns non-const BasicBlocks, so to be more consistent with other Loop API getExitEdges is modified to return non-const BasicBlocks as well. This is an alternative solution to D64060. Reviewers: reames, fhahn Reviewed By: reames, fhahn Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64309 llvm-svn: 365437	2019-07-09 04:20:43 +00:00
Philip Reames	0e344e9dc5	[LoopPred] Stylistic improvement to recently added NE/EQ normalization [NFC] llvm-svn: 365425	2019-07-09 02:03:31 +00:00
Philip Reames	5a637cbdc7	[LoopPred] Extend LFTR normalization to the inverse EQ case A while back, I added support for NE latches formed by LFTR. I didn't think that quite through, as LFTR will also produce the inverse EQ form for some loops and I hadn't handled that. This change just adds handling for that case as well. llvm-svn: 365419	2019-07-09 01:27:45 +00:00
Johannes Doerfert	accd3e8747	[Attributor] Deduce the "returned" argument attribute Deduce the "returned" argument attribute by collecting all potentially returned values. Not only the unique return value, if any, can be used by subsequent attributes but also the set of all potentially returned values as well as the mapping from returned values to return instructions that they originate from (see AAReturnedValues::checkForallReturnedValues). Change in statistics (-stats) for LLVM-TS + Spec2006, totaling ~19% more "returned" arguments. ADDED: attributor NumAttributesManifested n/a -> 637 ADDED: attributor NumAttributesValidFixpoint n/a -> 25545 ADDED: attributor NumFnArgumentReturned n/a -> 637 ADDED: attributor NumFnKnownReturns n/a -> 25545 ADDED: attributor NumFnUniqueReturned n/a -> 14118 CHANGED: deadargelim NumRetValsEliminated 470 -> 449 ( -4.468%) REMOVED: functionattrs NumReturned 535 -> n/a CHANGED: indvars NumElimIdentity 138 -> 164 ( +18.841%) Reviewers: homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames, efriedma, chandlerc Subscribers: hiraditya, bollu, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D59919 llvm-svn: 365407	2019-07-08 23:27:20 +00:00
Sanjay Patel	3dee113ebc	[InstCombine] fold insertelement into splat of same scalar Forming the canonical splat shuffle improves analysis and may allow follow-on transforms (although some possibilities are missing as shown in the test diffs). The backend generically turns these patterns into build_vector, so there should be no codegen regressions. All targets are expected to be able to lower splats efficiently. llvm-svn: 365379	2019-07-08 19:48:52 +00:00
Whitney Tsang	7d8f30e6b2	Keep the order of the basic blocks in the cloned loop as the original loop Summary: Do the cloning in two steps, first allocate all the new loops, then clone the basic blocks in the same order as the original loop. Reviewer: Meinersbur, fhahn, kbarton, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, hiraditya, llvm-commits Tag: https://reviews.llvm.org/D64224 Differential Revision: llvm-svn: 365366	2019-07-08 18:30:35 +00:00
Brian Homerding	498687bff2	Add, and infer, a nofree function attribute Removing dead code leftover from refactor. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365345	2019-07-08 16:33:32 +00:00
Sanjay Patel	0b59103a73	[InstCombine] canonicalize insert+splat to/from element 0 of vector We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue() and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts to that form for better matching. The backend generically turns these patterns into build_vector, so there should be no codegen difference. llvm-svn: 365342	2019-07-08 16:26:48 +00:00
Brian Homerding	b4b21d807e	Add, and infer, a nofree function attribute This patch adds a function attribute, nofree, to indicate that a function does not, directly or indirectly, call a memory-deallocation function (e.g., free, C++'s operator delete). Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365336	2019-07-08 15:57:56 +00:00
Cameron McInally	771769be90	[Float2Int] Add support for unary FNeg to Float2Int Differential Revision: https://reviews.llvm.org/D63941 llvm-svn: 365324	2019-07-08 14:46:07 +00:00
Philip Reames	9e62c86408	[IRBuilder] Introduce helpers for and/or of multiple values at once We had versions of this code scattered around, so consolidate into one location. Not strictly NFC since the order of intermediate results may change in some places, but since these operations are associatives, should not change results. llvm-svn: 365259	2019-07-06 03:46:18 +00:00
Eugene Leviant	3aef35288b	[ThinLTO] Attempt to recommit r365188 after alignment fix llvm-svn: 365215	2019-07-05 15:25:05 +00:00
Eugene Leviant	e91f86f0ac	Reverted r365188 due to alignment problems on i686-android llvm-svn: 365206	2019-07-05 13:26:05 +00:00
Eugene Leviant	820cc01d1e	[ThinLTO] Attempt to recommit r365040 after caching fix It's possible that some function can load and store the same variable using the same constant expression: store %Derived* @foo, %Derived bitcast (%Base @bar to %Derived*) %42 = load %Derived, %Derived bitcast (%Base @bar to %Derived**) The bitcast expression was mistakenly cached while processing loads, and never examined later when processing store. This caused @bar to be mistakenly treated as read-only variable. See load-store-caching.ll. llvm-svn: 365188	2019-07-05 12:00:10 +00:00
Sanjay Patel	75b5edf6a1	[InstCombine] allow undef elements when forming splat from chain of insertelements We allow forming a splat (broadcast) shuffle, but we were conservatively limiting that to cases where all elements of the vector are specified. It should be safe from a codegen perspective to allow undefined lanes of the vector because the expansion of a splat shuffle would become the chain of inserts again. Forming splat shuffles can reduce IR and help enable further IR transforms. Motivating bugs: https://bugs.llvm.org/show_bug.cgi?id=42174 https://bugs.llvm.org/show_bug.cgi?id=16739 Differential Revision: https://reviews.llvm.org/D63848 llvm-svn: 365147	2019-07-04 16:45:34 +00:00
Serguei Katkov	6d8813a391	[LoopPeel] Some small comment update. NFC. Follow-up change of comment after https://reviews.llvm.org/D63917 is landed. llvm-svn: 365107	2019-07-04 05:10:14 +00:00
Chen Zheng	469f30abab	[PowerPC] Hardware Loop branch instruction's condition may not be icmp. This fixes pr42492. Differential Revision: https://reviews.llvm.org/D64124 llvm-svn: 365104	2019-07-04 01:51:47 +00:00
Reid Kleckner	f7e52fbdb5	Revert [ThinLTO] Optimize writeonly globals out This reverts r365040 (git commit `5cacb91475`) Speculatively reverting, since this appears to have broken check-lld on Linux. Partial analysis in https://crbug.com/981168. llvm-svn: 365097	2019-07-04 00:03:30 +00:00
Eli Friedman	41ee3977c4	[JumpThreading] Fix threading with unusual PHI nodes. If the block being cloned contains a PHI node, in general, we need to clone that PHI node, even though it's trivial. If the operand of the PHI is an instruction in the block being cloned, the correct value for the operand doesn't exist until SSAUpdater constructs it. We usually don't hit this issue because we try to avoid threading across loop headers, but it's possible to hit this in some cases involving irreducible CFGs. I added a flag to allow threading across loop headers to make the testcase easier to understand. Thanks to Brian Rzycki for reducing the testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=42085. Differential Revision: https://reviews.llvm.org/D63913 llvm-svn: 365094	2019-07-03 23:12:39 +00:00
Philip Reames	ea06d63c35	[LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation. llvm-svn: 365075	2019-07-03 20:03:46 +00:00
Philip Reames	14f1543425	[LFTR] Remove a stray variable shadow of the same value [NFC] llvm-svn: 365072	2019-07-03 19:08:43 +00:00
Philip Reames	e7a258c6d9	[LFTR] Style and comment changes to clarify the narrow vs wide bitwidth evaluation behavior [NFC] llvm-svn: 365071	2019-07-03 19:03:37 +00:00
Philip Reames	abc8f344d6	[LFTR] Sink the decision not use truncate scheme for constants into genLoopLimit [NFC] We might as well just evaluate the constants using SCEV, and having the cases grouped makes the logic slightly easier to read anyway. llvm-svn: 365070	2019-07-03 18:41:03 +00:00
Philip Reames	4c80281c96	[LFTR] Remove falsely generalized (dead) code [NFC] llvm-svn: 365067	2019-07-03 18:24:06 +00:00
Philip Reames	83cca94194	[LFTR] Hoist extend expressions outside of loops w/o waiting for LICM The motivation for this is two fold: 1) Make the output (and thus tests) a bit more readable to a human trying to understand the result of the transform 2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default) llvm-svn: 365066	2019-07-03 18:18:36 +00:00
Eugene Leviant	5cacb91475	[ThinLTO] Optimize writeonly globals out Differential revision: https://reviews.llvm.org/D63444 llvm-svn: 365040	2019-07-03 14:14:52 +00:00
Roman Lebedev	9f0c83902d	[InstCombine] Y - ~X --> X + Y + 1 fold (PR42457) Summary: I think we'd want this new variant, because we obviously have better handling for `add` as compared to `sub`/`not`. https://rise4fun.com/Alive/WMn Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]] Reviewers: spatel, nikic, huihuiz, efriedma Reviewed By: spatel Subscribers: RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63992 llvm-svn: 365011	2019-07-03 09:41:50 +00:00
Alexander Potapenko	f82672873a	MSan: handle callbr instructions Summary: Handling callbr is very similar to handling an inline assembly call: MSan must checks the instruction's inputs. callbr doesn't (yet) have outputs, so there's nothing to unpoison, and conservative assembly handling doesn't apply either. Fixes PR42479. Reviewers: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64072 llvm-svn: 365008	2019-07-03 09:28:50 +00:00
Serguei Katkov	c22e772a28	[LoopPeel] Re-factor llvm::peelLoop method. NFC. Extract code dealing with branch weights in separate functions. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames, fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D63917 llvm-svn: 365002	2019-07-03 05:59:23 +00:00
Chen Zheng	dfdccbb26b	[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop. Differential Revision: https://reviews.llvm.org/D63477 llvm-svn: 364993	2019-07-03 01:49:03 +00:00
David Bolvansky	10ee3ac396	[NFC] Strenghten isInteger condition for rL364940 llvm-svn: 364969	2019-07-02 21:16:34 +00:00
Vasileios Porpodas	cf47ff5ffb	[SLP] Recommit: Look-ahead operand reordering heuristic. Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364964	2019-07-02 20:20:28 +00:00
Teresa Johnson	a700436323	[ThinLTO] Add summary entries for index-based WPD Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The follow-on index-based WPD patch is D55153. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 364960	2019-07-02 19:38:02 +00:00
David Bolvansky	cb1a5a705c	[SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n) Summary: Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190 Reviewers: spatel, nikic, efriedma Reviewed By: efriedma Subscribers: efriedma, nikic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63038 llvm-svn: 364940	2019-07-02 15:58:45 +00:00
Serge Guelton	4137aeb4bf	Provide basic Full LTO extension points Differential Revision: https://reviews.llvm.org/D61738 llvm-svn: 364937	2019-07-02 15:52:39 +00:00
Roman Lebedev	0bde7c6527	[InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484) I was actually wondering if there was some nicer way than m_Value()+cast, but apparently what i was really "subconsciously" thinking about was correctness issue. hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction, not for BinaryOperator, so let's just use m_Instruction(), thus both avoiding a cast, and a crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=42484, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587 llvm-svn: 364915	2019-07-02 12:54:48 +00:00
Reid Kleckner	d72163947a	[PGO] Update ICP pass for recent byval type changes Fixes verifier errors encountered in PR42413. Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv Differential Revision: https://reviews.llvm.org/D63842 llvm-svn: 364861	2019-07-01 22:43:39 +00:00
Sanjay Patel	ddc1b40f26	[InstCombine] reduce more checks for power-of-2-or-zero using ctpop Extends the transform from: rL364341 ...to include another (more common?) pattern that tests whether a value is a power-of-2 (including or excluding zero). llvm-svn: 364856	2019-07-01 22:00:00 +00:00
Jordan Rupprecht	a7972dc04a	Revert [SLP] Look-ahead operand reordering heuristic. This reverts r364478 (git commit `574cb0eb3a`) The patch is causing compilation timeouts. llvm-svn: 364846	2019-07-01 21:10:43 +00:00
Roman Lebedev	04d3d3bbff	[InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459) Summary: To be noted, this pattern is not unhandled by instcombine per-se, it is somehow does end up being folded when one runs opt -O3, but not if it's just -instcombine. Regardless, that fold is indirect, depends on some other folds, and is thus blind when there are extra uses. This does address the regression being exposed in D63992. https://godbolt.org/z/7DGltU https://rise4fun.com/Alive/EPO0 Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 \| PR42459 ]] Reviewers: spatel, nikic, huihuiz Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63993 llvm-svn: 364792	2019-07-01 15:55:24 +00:00
Roman Lebedev	72b8d41ce8	[InstCombine] Shift amount reassociation in bittest (PR42399) Summary: Given pattern: `icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0` we should move shifts to the same hand of 'and', i.e. rewrite as `icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)` It might be tempting to not restrict this to situations where we know we'd fold two shifts together, but i'm not sure what rules should there be to avoid endless combine loops. We pick the same shift that was originally used to shift the variable we picked to shift: https://rise4fun.com/Alive/6x1v Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 \| PR42399]]. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63829 llvm-svn: 364791	2019-07-01 15:55:15 +00:00
Roman Lebedev	f55818e3a7	[InstCombine] Omit 'urem' where possible This was added in D63390 / rL364286 to backend, but it makes sense to also handle it in middle-end. https://rise4fun.com/Alive/Zsln llvm-svn: 364738	2019-07-01 09:41:43 +00:00
Yevgeny Rouban	d4097b4a93	[SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst Differential Revision: https://reviews.llvm.org/D60606 llvm-svn: 364734	2019-07-01 08:43:53 +00:00
Sanjay Patel	706b48251f	[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics This is the opposite direction of D62158 (we have to choose 1 form or the other). Now that we have FMF on the select, this becomes more palatable. And the benefits of having a single IR instruction for this operation (less chances of missing folds based on extra uses, etc) overcome my previous comments about the potential advantage of larger pattern matching/analysis. Differential Revision: https://reviews.llvm.org/D62414 llvm-svn: 364721	2019-06-30 13:40:31 +00:00
Fangrui Song	78ee2fbf98	Cleanup: llvm::bsearch -> llvm::partition_point after r364719 llvm-svn: 364720	2019-06-30 11:19:56 +00:00
Nikita Popov	8023c84433	[LFTR] Rephrase getLoopTest into "based-on" check; NFCI What we want to know here is whether we're already using this value for the loop condition, so make the query about that. We can extend this to a more general "based-on" relationship, rather than a direct icmp use later. llvm-svn: 364715	2019-06-29 15:12:59 +00:00
Sanjay Patel	77dc1e8568	[InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum This transform came up in D62414, but we should deal with it first. We have LLVM intrinsics that correspond exactly to libm calls (unlike most libm calls, these libm calls never set errno). This holds without any fast-math-flags, so we should always canonicalize to those intrinsics directly for better optimization. Currently, we convert to fcmp+select only when we have FMF (nnan) because fcmp+select does not preserve the semantics of the call in the general case. Differential Revision: https://reviews.llvm.org/D63214 llvm-svn: 364714	2019-06-29 14:28:54 +00:00
Nikita Popov	61a8b62b4c	[LFTR] Remove unnecessary latch check; NFCI The whole indvars pass works on loops in simplified form, so there is always a unique latch. Convert the condition into an assertion in needsLFTR (though we also assert this in later LFTR functions). Additionally update the comment on getLoopTest() now that we are dealing with multiple exits. llvm-svn: 364713	2019-06-29 12:41:02 +00:00
Roman Lebedev	e3a94ba4a9	[InstCombine] Shift amount reassociation (PR42391) Summary: Given pattern: `(x shiftopcode Q) shiftopcode K` we should rewrite it as `x shiftopcode (Q+K)` iff `(Q+K) u< bitwidth(x)` This is valid for any shift, but they must be identical. * https://rise4fun.com/Alive/9E2 * exact on both lshr => exact https://rise4fun.com/Alive/plHk * exact on both ashr => exact https://rise4fun.com/Alive/QDAA * nuw on both shl => nuw https://rise4fun.com/Alive/5Uk * nsw on both shl => nsw https://rise4fun.com/Alive/0plg Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 \| PR42391]]. Reviewers: spatel, nikic, RKSimon Reviewed By: nikic Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63812 llvm-svn: 364712	2019-06-29 11:51:50 +00:00
Nikita Popov	2d756c4feb	[LFTR] Fix post-inc pointer IV with truncated exit count (PR41998) Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we have a truncated exit count we'll truncate the IV when comparing against the limit, in which case exit count overflow in post-inc form doesn't matter. However, for pointer IVs we don't do that, so we have to be careful about incrementing the IV in the wide type. I'm fixing this by removing the IVCount variable (which was ExitCount or ExitCount+1) and replacing it with a UsePostInc flag, and then moving the actual limit adjustment to the individual cases (which are: pointer IV where we add to the wide type, integer IV where we add to the narrow type, and constant integer IV where we add to the wide type). Differential Revision: https://reviews.llvm.org/D63686 llvm-svn: 364709	2019-06-29 09:24:12 +00:00
Philip Reames	1504b6ee7e	[IndVars] Remove a bit of manual constant folding [NFC] SCEV is more than capable of folding (add x, trunc(0)) to x. llvm-svn: 364693	2019-06-29 00:19:31 +00:00
Cameron McInally	30e5cf1d8f	[NewGVN] Add unary FNeg support to NewGVN pass Differential Revision: https://reviews.llvm.org/D63933 llvm-svn: 364680	2019-06-28 20:09:32 +00:00
Cameron McInally	ab4b2364e5	[GVNSink] Add unary FNeg support to GVNSink pass Differential Revision: https://reviews.llvm.org/D63900 llvm-svn: 364678	2019-06-28 19:57:31 +00:00
Peter Collingbourne	7108df964a	hwasan: Remove the old frame descriptor mechanism. Differential Revision: https://reviews.llvm.org/D63470 llvm-svn: 364665	2019-06-28 17:53:26 +00:00
Peter Collingbourne	5378afc02a	hwasan: Use llvm.read_register intrinsic to read the PC on aarch64 instead of taking the function's address. This shaves an instruction (and a GOT entry in PIC code) off prologues of functions with stack variables. Differential Revision: https://reviews.llvm.org/D63472 llvm-svn: 364608	2019-06-27 23:24:07 +00:00
Cameron McInally	6e62a796d5	[GVN] Add support for unary FNeg to GVN pass Differential Revision: https://reviews.llvm.org/D63896 llvm-svn: 364592	2019-06-27 21:05:02 +00:00
Johannes Doerfert	3b77583e95	[Attr] Add "willreturn" function attribute This patch introduces a new function attribute, willreturn, to indicate that a call of this function will either exhibit undefined behavior or comes back and continues execution at a point in the existing call stack that includes the current invocation. This attribute guarantees that the function does not have any endless loops, endless recursion, or terminating functions like abort or exit. Patch by Hideto Ueno (@uenoku) Reviewers: jdoerfert Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62801 llvm-svn: 364555	2019-06-27 15:51:40 +00:00
Tim Northover	22c96a966b	IR: compare type attributes deeply when looking into functions. FunctionComparator attempts to produce a stable comparison of two Function instances by looking at all available properties. Since ByVal attributes now contain a Type pointer, they are not trivially ordered and FunctionComparator should use its own Type comparison logic to sort them. llvm-svn: 364523	2019-06-27 11:44:45 +00:00
Stefan Stipanovic	5360589b7d	[Attributor] Deducing existing nounwind attribute. Adding nounwind deduction in new attributor framework. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63379 llvm-svn: 364521	2019-06-27 11:27:54 +00:00
Gerolf Hoflehner	e311a4d5c4	[SCCP] Fix non-deterministic uselists of return values (DenseMap -> MapVector) llvm-svn: 364482	2019-06-26 21:44:37 +00:00
Vasileios Porpodas	574cb0eb3a	[SLP] Look-ahead operand reordering heuristic. Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364478	2019-06-26 21:25:24 +00:00
Guanzhong Chen	9aad997a5a	[WebAssembly] Implement Address Sanitizer for Emscripten Summary: This diff enables address sanitizer on Emscripten. On Emscripten, real memory starts at the value passed to --global-base. All memory before this is used as shadow memory, and thus the shadow mapping function is simply dividing by 8. Reviewers: tlively, aheejin, sbc100 Reviewed By: sbc100 Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63742 llvm-svn: 364468	2019-06-26 20:16:14 +00:00
Philip Reames	03b2e2d986	[IndVars] Kill a redundant bit of debug output llvm-svn: 364449	2019-06-26 17:19:09 +00:00
Sanjay Patel	71ad22707c	[InstCombine] simplify code for inserts -> splat; NFC llvm-svn: 364441	2019-06-26 15:52:59 +00:00
Clement Courbet	2851248fa1	Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416	2019-06-26 12:13:13 +00:00
Clement Courbet	7b3a5f0e6d	[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline. This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412	2019-06-26 11:50:18 +00:00
Florian Hahn	4c11b5268c	[LoopUnroll] Add support for loops with exiting headers and uncond latches. This patch generalizes the UnrollLoop utility to support loops that exit from the header instead of the latch. Usually, LoopRotate would take care of must of those cases, but in some cases (e.g. -Oz), LoopRotate does not kick in. Codesize impact looks relatively neutral on ARM64 with -Oz + LTO. Program master patch diff External/S.../CFP2006/447.dealII/447.dealII 629060.00 627676.00 -0.2% External/SPEC/CINT2000/176.gcc/176.gcc 1245916.00 1244932.00 -0.1% MultiSourc...Prolangs-C/simulator/simulator 86100.00 86156.00 0.1% MultiSourc...arks/Rodinia/backprop/backprop 66212.00 66252.00 0.1% MultiSourc...chmarks/Prolangs-C++/life/life 67276.00 67312.00 0.1% MultiSourc...s/Prolangs-C/compiler/compiler 69824.00 69788.00 -0.1% MultiSourc...Prolangs-C/assembler/assembler 86672.00 86696.00 0.0% Reviewers: efriedma, vsk, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D61962 llvm-svn: 364398	2019-06-26 09:16:57 +00:00
Huihui Zhang	b90cb57b63	[InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0. Simplify 'shl' inequality test into 'and' equality test. This pattern happens in the middle-end while simplifying bitfield access, Exposed in https://reviews.llvm.org/D63505 https://rise4fun.com/Alive/6uz Reviewers: lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: spatel, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63675 llvm-svn: 364348	2019-06-25 20:44:52 +00:00
Philip Reames	c42a357178	[LFTR] Adjust debug output to include extensions (if any) llvm-svn: 364346	2019-06-25 20:14:08 +00:00
Sanjay Patel	fcfa056ceb	[InstCombine] reduce checks for power-of-2-or-zero using ctpop This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero. This is matching the isPowerOf2() patterns used in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 But there's at least 1 instcombine follow-up needed to match the alternate form: (v & (v - 1)) == 0; We should have all of the backend expansions handled with: rL364319 (x86-specific changes still needed for optimal code based on subtarget) And the larger patterns to exclude zero as a power-of-2 are joining with this change after: rL364153 ( D63660 ) rL364246 Differential Revision: https://reviews.llvm.org/D63777 llvm-svn: 364341	2019-06-25 18:51:44 +00:00
Whitney Tsang	7c1deeff4a	Expand cloneLoopWithPreheader() to support cloning loop nest Summary: cloneLoopWithPreheader() currently only support innermost loop, and assert otherwise. Reviewers: Meinersbur, fhahn, kbarton Reviewed By: Meinersbur Subscribers: hiraditya, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63446 llvm-svn: 364310	2019-06-25 13:23:13 +00:00
Clement Courbet	3bc5ad551a	[ExpandMemCmp] Move all options to TargetTransformInfo. Split off from D60318. llvm-svn: 364281	2019-06-25 08:04:13 +00:00
Huihui Zhang	4626613ffe	[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier. Summary: To generate simplified IR, make sure fold (X & ~C) ==/!= 0 --> X u</u>= C+1 is scheduled before fold ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. https://rise4fun.com/Alive/7ZN Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63505 llvm-svn: 364255	2019-06-25 00:09:10 +00:00
Sanjay Patel	2675b0c8ab	[InstCombine] squash is-not-power-of-2 using ctpop This is the Demorgan'd 'not' of the pattern handled in: D63660 / rL364153 This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is not a power-of-2 using ctpop(X) > 1, so combining that with an is-zero check of the input is the same as testing if not exactly 1 bit is set: (X == 0) \|\| (ctpop(X) u> 1) --> ctpop(X) != 1 llvm-svn: 364246	2019-06-24 22:35:26 +00:00
Vasileios Porpodas	3081f78776	[SLP] NFC: Fixed typo in comment llvm-svn: 364237	2019-06-24 21:40:48 +00:00
Matt Arsenault	8025842599	InstCombine: Preserve nuw when reassociating nuw ops [3/3] Alive says this is OK. llvm-svn: 364235	2019-06-24 21:37:03 +00:00
Matt Arsenault	5d82ecd5d9	InstCombine: Preserve nuw when reassociating nuw ops [2/3] Alive says this is OK. llvm-svn: 364234	2019-06-24 21:37:02 +00:00
Matt Arsenault	5a89ba7343	InstCombine: Preserve nuw when reassociating nuw ops [1/3] Alive says this is OK. llvm-svn: 364233	2019-06-24 21:36:59 +00:00
Nikita Popov	f1ffc4305d	[CVP] Reenable nowrap flag inference Inference of nowrap flags in CVP has been disabled, because it triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181). This issue has been fixed in D60935, so we should be able to reenable nowrap flag inference now. Differential Revision: https://reviews.llvm.org/D62776 llvm-svn: 364228	2019-06-24 20:13:13 +00:00
Cameron McInally	fe3f15cf90	[SLP] Support unary FNeg vectorization Differential Revision: https://reviews.llvm.org/D63609 llvm-svn: 364219	2019-06-24 19:24:23 +00:00
Sanjay Patel	89efefb170	[InstCombine] reduce funnel-shift i16 X, X, 8 to bswap X Prefer the more exact intrinsic to remove a use of the input value and possibly make further transforms easier (we will still need to match patterns with funnel-shift of wider types as pieces of bswap, especially if we want to canonicalize to funnel-shift with constant shift amount). Discussed in D46760. llvm-svn: 364187	2019-06-24 15:20:49 +00:00
Simon Pilgrim	b617b0808d	[InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shifts trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts. Fixes OSS Fuzz #15217 llvm-svn: 364181	2019-06-24 13:13:36 +00:00
Sanjoy Das	e2291f5af9	Fix typo in comment; NFC llvm-svn: 364159	2019-06-23 19:22:13 +00:00
Philip Reames	d22a2a9a72	[IndVars] Remove dead instructions after folding trivial loop exit In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration). This extends that to eliminate the dead comparison left around. llvm-svn: 364155	2019-06-23 17:06:57 +00:00
Sanjay Patel	13a5ae58fc	[InstCombine] squash is-power-of-2 that uses ctpop This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is power-of-2-or-0 using ctpop(X) < 2, so combining that with a non-zero check of the input is the same as testing if exactly 1 bit is set: (X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1 Differential Revision: https://reviews.llvm.org/D63660 llvm-svn: 364153	2019-06-23 14:22:37 +00:00
Philip Reames	8deb84c8ef	Exploit a zero LoopExit count to eliminate loop exits This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning. Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here. Differential Revision: https://reviews.llvm.org/D63618 llvm-svn: 364135	2019-06-22 17:54:25 +00:00
Nikita Popov	8c8e40f763	[NewGVN] Fix copy/paste mistake in cast llvm-svn: 364130	2019-06-22 10:20:13 +00:00
Nikita Popov	e96fda726e	[NewGVN] Remove dead SwitchEdges variable; NFC llvm-svn: 364129	2019-06-22 10:20:07 +00:00
Reid Kleckner	592a193285	Revert [SLP] Look-ahead operand reordering heuristic. This reverts r364084 (git commit `5698921be2`) It caused crashes while compiling a file in Chrome. Reduction forthcoming. llvm-svn: 364111	2019-06-21 23:10:25 +00:00
Julian Lettner	19c4d660f4	[ASan] Use dynamic shadow on 32-bit iOS and simulators The VM layout on iOS is not stable between releases. On 64-bit iOS and its derivatives we use a dynamic shadow offset that enables ASan to search for a valid location for the shadow heap on process launch rather than hardcode it. This commit extends that approach for 32-bit iOS plus derivatives and their simulators. rdar://50645192 rdar://51200372 rdar://51767702 Reviewed By: delcypher Differential Revision: https://reviews.llvm.org/D63586 llvm-svn: 364105	2019-06-21 21:01:39 +00:00
Simon Pilgrim	5698921be2	[SLP] Look-ahead operand reordering heuristic. This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364084	2019-06-21 17:57:01 +00:00
David Bolvansky	dbcdad51ff	[InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1 Summary: ``` %a = sub i32 31, %x %r = shl i32 1, %a => %d = shl i32 1, 31 %r = lshr i32 %d, %x Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/btZm Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63652 llvm-svn: 364073	2019-06-21 16:25:32 +00:00
David Bolvansky	4b28478389	[InstCombine] cttz(abs(x)) -> cttz(x) Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D63546 llvm-svn: 364064	2019-06-21 15:26:22 +00:00
Sanjay Patel	ddb9093684	[GVNSink] prevent crashing on mismatched instructions (PR42346) Patch based on suggestion by James Molloy (@jmolloy) in: https://bugs.llvm.org/show_bug.cgi?id=42346 llvm-svn: 364062	2019-06-21 15:17:24 +00:00
Jay Foad	d9d3c91b48	[Scalarizer] Propagate IR flags Summary: The motivation for this was to propagate fast-math flags like nnan and ninf on vector floating point operations to the corresponding scalar operations to take advantage of follow-on optimizations. But I think the same argument applies to all of our IR flags: if they apply to the vector operation then they also apply to all the individual scalar operations, and they might enable follow-on optimizations. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63593 llvm-svn: 364051	2019-06-21 14:10:18 +00:00
Fangrui Song	dc8de6037c	Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC llvm-svn: 364006	2019-06-21 05:40:31 +00:00
Cameron McInally	1c0bd6dd2c	[Reassociate] Remove bogus assert reported in PR42349. Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete. The bogus assert with introduced in D63445. llvm-svn: 363998	2019-06-20 23:03:55 +00:00
Rainer Orth	6fde832b82	[profile] Solaris ld supports __start___llvm_prof_data etc. labels Currently, many profiling tests on Solaris FAIL like Command Output (stderr): -- Undefined first referenced symbol in file __llvm_profile_register_names_function /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o __llvm_profile_register_function /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o Solaris 11.4 ld supports the non-standard GNU ld extension of adding __start_SECNAME and __stop_SECNAME labels to sections whose names are valid as C identifiers. Given that we already use Solaris 11.4-only features like ld -z gnu-version-script-compat and fully working .preinit_array support in compiler-rt, we don't need to worry about older versions of Solaris ld. The patch documents that support (although the comment in lib/Transforms/Instrumentation/InstrProfiling.cpp (needsRuntimeRegistrationOfSectionRange) is quite cryptic what it's actually about), and adapts the affected testcase not to expect the alternativeq __llvm_profile_register_functions and __llvm_profile_init. It fixes all affected tests. Tested on amd64-pc-solaris2.11. Differential Revision: https://reviews.llvm.org/D41111 llvm-svn: 363984	2019-06-20 21:27:06 +00:00
Alina Sbirlea	d0b11698cd	[LICM & MSSA] Limit unsafe sinking and hoisting. Summary: The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch). If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink. Given: ``` for i load a[i] store a[i] ``` The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it. In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs. Caught by PR42294. This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63582 llvm-svn: 363982	2019-06-20 21:09:09 +00:00
Sanjay Patel	273d97e6bf	[InstCombine] fix typo in comment; NFC llvm-svn: 363974	2019-06-20 20:23:32 +00:00
Philip Reames	a7fd8a806f	[LFTR] Fix a (latent?) bug related to nested loops I can't actually come up with a test case this triggers on without an out of tree change, but in theory, it's a bug in the recently added multiple exit LFTR support. The root issue is that an exiting block common to two loops can (in theory) have computable exit counts for both loops. Rewriting the exit of an inner loop in terms of the outer loops IV would cause the inner loop to either a) run forever, or b) terminate on the first iteration. In practice, we appear to get lucky and not have the exit count computable for the outer loop, except when it's trivially zero. Given we bail on zero exit counts, we don't appear to ever trigger this. But I can't come up with a reason we can't compute an exit count for the outer loop on the common exiting block, so this may very well be triggering in some cases. llvm-svn: 363964	2019-06-20 18:45:06 +00:00
Sanjay Patel	63311bfb83	[InstCombine] canonicalize check for power-of-2 The form that compares against 0 is better because: 1. It removes a use of the input value. 2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS). This is a root cause for PR42314, but probably doesn't completely answer the codegen request: https://bugs.llvm.org/show_bug.cgi?id=42314 Alive proof: https://rise4fun.com/Alive/9kG Name: is power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp eq i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp eq i32 %a2, 0 Name: is not power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp ne i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp ne i32 %a2, 0 llvm-svn: 363956	2019-06-20 17:41:15 +00:00
David Bolvansky	01511192b2	[InstCombine] cttz(-x) -> cttz(x) Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63534 llvm-svn: 363951	2019-06-20 17:04:14 +00:00
Philip Reames	eda1ba65ca	LFTR for multiple exit loops Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop. This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.) I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred. I do want to note that I added one bit after the review. When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one. This was theoretically possible with a single exit, but the zero case covered it in practice. With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached. Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done. The solution seemed obvious, so I didn't bother with another round of review. Differential Revision: https://reviews.llvm.org/D62625 llvm-svn: 363883	2019-06-19 21:58:25 +00:00
Philip Reames	ce53e2226c	[LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC] (Resumbit of r363292 which was reverted along w/an earlier patch) llvm-svn: 363877	2019-06-19 20:45:57 +00:00
Philip Reames	f8104f01e6	[LFTR] Rename variable to minimize confusion [NFC] (Recommit of r363293 which was reverted when a dependent patch was.) As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here. llvm-svn: 363875	2019-06-19 20:41:28 +00:00
Huihui Zhang	670778c762	[InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier Summary: To generate simplified IR, make sure fold ``` (X & signbit) ==/!= 0) -> X s>=/s< 0; ``` is scheduled before fold ``` ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. ``` https://rise4fun.com/Alive/fbdh Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63026 llvm-svn: 363845	2019-06-19 17:31:39 +00:00
Cameron McInally	7aa898e61e	[DFSan] Add UnaryOperator visitor to DataFlowSanitizer Differential Revision: https://reviews.llvm.org/D62815 llvm-svn: 363814	2019-06-19 15:11:41 +00:00
Cameron McInally	a027cf4764	[Reassociate] Handle unary FNeg in the Reassociate pass Differential Revision: https://reviews.llvm.org/D63445 llvm-svn: 363813	2019-06-19 14:59:14 +00:00
Orlando Cazalet-Hyams	1251cac62a	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 > llvm-svn: 363046 llvm-svn: 363786	2019-06-19 10:50:47 +00:00
Michael Liao	4f7f70e262	Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas [SROA] Enhance SROA to handle `addrspacecast`ed allocas - Fix typo in original change - Add additional handling to ensure all return pointers are properly casted. Summary: - After `addrspacecast` is allowed to be eliminated in SROA, the adjusting of storage pointer (from `alloca) needs to handle the potential different address spaces between the storage pointer (from alloca) and the pointer being used. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63501 llvm-svn: 363743	2019-06-18 21:41:13 +00:00
Gor Nishanov	3fcad775c0	[coroutines] Add missing pass dependency. Summary: CoroSplit depends on CallGraphWrapperPass, but it was not explicitly adding it as a pass dependency. This missing dependency can trigger errors / assertions / crashes in PMTopLevelManager::schedulePass() under certain configurations. Author: ben-clayton Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: capn, EricWF, modocache, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63144 llvm-svn: 363727	2019-06-18 19:49:48 +00:00
Jordan Rupprecht	33e85ad956	Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas This reverts r363711 (git commit `76a149ef81`) This causes stage2 build failures, e.g.: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio llvm-svn: 363718	2019-06-18 18:40:04 +00:00
Michael Liao	76a149ef81	[SROA] Enhance SROA to handle `addrspacecast`ed allocas Summary: - After `addrspacecast` is allowed to be eliminated in SROA, the adjusting of storage pointer (from `alloca) needs to handle the potential different address spaces between the storage pointer (from alloca) and the pointer being used. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63501 llvm-svn: 363711	2019-06-18 17:58:49 +00:00
Yevgeny Rouban	69daf4a72d	[SimplifyCFG] NFC, prof branch_weighs handling is simplified Using the new SwitchInstProfUpdateWrapper this patch simplifies 3 places of prof branch_weights handling. Differential Revision: https://reviews.llvm.org/D62123 llvm-svn: 363652	2019-06-18 06:50:52 +00:00
Peter Collingbourne	d57f7cc15e	hwasan: Use bits [3..11) of the ring buffer entry address as the base stack tag. This saves roughly 32 bytes of instructions per function with stack objects and causes us to preserve enough information that we can recover the original tags of all stack variables. Now that stack tags are deterministic, we no longer need to pass -hwasan-generate-tags-with-calls during check-hwasan. This also means that the new stack tag generation mechanism is exercised by check-hwasan. Differential Revision: https://reviews.llvm.org/D63360 llvm-svn: 363636	2019-06-17 23:39:51 +00:00
Peter Collingbourne	fb9ce100d1	hwasan: Add a tag_offset DWARF attribute to instrumented stack variables. The goal is to improve hwasan's error reporting for stack use-after-return by recording enough information to allow the specific variable that was accessed to be identified based on the pointer's tag. Currently we record the PC and lower bits of SP for each stack frame we create (which will eventually be enough to derive the base tag used by the stack frame) but that's not enough to determine the specific tag for each variable, which is the stack frame's base tag XOR a value (the "tag offset") that is unique for each variable in a function. In IR, the tag offset is most naturally represented as part of a location expression on the llvm.dbg.declare instruction. However, the presence of the tag offset in the variable's actual location expression is likely to confuse debuggers which won't know about tag offsets, and moreover the tag offset is not required for a debugger to determine the location of the variable on the stack, so at the DWARF level it is represented as an attribute so that it will be ignored by debuggers that don't know about it. Differential Revision: https://reviews.llvm.org/D63119 llvm-svn: 363635	2019-06-17 23:39:41 +00:00
Philip Reames	44475363e8	Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges This patch really contains two pieces: Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi. Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses. Differential Revision: https://reviews.llvm.org/D63224 llvm-svn: 363619	2019-06-17 21:06:17 +00:00
Philip Reames	fe8bd96ebd	Fix a bug w/inbounds invalidation in LFTR (recommit) Recommit r363289 with a bug fix for crash identified in pr42279. Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case. Test case previously committed in r363601. Original commit comment follows: This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363613	2019-06-17 20:32:22 +00:00
Joseph Tremoulet	daa1ae6142	[EarlyCSE] Fix hashing of self-compares Summary: Update compare normalization in SimpleValue hashing to break ties (when the same value is being compared to itself) by switching to the swapped predicate if it has a lower numerical value. This brings the hashing in line with isEqual, which already recognizes the self-compares with swapped predicates as equal. Fixes PR 42280. Reviewers: spatel, efriedma, nikic, fhahn, uabelho Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63349 llvm-svn: 363598	2019-06-17 19:11:28 +00:00
Matt Arsenault	6d741f29ec	AMDGPU: Fold readlane/readfirstlane calls llvm-svn: 363587	2019-06-17 17:52:35 +00:00
Warren Ristow	6452bdd29b	[LV] Suppress vectorization in some nontemporal cases When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581	2019-06-17 17:20:08 +00:00
Whitney Tsang	15b7f5b72d	PHINode: introduce setIncomingValueForBlock() function, and use it. Summary: There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue() but no function to replace incoming value for a specified BasicBlock* predecessor. Clearly, there are a lot of places that could use that functionality. Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn Reviewed By: Meinersbur, fhahn Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63338 llvm-svn: 363566	2019-06-17 14:38:56 +00:00
Matt Arsenault	1df203d78e	InferAddressSpaces: Fix cloning original addrspacecast If an addrspacecast needed to be inserted again, this was creating a clone of the original cast for each user. Just use the original, which also saves losing the value name. llvm-svn: 363562	2019-06-17 14:13:29 +00:00
Bjorn Pettersson	83773b77a5	[LV] Deny irregular types in interleavedAccessCanBeWidened Summary: Avoid that loop vectorizer creates loads/stores of vectors with "irregular" types when interleaving. An example of an irregular type is x86_fp80 that is 80 bits, but that may have an allocation size that is 96 bits. So an array of x86_fp80 is not bitcast compatible with a vector of the same type. Not sure if interleavedAccessCanBeWidened is the best place for this check, but it solves the problem seen in the added test case. And it is the same kind of check that already exists in memoryInstructionCanBeWidened. Reviewers: fhahn, Ayal, craig.topper Reviewed By: fhahn Subscribers: hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63386 llvm-svn: 363547	2019-06-17 12:02:24 +00:00
Hans Wennborg	a9e5d2f35d	Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" Third time's the charm. This was reverted in r363220 due to being suspected of an internal benchmark regression and a test failure, none of which turned out to be caused by this. llvm-svn: 363529	2019-06-17 07:47:28 +00:00
Yevgeny Rouban	ee62c40eae	[SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata if unreachable switch cases are removed. This patch fixes this bug by making use of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122). A new test is created. Differential Revision: https://reviews.llvm.org/D62186 llvm-svn: 363527	2019-06-17 05:55:12 +00:00
Nikita Popov	9145562b48	[SimplifyIndVar] Simplify non-overflowing saturating add/sub If we can detect that saturating math that depends on an IV cannot overflow, replace it with simple math. This is similar to the CVP optimization from D62703, just based on a different underlying analysis (SCEV vs LVI) that catches different cases. Differential Revision: https://reviews.llvm.org/D62792 llvm-svn: 363489	2019-06-15 08:48:52 +00:00
Akira Hatanaka	a704a8f28c	[ObjC][ARC] Delete ObjC runtime calls on global variables annotated with 'objc_arc_inert' Those calls are no-ops, so they can be safely deleted. rdar://problem/49839633 Differential Revision: https://reviews.llvm.org/D62433 llvm-svn: 363468	2019-06-14 22:06:32 +00:00
Matt Arsenault	282dac717e	SROA: Allow eliminating addrspacecasted allocas There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462	2019-06-14 21:38:31 +00:00
Florian Hahn	dcdd12b68c	Revert Fix a bug w/inbounds invalidation in LFTR Reverting because it breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363289 (git commit `eb88badff9`) llvm-svn: 363427	2019-06-14 17:23:09 +00:00
Florian Hahn	a19809045c	Revert [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC] Reverting because it depends on r363289, which breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363292 (git commit `42a3fc133d`) llvm-svn: 363426	2019-06-14 17:22:56 +00:00
Florian Hahn	e1b4b1b46e	Revert [LFTR] Rename variable to minimize confusion [NFC] Reverting because it depends on r363289, which breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363293 (git commit `c37be29634`) llvm-svn: 363425	2019-06-14 17:22:49 +00:00
Shawn Landden	f2e60fc4e8	[SimpligyCFG] NFC intended, remove GCD that was only used for powers of two and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 Differential Revision: https://reviews.llvm.org/D61151 llvm-svn: 363422	2019-06-14 16:56:49 +00:00
Johannes Doerfert	282d34ee78	[Attributor] Disable the Attributor by default and fix a comment llvm-svn: 363408	2019-06-14 14:53:41 +00:00
Matt Arsenault	492d71cc99	AMDGPU: Fold readlane intrinsics of constants I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406	2019-06-14 14:51:26 +00:00
Stanislav Mekhanoshin	68a2fef9ae	[AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32 Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339	2019-06-13 23:47:36 +00:00
Philip Reames	c37be29634	[LFTR] Rename variable to minimize confusion [NFC] As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here. llvm-svn: 363293	2019-06-13 18:40:15 +00:00
Philip Reames	42a3fc133d	[LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC] llvm-svn: 363292	2019-06-13 18:32:55 +00:00
Philip Reames	eb88badff9	Fix a bug w/inbounds invalidation in LFTR This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289	2019-06-13 18:23:13 +00:00
Leonard Chan	09f56b51ec	[clang][NewPM] Fix broken -O0 test from missing assumptions Add an AssumptionCache callback to the InlineFuntionInfo used for the AlwaysInlinerPass to match codegen of the AlwaysInlinerLegacyPass to generate llvm.assume. This fixes CodeGen/builtin-movdir.c when new PM is enabled by default. Differential Revision: https://reviews.llvm.org/D63170 llvm-svn: 363287	2019-06-13 18:18:40 +00:00
Joseph Tremoulet	3bc6e2a7aa	[EarlyCSE] Ensure equal keys have the same hash value Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274	2019-06-13 15:24:11 +00:00
Sander de Smalen	7957fc6547	[IntrinsicEmitter] Extend argument overloading with forward references. Extend the mechanism to overload intrinsic arguments by using either backward or forward references to the overloadable arguments. In for example: def int_something : Intrinsic<[LLVMPointerToElt<0>], [llvm_anyvector_ty], []>; LLVMPointerToElt<0> is a forward reference to the overloadable operand of type 'llvm_anyvector_ty' and would allow intrinsics such as: declare i32* @llvm.something.v4i32(<4 x i32>); declare i64* @llvm.something.v2i64(<2 x i64>); where the result pointer type is deduced from the element type of the first argument. If the returned pointer is not a pointer to the element type, LLVM will give an error: Intrinsic has incorrect return type! i64* (<4 x i32>)* @llvm.something.v4i32 Reviewers: RKSimon, arsenm, rnk, greened Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62995 llvm-svn: 363233	2019-06-13 08:19:33 +00:00
Shawn Landden	8b142bcc3f	[SimplifyCFG] reverting preliminary Switch patches again This reverts 363226 and 363227, both NFC intended I swear I fixed the test case that is failing, and ran the tests, but I will look into it again. llvm-svn: 363229	2019-06-13 05:26:17 +00:00
Shawn Landden	636220e83c	[SimpligyCFG] NFC intended, remove GCD that was only used for powers of two and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 Differential Revision: https://reviews.llvm.org/D61151 llvm-svn: 363227	2019-06-13 05:01:44 +00:00
David L. Jones	c73fadaa84	Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...' We have observed some failures with internal builds with this revision. - Performance regressions: - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings). - Benchmarks for Abseil's SwissTable. - Correctness: - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP). hwennborg has already been notified, and is aware of reproducer failures. llvm-svn: 363220	2019-06-13 02:04:45 +00:00
Philip Reames	ae2581cef3	[IndVars] Extend diagnostic -replexitval flag w/ability to bypass hard use hueristic Note: This does mean that "always" is now more powerful than it was. llvm-svn: 363196	2019-06-12 19:52:05 +00:00
Matt Arsenault	aa6bdf9dcd	LoopVersioning: Respect convergent This changes the standalone pass only. Arguably the utility class itself should assert there are no convergent calls. However, a target pass with additional context may still be able to version a loop if all of the dynamic conditions are sufficiently uniform. llvm-svn: 363165	2019-06-12 14:05:58 +00:00
Matt Arsenault	86325be3d7	LoopLoadElim: Respect convergent llvm-svn: 363162	2019-06-12 13:50:47 +00:00
Matt Arsenault	2466ba97bc	LoopDistribute/LAA: Respect convergent This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160	2019-06-12 13:34:19 +00:00
Orlando Cazalet-Hyams	a947156396	Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion" This reverts commit `1a0f7a2077`. See phabricator thread for D60831. llvm-svn: 363132	2019-06-12 08:34:51 +00:00
Philip Reames	082cd30327	Generalize icmp matching in IndVars' eliminateTrunc We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108	2019-06-11 22:43:25 +00:00
Alina Sbirlea	3cef1f7d64	Only passes that preserve MemorySSA must mark it as preserved. Summary: The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as preserved, because it's being called in a lot of passes that do not preserve MemorySSA. Instead, mark the MemorySSA analysis as preserved by each pass that does preserve it. These changes only affect the new pass mananger. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62536 llvm-svn: 363091	2019-06-11 18:27:49 +00:00
Cameron McInally	08200d6d26	[InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082	2019-06-11 16:21:21 +00:00
Cameron McInally	796de11331	[InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080	2019-06-11 15:45:41 +00:00
Orlando Cazalet-Hyams	1a0f7a2077	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046	2019-06-11 10:37:20 +00:00
Sander de Smalen	cbeb563cfb	Change semantics of fadd/fmul vector reductions. This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035	2019-06-11 08:22:10 +00:00
Rong Xu	7ea131c20c	[PGO] Fix the buildbot failure in r362995 Fixed one unused variable warning. llvm-svn: 363004	2019-06-10 23:20:04 +00:00
Rong Xu	e44fa83c37	[PGO] Handle cases of non-instrument BBs As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995	2019-06-10 22:36:27 +00:00
Philip Reames	a9633d5f0b	[LFTR] Use recomputed BE count This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975	2019-06-10 19:18:53 +00:00
Philip Reames	5d84ccb230	Prepare for multi-exit LFTR [NFC] This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context. Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening. For the nestedIV example from elim-extend, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :) In review, we decided to accept that test change. This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change. (I wanted it separate for problem attribution purposes.) Differential Revision: https://reviews.llvm.org/D62880 llvm-svn: 362971	2019-06-10 17:51:13 +00:00
Sanjay Patel	9650c95b7e	[InstCombine] allow unordered preds when canonicalizing to fabs() We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954	2019-06-10 15:39:00 +00:00
Sanjay Patel	85de9634e6	[InstCombine] fix bug in canonicalization to fabs() Forgot to translate the predicate clauses in rL362943. llvm-svn: 362945	2019-06-10 14:57:45 +00:00
Sanjay Patel	8b6d9f60ed	[InstCombine] change canonicalization to fabs() to use FMF on fsub Similar to rL362909: This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fsub because they have the same operand. llvm-svn: 362943	2019-06-10 14:46:36 +00:00
Sanjay Patel	8cd8c5784b	[InstCombine] allow unordered preds when canonicalizing to fabs() PR42179: https://bugs.llvm.org/show_bug.cgi?id=42179 llvm-svn: 362937	2019-06-10 14:14:51 +00:00
Vivek Pandya	11cb15f8ed	Do not derive no-recurse attribute if function does not have exact definition. This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336 Reviewers: jdoerfert Reviewed by: jdoerfert Differential Revision: https://reviews.llvm.org/D63045 llvm-svn: 362918	2019-06-10 04:16:04 +00:00
Roman Lebedev	d669758d84	[InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles A precondition 'x != 0' was forgotten by me: https://rise4fun.com/Alive/JFNP https://rise4fun.com/Alive/jHvL These 4 folds with non-constants could be re-enabled, but for now let's go for the simplest solution. https://bugs.llvm.org/show_bug.cgi?id=42198 llvm-svn: 362911	2019-06-09 16:30:42 +00:00
Sanjay Patel	87cd16a86e	[InstCombine] change canonicalization to fabs() to use FMF on fneg This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fneg (fsub) because they have the same operand. This works around the most glaring FMF logical inconsistency cited in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 llvm-svn: 362909	2019-06-09 16:22:01 +00:00
Keno Fischer	eb4a561fa3	[GVN] non-functional code movement Summary: Move some code around, in preparation for later fixes to the non-integral addrspace handling (D59661) Patch By Jameson Nash <jameson@juliacomputing.com> Reviewed By: reames, loladiro Differential Revision: https://reviews.llvm.org/D59729 llvm-svn: 362853	2019-06-07 23:08:38 +00:00
Alina Sbirlea	eaea538d18	[DomTreeUpdater] Add all insert before all delete updates to reduce compile time. Summary: The cleanup in D62751 introduced a compile-time regression due to the way DT updates are performed. Add all insert edges then all delete edges in DTU to match the previous compile time. Compile time on the test provided by @mstorsjo before and after this patch on my machine: 113.046s vs 35.649s Repro: clang -target x86_64-w64-mingw32 -c -O3 glew-preproc.c; on https://martin.st/temp/glew-preproc.c. Reviewers: kuhar, NutshellySima, mstorsjo Subscribers: jlebar, mstorsjo, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62981 llvm-svn: 362839	2019-06-07 20:43:55 +00:00
Stefan Stipanovic	128e8e8fb9	test-commit llvm-svn: 362802	2019-06-07 14:18:02 +00:00
Fangrui Song	19189993c9	[LV] Fix -Wunused-function after r362736 llvm-svn: 362762	2019-06-07 01:48:26 +00:00
Renato Golin	9e97caf594	[LV] Wrap LV illegality reporting in a function. NFC. A function for loop vectorization illegality reporting has been introduced: void LoopVectorizationLegality::reportVectorizationFailure( const StringRef DebugMsg, const StringRef OREMsg, const StringRef ORETag, Instruction * const I) const; The function prints a debug message when the debug for the compilation unit is enabled as well as invokes the optimization report emitter to generate a message with a specified tag. The function doesn't cover any complicated logic when a custom lambda should be passed to the emitter, only generating a message with a tag is supported. The function always prints the instruction `I` after the debug message whenever the instruction is specified, otherwise the debug message ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.' Patch by Pavel Samolysov <samolisov@gmail.com> llvm-svn: 362736	2019-06-06 19:15:52 +00:00
Philip Reames	101915cfda	[LoopPred] Fix a bug in unconditional latch bailout introduced in r362284 This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops. llvm-svn: 362727	2019-06-06 18:02:36 +00:00
Cameron McInally	c72fbe5dc1	[MSAN] Add unary FNeg visitor to the MemorySanitizer Differential Revision: https://reviews.llvm.org/D62909 llvm-svn: 362664	2019-06-05 22:37:05 +00:00
Mircea Trofin	e3eeacd70a	[CallSite removal] Refactoring llvm::InlineFunction APIs Summary: This change only unifies the API previous API pair accepting CallInst and InvokeInst, thus making it easier to refactor inliner pass ode to CallBase. The implementation of the unified API still relies on the CallSite implementation. Reviewers: eraman, chandlerc, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62283 llvm-svn: 362656	2019-06-05 21:28:13 +00:00
Sanjay Patel	ac111e526d	[InstCombine] simplify code for bitcast of insertelement; NFC llvm-svn: 362655	2019-06-05 21:26:52 +00:00
Matt Arsenault	663d762c9a	NewGVN: Handle addrspacecast The AllConstant check needs to be moved out of the if/else if chain to avoid a test regression. The "there is no SimplifyZExt" comment puzzles me, since there is SimplifyCastInst. Additionally, the Simplify* calls seem to not see the operand as constant, so this needs to be tried if the simplify failed. llvm-svn: 362653	2019-06-05 21:15:52 +00:00
Tim Northover	8d7f118ab2	InstCombine: correctly change byval type attribute alongside call args. When the byval attribute has a type, it must match the pointee type of any parameter; but InstCombine was not updating the attribute when folding casts of various kinds away. llvm-svn: 362643	2019-06-05 20:38:17 +00:00
Dinar Temirbulatov	15c657d13d	[SLP] Fix regression in broadcasts caused by operand reordering patch D59973. This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 . The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found. Please see the lit test for some examples. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D62427 llvm-svn: 362613	2019-06-05 15:26:28 +00:00
Sanjay Patel	ad62a3a299	[LoopUtils][SLPVectorizer] clean up management of fast-math-flags Instead of passing around fast-math-flags as a parameter, we can set those using an IRBuilder guard object. This is no-functional-change-intended. The motivation is to eventually fix the vectorizers to use and set the correct fast-math-flags for reductions. Examples of that not behaving as expected are: https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast') https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0) D61802 (should be able to reduce with IR-level FMF) Differential Revision: https://reviews.llvm.org/D62272 llvm-svn: 362612	2019-06-05 14:58:04 +00:00
Simon Pilgrim	db134aaec2	[IPO] Disabled 'default only' switch statements to fix MSVC warnings. @jdoerfert Looks like these are placeholders for incoming abstract attributes patches so I've just commented the code out, even though this is usually frowned upon. llvm-svn: 362592	2019-06-05 10:04:05 +00:00

... 5 6 7 8 9 ...

22423 Commits