llvm-project

Commit Graph

Author	SHA1	Message	Date
Max Kazantsev	84286ce5dd	[IRCE][NFC] Rename fields of InductiveRangeCheck Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively to make naming of similar entities for Ranges and Range Checks more consistent. Differential Revision: https://reviews.llvm.org/D39414 llvm-svn: 316979	2017-10-31 06:19:05 +00:00
Max Kazantsev	21e7b53490	[NFC] Get rid of variables used in assert only llvm-svn: 316977	2017-10-31 05:33:58 +00:00
Philip Reames	59bf1e0548	[IndVarSimplify] Simplify code using preheader assumption As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops. Given SCEV has comments to the contrary, that seems a bit suspect. More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available. Remove the excessive generaility and shorten the code greatly. Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops. See the added test case. llvm-svn: 316976	2017-10-31 05:16:46 +00:00
Max Kazantsev	488ec975bb	Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors" This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 316975	2017-10-31 05:07:56 +00:00
Philip Reames	39a8dbff87	[SimplifyIndVar] Extract out invariant expression handling Previously, the code returned early from the function when it couldn't find a free expansion, it should be returning from the transform. I don't have a test case, noticed this via inspection. As a follow up, I'm going to revisit the logic in the extract function. I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits. llvm-svn: 316974	2017-10-31 04:19:06 +00:00
Philip Reames	5552f503d5	Undo accidental commit These files shouldn't have been submitted in 316967 llvm-svn: 316968	2017-10-31 00:04:09 +00:00
Philip Reames	9c3cbeea39	[CGP] Fix crash on i96 bit multiply Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725 If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like. llvm-svn: 316967	2017-10-30 23:59:51 +00:00
Yaxun Liu	d23f23d81c	InferAddressSpaces: Fix bug about replacing addrspacecast InferAddressSpaces assumes the pointee type of addrspacecast is the same as the operand, which is not always true and causes invalid IR. This bug cause build failure in HCC. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39432 llvm-svn: 316957	2017-10-30 21:19:41 +00:00
Davide Italiano	834b45129b	[NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops. It's not guaranteed. There's a bug open to sort them in predecessor order, but it won't happen anytime soon. In the meanwhile, passes will have to do an O(#preds) scan. Such is life. llvm-svn: 316953	2017-10-30 20:20:16 +00:00
Daniel Neilson	f9c7d29c77	Create instruction classes for identifying any atomicity of memory intrinsic. (NFC) Summary: For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html This patch fleshes out the instruction class hierarchy with respect to atomic and non-atomic memory intrinsics. With this change, the relevant part of the class hierarchy becomes: IntrinsicInst -> MemIntrinsicBase (methods-only class) -> MemIntrinsic (non-atomic intrinsics) -> MemSetInst -> MemTransferInst -> MemCpyInst -> MemMoveInst -> AtomicMemIntrinsic (atomic intrinsics) -> AtomicMemSetInst -> AtomicMemTransferInst -> AtomicMemCpyInst -> AtomicMemMoveInst -> AnyMemIntrinsic (both atomicities) -> AnyMemSetInst -> AnyMemTransferInst -> AnyMemCpyInst -> AnyMemMoveInst This involves some class renaming: ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst A script for doing this renaming in downstream trees is included below. An example of where the Any* classes should be used in LLVM is when reasoning about the effects of an instruction (ex: aliasing). --- Script for renaming AtomicMem* classes: PREFIXES="[<,([:space:]]" CLASSES="MemIntrinsic\|MemTransferInst\|MemSetInst\|MemMoveInst\|MemCpyInst" SUFFIXES="[;)>,[:space:]]" REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})" REGEX2="visitElementUnorderedAtomic(${CLASSES})" FILES=$( grep -E "(${REGEX}\|${REGEX2})" -r . \| tr ':' ' ' \| awk '{print $1}' \| sort \| uniq ) SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g" SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g" for f in $FILES; do echo "Processing: $f" sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f done Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev Reviewed By: sanjoy Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38419 llvm-svn: 316950	2017-10-30 19:51:48 +00:00
Mandeep Singh Grang	f83268bd9e	[GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245. Reviewers: hiraditya, spop, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39410 llvm-svn: 316949	2017-10-30 19:42:41 +00:00
Clement Courbet	b2c3eb8cf1	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2). - Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905	2017-10-30 14:19:33 +00:00
Florian Hahn	d0208b4b1c	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316891	2017-10-30 10:07:42 +00:00
Max Kazantsev	390fc57771	[IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value llvm-svn: 316889	2017-10-30 09:35:16 +00:00
Florian Hahn	d18443edad	Revert r316887 to fix buildbot failures. llvm-svn: 316888	2017-10-30 09:21:50 +00:00
Florian Hahn	925d3e4a98	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316887	2017-10-30 09:04:18 +00:00
Max Kazantsev	1d7c0439b9	[GVN][NFC] Mark instruction for deletion instead of immediate erasing in LoadPRE It is done to uniformly handle instructions removal. Differential Revision: https://reviews.llvm.org/D39369 llvm-svn: 316884	2017-10-30 04:48:34 +00:00
Sanjay Patel	b049173157	[SimplifyCFG] use pass options and remove the latesimplifycfg pass This is no-functional-change-intended. This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired). The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager. Differential Revision: https://reviews.llvm.org/D38631 llvm-svn: 316835	2017-10-28 18:43:07 +00:00
Craig Topper	49687104d6	[PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, properly check the function signature, and check TLI::has Summary: We shouldn't do this transformation if the function is marked nobuitlin. We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name. We should also be checking TLI::has since sqrtf is a macro on Windows. Fixes PR32559. Reviewers: hfinkel, spatel, davide, efriedma Reviewed By: davide, efriedma Subscribers: efriedma, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D39381 llvm-svn: 316819	2017-10-28 00:36:58 +00:00
Artur Pilipenko	8aadc643cf	[LoopPredication] Handle the case when the guard and the latch IV have different offsets This is a follow up change for D37569. Currently the transformation is limited to the case when: * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is of the form i u< guardLimit. This patch enables the transform in the case when the latch is latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. And the guard is guardStart + i u< guardLimit Reviewed By: anna Differential Revision: https://reviews.llvm.org/D39097 llvm-svn: 316768	2017-10-27 14:46:17 +00:00
Max Kazantsev	665907c3c2	[GVN][NFC] Refactor loop iteration with foreach llvm-svn: 316748	2017-10-27 08:19:35 +00:00
Eugene Zelenko	57bd5a0274	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316724	2017-10-27 01:09:08 +00:00
Philip Reames	29dd40b38e	[SimplifyIndVars] Shorten code by using SCEV helper [NFC] llvm-svn: 316709	2017-10-26 22:02:16 +00:00
Dehao Chen	ed2d5402cb	Do not add discriminator encoding for debug intrinsics. Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics. Reviewers: dblaikie, aprantl Reviewed By: aprantl Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D39343 llvm-svn: 316703	2017-10-26 21:20:52 +00:00
Philip Reames	21cc2fa3f6	[LICM] Restructure implicit exit handling to be more clear [NFCI] When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion. Extract a helper routine and clarify naming/scopes to make this a bit more obvious. llvm-svn: 316699	2017-10-26 21:00:15 +00:00
Balaram Makam	9ee942f481	Reapply r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. Summary: This reverts r316612 to reapply r316582. The buildbot failure was unrelated to this commit. Reviewers: Subscribers: llvm-svn: 316669	2017-10-26 15:04:53 +00:00
Bjorn Pettersson	86db068e39	[LSV] Avoid adding vectors of pointers as candidates Summary: We no longer add vectors of pointers as candidates for load/store vectorization. It does not seem to work anyway, but without this patch we can end up in asserts when trying to create casts between an integer type and the pointer of vectors type. The test case I've added used to assert like this when trying to cast between i64 and <2 x i16>: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>) Reviewers: arsenm Reviewed By: arsenm Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D39296 llvm-svn: 316665	2017-10-26 13:59:15 +00:00
Bjorn Pettersson	22a2282da1	[LSV] Skip all non-byte sizes, not only less than eight bits Summary: The code comments indicate that no effort has been spent on handling load/stores when the size isn't a multiple of the byte size correctly. However, the code only avoided types smaller than 8 bits. So for example a load of an i28 could still be considered as a candidate for vectorization. This patch adjusts the code to behave according to the code comment. The test case used to hit the following assert when trying to use "cast" an i32 to i28 using CreateBitOrPointerCast: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39295 llvm-svn: 316663	2017-10-26 13:42:55 +00:00
Eugene Zelenko	5c2aecef78	[Transforms] Revert r316630 changes in Scalar/MergeICmps.cpp to fix broken build bots (NFC). llvm-svn: 316634	2017-10-26 01:25:14 +00:00
Eugene Zelenko	5adb96cc92	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316630	2017-10-26 00:55:39 +00:00
Matthew Simpson	99f57933ba	Attempt to unbreak the expensive-checks-win bot llvm-svn: 316625	2017-10-25 22:46:34 +00:00
Balaram Makam	52252fe20d	Revert r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. Summary: This reverts commit r316582. It looks like this commit broke tests on one buildbot: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/5719 . . . Failing Tests (1): LLVM :: Transforms/CalledValuePropagation/simple-arguments.ll Reviewers: Subscribers: llvm-svn: 316612	2017-10-25 21:32:54 +00:00
Balaram Makam	925ddf1a93	[Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. Summary: For some irreducible CFG the domtree nodes might be dead, do not update domtree for dead nodes. Reviewers: kuhar, dberlin, hfinkel Reviewed By: kuhar Subscribers: llvm-commits, mcrosier Differential Revision: https://reviews.llvm.org/D38960 llvm-svn: 316582	2017-10-25 14:55:48 +00:00
Matthew Simpson	cb58558c2f	Add CalledValuePropagation pass This patch adds a new pass for attaching !callees metadata to indirect call sites. The pass propagates values to call sites by performing an IPSCCP-like analysis using the generic sparse propagation solver. For indirect call sites having a small set of possible callees, the attached metadata indicates what those callees are. The metadata can be used to facilitate optimizations like intersecting the function attributes of the possible callees, refining the call graph, performing indirect call promotion, etc. Differential Revision: https://reviews.llvm.org/D37355 llvm-svn: 316576	2017-10-25 13:40:08 +00:00
Max Kazantsev	9ac7021a25	[IRCE] Fix intersection between signed and unsigned ranges IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating example contained an unsigned latch condition and a signed range check. One of the safe iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check without inserting a postloop where it was needed. This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if: 1. The range check is also unsigned, or 2. Safe iteration range of the range check lies within `[0, SINT_MAX]`. The same is done for signed latch. Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation, and all values of a range intersected with such range are also non-negative. We also use signed/unsigned min/max functions for range intersection depending on type of the latch condition. Differential Revision: https://reviews.llvm.org/D38581 llvm-svn: 316552	2017-10-25 06:47:39 +00:00
Max Kazantsev	4332a943bc	[IRCE] Smarter detection of empty ranges using SCEV For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges which looks like `Begin == End` with a SCEV check. The range is guaranteed to be empty of `Begin >= End`. We should filter such ranges out and do not try to perform IRCE for them. For example, we can get such range when intersecting range `[A, B)` and `[C, D)` where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`. This range is empty, but its `Begin` does not match with `End`. Making IRCE for an empty range is basically safe but unprofitable because we never actually get into the main loop where the range checks are supposed to be eliminated. This patch uses SCEV mechanisms to treat loops with proved `Begin >= End` as empty. Differential Revision: https://reviews.llvm.org/D39082 llvm-svn: 316550	2017-10-25 06:10:02 +00:00
Eugene Zelenko	7f0f9bc5ab	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316503	2017-10-24 21:24:53 +00:00
Artem Belevich	cb8f6328dc	[NVPTX] allow address space inference for volatile loads/stores. If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495	2017-10-24 20:31:44 +00:00
Adrian Prantl	d20442d383	Delete unused instantiations of DIBuilder. NFC llvm-svn: 316494	2017-10-24 20:26:17 +00:00
Marek Olsak	ce76ea0394	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427	2017-10-24 10:27:13 +00:00
Marek Olsak	2114fc3bcb	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426	2017-10-24 10:26:59 +00:00
Saleem Abdulrasool	619b3269fd	ObjCARC: do not increment past the end of the BB The `BasicBlock::getFirstInsertionPt` call may return `std::end` for the BB. Dereferencing the end iterator results in an assertion failure "(!NodePtr->isKnownSentinel()), function operator*". Ensure that the returned iterator is valid before dereferencing it. If the end is returned, move one position backward to get a valid insertion point. llvm-svn: 316401	2017-10-24 00:09:10 +00:00
Mandeep Singh Grang	9ed81c66ce	[GVNSink] Fix failing GVNSink tests in the reverse iteration bot Summary: The elts of ActivePreds which is defined as a SmallPtrSet are copied into Blocks using std::copy. This makes the resultant order of Blocks non-deterministic. We cannot simply sort Blocks as they need to match the corresponding Values. So a better approach is to define ActivePreds as SmallSetVector. This fixes the following failures in http://lab.llvm.org:8011/builders/reverse-iteration: LLVM :: Transforms/GVNSink/indirect-call.ll LLVM :: Transforms/GVNSink/sink-common-code.ll LLVM :: Transforms/GVNSink/struct.ll Reviewers: dberlin, jmolloy, bkramer, efriedma Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39025 llvm-svn: 316369	2017-10-23 19:56:52 +00:00
Sanjay Patel	b80daf0b48	[SimplifyCFG] delay switch condition forwarding to -latesimplifycfg As discussed in D39011: https://reviews.llvm.org/D39011 ...replacing constants with a variable is inverting the transform done by other IR passes, so we definitely don't want to do this early. In fact, it's questionable whether this transform belongs in SimplifyCFG at all. I'll look at moving this to codegen as a follow-up step. llvm-svn: 316298	2017-10-22 19:10:07 +00:00
Sanjay Patel	24226504a7	[SimplifyCFG] try harder to forward switch condition to phi (PR34471) The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen: int switcher(int x) { switch(x) { case 17: return 17; case 19: return 19; case 42: return 42; default: break; } return 0; } int comparator(int x) { if (x == 17) return 17; if (x == 19) return 19; if (x == 42) return 42; return 0; } For the first example, we use a bit-test optimization to avoid a series of compare-and-branch: https://godbolt.org/g/BivDsw Differential Revision: https://reviews.llvm.org/D39011 llvm-svn: 316293	2017-10-22 16:51:03 +00:00
David Green	907b60fbba	[LoopInterchange] Fix phi node ordering miscompile. The way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure that is actually the case when there are both IV and reduction phis. Differential Revision: https://reviews.llvm.org/D38682 llvm-svn: 316261	2017-10-21 13:58:37 +00:00
Eugene Zelenko	fce435764e	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316253	2017-10-21 00:57:46 +00:00
Eugene Zelenko	99241d75c1	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316241	2017-10-20 21:47:29 +00:00
Eugene Zelenko	bff0ef0324	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316190	2017-10-19 22:07:16 +00:00
Eugene Zelenko	f27d161bf0	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316187	2017-10-19 21:21:30 +00:00
Simon Pilgrim	0444e4fcd4	Fix MSVC signed/unsigned comparison warning llvm-svn: 316161	2017-10-19 15:00:31 +00:00
Max Kazantsev	3612d4b4f9	[NFC][IRCE] Filter out empty ranges early llvm-svn: 316146	2017-10-19 05:33:28 +00:00
whitequark	a99ecf1bbb	[MergeFunctions] Don't blindly RAUW a GlobalValue with a ConstantExpr. MergeFunctions uses (through FunctionComparator) a map of GlobalValues to identifiers because it needs to compare functions and globals do not have an inherent total order. Thus, FunctionComparator (through GlobalNumberState) has a ValueMap<GlobalValue >. r315852 added a RAUW on globals that may have been previously encountered by the FunctionComparator, which would replace a GlobalValue key with a ConstantExpr *, which is illegal. This commit adjusts that code path to remove the function being replaced from the ValueMap as well. llvm-svn: 316145	2017-10-19 04:47:48 +00:00
Chandler Carruth	3f0e056df4	[PM] Refactor the bounds checking pass to remove a method only called in one place. llvm-svn: 316135	2017-10-18 22:42:36 +00:00
Sanjoy Das	2f27456c82	Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain." This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129	2017-10-18 22:00:57 +00:00
Eugene Zelenko	306d29977d	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316128	2017-10-18 21:46:47 +00:00
Jatin Bhateja	1fc49627e4	[ScalarEvolution] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054	2017-10-18 01:36:16 +00:00
Michael Zolotukhin	c4fcc189d2	[GlobalDCE] Use DenseMap instead of unordered_multimap for GVDependencies. Summary: std::unordered_multimap happens to be very slow when the number of elements grows large. On one of our internal applications we observed a 17x compile time improvement from changing it to DenseMap. Reviewers: mehdi_amini, serge-sans-paille, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38916 llvm-svn: 316045	2017-10-17 23:47:06 +00:00
Eugene Zelenko	6cadde7f40	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316034	2017-10-17 21:27:42 +00:00
Vitaly Buka	524c0a639d	Fix signed overflow detected by ubsan This overflow does not affect algorithm, so just suppress it. llvm-svn: 316018	2017-10-17 18:33:15 +00:00
Philip Reames	6a7bbfb2e2	Revert 315440 on behalf of mkazantsev This patch reverts rL315440 because of the bug described at https://bugs.llvm.org/show_bug.cgi?id=34937 The fix for the bug is on review as D38944, but not yet ready. Given this is a regression reverting until a fix is ready is called for. Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason. I did the build and test to ensure the revert worked as expected on his behalf. llvm-svn: 315974	2017-10-17 06:21:07 +00:00
Craig Topper	91259e2681	[JumpThreading] Move two PredValueInfoTy vectors to a scope closer to their usage. NFCI llvm-svn: 315941	2017-10-16 21:54:13 +00:00
Eugene Zelenko	dd40f5e7c1	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315940	2017-10-16 21:34:24 +00:00
Akira Hatanaka	e8c1a54c07	[ObjCARC] Do not move a release that has the clang.imprecise_release tag above PHI instructions. ARC optimizer has an optimization that moves a call to an ObjC runtime function above a phi instruction when the phi has a null operand and is an argument passed to the function call. This optimization should not kick in when the runtime function is an objc_release that releases an object with precise lifetime semantics. rdar://problem/34959669 llvm-svn: 315914	2017-10-16 16:46:59 +00:00
Sanjay Patel	42135beac8	[InstCombine] don't unnecessarily generate a constant; NFCI llvm-svn: 315910	2017-10-16 14:47:24 +00:00
NAKAMURA Takumi	414151a47e	Revert rL315894, "SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)" llvm-svn: 315896	2017-10-16 09:50:01 +00:00
Nikolai Bozhenov	0e7ebbccc7	Move folding of icmp with zero after checking for min/max idioms. Summary: The following transformation for cmp instruction: icmp smin(x, PositiveValue), 0 -> icmp x, 0 should only be done after checking for min/max to prevent infinite looping caused by a reverse canonicalization. That is why this transformation was moved to place after the mentioned check. Reviewers: spatel, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38934 Patch by: Artur Gainullin <artur.gainullin@intel.com> llvm-svn: 315895	2017-10-16 09:19:21 +00:00
NAKAMURA Takumi	4543affa98	SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586) llvm-svn: 315894	2017-10-16 09:15:23 +00:00
Sanjay Patel	934738a3da	revert r314984: revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) Recommitting r314698. The bug exposed by this change should be fixed with: https://reviews.llvm.org/rL315579 llvm-svn: 315857	2017-10-15 15:39:15 +00:00
Sanjay Patel	30f30d37fb	[SimplifyCFG] use range-for-loops, tidy; NFCI There seems to be something missing here as shown in PR34471: https://bugs.llvm.org/show_bug.cgi?id=34471 llvm-svn: 315855	2017-10-15 14:43:39 +00:00
Aaron Ballman	615eb47035	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854	2017-10-15 14:32:27 +00:00
whitequark	ae12efab20	[MergeFunctions] Merge small functions if possible without a thunk. This can result in significant code size savings in some cases, e.g. an interrupt table all filled with the same assembly stub in a certain Cortex-M BSP results in code blowup by a factor of 2.5. Differential Revision: https://reviews.llvm.org/D34806 llvm-svn: 315853	2017-10-15 12:29:09 +00:00
whitequark	b2ce9ffede	[MergeFunctions] Replace all uses of unnamed_addr functions. This reduces code size for constructs like vtables or interrupt tables that refer to functions in global initializers. Differential Revision: https://reviews.llvm.org/D34805 llvm-svn: 315852	2017-10-15 12:29:01 +00:00
Hongbin Zheng	73f650435b	[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC. This avoid code duplication and allow us to add the disable unroll metadata elsewhere. Differential Revision: https://reviews.llvm.org/D38928 llvm-svn: 315850	2017-10-15 07:31:02 +00:00
Sanjay Patel	b869f76d85	[InstCombine] use m_Neg() to reduce code; NFCI llvm-svn: 315762	2017-10-13 21:28:50 +00:00
Eugene Zelenko	3b87939604	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315760	2017-10-13 21:17:07 +00:00
Peter Collingbourne	868783e855	LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias. It is possible for both a base and a derived class to be satisfied with a unique vtable. If a program contains casts of the same pointer to both of those types, the CFI checks will be lowered to this (with ThinLTO): if (p != &__typeid_base_global_addr) trap(); if (p != &__typeid_derived_global_addr) trap(); The optimizer may then use the first condition combined with the assumption that __typeid_base_global_addr and __typeid_derived_global_addr may not alias to optimize away the second comparison, resulting in an unconditional trap. This patch fixes the bug by giving imported globals the type [0 x i8]*, which prevents the optimizer from assuming that they do not alias. Differential Revision: https://reviews.llvm.org/D38873 llvm-svn: 315753	2017-10-13 21:02:16 +00:00
Sanjay Patel	f0242de143	[InstCombine] move code to remove repeated constant check; NFCI Also, consolidate tests for this fold in one place. llvm-svn: 315745	2017-10-13 20:29:11 +00:00
Sanjay Patel	28b3aa3663	[InstCombine] recycle adds for better efficiency Also, clean up unnecessary matcher capture variable initializations. llvm-svn: 315743	2017-10-13 20:12:21 +00:00
Sanjay Patel	2118952162	[InstCombine] use local var to reduce code duplication; NFCI llvm-svn: 315728	2017-10-13 18:32:53 +00:00
Matthew Simpson	2284937bbc	[IPSCCP] Move common functions to ValueLatticeUtils (NFC) This patch moves some common utility functions out of IPSCCP and makes them available globally. The functions determine if interprocedural data-flow analyses can propagate information through function returns, arguments, and global variables. Differential Revision: https://reviews.llvm.org/D37638 llvm-svn: 315719	2017-10-13 17:53:44 +00:00
Sanjay Patel	c419c9f640	[InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions llvm-svn: 315718	2017-10-13 17:47:25 +00:00
Sanjay Patel	76ed9eab29	[InstCombine] use AddOne helper to reduce code; NFC llvm-svn: 315709	2017-10-13 17:00:47 +00:00
Sanjay Patel	8d810fee43	[InstCombine] rearrange code to remove repeated constant check; NFCI llvm-svn: 315703	2017-10-13 16:43:58 +00:00
Sanjay Patel	2150651ac3	[InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types The backend should be prepared for this transform after: https://reviews.llvm.org/rL311731 llvm-svn: 315701	2017-10-13 16:29:38 +00:00
Daniel Neilson	fa14ebd138	[RS4GC] Look through vector bitcasts when looking for base pointer Summary: In RS4GC it is possible that a base pointer is contained in a vector that has undergone a bitcast from one element-pointertype to another. We teach RS4GC how to look through bitcasts of vector types when looking for a base pointer. Reviewers: anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38849 llvm-svn: 315694	2017-10-13 15:59:13 +00:00
Daniel Jasper	3344a21236	Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode" Significantly reduces performancei (~30%) of gipfeli (https://github.com/google/gipfeli) I have not yet managed to reproduce this regression with the open-source version of the benchmark on github, but will work with others to get a reproducer to you later today. llvm-svn: 315680	2017-10-13 14:04:21 +00:00
Marco Castelluccio	0dcf64ad20	Disable gcov instrumentation of functions using funclet-based exception handling Summary: This patch fixes the crash from https://bugs.llvm.org/show_bug.cgi?id=34659 and https://bugs.llvm.org/show_bug.cgi?id=34833. Reviewers: rnk, majnemer Reviewed By: rnk, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D38223 llvm-svn: 315677	2017-10-13 13:49:15 +00:00
Eugene Zelenko	5323550e9a	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315640	2017-10-12 23:30:03 +00:00
Anna Thomas	61aec18d46	[CVP] Process binary operations even when def is local Summary: This patch adds processing of binary operations when the def of operands are in the same block (i.e. local processing). Earlier we bailed out in such cases (the bail out was introduced in rL252032) because LVI at that time was more precise about context at the end of basic blocks, which implied local def and use analysis didn't benefit CVP. Since then we've added support for LVI in presence of assumes and guards. The test cases added show how local def processing in CVP helps adding more information to the ashr, sdiv, srem and add operators. Note: processCmp which suffers from the same problem will be handled in a later patch. Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38766 llvm-svn: 315634	2017-10-12 22:39:52 +00:00
Artur Pilipenko	ead69ee4bd	[LoopPredication] Check whether the loop is already guarded by the first iteration check condition llvm-svn: 315623	2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes	993d2e67d8	Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."" This reverts commit r315593: still affect two bots: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308 http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/ llvm-svn: 315618	2017-10-12 20:52:34 +00:00
Artur Pilipenko	b4527e1ce2	[LoopPredication] Support ule, sle latch predicates This is a follow up for the loop predication change 313981 to support ule, sle latch predicates. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D38177 llvm-svn: 315616	2017-10-12 20:40:27 +00:00
Bruno Cardoso Lopes	326fdcbff8	Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP." This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315593	2017-10-12 16:54:11 +00:00
Don Hinton	3e0199f7eb	[dump] Remove NDEBUG from test to enable dump methods [NFC] Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590	2017-10-12 16:16:06 +00:00
Hongbin Zheng	d36f2030e2	[SimplifyIndVar] Replace IVUsers with loop invariant whenever possible Differential Revision: https://reviews.llvm.org/D38415 llvm-svn: 315551	2017-10-12 02:54:11 +00:00
Zachary Turner	41a9ee98f9	Revert "[ADT] Make Twine's copy constructor private." This reverts commit 4e4ee1c507e2707bb3c208e1e1b6551c3015cbf5. This is failing due to some code that isn't built on MSVC so I didn't catch. Not immediately obvious how to fix this at first glance, so I'm reverting for now. llvm-svn: 315536	2017-10-11 23:54:34 +00:00
Zachary Turner	337462b365	[ADT] Make Twine's copy constructor private. There's a lot of misuse of Twine scattered around LLVM. This ranges in severity from benign (returning a Twine from a function by value that is just a string literal) to pretty sketchy (storing a Twine by value in a class). While there are some uses for copying Twines, most of the very compelling ones are confined to the Twine class implementation itself, and other uses are either dubious or easily worked around. This patch makes Twine's copy constructor private, and fixes up all callsites. Differential Revision: https://reviews.llvm.org/D38767 llvm-svn: 315530	2017-10-11 23:33:06 +00:00
Eugene Zelenko	6f1ae631f7	[Transforms] Revert r315516 changes in PredicateInfo to fix Windows build bots (NFC). llvm-svn: 315519	2017-10-11 21:56:44 +00:00
Eugene Zelenko	286d5897d6	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315516	2017-10-11 21:41:43 +00:00
Vivek Pandya	9590658fb8	[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 llvm-svn: 315476	2017-10-11 17:12:59 +00:00
Max Kazantsev	fecaff1bd9	[NFC] Fix variables used only for assert in GVN llvm-svn: 315448	2017-10-11 10:31:49 +00:00
Max Kazantsev	3b81809e06	[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440	2017-10-11 08:10:43 +00:00
Max Kazantsev	0c8dd052b8	[LICM] Disallow sinking of unordered atomic loads into loops Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this prohibits any transformation that > transforms a single load into multiple loads, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438	2017-10-11 07:26:45 +00:00
Max Kazantsev	25d8655dc2	[IRCE] Do not process empty safe ranges IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437	2017-10-11 06:53:07 +00:00
Davide Italiano	e2138fe41b	[GVN] Don't replace constants with constants. This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429	2017-10-11 04:21:51 +00:00
Eugene Zelenko	e9ea08a097	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315383	2017-10-10 22:49:55 +00:00
Dehao Chen	3f56a05ae5	Use the first instruction's count to estimate the funciton's entry frequency. Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369	2017-10-10 21:13:50 +00:00
Bruno Cardoso Lopes	57304923ca	Revert "[SCCP] Propagate integer range info for parameters in IPSCCP." This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329	2017-10-10 16:37:57 +00:00
Bruno Cardoso Lopes	122c4b3c8c	Revert "[SCCP] Fix mem-sanitizer failure introduced by r315288." This reverts commit r315294. Part of fixing seg fault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315328	2017-10-10 16:37:51 +00:00
Florian Hahn	7d2375df30	[SCCP] Fix mem-sanitizer failure introduced by r315288. llvm-svn: 315294	2017-10-10 10:33:45 +00:00
Florian Hahn	22a44bca40	[SCCP] Propagate integer range info for parameters in IPSCCP. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288	2017-10-10 09:32:38 +00:00
Clement Courbet	e2e8a5c496	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281	2017-10-10 08:00:45 +00:00
Xinliang David Li	4cdc9dab0a	Renable r314928 Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272	2017-10-10 05:07:54 +00:00
Adam Nemet	0965da2055	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249	2017-10-09 23:19:02 +00:00
Sanjay Patel	ce36b03b03	[InstCombine] fix formatting; NFC llvm-svn: 315223	2017-10-09 17:54:46 +00:00
Sanjay Patel	72d339abb7	[InstCombine] use correct type when propagating constant condition in simplifyDivRemOfSelectWithZeroOp (PR34856) llvm-svn: 315130	2017-10-06 23:43:06 +00:00
Sanjay Patel	ae2e3a44d2	[InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, simplify code; NFCI There's at least one bug here - this code can fail with vector types (PR34856). It's also being called for FREM; I'm still trying to understand how that is valid. llvm-svn: 315127	2017-10-06 23:20:16 +00:00
Reid Kleckner	b6b210e61f	Revert "Roll forward r314928" This appears to be miscompiling Clang, as shown on two Windows bootstrap bots: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611 http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870 Nothing else is in the blame list. Both emit errors on this valid code in the Windows ucrt headers: C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char ' and 'int') _Ptr = (char)_Ptr + _ALLOCA_S_MARKER_SIZE; ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~ I am attempting to reproduce this now. This reverts r315044 llvm-svn: 315108	2017-10-06 21:17:51 +00:00
Dehao Chen	9bd60429e2	Directly return promoted direct call instead of rely on stripPointerCast. Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38603 llvm-svn: 315077	2017-10-06 17:04:55 +00:00
Clement Courbet	d12c189e2e	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Still a few stability issues on windows. This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545. llvm-svn: 315058	2017-10-06 13:02:24 +00:00
Clement Courbet	4e1bae8136	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed unit tests by making comparisons stable) This reverts commit 1b2d359ce256fd6737da4e93833346a0bd6d7583. llvm-svn: 315056	2017-10-06 12:12:35 +00:00
Xinliang David Li	bcd36f7c5a	Roll forward r314928 Fixed ThinLTO bootstrap failure : track new bitcast per incomingVal. Added new tests. llvm-svn: 315044	2017-10-06 05:15:25 +00:00
Davide Italiano	c74ea93b8c	[PM] Retire disable unit-at-a-time switch. This is a vestige from the GCC-3 days, which disables IPO passes when set. I don't think anybody actually uses it as there are several IPO passes which still run with this flag set and nobody complained/noticed. This reduces the delta between current and new pass manager and allows us to easily review the difference when we decide to flip the switch (or audit which passes should run, FWIW). llvm-svn: 315043	2017-10-06 04:39:40 +00:00
Jakub Kuderski	cbe9fae99d	[CodeExtractor] Fix multiple bugs under certain shape of extracted region Summary: If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub. Unittest for this bug is included. Author: myhsu Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar Subscribers: dberlin, kuhar, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37902 llvm-svn: 315041	2017-10-06 03:37:06 +00:00
Daniel Berlin	08dd582ea0	NewGVN: Factor out duplicate parts of OpIsSafeForPHIOfOps llvm-svn: 315040	2017-10-06 01:33:06 +00:00
Peter Collingbourne	715bcfe0c9	ModuleUtils: Stop using comdat members to generate unique module ids. It is possible for two modules to define the same set of external symbols without causing a duplicate symbol error at link time, as long as each of the symbols is a comdat member. So we cannot use them as part of a unique id for the module. Differential Revision: https://reviews.llvm.org/D38602 llvm-svn: 315026	2017-10-05 21:54:53 +00:00
Sanjay Patel	7ac2db6a48	[InstCombine] improve folds for icmp gt/lt (shr X, C1), C2 We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C' This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: ADD: %1 = udiv i4 %x, 2 IC: Old = %c = icmp ugt i4 %1, 1 New = <badref> = icmp uge i4 %x, 4 IC: ADD: %c = icmp uge i4 %x, 4 IC: ERASE %2 = icmp ugt i4 %1, 1 IC: Visiting: %c = icmp uge i4 %x, 4 IC: Old = %c = icmp uge i4 %x, 4 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %2 = icmp uge i4 %x, 4 IC: Visiting: %c = icmp ugt i4 %x, 3 IC: DCE: %1 = udiv i4 %x, 2 IC: ERASE %1 = udiv i4 %x, 2 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: ret i1 %c When we could go directly to canonical icmp form: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: Old = %c = icmp ugt i4 %s, 1 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %1 = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: %c = icmp ugt i4 %x, 3 ...but then I noticed that the folds were incomplete too: https://godbolt.org/g/aB2hLE Here are attempts to prove the logic with Alive: https://rise4fun.com/Alive/92o Name: lshr_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) Name: ashr_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_ugt Pre: (((C2+1) << C1) u>> C1) == (C2+1) %sh = lshr i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, ((C2+1) << C1) - 1 Name: ashr_sgt Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1) %sh = ashr i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, ((C2+1) << C1) - 1 Name: ashr_exact_sgt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, (C2 << C1) Name: ashr_exact_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_exact_ugt Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, (C2 << C1) Name: lshr_exact_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) We did something similar for 'shl' in D28406. Differential Revision: https://reviews.llvm.org/D38514 llvm-svn: 315021	2017-10-05 21:11:49 +00:00
Dehao Chen	16f01fb1db	Annotate VP prof on indirect call if it is ICPed in the profiled binary. Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D38477 llvm-svn: 315016	2017-10-05 20:15:29 +00:00
Davide Italiano	c8708e59e8	[PassManager] Improve the interaction between -O2 and ThinLTO. Run GDCE slightly later so that we don't have to repeat it twice when preparing for Thin. Thanks to Mehdi for the suggestion. llvm-svn: 314999	2017-10-05 18:23:25 +00:00
Davide Italiano	ff829cea8b	[PassManager] Run global optimizations after the inliner. The inliner performs some kind of dead code elimination as it goes, but there are cases that are not really caught by it. We might at some point consider teaching the inliner about them, but it is OK for now to run GlobalOpt + GlobalDCE in tandem as their benefits generally outweight the cost, making the whole pipeline faster. This fixes PR34652. Differential Revision: https://reviews.llvm.org/D38154 llvm-svn: 314997	2017-10-05 18:06:37 +00:00
Ayal Zaks	c9e0f886e5	[LV] Fix PR34743 - handle casts that sink after interleaved loads When ignoring a load that participates in an interleaved group, make sure to move a cast that needs to sink after it. Testcase derived from reproducer of PR34743. Differential Revision: https://reviews.llvm.org/D38338 llvm-svn: 314986	2017-10-05 15:45:14 +00:00
Clement Courbet	922e5bc698	Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""" broken test on windows This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca. llvm-svn: 314985	2017-10-05 14:42:06 +00:00
Sanjay Patel	f11b5b4f87	revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) There is a bot failure that appears to be related to this change: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117 ...so reverting to confirm that and attempting to keep the bot green while investigating. llvm-svn: 314984	2017-10-05 14:26:15 +00:00
Ayal Zaks	fc3f7a4f0c	[LV] Fix PR34711 - widen instruction ranges when sinking casts Instead of trying to keep LastWidenRecipe updated after creating each recipe, have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check if it's a VPWidenRecipe when attempting to extend its range. This ensures that such extensions, optimized to maintain the original instruction order, do so only when the instructions are to maintain their relative order. The latter does not always hold, e.g., when a cast needs to sink to unravel first order recurrence (r306884). Testcase derived from reproducer of PR34711. Differential Revision: https://reviews.llvm.org/D38339 llvm-svn: 314981	2017-10-05 12:41:49 +00:00
Clement Courbet	4cafbb9b5e	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."" llvm-svn: 314980	2017-10-05 12:39:57 +00:00
Clement Courbet	6603fc0e7b	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Breaks clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa. llvm-svn: 314972	2017-10-05 08:03:39 +00:00
Craig Topper	17b0c78447	[InstCombine] Fix a vector splat handling bug in transformZExtICmp. We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us. This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first. Fixes PR34841. llvm-svn: 314971	2017-10-05 07:59:11 +00:00
Clement Courbet	902eef32eb	[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion. Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later. Reviewers: dberlin, spatel Subscribers: davide, nemanjai Differential Revision: https://reviews.llvm.org/D38232 llvm-svn: 314970	2017-10-05 07:49:09 +00:00
Xinliang David Li	04ab11a08a	Revert r314928 to investigate thinLTO bootstrap failure llvm-svn: 314961	2017-10-05 01:40:13 +00:00
Craig Topper	7a93092399	[InstCombine] Improve support for ashr in foldICmpAndShift We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945	2017-10-04 23:06:13 +00:00
Hans Wennborg	899809d531	Fix a -Wparentheses warning. NFC. llvm-svn: 314936	2017-10-04 21:14:07 +00:00
Marcello Maggioni	df3e71e037	[LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC llvm-svn: 314934	2017-10-04 20:42:46 +00:00
Sanjay Patel	4c33d5213b	[SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI This is a follow-up to https://reviews.llvm.org/D38138. I fixed the capitalization of some functions because we're changing those lines anyway and that helped verify that we weren't accidentally dropping any options by using default param values. llvm-svn: 314930	2017-10-04 20:26:25 +00:00
Xinliang David Li	7a73757358	Recommit r314561 after fixing msan build failure (trial 2) Incoming val defined by terminator instruction which also requires bitcasts can not be handled. llvm-svn: 314928	2017-10-04 20:17:55 +00:00
Jun Bum Lim	d40e03c2d8	Recommit : Use the basic cost if a GEP is not used as addressing mode Recommitting r314517 with the fix for handling ConstantExpr. Original commit message: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. llvm-svn: 314923	2017-10-04 18:33:52 +00:00
Clement Courbet	98eaa88357	[NFC] clang-format lib/Transforms/Scalar/MergeICmps.cpp llvm-svn: 314906	2017-10-04 15:13:52 +00:00
Max Kazantsev	8aacef6cae	[IRCE] Temporarily disable unsigned latch conditions by default We have found some corner cases connected to range intersection where IRCE makes a bad thing when the latch condition is unsigned. The fix for that will go as a follow up. This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed. The unsigned latch conditions were introduced to IRCE by rL310027. Differential Revision: https://reviews.llvm.org/D38529 llvm-svn: 314881	2017-10-04 06:53:22 +00:00
Craig Topper	df63b96811	[InstCombine] Use isSignBitCheck to simplify an if statement. Directly create new sign bit compares instead of manipulating the constant. NFCI Since we no longer had the direct constant compares, manipulating the constant seemeded less clear. llvm-svn: 314830	2017-10-03 19:14:23 +00:00
Hans Wennborg	9a9048e19f	Revert r314806 "[SLP] Vectorize jumbled memory loads." All the buildbots are red, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/ > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' of > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh > > Reviewed By: Ayal > > Subscribers: hans, mzolotukhin > > Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314824	2017-10-03 18:32:29 +00:00
Dehao Chen	ea523ddb1b	Revert the change that accidentally went in r314806. llvm-svn: 314807	2017-10-03 15:50:42 +00:00
Mohammad Shahid	1d5422f27f	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: hans, mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314806	2017-10-03 15:28:48 +00:00
Craig Topper	8ed1aa91bd	[InstCombine] Change a bunch of methods to take APInts by reference instead of pointer. This allows us to remove a bunch of dereferences and only have a few dereferences at the call sites. llvm-svn: 314762	2017-10-03 05:31:07 +00:00
Craig Topper	664c4d0190	[InstCombine] Replace an equality compare of two APInt pointers with a compare of the APInts themselves. Apparently this works by virtue of the fact that the pointers are pointers to the APInts stored inside of the ConstantInt objects. But I really don't think we should be relying on that. llvm-svn: 314761	2017-10-03 04:55:04 +00:00
Davide Italiano	c48d1c8519	[PassManager] Retire cl::opt that have been set for a while. NFCI. llvm-svn: 314740	2017-10-02 23:39:20 +00:00
Sanjay Patel	6e17c00a88	[InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) llvm-svn: 314698	2017-10-02 18:26:44 +00:00
Dehao Chen	f464627f28	Update getMergedLocation to check the instruction type and merge properly. Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined. Reviewers: dblaikie, aprantl Reviewed By: dblaikie, aprantl Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D37877 llvm-svn: 314694	2017-10-02 18:13:14 +00:00
Craig Topper	6e025a3ecc	[InstCombine] Use APInt for all the math in foldICmpDivConstant Summary: This currently uses ConstantExpr to do its math, but as noted in a TODO it can all be done directly on APInt. Reviewers: spatel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38440 llvm-svn: 314640	2017-10-01 23:53:54 +00:00
Daniel Jasper	3c9c60c727	Revert r314579: "Recommi r314561 after fixing over-debug assertion". And follow-up r314585. Leads to segfaults. I'll forward reproduction instructions to the patch author. Also, for a recommit, still add the original patch description. Otherwise, it becomes really tedious to find out what a patch actually does. The fact that it is a recommit with a fix is somewhat secondary. llvm-svn: 314622	2017-10-01 09:53:53 +00:00
Dehao Chen	d26dae0d34	Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases. Summary: In SamplePGO ThinLTO compile phase, we will not invoke ICP as it may introduce confusion to the 2nd annotation. This patch extracted that logic and makes it clearer before profile annotation. In the mean time, we need to make function importing process both inlined callsites as well as not promoted indirect callsites. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D38094 llvm-svn: 314619	2017-10-01 05:24:51 +00:00
Daniel Berlin	d36c27bedb	NewGVN: Fix PR 34473, by not using ExactlyEqualsExpression for finding phi of ops users. llvm-svn: 314612	2017-09-30 23:51:55 +00:00
Daniel Berlin	c1305af09b	NewGVN: Evaluate phi of ops expressions before creating phi node llvm-svn: 314611	2017-09-30 23:51:54 +00:00
Daniel Berlin	9b926e90d3	NewGVN: Allow dependent PHI of ops llvm-svn: 314610	2017-09-30 23:51:53 +00:00
Daniel Berlin	de6958ee85	NewGVN: Make OpIsSafeForPhiOfOps non-recursive llvm-svn: 314609	2017-09-30 23:51:04 +00:00
Dehao Chen	4f5d830343	Refactor the SamplePGO profile annotation logic to extract inlineCallInstruction. (NFC) llvm-svn: 314601	2017-09-30 20:46:15 +00:00
Daniel Jasper	0a51ec29c9	Revert r314435: "[JumpThreading] Preserve DT and LVI across the pass" Causes a segfault on a builtbot (and in our internal bootstrapping of Clang). See Eli's response on the commit thread. llvm-svn: 314589	2017-09-30 11:57:19 +00:00
Xinliang David Li	b8aac3ac19	Fix buildbot failure -- tighten type check for matching phi llvm-svn: 314585	2017-09-30 05:27:46 +00:00
Xinliang David Li	3409d9c07f	Recommi r314561 after fixing over-debug assertion llvm-svn: 314579	2017-09-30 00:46:32 +00:00
Xinliang David Li	455dec098b	Revert 314561 due to debug build assertion failure llvm-svn: 314563	2017-09-29 22:30:34 +00:00
Xinliang David Li	5b9d96825b	Eliminate PHI (int typed) which has only one use by intptr This patch will eliminate redundant intptr/ptrtoint that pessimizes analyses such as SCEV, AA and will make optimization passes such as auto-vectorization more powerful. Differential revision: http://reviews.llvm.org/D37832 llvm-svn: 314561	2017-09-29 22:10:15 +00:00
Alex Shlyapnikov	e76aa3b0b2	Revert "Use the basic cost if a GEP is not used as addressing mode" This reverts commit r314517. This commit crashes sanitizer bots, for example: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167 Stack snippet: ... /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const, llvm::ArrayRef<llvm::Value const>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const, llvm::ArrayRef<llvm::Value const>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0 ... llvm-svn: 314560	2017-09-29 22:04:45 +00:00
Matthew Simpson	f4bb480b62	[LV] Use correct insertion point when type shrinking reductions When type shrinking reductions, we should insert the truncations and extends at the end of the loop latch block. Previously, these instructions were inserted at the end of the loop header block. The difference is only a problem for loops with predicated instructions (e.g., conditional stores and instructions that may divide by zero). For these instructions, we create new basic blocks inside the vectorized loop, which cause the loop header and latch to no longer be the same block. This should fix PR34687. Reference: https://bugs.llvm.org/show_bug.cgi?id=34687 llvm-svn: 314542	2017-09-29 18:07:39 +00:00
Hongbin Zheng	c8abdf5f25	[SimplifyIndVar] Do not fail when we constant fold an IV user to ConstantPointerNull The type of a SCEVConstant may not match the corresponding LLVM Value. In this case, we skip the constant folding for now. TODO: Replace ConstantInt Zero by ConstantPointerNull llvm-svn: 314531	2017-09-29 16:32:12 +00:00
Jun Bum Lim	0e16a59e83	Use the basic cost if a GEP is not used as addressing mode Summary: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng Reviewed By: hfinkel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38085 llvm-svn: 314517	2017-09-29 14:50:16 +00:00
Sanjoy Das	0ac5ba5ade	Revert "[BypassSlowDivision] Improve our handling of divisions by constants" This reverts commit r314253. It causes a miscompile on P100 in an internal benchmark. Reverting while I investigate. llvm-svn: 314482	2017-09-29 00:54:16 +00:00
Evandro Menezes	3701df55c6	[JumpThreading] Preserve DT and LVI across the pass JumpThreading now preserves dominance and lazy value information across the entire pass. The pass manager is also informed of this preservation with the goal of DT and LVI being recalculated fewer times overall during compilation. This change prepares JumpThreading for enhanced opportunities; particularly those across loop boundaries. Patch by: Brian Rzycki <b.rzycki@samsung.com>, Sebastian Pop <s.pop@samsung.com> Differential revision: https://reviews.llvm.org/D37528 llvm-svn: 314435	2017-09-28 17:24:40 +00:00
Benjamin Kramer	c965b30e54	[LoopUnroll] Fix use after poison. llvm-svn: 314418	2017-09-28 14:47:39 +00:00
Sanjoy Das	def1729dc4	Use a BumpPtrAllocator for Loop objects Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38201 llvm-svn: 314375	2017-09-28 02:45:42 +00:00
Craig Topper	0cd25942f7	Revert r314017 '[InstCombine] Simplify check for RHS being a splat constant in foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt.' This reverts r314017 and similar code added in later commits. It seems to not work for pointer compares and is causing a bot failure for the last several days. llvm-svn: 314360	2017-09-27 22:57:18 +00:00
Rui Ueyama	0dbb0f107e	Fix -Wunused-variable for Release build. llvm-svn: 314353	2017-09-27 22:03:15 +00:00
Sanjoy Das	4f3ebd537c	Return the LoopUnrollResult from tryToUnrollLoop; NFC I will use this in a later change. llvm-svn: 314352	2017-09-27 21:45:22 +00:00
Sanjoy Das	8e8c1bc490	LoopDeletion: use return value instead of passing in LPMUpdater; NFC I will use this refactoring in a later patch. llvm-svn: 314351	2017-09-27 21:45:21 +00:00
Sanjoy Das	3567d3d2ec	Rename LoopUnrollStatus to LoopUnrollResult; NFC A "Result" suffix is more appropriate here llvm-svn: 314350	2017-09-27 21:45:19 +00:00
Alexey Bataev	022cc6c41e	[SLP] Fix crash on propagate IR flags for undef operands of min/max reductions. If both operands of the newly created SelectInst are Undefs the resulting operation is also Undef, not SelectInst. It may cause crashes when trying to propagate IR flags because function expects exactly SelectInst instruction, nothing else. llvm-svn: 314323	2017-09-27 17:42:49 +00:00
Chad Rosier	d8b4b06f5d	[InstCombine] Gating select arithmetic optimization. These changes faciliate positive behavior for arithmetic based select expressions that match its translation criteria, keeping code size gated to neutral or improved scenarios. Patch by Michael Berg <michael_c_berg@apple.com>! Differential Revision: https://reviews.llvm.org/D38263 llvm-svn: 314320	2017-09-27 17:16:51 +00:00
Sanjay Patel	fee80d5e65	[SLP] fix typos/formatting; NFC llvm-svn: 314315	2017-09-27 16:32:56 +00:00
Sanjay Patel	0f9b4773c1	[SimplifyCFG] add a struct to house optional folds (PR34603) This was intended to be no-functional-change, but it's not - there's a test diff. So I thought I should stop here and post it as-is to see if this looks like what was expected based on the discussion in PR34603: https://bugs.llvm.org/show_bug.cgi?id=34603 Notes: 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. The parameter isn't passed down, so we pick up the default value from the function signature after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 'SimplifyCFG' calls. 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. This would theoretically allow us to differentiate the transforms controlled by those params independently. 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. I just stopped here to minimize the diffs. 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG set to true no matter where in the pipeline it's creating SimplifyCFG passes? // Create an early function pass manager to cleanup the output of the // frontend. EarlyFPM.addPass(SimplifyCFGPass()); --> /// \brief Construct a pass with the default thresholds /// and switch optimizations. SimplifyCFGPass::SimplifyCFGPass() : BonusInstThreshold(UserBonusInstThreshold), LateSimplifyCFG(true) {} <-- switches get converted to lookup tables and loops may not be in canonical form If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' setting via recursion was masking this bug. Differential Revision: https://reviews.llvm.org/D38138 llvm-svn: 314308	2017-09-27 14:54:16 +00:00
Hongbin Zheng	d1b7b2efba	[SimplifyIndVar] Constant fold IV users This patch tries to transform cases like: for (unsigned i = 0; i < N; i += 2) { bool c0 = (i & 0x1) == 0; bool c1 = ((i + 1) & 0x1) == 1; } To for (unsigned i = 0; i < N; i += 2) { bool c0 = true; bool c1 = true; } This commit also update test/Transforms/IndVarSimplify/replace-srem-by-urem.ll to prevent constant folding. Differential Revision: https://reviews.llvm.org/D38272 llvm-svn: 314266	2017-09-27 03:11:46 +00:00
Sanjoy Das	eda7a86d42	[BypassSlowDivision] Improve our handling of divisions by constants Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 314253	2017-09-26 21:54:27 +00:00
Craig Topper	8bf622174d	[InstCombine] Remove one use restriction on the shift for calls to foldICmpAndShift. If this transformation succeeds, we're going to remove our dependency on the shift by rewriting the and. So it doesn't matter how many uses the shift has. This distributes the one use check to other transforms in foldICmpAndConstConst that do need it. Differential Revision: https://reviews.llvm.org/D38206 llvm-svn: 314233	2017-09-26 18:47:25 +00:00
Sanjay Patel	1d04b5bacf	[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to (2nd try) This is a 2nd attempt at: https://reviews.llvm.org/rL310055 ...which was reverted at rL310123 because of PR34074: https://bugs.llvm.org/show_bug.cgi?id=34074 In this version, we break out of the inner loop after we successfully merge and kill a pair of stores. In the earlier rev, we were continuing instead, which meant we could process the invalid info from a now dead store. Original commit message (authored by Filipe Cabecinhas): This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 314206	2017-09-26 13:54:28 +00:00
Sylvestre Ledru	e7d4cd639b	Don't move llvm.localescape outside the entry block in the GCOV profiling pass Summary: This fixes https://bugs.llvm.org/show_bug.cgi?id=34714. Patch by Marco Castelluccio Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38224 llvm-svn: 314201	2017-09-26 11:56:43 +00:00
Matthias Braun	cc603ee3d5	TargetLibraryInfo: Stop guessing wchar_t size Usually the frontend communicates the size of wchar_t via metadata and we can optimize wcslen (and possibly other calls in the future). In cases without the wchar_size metadata we would previously try to guess the correct size based on the target triple; however this is fragile to keep up to date and may miss users manually changing the size via flags. Better be safe and stop guessing and optimizing if the frontend didn't communicate the size. Differential Revision: https://reviews.llvm.org/D38106 llvm-svn: 314185	2017-09-26 02:36:57 +00:00
Vlad Tsyrklevich	998b220e97	Add section headers to SpecialCaseLists Summary: Sanitizer blacklist entries currently apply to all sanitizers--there is no way to specify that an entry should only apply to a specific sanitizer. This is important for Control Flow Integrity since there are several different CFI modes that can be enabled at once. For maximum security, CFI blacklist entries should be scoped to only the specific CFI mode(s) that entry applies to. Adding section headers to SpecialCaseLists allows users to specify more information about list entries, like sanitizer names or other metadata, like so: [section1] fun:fun1 [section2\|section3] fun:fun23 The section headers are regular expressions. For backwards compatbility, blacklist entries entered before a section header are put into the '[*]' section so that blacklists without sections retain the same behavior. SpecialCaseList has been modified to also accept a section name when matching against the blacklist. It has also been modified so the follow-up change to clang can define a derived class that allows matching sections by SectionMask instead of by string. Reviewers: pcc, kcc, eugenis, vsk Reviewed By: eugenis, vsk Subscribers: vitalybuka, llvm-commits Differential Revision: https://reviews.llvm.org/D37924 llvm-svn: 314170	2017-09-25 22:11:11 +00:00
Craig Topper	30dc9797e9	[InstCombine] Move an optimization from foldICmpAndConstConst to foldICmpUsingKnownBits All this optimization cares about is knowing how many low bits of LHS is known to be zero and whether that means that the result is 0 or greater than the RHS constant. It doesn't matter where the zeros in the low bits came from. So we don't need to specifically look for an AND. Instead we can use known bits. Differential Revision: https://reviews.llvm.org/D38195 llvm-svn: 314153	2017-09-25 21:15:00 +00:00
Sanjay Patel	ecb175608f	[InstCombine] remove extract-of-select vector transform (2nd try) The 1st attempt at this: https://reviews.llvm.org/rL314117 was reverted at: https://reviews.llvm.org/rL314118 because of bot fails for clang tests that were checking optimized IR. That should be fixed with: https://reviews.llvm.org/rL314144 ...so try again. Original commit message: The transform to convert an extract-of-a-select-of-vectors was added at: https://reviews.llvm.org/rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314147	2017-09-25 20:30:53 +00:00
Hongbin Zheng	bbe448abd8	[SimplifyIndvar] Minor change to refine r314125, NFC llvm-svn: 314130	2017-09-25 18:10:36 +00:00
Hongbin Zheng	f0093e45c4	[SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior This is a follow up of D34598 Differential Revision: https://reviews.llvm.org/D38072 llvm-svn: 314125	2017-09-25 17:39:40 +00:00
Sanjay Patel	aa7f750bec	revert r314117 because there are bogus clang tests that depend on the optimizer llvm-svn: 314118	2017-09-25 17:00:04 +00:00
Sanjay Patel	9639897d77	[InstCombine] remove extract-of-select vector transform The transform to convert an extract-of-a-select-of-vectors was added at: rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314117	2017-09-25 16:41:34 +00:00
Alexey Bataev	ccce7afee8	[SLP] Support for horizontal min/max reduction. Summary: SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Reviewers: spatel, mkuper, hfinkel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27846 llvm-svn: 314101	2017-09-25 13:34:59 +00:00
Craig Topper	ea927baee2	[InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to equality comparisons when the min/max ranges intersect in a single value. This is the inverse of what we do for SGT/SLT/UGT/ULT. llvm-svn: 314032	2017-09-22 21:47:22 +00:00
Craig Topper	3f364aa908	[InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases in foldICmpUsingKnownBits. llvm-svn: 314025	2017-09-22 19:54:15 +00:00
Craig Topper	3edda87c42	[InstCombine] Move the call to isSignBitCheck into getDemandedBitsLHSMask instead of calling it outside and passing its result through a flag. NFCI The result of the isSignBitCheck isn't used anywhere else and this allows us to share the m_APInt call in the likely case that it isn't a sign bit check. llvm-svn: 314018	2017-09-22 18:57:23 +00:00
Craig Topper	5b35b68785	[InstCombine] Simplify check for RHS being a splat constant in foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt. llvm-svn: 314017	2017-09-22 18:57:22 +00:00
Craig Topper	2c9b7d7894	[InstCombine] Make cases for ICMP_UGT/ICMP_ULT use similar formatting since they use similar code. NFC llvm-svn: 314016	2017-09-22 18:57:20 +00:00
Artur Pilipenko	889dc1e3a5	Rework loop predication pass We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation. For now the transformation is limited to the following case: * The loop has a single latch with either ult or slt icmp condition. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is ult. See the review or the LoopPredication.cpp header for the details about the problem and the new implementation. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D37569 llvm-svn: 313981	2017-09-22 13:13:57 +00:00
Sanjoy Das	388b012f4e	Rename markAsErased to erase, as pointed out in a previous review; NFC llvm-svn: 313951	2017-09-22 01:47:41 +00:00
Reid Kleckner	0fe506bc5e	Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare" The fix is to avoid invalidating our insertion point in replaceDbgDeclare: Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore); + if (DII == InsertBefore) + InsertBefore = &std::next(InsertBefore->getIterator()); DII->eraseFromParent(); I had to write a unit tests for this instead of a lit test because the use list order matters in order to trigger the bug. The reduced C test case for this was: void useit(int); static inline void inlineme() { int x[2]; useit(x); } void f() { inlineme(); inlineme(); } llvm-svn: 313905	2017-09-21 19:52:03 +00:00
Daniel Jasper	7d2f38d600	Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare" .. as well as the two subsequent changes r313826 and r313875. This leads to segfaults in combination with ASAN. Will forward repro instructions to the original author (rnk). llvm-svn: 313876	2017-09-21 12:07:33 +00:00
Mikael Holmen	582e141007	[SROA] Really remove associated dbg.declare when removing dead alloca Summary: There already was code that tried to remove the dbg.declare, but that code was placed after we had called I->replaceAllUsesWith(UndefValue::get(I->getType())); on the alloca, so when we searched for the relevant dbg.declare, we couldn't find it. Now we do the search before we call RAUW so there is a chance to find it. An existing testcase needed update due to this. Two dbg.declare with undef were removed and then suddenly one of the two CHECKS failed. Before this patch we got call void @llvm.dbg.declare(metadata i24* undef, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 call void @llvm.dbg.declare(metadata %struct.prog_src_register* undef, metadata !14, metadata !DIExpression()), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 and with it we get call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 However, the CHECKs in the testcase checked things in a silly order, so they only passed since they found things in the first dbg.declare. Now we changed the order of the checks and the test passes. Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37900 llvm-svn: 313875	2017-09-21 11:14:27 +00:00
Strahinja Petrovic	29202f6dc1	Fixed reverted commit rL312318 This patch contains fix for reverted commit rL312318 which was causing failure due to use of unchecked dyn_cast to CIInit. Patch by: Nikola Prica. llvm-svn: 313870	2017-09-21 10:04:02 +00:00
Serguei Katkov	675e304ef8	Revert "Re-enable "[IRCE] Identify loops with latch comparison against current IV value"" Revert the patch causing the functional failures. The patch owner is notified with test cases which fail. Test case has been provided to Maxim offline. llvm-svn: 313857	2017-09-21 04:50:41 +00:00
Craig Topper	18887bf179	[InstCombine] Teach getDemandedBitsLHSMask to handle constant splat vectors This replaces a ConstantInt dyn_cast with m_APInt Differential Revision: https://reviews.llvm.org/D38100 llvm-svn: 313840	2017-09-20 23:48:58 +00:00
Matt Morehouse	4881a23ca8	[MSan] Disable sanitization for __sanitizer_dtor_callback. Summary: Eliminate unnecessary instrumentation at __sanitizer_dtor_callback call sites. Fixes https://github.com/google/sanitizers/issues/861. Reviewers: eugenis, kcc Reviewed By: eugenis Subscribers: vitalybuka, llvm-commits, cfe-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38063 llvm-svn: 313831	2017-09-20 22:53:08 +00:00
Sanjay Patel	73811a152a	[SimplifyCFG] don't create a no-op subtract I noticed this inefficiency while investigating PR34603: https://bugs.llvm.org/show_bug.cgi?id=34603 This fix will likely push another bug (we don't maintain state of 'LateSimplifyCFG') into hiding, but I'll try to clean that up with a follow-up patch anyway. llvm-svn: 313829	2017-09-20 22:31:35 +00:00
Reid Kleckner	3f547e87b2	[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare Summary: This implements the design discussed on llvm-dev for better tracking of variables that live in memory through optimizations: http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html This is tracked as PR34136 llvm.dbg.addr is intended to be produced and used in almost precisely the same way as llvm.dbg.declare is today, with the exception that it is control-dependent. That means that dbg.addr should always have a position in the instruction stream, and it will allow passes that optimize memory operations on local variables to insert llvm.dbg.value calls to reflect deleted stores. See SourceLevelDebugging.rst for more details. The main drawback to generating DBG_VALUE machine instrs is that they usually cause LLVM to emit a location list for DW_AT_location. The next step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE ranges as not needing a location list, and possibly start setting DW_AT_start_offset for variables whose lifetimes begin mid-scope. Reviewers: aprantl, dblaikie, probinson Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37768 llvm-svn: 313825	2017-09-20 21:52:33 +00:00
Craig Topper	562bf99ee6	[InstCombine] Handle (X & C2) < C1 --> (X & C2) == 0 We already did (X & C2) > C1 --> (X & C2) != 0, if any bit set in (X & C2) will produce a result greater than C1. But there is an equivalent inverse condition with <= C1 (which will be canonicalized to < C1+1) Differential Revision: https://reviews.llvm.org/D38065 llvm-svn: 313819	2017-09-20 21:18:17 +00:00
Craig Topper	a0c897f634	[InstCombine] Use APInt::getActiveBits() to avoid creating an APInt from a trailing zero count to do a comparison. NFCI llvm-svn: 313792	2017-09-20 18:49:29 +00:00
Hans Wennborg	57c3341ada	Revert r313771 "[SLP] Vectorize jumbled memory loads." This broke the buildbots, e.g. http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391 > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Subscribers: mzolotukhin > > Reviewed By: ayal > > Differential Revision: https://reviews.llvm.org/D36130 > > Review comments updated accordingly > > Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 > > Added a TODO for sortLoadAccesses API > > Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 > > Modified the TODO for sortLoadAccesses API > > Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 > > Review comment update for using OpdNum to insert the mask in respective location > > Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce > > Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase > > Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313781	2017-09-20 18:00:03 +00:00
Quentin Colombet	aa103b3d86	[InstCombine] Add select simplifications In these cases, two selects have constant selectable operands for both the true and false components and have the same conditional expression. We then create two arithmetic operations of the same type and feed a final select operation using the result of the true arithmetic for the true operand and the result of the false arithmetic for the false operand and reuse the original conditionl expression. The arithmetic operations are naturally folded as a consequence, leaving only the newly formed select to replace the old arithmetic operation. Patch by: Michael Berg <michael_c_berg@apple.com> Differential Revision: https://reviews.llvm.org/D37019 llvm-svn: 313774	2017-09-20 17:32:16 +00:00
Mohammad Shahid	2b281de576	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Subscribers: mzolotukhin Reviewed By: ayal Differential Revision: https://reviews.llvm.org/D36130 Review comments updated accordingly Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 Added a TODO for sortLoadAccesses API Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 Modified the TODO for sortLoadAccesses API Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 Review comment update for using OpdNum to insert the mask in respective location Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313771	2017-09-20 17:19:57 +00:00
Teresa Johnson	f625118ec7	[ThinLTO] Fix dead stripping analysis for SamplePGO Summary: The fix for dead stripping analysis in the case of SamplePGO indirect calls to local functions (r313151) introduced the possibility of an infinite loop. Make sure we check for the value being already live after we update it for SamplePGO indirect call handling. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D38086 llvm-svn: 313766	2017-09-20 17:09:47 +00:00
Alexander Kornienko	6a140234ed	Revert r313736: "[SLP] Vectorize jumbled memory loads." The revision breaks buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio llvm-svn: 313758	2017-09-20 14:53:07 +00:00
Mohammad Shahid	f8db9bd857	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 Commit after rebase for patch D36130 Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab llvm-svn: 313736	2017-09-20 08:18:28 +00:00
Sanjoy Das	09613b122e	Tighten the invariants around LoopBase::invalidate Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance This change also shuffles things around as necessary to work with this stricter invariant. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38055 llvm-svn: 313708	2017-09-20 02:31:57 +00:00
Daniel Berlin	064cb68d18	GVNSink: Make ModelledPHIs constructor linear (and avoid edge case it worries about) by avoiding getIncomingValueForBlock llvm-svn: 313702	2017-09-20 00:07:27 +00:00
Daniel Berlin	dd323297d0	Revert "[GVNSink] Remove dependency on SmallPtrSet iteration order." This reverts commit r312156, because now the op and block arrays are not in the same order :(. llvm-svn: 313701	2017-09-20 00:07:25 +00:00
Daniel Berlin	9632dd7376	NewGVN: Remove unused includes llvm-svn: 313700	2017-09-20 00:07:12 +00:00
Sanjoy Das	76ab23234c	[LoopInfo] Make LoopBase and Loop destructors non-public Summary: See comment for why I think this is a good idea. This change also: - Removes an SCEV test case. The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop. - Updates the loop pass manager test case to deal with a private ~Loop::Loop. - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37996 llvm-svn: 313695	2017-09-19 23:19:00 +00:00
Adam Nemet	15fccf0009	Allow ORE.emit to take a closure to delay building the remark object In the lambda we are now returning the remark by value so we need to preserve its type in the insertion operator. This requires making the insertion operator generic. I've also converted a few cases to use the new API. It seems to work pretty well. See the LoopUnroller for a slightly more interesting case. llvm-svn: 313691	2017-09-19 23:00:55 +00:00
Dehao Chen	62b9c33e1e	Import all inlined indirect call targets for SamplePGO. Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36637 llvm-svn: 313678	2017-09-19 21:18:14 +00:00
Sanjay Patel	ca14697c2b	[SimplifyCFG] fix typos/formatting; NFC llvm-svn: 313671	2017-09-19 20:58:14 +00:00
Dehao Chen	b6e60c8b80	Handle profile mismatch correctly for SamplePGO. Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash. Reviewers: davidxl, tejohnson, eraman Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38018 llvm-svn: 313658	2017-09-19 18:26:54 +00:00
Reid Kleckner	1aa4ea8104	[gcov] Emit errors when opening the notes file fails No time to write a test case, on to the next bug. =P Discovered while investigating PR34659 llvm-svn: 313571	2017-09-18 21:31:48 +00:00
Sanjay Patel	55de1ed8e6	[SLP] clean up for vector store case; NFCI llvm-svn: 313541	2017-09-18 16:20:15 +00:00
Craig Topper	f264fcc704	[X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles. I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates. llvm-svn: 313450	2017-09-16 07:36:14 +00:00
Chandler Carruth	beb22b5437	[SLP] Revert r312791 and other necessary commits, except for TTI and CostModel. The original patch added support for horizontal min/max reductions to the SLP vectorizer. This patch causes LLVM to miscompile fairly simple signed min reductions. I have attached a test progrom to http://llvm.org/PR34635 that shows the behavior change after this patch. We found this in a test for the open source Eigen library, but also in other code. Unfortunately, the revert is moderately challenging. It required reverting: r313042: [SLP] Test with multiple uses of conditional op and wrong parent. r312853: [SLP] Fix buildbots, NFC. r312793: [SLP] Fix the warning about paths not returning the value, NFC. r312791: [SLP] Support for horizontal min/max reduction. And even then, I had to completely skip reverting the changes to TTI and CostModel because r312832 rewrote so much of this code. Plus, the cost modeling changes aren implicated in the miscompile, so they should be fine and will just not be used until this gets re-introduced. llvm-svn: 313409	2017-09-15 22:23:27 +00:00
Vivek Pandya	b5ab895e2a	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313390	2017-09-15 20:10:09 +00:00
Vivek Pandya	df8598dcc4	This reverts r313381 llvm-svn: 313387	2017-09-15 19:53:54 +00:00
Vivek Pandya	00d887447b	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313382	2017-09-15 19:30:59 +00:00
Anna Thomas	f34537dff8	[RuntimeUnroll] Add heuristic for unrolling multi-exit loop Add a profitability heuristic to enable runtime unrolling of multi-exit loop: There can be atmost two unique exit blocks for the loop and the second exit block should be a deoptimizing block. Also, there can be one other exiting block other than the latch exiting block. The reason for the latter is so that we limit the number of branches in the unrolled code to being at most the unroll factor. Deoptimizing blocks are rarely taken so these additional number of branches created due to the unrolling are predictable, since one of their target is the deopt block. Reviewers: apilipenko, reames, evstupac, mkuper Subscribers: llvm-commits Reviewed by: reames Differential Revision: https://reviews.llvm.org/D35380 llvm-svn: 313363	2017-09-15 15:56:05 +00:00
Anna Thomas	512dde77ba	[RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup During runtime unrolling on loops with multiple exits, we update the exit blocks with the correct phi values from both original and remainder loop. In this process, we lookup the VMap for the mapped incoming phi values, but did not update the VMap if a default entry was generated in the VMap during the lookup. This default value is generated when constants or values outside the current loop are looked up. This patch fixes the assertion failure when null entries are present in the VMap because of this lookup. Added a testcase that showcases the problem. llvm-svn: 313358	2017-09-15 13:29:33 +00:00
Ilya Biryukov	d23faa843e	Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." This reverts commit r313348. Reason: it caused buildbot failures. llvm-svn: 313352	2017-09-15 10:15:00 +00:00
Dinar Temirbulatov	e2358b53bc	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 313348	2017-09-15 06:56:39 +00:00
Dinar Temirbulatov	bb891b864c	[SLPVectorizer] Remove duplicated functionality code in initScheduleData function, NFCI. llvm-svn: 313341	2017-09-15 04:31:54 +00:00
Alina Sbirlea	7ed5856a32	Refactor collectChildrenInLoop to LoopUtils [NFC] Summary: Move to LoopUtils method that collects all children of a node inside a loop. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37870 llvm-svn: 313322	2017-09-15 00:04:16 +00:00
Dehao Chen	3a81f84d9a	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: vitalybuka, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313277	2017-09-14 17:29:56 +00:00
Alon Kom	682cfc1d4c	[LV] Fix maximum legal VF calculation This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237	2017-09-14 07:40:02 +00:00
Vitaly Buka	48624d327a	Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader." Patch introduced uninitialized value. This reverts commit r313195. llvm-svn: 313230	2017-09-14 05:40:33 +00:00
Peter Collingbourne	cfbd089237	Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222. This reland includes a fix for the LowerTypeTests pass so that it looks past aliases when determining which type identifiers are live. Differential Revision: https://reviews.llvm.org/D37842 llvm-svn: 313229	2017-09-14 05:02:59 +00:00
Dinar Temirbulatov	df0b843875	[SLPVectorizer] Prefer auto over explicit type for VL0, NFCI. llvm-svn: 313228	2017-09-14 04:28:35 +00:00
Hans Wennborg	ae050afeb9	Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping." This broke Chromium's CFI build; see crbug.com/765004. > We were previously handling aliases during dead stripping by adding > the aliased global's "original name" GUID to the worklist. This will > lead to incorrect behaviour if the global has local linkage because > the original name GUID will not correspond to the global's GUID in > the summary. > > Because an alias is just another name for the global that it > references, there is no need to mark the referenced global as used, > or to follow references from any other copies of the global. So all > we need to do is to follow references from the aliasee's summary > instead of the alias. > > Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313222	2017-09-14 00:40:14 +00:00
Eugene Zelenko	8002c504cd	[Transforms] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 313198	2017-09-13 21:43:53 +00:00
Dehao Chen	15c86ef970	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313195	2017-09-13 21:22:55 +00:00
Anna Thomas	19529f75b9	[LV] Avoid computing the register usage for default VF. NFC These are changes to reduce redundant computations when calculating a feasible vectorization factor: 1. early return when target has no vector registers 2. don't compute register usage for the default VF. Suggested during review for D37702. llvm-svn: 313176	2017-09-13 19:35:45 +00:00
Hiroshi Yamauchi	a43913cfaf	Add options to dump PGO counts in text. Summary: Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37776 llvm-svn: 313159	2017-09-13 17:20:38 +00:00
Peter Collingbourne	d067c8ed59	ThinLTO: Correctly follow aliasee references when dead stripping. We were previously handling aliases during dead stripping by adding the aliased global's "original name" GUID to the worklist. This will lead to incorrect behaviour if the global has local linkage because the original name GUID will not correspond to the global's GUID in the summary. Because an alias is just another name for the global that it references, there is no need to mark the referenced global as used, or to follow references from any other copies of the global. So all we need to do is to follow references from the aliasee's summary instead of the alias. Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313157	2017-09-13 17:09:20 +00:00
Teresa Johnson	1958083d35	[ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link Summary: SamplePGO indirect call profiles record the target as the original GUID for statics. The importer had special handling to map to the normal GUID in that case. The dead global analysis needs the same treatment or inconsistencies arise, resulting in linker unsats due to some dead symbols being exported and kept, leaving in references to other dead symbols that are removed. This can happen when a SamplePGO profile collected by one binary is used for a different binary, so the indirect call profiles may not accurately reflect live targets. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37783 llvm-svn: 313151	2017-09-13 15:16:38 +00:00
Ayal Zaks	e2a8c0758f	[LV] Fix PR34523 - avoid generating redundant selects When converting a PHI into a series of 'select' instructions to combine the incoming values together according their edge masks, initialize the first value to the incoming value In0 of the first predecessor, instead of generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter fails when the Cond[0] mask is null, representing a full mask, which can happen only when there's a single incoming value. No functional changes intended nor expected other than surviving null Cond[0]'s. This fix follows D35725, which introduced using null to represent full masks. Differential Revision: https://reviews.llvm.org/D37619 llvm-svn: 313119	2017-09-13 06:28:37 +00:00
Aditya Kumar	dfa8741c96	[GVNHoist] Factor out reachability to search for anticipable instructions quickly Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes) to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates. This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way. Relevant test cases are added to show the functionality as well as regression fixes from PR32821. Regression from previous GVNHoist: We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN Differential Revision: https://reviews.llvm.org/D35918 Reviewers: dberlin, sebpop, gberry, llvm-svn: 313116	2017-09-13 05:28:03 +00:00
Reid Kleckner	8a1cd91016	[InstCombine] Add a flag to disable LowerDbgDeclare Summary: This should improve optimized debug info for address-taken variables at the cost of inaccurate debug info in some situations. We patched this into clang and deployed this change to Chromium developers, and this significantly improved debuggability of optimized code. The long-term solution to PR34136 seems more and more like it's going to take a while, so I would like to commit this change under a flag so that it can be used as a stop-gap measure. This flag should really help so for C++ aggregates like std::string and std::vector, which are typically address-taken, even after inlining, and cannot be SROA-ed. Reviewers: aprantl, dblaikie, probinson, dberlin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36596 llvm-svn: 313108	2017-09-13 01:43:25 +00:00
Dehao Chen	f3ed14d323	Refactor the code to pass down ACT to SampleProfileLoader correctly. Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37773 llvm-svn: 313080	2017-09-12 21:55:55 +00:00
Alina Sbirlea	80b806bf30	Make promoteLoopAccessesToScalars independent of AliasSet [NFC] Summary: The current promoteLoopAccessesToScalars method receives an AliasSet, but the information used is in fact a list of Value, known to must alias. Create the list ahead of time to make this method independent of the AliasSet class. While there is no functionality change, this adds overhead for creating a set of Value, when promotion would normally exit earlier. This is meant to be as a first refactoring step in order to start replacing AliasSetTracker with MemorySSA. And while the end goal is to redesign LICM, the first few steps will focus on adding MemorySSA as an alternative to the AliasSetTracker using most of the existing functionality. Reviewers: mkuper, danielcdh, dberlin Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35439 llvm-svn: 313075	2017-09-12 21:18:44 +00:00
Anna Thomas	9f1be02fa3	[LV] Clamp the VF to the trip count Summary: When the MaxVectorSize > ConstantTripCount, we should just clamp the vectorization factor to be the ConstantTripCount. This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF. Earlier we were finding the maximum vector width, which could be greater than the trip count itself. The Loop vectorizer does all the work for generating a vectorizable loop, but in the end we would always choose the scalar loop (since the VF > trip count). This allows us to choose the VF keeping in mind the trip count if available. This is a fix on top of rL312472. Reviewers: Ayal, zvi, hfinkel, dneilson Reviewed by: Ayal Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37702 llvm-svn: 313046	2017-09-12 16:32:45 +00:00
Alexey Bataev	a26d3e834d	[SLP] Fix for PHINode during horizontal reduction scanning, NFC. Reduces number of loops during instructions analysis. llvm-svn: 313035	2017-09-12 15:13:50 +00:00
Peter Collingbourne	b9b6025328	LowerTypeTests: Add import/export support for targets without absolute symbol constants. The rationale is the same as for r312967. Differential Revision: https://reviews.llvm.org/D37408 llvm-svn: 312968	2017-09-11 22:49:10 +00:00
Peter Collingbourne	b15a35e604	WholeProgramDevirt: Add import/export support for targets without absolute symbol constants. Not all targets support the use of absolute symbols to export constants. In particular, ARM has a wide variety of constant encodings that cannot currently be relocated by linkers. So instead of exporting the constants using symbols, export them directly in the summary. The values of the constants are left as zeroes on targets that support symbolic exports. This may result in more cache misses when targeting those architectures as a result of arbitrary changes in constant values, but this seems somewhat unavoidable for now. Differential Revision: https://reviews.llvm.org/D37407 llvm-svn: 312967	2017-09-11 22:34:42 +00:00
Uriel Korach	18972237a2	Test commit llvm-svn: 312878	2017-09-10 08:31:22 +00:00
Nuno Lopes	404f106d71	Merge isKnownNonNull into isKnownNonZero It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 llvm-svn: 312869	2017-09-09 18:23:11 +00:00
Sanjay Patel	6fd4391ddd	[DivRempairs] add a pass to optimize div/rem pairs (PR31028) This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862	2017-09-09 13:38:18 +00:00
Kostya Serebryany	4192b96313	[sanitizer-coverage] call appendToUsed once per module, not once per function (which is too slow) llvm-svn: 312855	2017-09-09 05:30:13 +00:00
Alexey Bataev	628fbcae4c	[SLP] Fix buildbots, NFC. llvm-svn: 312853	2017-09-09 02:08:45 +00:00
Dinar Temirbulatov	0d31f0af43	[SLPVectorizer] Add struct InstructionsState that holds information about analysis of vector to be vectorized. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D37212 llvm-svn: 312802	2017-09-08 17:08:17 +00:00
Alexey Bataev	bd4a361739	[SLP] Fix the warning about paths not returning the value, NFC. llvm-svn: 312793	2017-09-08 14:32:20 +00:00
Alexey Bataev	6dd29fccb8	[SLP] Support for horizontal min/max reduction. SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 llvm-svn: 312791	2017-09-08 13:49:36 +00:00
Max Kazantsev	d7b0f74c64	Re-enable "[IRCE] Identify loops with latch comparison against current IV value" Re-applying after the found bug was fixed. Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312783	2017-09-08 10:15:05 +00:00
Max Kazantsev	57db44838d	diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp index f72a808..9fa49fd 100644 --- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp +++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp @@ -450,20 +450,10 @@ struct LoopStructure { // equivalent to: // // intN_ty inc = IndVarIncreasing ? 1 : -1; - // pred_ty predicate = IndVarIncreasing - // ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT - // : IsSignedPredicate ? ICMP_SGT : ICMP_UGT; + // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT; // - // - // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt); - // iv = IndVarNext) + // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase) // ... body ... - // - // Here IndVarBase is either current or next value of the induction variable. - // in the former case, IsIndVarNext = false and IndVarBase points to the - // Phi node of the induction variable. Otherwise, IsIndVarNext = true and - // IndVarBase points to IV increment instruction. - // Value IndVarBase; Value IndVarStart; @@ -471,13 +461,12 @@ struct LoopStructure { Value LoopExitAt; bool IndVarIncreasing; bool IsSignedPredicate; - bool IsIndVarNext; LoopStructure() : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr), LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr), IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr), - IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {} + IndVarIncreasing(false), IsSignedPredicate(true) {} template <typename M> LoopStructure map(M Map) const { LoopStructure Result; @@ -493,7 +482,6 @@ struct LoopStructure { Result.LoopExitAt = Map(LoopExitAt); Result.IndVarIncreasing = IndVarIncreasing; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; return Result; } @@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, return false; }; - // `ICI` can either be a comparison against IV or a comparison of IV.next. - // Depending on the interpretation, we calculate the start value differently. + // `ICI` is interpreted as taking the backedge if the next* value of the + // induction variable satisfies some constraint. - // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch - // comparisons happens against the IV before or after its value is - // incremented. Two valid combinations for them are: - // - // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false }, - // 2) { iv.next; true }. - // - // The latch comparison happens against IndVarBase which can be either current - // or next value of the induction variable. const SCEVAddRecExpr IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV); bool IsIncreasing = false; bool IsSignedPredicate = true; - bool IsIndVarNext = false; ConstantInt StepCI; if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) { FailureReason = "LHS in icmp not induction variable"; return None; } - const SCEV IndVarStart = nullptr; - // TODO: Currently we only handle comparison against IV, but we can extend - // this analysis to be able to deal with comparison against sext(iv) and such. - if (isa<PHINode>(LeftValue) && - cast<PHINode>(LeftValue)->getParent() == Header) - // The comparison is made against current IV value. - IndVarStart = IndVarBase->getStart(); - else { - // Assume that the comparison is made against next IV value. - const SCEV StartNext = IndVarBase->getStart(); - const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); - IndVarStart = SE.getAddExpr(StartNext, Addend); - IsIndVarNext = true; - } + const SCEV StartNext = IndVarBase->getStart(); + const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); + const SCEV IndVarStart = SE.getAddExpr(StartNext, Addend); const SCEV Step = SE.getSCEV(StepCI); ConstantInt One = ConstantInt::get(IndVarTy, 1); @@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, Result.IndVarIncreasing = IsIncreasing; Result.LoopExitAt = RightValue; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; FailureReason = nullptr; @@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd( BranchToContinuation); NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader); - auto FixupValue = - LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN; - NewPHI->addIncoming(FixupValue, RRI.ExitSelector); + NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch), + RRI.ExitSelector); RRI.PHIValuesAtPseudoExit.push_back(NewPHI); } @@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop L, LPPassManager &LPM) { } LoopStructure LS = MaybeLoopStructure.getValue(); const SCEVAddRecExpr IndVar = - cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase)); - if (LS.IsIndVarNext) - IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar, - SE.getSCEV(LS.IndVarStep))); + cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep))); Optional<InductiveRangeCheck::Range> SafeIterRange; Instruction ExprInsertPt = Preheader->getTerminator(); diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll deleted file mode 100644 index afea0e6..0000000 --- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll +++ /dev/null @@ -1,182 +0,0 @@ -; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 \| FileCheck %s - -; Check that IRCE is able to deal with loops where the latch comparison is -; done against current value of the IV, not the IV.next. - -; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> - -; SLT condition for increasing loop from 0 to 100. -define void @test_01(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_01 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp slt i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp slt i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; ULT condition for increasing loop from 0 to 100. -define void @test_02(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_02 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp ult i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp ult i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_01, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_03(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_03 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp slt i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_02, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_04(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_04 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp ult i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -!0 = !{i32 0, i32 50} -!1 = !{i64 0, i64 50} llvm-svn: 312775	2017-09-08 04:26:41 +00:00
Peter Collingbourne	88a58cf9e7	WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat. This is required when targeting COFF, as the comdat name must match one of the names of the symbols in the comdat. Differential Revision: https://reviews.llvm.org/D37550 llvm-svn: 312767	2017-09-08 00:10:53 +00:00
Reid Kleckner	0e8c4bb055	Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759	2017-09-07 23:27:44 +00:00
Richard Trieu	c7828ebea4	Revert r312318, r312325, r312424, r312489 r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 llvm-svn: 312758	2017-09-07 23:20:35 +00:00
Krzysztof Parzyszek	1dc313727e	Disable jump threading into loop headers Consider this type of a loop: for (...) { ... if (...) continue; ... } Normally, the "continue" would branch to the loop control code that checks whether the loop should continue iterating and which contains the (often) unique loop latch branch. In certain cases jump threading can "thread" the inner branch directly to the loop header, creating a second loop latch. Loop canonicalization would then transform this loop into a loop nest. The problem with this is that in such a loop nest neither loop is countable even if the original loop was. This may inhibit subsequent loop optimizations and be detrimental to performance. Differential Revision: https://reviews.llvm.org/D36404 llvm-svn: 312664	2017-09-06 19:36:58 +00:00
Sanjay Patel	6840c5ff75	[ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 llvm-svn: 312591	2017-09-05 23:13:13 +00:00
Davide Italiano	32504cf661	[GVNHoist] Move duplicated code to a helper function. NFCI. llvm-svn: 312575	2017-09-05 20:49:41 +00:00
Craig Topper	28d6d962d5	[InstCombine] Move foldSelectICmpAnd helper function earlier in the file to enable reuse in a future patch. llvm-svn: 312518	2017-09-05 05:26:37 +00:00
Craig Topper	4c766a0559	[InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know for sure we're going to use it and avoid an unnecessary call to m_APInt. Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt. This is a refactor for a future patch that will do some more checks of the constant values here. llvm-svn: 312517	2017-09-05 05:26:36 +00:00
Daniel Berlin	f9c9455d3f	NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant. llvm-svn: 312510	2017-09-05 02:17:43 +00:00
Daniel Berlin	54a92fcc5d	NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification llvm-svn: 312509	2017-09-05 02:17:42 +00:00
Daniel Berlin	1a58258232	NewGVN: Detect copies through predicateinfo llvm-svn: 312508	2017-09-05 02:17:41 +00:00
Daniel Berlin	4ad7e8d263	NewGVN: Change where check for original instruction in phi of ops leader finding is done. Where we had it before, we would stop looking when we hit the original instruction, but skip it. Now we skip it and keep looking. llvm-svn: 312507	2017-09-05 02:17:40 +00:00
Zvi Rackover	9a087a357a	LoopVectorize: MaxVF should not be larger than the loop trip count Summary: Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count. Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow: 1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks. 2. Compute MaxVF returned 16 on a target supporting AVX512. 3. OptForSize -> choose VF:=MaxVF. 4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop. With this patch step 2. will choose MaxVF=8 based on TripCount. Reviewers: Ayal, dorit, mkuper, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D37425 llvm-svn: 312472	2017-09-04 08:35:13 +00:00
Sam Parker	7cd826a321	[LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop Debug information can be, and was, corrupted when the runtime remainder loop was fully unrolled. This is because a !null node can be created instead of a unique one describing the loop. In this case, the original node gets incorrectly updated with the NewLoopID metadata. In the case when the remainder loop is going to be quickly fully unrolled, there isn't the need to add loop metadata for it anyway. Differential Revision: https://reviews.llvm.org/D37338 llvm-svn: 312471	2017-09-04 08:12:16 +00:00
Sanjay Patel	bc6da4e40f	[InstCombine] replace unnecessary fcmp fold with assert See https://reviews.llvm.org/rL312411 for related InstSimplify tests. llvm-svn: 312421	2017-09-02 18:10:29 +00:00
Sanjay Patel	64fc5daf42	[InstCombine] combine foldAndOfFCmps and foldOrOfFcmps; NFCI In addition to removing chunks of duplicated code, we don't want these to diverge. If there's a fold for one, there should be a fold of the other via DeMorgan's Laws. llvm-svn: 312420	2017-09-02 17:53:33 +00:00
Sanjay Patel	275bb5a14e	[InstCombine] fix misnamed locals and use them to reduce code; NFCI We had these locals: Value Op0RHS = LHS->getOperand(1); Value Op1LHS = RHS->getOperand(0); ...so we confusingly transposed the meaning of left/right and op0/op1. llvm-svn: 312418	2017-09-02 17:17:17 +00:00
Benjamin Kramer	14ddcdfb18	[LoopVectorize] Turn static DenseSet into switch. LLVM transforms this into a bit test which is a lot faster and smaller. llvm-svn: 312417	2017-09-02 16:41:55 +00:00
Sanjay Patel	da6f9b2fee	[InstCombine] remove unnecessary code; NFC llvm-svn: 312416	2017-09-02 16:32:37 +00:00
Sanjay Patel	4c52f765a5	[InstCombine] move related functions next to each other; NFC This makes it easier to see that they're almost duplicates. As with the similar icmp functions, there should be identical folds for both logic ops because those are DeMorganized variants. llvm-svn: 312415	2017-09-02 16:30:27 +00:00
Sanjay Patel	6b139464ca	[InstCombine] use local variable to reduce code duplication; NFCI llvm-svn: 312414	2017-09-02 15:11:55 +00:00
Daniel Berlin	94090dd13b	Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders. Summary: After a discussion with Rekka, i believe this (or a small variant) should fix the remaining phi-of-ops problems. Rekka's algorithm for completeness relies on looking up expressions that should have no leader, and expecting it to fail (IE looking up expressions that can't exist in a predecessor, and expecting it to find nothing). Unfortunately, sometimes these expressions can be simplified to constants, but we need the lookup to fail anyway. Additionally, our simplifier outsmarts this by taking these "not quite right" expressions, and simplifying them into other expressions or walking through phis, etc. In the past, we've sometimes been able to find leaders for these expressions, incorrectly. This change causes us to not to try to phi of ops such expressions. We determine safety by seeing if they depend on a phi node in our block. This is not perfect, we can do a bit better, but this should be a "correctness start" that we can then improve. It also requires a bunch of caching that i'll eventually like to eliminate. The right solution, longer term, to the simplifier issues, is to make the query interface for the instruction simplifier/constant folder have the flags we need, so that we can keep most things going, but turn off the possibly-invalid parts (threading through phis, etc). This is an issue in another wrong code bug as well. Reviewers: davide, mcrosier Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37175 llvm-svn: 312401	2017-09-02 02:18:44 +00:00
Eugene Zelenko	75075efe5e	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 312383	2017-09-01 21:37:29 +00:00
Craig Topper	924f20262b	[InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask. This allows some code to be removed from InstSimplify that was doing this functionality. This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out. There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary. Differential Revision: https://reviews.llvm.org/D37158 llvm-svn: 312382	2017-09-01 21:27:34 +00:00
Craig Topper	d3b465606a	[InstCombine] Don't require the compare types to be the same in getMaskedTypeForICmpPair. A future patch will make the code look through truncates feeding the compare. So the compares might be different types but the pretruncated types might be the same. This should be safe because we still require the same Value* to be used truncated or not in both compares. So that serves to ensure the types are the same. llvm-svn: 312381	2017-09-01 21:27:31 +00:00
Craig Topper	085c1f4dea	[InstCombine] When converting decomposeBitTestICmp's APInt return to ConstantInt, make sure we use the type from the Value* that was also returned from decomposeBitTestICmp. Previously we used the type from the LHS of the compare, but a future patch will change decomposeBitTestICmp to look through truncates so it will return a pretruncated Value* and the type needs to match that. llvm-svn: 312380	2017-09-01 21:27:29 +00:00
Daniel Berlin	86932104db	NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context. Reviewers: davide, mcrosier Subscribers: sanjoy, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D37174 llvm-svn: 312352	2017-09-01 19:20:18 +00:00
Manoj Gupta	6b54c7e11b	[LoopVectorizer] Use two step casting for float to pointer types. Summary: LoopVectorizer is creating casts between vec<ptr> and vec<float> types on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a floating point type to a pointer type even if the types have same size causing a crash. Fix the crash using a two-step casting by bitcasting to integer and integer to pointer/float. Fixes PR33804. Reviewers: mkuper, Ayal, dlj, rengolin, srhines Reviewed By: rengolin Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35498 llvm-svn: 312331	2017-09-01 15:36:00 +00:00
Clement Courbet	bc0c4459c9	[MergeICmps] Fix build of rL312315 on clang-with-thin-lto-windows: MergeICmps.cpp(68,15): error: chosen constructor is explicit in copy-initialization return {}; APInt.h(339,12): note: explicit constructor declared here explicit APInt() : BitWidth(1) { U.VAL = 0; } ^ MergeICmps.cpp(56,9): note: in implicit initialization of field 'Offset' with omitted initializer APInt Offset; ^ llvm-svn: 312326	2017-09-01 11:51:23 +00:00
Clement Courbet	65130e2d8d	Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322	2017-09-01 10:56:34 +00:00
Strahinja Petrovic	676fd0b022	Debug info for variables whose type is shrinked to bool This patch provides such debug information for integer variables whose type is shrinked to bool by providing dwarf expression which returns either constant initial value or other value. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D35994 llvm-svn: 312318	2017-09-01 10:05:27 +00:00
Clement Courbet	316212575b	Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317	2017-09-01 09:43:08 +00:00
Clement Courbet	9473c01e96	[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315	2017-09-01 09:07:05 +00:00
Eugene Zelenko	fa6434bebb	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). llvm-svn: 312289	2017-08-31 21:56:16 +00:00
Akira Hatanaka	13d2beb14d	[ObjCARC] Pass the correct BasicBlock to fix assertion failure. The BasicBlock passed to FindPredecessorRetainWithSafePath should be the parent block of Autorelease. This fixes a crash that occurs in FindDependencies when StartInst is not in StartBB. rdar://problem/33866381 llvm-svn: 312266	2017-08-31 18:27:47 +00:00
Sanjay Patel	e6b48a1b02	[InstCombine] improve demanded vector elements analysis of insertelement Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 llvm-svn: 312248	2017-08-31 15:57:17 +00:00
Dinar Temirbulatov	8870a14e4e	[SLPVectorizer] Move out Entry->NeedToGather check and assert of inner loop as invariant, NFCI. llvm-svn: 312242	2017-08-31 14:10:07 +00:00
Max Kazantsev	0a9c1ef2eb	[IRCE] Identify loops with latch comparison against current IV value Current implementation of parseLoopStructure interprets the latch comparison as a comarison against `iv.next`. If the actual comparison is made against the `iv` current value then the loop may be rejected, because this misinterpretation leads to incorrect evaluation of the latch start value. This patch teaches the IRCE to distinguish this kind of loops and perform the optimization for them. Now we use `IndVarBase` variable which can be either next or current value of the induction variable (previously we used `IndVarNext` which was always the value on next iteration). Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312221	2017-08-31 07:04:20 +00:00
Max Kazantsev	a22742be5a	[IRCE][NFC] Rename IndVarNext to IndVarBase Renaming as a preparation step to generalizing IRCE for comparison not only against the next value of an indvar, but also against the current. Differential Revision: https://reviews.llvm.org/D36509 llvm-svn: 312215	2017-08-31 05:58:15 +00:00
Adrian Prantl	504b82d44b	Don't add a fragment expression when GlobalSRA splits up a single-member struct Fixes PR34390. https://bugs.llvm.org/show_bug.cgi?id=34390 llvm-svn: 312196	2017-08-31 00:06:18 +00:00
Matt Morehouse	034126e507	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. - Only enable on Linux. Reviewers: vitalybuka, kcc, george.karpenkov Reviewed By: kcc Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 312185	2017-08-30 22:49:31 +00:00
Adrian Prantl	b192b545c1	Refactor DIBuilder::createFragmentExpression into a static DIExpression member NFC llvm-svn: 312165	2017-08-30 20:04:17 +00:00
Daniel Berlin	23fec57e6f	NewGVN: Make sure we add the correct user if we swapped the comparison operands llvm-svn: 312162	2017-08-30 19:53:23 +00:00
Daniel Berlin	7ef26daba8	NewGVN: Allow simplification into variables llvm-svn: 312161	2017-08-30 19:52:39 +00:00
Benjamin Kramer	b99d7c9214	[GVNSink] Remove dependency on SmallPtrSet iteration order. Found by LLVM_ENABLE_REVERSE_ITERATION. llvm-svn: 312156	2017-08-30 18:46:37 +00:00
Sanjay Patel	6f7ac7e402	[InstCombine] remove unnecessary vector select fold; NFCI This code is double-dead: 1. We simplify all selects with constant true/false condition in InstSimplify. I've minimized/moved the tests to show that works as expected. 2. All remaining vector selects with a constant condition are canonicalized to shufflevector, so we really can't see this pattern. llvm-svn: 312123	2017-08-30 14:04:57 +00:00
Florian Hahn	b992feee13	[InstCombine] Fold insert sequence if first ins has multiple users. Summary: If the first insertelement instruction has multiple users and inserts at position 0, we can re-use this instruction when folding a chain of insertelement instructions. As we need to generate the first insertelement instruction anyways, this should be a strict improvement. We could get rid of the restriction of inserting at position 0 by creating a different shufflemask, but it is probably worth to keep the first insertelement instruction with position 0, as this is easier to do efficiently than at other positions I think. Reviewers: grosser, mkuper, fpetrogalli, efriedma Reviewed By: fpetrogalli Subscribers: gareevroman, llvm-commits Differential Revision: https://reviews.llvm.org/D37064 llvm-svn: 312110	2017-08-30 10:54:21 +00:00
Mandeep Singh Grang	e3bbb68b0c	[cfi] Fixed non-determinism in codegen due to DenseSet iteration order llvm-svn: 312098	2017-08-30 04:47:21 +00:00
Evgeniy Stepanov	7372b67063	[cfi] Avoid branch veneers in jump tables when possible. Summary: When jumptable encoding does not match target code encoding (arm vs thumb), a veneer is inserted by the linker. We can not avoid this in all cases, because entries within one jumptable must have the same encoding, but we can make it less common by selecting the jumptable encoding to match the majority of its targets. This change only covers FullLTO, and not ThinLTO. Reviewers: pcc Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37171 llvm-svn: 312054	2017-08-29 22:40:19 +00:00
Evgeniy Stepanov	4731ad81c7	[cfi] Build __cfi_check as Thumb when applicable. Summary: Cross-DSO CFI needs all __cfi_check exports to use the same encoding (ARM vs Thumb). Reviewers: pcc Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37243 llvm-svn: 312052	2017-08-29 22:29:15 +00:00
Matt Morehouse	ba2e61b357	Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer" This reverts r312026 due to bot breakage. llvm-svn: 312047	2017-08-29 21:56:56 +00:00
Wei Mi	ebb9327759	[LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition fails to find a loop invariant condition. This is wrong and it will disable loop unswitch for select. The patch fixes the bug. Differential Revision: https://reviews.llvm.org/D36985 llvm-svn: 312045	2017-08-29 21:45:11 +00:00
Benjamin Kramer	154411e0e7	[FunctionImport] Avoid unused variable warnings in Release builds Just skip the entire block in NDEBUG. No functionality change intended. llvm-svn: 312031	2017-08-29 20:24:39 +00:00
Alexey Bataev	978e2e4760	[SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores. Summary: If SimplifyCFG pass is able to merge conditional stores into single one, it loses the alignment. This may lead to incorrect codegen. Patch sets the alignment of the new instruction if it is set in the original one. Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36841 llvm-svn: 312030	2017-08-29 20:06:24 +00:00
Matt Morehouse	2ad8d948b2	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. - Disable stack depth tracking on Mac. Reviewers: vitalybuka, kcc, george.karpenkov Reviewed By: kcc Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 312026	2017-08-29 19:48:12 +00:00
Craig Topper	4431bfe88c	[InstCombine] Support vector splats in transformZExtICmp This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code. One test didn't vectorize optimize properly due to another bug so a TODO has been added. Differential Revision: https://reviews.llvm.org/D37253 llvm-svn: 312023	2017-08-29 18:58:13 +00:00
Teresa Johnson	2df7fc7991	[ThinLTO] Clean up stale alias import handling Summary: Remove some code that was no longer needed. The first FIXME is stale since we long ago started using the index to drive importing, rather than doing force importing based on linkage type. And now with r309278, we no longer import any aliases. Reviewers: dblaikie Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D37266 llvm-svn: 312019	2017-08-29 18:15:34 +00:00
Dehao Chen	efd007f6f4	Add null check for promoted direct call Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed. Reviewers: tejohnson, xur, davidxl, djasper Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D37252 llvm-svn: 312005	2017-08-29 15:28:12 +00:00
Sanjay Patel	674d2c23ea	[Instruction] add moveAfter() convenience function; NFCI As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(), but implemented using moveBefore() for symmetry/efficiency. Differential Revision: https://reviews.llvm.org/D37239 llvm-svn: 312001	2017-08-29 14:07:48 +00:00
Max Kazantsev	bb1d010872	[LSR] Fix Shadow IV in case of integer overflow When LSR processes code like int accumulator = 0; for (int i = 0; i < N; i++) { accummulator += i; use((double) accummulator); } It may decide to replace integer `accumulator` with a double Shadow IV to get rid of casts. The problem with that is that the `accumulator`'s value may overflow. Starting from this moment, the behavior of integer and double accumulators will differ. This patch strenghtens up the conditions of Shadow IV mechanism applicability. We only allow it for IVs that are proved to be `AddRec`s with `nsw`/`nuw` flag. Differential Revision: https://reviews.llvm.org/D37209 llvm-svn: 311986	2017-08-29 07:32:20 +00:00
Craig Topper	5d6ddda92d	[InstCombine] Teach foldSelectICmpAndOr to handle vector splats This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail. Differential Revision: https://reviews.llvm.org/D37237 llvm-svn: 311960	2017-08-29 00:13:49 +00:00
Justin Bogner	f1a54a47b0	[sanitizer-coverage] Mark the guard and 8-bit counter arrays as used In r311742 we marked the PCs array as used so it wouldn't be dead stripped, but left the guard and 8-bit counters arrays alone since these are referenced by the coverage instrumentation. This doesn't quite work if we want the indices of the PCs array to match the other arrays though, since elements can still end up being dead and disappear. Instead, we mark all three of these arrays as used so that they'll be consistent with one another. llvm-svn: 311959	2017-08-29 00:11:05 +00:00
Justin Bogner	873a0746f1	[sanitizer-coverage] Return the array from CreatePCArray. NFC Be more consistent with CreateFunctionLocalArrayInSection in the API of CreatePCArray, and assign the member variable in the caller like we do for the guard and 8-bit counter arrays. This also tweaks the order of method declarations to match the order of definitions in the file. llvm-svn: 311955	2017-08-28 23:46:11 +00:00
Justin Bogner	be757de2b6	[sanitizer-coverage] Clean up trailing whitespace. NFC llvm-svn: 311954	2017-08-28 23:38:12 +00:00
Kamil Rytarowski	a9f404f813	Define NetBSD/amd64 ASAN Shadow Offset Summary: Catch up after compiler-rt changes and define kNetBSD_ShadowOffset64 as (1ULL << 46). Sponsored by <The NetBSD Foundation> Reviewers: kcc, joerg, filcab, vitalybuka, eugenis Reviewed By: eugenis Subscribers: llvm-commits, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D37234 llvm-svn: 311941	2017-08-28 22:13:52 +00:00
Craig Topper	516e39cd38	[InstCombine] Teach select01 helper of foldSelectIntoOp to handle vector splats We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars. Differential Revision: https://reviews.llvm.org/D37232 llvm-svn: 311940	2017-08-28 22:00:27 +00:00
Davide Italiano	20cb7e887f	[LoopUnroll] Properly update loop structure in case of successful peeling. When peeling kicks in, it updates the loop preheader. Later, a successful full unroll of the loop needs to update a PHI which i-th argument comes from the loop preheader, so it'd better look at the correct block. Fixes PR33437. Differential Revision: https://reviews.llvm.org/D37153 llvm-svn: 311922	2017-08-28 20:29:33 +00:00
Davide Italiano	9a09ae448d	[LoopUnroll] Add a cl::opt to force peeling, for testing purposes. Will be used to test the patch proposed in D37153. llvm-svn: 311915	2017-08-28 19:50:55 +00:00
Taewook Oh	572f45a3c8	Create PHI node for the return value only when the return value has uses. Summary: Currently, a phi node is created in the normal destination to unify the return values from promoted calls and the original indirect call. This patch makes this phi node to be created only when the return value has uses. This patch is necessary to generate valid code, as compiler crashes with the attached test case without this patch. Without this patch, an illegal phi node that has no incoming value from `entry`/`catch` is created in `cleanup` block. I think existing implementation is good as far as there is at least one use of the original indirect call. `insertCallRetPHI` creates a new phi node in the normal destination block only when the original indirect call dominates its use and the normal destination block. Otherwise, `fixupPHINodeForNormalDest` will handle the unification of return values naturally without creating a new phi node. However, if there's no use, `insertCallRetPHI` still creates a new phi node even when the original indirect call does not dominate the normal destination block, because `getCallRetPHINode` returns false. Reviewers: xur, davidxl, danielcdh Reviewed By: xur Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37176 llvm-svn: 311906	2017-08-28 18:57:00 +00:00
Craig Topper	3763f0e00d	[InstCombine] Call hasNoSignedWrap instead of hasNoUnsignedWrap to get the NSW flag when handling Add in SimplifyDemandedUseBits. This is a typo from r311789. This should fix PR34349. llvm-svn: 311902	2017-08-28 18:44:28 +00:00
NAKAMURA Takumi	a1e97a77f5	Untabify. llvm-svn: 311875	2017-08-28 06:47:47 +00:00
Dehao Chen	191b24d3d2	revert r310985 which breaks for the following case: struct string { ~string(); }; void f2(); void f1(int) { f2(); } void run(int c) { string body; while (true) { if (c) f1(c); else f1(c); } } Will recommit once the issue is fixed. llvm-svn: 311864	2017-08-27 22:22:39 +00:00
Ayal Zaks	1f58dda4e4	[LV] Fix PR34248 - recommit D32871 after revert r311304 Original commit r311077 of D32871 was reverted in r311304 due to failures reported in PR34248. This recommit fixes PR34248 by restricting the packing of predicated scalars into vectors only when vectorizing, avoiding doing so when unrolling w/o vectorizing. Added a test derived from the reproducer of PR34248. llvm-svn: 311849	2017-08-27 12:55:46 +00:00
Davide Italiano	9bdccb37d5	[NewGVN] Use `auto` when the type is obvious NFCI. llvm-svn: 311838	2017-08-26 22:31:10 +00:00
Daniel Berlin	de269f4620	NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs for loads, not just when we do it for stores. llvm-svn: 311829	2017-08-26 07:37:11 +00:00
Davide Italiano	a872519dbd	[Inliner] Only compute fully inline cost when remarks are enabled. Prior to this change (and after r311371), we computed it unconditionally, causin gsevere compile time regressions (in some cases, 5 to 10x). llvm-svn: 311804	2017-08-25 22:01:42 +00:00
Matt Morehouse	6ec7595b1e	Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer" This reverts r311801 due to a bot failure. llvm-svn: 311803	2017-08-25 22:01:21 +00:00
Matt Morehouse	f42bd31323	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 311801	2017-08-25 21:18:29 +00:00
Kostya Serebryany	d3e4b7e24a	[sanitizer-coverage] extend fsanitize-coverage=pc-table with flags for every PC llvm-svn: 311794	2017-08-25 19:29:47 +00:00
Craig Topper	35171e5d67	[InstCombine] Don't fall back to only calling computeKnownBits if the upper bit of Add/Sub is demanded. Just create an all 1s demanded mask and continue recursing like normal. The recursive calls should be able to handle an all 1s mask and do the right thing. The only time we should care about knowing whether the upper bit was demanded is when we need to know if we should clear the NSW/NUW flags. Now that we have a consistent path through the code for all cases, use KnownBits::computeForAddSub to compute the known bits at the end since we already have the LHS and RHS. My larger goal here is to move the code that turns add into xor if only 1 bit is demanded and no bits below it are non-zero from InstCombiner::OptAndOp to here. This will allow it to be more general instead of just looking for 'add' and 'and' with constant RHS. Differential Revision: https://reviews.llvm.org/D36486 llvm-svn: 311789	2017-08-25 18:39:40 +00:00
Florian Hahn	cd78345398	[LoopInterchange] Skip zext instructions when looking for induction var. Summary: SimplifyIndVar may introduce zext instructions to widen arguments of the loop exit check. They should not prevent us from splitting the loop at the induction variable, but maybe the check should be more conservative, e.g. making sure it only extends arguments used by a comparison? Reviewers: karthikthecool, mcrosier, mzolotukhin Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34879 llvm-svn: 311783	2017-08-25 16:52:29 +00:00
Amjad Aboud	22178dd33b	[InstCombine] Consider more cases where SimplifyDemandedUseBits does not convert AShr to LShr. There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit. Differential Revision: https://reviews.llvm.org/D36936 llvm-svn: 311773	2017-08-25 11:07:54 +00:00
Gor Nishanov	e29e94cf87	[coroutines] Add support for symmetric control transfer (musttail on coro.resumes followed by a suspend) Summary: Add musttail to any resume instructions that is immediately followed by a suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call for symmetrical coroutine control transfer (C++ Coroutines TS extension). This transformation is done only in the resume part of the coroutine that has identical signature and calling convention as the coro.resume call. Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: EricWF, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D37125 llvm-svn: 311751	2017-08-25 02:25:10 +00:00
Justin Bogner	ad96ff1228	[sanitizer-coverage] Make sure pc-tables aren't dead stripped Add a reference to the PC array in llvm.used so that linkers that aggressively dead strip (like ld64) don't remove it. llvm-svn: 311742	2017-08-25 01:24:54 +00:00
Xinliang David Li	66531dd10a	[Profile] backward propagate profile info in JumpThreading Take-2 after fixing bugs in the original patch. Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311727	2017-08-24 22:54:01 +00:00
Sanjay Patel	bb789381fc	[InstCombine] fix and enhance udiv/urem narrowing There are 3 small independent changes here: 1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count. 2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p 3. Enable all folds for vector types. There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type. Differential Revision: https://reviews.llvm.org/D36988 llvm-svn: 311726	2017-08-24 22:54:01 +00:00
Chad Rosier	f98335e0b0	[PartialInlining] Formatting. NFC. llvm-svn: 311702	2017-08-24 21:21:09 +00:00
Chad Rosier	4cb2e82774	[PartialInlining] Type. NFC. llvm-svn: 311699	2017-08-24 20:29:02 +00:00
Sanjay Patel	5d67d8916e	[BypassSlowDivision] move map helper code to header; NFC We can reuse this code with other div/rem transforms as shown in: https://reviews.llvm.org/D31037 https://bugs.llvm.org/show_bug.cgi?id=31028 llvm-svn: 311661	2017-08-24 14:43:33 +00:00
Mikael Holmen	7a99e33b8e	[Reassociate] Do not drop debug location if replacement is missing Summary: When reassociating an expression, do not drop the instruction's original debug location in case the replacement location is missing. The debug location must at least not be dropped for inlinable callsites of debug-info-bearing functions in debug-info-bearing functions. Failing to do so would result in an "inlinable function " "call in a function with debug info must have a !dbg location" error in the verifier. As preserving the original debug location is not expected to result in overly jumpy debug line information, it is preserved for all other cases too. This fixes PR34231: https://bugs.llvm.org/show_bug.cgi?id=34231 Original patch by David Stenberg Reviewers: davide, craig.topper, mcrosier, dblaikie, aprantl Reviewed By: davide, aprantl Subscribers: aprantl Differential Revision: https://reviews.llvm.org/D36865 llvm-svn: 311642	2017-08-24 09:05:00 +00:00
Daniel Berlin	f948603a15	NewGVN: We weren't properly simplifying selects with equal arguments due to a thinko. llvm-svn: 311626	2017-08-24 02:43:17 +00:00
Rong Xu	15848e5977	[PGO] Set edge weights for indirectbr instruction with profile counts Current PGO only annotates the edge weight for branch and switch instructions with profile counts. We should also annotate the indirectbr instruction as all the information is there. This patch enables the annotating for indirectbr instructions. Also uses this annotation in branch probability analysis. Differential Revision: https://reviews.llvm.org/D37074 llvm-svn: 311604	2017-08-23 21:36:02 +00:00
Hans Wennborg	66f6fc0a49	LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020) The lowering isn't really an optimization, so optnone shouldn't make a difference. ARM relies on the pass running when using "-mthread-model single", because in that mode, it doesn't run AtomicExpand. See bug for more details. Differential Revision: https://reviews.llvm.org/D37040 llvm-svn: 311565	2017-08-23 15:43:28 +00:00
Gor Nishanov	2f55b958b1	[coroutines] CoroBegin from inner coroutines should be considered for spills Summary: If a coroutine outer calls another coroutine inner and the inner coroutine body is inlined into the outer, coro.begin from the inner coroutine should be considered for spilling if accessed across suspends. Prior to this change, coroutine frame building code was not considering any coro.begins for spilling. With this change, we only ignore coro.begin for the current coroutine, but, any coro.begins that were inlined into the current coroutine are eligible for spills. Fixes PR34267 Reviewers: GorNishanov Subscribers: qcolombet, llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D37062 llvm-svn: 311556	2017-08-23 14:47:52 +00:00
Chad Rosier	8db41e9dbd	[Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y).. ..if the resulting subtract will be broken up later. This can cause us to get into an infinite loop. x + (-5.0 * y) -> x - (5.0 * y) ; Canonicalize neg const x - (5.0 * y) -> x + (0 - (5.0 * y)) ; Break up subtract x + (0 - (5.0 * y)) -> x + (-5.0 * y) ; Replace 0-X with X*-1. PR34078 llvm-svn: 311554	2017-08-23 14:10:06 +00:00
Davide Italiano	c78885818a	[InstCombine] Fold branches with irrelevant conditions to a constant. InstCombine folds instructions with irrelevant conditions to undef. This, as Nuno confirmed is a bug. (see https://bugs.llvm.org/show_bug.cgi?id=33409#c1 ) Given the original motivation for the change is that of removing an USE, we now fold to false instead (which reaches the same goal without undesired side effects). Fixes PR33409. Differential Revision: https://reviews.llvm.org/D36975 llvm-svn: 311540	2017-08-23 09:14:37 +00:00
Craig Topper	a85f86225a	[InstCombine] Remove unused argument. NFC llvm-svn: 311529	2017-08-23 05:46:09 +00:00
Craig Topper	a94069fb4c	[InstCombine] Replace a simple matcher with a plain old dyn_cast. NFC llvm-svn: 311528	2017-08-23 05:46:08 +00:00
Craig Topper	524c44f74e	[InstCombine] Remove an unnecessary dyn_cast to Instruction and a switch over two opcodes. Just dyn_cast to the specific instruction classes individually. NFC Change the helper methods to take the more specific class as well. llvm-svn: 311527	2017-08-23 05:46:07 +00:00
Craig Topper	ec4b82571c	[InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway. For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'. With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok. Differential Revision: https://reviews.llvm.org/D36213 llvm-svn: 311508	2017-08-22 23:40:15 +00:00
Peter Collingbourne	001052a067	WholeProgramDevirt: Create bitcast to i8* at each virtual call site. We can't reuse the llvm.assume instruction's bitcast because it may not dominate every user of the vtable pointer. Differential Revision: https://reviews.llvm.org/D36994 llvm-svn: 311491	2017-08-22 21:41:19 +00:00
Matt Morehouse	b1fa8255db	[SanitizerCoverage] Optimize stack-depth instrumentation. Summary: Use the initialexec TLS type and eliminate calls to the TLS wrapper. Fixes the sanitizer-x86_64-linux-fuzzer bot failure. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37026 llvm-svn: 311490	2017-08-22 21:28:29 +00:00
Jakub Kuderski	2724d45325	[ADCE][Dominators] Reapply: Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. This is reapplies the original patch r311057 that was reverted in r311381. The previous version wasn't using the batch update api for updating dominators, which in vary rare cases caused assertion failures. This also fixes PR34258. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311467	2017-08-22 16:30:21 +00:00
Craig Topper	775ffcc8f5	[InstCombine] Move the checks for pointer types in getMaskedTypeForICmpPair earlier in the function I don't think there's any reason to have them scattered about and on all 4 operands. We already have an early check that both compares must be the same type. And within a given compare the LHS and RHS must have the same type. Beyond that I don't think there's anyway this function returns anything valid for pointer types. So let's just return early and be done with it. Differential Revision: https://reviews.llvm.org/D36561 llvm-svn: 311383	2017-08-21 21:00:45 +00:00
Sanjoy Das	08a38fe71e	Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators" Summary: This partially reverts commit r311057 since it breaks ADCE. See PR34258. Reviewers: kuhar Subscribers: mcrosier, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D36979 llvm-svn: 311381	2017-08-21 20:39:18 +00:00
Haicheng Wu	0812c5bea3	[InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes. Currently, the inline cost model will bail once the inline cost exceeds the inline threshold in order to avoid unnecessary compile-time. However, when debugging it is useful to compute the full cost, so this command line option is added to override the default behavior. I took over this work from Chad Rosier (mcrosier@codeaurora.org). Differential Revision: https://reviews.llvm.org/D35850 llvm-svn: 311371	2017-08-21 20:00:09 +00:00
Sanjay Patel	82ec872990	[LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try) The 1st try was reverted because it could inf-loop by creating a dead instruction. Fixed that to not happen and added a test case to verify. Original commit message: Try to fold: memcmp(X, C, ConstantLength) == 0 --> load X == *C Without this change, we're unnecessarily checking the alignment of the constant data, so we miss the transform in the first 2 tests in the patch. I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion patches. This doesn't help the example in: https://bugs.llvm.org/show_bug.cgi?id=34032#c13 ...directly, but it's worth short-circuiting more of these simple cases since we're already trying to do that. The benefit of transforming to load+cmp is that existing IR analysis/transforms may further simplify that code. For example, if the load of the variable is common to multiple memcmp calls, CSE can remove the duplicate instructions. Differential Revision: https://reviews.llvm.org/D36922 llvm-svn: 311366	2017-08-21 19:13:14 +00:00
Craig Topper	74177e1ed1	[InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely. Differential Revision: https://reviews.llvm.org/D36498 llvm-svn: 311362	2017-08-21 19:02:06 +00:00
Sam Elliott	e963c89d11	Migrate WholeProgramDevirt to new Optimization Remark API Summary: This is an attempt to move WholeProgramDevirt to the new remark API. https://bugs.llvm.org/show_bug.cgi?id=33793 Reviewers: anemet Reviewed By: anemet Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D36943 llvm-svn: 311352	2017-08-21 16:57:21 +00:00
Sam Elliott	e604b563ea	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311349	2017-08-21 16:45:47 +00:00
Craig Topper	cc255bcd77	[InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions Summary: If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear. Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36944 llvm-svn: 311343	2017-08-21 16:04:11 +00:00
Xinliang David Li	d2838fc4b9	Revert 311208, 311209 llvm-svn: 311341	2017-08-21 16:00:38 +00:00
Sanjay Patel	707f786cc5	revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments We're getting lots of compile-timeout bot failures like: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux llvm-svn: 311340	2017-08-21 15:16:25 +00:00
Sanjay Patel	7756edfa93	[LibCallSimplifier] try harder to fold memcmp with constant arguments Try to fold: memcmp(X, C, ConstantLength) == 0 --> load X == *C Without this change, we're unnecessarily checking the alignment of the constant data, so we miss the transform in the first 2 tests in the patch. I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion patches. This doesn't help the example in: https://bugs.llvm.org/show_bug.cgi?id=34032#c13 ...directly, but it's worth short-circuiting more of these simple cases since we're already trying to do that. The benefit of transforming to load+cmp is that existing IR analysis/transforms may further simplify that code. For example, if the load of the variable is common to multiple memcmp calls, CSE can remove the duplicate instructions. Differential Revision: https://reviews.llvm.org/D36922 llvm-svn: 311333	2017-08-21 13:55:49 +00:00
Chandler Carruth	bd6dc14230	Revert r311077: [LV] Using VPlan ... This causes LLVM to assert fail on PPC64 and crash / infloop in other cases. Filed http://llvm.org/PR34248 with reproducer attached. llvm-svn: 311304	2017-08-20 23:17:11 +00:00
Benjamin Kramer	2e5be849cc	[Mem2Reg] Modernize code a bit. No functionality change intended. llvm-svn: 311290	2017-08-20 14:34:44 +00:00
Benjamin Kramer	49a49fe816	Move helper classes into anonymous namespaces. No functionality change intended. llvm-svn: 311288	2017-08-20 13:03:48 +00:00
Aditya Kumar	a525fffd07	[Loop Vectorize] Added a separate metadata Added a separate metadata to indicate when the loop has already been vectorized instead of setting width and count to 1. Patch written by Divya Shanmughan and Aditya Kumar Differential Revision: https://reviews.llvm.org/D36220 llvm-svn: 311281	2017-08-20 10:32:41 +00:00
Sam Elliott	7fe0aaa140	Revert "Emit only A Single Opt Remark When Inlining" Reverting due to clang build failure llvm-svn: 311274	2017-08-20 06:55:10 +00:00
Sam Elliott	785dd75369	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311273	2017-08-20 06:43:34 +00:00
Teresa Johnson	73305f82e9	[ThinLTO] Fix ThinLTO crash Summary: Follow up to fix in r311023, which fixed the case where the combined index is written to disk. The same samplePGO logic exists for the in-memory index when computing imports, so we need to filter out GlobalVariable summaries there too. Reviewers: davidxl Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D36919 llvm-svn: 311254	2017-08-19 18:04:25 +00:00
Chandler Carruth	4f3aa29a46	[Inliner] Fix a nasty bug when inlining a non-recursive trace of a function into itself. We tried to fix this before in r306495 but that got reverted as the assert was actually hit. This fixes the original bug (which we seem to have lost track of with the revert) by blocking a second remapping when the function being inlined is also the caller and the remapping could succeed but erroneously. The included test case would actually load from an inlined copy of the alloca before this change, failing to load the stored value and miscompiling. Many thanks to Richard Smith for diagnosing a user miscompile to this bug, and to Kyle for the first attempt and initial analysis and David Li for remembering the issue and how to fix it and suggesting the patch. I'm just stitching it together and landing it. =] llvm-svn: 311229	2017-08-19 06:56:11 +00:00
Chandler Carruth	1f8212597d	[SLP] Fix an unused variable warning in non-asserts builds. llvm-svn: 311227	2017-08-19 05:06:23 +00:00
Dinar Temirbulatov	7aff8cfa55	[SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI. llvm-svn: 311223	2017-08-19 03:15:07 +00:00
Dinar Temirbulatov	e3ce1b455e	[SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions. Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36766 llvm-svn: 311221	2017-08-19 02:54:20 +00:00
Xinliang David Li	0d07f9d68a	Fix comment /NFC llvm-svn: 311209	2017-08-18 23:08:50 +00:00
Xinliang David Li	709ffe178e	[Profile] backward propagate profile info in JumpThreading Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311208	2017-08-18 23:00:05 +00:00
Max Kazantsev	0aaf8c16ac	[IRCE] Fix buggy behavior in Clamp Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations. In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned predicates, but we did not check this for `S` which can in theory be negative. This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative, and it adds a test where such situation may lead to incorrect conditions calculation. Differential Revision: https://reviews.llvm.org/D36873 llvm-svn: 311205	2017-08-18 22:50:29 +00:00
Ana Pazos	6210f27dfc	[PGO] Fixed assertion due to mismatched memcpy size type. Summary: Memcpy intrinsics have size argument of any integer type, like i32 or i64. Fixed size type along with its value when cloning the intrinsic. Reviewers: davidxl, xur Reviewed By: davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36844 llvm-svn: 311188	2017-08-18 19:17:08 +00:00
Matt Morehouse	5c7fc76983	[SanitizerCoverage] Add stack depth tracing instrumentation. Summary: Augment SanitizerCoverage to insert maximum stack depth tracing for use by libFuzzer. The new instrumentation is enabled by the flag -fsanitize-coverage=stack-depth and is compatible with the existing trace-pc-guard coverage. The user must also declare the following global variable in their code: thread_local uintptr_t __sancov_lowest_stack https://bugs.llvm.org/show_bug.cgi?id=33857 Reviewers: vitalybuka, kcc Reviewed By: vitalybuka Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D36839 llvm-svn: 311186	2017-08-18 18:43:30 +00:00
Jakub Kuderski	e608ef7635	[LoopRotate][Dominators] Use the incremental API to update DomTree Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, davide Subscribers: hiraditya, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D35581 llvm-svn: 311125	2017-08-17 21:48:19 +00:00
Jakub Kuderski	e35a449140	[Dominators] Teach LoopUnswitch to use the incremental API Summary: This patch makes LoopUnswitch use new incremental API for updating dominators. It also updates SplitCriticalEdge, as it is called in LoopUnswitch. There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch. Reviewers: dberlin, davide, sanjoy, grosser, chandlerc Reviewed By: davide, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35528 llvm-svn: 311093	2017-08-17 16:45:35 +00:00
Simon Dardis	b5205c69d2	[dfsan] Add explicit zero extensions for shadow parameters in function wrappers. In the case where dfsan provides a custom wrapper for a function, shadow parameters are added for each parameter of the function. These parameters are i16s. For targets which do not consider this a legal type, the lack of sign extension information would cause LLVM to generate anyexts around their usage with phi variables and calling convention logic. Address this by introducing zero exts for each shadow parameter. Reviewers: pcc, slthakur Differential Revision: https://reviews.llvm.org/D33349 llvm-svn: 311087	2017-08-17 14:14:25 +00:00
Ayal Zaks	6627883369	[LV] Using VPlan to model the vectorized code and drive its transformation VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This patch introduces the VPlan model into LV and uses it to represent the vectorized code and drive the generation of vectorized IR. In this patch VPlan models the vectorized loop body: the vectorized control-flow is represented using VPlan's Hierarchical CFG, with predication refactored from being a post-vectorization-step into a vectorization planning step modeling if-then VPRegionBlocks, and generating code inline with non-predicated code. The vectorized code within each VPBasicBlock is represented as a sequence of Recipes, each responsible for modelling and generating a sequence of IR instructions. To keep the size of this commit manageable the Recipes in this patch are coarse-grained and capture large chunks of LV's code-generation logic. The constructed VPlans are dumped in dot format under -debug. This commit retains current vectorizer output, except for minor instruction reorderings; see associated modifications to lit tests. For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst and its references. Authors: Gil Rapaport and Ayal Zaks Differential Revision: https://reviews.llvm.org/D32871 llvm-svn: 311077	2017-08-17 09:29:59 +00:00
Jakub Kuderski	fd5c5c9144	Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. The patch was originally committed in r311039 and reverted in r311049. This revision fixes the problem with not adding a dependency on the DominatorTreeWrapperPass for the LegacyPassManager. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311057	2017-08-17 01:41:49 +00:00
Amjad Aboud	86111c6696	[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount) Differential Revision: https://reviews.llvm.org/D36784 llvm-svn: 311050	2017-08-16 22:42:38 +00:00
Jakub Kuderski	cbcffb173c	Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" This reverts commit r311039. The patch caused the `test/Bindings/OCaml/Output/scalar_opts.ml` to fail. llvm-svn: 311049	2017-08-16 22:10:53 +00:00
Craig Topper	882f29630b	[InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors This also uses decomposeBitTestICmp to decode the compare. Differential Revision: https://reviews.llvm.org/D36781 llvm-svn: 311044	2017-08-16 21:52:07 +00:00
Jakub Kuderski	4552e9de9f	[ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311039	2017-08-16 20:50:23 +00:00
Geoff Berry	40549ad1ac	[LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution Summary: Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving ScalarEvolution since they do not alter loop structure and should not alter any SCEV values (though LoopDataPrefetch may introduce new instructions that won't have cached SCEV values yet). This can result in slight code differences, mainly w.r.t. nsw/nuw flags on SCEVs, since these are computed somewhat lazily when a zext/sext instruction is encountered. As a result, passes after the modified passes may see SCEVs with more nsw/nuw flags present. Reviewers: sanjoy, anemet Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36716 llvm-svn: 311032	2017-08-16 19:03:16 +00:00
Hal Finkel	9e54b7093a	[BDCE] Don't check demanded bits on unsized types To clear assumptions that are potentially invalid after trivialization, we need to walk the use/def chain. Normally, the only way to reach an instruction with an unsized type is via an instruction that has side effects (or otherwise will demand its input bits). That would stop the walk. However, if we have a readnone function that returns an unsized type (e.g., void), we must avoid asking for the demanded bits of the function call's return value. A void-returning readnone function is always dead (and so we can stop walking the use/def chain here), but the check is necessary to avoid asserting. Fixes PR34211. llvm-svn: 311014	2017-08-16 16:09:22 +00:00
Dehao Chen	84d412035a	Merge debug info when hoist then-else code to if. Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached. Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D36778 llvm-svn: 310985	2017-08-16 01:55:26 +00:00
Craig Topper	0a1a276d91	[InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount We were only allowing ConstantInt before. This patch allows splat of ConstantInt too. Differential Revision: https://reviews.llvm.org/D36763 llvm-svn: 310970	2017-08-15 22:48:41 +00:00
Amjad Aboud	0464c5d958	[InstCombine] Added support for (X >>s C) << C --> X & (-1 << C) Differential Revision: https://reviews.llvm.org/D36743 llvm-svn: 310949	2017-08-15 19:33:14 +00:00
Sanjay Patel	f69b7d5c93	[InstCombine] sink sext after ashr Narrow ops are better for bit-tracking, and in the case of vectors, may enable better codegen. As the trunc test shows, this can allow follow-on simplifications. There's a block of code in visitTrunc that deals with shifted ops with FIXME comments. It may be possible to remove some of that now, but I want to make sure there are no problems with this step first. http://rise4fun.com/Alive/Y3a Name: hoist_ashr_ahead_of_sext_1 %s = sext i8 %x to i32 %r = ashr i32 %s, 3 ; shift value is < than source bit width => %a = ashr i8 %x, 3 %r = sext i8 %a to i32 Name: hoist_ashr_ahead_of_sext_2 %s = sext i8 %x to i32 %r = ashr i32 %s, 8 ; shift value is >= than source bit width => %a = ashr i8 %x, 7 ; so clamp this shift value %r = sext i8 %a to i32 Name: junc_the_trunc %a = sext i16 %v to i32 %s = ashr i32 %a, 18 %t = trunc i32 %s to i16 => %t = ashr i16 %v, 15 llvm-svn: 310942	2017-08-15 18:25:52 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Rui Ueyama	4a17955030	Fix -Wunused-lambda-capture for Release build. `I` and `this` are used only in assert or DEBUG, so they are unused in Release build. llvm-svn: 310934	2017-08-15 17:39:35 +00:00
Ayal Zaks	25e2800e20	[LV] Minor savings to Sink casts to unravel first order recurrence Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it is not needed. Differential Revision: https://reviews.llvm.org/D36408 llvm-svn: 310910	2017-08-15 08:32:59 +00:00
Dinar Temirbulatov	9e43d6e7b2	[SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0, replace E->Scalars[0] to VL0, NFCI. llvm-svn: 310904	2017-08-15 00:31:49 +00:00
Dehao Chen	45847d3612	Add missing dependency in ICP. (NFC) llvm-svn: 310896	2017-08-14 23:25:21 +00:00
Reid Kleckner	18728822d2	Remove checks for debug info intrinsics in use lists, NFC These haven't done anything since debug info intrinsics stopped appearing in Value use lists in 2014. llvm-svn: 310892	2017-08-14 22:10:54 +00:00
Craig Topper	0aa3a19512	Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889	2017-08-14 21:39:51 +00:00
Andrew Kaylor	53a5fbb45f	Add strictfp attribute to prevent unwanted optimizations of libm calls Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885	2017-08-14 21:15:13 +00:00
Craig Topper	69fa8e0d99	Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873	2017-08-14 19:09:32 +00:00
Craig Topper	2f0b450666	[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869	2017-08-14 18:49:42 +00:00
Dinar Temirbulatov	7b78f5e52d	[SLPVectorizer] Schedule bundle with different opcodes. This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ] Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36518 llvm-svn: 310847	2017-08-14 15:40:16 +00:00
Sanjay Patel	a1067d9bae	[BDCE] reduce scope of an assert (PR34179) The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842	2017-08-14 15:13:46 +00:00
Sam Parker	718c8a6a2a	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824	2017-08-14 09:25:26 +00:00
Craig Topper	f720099007	[InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants Summary: These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code. Next step is to use m_APInt instead of ConstantInt. Reviewers: spatel, efriedma, davide, majnemer Reviewed By: spatel Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D36439 llvm-svn: 310806	2017-08-14 00:04:21 +00:00
Sanjay Patel	fe346f9f5b	[BDCE] clear poison generators after turning a value into zero (PR33695, PR34037) nsw, nuw, and exact carry implicit assumptions about their operands, so we need to clear those after trivializing a value. We decided there was no danger for llvm.assume or metadata, so there's just a comment about that. This fixes miscompiles as shown in: https://bugs.llvm.org/show_bug.cgi?id=33695 https://bugs.llvm.org/show_bug.cgi?id=34037 Differential Revision: https://reviews.llvm.org/D36592 llvm-svn: 310779	2017-08-12 16:41:08 +00:00
Eli Friedman	51cf2604b6	[OptDiag] Updating Remarks in SampleProfile Updating remark API to newer OptimizationDiagnosticInfo API. This allows remarks to show up in diagnostic yaml file, and enables use of opt-viewer tool. Hotness information for remarks (L505 and L751) do not display hotness information, most likely due to profile information not being propagated yet. Unsure if this is the desired outcome. Patch by Tarun Rajendran. Differential Revision: https://reviews.llvm.org/D36127 llvm-svn: 310763	2017-08-11 21:12:04 +00:00
Xinliang David Li	24524f314c	Fix typo /NFC llvm-svn: 310737	2017-08-11 17:49:20 +00:00
Craig Topper	9a6110b2d3	[InstCombine] Make (X\|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectors This also corrects the description to match what was actually implemented. The old comment said X^(C1\|C2), but it implemented X^((C1\|C2)&~(C1&C2)). I believe ((C1\|C2)&~(C1&C2)) is equivalent to (C1^C2). Differential Revision: https://reviews.llvm.org/D36505 llvm-svn: 310658	2017-08-10 20:35:34 +00:00
Craig Topper	57b4d8646b	[InstCombine] Fix a crash in getSelectCondition if we happen to have two inverse vectors of i1 constants. We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check. llvm-svn: 310639	2017-08-10 17:48:14 +00:00
Craig Topper	cd13ebca5f	[InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many instructions are visited for debug Sometimes it would be nice to stop InstCombine mid way through its combining to see the current IR. By using a debug counter we can place an upper limit on how many instructions to process. This will also allow skipping the first X combines, but that has the potential to change later combines since earlier canonicalizations might have been skipped. Differential Revision: https://reviews.llvm.org/D36553 llvm-svn: 310638	2017-08-10 17:48:12 +00:00
Craig Topper	9cd976d041	[DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use. This make it consistent with STATISTIC which it will often appears near. While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary. llvm-svn: 310637	2017-08-10 17:48:11 +00:00
Alexander Potapenko	5241081532	[sanitizer-coverage] Change cmp instrumentation to distinguish const operands This implementation of SanitizerCoverage instrumentation inserts different callbacks depending on constantness of operands: 1. If both operands are non-const, then a usual __sanitizer_cov_trace_cmp[1248] call is inserted. 2. If exactly one operand is const, then a __sanitizer_cov_trace_const_cmp[1248] call is inserted. The first argument of the call is always the constant one. 3. If both operands are const, then no callback is inserted. This separation comes useful in fuzzing when tasks like "find one operand of the comparison in input arguments and replace it with the other one" have to be done. The new instrumentation allows us to not waste time on searching the constant operands in the input. Patch by Victor Chibotaru. llvm-svn: 310600	2017-08-10 15:00:13 +00:00
Chad Rosier	a5508e3119	[NewGVN] Add CL option to control the generation of phi-of-ops (disable by default). Differential Revision: https://reviews.llvm.org/D36478539 llvm-svn: 310594	2017-08-10 14:12:57 +00:00
Sanjay Patel	c50e55d0e6	[InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046) I couldn't find any smaller folds to help the cases in: https://bugs.llvm.org/show_bug.cgi?id=34046 after: rL310141 The truncated rotate-by-variable patterns elude all of the existing transforms because of multiple uses and knowledge about demanded bits and knownbits that doesn't exist without the whole pattern. So we need an unfortunately large pattern match. But by simplifying this pattern in IR, the backend is already able to generate rolb/rolw/rorb/rorw for x86 using its existing rotate matching logic (although there is a likely extraneous 'and' of the rotate amount). Note that rotate-by-constant doesn't have this problem - smaller folds should already produce the narrow IR ops. Differential Revision: https://reviews.llvm.org/D36395 llvm-svn: 310509	2017-08-09 18:37:41 +00:00
Matt Morehouse	49e5acab33	[asan] Fix instruction emission ordering with dynamic shadow. Summary: Instrumentation to copy byval arguments is now correctly inserted after the dynamic shadow base is loaded. Reviewers: vitalybuka, eugenis Reviewed By: vitalybuka Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36533 llvm-svn: 310503	2017-08-09 17:59:43 +00:00
Jonas Paulsson	6228aeda65	[LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess() isLegalAddressingMode() has recently gained the extra optional Instruction* parameter, and therefore it can now do the job that previously only isFoldableMemAccess() could do. The SystemZ implementation of isLegalAddressingMode() has gained the functionality of checking for offsets, which used to be done with isFoldableMemAccess(). The isFoldableMemAccess() hook has been removed everywhere. Review: Quentin Colombet, Ulrich Weigand https://reviews.llvm.org/D35933 llvm-svn: 310463	2017-08-09 11:28:01 +00:00
Jonas Paulsson	5052771af3	[LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded(). In the recursive call to isAMCompletelyFolded(), the passed offset should be the sum of F.BaseOffset and Fixup.Offset. Review: Quentin Colombet. llvm-svn: 310462	2017-08-09 11:27:46 +00:00
Davide Italiano	c163fac184	[GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI. llvm-svn: 310453	2017-08-09 09:23:29 +00:00
Craig Topper	5706c01c0b	[InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFC llvm-svn: 310446	2017-08-09 06:17:48 +00:00
Wei Mi	bb9106ac4b	[GVN] Remove stale entries in phitranslate cache when new phi is generated for PRE When a new phi is generated for scalarpre of an expression, the phiTranslate cache will become stale: Before PRE, the candidate expression must not be available in a predecessor block, and phitranslate will cache the information. After PRE, the expression will become available in all predecessor blocks, so the related entries in phiTranslate cache becomes stale. The patch will simply remove the stale entries so phiTranslate can be recomputed next time. The stale entries in phitranslate cache will not affect correctness but will cause missing PRE opportunity for later instructions. Differential Revision: https://reviews.llvm.org/D36124 llvm-svn: 310421	2017-08-08 21:40:14 +00:00
Dehao Chen	34cfcb29aa	Make ICP uses PSI to check for hotness. Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted. Reviewers: davidxl, tejohnson, eraman Reviewed By: davidxl Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36341 llvm-svn: 310416	2017-08-08 20:57:33 +00:00
Craig Topper	364359e4fc	[InstCombine] Support pulling left shifts through a subtract with constant LHS We already support pulling through an add with constant RHS. We can do the same for subtract. Differential Revision: https://reviews.llvm.org/D36443 llvm-svn: 310407	2017-08-08 20:14:11 +00:00
Chad Rosier	4d852597f8	[NewGVN] Use a cast instead of a dyn_cast. Differential Revision: https://reviews.llvm.org/D36478 llvm-svn: 310397	2017-08-08 18:41:49 +00:00
Anna Thomas	9b6e12f3dc	[LoopVectorize] Fix assertion failure in Fcmp vectorization Summary: When vectorizing fcmps we can trip on incorrect cast assertion when setting the FastMathFlags after generating the vectorized FCmp. This can happen if the FCmp can be folded to true or false directly. The fix here is to set the FastMathFlag using the FastMathFlagBuilder before creating the FCmp Instruction. This is what's done by other optimizations such as InstCombine. Added a test case which trips on cast assertion without this patch. Reviewers: Ayal, mssimpso, mkuper, gilr Reviewed by: Ayal, mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D36244 llvm-svn: 310389	2017-08-08 18:07:44 +00:00
Craig Topper	8e351e9018	[InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify the code. We no longer need the explicit operand count check or the later dynamic cast. llvm-svn: 310339	2017-08-08 06:19:24 +00:00
Chandler Carruth	7c888dca46	[PM] Fix new LoopUnroll function pass by invalidating loop analysis results when a loop is completely removed. This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align. But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get any test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash. I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear all of the stale cache entries. And where the pass is updating the core analysis that provides the IR units! For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer. A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare. llvm-svn: 310333	2017-08-08 02:24:20 +00:00
Evgeny Stupachenko	c675290680	Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720). The root cause of reverting was fixed - PR33514. Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310289	2017-08-07 19:56:34 +00:00
Aaron Ballman	428f0fe910	Removing an unused variable that was missed with the refactoring in r310272; NFC. llvm-svn: 310285	2017-08-07 19:26:17 +00:00
Craig Topper	7091a743b4	[InstCombine] Support (X \| C1) & C2 --> (X & C2^(C1&C2)) \| (C1&C2) for vector splats Note the original code I deleted incorrectly listed this as (X \| C1) & C2 --> (X & C2^(C1&C2)) \| C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code. My new implementation avoids relying on that behavior so that it can be naively verified with alive. Differential Revision: https://reviews.llvm.org/D36384 llvm-svn: 310272	2017-08-07 18:10:39 +00:00
Alexey Bataev	9581b42589	[SLP] General improvements of SLP vectorization process. Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310260	2017-08-07 15:25:49 +00:00
Alexey Bataev	53d523c9eb	Revert "[SLP] General improvements of SLP vectorization process." This reverts commit r310255. llvm-svn: 310257	2017-08-07 14:51:52 +00:00
Alexey Bataev	faace8f1f1	[SLP] General improvements of SLP vectorization process. Summary: Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310255	2017-08-07 14:03:17 +00:00
Vitaly Buka	5d432ec929	[asan] Fix asan dynamic shadow check before copyArgsPassedByValToAllocas llvm-svn: 310242	2017-08-07 07:35:33 +00:00
Vitaly Buka	629047de8e	[asan] Disable checking of arguments passed by value for --asan-force-dynamic-shadow Fails with "Instruction does not dominate all uses!" llvm-svn: 310241	2017-08-07 07:12:34 +00:00
Davide Italiano	b53b075bb1	[Reassociate] Use a range loop for clarity. NFCI. While here, rename `i` to `Rank` as the latter is more self-explanatory (and this code also uses `I` two lines below to identify an Instruction). llvm-svn: 310238	2017-08-07 01:57:21 +00:00
Davide Italiano	a5cdc22e70	[Reassociate] Try to bail out early when canonicalizing. This commit rearranges the checks to avoid calls to getRank() when not needed (e.g. when RHS == LHS). llvm-svn: 310237	2017-08-07 01:49:09 +00:00
Craig Topper	576fb91aef	[InstCombine] Remove shift handling from OptAndOp. Summary: This is all handled by SimplifyDemandedBits. Reviewers: spatel, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D36382 llvm-svn: 310234	2017-08-06 23:30:49 +00:00
Craig Topper	a1693a2ed3	[InstCombine] Support (X ^ C1) & C2 --> (X & C2) ^ (C1&C2) for vector splats. llvm-svn: 310233	2017-08-06 23:11:49 +00:00
Craig Topper	9cbdbefd0f	[InstCombine] Support '(C - X) ^ signmask -> (C + signmask - X)' and '(X + C) ^ signmask -> (X + C + signmask)' for vector splats. llvm-svn: 310232	2017-08-06 22:17:21 +00:00
Craig Topper	b5bf016015	[InstCombine] Support ~(c-X) --> X+(-c-1) and ~(X-c) --> (-c-1)-X for splat vectors. llvm-svn: 310195	2017-08-06 06:28:41 +00:00
Craig Topper	9ffda5ab86	[InstCombine] Fold (C - X) ^ signmask -> (C + signmask - X). llvm-svn: 310186	2017-08-05 20:00:44 +00:00
Craig Topper	65dd32afbc	[InstCombine] Teach the code that pulls logical operators through constant shifts to handle vector splats too. llvm-svn: 310185	2017-08-05 20:00:42 +00:00
Craig Topper	1bbcab9ca5	[InstCombine] Support vector splats in foldSelectICmpAnd. Unfortunately, it looks like there's some other missed optimizations in the generated code for some of these cases. I'll try to look at some of those next. llvm-svn: 310184	2017-08-05 20:00:41 +00:00
Dinar Temirbulatov	cc2294a4eb	[SLPVectorizer] Add extra parameter to setInsertPointAfterBundle to handle different opcodes, NFCI. Differential Revision: https://reviews.llvm.org/D35769 llvm-svn: 310183	2017-08-05 18:43:52 +00:00
Sanjay Patel	94da1de1ce	[InstCombine] refactor trunc(binop) transforms; NFCI In addition to moving the shift transforms over, we may want to detect too-wide rotate patterns here (PR34046). llvm-svn: 310181	2017-08-05 15:19:18 +00:00
Craig Topper	fc5283092b	[InstCombine] In foldSelectICmpAnd, if we need to to truncate from the 'and' type to the 'select' type, do it after shifting right instead of just bailing. Previously we were always trying to emit the zext or truncate before any shift. This meant if the 'and' mask was larger than the size of the truncate we would skip the transformation. Now we shift the result of the and right first leaving the bit within the range of the truncate. This matches what we are doing in foldSelectICmpAndOr for the same problem. llvm-svn: 310159	2017-08-05 01:45:17 +00:00
Sanjay Patel	e12d734be3	[InstCombine] narrow truncated add/sub/mul with constant Name: narrow_sub %sub = sub i32 C1, %x %r = trunc i32 %sub to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = sub i8 %narrowC, %xn Name: narrow_add %add = add i32 %x, C1 %r = trunc i32 %add to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = add i8 %xn, %narrowC Name: narrow_mul %mul = mul i32 %x, C1 %r = trunc i32 %mul to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = mul i8 %xn, %narrowC http://rise4fun.com/Alive/QpS This doesn't solve PR34046 (failure to recognize rotate): https://bugs.llvm.org/show_bug.cgi?id=34046 ...but it reduces an extra complication in the description examples to a form that we can more easily match. llvm-svn: 310141	2017-08-04 22:30:34 +00:00
Nico Weber	69ca5322d4	Revert r310055, it caused PR34074. llvm-svn: 310123	2017-08-04 20:40:38 +00:00
Evgeny Stupachenko	38197c66a1	Fix PR33514 Summary: The bug was uncovered after fix of PR23384 (part 3 of 3). The patch restricts pointer multiplication in SCEV computaion for ICmpZero. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D36170 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310092	2017-08-04 18:46:13 +00:00
Reid Kleckner	da748f1c3d	[ArgPromotion] Preserve alignment of byval argument in new alloca The frontend may have requested a higher alignment for any reason, and downstream optimizations may already have taken advantage of it. We should keep the same alignment when moving the allocation from the parameter area to the local variable area. Fixes PR34038 llvm-svn: 310071	2017-08-04 17:09:11 +00:00
Benjamin Kramer	bda212a65d	[InstCombine] Fold single-use variable into assert. Avoids unused variable warnings in Release builds. No functional change. llvm-svn: 310064	2017-08-04 16:08:41 +00:00
Craig Topper	760ff6ee87	[InstCombine] Remove the (not (sext)) case from foldBoolSextMaskToSelect and inline the remaining code to match visitOr Summary: The (not (sext)) case is really (xor (sext), -1) which should have been simplified to (sext (xor, 1)) before we got here. So we shouldn't need to handle it. With that taken care of we only need to two cases so don't need the swap anymore. This makes us in sync with the equivalent code in visitOr so inline this to match. Reviewers: spatel, eli.friedman, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36240 llvm-svn: 310063	2017-08-04 16:07:20 +00:00
Craig Topper	3b74a68cc7	[InstCombine] Use ConstantInt::getFalse to reduce some code. NFC llvm-svn: 310062	2017-08-04 16:07:18 +00:00
Sanjay Patel	79e7f6b3e3	[InstCombine] narrow lshr with constant Name: narrow_shift Pre: C1 < 8 %zx = zext i8 %x to i32 %l = lshr i32 %zx, C1 => %narrowC = trunc i32 C1 to i8 %ns = lshr i8 %x, %narrowC %l = zext i8 %ns to i32 http://rise4fun.com/Alive/jIV This isn't directly applicable to PR34046 as written, but we need to have more narrowing folds like this to be sure that rotate patterns are recognized. llvm-svn: 310060	2017-08-04 15:42:47 +00:00
Filipe Cabecinhas	fb9d2a8775	[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to. Summary: This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Reviewers: hfinkel, junbuml, jfb, RKSimon, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 310055	2017-08-04 12:28:36 +00:00
Nikolai Bozhenov	1545eb3408	[InstCombine] Canonicalize clamp of float types to minmax in fast mode. Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 310054	2017-08-04 12:22:17 +00:00
Max Kazantsev	9505470033	Do not declare a variable which is used only in assert. NFC llvm-svn: 310034	2017-08-04 07:41:24 +00:00
Max Kazantsev	2f6ae28152	[IRCE] Handle loops with step different from 1/-1 This patch generalizes IRCE to handle IV steps that are not equal to 1 or -1. Differential Revision: https://reviews.llvm.org/D35539 llvm-svn: 310032	2017-08-04 07:01:04 +00:00
Max Kazantsev	07da1ab23a	[IRCE] Recognize loops with unsigned latch conditions This patch enables recognition of loops with ult/ugt latch conditions. Differential Revision: https://reviews.llvm.org/D35302 llvm-svn: 310027	2017-08-04 05:40:20 +00:00
Craig Topper	c2d3c631ff	[InstCombine] Move the call to foldSelectICmpAnd into foldSelectInstWithICmp. NFCI llvm-svn: 310025	2017-08-04 05:12:37 +00:00
Craig Topper	a86ca08d26	[InstCombine] Remove unnecessary casts. NFC We're calling an overload of getOpcode that already returns Instruction::CastOps. llvm-svn: 310024	2017-08-04 05:12:35 +00:00
Victor Leschuk	56b03d0dd6	Un-revert r310014: false revert, it wasn't the cause of build break llvm-svn: 310021	2017-08-04 04:51:15 +00:00
Victor Leschuk	21713ebfb1	Revert r310014 as it breaks build lld-x86_64-darwin13 llvm-svn: 310020	2017-08-04 04:43:54 +00:00
Adrian Prantl	fd8c8e9fe6	Teach GlobalSRA to update the debug info for split-up globals. This is similar to what we are doing in "regular" SROA and creates DW_OP_LLVM_fragment operations to describe the resulting variables. rdar://problem/33654891 llvm-svn: 310014	2017-08-04 01:19:54 +00:00
Teresa Johnson	8482e56920	Use profile summary to disable peeling for huge working sets Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold. When the working set size is determined to be huge, disable peeling to avoid bloating the working set further. Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K). Reviewers: davidxl Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36288 llvm-svn: 310005	2017-08-03 23:42:58 +00:00
Davide Italiano	5974c31d91	[NewGVN] Fix the case where we have a phi-of-ops which goes away. Patch by Daniel Berlin, fixes PR33196 (and probably something else). llvm-svn: 309988	2017-08-03 21:17:49 +00:00
Teresa Johnson	9a18a6f08b	Disable loop peeling during full unrolling pass. Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling. Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled. Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks. Reviewers: chandlerc, davidxl Subscribers: mzolotukhin, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36258 llvm-svn: 309966	2017-08-03 17:52:38 +00:00
Sanjay Patel	7cf745cc94	[NewGVN] fix typos; NFC llvm-svn: 309946	2017-08-03 15:18:27 +00:00
Ewan Crawford	e18490c8be	[Cloning] Move distinct GlobalVariable debug info metadata in CloneModule Duplicating the distinct Subprogram and CU metadata nodes seems like the incorrect thing to do in CloneModule for GlobalVariable debug info. As it results in the scope of the GlobalVariable DI no longer being consistent with the rest of the module, and the new CU is absent from llvm.dbg.cu. Fixed by adding RF_MoveDistinctMDs to MapMetadata flags for GlobalVariables. Current unit test IR after clone: ``` @gv = global i32 1, comdat($comdat), !dbg !0, !type !5 define private void @f() comdat($comdat) personality void ()* @persfn !dbg !14 { !llvm.dbg.cu = !{!10} !0 = !DIGlobalVariableExpression(var: !1) !1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5) !3 = !DIFile(filename: "filename.c", directory: "/file/dir/") !4 = !DISubroutineType(types: !5) !5 = !{} !6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8) !7 = !DIFile(filename: "filename.c", directory: "/file/dir") !8 = !{!0} !9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") !10 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !11) !11 = !{!12} !12 = !DIGlobalVariableExpression(var: !13) !13 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !14, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !14 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !10, variables: !5) ``` Patched IR after clone: ``` @gv = global i32 1, comdat($comdat), !dbg !0, !type !5 define private void @f() comdat($comdat) personality void ()* @persfn !dbg !2 { !llvm.dbg.cu = !{!6} !0 = !DIGlobalVariableExpression(var: !1) !1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5) !3 = !DIFile(filename: "filename.c", directory: "/file/dir/") !4 = !DISubroutineType(types: !5) !5 = !{} !6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8) !7 = !DIFile(filename: "filename.c", directory: "/file/dir") !8 = !{!0} !9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") ``` Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36082 llvm-svn: 309928	2017-08-03 09:23:03 +00:00
Matt Arsenault	e6de494b74	LV: Don't insert runtime ptr checks on divergent targets llvm-svn: 309890	2017-08-02 21:43:08 +00:00
Craig Topper	fc9bf50dee	[InstCombine] Remove unnecessary temporary APInt. NFCI llvm-svn: 309887	2017-08-02 21:05:40 +00:00
Teresa Johnson	ecd901314d	[PM] Split LoopUnrollPass and make partial unroller a function pass Summary: This is largely NFC, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 llvm-svn: 309886	2017-08-02 20:35:29 +00:00
Craig Topper	4068e4eec5	[InstCombine] Remove explicit code for folding (xor(zext(cmp)), 1) and (xor(sext(cmp)), -1) to ext(!cmp). As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor. Differential Revision: https://reviews.llvm.org/D36214 llvm-svn: 309882	2017-08-02 20:30:27 +00:00
Craig Topper	ae9b87d10c	[InstCombine] Support sext in foldLogicCastConstant This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214. Differential Revision: https://reviews.llvm.org/D36234 llvm-svn: 309880	2017-08-02 20:25:56 +00:00
Jakub Kuderski	d869913f3b	[Dominators] Teach LoopDeletion to use the new incremental API Summary: This patch makes LoopDeletion use the incremental DominatorTree API. We modify LoopDeletion to perform the deletion in 5 steps: 1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch. 2. Inform the DomTree about the new edge. 3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable. 4. Inform the DomTree about the deleted edge. 5. Remove the unreachable block from the function. Creating the dummy conditional branch is necessary to perform incremental DomTree update. We should consider using the batch updater when it's ready. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35391 llvm-svn: 309850	2017-08-02 18:17:52 +00:00
Alexey Bataev	36e6096a03	[SLPVectorizer] Generalize interface of functions, NFC. llvm-svn: 309816	2017-08-02 14:38:07 +00:00
Alexey Bataev	fba97e6e21	[SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements Summary: Currently most of the time vectors of extractelement instructions are treated as scalars that must be gathered into vectors. But in some cases, like when we have extractelement instructions from single vector with different constant indeces or from 2 vectors of the same size, we can treat this operations as shuffle of a single vector or blending of 2 vectors. ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %x0 = extractelement <2 x i8> %x, i32 0 %y1 = extractelement <2 x i8> %y, i32 1 %x0x0 = mul i8 %x0, %x0 %y1y1 = mul i8 %y1, %y1 %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0 %ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1 ret <2 x i8> %ins2 } ``` can be converted to something like ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3> %2 = mul <2 x i8> %1, %1 ret <2 x i8> %2 } ``` Currently this type of conversion is considered as high cost transformation. Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon Subscribers: ashahid, RKSimon, spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D30200 llvm-svn: 309812	2017-08-02 13:25:26 +00:00
Davide Italiano	c2f73b7fae	[NewGVN] Fold single-use variables. NFCI. llvm-svn: 309790	2017-08-02 04:05:49 +00:00
Davide Italiano	b13a3fa41b	[NewGVN] Remove a (now stale) comment. NFCI. llvm-svn: 309789	2017-08-02 03:51:40 +00:00
Craig Topper	36af40c115	[SimplifyCFG] Fix typo in comment. NFC llvm-svn: 309785	2017-08-02 02:34:16 +00:00
Chandler Carruth	95055d8f8b	[PM] Fix a bug where through CGSCC iteration we can get infinite-inlining across multiple runs of the inliner by keeping a tiny history of internal-to-SCC inlining decisions. This is still a bit gross, but I don't yet have any fundamentally better ideas and numerous people are blocked on this to use new PM and ThinLTO together. The core of the idea is to detect when we are about to do an inline that has a chance of re-splitting an SCC which we have split before with a similar inlining step. That is a critical component in the inlining forming a cycle and so far detects all of the various cyclic patterns I can come up with as well as the original real-world test case (which comes from a ThinLTO build of libunwind). I've added some tests that I think really demonstrate what is going on here. They are essentially state machines that march the inliner through various steps of a cycle and check that we stop when the cycle is closed and that we actually did do inlining to form that cycle. A lot of thanks go to Eric Christopher and Sanjoy Das for the help understanding this issue and improving the test cases. The biggest "yuck" here is the layering issue -- the CGSCC pass manager is providing somewhat magical state to the inliner for it to use to make itself converge. This isn't great, but I don't honestly have a lot of better ideas yet and at least seems nicely isolated. I have tested this patch, and it doesn't block any inlining on the entire LLVM test suite and SPEC, so it seems sufficiently narrowly targeted to the issue at hand. We have come up with hypothetical issues that this patch doesn't cover, but so far none of them are practical and we don't have a viable solution yet that covers the hypothetical stuff, so proceeding here in the interim. Definitely an area that we will be back and revisiting in the future. Differential Revision: https://reviews.llvm.org/D36188 llvm-svn: 309784	2017-08-02 02:09:22 +00:00
Chad Rosier	dfd1de687d	[Value Tracking] Default argument to true and rename accordingly. NFC. IMHO this is a bit more readable. llvm-svn: 309739	2017-08-01 20:18:54 +00:00
Craig Topper	4d5050b5ba	[InstCombine] Remove explicit check for impossible condition. Replace with assert Summary: As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth. At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36123 llvm-svn: 309690	2017-08-01 15:10:25 +00:00
Max Kazantsev	e4c220e8f2	[IRCE][NFC] Add another assert that AddRecExpr's step is not zero One more assertion of this kind. It is a preparation step for generalizing to the case of stride not equal to +1/-1. llvm-svn: 309663	2017-08-01 06:49:29 +00:00
Max Kazantsev	85da7543f9	[IRCE][NFC] Add assert that AddRecExpr's step is not zero We should never return zero steps, ensure this fact by adding a sanity check when we are analyzing the induction variable. llvm-svn: 309661	2017-08-01 06:27:51 +00:00
Davide Italiano	72c4285bd6	[MetaRenamer] Leave `@main` alone. To the best of my knowledge -metarenamer is used in two cases: 1) obfuscate names, when e.g. they contain informations that can't be shared. 2) Improve clarity of the textual IR for testcases. One of the usecases if getting the output of `opt` and passing it to the lli interpreter to run the test. If metarenamer renames @main, lli can't find an entry point. llvm-svn: 309657	2017-08-01 05:14:45 +00:00
Alina Sbirlea	30d8a881e8	Default MemoryLocation passed to getModRefInfo should be None (D35441) llvm-svn: 309645	2017-08-01 00:47:17 +00:00
Kostya Serebryany	a1f12ba17e	[sanitizer-coverage] relax an assertion llvm-svn: 309644	2017-08-01 00:44:05 +00:00
Alina Sbirlea	967e7966fc	Allow None as a MemoryLocation to getModRefInfo Summary: Adding part of the changes in D30369 (needed to make progress): Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA. Original summary from D30369, by dberlin: Currently, we have instructions which affect memory but have no memory location. If you call, for example, MemoryLocation::get on a fence, it asserts. This means things specifically have to avoid that. It also means we end up with a copy of each API, one taking a memory location, one not. This starts to fix that. We add MemoryLocation::getOrNone as a new call, and reimplement the old asserting version in terms of it. We make MemoryLocation optional in the (Instruction, MemoryLocation) version of getModRefInfo, and kill the old one argument version in favor of passing None (it had one caller). Now both can handle fences because you can just use MemoryLocation::getOrNone on an instruction and it will return a correct answer. We use all this to clean up part of MemorySSA that had to handle this difference. Note that literally every actual getModRefInfo interface we have could be made private and replaced with: getModRefInfo(Instruction, Optional<MemoryLocation>) and getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>) and delegating to the right ones, if we wanted to. I have not attempted to do this yet. Reviewers: dberlin, davide, dblaikie Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D35441 llvm-svn: 309641	2017-08-01 00:28:29 +00:00
Sanjay Patel	dac0ab272c	[InstCombine] allow mask hoisting transform for vector types llvm-svn: 309627	2017-07-31 21:01:53 +00:00
Peter Collingbourne	bcd204b478	Update phi nodes in LowerTypeTests control flow simplification D33925 added a control flow simplification for -O2 --lto-O0 builds that manually splits blocks and reassigns conditional branches but does not correctly update phi nodes. If the else case being branched to had incoming phi nodes the control-flow simplification would leave phi nodes in that BB with an unhandled predecessor. Patch by Vlad Tsyrklevich! Differential Revision: https://reviews.llvm.org/D36012 llvm-svn: 309621	2017-07-31 20:43:07 +00:00
Kostya Serebryany	bfc83fa8d7	[sanitizer-coverage] don't instrument available_externally functions llvm-svn: 309611	2017-07-31 20:00:22 +00:00
Kostya Serebryany	bb6f079a45	[sanitizer-coverage] ensure minimal alignment for coverage counters and guards llvm-svn: 309610	2017-07-31 19:49:45 +00:00
Davide Italiano	e4c2782cba	[SLPVectorizer] Unbreak the build with -Werror. GCC was complaining about `&&` within `\|\|` without explicit parentheses. NFCI. llvm-svn: 309606	2017-07-31 19:14:19 +00:00
Craig Topper	317a51e886	[X86][InstCombine] Add some simplifications for BZHI intrinsics This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications. This could be in InstSimplify, but we don't handle any target specific intrinsics there today. Differential Revision: https://reviews.llvm.org/D36069 llvm-svn: 309604	2017-07-31 18:52:15 +00:00
Craig Topper	8324003818	[X86][InstCombine] Add basic simplification support for BEXTR/BEXTRI intrinsics. This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either. I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify. Differential Revision: https://reviews.llvm.org/D36063 llvm-svn: 309603	2017-07-31 18:52:13 +00:00
David Majnemer	91c6330c96	[IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer We are not allowed to reason about an initializer value without first consulting hasDefinitiveInitializer. llvm-svn: 309594	2017-07-31 17:47:07 +00:00
Florian Hahn	94b8a87c8e	Extend ifdefs to more unused helper functions. This fixes a buildbot failure with -Werror introduced by r309553 llvm-svn: 309572	2017-07-31 16:11:43 +00:00
Alexey Bataev	0ab22bb991	[SLP] Initial rework for min/max horizontal reduction vectorization, NFC. Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations. Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D29402 llvm-svn: 309566	2017-07-31 14:36:05 +00:00
Alexey Bataev	3e9b3eb91d	[Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC. llvm-svn: 309563	2017-07-31 14:19:32 +00:00
Ayal Zaks	e841b214b1	[LV] Avoid redundant operations manipulating masks The Loop Vectorizer generates redundant operations when manipulating masks: AND with true, OR with false, compare equal to true. Instead of relying on a subsequent pass to clean them up, this patch avoids generating them. Use null (no-mask) to represent all-one full masks, instead of a constant all-one vector, following the convention of masked gathers and scatters. Preparing for a follow-up VPlan patch in which these mask manipulating operations are modeled using recipes. Differential Revision: https://reviews.llvm.org/D35725 llvm-svn: 309558	2017-07-31 13:21:42 +00:00
Florian Hahn	6b3216aad8	Guard print() functions only used by dump() functions. Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553	2017-07-31 10:07:49 +00:00
Florian Hahn	4284049dcc	[LoopInterchange] Do not interchange loops with function calls. Summary: Without any information about the called function, we cannot be sure that it is safe to interchange loops which contain function calls. For example there could be dependences that prevent interchanging between accesses in the called function and the loops. Even functions without any parameters could cause problems, as they could access memory using global pointers. For now, I think it is only safe to interchange loops with calls marked as readnone. With this patch, the LLVM test suite passes with `-O3 -mllvm -enable-loopinterchange` and LoopInterchangeProfitability::isProfitable returning true for all loops. check-llvm and check-clang also pass when bootstrapped in a similar fashion, although only 3 loops got interchanged. Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35489 llvm-svn: 309547	2017-07-31 09:00:52 +00:00
Sam Elliott	67b0e589d0	Migrate PGOMemOptSizeOpt to use new OptimizationRemarkEmitter Pass Summary: Fixes PR33790. This patch still needs a yaml-style test, which I shall write tomorrow Reviewers: anemet Reviewed By: anemet Subscribers: anemet, llvm-commits Differential Revision: https://reviews.llvm.org/D35981 llvm-svn: 309497	2017-07-30 00:35:33 +00:00
Sumanth Gundapaneni	8d50a50e98	[SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables Differential Revision: https://reviews.llvm.org/D35579 llvm-svn: 309444	2017-07-28 22:25:40 +00:00
Adrian Prantl	abe04759a6	Remove the obsolete offset parameter from @llvm.dbg.value There is no situation where this rarely-used argument cannot be substituted with a DIExpression and removing it allows us to simplify the DWARF backend. Note that this patch does not yet remove any of the newly dead code. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D35951 llvm-svn: 309426	2017-07-28 20:21:02 +00:00
Alexey Bataev	e109655c90	[SLP] Allow vectorization of the instruction from the same basic blocks only, NFC. Summary: After some changes in SLP vectorizer we missed some additional checks to limit the instructions for vectorization. We should not perform analysis of the instructions if the parent of instruction is not the same as the parent of the first instruction in the tree or it was analyzed already. Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D34881 llvm-svn: 309425	2017-07-28 20:11:16 +00:00
Wei Mi	55c05e14af	[GVN] Recommit the patch "Add phi-translate support in scalarpre" Recommit after workaround the bug PR31652. Three bugs fixed in previous recommits: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 309397	2017-07-28 15:47:25 +00:00
Davide Italiano	75a001ba78	[JumpThreading] Stop falsely preserving LazyValueInfo. JumpThreading claims to preserve LVI, but it doesn't preserve the analyses which LVI holds a reference to (e.g. the Dominator). In the current pass manager infrastructure, after JT runs, the PM frees these analyses (including DominatorTree) but preserves LVI. CorrelatedValuePropagation runs immediately after and queries a corrupted domtree, causing weird miscompiles. This commit disables the preservation of LVI for the time being. Eventually, we should either move LVI to a proper dependency tracking mechanism (i.e. an analyses shouldn't hold references to other analyses and compute them on demand if needed), or we should teach all the passes preserving LVI to preserve the analyses LVI depends on. The new pass manager has a mechanism to invalidate LVI in case one of the analyses it depends on becomes invalid, so this problem shouldn't exist (at least not in this immediate form), but handling of analyses holding references is still a very delicate subject. Fixes PR33917 (and rustc). llvm-svn: 309355	2017-07-28 03:10:43 +00:00
Davide Italiano	01cb947abb	[JumpThreading] Add an option to dump LazyValueInfo after the run. Differential Revision: https://reviews.llvm.org/D35973 llvm-svn: 309353	2017-07-28 02:57:43 +00:00
Dehao Chen	8260d66556	Increase the ImportHotMultiplier to 10.0 Summary: The original 3.0 hot mupltiplier is too small, and would prevent hot callsites from being inline. This patch increases the hot multilier to 10.0 Reviewers: davidxl, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D35969 llvm-svn: 309344	2017-07-28 01:02:34 +00:00
Kostya Serebryany	063b652096	[sanitizer-coverage] rename sanitizer-coverage-create-pc-table into sanitizer-coverage-pc-table and add plumbing for a clang flag llvm-svn: 309337	2017-07-28 00:09:29 +00:00
Kostya Serebryany	b75d002f15	[sanitizer-coverage] add a feature sanitizer-coverage-create-pc-table=1 (works with trace-pc-guard and inline-8bit-counters) that adds a static table of instrumented PCs to be used at run-time llvm-svn: 309335	2017-07-27 23:36:49 +00:00
whitequark	9e8197ac8f	[MergeFunctions] Remove alias support. The alias support was dead code since 2011. It was last touched in r124182, where it was reintroduced after being removed in r110434, and since then it was gated behind a HasGlobalAliases flag that was permanently stuck as `false`. It is also broken. I'm not sure if it bitrotted or was just broken in the first place because it appears to have never been tested, but the following IR results in a crash: define internal i32 @a(i32 %a, i32 %b) unnamed_addr { %c = add i32 %a, %b %d = xor i32 %a, %c ret i32 %c } define internal i32 @b(i32 %a, i32 %b) unnamed_addr { %c = add i32 %a, %b %d = xor i32 %a, %c ret i32 %c } It seems safe to remove buggy untested code that no one cared about for seven years. Differential Revision: https://reviews.llvm.org/D34802 llvm-svn: 309313	2017-07-27 19:36:13 +00:00
Davide Italiano	82c7d3768d	[FunctionImport] Prefer isa<> to dyn_cast<> as the value is not used. This change makes GCC7 happy again. llvm-svn: 309305	2017-07-27 18:38:09 +00:00
Hiroshi Yamauchi	60855214c2	[InstCombine] Simplify pointer difference subtractions (GEP-GEP) where GEPs have other uses and one non-constant index Summary: Pointer difference simplifications currently happen only if input GEPs don't have other uses or their indexes are all constants, to avoid duplicating indexing arithmetic. This patch enables cases with exactly one non-constant index among input GEPs to happen where there is no duplicated arithmetic or code size increase even if input GEPs have other uses. For example, this patch allows "(&A[42][i]-&A[42][0])" --> "i", which didn't happen previously, if the input GEP(s) have other uses. Reviewers: sanjoy, bkramer Reviewed By: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35499 llvm-svn: 309304	2017-07-27 18:27:11 +00:00
Adam Nemet	0d8b5d6f69	[ICP] Migrate to OptimizationRemarkEmitter This is a module pass so for the old PM, we can't use ORE, the function analysis pass. Instead ORE is created on the fly. A few notes: - isPromotionLegal is folded in the caller since we want to emit the Function in the remark but we can only do that if the symbol table look-up succeeded. - There was good test coverage for remarks in this pass. - promoteIndirectCall uses ORE conditionally since it's also used from SampleProfile which does not use ORE yet. Fixes PR33792. Differential Revision: https://reviews.llvm.org/D35929 llvm-svn: 309294	2017-07-27 16:54:15 +00:00
Daniel Neilson	2574d7cbf6	All libcalls should be considered to be GC-leaf functions. Summary: It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc), but these passes will not mark the call as a gc-leaf-function. All libcalls are actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that available libcalls are equivalent to gc-leaf-function calls. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35840 llvm-svn: 309291	2017-07-27 16:49:39 +00:00
Alexey Bataev	07b96e8e96	[SLP] Outline code for the check that instruction users are part of vectorization tree, NFC. llvm-svn: 309284	2017-07-27 15:48:44 +00:00
David Blaikie	72c0b1cc1b	Fix assert from r309278 llvm-svn: 309281	2017-07-27 15:28:10 +00:00
David Blaikie	2f0cc477ab	ThinLTO: Don't import aliases of any kind (even linkonce_odr) Summary: Until a more advanced version of importing can be implemented for aliases (one that imports an alias as an available_externally definition of the aliasee), skip the narrow subset of cases that was possible but came at a cost: aliases of linkonce_odr functions could be imported because the linkonce_odr function could be safely duplicated from the source module. This came/comes at the cost of not being able to 'home' imported linkonce functions (they had to be emitted linkonce_odr in all the destination modules (even if they weren't used by an alias) rather than as available_externally - causing extra object size). Tangentially, this also was the only reason ThinLTO would emit multiple CUs in to the resulting DWARF - which happens to be a problem for Fission (there's a fix for this in GDB but not released yet, etc). (actually it's not the only reason - but I'm sending a patch to fix the other reason shortly) There's no reason to believe this particularly narrow alias importing was especially/meaningfully important, only that it was /possible/ to implement in this way. When a more general solution is done, it should still satisfy the DWARF concerns above, since the import will still be available_externally, and thus not create extra CUs. Since now all aliases are treated the same, I removed/simplified some test cases since they were testing corner cases where there are no longer any corners. Reviewers: tejohnson, mehdi_amini Differential Revision: https://reviews.llvm.org/D35875 llvm-svn: 309278	2017-07-27 15:09:06 +00:00
Hiroshi Yamauchi	0445e31c88	Fix a comment (test commit). llvm-svn: 309192	2017-07-26 21:54:43 +00:00
Adam Nemet	ea06e6e865	Migrate SimplifyLibCalls to new OptimizationRemarkEmitter Summary: This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter API. In fact, as SimplifyLibCalls is only ever called via InstCombine, (as far as I can tell) the OptimizationRemarkEmitter is added there, and then passed through to SimplifyLibCalls later. I have avoided changing any remark text. This closes PR33787 Patch by Sam Elliott! Reviewers: anemet, davide Reviewed By: anemet Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35608 llvm-svn: 309158	2017-07-26 19:03:18 +00:00
Wei Mi	fc0e245464	Disable loop unswitching for some patterns containing equality comparison with undef. This is a workaround for the bug described in PR31652 and http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for some simple patterns we can know the equality comparison may contains undef, so we regard such comparison as unsafe and will not do loop-unswitching for them. We also need to disable the select simplification when one of select operand is undef and its result feeds into equality comparison. The patch cannot clear the safety issue caused by the bug, but it can suppress the issue from happening to some extent. Differential Revision: https://reviews.llvm.org/D35811 llvm-svn: 309059	2017-07-25 23:37:17 +00:00
Chandler Carruth	1dc34c6d80	[LIR] Teach LIR to avoid extending the BE count prior to adding one to it when safe. Very often the BE count is the trip count minus one, and the plus one here should fold with that minus one. But because the BE count might in theory be UINT_MAX or some such, adding one before we extend could in some cases wrap to zero and break when we scale things. This patch checks to see if it would be safe to add one because the specific case that would cause this is guarded for prior to entering the preheader. This should handle essentially all of the common loop idioms coming out of C/C++ code once canonicalized by LLVM. Before this patch, both forms of loop in the added test cases ended up subtracting one from the size, extending it, scaling it up by 8 and then adding 8 back onto it. This is really silly, and it turns out made it all the way into generated code very often, so this is a surprisingly important cleanup to do. Many thanks to Sanjoy for showing me how to do this with SCEV. Differential Revision: https://reviews.llvm.org/D35758 llvm-svn: 308968	2017-07-25 10:48:32 +00:00
Kostya Serebryany	c485ca05ac	[sanitizer-coverage] simplify the code, NFC llvm-svn: 308944	2017-07-25 02:07:38 +00:00
Florian Hahn	f66efd6181	[LoopInterchange] Update code to use range-based for loops (NFC). Summary: The remaining non range-based for loops do not iterate over full ranges, so leave them as they are. Reviewers: karthikthecool, blitz.opensource, mcrosier, mkuper, aemerson Reviewed By: aemerson Subscribers: aemerson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35777 llvm-svn: 308872	2017-07-24 11:41:30 +00:00
Xinliang David Li	8e43698cf1	[PGOInstr] Add a debug print llvm-svn: 308785	2017-07-21 21:36:25 +00:00
Haojie Wang	1dec57d5b0	ThinLTO Minimized Bitcode File Size Reduction Summary: Currently the ThinLTO minimized bitcode file only strip the debug info, but there is still a lot of information in the minimized bit code file that will be not used for thin linker. In this patch, most of the extra information is striped to reduce the minimized bitcode file. Now only ModuleVersion, ModuleInfo, ModuleGlobalValueSummary, ModuleHash, Symtab and Strtab are left. Now the minimized bitcode file size is reduced to 15%-30% of the debug info stripped bitcode file size. Reviewers: danielcdh, tejohnson, pcc Reviewed By: pcc Subscribers: mehdi_amini, aprantl, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D35334 llvm-svn: 308760	2017-07-21 17:25:20 +00:00
Anna Thomas	5c07a4c5de	[RuntimeUnroll] NFC: Add a profitability function for mutliexit loop Separated out the profitability from the safety analysis for multiexit loop unrolling. Currently, this is an NFC because profitability is true only if the unroll-runtime-multi-exit is set to true (off-by-default). This is to ease adding the profitability heuristic up for review at D35380. llvm-svn: 308753	2017-07-21 16:30:38 +00:00
Dinar Temirbulatov	4403b2b668	[SLPVectorizer] Replace E->Scalars to VL0 at vectorizeTree and move comment, NFCI. llvm-svn: 308750	2017-07-21 16:02:56 +00:00
Dinar Temirbulatov	b2a9a23213	[SLPVectorizer] buildTree_rec replace cast<Instruction>(VL[0]) to VL0, NFCI. llvm-svn: 308745	2017-07-21 15:31:54 +00:00
Dinar Temirbulatov	3206409d91	[SLPVectorizer] Change canReuseExtract function parameter Opcode from unsigned to Value *, NFCI. llvm-svn: 308739	2017-07-21 13:32:36 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Davide Italiano	0c8d26c312	[PGO] Move the PGOInstrumentation pass to new OptRemark API. This fixes PR33791. llvm-svn: 308668	2017-07-20 20:43:05 +00:00
Peter Collingbourne	6f6788b99c	LowerTypeTests: Drop function type metadata only if we're going to replace it. Previously we were (mis)handling jump table members with a prevailing definition in a full LTO module and a non-prevailing definition in a ThinLTO module by dropping type metadata on those functions entirely, which would cause type tests involving such functions to fail. This patch causes us to drop metadata only if we are about to replace it with metadata from cfi.functions. We also want to replace metadata for available_externally functions, which can arise in the opposite scenario (prevailing ThinLTO definition, non-prevailing full LTO definition). The simplest way to handle that is to remove the definition; there's little value in keeping it around at this point (i.e. after most optimization passes have already run) and later code will try to use the function's linkage to create an alias, which would result in invalid IR if the function is available_externally. Fixes PR33832. Differential Revision: https://reviews.llvm.org/D35604 llvm-svn: 308642	2017-07-20 18:02:05 +00:00
David Majnemer	e6bb895ab5	[LICM] Make sinkRegion and hoistRegion non-recursive Large CFGs can cause us to blow up the stack because we would have a recursive step for each basic block in a region. Instead, create a worklist and iterate it. This limits the stack usage to something more manageable. Differential Revision: https://reviews.llvm.org/D35609 llvm-svn: 308582	2017-07-20 03:27:02 +00:00
Davide Italiano	4b8c8eae32	[TRE] Move to the new OptRemark API. Fixes PR33788. Differential Revision: https://reviews.llvm.org/D35570 llvm-svn: 308524	2017-07-19 21:13:22 +00:00
Peter Collingbourne	93fdaca5ac	ThinLTOBitcodeWriter: Do not rewrite intrinsic functions when splitting modules. Changing the type of an intrinsic may invalidate the IR. Differential Revision: https://reviews.llvm.org/D35593 llvm-svn: 308500	2017-07-19 17:54:29 +00:00
Dinar Temirbulatov	a61f4b8957	[LoopUtils] Add an extra parameter OpValue to propagateIRFlags function, If OpValue is non-null, we only consider operations similar to OpValue when intersecting. Differential Revision: https://reviews.llvm.org/D35292 llvm-svn: 308428	2017-07-19 10:02:07 +00:00
Balaram Makam	b05a55787a	[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure. Summary: When simplifying unconditional branches from empty blocks, we pre-test if the BB belongs to a set of loop headers and keep the block to prevent passes from destroying canonical loop structure. However, the current algorithm fails if the destination of the branch is a loop header. Especially when such a loop's latch block is folded into loop header it results in additional backedges and LoopSimplify turns it into a nested loop which prevent later optimizations from being applied (e.g., loop unrolling and loop interleaving). This patch augments the existing algorithm by further checking if the destination of the branch belongs to a set of loop headers and defer eliminating it if yes to LateSimplifyCFG. Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605 Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl Reviewed By: efriedma Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35411 llvm-svn: 308422	2017-07-19 08:53:34 +00:00
Ayal Zaks	8c452d76ed	[LV] Test once if vector trip count is zero, instead of twice Generate a single test to decide if there are enough iterations to jump to the vectorized loop, or else go to the scalar remainder loop. This test compares the Scalar Trip Count: if STC < VF * UF go to the scalar loop. If requiresScalarEpilogue() holds, at-least one iteration must remain scalar; the rest can be used to form vector iterations. So in this case the test checks instead if (STC - 1) < VF * UF by comparing STC <= VF * UF, and going to the scalar loop if so. Otherwise the vector loop is entered for at-least one vector iteration. This test covers the case where incrementing the backedge-taken count will overflow leading to an incorrect trip count of zero. In this (rare) case we will also avoid the vector loop and jump to the scalar loop. This patch simplifies the existing tests and effectively removes the basic-block originally named "min.iters.checked", leaving the single test in block "vector.ph". Original observation and initial patch by Evgeny Stupachenko. Differential Revision: https://reviews.llvm.org/D34150 llvm-svn: 308421	2017-07-19 05:16:39 +00:00
Chandler Carruth	06a86301a1	[PM/LCG] Follow-up fix to r308088 to handle deletion of library functions. In the prior commit, we provide ordering to the LCG between functions and library function definitions that they might begin to call through transformations. But we still would delete these library functions from the call graph if they became dead during inlining. While this immediately crashed, it also exposed a loss of information. We shouldn't remove definitions of library functions that can still usefully participate in the LCG-powered CGSCC optimization process. If new call edges are formed, we want to have definitions to be called. We can still remove these functions if truly dead using global-dce, etc, but removing them during the CGSCC walk is premature. This fixes a crash in the new PM when optimizing some unusual libraries that end up with "internal" lib functions such as the code in the "R" language's libraries. llvm-svn: 308417	2017-07-19 04:12:25 +00:00
Weiming Zhao	984f1dc338	Fix DebugLoc propagation for unreachable LoadInst Summary: Currently, when GVN creates a load and when InstCombine creates a new store for unreachable Load, the DebugLoc info gets lost. Reviewers: dberlin, davide, aprantl Reviewed By: aprantl Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D34639 llvm-svn: 308404	2017-07-19 01:27:24 +00:00
Vitaly Buka	74443f0778	[asan] Copy arguments passed by value into explicit allocas for ASan Summary: ASan determines the stack layout from alloca instructions. Since arguments marked as "byval" do not have an explicit alloca instruction, ASan does not produce red zones for them. This commit produces an explicit alloca instruction and copies the byval argument into the allocated memory so that red zones are produced. Submitted on behalf of @morehouse (Matt Morehouse) Reviewers: eugenis, vitalybuka Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D34789 llvm-svn: 308387	2017-07-18 22:28:03 +00:00
Davide Italiano	6da7db31df	[TRE] Simplify canTRE() a bit using all_of(). NFCI. This has a ~11 years old FIXME, which may not be true today. We might consider removing this code altogether. llvm-svn: 308319	2017-07-18 15:42:59 +00:00
Alexander Potapenko	9385aaa848	[sancov] Fix PR33732 Coverage hooks that take less-than-64-bit-integers as parameters need the zeroext parameter attribute (http://llvm.org/docs/LangRef.html#paramattrs) to make sure they are properly extended by the x86_64 ABI. llvm-svn: 308296	2017-07-18 11:47:56 +00:00
Max Kazantsev	2c627a97fd	[IRCE] Recognize loops with ne/eq latch conditions In some particular cases eq/ne conditions can be turned into equivalent slt/sgt conditions. This patch teaches parseLoopStructure to handle some of these cases. Differential Revision: https://reviews.llvm.org/D35010 llvm-svn: 308264	2017-07-18 04:53:48 +00:00
Martin Storsjo	2f24e93481	[AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208	2017-07-17 20:05:19 +00:00
Teresa Johnson	f9dc3deaa6	Revert "Restore with fix "[ThinLTO] Ensure we always select the same function copy to import"" This reverts commit r308114 (and follow on fixes to test). There is a linking failure in a ThinLTO bot: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rthinlto_build/3663/ (and undefined reference). It seems like it must be a second order effect of the heuristic change I made, and may take some time to try to reproduce locally and track down. Therefore, reverting for now. llvm-svn: 308206	2017-07-17 19:25:38 +00:00
Simon Pilgrim	7ac3efacc0	Remove unnecessary cast. NFCI. llvm-svn: 308166	2017-07-17 09:35:03 +00:00
Davide Italiano	579064e2c1	[InstCombine] Don't violate dominance when replacing instructions. Differential Revision: https://reviews.llvm.org/D35376 llvm-svn: 308144	2017-07-16 18:56:30 +00:00
Craig Topper	2072aca51c	[InstCombine] Move (0 - x) & 1 --> x & 1 to SimplifyDemandedUseBits. This removes a dedicated matcher and allows us to support more than just an AND masking the lower bit. llvm-svn: 308124	2017-07-16 05:37:58 +00:00
Teresa Johnson	a7660b0127	Restore with fix "[ThinLTO] Ensure we always select the same function copy to import" This restores r308078/r308079 with a fix for bot non-determinisim (make sure we run llvm-lto in single threaded mode so the debug output doesn't get interleaved). llvm-svn: 308114	2017-07-15 22:58:06 +00:00
Craig Topper	d918d5b36b	[InstCombine] Improve the expansion in SimplifyUsingDistributiveLaws to handle cases where one side doesn't simplify, but the other side resolves to an identity value Summary: If one side simplifies to the identity value for inner opcode, we can replace the value with just the operation that can't be simplified. I've removed a couple now unneeded special cases in visitAnd and visitOr. There are probably other cases I missed. Reviewers: spatel, majnemer, hfinkel, dberlin Reviewed By: spatel Subscribers: grandinj, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D35451 llvm-svn: 308111	2017-07-15 21:49:49 +00:00
Sanjay Patel	3437ee2740	[InstCombine] improve (1 << x) & 1 --> zext(x == 0) folding 1. Add a one-use check to prevent increasing instruction count. 2. Generalize the pattern matching to include vector types. llvm-svn: 308105	2017-07-15 17:26:01 +00:00
Sanjay Patel	55b9f88ecc	[InstCombine] allow (0 - x) & 1 --> x & 1 for vectors llvm-svn: 308098	2017-07-15 15:29:47 +00:00
Sanjay Patel	27339133a7	[InstCombine] remove dead code/tests; NFCI These patterns and tests were added to InstSimplify with: https://reviews.llvm.org/rL303004 llvm-svn: 308096	2017-07-15 15:01:33 +00:00
Chandler Carruth	d78a38ed2e	Revert r308078 (and subsequent tweak in r308079) which introduces a test that appears to exhibit non-determinism and is flaking on the bots pretty consistently. r308078: [ThinLTO] Ensure we always select the same function copy to import r308079: Require asserts in new test that uses debug flag llvm-svn: 308095	2017-07-15 13:50:26 +00:00
Florian Hahn	ad993521ac	[LoopInterchange] Add some optimization remarks. Reviewers: anemet, karthikthecool, blitz.opensource Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35122 llvm-svn: 308094	2017-07-15 13:13:19 +00:00
Dinar Temirbulatov	3c64077c82	[SLPVectorizer] Add an extra parameter to tryScheduleBundle function, NFCI. llvm-svn: 308081	2017-07-15 05:43:54 +00:00
Teresa Johnson	82b4fb1afe	[ThinLTO] Ensure we always select the same function copy to import Summary: Check if the first eligible callee is under the instruction threshold. Checking this on the first eligible callee ensures that we don't end up selecting different callees to import when we invoke this routine with different thresholds due to reaching the callee via paths that are shallower or hotter (when there are multiple copies, i.e. with weak or linkonce linkage). We don't want to leave the decision of which copy to import up to the backend. Reviewers: mehdi_amini Subscribers: inglorion, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35436 llvm-svn: 308078	2017-07-15 04:53:05 +00:00
Geoff Berry	f7d5daa0c0	[EarlyCSE] Handle calls with no MemorySSA info. Summary: When checking for memory dependencies between calls using MemorySSA, handle cases where the calls have no MemoryAccess associated with them because the AA analysis being used has determined that the call does not read/write memory. Fixes PR33756 Reviewers: dberlin, davide Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D35317 llvm-svn: 308051	2017-07-14 20:13:21 +00:00
Haicheng Wu	476adcca6b	[JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB() Add the following pattern to TryToUnfoldSelectInCurrBB() bb: %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ... %c = cmp %p, 0 %s = select %c, trueval, falseval The Select in the above pattern will be unfolded and then jump-threaded. The current implementation does not allow CMP in the middle of PHI and Select. Differential Revision: https://reviews.llvm.org/D34762 llvm-svn: 308050	2017-07-14 19:16:47 +00:00
Jakub Kuderski	b292c22c8d	[Dominators] Make IsPostDominator a template parameter Summary: DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere. This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction. Reviewers: dberlin, sanjoy, davide, grosser Reviewed By: dberlin Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35315 llvm-svn: 308040	2017-07-14 18:26:09 +00:00
Sanjay Patel	3f4db3ea97	[InstCombine] convert bitwise (in)equality checks to logical ops (PR32401) As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 we have a backend transform to undo this: https://reviews.llvm.org/rL299542 when it's likely that the xor version leads to better codegen, but we want this form in IR for better analysis and simplification potential. llvm-svn: 308031	2017-07-14 15:09:49 +00:00
Max Kazantsev	f80ffa1a78	[IRCE] Fix corner case with Start = INT_MAX When iterating through loop for (int i = INT_MAX; i > 0; i--) We fail to generate the pre-loop for it. It happens because we use the overflown value in a comparison predicate when identifying whether or not we need it. In old logic, we used SLE predicate against Greatest value which exceeds all seen values of the IV and might be overflown. Now we use the GreatestSeen value of this IV with SLT predicate. Also added a test that ensures that a pre-loop is generated for such loops. Differential Revision: https://reviews.llvm.org/D35347 llvm-svn: 308001	2017-07-14 06:35:03 +00:00
Dinar Temirbulatov	21599fe2de	[SLPVectorizer] Add an extra parameter to alreadyVectorized function, NFCI. llvm-svn: 307996	2017-07-14 03:48:29 +00:00
Simon Pilgrim	f32f4be957	Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI. llvm-svn: 307929	2017-07-13 17:10:12 +00:00
Davide Italiano	c3dc055780	Reapply [GlobalOpt] Remove unreachable blocks before optimizing a function. This commit reapplies r307215 now that we found out and fixed the cause of the cfi test failure (in r307871). llvm-svn: 307920	2017-07-13 15:40:59 +00:00
Anna Thomas	ec9b326569	[RuntimeUnrolling] Update DomTree correctly when exit blocks have successors Summary: When we runtime unroll with multiple exit blocks, we also need to update the immediate dominators of the immediate successors of the exit blocks. Reviewers: reames, mkuper, mzolotukhin, apilipenko Reviewed by: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35304 llvm-svn: 307909	2017-07-13 13:21:23 +00:00
Xinliang David Li	f564c6959e	[PGO] Enhance pgo counter promotion This is an incremental change to the promotion feature. There are two problems with the current behavior: 1) loops with multiple exiting blocks are totally disabled 2) a counter update can only be promoted one level up in the loop nest -- which does help much for short trip count inner loops inside a high trip-count outer loops. Due to this limitation, we still saw very large profile count fluctuations from run to run for the affected loops which are usually very hot. This patch adds the support for promotion counters iteratively across the loop nest. It also turns on the promotion for loops with multiple exiting blocks (with a limit). For single-threaded applications, the performance impact is flat on average. For instance, dealII improves, but povray regresses. llvm-svn: 307863	2017-07-12 23:27:44 +00:00
Anna Thomas	8e431a9851	[LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843	2017-07-12 20:55:43 +00:00
Sam Clegg	fd5ab25ae1	Remove unneeded use of #undef DEBUG_TYPE. NFC Where is is needed (at the end of headers that define it), be consistent about its use. Also fix a few header guards that I found in the process. Differential Revision: https://reviews.llvm.org/D34916 llvm-svn: 307840	2017-07-12 20:49:21 +00:00
Michael Kuperstein	fdb46b2fb4	[LV] Don't allow outside uses of IVs if the SCEV is predicated on loop conditions. This fixes PR33706. Differential Revision: https://reviews.llvm.org/D35227 llvm-svn: 307837	2017-07-12 19:53:55 +00:00
Jakub Kuderski	b323f4f173	[LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701. Summary: LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator. When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr. This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 \| PR33701 ]]. In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree. Reviewers: dberlin, davide, uabelho, grosser Subscribers: efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D35074 llvm-svn: 307828	2017-07-12 18:42:16 +00:00
Peter Collingbourne	cacac6a104	LowerTypeTests: When importing functions skip definitions where the summary contains a decl. This normally indicates mixed CFI + non-CFI compilation, and will result in us treating the function in the same way as a function defined outside of the LTO unit. Part of PR33752. Differential Revision: https://reviews.llvm.org/D35281 llvm-svn: 307744	2017-07-12 00:39:12 +00:00
Davide Italiano	b8ad3eebca	[IPO] Temporarily rollback r307215. [GlobalOpt] Remove unreachable blocks before optimizing a function. While the change is presumably correct, it exposes a latent bug in DI which breaks on of the CFI checks. I'll analyze it further and try to understand what's going on. llvm-svn: 307729	2017-07-11 23:10:17 +00:00
Konstantin Zhuravlyov	bb80d3e1d3	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Anna Thomas	bafe766f5d	[LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop wasn't unrolled. llvm-svn: 307705	2017-07-11 20:44:37 +00:00
Davide Italiano	ee1c82112e	[NewGVN] Check for congruency of memory accesses. This is fine as nothing in the code relies on leader and memory leader being the same for a given congruency class. Ack'ed by Dan. Fixes PR33720. llvm-svn: 307699	2017-07-11 19:49:12 +00:00
Davide Italiano	67b0e53dc1	[NewGVN] Fix an innocent typo I found while debugging PR33720. llvm-svn: 307694	2017-07-11 19:19:45 +00:00
Davide Italiano	fb4544cd15	[NewGVN] Clarify the function invariants formatting them properly. llvm-svn: 307692	2017-07-11 19:15:36 +00:00
Evgeniy Stepanov	3d5ea713f7	[msan] Only check shadow memory for operands that are sized. Fixes PR33347: https://bugs.llvm.org/show_bug.cgi?id=33347. Differential Revision: https://reviews.llvm.org/D35160 Patch by Matt Morehouse. llvm-svn: 307684	2017-07-11 18:13:52 +00:00
Anna Thomas	5526a33f4f	[LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generation The loop structure for the outer loop does not contain the epilog preheader when we try to unroll inner loop with multiple exits and epilog code is generated. For now, we just bail out in such cases. Added a test case that shows the problem. Without this bailout, we would trip on assert saying LCSSA form is incorrect for outer loop. llvm-svn: 307676	2017-07-11 17:16:33 +00:00
Dinar Temirbulatov	09b6779709	[SLPVectorizer] Revert change in cancelScheduling with referencing to FirstInBundle, NFCI. llvm-svn: 307667	2017-07-11 15:54:50 +00:00
Hiroshi Inoue	0ca79dcf4b	fix typos in comments; NFC llvm-svn: 307626	2017-07-11 06:04:59 +00:00
Chandler Carruth	01f0c8a8c4	[PM/ThinLTO] Fix PR33536, a bug where the ThinLTO bitcode writer was querying for analysis results on a function declaration rather than a definition. The only reason this worked previously is by chance -- because the way we got alias analysis results with the legacy PM, we happened to not compute a dominator tree and so we happened to not hit an assert even though it didn't make any real sense. Now we bail out before trying to compute alias analysis so that we don't hit these asserts. llvm-svn: 307625	2017-07-11 05:39:20 +00:00
Leo Li	93abd7d915	[ConstantHoisting] Remove dupliate logic in constant hoisting Summary: As metioned in https://reviews.llvm.org/D34576, checkings in `collectConstantCandidates` can be replaced by using `llvm::canReplaceOperandWithVariable`. The only special case is that `collectConstantCandidates` return false for all `IntrinsicInst` but it is safe for us to collect constant candidates from `IntrinsicInst`. Reviewers: pirama, efriedma, srhines Reviewed By: efriedma Subscribers: llvm-commits, javed.absar Differential Revision: https://reviews.llvm.org/D34921 llvm-svn: 307587	2017-07-10 20:45:34 +00:00
Davide Italiano	a7a77540ef	[NewGVN] Simplify a lambda a little bit. NFCI. llvm-svn: 307586	2017-07-10 20:45:00 +00:00
Serge Guelton	f6329ec2e9	Fix invalid cast in instcombine UMul/ZExt idiom Fixes https://bugs.llvm.org/show_bug.cgi?id=25454 Do not assume IRBuilder creates Instruction where it can create Value. Do not assume idiom operands are constant, leave generalisation ot the IRBuilder. Differential Revision: https://reviews.llvm.org/D35114 llvm-svn: 307554	2017-07-10 16:51:40 +00:00
Anna Thomas	70ffd65ca9	[LoopUnrollRuntime] Remove strict assert about VMap requirement When unrolling under multiple exits which is under off-by-default option, the assert that checks for VMap entry in loop exit values is too strong. (assert if VMap entry did not exist, the value should be a constant). However, values derived from constants or from values outside loop, does not have a VMap entry too. Removed the assert and added a testcase showcasing the property for non-constant values. llvm-svn: 307542	2017-07-10 15:29:38 +00:00
Mikael Holmen	e0ced14449	[ArgumentPromotion] Change use of removed argument in llvm.dbg.value to undef Summary: This solves PR33641. When removing a dead argument we must also handle possibly existing calls to llvm.dbg.value that use the removed argument. Now we change the use of the otherwise dead argument to an undef for some other pass to cleanup later. If the calls are left untouched, they will later on cause errors: "function-local metadata used in wrong function" since the ArgumentPromotion rewrites the code by creating a new function with the wanted signature, but the metadata is not recreated so the new function may then erroneously use metadata from the old function. Reviewers: mstorsjo, rnk, arsenm Reviewed By: rnk Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34874 llvm-svn: 307521	2017-07-10 06:07:24 +00:00
Craig Topper	fde4723ebe	[IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492	2017-07-09 07:04:03 +00:00
Craig Topper	95d2347ae1	[IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491	2017-07-09 07:04:00 +00:00
Hiroshi Inoue	713b5ba2de	fix trivial typos; NFC sucessor -> successor llvm-svn: 307488	2017-07-09 05:54:44 +00:00
Chandler Carruth	bd9c29039e	[PM] Finish implementing and fix a chain of bugs uncovered by testing the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the current SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate just the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487	2017-07-09 03:59:31 +00:00
Craig Topper	e79b3e7d9a	[InstCombine] Speculatively implement a fix for what might be the root cause of PR33721 by making sure that we have integer types before doing select C, -1, 0 -> sext C to int I recently changed m_One and m_AllOnes to use Constant::isOneValue/isAllOnesValue which work on floating point values too. The original implementation looked specifically for ConstantInt scalars and splats. So I'm guessing we are accidentally trying to issue sext/zexts on floating point types now. Hopefully I figure out how to reproduce the failure from the PR soon. llvm-svn: 307486	2017-07-09 03:25:17 +00:00
Max Kazantsev	b9edcbcb1d	Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477	2017-07-08 17:17:30 +00:00
Craig Topper	bb4069e439	[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. llvm-svn: 307451	2017-07-07 23:16:26 +00:00
Dehao Chen	64c46574b0	Increase the import-threshold for crtical functions. Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D35096 llvm-svn: 307439	2017-07-07 21:01:00 +00:00
Anna Thomas	e3872003d0	[LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435	2017-07-07 20:12:32 +00:00
Davide Italiano	4eb210bdeb	[Local] Update the comment for removeUnreachableBlocks. It referenced a wrong function name, and didn't mention what the second argument did. This should be slightly more accurate now. llvm-svn: 307425	2017-07-07 18:54:14 +00:00
Gor Nishanov	8cdf648795	[cloning] Do not duplicate types when cloning functions Summary: This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655) rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but, in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined. Consider cloning of the following function: ``` define private void @f() !dbg !5 { %1 = alloca i32, !dbg !11 call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18 ret void, !dbg !18 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct. ``` define private void @f.1() !dbg !23 { %1 = alloca i32, !dbg !26 call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30 ret void, !dbg !30 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ; !28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable !29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed. (Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492) In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable. Reviewers: loladiro, dblaikie, aprantl, echristo Reviewed By: loladiro Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D35106 llvm-svn: 307418	2017-07-07 18:24:20 +00:00
Anna Thomas	734ab3f75c	[LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectProlog Minor refactoring to use the preexisting loop exit that's already calculated. We do not need to recompute the loop exit in ConnectProlog. Apart from avoiding redundant computation, this is required for supporting multiple loop exits when Prolog remainder loops are generated. llvm-svn: 307417	2017-07-07 18:05:28 +00:00
Yaxun Liu	b909f11a31	[InferAddressSpaces] Fix assertion about null pointer InferAddressSpaces does not check address space in collectFlatAddressExpressions, which causes values with non flat address space put into Postorder and causes assertion in cloneValueWithNewAddressSpace. This patch fixes assertion in OpenCL 2.0 conformance test generic_address_space subtest for amdgcn target. Differential Revision: https://reviews.llvm.org/D34991 llvm-svn: 307349	2017-07-07 02:40:13 +00:00
Sean Fertile	9cd1cdf814	Extend memcpy expansion in Transform/Utils to handle wider operand types. Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 llvm-svn: 307346	2017-07-07 02:00:06 +00:00
Evgeniy Stepanov	7d3eeaaa96	Revert r307342, r307343. Revert "Copy arguments passed by value into explicit allocas for ASan." Revert "[asan] Add end-to-end tests for overflows of byval arguments." Build failure on lldb-x86_64-ubuntu-14.04-buildserver. Test failure on clang-cmake-aarch64-42vma and sanitizer-x86_64-linux-android. llvm-svn: 307345	2017-07-07 01:31:23 +00:00
Evgeniy Stepanov	2a7a4bc1c9	Copy arguments passed by value into explicit allocas for ASan. ASan determines the stack layout from alloca instructions. Since arguments marked as "byval" do not have an explicit alloca instruction, ASan does not produce red zones for them. This commit produces an explicit alloca instruction and copies the byval argument into the allocated memory so that red zones are produced. Patch by Matt Morehouse. Differential revision: https://reviews.llvm.org/D34789 llvm-svn: 307342	2017-07-07 00:48:25 +00:00
Wei Mi	7586755013	[ConstHoisting] Turn on consthoist-with-block-frequency by default. Using profile information to guide consthoisting is generally helpful for performance, so the patch turns it on by default. No compile time or perf regression were found using spec2000 and spec2006 on x86. Some significant improvement (>20%) was seen on internal benchmarks. Differential Revision: https://reviews.llvm.org/D35063 llvm-svn: 307338	2017-07-07 00:11:05 +00:00
Craig Topper	cb22039bee	[InstCombine] No need to pass DataLayout to helper functions if we're passing the InstCombiner object. We can just ask it for the DataLayout. NFC llvm-svn: 307333	2017-07-06 23:18:43 +00:00
Craig Topper	4853c4304b	[InstCombine] Remove unused arguments from some helper functions. NFC llvm-svn: 307332	2017-07-06 23:18:42 +00:00
Craig Topper	2bb9f0f620	[InstCombine] Change a couple helper functions to only take the IRBuilder as an argument and not the whole InstCombiner object. NFC llvm-svn: 307331	2017-07-06 23:18:41 +00:00
Wei Mi	20526b2725	[ConstHoisting] choose to hoist when frequency is the same. The patch is to adjust the strategy of frequency based consthoisting: Previously when the candidate block has the same frequency with the existing blocks containing a const, it will not hoist the const to the candidate block. For that case, now we change the strategy to hoist the const if only existing blocks have more than one block member. This is helpful for reducing code size. Differential Revision: https://reviews.llvm.org/D35084 llvm-svn: 307328	2017-07-06 22:32:27 +00:00
Davide Italiano	f4891d29f8	[lib/LTO] Add a comment to explain where we set the linkage in the summary. Pointed out by Teresa! llvm-svn: 307305	2017-07-06 20:04:20 +00:00
Davide Italiano	6a5fbe52fa	[LTO] Fix the interaction between linker redefined symbols and ThinLTO This is the same as r304719 but for ThinLTO. The substantial difference is that in this case we don't have whole visibility, just the summary. In the LTO case, when we got the resolution for the input file we could just see if the linker told us whether a symbol was linker redefined (using --wrap or --defsym) and switch the linkage directly for the GV. Here, we have the summary. So, we record that the linkage changed from <whatever it was> to $weakany to prevent IPOs across this symbol boundaries and actually just switch the linkage at FunctionImport time. This patch should also fixes the lld bits (as all the scaffolding for communicating if a symbol is linker redefined should be there & should be the same), but I'll make sure to add some tests there as well. Fixes PR33192. Differential Revision: https://reviews.llvm.org/D35064 llvm-svn: 307303	2017-07-06 19:58:26 +00:00
Craig Topper	e9bf7ebacf	[InstCombine] Remove include of DIBuilder.h and Dwarf.h as they don't appear to be necessary. llvm-svn: 307295	2017-07-06 18:47:47 +00:00
Leo Li	5499b1b8be	Modify constraints in `llvm::canReplaceOperandWithVariable` Summary: `Instruction::Switch`: only first operand can be set to a non-constant value. `Instruction::InsertValue` both the first and the second operand can be set to a non-constant value. `Instruction::Alloca` return true for non-static allocation. Reviewers: efriedma Reviewed By: efriedma Subscribers: srhines, pirama, llvm-commits Differential Revision: https://reviews.llvm.org/D34905 llvm-svn: 307294	2017-07-06 18:47:05 +00:00
Craig Topper	ca2c87653c	[Constants] Replace calls to ConstantInt::equalsInt(0)/equalsInt(1) with isZero and isOne. NFCI llvm-svn: 307293	2017-07-06 18:39:49 +00:00
Craig Topper	79ab643da8	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292	2017-07-06 18:39:47 +00:00
Anna Thomas	eb6d5d1950	[LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch exit block Currently, we do not support multiple exiting blocks to the latch exit block. However, this bailout wasn't triggered when we had a unique exit block (which is the latch exit), with multiple exiting blocks to that unique exit. Moved the bailout so that it's triggered in both cases and added testcase. llvm-svn: 307291	2017-07-06 18:39:26 +00:00
Craig Topper	47c8f66997	[InstCombine] Remove Builder argument from InstCombiner::tryFactorization. NFC Builder is already a member of the InstCombiner class so we can use it with passing it. llvm-svn: 307290	2017-07-06 18:35:52 +00:00
Craig Topper	dfd01ea9ed	[SimplifyCFG] Move a portion of an if statement that should already be implied to an assert Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there? Reviewers: jmolloy, davide, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35025 llvm-svn: 307276	2017-07-06 16:29:43 +00:00
Craig Topper	95e4142f94	[InstCombine] Change helper method to a file local static method. NFC llvm-svn: 307275	2017-07-06 16:24:23 +00:00
Craig Topper	fc42acef92	[InstCombine] Clarify comment to mention other transform that it does. NFC llvm-svn: 307274	2017-07-06 16:24:22 +00:00
Craig Topper	22795de20a	[InstCombine] Add single use checks to SimplifyBSwap to ensure we are really saving instructions Bswap isn't a simple operation so we need to make sure we are really removing a call to it before doing these simplifications. For the case when both LHS and RHS are bswaps I've allowed it to be moved if either LHS or RHS has a single use since that at least allows us to move it later where it might find another bswap to combine with and it decreases the use count on the other side so maybe the other user can be optimized. Differential Revision: https://reviews.llvm.org/D34974 llvm-svn: 307273	2017-07-06 16:24:21 +00:00
Craig Topper	3e1909d797	[InstCombine] Don't create extra ConstantInt objects in foldSelectICmpAnd. NFCI Instead just use APInt objects and only create a ConstantInt at the end if we need it for the Offset. llvm-svn: 307270	2017-07-06 15:58:54 +00:00
Wei Mi	90707394e3	[LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. When the formulae search space is huge, LSR uses a series of heuristic to keep pruning the search space until the number of possible solutions are within certain limit. The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs, which picks the register which is used by the most LSRUses and deletes the other formulae which don't use the register. This is a effective way to prune the search space, but quite often not a good way to keep the best solution. We saw cases before that the heuristic pruned the best formula candidate out of search space. To relieve the problem, we introduce a new heuristic called NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to reduce the search space while keeping the best formula, we want to keep as many formulae with different Scale and ScaledReg as possible. That is because the central idea of LSR is to choose a group of loop induction variables and use those induction variables to represent LSRUses. An induction variable candidate is often represented by the Scale and ScaledReg in a formula. If we have more formulae with different ScaledReg and Scale to choose, we have better opportunity to find the best solution. That is why we believe pruning search space by only keeping the best formula with the same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use two criteria to choose the best formula with the same Scale and ScaledReg. The first criteria is to select the formula using less non shared registers, and the second criteria is to select the formula with less cost got from RateFormula. The patch implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the last resort. Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help on the two improved internal cases we saw. Differential Revision: https://reviews.llvm.org/D34583 llvm-svn: 307269	2017-07-06 15:52:14 +00:00
Max Kazantsev	98838527c6	Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249	2017-07-06 10:47:13 +00:00
Max Kazantsev	c8db20b78c	Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244	2017-07-06 09:57:41 +00:00
Frederich Munch	52dfcd18d1	Avoid constructing GlobalExtensions only to find out it is empty. Summary: GlobalExtensions is dereferenced twice, once for iteration and then a check if it is empty. As a ManagedStatic this dereference forces it's construction which is unnecessary. Reviewers: efriedma, davide, mehdi_amini Reviewed By: mehdi_amini Subscribers: chapuni, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D33381 llvm-svn: 307229	2017-07-06 00:09:09 +00:00
Davide Italiano	7dd0694f96	[GlobalOpt] Remove unreachable blocks before optimizing a function. LLVM's definition of dominance allows instructions that are cyclic in unreachable blocks, e.g.: %pat = select i1 %condition, @global, i16* %pat because any instruction dominates an instruction in a block that's not reachable from entry. So, remove unreachable blocks from the function, because a) there's no point in analyzing them and b) GlobalOpt should otherwise grow some more complicated logic to break these cycles. Differential Revision: https://reviews.llvm.org/D35028 llvm-svn: 307215	2017-07-05 22:28:28 +00:00
Craig Topper	cc418b656a	[InstCombine] Use CmpInst::Predicate with m_Cmp instead of ICmpInst::Predicate. NFC There isn't really an ICmpInst version so we're just accessing the CmpInst version through inheritance. llvm-svn: 307199	2017-07-05 20:31:00 +00:00
Dinar Temirbulatov	b78adec638	[SLPVectorizer] Add an extra parameter to cancelScheduling function, NFCI. llvm-svn: 307158	2017-07-05 13:53:03 +00:00
David Green	b26a0a460c	[IndVarSimplify] Add AShr exact flags using induction variables ranges. This adds exact flags to AShr/LShr flags where we can statically prove it is valid using the range of induction variables. This allows further optimisations to remove extra loads. Differential Revision: https://reviews.llvm.org/D34207 llvm-svn: 307157	2017-07-05 13:25:58 +00:00
Max Kazantsev	ebe56283bc	Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135	2017-07-05 09:44:41 +00:00
Max Kazantsev	80bc4a5554	[IndVars] Canonicalize comparisons between non-negative values and indvars -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126	2017-07-05 06:38:49 +00:00
Anna Thomas	ada4ddc0bc	[LoopDeletion] NFC: Add loop being analyzed debug statement llvm-svn: 307096	2017-07-04 17:00:03 +00:00
Anna Thomas	90f69abc8b	[LoopDeletion] NFC: Add debug statements to the optimization We have a DEBUG option for loop deletion, but no related debug messages. Added some debug messages to state why loop deletion failed. llvm-svn: 307078	2017-07-04 14:05:19 +00:00
Craig Topper	0f746c2793	[InstCombine] Add TODOs for a couple things that should maybe be in InstSimplify instead. NFC llvm-svn: 307065	2017-07-04 06:50:48 +00:00
Florian Hahn	4eeff394d3	[LoopInterchange] Add more debug messages to currentLimitations(). Summary: This makes it easier to find out which limitation prevented this pass from doing its work. Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34940 llvm-svn: 307035	2017-07-03 15:32:00 +00:00
Benjamin Kramer	fb620493e1	Revert "[GVN] Recommit the patch "Add phi-translate support in scalarpre"." This reverts commit r306313. This breaks selfhost at -O3 and PR33652. Let me know if you need additional information on reproducing the issue. llvm-svn: 307021	2017-07-03 12:23:10 +00:00
Craig Topper	8036970008	[InstCombine] Add a TODO for a probable missing single use check. NFC Will try to fix it soon, but in case I forget. llvm-svn: 307003	2017-07-03 05:54:16 +00:00
Craig Topper	766ce6e9cf	[InstCombine] Support BITWISE_OP( BSWAP(x), CONSTANT ) -> BSWAP( BITWISE_OP(x, BSWAP(CONSTANT) ) ) for splat vectors. llvm-svn: 307002	2017-07-03 05:54:15 +00:00
Craig Topper	32fce4d647	[InstCombine] Remove support for BITWISE_OP(CONSTANT, BSWAP(x)) -> BSWAP(OP(BSWAP(CONSTANT), x)). Constants were already canonicalized to the right hand side before we got here. llvm-svn: 307000	2017-07-03 05:54:13 +00:00
Craig Topper	1e4643a98e	[InstCombine] Support BITWISE_OP(BSWAP(A),BSWAP(B))->BSWAP(BITWISE_OP(A, B)) for vectors. llvm-svn: 306999	2017-07-03 05:54:13 +00:00
Craig Topper	c6948c25cc	[InstCombine] Remove an if that should have been guaranteed by the caller. Replace with an assert. NFC llvm-svn: 306997	2017-07-03 05:54:11 +00:00
Simon Pilgrim	df2657ac2d	[InstCombine] Use m_BitReverse pattern match helper. NFCI. llvm-svn: 306986	2017-07-02 16:31:16 +00:00
Sanjay Patel	b51e072d35	[InstCombine] fix crash when folding cmp+bswap vector We assumed the constant was a scalar when creating the replacement operand. Also, improve tests for this fold and move the tests for this fold to their own file. I'll move the related and missing tests to this file as a follow-up. llvm-svn: 306985	2017-07-02 16:05:11 +00:00
Sanjay Patel	c3d5cf0bb7	[InstCombine] look through bswap/bitreverse for equality comparisons I noticed this missed bswap optimization in the CGP memcmp() expansion, and then I saw that we don't have the fold in InstCombine. Differential Revision: https://reviews.llvm.org/D34763 llvm-svn: 306980	2017-07-02 14:34:50 +00:00
Hiroshi Inoue	bb703e8960	fix trivial typos; NFC suport -> support llvm-svn: 306968	2017-07-02 03:24:54 +00:00
Craig Topper	f60ab47098	[InstCombine] Fold (a \| b) ^ (~a \| ~b) --> ~(a ^ b) and (a & b) ^ (~a & ~b) --> ~(a ^ b) Summary: I came across this while thinking about what would happen if one of the operands in this xor pattern was itself a inverted (A & ~B) ^ (~A & B)-> (A^B). The patterns here assume that the (~a \| ~b) will be demorganed to ~(a & b) first. Though I wonder if there's a multiple use case that would prevent the demorgan. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34870 llvm-svn: 306967	2017-07-02 01:15:51 +00:00
Davide Italiano	e3f7dda1fb	[CodeExtractor] Remove unneded and commented out debugging stmts. llvm-svn: 306966	2017-07-02 00:07:18 +00:00
Hiroshi Inoue	ef1c2ba22a	fix trivial typos, NFC llvm-svn: 306952	2017-07-01 07:12:15 +00:00
Davide Italiano	9282f1aece	[Cloner] Re-map simplfied cloned instructions. This commit pretty much rolls back the logic added in r306495 as in the testcase provided we simplify an `icmp` looking through a PHI that hasn't been mapped yet. I think instsimplify shouldn't do threading over select/phis or just looking through phis in general, but this is what we have now. Also, add a test to prevent this from happening in case somebody wants to modify this code again. Briefly discussed with Kyle Butt (thanks Kyle!). llvm-svn: 306938	2017-07-01 03:29:33 +00:00
Teresa Johnson	32d95742b8	Recommit "r306541 - Add zero-length check to memcpy/memset load store loop expansion"" With fix for use-after-free errors. We can't add the new branch and remove the old one until we are done with the Builder constructed for the block. llvm-svn: 306937	2017-07-01 03:24:10 +00:00
Teresa Johnson	c12306c0ad	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306936	2017-07-01 03:24:09 +00:00
Teresa Johnson	eb4fba9d61	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306935	2017-07-01 03:24:08 +00:00
Teresa Johnson	de56903bde	revert r306336 for breaking ppc test. llvm-svn: 306934	2017-07-01 03:24:07 +00:00
Teresa Johnson	1fbaffeba1	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306933	2017-07-01 03:24:06 +00:00
Dinar Temirbulatov	2fb1075f14	[SLPVectorizer] Add isOdd() helper function, NFCI. llvm-svn: 306887	2017-06-30 21:16:26 +00:00
Craig Topper	bcf511c0da	[InstCombine] Replace an unnecessary use of a matcher with just an isa and a cast. NFC We aren't looking through any levels of IR here so I don't think we need the power of a matcher or the temporary variable it requires. llvm-svn: 306885	2017-06-30 21:09:34 +00:00
Ayal Zaks	2ff59d4350	[LV] Sink casts to unravel first order recurrence Check if a single cast is preventing handling a first-order-recurrence Phi, because the scheduling constraints it imposes on the first-order-recurrence shuffle are infeasible; but they can be made feasible by moving the cast downwards. Record such casts and move them when vectorizing the loop. Differential Revision: https://reviews.llvm.org/D33058 llvm-svn: 306884	2017-06-30 21:05:06 +00:00
Sumanth Gundapaneni	5372f0a73e	[SimplifyCFG] Update the name of switch generated lookup table. This patch appends the name of the function to the switch generated lookup table. This will ease the visual debugging in identifying the function the table is generated from. Differential Revision: https://reviews.llvm.org/D34817 llvm-svn: 306867	2017-06-30 20:00:01 +00:00
Simon Pilgrim	77c3c5f9b8	[InstCombine] Add m_BitReverse pattern match helper. NFCI. llvm-svn: 306860	2017-06-30 18:58:29 +00:00
Anna Thomas	e5e5e59d8b	[RuntimeUnrolling] Add logic for loops with multiple exit blocks Summary: Runtime unrolling is done for loops with a single exit block and a single exiting block (and this exiting block should be the latch block). This patch adds logic to support unrolling in the presence of multiple exit blocks (which also means multiple exiting blocks). Currently this is under an off-by-default option and is supported when epilog code is generated. Support in presence of prolog code will be in a future patch (we just need to add more tests, and update comments). This patch is essentially an implementation patch. I have not added any heuristic (in terms of branches added or code size) to decide when this should be enabled. Reviewers: mkuper, sanjoy, reames, evstupac Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33001 llvm-svn: 306846	2017-06-30 17:57:07 +00:00
Nikolai Bozhenov	bde9b14c6f	Revert of r306525: "Canonicalize clamp of float types to minmax" llvm-svn: 306815	2017-06-30 10:39:09 +00:00
Ayal Zaks	8d26f0a602	[LV] Optimize for size when vectorizing loops with tiny trip count It may be detrimental to vectorize loops with very small trip count, as various costs of the vectorized loop body as well as enclosing overheads including runtime tests and scalar iterations may outweigh the gains of vectorizing. The current cost model measures the cost of the vectorized loop body only, expecting it will amortize other costs, and loops with known or expected very small trip counts are not vectorized at all. This patch allows loops with very small trip counts to be vectorized, but under OptForSize constraints, which ensure the cost of the loop body is dominant, having no runtime guards nor scalar iterations. Patch inspired by D32451. Differential Revision: https://reviews.llvm.org/D34373 llvm-svn: 306803	2017-06-30 08:02:35 +00:00
Craig Topper	880bf82685	[InstCombine] In foldXorToXor, move the commutable matcher from the LHS match to the RHS match. No meaningful change intended. There are two conditions ORed here with similar checks and each contain two matches that must be true for the if to succeed. With the commutable match on the first half of the OR then both ifs basically have the same first part and only the second part distinguishs. With this change we move the commutable match to second half and make the first half unique. This caused some tests to change because we now produce a commuted result, but this shouldn't matter in practice. llvm-svn: 306800	2017-06-30 07:37:41 +00:00
Chandler Carruth	3545a9e1f9	Remove the BBVectorize pass. It served us well, helped kick-start much of the vectorization efforts in LLVM, etc. Its time has come and past. Back in 2014: http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html Time to actually let go and move forward. =] I've updated the release notes both about the removal and the deprecation of the corresponding C API. llvm-svn: 306797	2017-06-30 07:09:08 +00:00
Daniel Jasper	3b704ceba1	Revert "r306541 - Add zero-length check to memcpy/memset load store loop expansion" Segfaults in non-optimized builds. I'll get a stack trace and a reproducer to Teresa. llvm-svn: 306793	2017-06-30 06:37:33 +00:00
Daniel Jasper	5ce1ce742e	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306792	2017-06-30 06:32:21 +00:00
Max Kazantsev	8d0322e612	[SCEV] Use depth limit instead of local cache for SExt and ZExt In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785	2017-06-30 05:04:09 +00:00
Eric Christopher	a95aac3751	Reduce indenting and clean up comparisons around sign bit. llvm-svn: 306781	2017-06-30 01:57:48 +00:00
Eric Christopher	710c1c8faa	Reduce the complexity of the signbit/branch test functions. llvm-svn: 306779	2017-06-30 01:35:31 +00:00
Dehao Chen	2f31d0d86e	Hook the sample PGO machinery in the new PM Summary: This patch hooks up SampleProfileLoaderPass with the new PM. Reviewers: chandlerc, davidxl, davide, tejohnson Reviewed By: chandlerc, tejohnson Subscribers: tejohnson, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34720 llvm-svn: 306763	2017-06-29 23:33:05 +00:00
Dinar Temirbulatov	f05c73c132	[SLPVectorizer] Moving Entry->NeedToGather check out of inner loop, since it is invariant there. NFCI. llvm-svn: 306749	2017-06-29 21:56:33 +00:00
Sam Clegg	3d65030c45	Remove `inline` keyword from inline `classof` methods The style guide states that the explicit `inline` should not be used with inline methods. classof is very common inline method with a fair amount on inconsistency: $ git grep classof ./include \| grep inline \| wc -l 230 $ git grep classof ./include \| grep -v inline \| wc -l 257 I chose to target this method rather the larger change since this method is easily cargo-culted (I did it at least once). I considered doing the larger change and removing all occurrences but that would be a much larger change. Differential Revision: https://reviews.llvm.org/D33906 llvm-svn: 306731	2017-06-29 19:35:17 +00:00
Xin Tong	02008c30b5	Remove useless header. NFC llvm-svn: 306712	2017-06-29 17:48:12 +00:00
Leo Li	20fbad9307	[ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type. Summary: Indices for GEPs that index into a struct type should always be constants. This added more checks in `collectConstantCandidates:` which make sure constants for GEP pointer type are not hoisted. This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538 Reviewers: ributzka, rnk Reviewed By: ributzka Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama Differential Revision: https://reviews.llvm.org/D34576 llvm-svn: 306704	2017-06-29 17:03:34 +00:00
Daniel Berlin	b7df17ec59	PredicateInfo: Use OrderedInstructions instead of our homemade version. llvm-svn: 306703	2017-06-29 17:01:14 +00:00
Daniel Berlin	b779db7ebc	NewGVN: Remove useless test in addPhiOfOps. llvm-svn: 306702	2017-06-29 17:01:10 +00:00
Daniel Berlin	7c757aee38	Remove unneeded else from OrderedInstructions::dominates. llvm-svn: 306701	2017-06-29 17:01:03 +00:00
Dinar Temirbulatov	7b96266a16	[SLPVectorizer] Introducing getTreeEntry() helper function [NFC] Differential Revision: https://reviews.llvm.org/D34756 llvm-svn: 306655	2017-06-29 08:46:18 +00:00
Craig Topper	798a19ab8e	[InstCombine] In visitXor, use m_Not on the instruction itself instead of looking for all ones in Op1. This is consistent with 3 other not checks before this one. NFCI llvm-svn: 306617	2017-06-29 00:07:08 +00:00
Keno Fischer	a236dae5d1	[InstCombine] Retain TBAA when narrowing memory accesses Summary: As discussed on the mailing list it is legal to propagate TBAA to loads/stores from/to smaller regions of a larger load tagged with TBAA. Do so for (load->extractvalue)=>(gep->load) and similar foldings. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D31954 llvm-svn: 306615	2017-06-28 23:36:40 +00:00
Ayal Zaks	d9bc43ef2a	[LV] Fix PR33613 - retain order of insertelement per part r306381 caused PR33613, by reversing the order in which insertelements were generated per unroll part. This patch fixes PR33613 by retraining this order, placing each set of insertelements per part immediately after the last scalar being packed for this part. Includes a test case derived from PR33613. Reference: https://bugs.llvm.org/show_bug.cgi?id=33613 Differential Revision: https://reviews.llvm.org/D34760 llvm-svn: 306575	2017-06-28 17:59:33 +00:00
Geoff Berry	b0573547f6	[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D34532 llvm-svn: 306564	2017-06-28 17:01:15 +00:00
Sanjay Patel	4e96f19052	[InstCombine] use local variable to reduce code; NFCI llvm-svn: 306560	2017-06-28 16:39:06 +00:00
Geoff Berry	66d9bdbca8	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554	2017-06-28 15:53:17 +00:00
Teresa Johnson	538b8d25f0	Add zero-length check to memcpy/memset load store loop expansion Summary: I was testing using this expansion logic in other cases besides NVPTX, and found some runtime failures due to the lack of a check for a zero length memcpy/memset before the loop. There is already such a check in the memmove expansion code though. Reviewers: hfinkel Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34707 llvm-svn: 306541	2017-06-28 13:07:37 +00:00
Nikolai Bozhenov	b01e6b5a52	[InstCombine] Canonicalize clamp of float types to minmax in fast mode. Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 306525	2017-06-28 09:26:20 +00:00
Max Kazantsev	6c466a376e	[IRCE][NFC] Better get SCEV for 1 in calculateSubRanges A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive invocations, and we don't create a ConstantInt if 'true' branch is taken. Differential Revision: https://reviews.llvm.org/D34672 llvm-svn: 306503	2017-06-28 04:57:45 +00:00
Kyle Butt	f73c8a06a9	Inlining: Don't re-map simplified cloned instructions. When simplifying an instruction that has been re-mapped, it should never simplify to an instruction in the original function. In the edge case where we are inlining a function into itself, the existing code led to incorrect behavior. Replace the incorrect code with an assert verifying that we never expect simplification to produce an instruction in the old function, unless the functions are the same. Differential Revision: https://reviews.llvm.org/D33850 llvm-svn: 306495	2017-06-28 01:41:25 +00:00
Peter Collingbourne	92648c25a4	Bitcode: Write the irsymtab to disk. Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487	2017-06-27 23:50:11 +00:00
Geoff Berry	2573a19fe6	[EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE. llvm-svn: 306477	2017-06-27 22:25:02 +00:00
Dehao Chen	920d022519	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306473	2017-06-27 22:05:58 +00:00
Craig Topper	5fe0197622	[InstCombine] Propagate nsw flag when turning mul by pow2 into shift when the constant is a vector splat or the scalar bit width is larger than 64-bits The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less. This patch changes it to use m_APInt to remove both these issues Differential Revision: https://reviews.llvm.org/D34699 llvm-svn: 306457	2017-06-27 19:57:53 +00:00
Serge Guelton	7bc405aa4c	[CodeExtractor] Prevent extraction of block involving blockaddress BlockAddress are only valid within their function context, which does not interact well with CodeExtractor. Detect this case and prevent it. Differential Revision: https://reviews.llvm.org/D33839 llvm-svn: 306448	2017-06-27 18:57:53 +00:00
Yaxun Liu	7c44f340de	[SROA] Fix APInt size when alloca address space is not 0 SROA assumes alloca address space is 0, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34104 llvm-svn: 306440	2017-06-27 18:26:06 +00:00
Sanjay Patel	7227276d41	[InstCombine] canonicalize icmp predicate feeding select This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. We have this transform for icmp+br, so unless there's some reason that icmp+select should be treated differently, we should do the same thing here. The benefit comes from increasing the chances of creating identical instructions. This is shown in the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE can simplify the identical cmps, and then InstCombine can fold the selects together. The possible regression for the tests in select.ll raises questions about poison/undef: http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html ...but that transform is just as likely to be triggered by this canonicalization as it is to be missed, so we're just pointing out a commutation deficiency in the pattern matching: https://reviews.llvm.org/rL228409 Differential Revision: https://reviews.llvm.org/D34242 llvm-svn: 306435	2017-06-27 17:53:22 +00:00
Dehao Chen	66131665c4	Enable ICP for AutoFDO. Summary: AutoFDO should have ICP enabled. Reviewers: davidxl Reviewed By: davidxl Subscribers: sanjoy, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D34662 llvm-svn: 306429	2017-06-27 17:23:33 +00:00
Anna Thomas	dc935a6eb6	[LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block (which is proven to be the only exiting block in the loop to be unrolled). llvm-svn: 306410	2017-06-27 14:14:35 +00:00
Ayal Zaks	fc1e210d44	Recommitting 306331. Undoing revert 306338 after fixed bug: add metadata to the load instead of the reverse shuffle added to it, retaining the original ValueMap implementation. llvm-svn: 306381	2017-06-27 08:41:19 +00:00
Chandler Carruth	3f81d8024c	[SROA] Fix PR32902 by more carefully propagating !nonnull metadata. This is based heavily on the work done ni D34285. I mostly wanted to do test cleanup for the author to save them some time, but I had a really hard time understanding why it was so hard to write better test cases for these issues. The problem is that because SROA does a second rewrite of the loads and because we don't propagate !nonnull for non-pointer loads, we first introduced invalid !nonnull metadata and then stripped it back off just in time to avoid most ways of this PR manifesting. Moving to the more careful utility only fixes this by changing the predicate to look at the new load's type rather than the target type. However, that does fix the bug, and the utility is much nicer including adding range metadata to model the nonnull property after a conversion to an integer. However, we have bigger problems because we don't actually propagate range metadata, and the utility to do this extracted from instcombine isn't really in good shape to do this currently. It only handles the case of copying range metadata from an integer load to a pointer load. It doesn't even handle the trivial cases of propagating from one integer load to another when they are the same width! This utility will need to be beefed up prior to using in this location to get the metadata to fully survive. And even then, we need to go and teach things to turn the range metadata into an assume the way we do with nonnull so that when we promote an integer we don't lose the information. All of this will require a new test case that looks kind-of like `preserve-nonnull.ll` does here but focuses on range metadata. It will also likely require more testing because it needs to correctly handle changes to the integer width, especially as SROA actively tries to change the integer width! Last but not least, I'm a little worried about hooking the range metadata up here because the instcombine logic for converting from a range metadata to a nonnull metadata node seems broken in the face of non-zero address spaces where null is not mapped to the integer `0`. So that probably needs to get fixed with test cases both in SROA and in instcombine to cover it. But this does extract the core PR fix from D34285 of preventing the !nonnull metadata from being propagated in a broken state just long enough to feed into promotion and crash value tracking. On D34285 there is some discussion of zero-extend handling because it isn't necessary. First, the new load size covers all of the non-undef (ie, possibly initialized) bits. This may even extend past the original alloca if loading those bits could produce valid data. The only way its valid for us to zero-extend an integer load in SROA is if the original code had a zero extend or those bits were undef. And we get to assume things like undef never satifies nonnull, so non undef bits can participate here. No need to special case the zero-extend handling, it just falls out correctly. The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to save a few rounds of trivial edits fixing style issues and test case formulation. Differental Revision: D34285 llvm-svn: 306379	2017-06-27 08:32:03 +00:00
Mikael Holmen	37b5120a9a	[Reassociate] Make sure EraseInst sets MadeChange Summary: EraseInst didn't report that it made IR changes through MadeChange. It is essential that changes to the IR are reported correctly, since for example ReassociatePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). Reviewers: craig.topper, rnk, davide Reviewed By: rnk, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34616 llvm-svn: 306368	2017-06-27 05:32:13 +00:00
Dehao Chen	8b7effb344	revert r306336 for breaking ppc test. llvm-svn: 306344	2017-06-26 23:05:35 +00:00
Ayal Zaks	3923c0c46b	reverting 306331. Causes TBAA metadata to be generates on reverse shuffles, investigating. llvm-svn: 306338	2017-06-26 22:26:54 +00:00
Dehao Chen	79655792cc	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306336	2017-06-26 21:41:09 +00:00
Ayal Zaks	e7e15d186b	[LV] Changing the interface of ValueMap, NFC. Instead of providing access to the internal MapStorage holding all Values associated with a given Key, used for setting or resetting them all together, ValueMap keeps its MapStorage internal; its new interface allows getting, setting or resetting a single Value, per part or per part-and-lane. Follows the discussion in https://reviews.llvm.org/D32871. Differential Revision: https://reviews.llvm.org/D34473 llvm-svn: 306331	2017-06-26 21:03:51 +00:00
Wei Mi	71f06420e4	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes three bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. llvm-svn: 306313	2017-06-26 18:16:10 +00:00
Chandler Carruth	2abb65ae11	[InstCombine] Factor the logic for propagating !nonnull and !range metadata out of InstCombine and into helpers. NFC, this just exposes the logic used by InstCombine when propagating metadata from one load instruction to another. The plan is to use this in SROA to address PR32902. If anyone has better ideas about how to factor this or name variables, I'm all ears, but this seemed like a pretty good start and lets us make progress on the PR. This is based on a patch by Ariel Ben-Yehuda (D34285). llvm-svn: 306267	2017-06-26 03:31:31 +00:00
Chandler Carruth	4a000883c7	[LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr. This was reverted in r306252, but I already had the bug fixed and was just trying to form a test case. The original commit factored the logic for forming dedicated exits inside of LoopSimplify into a helper that could be used elsewhere and with an approach that required fewer intermediate data structures. See that commit for full details including the change to the statistic, etc. The code looked fine to me and my reviewers, but in fact didn't handle indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty. If you have code that looks just right, you can end up leaking these predecessors into a subsequent rewrite, and crash deep down when trying to update PHI nodes for predecessors that don't exist. I've added an assert that makes the bug much more obvious, and then changed the code to reliably clear the vector so we don't get this bug again in some other form as the code changes. I've also added a test case that does manage to catch this while also giving some nice positive coverage in the face of indirectbr. The real code that found this came out of what I think is CPython's interpreter loop, but any code with really "creative" interpreter loops mixing indirectbr and other exit paths could manage to tickle the bug. I was hard to reduce the original test case because in addition to having a particular pattern of IR, the whole thing depends on the order of the predecessors which is in turn depends on use list order. The test case added here was designed so that in multiple different predecessor orderings it should always end up going down the same path and tripping the same bug. I hope. At least, it tripped it for me without manipulating the use list order which is better than anything bugpoint could do... llvm-svn: 306257	2017-06-25 22:45:31 +00:00
Anna Thomas	e7cb633d29	[LoopDeletion] NFC: Move phi node value setting into prepass Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator, which caused loop deletion tests to hang due to infinite loop. Had reverted it in rL306162. rL306157 commit message: Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306254	2017-06-25 21:13:58 +00:00
Daniel Jasper	4c6cd4ccb7	Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility." This leads to a segfault. Chandler already has a test case and should be able to recommit with a fix soon. llvm-svn: 306252	2017-06-25 17:58:25 +00:00
Sanjay Patel	2f3ead7adc	[InstCombine] add (sext i1 X), 1 --> zext (not X) http://rise4fun.com/Alive/i8Q A narrow bitwise logic op is obviously better than math for value tracking, and zext is better than sext. Typically, the 'not' will be folded into an icmp predicate. The IR difference would even survive through codegen for x86, so we would see worse code: https://godbolt.org/g/C14HMF one_or_zero(int, int): # @one_or_zero(int, int) xorl %eax, %eax cmpl %esi, %edi setle %al retq one_or_zero_alt(int, int): # @one_or_zero_alt(int, int) xorl %ecx, %ecx cmpl %esi, %edi setg %cl movl $1, %eax subl %ecx, %eax retq llvm-svn: 306243	2017-06-25 14:15:28 +00:00
Xinliang David Li	b67530e9b9	[PGO] Implementate profile counter regiser promotion Differential Revision: http://reviews.llvm.org/D34085 llvm-svn: 306231	2017-06-25 00:26:43 +00:00
Hiroshi Inoue	b300824ee7	fix trivial typos in comment, NFC dereferencable -> dereferenceable llvm-svn: 306210	2017-06-24 15:43:33 +00:00
Craig Topper	7b66ffe875	[ValueTracking][InstCombine] Use m_Shr instead m_CombineOr(m_LShr, m_AShr). NFC llvm-svn: 306205	2017-06-24 06:24:04 +00:00
Craig Topper	72ee6945af	[Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC llvm-svn: 306204	2017-06-24 06:24:01 +00:00
Vitaly Buka	df19ad456e	[InstCombine] Don't replace allocas with smaller globals Summary: InstCombine replaces large allocas with small globals consts causing buffer overflows on valid code, see PR33372. This fix permits this optimization only if the global is dereference for alloca size. Fixes PR33372 Reviewers: eugenis, majnemer, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34311 llvm-svn: 306194	2017-06-24 01:35:19 +00:00
Anna Thomas	77a2e6b198	Revert "[LoopDeletion] NFC: Move phi node value setting into prepass" This reverts commit r306157. It caused some timeouts in clang tests. Perhaps unreachable loops have far too many phi nodes. Reverting and investigating. llvm-svn: 306162	2017-06-23 21:30:48 +00:00
Anna Thomas	a43b387f27	[LoopDeletion] NFC: Move phi node value setting into prepass Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306157	2017-06-23 20:38:50 +00:00
Craig Topper	68ed55e06a	[CorrelatedValuePropagation] Fix typo in comment sense->since. NFC llvm-svn: 306152	2017-06-23 20:28:40 +00:00
Craig Topper	29cdfe2cd9	[CorrelatedValuePropagation] Remove comment about iterating switch cases in reverse order. This is no longer being done after r298791. NFC llvm-svn: 306151	2017-06-23 20:28:35 +00:00
Anna Thomas	91eed9ac1a	[RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFC The single exit block allowed in runtime unrolling is guaranteed to be the Latch's successor, so rename it as LatchExitBlock. llvm-svn: 306105	2017-06-23 14:28:01 +00:00
Anna Thomas	d67165c93c	[InstCombine] Recognize and simplify three way comparison idioms Summary: Many languages have a three way comparison idiom where comparing two values produces not a boolean, but a tri-state value. Typical values (e.g. as used in the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1 for greater than. We actually do a great job already of converting three way comparisons into binary comparisons when the result produced has one a single use. Unfortunately, such values can have more than one use, and in that case, our existing optimizations break down. The patch adds a peephole which converts a three-way compare + test idiom into a binary comparison on the original inputs. It focused on replacing the test on the result of the three way compare and does nothing about removing the three way compare itself. That's left to other optimizations (which do actually kick in commonly.) We currently recognize one idiom on signed integer compare. In the future, we plan to recognize and simplify other comparison idioms on other signed/unsigned datatypes such as floats, vectors etc. This is a resurrection of Philip Reames' original patch: https://reviews.llvm.org/D19452 Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34278 llvm-svn: 306100	2017-06-23 13:41:45 +00:00
Craig Topper	2c20c42cb6	[JumpThreading] Teach jump threading how to analyze (and (cmp A, C1), (cmp A, C2)) after InstCombine has turned it into (cmp (add A, C3), C4) Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition. But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare. This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it. Differential Revision: https://reviews.llvm.org/D33262 llvm-svn: 306085	2017-06-23 05:41:35 +00:00
Craig Topper	7927996140	[JumpThreading] Use some temporary variables to reduce the number of times we call the same methods. NFC A future patch will add even more uses of these variables. llvm-svn: 306084	2017-06-23 05:41:32 +00:00
Chandler Carruth	4ab0f4910a	[LoopSimplify] Factor the logic to form dedicated exits into a utility. I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here. I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate. This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand. I also had to ditch a statistic, but it doesn't seem terribly valuable. Differential Revision: https://reviews.llvm.org/D34049 llvm-svn: 306081	2017-06-23 04:03:04 +00:00
Eric Christopher	5a7c2f1700	Remove the LoadCombine pass. It was never enabled and is unsupported. Based on discussions with the author on mailing lists. llvm-svn: 306067	2017-06-22 22:58:12 +00:00
Anna Thomas	72c90c87f8	[LoopDeletion] Update exits correctly when multiple duplicate edges from an exiting block Summary: Currently, we incorrectly update exit blocks of loops when there are multiple edges from a single exiting block to the exit block. This can happen when we have switches as the terminator of the exiting blocks. The fix here is to correctly update the phi nodes in the exit block, and remove all incoming values except for one which is from the preheader. Note: Currently, this error can manifest only while deleting non-executed loops. However, it is possible to trigger this error in invariant loops, once we enhance the logic around the exit conditions for the loop check. Reviewers: chandlerc, dberlin, sanjoy, efriedma Reviewed by: efriedma Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34516 llvm-svn: 306048	2017-06-22 20:20:56 +00:00
Craig Topper	dffbbcb3fd	[InstCombine] Teach foldSelectICmpAndOr to recognize (select (icmp slt (trunc (X)), 0), Y, (or Y, C2)) Summary: InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y). This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used. Reviewers: spatel, majnemer, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34184 llvm-svn: 306029	2017-06-22 16:23:30 +00:00
Craig Topper	0de5e6a729	[InstCombine] Add one use checks to or/and->xnor folding If the components of the and/or had multiple uses, this transform created an additional instruction. This patch makes sure we remove one of the components. Differential Revision: https://reviews.llvm.org/D34498 llvm-svn: 306027	2017-06-22 16:12:02 +00:00
Sanjay Patel	d1e811979c	[InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138) There are 2 parts to this patch made simultaneously to avoid a regression. We're reversing the canonicalization that moves bitwise vector ops before bitcasts. We're moving bitwise vector ops after bitcasts instead. That's the 1st and 3rd hunks of the patch. The motivation is that there's only one fold that currently depends on the existing canonicalization (see next), but there are many folds that would automatically benefit from the new canonicalization. PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these patterns in IR. There's an or(and,andn) pattern that requires an adjustment in order to continue matching to 'select' because the bitcast changes position. This match is unfortunately complicated because it requires 4 logic ops with optional bitcast and sext ops. Test diffs: 1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference - bitcast comes before logic. 2. There are also tests with no diffs in bitcast.ll that verify that we're still doing folds that were enabled by the previous canonicalization. 3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns to look through bitcasts. 4. logical-select.ll contains several tests for the or(and,andn) --> select fold to verify that we are still handling those cases. The lone diff shows the movement of the bitcast from the new canonicalization rule. Differential Revision: https://reviews.llvm.org/D33517 llvm-svn: 306011	2017-06-22 15:46:54 +00:00
Sanjay Patel	e800df8eac	[InstCombine] add peekThroughBitcast() helper; NFC This is an NFC portion of D33517. We have similar helpers in the backend. llvm-svn: 306008	2017-06-22 15:28:01 +00:00
Diana Picus	b512e91515	Revert "Enable vectorizer-maximize-bandwidth by default." This reverts commit r305960 because it broke self-hosting on AArch64. llvm-svn: 305990	2017-06-22 10:00:28 +00:00
Sam Clegg	705f798bff	Mark dump() methods as const. NFC Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963	2017-06-21 22:19:17 +00:00
Dehao Chen	014db29b89	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 305960	2017-06-21 22:01:32 +00:00
Craig Topper	34caf5396f	[Reassociate] Use early returns in a couple places to reduce indentation and improve readability. NFC llvm-svn: 305946	2017-06-21 19:39:35 +00:00
Craig Topper	99a2e89920	[Reassociate] Const correct a helper function. NFC llvm-svn: 305945	2017-06-21 19:39:33 +00:00
Craig Topper	a074c101e5	[InstCombine] Cleanup using commutable matchers. Make a couple helper methods standalone static functions. Put 'if' around variable declaration instead of after. NFC llvm-svn: 305941	2017-06-21 18:57:00 +00:00
Dehao Chen	50f2aa19e8	Do not inline recursive direct calls in sample loader pass. Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls. Reviewers: iteratee, davidxl Reviewed By: iteratee Subscribers: sanjoy, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D34456 llvm-svn: 305934	2017-06-21 17:57:43 +00:00
Craig Topper	5b173f2bb3	[InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits Summary: I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know. This patch adds range metadata to these intrinsics based on the known bits. Currently the patch punts if we already have range metadata present. Reviewers: spatel, RKSimon, davide, majnemer Reviewed By: RKSimon Subscribers: sanjoy, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D32582 llvm-svn: 305927	2017-06-21 16:32:35 +00:00
Craig Topper	ae86cc725d	[InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) create more instructions than it removes Summary: Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184 This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34437 llvm-svn: 305926	2017-06-21 16:07:13 +00:00
Craig Topper	cbac691c4b	[Reassociate] Support xor reassociating for splat vectors Summary: This patch adds support for xors of splat vectors. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34354 llvm-svn: 305925	2017-06-21 16:07:09 +00:00
Davide Italiano	0ec715be1f	[NewGVN] Fix a bug that made the store verifier less effective. We weren't actually checking for duplicated stores, as the condition was always actually false. This was found by Coverity, and I have no clue how to trigger this in real-world code (although I tried for a bit). llvm-svn: 305867	2017-06-20 22:57:40 +00:00
Sanjay Patel	4ccbd58d70	[InstCombine] fix code/test comments for r305792; NFC These diffs were in the last version of the patch in D33342, but I accidentally committed the previous rev. llvm-svn: 305793	2017-06-20 12:45:46 +00:00
Sanjay Patel	adca825dc1	[InstCombine] try to canonicalize xor-of-icmps to and-of-icmps We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, we can use the truth table definition of xor: X ^ Y --> (X \| Y) & !(X & Y) ...to see if we can convert the xor to and/or and then use the existing folds. http://rise4fun.com/Alive/J9v Differential Revision: https://reviews.llvm.org/D33342 llvm-svn: 305792	2017-06-20 12:40:55 +00:00
Vedant Kumar	b5794ca90c	[ProfileData] PR33517: Check for failure of symtab creation With PR33517, it became apparent that symbol table creation can fail when presented with malformed inputs. This patch makes that sort of error detectable, so llvm-cov etc. can fail more gracefully. Specifically, we now check that function names within the symbol table aren't empty. Testing: check-{llvm,clang,profile}, some unit test updates. llvm-svn: 305765	2017-06-20 01:38:56 +00:00
Ana Pazos	f731bde064	[PATCH] [PGO] Fixed cast operation in emIntrinsicVisitor::instrumentOneMemIntrinsic. Reviewers: xur, efriedma, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34293 llvm-svn: 305737	2017-06-19 20:04:33 +00:00
Taewook Oh	9083547ae3	Improve profile-guided heuristics to use estimated trip count. Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 llvm-svn: 305729	2017-06-19 18:48:58 +00:00
Bjorn Pettersson	475fcd9cd8	[InstCombine] Make sure AddReachableCodeToWorklist sets MadeIRChange Summary: Some optimizations in AddReachableCodeToWorklist did not update the MadeIRChange state. This could happen both when removing trivially dead instructions (DCE) and at constant folds. It is essential that changes to the IR is reported correctly, since for example InstCombinePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). The new test case early_dce_clobbers_callgraph.ll is a reproducer for some asserts that started to trigger after changes in the inliner in r305245. With this patch the test case passes again. Reviewers: sanjoy, craig.topper, dblaikie Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34346 llvm-svn: 305725	2017-06-19 18:00:27 +00:00
Hans Wennborg	ca69fc1cb7	Revert r304824 "Fix PR23384 (part 3 of 3)" This seems to be interacting badly with ASan somehow, causing false reports of heap-buffer overflows: PR33514. > Summary: > The patch makes instruction count the highest priority for > LSR solution for X86 (previously registers had highest priority). > > Reviewers: qcolombet > > Differential Revision: http://reviews.llvm.org/D30562 > > From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 305720	2017-06-19 17:57:15 +00:00
Davide Italiano	daa9c0e403	[NewGVN] Simplify findConditionEquivalence(). NFCI. llvm-svn: 305707	2017-06-19 16:46:15 +00:00
Dinar Temirbulatov	e2c6991c07	Remove brackets, NFC. llvm-svn: 305706	2017-06-19 16:44:07 +00:00
Craig Topper	a7529b68cc	[InstCombine] Cleanup some duplicated one use checks Summary: These 4 patterns have the same one use check repeated twice for each. Once without a cast and one with. But the cast has no effect on what method is called. For the OR case I believe it is always profitable regardless of the number of uses since we'll never increase the instruction count. For the AND case I believe it is profitable if the pair of xors has one use such that we'll get rid of it completely. Or if the C value is something freely invertible, in which case the not doesn't cost anything. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34308 llvm-svn: 305705	2017-06-19 16:23:49 +00:00
Craig Topper	ef85498e05	[Reassociate] Support some reassociation of vector xors Summary: Currently we don't try to do anything with vector xors. This patch adds support for removing duplicate pairs from a chain of vector xors as its pretty easy to support. We still dont' try to combine the xors with and/ors, but I might try that in a future patch. Reviewers: mcrosier, davide, resistor Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34338 llvm-svn: 305704	2017-06-19 16:23:46 +00:00
Craig Topper	4350734d36	[Reassociate] Make one of the helper methods static because it doesn't use any class variables. NFC llvm-svn: 305703	2017-06-19 16:23:43 +00:00
Anna Thomas	7949f4529a	[JumpThreading][LVI] Invalidate LVI information after blocks are merged Summary: After a single predecessor is merged into a basic block, we need to invalidate the LVI information for the new merged block, when LVI is not provably true for all of instructions in the new block. The test cases added show the correct LVI information using the LVI printer pass. Reviewers: reames, dberlin, davide, sanjoy Reviewed by: dberlin, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34108 llvm-svn: 305699	2017-06-19 15:23:33 +00:00
Xin Tong	b412831d11	[TRE] Improve code motion in TRE, use AA to tell whether a load can be moved before a call that writes to memory. Summary: use AA to tell whether a load can be moved before a call that writes to memory. Reviewers: dberlin, davide, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D34115 llvm-svn: 305698	2017-06-19 15:21:18 +00:00
Daniel Berlin	36b08b2088	NewGVN: Fix PR 33461, caused by slightly overzealous verification. llvm-svn: 305657	2017-06-19 00:24:00 +00:00
Craig Topper	d96177cf72	[Reassociate] Use APInt::isNullValue() instead of comparing with 0. NFC This should compile to slightly better code. llvm-svn: 305651	2017-06-18 18:15:38 +00:00
Xin Tong	9d2a5b1cf7	Add argmononly attribute to strlen and wcslen, i.e. they only read memory (string) passed to them. Summary: This allows strlen to be moved out of the loop in case its argument is not modified in the loop in LICM. Reviewers: hfinkel, davide, sanjoy, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34323 llvm-svn: 305641	2017-06-18 03:10:26 +00:00
Sanjoy Das	b70ddd8901	[SROA] Add support for non-integral pointers Summary: C.f. http://llvm.org/docs/LangRef.html#non-integral-pointer-type Reviewers: chandlerc, loladiro Reviewed By: loladiro Subscribers: reames, loladiro, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32203 llvm-svn: 305639	2017-06-17 20:28:13 +00:00
Xin Tong	025780ba6e	[TRE] Add assertion for folding trivial return block llvm-svn: 305637	2017-06-17 16:55:12 +00:00
Xin Tong	d5b4d0b53a	[TRE] Update comments. NFC llvm-svn: 305636	2017-06-17 16:18:36 +00:00
Wei Mi	c7ba876323	Revert rL305578. There is still some buildbot failure to be fixed. llvm-svn: 305603	2017-06-16 23:14:35 +00:00
Anna Thomas	6bc14c65ad	[InstCombine] Set correct insertion point for selects generated while folding phis Summary: When we fold vector constants that are operands of phi's that feed into select, we need to set the correct insertion point for the new selects that get generated. The correct insertion point is the incoming block for the phi. Such cases can occur with patch r298845, which fixed folding of vector constants, but the new selects could be inserted incorrectly (as the added test case shows). Reviewers: majnemer, spatel, sanjoy Reviewed by: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34162 llvm-svn: 305591	2017-06-16 21:08:37 +00:00
Davide Italiano	ec5b0257bf	[SCCP] Simplify the code a bit. NFCI. llvm-svn: 305583	2017-06-16 20:50:31 +00:00
Davide Italiano	0b1190aa8d	[SCCP] Clarify a comment about unhandled instructions. llvm-svn: 305579	2017-06-16 20:27:17 +00:00
Wei Mi	a2493b6ad9	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes two bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 305578	2017-06-16 20:21:01 +00:00
Davide Italiano	d95d871af0	[SCCP] Remove redundant instruction visitors. Whenever we don't know what to do with an instruction, we send it to overdefined anyway. llvm-svn: 305575	2017-06-16 19:43:57 +00:00
Xinliang David Li	c3f8e83253	Fix function name /NFC llvm-svn: 305564	2017-06-16 16:54:13 +00:00
Daniel Neilson	3faabbbe85	[Atomics] Rename and change prototype for atomic memcpy intrinsic Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558	2017-06-16 14:43:59 +00:00
Craig Topper	da6ea0d3e8	[InstCombine] Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 \| K2)) == (K1 \| K2) if K1 and K2 are a 1-bit mask Summary: This is the demorganed version of the case we already handle for the OR of iszero. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34244 llvm-svn: 305548	2017-06-16 05:10:37 +00:00
Craig Topper	4d76da4fc9	[CorrelatedValuePropagation] Remove superfluous semicolon. NFC llvm-svn: 305538	2017-06-16 01:53:20 +00:00
Evgeniy Stepanov	4d4ee93d25	[cfi] CFI-ICall for ThinLTO. Implement ControlFlowIntegrity for indirect function calls in ThinLTO. Design follows the RFC in llvm-dev, see https://groups.google.com/d/msg/llvm-dev/MgUlaphu4Qc/kywu0AqjAQAJ llvm-svn: 305533	2017-06-16 00:18:29 +00:00
Xinliang David Li	eea0ade2eb	[PartialInlining] Code Refactoring This is a NFC code refactoring and interface cleanup. This paves the way to enable outlining-only mode for the partial inliner. llvm-svn: 305530	2017-06-15 23:56:59 +00:00
Craig Topper	2ba991ff2c	[InstCombine] Add two FIXMEs for bad single use checks. NFC llvm-svn: 305510	2017-06-15 21:38:48 +00:00
Teresa Johnson	152277952e	Split PGO memory intrinsic optimization into its own source file Summary: Split the PGOMemOPSizeOpt pass out from IndirectCallPromotion.cpp into its own file. Reviewers: davidxl Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D34248 llvm-svn: 305501	2017-06-15 20:23:57 +00:00
Craig Topper	f2d3e6d3d5	[InstCombine] Make the context instruction parameter of foldOrOfICmps a reference to discourage passing nullptr and to remove the '&' from all of the call sites. NFC llvm-svn: 305493	2017-06-15 19:09:51 +00:00
Craig Topper	6eec9e21a5	[InstCombine] Handle (iszero(A & K1) \| iszero(A & K2)) -> (A & (K1 \| K2)) != (K1 \| K2) when the one of the Ands is commuted relative to the other Currently we expect A to be on the same side in both Ands but nothing guarantees that. While there also switch to using matchers for some of the code. Differential Revision: https://reviews.llvm.org/D34230 llvm-svn: 305487	2017-06-15 17:55:20 +00:00
Max Kazantsev	dc80366d52	[ScalarEvolution] Apply Depth limit to getMulExpr This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463	2017-06-15 11:48:21 +00:00
George Karpenkov	406c113103	Fixing section name for Darwin platforms for sanitizer coverage On Darwin, section names have a 16char length limit. llvm-svn: 305429	2017-06-14 23:40:25 +00:00
Daniel Berlin	6d2db9edb2	PredicateInfo: Don't insert conditional info when a conditional branch jumps to the same target regardless of condition llvm-svn: 305416	2017-06-14 21:19:52 +00:00
Daniel Berlin	51e878e01d	NewGVN: This is wrong by inspection, it will not cause an issue currently due to other limitations, i believe. This also means i can't make a test for it. llvm-svn: 305415	2017-06-14 21:19:28 +00:00
Davide Italiano	0dc4778067	[EarlyCSE] Make PhiToCheck in removeMSSA() a set. This way we end up not looking at PHI args already removed. MemSSA now goes through the updater so we can prune it to avoid having redundant MemoryPHI arguments, but that doesn't quite work for the general case. Discussed with Daniel Berlin, fixes PR33406. llvm-svn: 305409	2017-06-14 19:29:53 +00:00
Frederich Munch	dceb612eeb	Hide dbgs() stream for when built with -fmodules. Summary: Make DebugCounter::print and dump methods to be const correct. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34214 llvm-svn: 305408	2017-06-14 19:16:22 +00:00
Vedant Kumar	9c056c9e1b	[InstrProf] Don't take the address of alwaysinline available_externally functions Doing so breaks compilation of the following C program (under -fprofile-instr-generate): __attribute__((always_inline)) inline int foo() { return 0; } int main() { return foo(); } At link time, we fail because taking the address of an available_externally function creates an undefined external reference, which the TU cannot provide. Emitting the function definition into the object file at all appears to be a violation of the langref: "Globals with 'available_externally' linkage are never emitted into the object file corresponding to the LLVM module." Differential Revision: https://reviews.llvm.org/D34134 llvm-svn: 305327	2017-06-13 22:12:35 +00:00
Teresa Johnson	8015f88525	[PGO] Update VP metadata after memory intrinsic optimization Summary: Leave an updated VP metadata on the fallback memcpy intrinsic after specialization. This can be used for later possible expansion based on the average of the remaining values. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34164 llvm-svn: 305321	2017-06-13 20:44:08 +00:00
Frederich Munch	6391c7e2a1	Revert r305313 & r305303, self-hosting build-bot isn’t liking it. llvm-svn: 305318	2017-06-13 19:05:24 +00:00
Frederich Munch	4c73b40dca	Force RegisterStandardPasses to construct std::function in the IPO library. Summary: Fixes an issue using RegisterStandardPasses from a statically linked object before PassManagerBuilder::addGlobalExtension is called from a dynamic library. Reviewers: efriedma, theraven Reviewed By: efriedma Subscribers: mehdi_amini, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33515 llvm-svn: 305303	2017-06-13 16:48:41 +00:00
David Blaikie	6d0f39476a	Inliner: Avoid calling shouldInline until it's absolutely necessary This restores the order of evaluation (& conditionalized evaluation) of isTriviallyDeadInstruction, InlineHistoryIncludes, and shouldInline (with the addition of a shouldInline call after isTriviallyDeadInstruction) from before r305245. llvm-svn: 305267	2017-06-13 02:24:09 +00:00
George Burgess IV	f613749382	Fix signed/unsigned comparison warning; NFC llvm-svn: 305262	2017-06-13 01:28:49 +00:00
David Blaikie	ae8c4af4ac	Inliner: Don't remove calls to readnone+nounwind (but not always_inline) functions in the AlwaysInliner llvm-svn: 305245	2017-06-12 23:01:17 +00:00
Anna Thomas	4b027e8f89	[RS4GC] Drop invalid metadata after pointers are relocated Summary: After RS4GC, we should drop metadata that is no longer valid. These metadata is used by optimizations scheduled after RS4GC, and can cause a miscompile. One such metadata is invariant.load which is used by LICM sinking transform. After rewriting statepoints, the address of a load maybe relocated. With invariant.load metadata on a load instruction, LICM sinking assumes the loaded value (from a dererenceable address) to be invariant, and rematerializes the load operand and the load at the exit block. This transforms the IR to have an unrelocated use of the address after a statepoint, which is incorrect. Other metadata we conservatively remove are related to dereferenceability and noalias metadata. This patch drops such metadata on store and load instructions after rewriting statepoints. Reviewers: reames, sanjoy, apilipenko Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33756 llvm-svn: 305234	2017-06-12 21:26:53 +00:00
Sanjay Patel	2e33bbaff0	[InstCombine] lshr (sext iM X to iN), N-M --> zext (ashr X, min(N-M, M-1)) to iN This is a follow-up to https://reviews.llvm.org/D33879 / https://reviews.llvm.org/rL304939 , and was discussed in https://reviews.llvm.org/D33338. We prefer this form because a narrower shift may be cheaper, and we can more easily fold a zext than a sext. http://rise4fun.com/Alive/slVe Name: shz %s = sext i8 %x to i12 %r = lshr i12 %s, 4 => %a = ashr i8 %x, 4 %r = zext i8 %a to i12 llvm-svn: 305190	2017-06-12 14:23:43 +00:00
Xinliang David Li	7ed6cd32ea	[PartialInlining] Support shrinkwrap life_range markers Differential Revision: http://reviews.llvm.org/D33847 llvm-svn: 305170	2017-06-11 20:46:05 +00:00
Geoff Berry	3cca1da20c	[EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default). Summary: Use MemorySSA for memory dependency checking in the EarlyCSE pass at the start of the function simplification portion of the pipeline. We rely on the fact that GVNHoist runs just after this pass of EarlyCSE to amortize the MemorySSA construction cost since GVNHoist uses MemorySSA and EarlyCSE preserves it. This is turned off by default. A follow-up change will turn it on to allow for easier reversion in case it breaks something. llvm-svn: 305146	2017-06-10 15:20:03 +00:00
Andrew Kaylor	647025f9e1	[InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin Differential Revision: https://reviews.llvm.org/D33737 llvm-svn: 305132	2017-06-09 23:18:11 +00:00
Yaxun Liu	6455b0dbf3	[SROA] Fix APInt size when load/store have different address space Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in GEPOperator::accumulateConstantOffset. Basically it does not consider the situation that the pointer operand of load or store may be in a non-zero address space and its size may be different from the size of a pointer in address space 0. This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend. Diffferential Revision: https://reviews.llvm.org/D33298 llvm-svn: 305107	2017-06-09 20:46:29 +00:00
Keno Fischer	5329174cb1	[Sink] Fix predicate in legality check Summary: isSafeToSpeculativelyExecute is the wrong predicate to use here. All that checks for is whether it is safe to hoist a value due to unaligned/un-dereferencable accesses. However, not only are we doing sinking rather than hoisting, our concern is that the location we're loading from may have been modified. Instead forbid sinking any load across a critical edge. Reviewers: majnemer Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D33179 llvm-svn: 305102	2017-06-09 19:31:10 +00:00
Sanjay Patel	70db424601	[SimplifyLibCalls] fix formatting; NFC llvm-svn: 305081	2017-06-09 14:22:03 +00:00
Serguei Katkov	38414b57f9	[IndVars] Add an option to be able to disable LFTR This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization. By default option is off so current behavior is not changed. Reviewers: reames, sanjoy, wmi, andreadb, apilipenko Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33979 llvm-svn: 305055	2017-06-09 06:11:59 +00:00
George Burgess IV	a20352e13e	[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops. If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] \| 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. llvm-svn: 305053	2017-06-09 03:56:15 +00:00
David Blaikie	cb9327b02d	Inliner: Don't touch indirect calls Other comments/implications are that this isn't intended behavior (nor perserved/reimplemented in the new inliner) & complicates fixing the 'inlining' of trivially dead calls without consulting the cost function first. llvm-svn: 305052	2017-06-09 03:29:20 +00:00
Craig Topper	a420562257	[InstCombine] Pass a proper context instruction to all of the calls into InstSimplify Summary: This matches the behavior we already had for compares and makes us consistent everywhere. Reviewers: dberlin, hfinkel, spatel Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33604 llvm-svn: 305049	2017-06-09 03:21:29 +00:00
Evgeniy Stepanov	d02dbf6b1c	[CFI] Remove LinkerSubsectionsViaSymbols. Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. This is the second attempt to land this change after fixing PR33316. llvm-svn: 305031	2017-06-08 23:38:22 +00:00
Craig Topper	2aa4d39f5e	[ExtractGV] Fix the doxygen comment on the constructor and the class to refer to global values instead of functions. While there fix an 80 column violation. NFC llvm-svn: 305030	2017-06-08 23:38:19 +00:00
Peter Collingbourne	e357fbd243	Write summaries for merged modules when splitting modules for ThinLTO. This is to prepare to allow for dead stripping of globals in the merged modules. Differential Revision: https://reviews.llvm.org/D33921 llvm-svn: 305027	2017-06-08 23:01:49 +00:00
Kostya Serebryany	2c2fb8896b	[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308 llvm-svn: 305026	2017-06-08 22:58:19 +00:00
Dehao Chen	e2a428bad7	Do not early-inline recursive calls in sample profile loader. Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it. Reviewers: davidxl, dnovillo, iteratee Reviewed By: iteratee Subscribers: iteratee, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34017 llvm-svn: 305009	2017-06-08 20:11:57 +00:00
Galina Kistanova	e128958552	Changed a comparison operator for std::stable_sort to implement strict weak ordering. This is a temporarily fix which needs additional work, as it triggers a test3 failure. test3 is commented out till then. llvm-svn: 304993	2017-06-08 17:27:40 +00:00
Nirav Dave	62fb8498d3	InferAddressSpaces: Avoid assertion failure with replacing identical cloned constexpr Have cloneConstantExprWithNewAddressSpaces return nullptr when returning initial ConstantExpr. Reviewers: arsenm Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D33995 llvm-svn: 304975	2017-06-08 13:20:55 +00:00
John Brawn	da4a68a1d2	[BPI] Don't assume that strcmp returning >0 is more likely than <0 The zero heuristic assumes that integers are more likely positive than negative, but this also has the effect of assuming that strcmp return values are more likely positive than negative. Given that for nonzero strcmp return values it's the ordering of arguments that determines the sign of the result there's no reason to assume that's true. Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to decide if it's strcmp-like, and if so only assume that nonzero is more likely than zero i.e. strings are more often different than the same. This causes a slight code generation change in the spec2006 benchmark 403.gcc, but with no noticeable performance impact. The intent of this patch is to allow better optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are also some changes that need to be made to if-conversion. Differential Revision: https://reviews.llvm.org/D33934 llvm-svn: 304970	2017-06-08 09:44:40 +00:00
Sanjay Patel	66f7fdb300	[InstCombine] fold lshr (sext X), C1 --> zext (lshr X, C2) This was discussed in D33338. We have larger pattern-matching ending in a truncate that we can reduce or remove by handling these smaller patterns first. Further motivation is that narrower shift ops are easier for value tracking and zext is better than sext. http://rise4fun.com/Alive/rhh Name: boolshift %sext = sext i1 %x to i8 %r = lshr i8 %sext, 7 => %r = zext i1 %x to i8 Name: noboolshift %sext = sext i3 %x to i8 %r = lshr i8 %sext, 7 => %sh = lshr i3 %x, 2 %r = zext i3 %sh to i8 Differential Revision: https://reviews.llvm.org/D33879 llvm-svn: 304939	2017-06-07 20:32:08 +00:00
Xinliang David Li	4f49bee764	Fix builin_expect lowering bug PR33346 Skip cases when expected value is not constant int. llvm-svn: 304933	2017-06-07 18:32:24 +00:00
Peter Collingbourne	aaae7eed5c	LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else). This makes it so that the code quality for CFI checks when compiling with -O2 and linking with --lto-O0 is similar to that of the rest of the code. Reduces the size of a chrome binary built with -O2/--lto-O0 by about 750KB. Differential Revision: https://reviews.llvm.org/D33925 llvm-svn: 304921	2017-06-07 15:49:14 +00:00
Craig Topper	73ba1c84be	[InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare. llvm-svn: 304876	2017-06-07 07:40:37 +00:00
Craig Topper	29c282eac8	[InstCombine] Fix two asserts that were accidentally checking that an APInt pointer is non-zero instead of checking that the APInt self is non-zero. I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used. llvm-svn: 304875	2017-06-07 07:40:29 +00:00
Zachary Turner	264b5d9e88	Move Object format code to lib/BinaryFormat. This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864	2017-06-07 03:48:56 +00:00
Evgeny Stupachenko	3b88291581	Fix PR23384 (part 3 of 3) Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304824	2017-06-06 20:04:16 +00:00
Daniel Berlin	eafdd862e5	NewGVN: Fix PR/33187. This is a bug caused by two things: 1. When there is no perfect iteration order, we can't let phi nodes put themselves in terms of things that come later in the iteration order, or we will endlessly cycle (the normal RPO algorithm clears the hashtable to avoid this issue). 2. We are sometimes erasing the wrong expression (causing pessimism) because our equality says loads and stores are the same. We introduce an exact equality function and use it when erasing to make sure we erase only identical expressions, not equivalent ones. llvm-svn: 304807	2017-06-06 17:15:28 +00:00
Anna Thomas	b2a212c070	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames, anna Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304806	2017-06-06 16:45:25 +00:00
Anna Thomas	7218032019	[IRCE] Canonicalize pre/post loops after the blocks are added into parent loop Summary: We were canonizalizing the pre loop (into loop-simplify form) before the post loop blocks were added into parent loop. This is incorrect when IRCE is done on a subloop. The post-loop blocks are created, but not yet added to the parent loop. So, loop-simplification on the pre-loop incorrectly updates LoopInfo. This patch corrects the ordering so that pre and post loop blocks are added to parent loop (if any), and then the loops are canonicalized to LCSSA and LoopSimplifyForm. Reviewers: reames, sanjoy, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33846 llvm-svn: 304800	2017-06-06 14:54:01 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Xin Tong	9d6f08a8d4	Add a dominanance check interface that uses caching for instructions within same basic block. Summary: This problem stems from the fact that instructions are allocated using new in LLVM, i.e. there is no relationship that can be derived by just looking at the pointer value. This interface dispatches to appropriate dominance check given 2 instructions, i.e. in case the instructions are in the same basic block, ordered basicblock (with instruction numbering and caching) are used. Otherwise, dominator tree is used. This is a preparation patch for https://reviews.llvm.org/D32720 Reviewers: dberlin, hfinkel, davide Subscribers: davide, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33380 llvm-svn: 304764	2017-06-06 02:34:41 +00:00
Evgeny Stupachenko	f2b3b467e5	Fix PR23384 (part 2 of 3) NFC Summary: The patch moves LSR cost comparison to target part. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30561 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304750	2017-06-05 23:37:00 +00:00
Evgeny Stupachenko	4d94e99446	LSR: Calculate instruction cost only if InsnsCost is set to true (NFC) Summary: The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option. Currently even if the option set to false we calculate and print (in debug mode) instruction costs. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D33914 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304746	2017-06-05 22:44:18 +00:00
Sven van Haastregt	78819e0fd4	[InstCombine] Fix extractelement use before def This fixes a bug that can cause extractelements with operands that haven't been defined yet to be inserted at a wrong point when optimising insertelements. Patch by Karl Hylen. Differential Revision: https://reviews.llvm.org/D33449 llvm-svn: 304701	2017-06-05 09:18:10 +00:00
Renato Golin	cdf840fd38	Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet." This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days. llvm-svn: 304698	2017-06-05 07:35:52 +00:00
Ayal Zaks	ab32aff838	[LV] Make scalarizeInstruction() non-virtual. NFC. Following the request made in https://reviews.llvm.org/D32871, scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is hereby made non-virtual in InnerLoopVectorizer. Should have been part of r297580 originally. llvm-svn: 304685	2017-06-04 13:29:51 +00:00
Craig Topper	0799ff9e64	[InstCombine] Add support for simplifying ctlz/cttz intrinsics based on known bits. llvm-svn: 304669	2017-06-03 18:50:32 +00:00
Galina Kistanova	e9cacb6ae8	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304638	2017-06-03 05:19:32 +00:00
Galina Kistanova	55344aba7e	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304637	2017-06-03 05:19:10 +00:00
Galina Kistanova	96d51f5bcb	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304636	2017-06-03 05:18:46 +00:00
Kostya Serebryany	f7db346cdf	[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. llvm-svn: 304630	2017-06-03 01:35:47 +00:00
Evgeniy Stepanov	704003ea3d	Revert "[CFI] Remove LinkerSubsectionsViaSymbols." This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin. llvm-svn: 304626	2017-06-03 00:46:27 +00:00
Alexey Bataev	e4e5923ef1	[SLP] Improve comments and naming of functions/variables/members, NFC. Fixed some comments, added an additional description of the algorithms, improved readability of the code. Differential revision: https://reviews.llvm.org/D33320 llvm-svn: 304616	2017-06-03 00:08:21 +00:00
Kostya Serebryany	aed6ba770c	[sanitizer-coverage] refactor the code to make it easier to add more sections in future. NFC llvm-svn: 304610	2017-06-02 23:13:44 +00:00
Alexey Bataev	03ca396b95	Revert "[SLP] Improve comments and naming of functions/variables/members, NFC." This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a. llvm-svn: 304609	2017-06-02 23:09:15 +00:00
Philip Reames	b70cecd60a	[Statepoint] Be consistent about using deopt naming [NFCI] We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed. llvm-svn: 304607	2017-06-02 23:03:26 +00:00
Xinliang David Li	5fdc75aea1	Fix debug build test failure llvm-svn: 304600	2017-06-02 22:38:48 +00:00
Xinliang David Li	0b7d858fa3	[PartialInlining] Minor cost anaysis tuning Also added a test option and 2 cost analysis related tests. llvm-svn: 304599	2017-06-02 22:08:04 +00:00
David Blaikie	6aeacaa527	FunctionAttrs: Skip it if the effective SCC (ignoring optnone functions) is empty Minor optimization but mostly simplifies my debugging so I'm not dealing with empty SCCNodeSets while investigating issues in this optimization. llvm-svn: 304597	2017-06-02 21:24:17 +00:00
Alexey Bataev	2c08fde9e5	[SLP] Improve comments and naming of functions/variables/members, NFC. Summary: Fixed some comments, added an additional description of the algorithms, improved readability of the code. Reviewers: anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33320 llvm-svn: 304593	2017-06-02 20:39:27 +00:00
Keno Fischer	514a6a54e7	[SROA] Fix crash due to bad bitcast Summary: As shown in the test case, SROA was crashing when trying to split stores (to the alloca) of loads (from anywhere), because it assumed the pointer operand to the loads and stores had to have the same address space. This isn't the case. Make sure to use the correct pointer type for both the load and the store. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D32593 llvm-svn: 304585	2017-06-02 19:04:17 +00:00
Evgeniy Stepanov	63f056327d	[CFI] Remove LinkerSubsectionsViaSymbols. Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. llvm-svn: 304582	2017-06-02 18:45:14 +00:00
Evgeniy Stepanov	b933ad3a77	Skip CFI for dead functions. Differential Revision: https://reviews.llvm.org/D33805 llvm-svn: 304578	2017-06-02 18:24:23 +00:00
Sanjay Patel	ce241f48c5	[InstCombine] fix icmp with not op and constant to work with splat vector constant llvm-svn: 304562	2017-06-02 16:29:41 +00:00
Sanjay Patel	4dc85eb75a	[InstCombine] improve perf by not creating a known non-canonical instruction Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp an extra time to canonicalize it: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp sgt i8 -43, %x IC: ADD: %cmp = icmp sgt i8 -43, %x IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 -43, %x IC: Mod = %cmp = icmp sgt i8 -43, %x New = %cmp = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp If we create the swapped ICmp directly, we go faster: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp llvm-svn: 304558	2017-06-02 16:11:14 +00:00
Gor Nishanov	053d2d24f7	[coroutines] PR33271: Remove stray coro.save intrinsics during CoroSplit Summary: Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned. Make sure we clean up orphaned coro.saves. The bug manifested with a crash similar to this: ``` llvm_unreachable("Unknown type!"); llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false) llvm::EVT::getEVT llvm::TargetLoweringBase::getValueType llvm::ComputeValueVTs llvm::SelectionDAGBuilder::visitTargetIntrinsic ``` Reviewers: GorNishanov Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33817 llvm-svn: 304518	2017-06-02 02:18:36 +00:00
Xinliang David Li	621e8dcf1f	[Profile] Enhance expect lowering to handle correlated branches builtin_expect applied on && or \|\| expressions were not handled properly before. With this patch, the problem is fixed. Differential Revision: http://reviews.llvm.org/D33164 llvm-svn: 304517	2017-06-02 02:09:31 +00:00
Philip Reames	ae80045deb	[RS4GC] Comment clarification llvm-svn: 304514	2017-06-02 01:52:06 +00:00
Davide Italiano	1dd5558e52	[PM] GVNSink is off by default, fix an obvious typo. llvm-svn: 304497	2017-06-01 23:47:53 +00:00
Xinliang David Li	d6cfba2a02	Fix compiler_rt buildbot failure llvm-svn: 304489	2017-06-01 23:05:11 +00:00
Keno Fischer	fa635d730f	Reapply "[Cloning] Take another pass at properly cloning debug info" This was rL304226, reverted in 304228 due to a clang assertion failure on the build bots. That problem should have been addressed by clang commit rL304470. llvm-svn: 304488	2017-06-01 23:02:12 +00:00
Evgeniy Stepanov	56584bbf16	(NFC) Track global summary liveness in GVFlags. Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of all the DeadSymbols sets. This is refactoring in order to make liveness information available in the RegularLTO pipeline. llvm-svn: 304466	2017-06-01 20:30:06 +00:00
Xinliang David Li	ee8d6acb1f	[Profile] Fix builtin_expect lowering bug The lowerer wrongly assumes the ICMP instruction 1) always has a constant operand; 2) the operand has value 0. It also assumes the expected value can only be one, thus other values other than one will be considered 'zero'. This leads to wrong profile annotation when other integer values are used other than 0, 1 in the comparison or in the expect intrinsic. Also missing is handling of equal predicate. This patch fixes all the above problems. Differential Revision: http://reviews.llvm.org/D33757 llvm-svn: 304453	2017-06-01 19:05:55 +00:00
Xinliang David Li	0a0acbcf78	[PartialInlining] Emit branch info and profile data as remarks This allows us to collect profile statistics to tune static branch prediction. Differential Revision: http://reviews.llvm.org/D33746 llvm-svn: 304452	2017-06-01 18:58:50 +00:00
Mandeep Singh Grang	33a1b73600	[PredicateInfo] Fix non-determinism in codegen uncovered by reverse iterating SmallPtrSet Summary: Sort OpsToRename before iterating to make iteration order deterministic. Thanks to Daniel Berlin for the sorting logic. Reviewers: dberlin, RKSimon, efriedma, davide Reviewed By: dberlin, davide Subscribers: sanjoy, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D33265 llvm-svn: 304447	2017-06-01 18:36:24 +00:00
Tim Shen	6b41141863	[ThinLTO] Migrate ThinLTOBitcodeWriter to the new PM. Summary: Also see D33429 for other ThinLTO + New PM related changes. Reviewers: davide, chandlerc, tejohnson Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D33525 llvm-svn: 304378	2017-06-01 01:02:12 +00:00
Xinliang David Li	32c5e809be	[PartialInlining] Reduce outlining overhead by removing unneeded live-out(s) Differential Revision: http://reviews.llvm.org/D33694 llvm-svn: 304375	2017-06-01 00:12:41 +00:00
Wei Mi	0bd3f41588	Revert rL304050. It may break sanitizer bootstrap. Revert it for now while investigating. llvm-svn: 304350	2017-05-31 21:29:33 +00:00
Reid Kleckner	5fbdd17714	[IR] Add additional addParamAttr/removeParamAttr to AttributeList API Summary: Fairly straightforward patch to fill in some of the holes in the attributes API with respect to accessing parameter/argument attributes. The patch aims to step further towards encapsulating the idx+FirstArgIndex pattern to access these attributes to within the AttributeList. Patch by Daniel Neilson! Reviewers: rnk, chandlerc, pete, javed.absar, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33355 llvm-svn: 304329	2017-05-31 19:23:09 +00:00
Kostya Serebryany	53b34c8443	[sanitizer-coverage] remove stale code (old coverage); llvm part llvm-svn: 304319	2017-05-31 18:27:33 +00:00
Anna Thomas	777bb90bdc	Revert "[Atomics][LoopIdiom] Recognize unordered atomic memcpy" This reverts commit r304310. It caused build failures in polly and mingw due to undefined reference to llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC. llvm-svn: 304315	2017-05-31 17:20:51 +00:00
Zaara Syeda	3a7578c658	[PPC] Inline expansion of memcmp This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313	2017-05-31 17:12:38 +00:00
Anna Thomas	056c009f1b	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304310	2017-05-31 16:39:52 +00:00
Gor Nishanov	2bc782d8da	[coroutines] Call initializePass in coroutine pass constructors Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=33226 Reviewers: chandlerc, davide, majnemer, dblaikie Reviewed By: chandlerc Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33701 llvm-svn: 304277	2017-05-31 03:12:42 +00:00
Daniel Berlin	be3e7ba45e	NewGVN: Fix PR 33185 by checking whether we need to recursively generate a phi of ops, which we don't currently support. llvm-svn: 304272	2017-05-31 01:47:32 +00:00
Xinliang David Li	74480adafd	[PartialInlining] Shrinkwrap allocas with live range contained in outline region. Differential Revision: http://reviews.llvm.org/D33618 llvm-svn: 304245	2017-05-30 21:22:18 +00:00
Matthew Simpson	646475a9bc	[LV] Reapply r303763 with fix for PR33193 r303763 caused build failures in some out-of-tree tests due to an assertion in TTI. The original patch updated cost estimates for induction variable update instructions marked for scalarization. However, it didn't consider that the incoming value of an induction variable phi node could be a cast instruction. This caused queries for cast instruction costs with a mix of vector and scalar types. This patch includes a fix for cast instructions and the test case from PR33193. The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>. Reference: https://bugs.llvm.org/show_bug.cgi?id=33193 Original Differential Revision: https://reviews.llvm.org/D33457 llvm-svn: 304235	2017-05-30 19:55:57 +00:00
Keno Fischer	3fa5db4c04	Revert "[Cloning] Take another pass at properly cloning debug info" At least one build bot is complaining. Will investigate after lunch. llvm-svn: 304228	2017-05-30 18:56:26 +00:00
Keno Fischer	945dc1d2d1	[Cloning] Take another pass at properly cloning debug info Summary: In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support was adjusted to attempt to enforce this invariant during cloning. However, there were several problems with the implementation. Part of these were fixed in rL304079. However, there was a more fundamental problem with these changes, namely that it bypasses the matadata value map, causing the cloned metadata to be a mix of metadata pointing to the new suprogram (where manual code was added to fix those up) and the old suprogram (where this was not the case). This mismatch could cause a number of different assertion failures in the DWARF emitter. Some of these are given at https://github.com/JuliaLang/julia/issues/22069, but some others have been observed as well. Attempt to rectify this by partially reverting the manual DI metadata fixup, and instead using the standard value map approach. To retain the desired semantics of not duplicating the compilation unit and inlined subprograms, explicitly freeze these in the value map. Reviewers: dblaikie, aprantl, GorNishanov, echristo Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33655 llvm-svn: 304226	2017-05-30 18:28:30 +00:00
Daniel Berlin	2aa5dc1589	NewGVN: Compute hash value of expression on demand and use it in inequality testing. llvm-svn: 304195	2017-05-30 06:58:18 +00:00
Daniel Berlin	c8ed40400c	NewGVN: Fix PR33194, memory corruption by putting temporary instructions in tables sometimes. llvm-svn: 304194	2017-05-30 06:42:29 +00:00
Joerg Sonnenberger	9375a25342	Revert r303763, results in asserts i.e. while building Ruby. llvm-svn: 304179	2017-05-29 22:52:17 +00:00
Hiroshi Inoue	ac9cd3080d	[trivial] fix a typo in comment, NFC llvm-svn: 304139	2017-05-29 08:37:42 +00:00
Gor Nishanov	ffbeb22b6f	Cloning: Fix debug info cloning Summary: I believe https://reviews.llvm.org/rL302576 introduced two bugs: 1) it produces duplicate distinct variables for every: dbg.value describing the same variable. To fix the problme I switched form getDistinct() to get() in DebugLoc.cpp: auto reparentVar = [&](DILocalVariable Var) { return DILocalVariable::getDistinct( 2) It passes NewFunction plain name as a linkagename parameter to Subprogram constructor. Breaks assert in: \|\| DeclLinkageName.empty()) \|\| LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed. #9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3 # (Edit: reproducer added) Here how https://reviews.llvm.org/rL302576 broke coroutine debug info. Coroutine body of the original function is split into several parts by cloning and removing unneeded code. All parts describe the original function and variables present in the original function. For a simple case, prior to Split, original function has these two blocks: ``` PostSpill: ; preds = %AllocaSpillBB call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !14, metadata !15), !dbg !13 store i32 %x, i32* %x.addr, align 4 ... and sw.epilog: ; preds = %sw.bb %x.addr.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4, !dbg !20 %4 = load i32, i32* %x.addr.reload.addr, align 4, !dbg !20 call void @llvm.dbg.value(metadata i32 %4, i64 0, metadata !14, metadata !15), !dbg !13 !14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11) ``` Note that in two blocks different expression represent the same original user variable X. Before rL302576, for every cloned function there was exactly one cloned DILocalVariable(name: "x" as in: ``` define i8* @f(i32 %x) #0 !dbg !6 { ... !6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, ... !14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11) define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 { ... !25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2) !28 = !DILocalVariable(name: "x", arg: 1, scope: !25, file: !7, line: 55, type: !11) ``` After rL302576, for every cloned function there were as many DILocalVariable(name: "x" as there were "call void @llvm.dbg.value" for that variable. This was causing asserts in VerifyDebugInfo and AssemblyPrinter. Example: ``` !27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, !29 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) !39 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) !41 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) ``` Second problem: Prior to rL302576, all clones were described by DISubprogram referring to original function. ``` define i8* @f(i32 %x) #0 !dbg !6 { ... !6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 { ... !25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, ``` After rL302576, DISubprogram for clones is of two minds, plain name refers to the original name, linkageName refers to plain name of the clone. ``` !27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, ``` I think the assumption in AsmPrinter is that both name and linkageName should refer to the same entity. It asserts here when they are not: ``` \|\| DeclLinkageName.empty()) \|\| LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed. #9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3 ``` After this fix, behavior (with respect to coroutines) reverts to exactly as it was before and therefore making them debuggable again, or even more importantly, compilable, with "-g" Reviewers: dblaikie, echristo, aprantl Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33614 llvm-svn: 304079	2017-05-27 19:41:09 +00:00
Gor Nishanov	9c6ac6138d	[coroutines] Define getPassName() for coroutine passes Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33622 llvm-svn: 304065	2017-05-27 05:54:30 +00:00
Vitaly Buka	a637489ef1	[PartialInlining] Replace delete with unique_ptr in computeCallsiteToProfCountMap Reviewers: davidxl Reviewed By: davidxl Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D33220 llvm-svn: 304064	2017-05-27 05:32:09 +00:00
Wei Mi	5bbb5aafc1	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit is to fix a bug about ExtractValue and InsertValue ops. For those ops, some varargs inside GVN::Expression are not value numbers but raw index numbers. It is wrong to do phi-translate for raw index numbers, and the fix is to stop doing that. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 304050	2017-05-27 00:54:19 +00:00
Benjamin Kramer	debb3c35e0	Make helper functions static. NFC. llvm-svn: 304029	2017-05-26 20:09:00 +00:00
Peter Collingbourne	7730b24448	PMB: Run the whole-program-devirt pass during LTO at --lto-O0. The whole-program-devirt pass needs to run at -O0 because only it knows about the llvm.type.checked.load intrinsic: it needs to both lower the intrinsic itself and handle it in the summary. Differential Revision: https://reviews.llvm.org/D33571 llvm-svn: 304019	2017-05-26 18:27:13 +00:00
Craig Topper	d45185f231	[InstCombine] Pass the DominatorTree, AssumptionCache, and context instruction to a few calls to isKnownPositive, isKnownNegative, and isKnownNonZero Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent. Differential Revision: https://reviews.llvm.org/D33567 llvm-svn: 304018	2017-05-26 18:23:57 +00:00
Wei Mi	3250ae3f7c	Revert rL303923 since it broke the sanitizer bootstrap build bot. llvm-svn: 303969	2017-05-26 05:42:50 +00:00
Craig Topper	d4039f7283	[InstCombine] Add an InstCombine specific wrapper around isKnownToBeAPowerOfTwo to shorten code. NFC We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo. llvm-svn: 303924	2017-05-25 21:51:12 +00:00
Wei Mi	fd257fa7bf	[GVN] Add phi-translate support in scalarpre. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 303923	2017-05-25 21:49:02 +00:00
Daniel Berlin	e67c322260	NewGVN: Fix PR 33119, PR 33129, due to regressed undef handling Fix PR33120 and others by eliminating self-cycles a different way. llvm-svn: 303875	2017-05-25 15:44:20 +00:00
Artur Pilipenko	315eafc339	[InstCombine] Teach isAllocSiteRemovable to look through addrspacecasts Reviewed By: reames Differential Revision: https://reviews.llvm.org/D28565 llvm-svn: 303870	2017-05-25 15:14:48 +00:00
Sanjay Patel	5150612012	[InstCombine] make icmp-mul fold more efficient There's probably a lot more like this (see also comments in D33338 about responsibility), but I suspect we don't usually get a visible manifestation. Given the recent interest in improving InstCombine efficiency, another potential micro-opt that could be repeated several times in this function: morph the existing icmp pred/operands instead of creating a new instruction. llvm-svn: 303860	2017-05-25 14:13:57 +00:00
James Molloy	dc2d64bc35	[GVNSink] Pacify MSVC Don't convert an unsigned to a pointer for a sentinel, use a size_t instead. llvm-svn: 303855	2017-05-25 13:14:10 +00:00
James Molloy	2a237f19f1	[GVNSink] Don't define operator<< in NDEBUG Without debug macros enabled, the raw_ostream operator<< overload is unused. llvm-svn: 303852	2017-05-25 13:11:18 +00:00
James Molloy	a929063233	[GVNSink] GVNSink pass This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts. This pass attempts to sink instructions into successors, reducing static instruction count and enabling if-conversion. We use a variant of global value numbering to decide what can be sunk. Consider: [ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ] [ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ] \ / [ %e = phi i32 %a2, %c2 ] [ add i32 %e, 4 ] GVN would number %a1 and %c1 differently because they compute different results - the VN of an instruction is a function of its opcode and the transitive closure of its operands. This is the key property for hoisting and CSE. What we want when sinking however is for a numbering that is a function of the uses of an instruction, which allows us to answer the question "if I replace %a1 with %c1, will it contribute in an equivalent way to all successive instructions?". The (new) PostValueTable class in GVN provides this mapping. This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG. Differential revision: https://reviews.llvm.org/D24805 llvm-svn: 303850	2017-05-25 12:51:11 +00:00
Chandler Carruth	dd2e275a47	[PM/Unswitch] Fix a bug in the domtree update logic for the new unswitch pass. The original logic only considered direct successors of the hoisted domtree nodes, but that isn't really enough. If there are other basic blocks that are completely within the subtree, their successors could just as easily be impacted by the hoisting. The more I think about it, the more I think the correct update here is to hoist every block on the dominance frontier which has an idom in the chain we hoist across. However, this is subtle enough that I'd definitely appreciate some more eyes on it. Sadly, if this is the correct algorithm, it requires computing a (highly localized) dominance frontier. I've done this in the simplest (IE, least code) way I could come up with, but that may be too naive. Suggestions welcome here, dominance update algorithms are not an area I've studied much, so I don't have strong opinions. In good news, with this patch, turning on simple unswitch passes the LLVM test suite for me with asserts enabled. Differential Revision: https://reviews.llvm.org/D32740 llvm-svn: 303843	2017-05-25 06:33:36 +00:00
Chandler Carruth	29c22d2835	[LegacyPM] Make the 'addLoop' method accept a loop to add rather than having it internally allocate the loop. This is a much more flexible API and necessary in the new loop unswitch to reasonably support both new and old PMs in common code. It also just seems like a cleaner separation of concerns. NFC, this should just be a pure refactoring. Differential Revision: https://reviews.llvm.org/D33528 llvm-svn: 303834	2017-05-25 03:01:31 +00:00
George Karpenkov	a1c532784d	Fix coverage check for full post-dominator basic blocks. Coverage instrumentation which does not instrument full post-dominators and full-dominators may skip valid paths, as the reasoning for skipping blocks may become circular. This patch fixes that, by only skipping full post-dominators with multiple predecessors, as such predecessors by definition can not be full-dominators. llvm-svn: 303827	2017-05-25 01:41:46 +00:00
Gor Nishanov	1fbc01f70f	[coroutines] CoroFrame.cpp conform to coding convention (s/repeat/Repeat) (NFC) llvm-svn: 303826	2017-05-25 01:07:10 +00:00
Gor Nishanov	0ea1863b27	[coroutines] Relocate instructions that maybe spilled after coro.begin Summary: Frontend generates store instructions after allocas, for example: ``` define i8* @f(i64 %this) "coroutine.presplit"="1" personality i32 0 { entry: %this.addr = alloca i64 store i64 %this, i64* %this.addr .. %hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc) ``` Such instructions may require spilling into coro.frame, but, coro-frame address is only available after coro.begin and thus needs to be moved after coro.begin. The only instructions that should not be moved are the arguments of coro.begin and all of their operands. Reviewers: GorNishanov, majnemer Reviewed By: GorNishanov Subscribers: llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D33527 llvm-svn: 303825	2017-05-25 00:46:20 +00:00
Gor Nishanov	1f72d75714	[coroutines] Allow rematerialization upto 4 times. Remove incorrect assert Reviewers: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33524 llvm-svn: 303819	2017-05-24 23:01:02 +00:00
Sanjay Patel	07b1ba54b5	[InstCombine] use m_APInt to allow icmp-mul-mul vector fold The swapped operands in the first test is a manifestation of an inefficiency for vectors that doesn't exist for scalars because the IRBuilder checks for an all-ones mask for scalars, but not vectors. llvm-svn: 303818	2017-05-24 22:58:17 +00:00
Craig Topper	2f9c6dafe3	[InstCombine] Merge together the SimplifyDemandedUseBits implementations for ZExt and Trunc. NFC While there avoid resizing the DemandedMask twice. Make a copy into a separate variable instead. This potentially removes an allocation on large bit widths. With the use of the zextOrTrunc methods on APInt and KnownBits these can be made almost source identical. The only difference is the zero of the upper bits for ZExt. This is similar to how its done in computeKnownBits in ValueTracking. llvm-svn: 303791	2017-05-24 18:40:25 +00:00
Teresa Johnson	cd2aa0d2e4	Fix a couple of typos in memory intrinsic optimization output (NFC) s/instrinsic/intrinsic llvm-svn: 303782	2017-05-24 17:55:25 +00:00
Craig Topper	1c660dbea6	[InstCombine] Use less bitwise operations to handle Instruction::SExt in SimplifyDemandedUseBits. Other improvements. The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits. We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better. Differential Revision: https://reviews.llvm.org/D32098 llvm-svn: 303779	2017-05-24 17:33:30 +00:00
Craig Topper	8205a1a9b6	[ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773	2017-05-24 16:53:07 +00:00
Matthew Simpson	d6f179cad6	[LV] Update type in cost model for scalarization For non-uniform instructions marked for scalarization, we should update `VectorTy` when computing instruction costs to reflect the scalar type. In addition to determining instruction costs, this type is also used to signal that all instructions in the loop will be scalarized. This currently affects memory instructions and non-pointer induction variables and their updates. (We also mark GEPs scalar after vectorization, but their cost is computed together with memory instructions.) For scalarized induction updates, this patch also scales the scalar cost by the vectorization factor, corresponding to each induction step. llvm-svn: 303763	2017-05-24 15:26:15 +00:00
Jonas Paulsson	8624b7e1ce	[LoopVectorizer] Let target prefer scalar addressing computations. The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744	2017-05-24 13:42:56 +00:00
Davide Italiano	fd9100e056	[NewGVN] Update additionalUsers when we simplify to a value. Otherwise we don't revisit an instruction that could be simplified, and when we verify, we discover there's something that changed, i.e. what we had wasn't a maximal fixpoint. Fixes PR32836. llvm-svn: 303715	2017-05-24 02:30:24 +00:00
George Karpenkov	018472c34a	Revert "Disable coverage opt-out for strong postdominator blocks." This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef. Buildbots do not like this on Linux. llvm-svn: 303710	2017-05-24 00:29:12 +00:00
Davide Italiano	c4861adad9	[SCCP] Use the `hasAddressTaken()` version defined in `Function`. Instead of using the SCCP homegrown one. We should eventually make the private SCCP version disappear, but that wont' be today. PR33143 tracks this issue. Add braces for consistency while here. No functional change intended. llvm-svn: 303706	2017-05-23 23:59:23 +00:00
Davide Italiano	7bf95b964f	[LIR] Use the newly `getRecurrenceVar()` helper. NFCI. llvm-svn: 303704	2017-05-23 23:51:54 +00:00
Davide Italiano	4bc91190ea	[LIR] Strengthen the check for recurrence variable in popcnt/CTLZ. Fixes PR33114. Differential Revision: https://reviews.llvm.org/D33420 llvm-svn: 303700	2017-05-23 22:32:56 +00:00
George Karpenkov	9017ca290a	Disable coverage opt-out for strong postdominator blocks. Coverage instrumentation has an optimization not to instrument extra blocks, if the pass is already "accounted for" by a successor/predecessor basic block. However (https://github.com/google/sanitizers/issues/783) this reasoning may become circular, which stops valid paths from having coverage. In the worst case this can cause fuzzing to stop working entirely. This change simplifies logic to something which trivially can not have such circular reasoning, as losing valid paths does not seem like a good trade-off for a ~15% decrease in the # of instrumented basic blocks. llvm-svn: 303698	2017-05-23 21:58:54 +00:00
Sanjay Patel	d3106add77	[InstCombine] allow icmp-xor folds for vectors (PR33138) This fixes the first part of: https://bugs.llvm.org/show_bug.cgi?id=33138 More work is needed for the bitcasted variant. llvm-svn: 303660	2017-05-23 17:29:58 +00:00
Reid Kleckner	8bf67fe98f	[IR] Switch AttributeList to use an array for O(1) access Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have attributes. However, it requires doing a search over the pairs to test an argument or function attribute. Profiling shows that this loop was 0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly tests values for nullability. This was worth about 2.5% of mid-level optimization cycles on the sqlite3 amalgamation. Here are the full perf results: https://reviews.llvm.org/P7995 Here are just the before and after cycle counts: ``` $ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null 13,274,181,184 cycles # 3.047 GHz ( +- 0.28% ) $ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null 12,906,927,263 cycles # 3.043 GHz ( +- 0.51% ) ``` This patch does not change the indices used to query attributes, as requested by reviewers. Tracking whether an index is usable for array indexing is a huge pain that affects many of the internal APIs, so it would be good to come back later and do a cleanup to remove this internal adjustment. Reviewers: pete, chandlerc Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D32819 llvm-svn: 303654	2017-05-23 17:01:48 +00:00

... 17 18 19 20 21 ...

19845 Commits