llvm-project

Commit Graph

Author	SHA1	Message	Date
David Bolvansky	44a37f04b2	[InstCombine] snprintf optimizations Reviewers: spatel, efriedma, majnemer, rja Reviewed By: rja Subscribers: rja, llvm-commits Differential Revision: https://reviews.llvm.org/D46285 llvm-svn: 331849	2018-05-09 06:34:20 +00:00
Shiva Chen	2c864551df	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841	2018-05-09 02:40:45 +00:00
Heejin Ahn	bf7716952a	Support a funclet operand bundle in LowerInvoke Summary: The current LowerInvoke pass cannot handle invoke instructions with a funclet bundle operand. The order of operands for an invoke instruction is {call arguments, callee, funclet operand (if any), normal dest, unwind dest}. The current code assumes there is no funclet operand and incorrectly includes a funclet operand into call arguments. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46242 llvm-svn: 331832	2018-05-09 00:53:50 +00:00
Davide Italiano	48283ba3a1	[SimplifyCFG] Fix a crash when folding PHIs. We enter MergeBlockIntoPredecessor with a block looking like this: for.inc.us-lcssa: ; preds = %cond.end %k.1.lcssa.ph = phi i32 [ %conv15, %cond.end ] %t.3.lcssa.ph = phi i32 [ %k.1.lcssa.ph, %cond.end ] br label %for.inc, !dbg !66 [note the first arg of the PHI being a PHI]. FoldSingleEntryPHINodes gets rid of both PHIs (calling, eraseFromParent). But right before we call the function, we push into IncomingValues the only argument of the PHIs, and shortly after we try to iterate over something which has been invalidated before :( The fix its not trying to remove PHIs which have an incoming value coming from the same BB we're looking at. Fixes PR37300 and rdar://problem/39910460 Differential Revision: https://reviews.llvm.org/D46568 llvm-svn: 331824	2018-05-08 23:28:15 +00:00
Hideki Saito	d722d61402	[LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body is single basic block Summary: Broadcast code generation emitted instructions in pre-header, while the instruction they are dependent on in the vector loop body. This resulted in an IL verification error ---- value used before defined. Reviewers: rengolin, fhahn, hfinkel Reviewed By: rengolin, fhahn Subscribers: dcaballe, Ka-Ka, llvm-commits Differential Revision: https://reviews.llvm.org/D46302 llvm-svn: 331799	2018-05-08 18:57:34 +00:00
Bjorn Pettersson	51cebc98f3	[LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions Summary: In formLCSSAForInstructions we speculatively add new PHI nodes, that sometimes ends up without having any uses. It has been discovered that sometimes an added PHI node can appear as being unused in one iteration of the Worklist, although it can end up being used by a PHI node added in a later iteration. We now check, a second time, that the PHI node still is unused before we remove it. This avoids an assert about "Trying to remove a phi with uses." for the added test case. Reviewers: davide, mzolotukhin, mattd, dberlin Reviewed By: mzolotukhin, dberlin Subscribers: dberlin, mzolotukhin, davide, bjope, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D46422 llvm-svn: 331741	2018-05-08 06:59:47 +00:00
Teresa Johnson	59da890c96	[NewPM] Emit inliner NoDefinition missed optimization remark Summary: Makes this consistent with the old PM. Reviewers: eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D46526 llvm-svn: 331709	2018-05-08 01:45:46 +00:00
Dmitry Mikulin	738bac77c1	Remove explicit setting of the CFI jumptable section name, it does not appear to be needed: jump table sections are created with .cfi.jumptable suffix. With this change each jump table is placed in a separate section, which allows the linker to re-order them. Differential Revision: https://reviews.llvm.org/D46537 llvm-svn: 331680	2018-05-07 21:30:15 +00:00
Fangrui Song	862eebb6d6	Simplify LLVM_ATTRIBUTE_USED call sites. llvm-svn: 331599	2018-05-05 20:14:38 +00:00
George Burgess IV	f9d26af4ea	Range-ify for loop; NFC llvm-svn: 331582	2018-05-05 04:52:26 +00:00
Craig Topper	781aa181ab	Fix a bunch of places where operator-> was used directly on the return from dyn_cast. Inspired by r331508, I did a grep and found these. Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa. llvm-svn: 331577	2018-05-05 01:57:00 +00:00
Peter Collingbourne	e04ecc88de	LowerTypeTests: Fix non-determinism in code that handles icall branch funnels. This was exposed by enabling expensive checks, which causes llvm::sort to sort randomly. Differential Revision: https://reviews.llvm.org/D45901 llvm-svn: 331573	2018-05-05 00:51:55 +00:00
Philip Reames	5b39acd111	[LICM] Compute a must execute property for the prefix of the header as we go Computing this property within the existing walk ensures that the cost is linear with the size of the block. If we did this from within isGuaranteedToExecute, it would be quadratic without some very fancy caching. This allows us to reliably catch a hoistable instruction within a header which may throw at some point after our hoistable instruction. It doesn't do anything for non-header cases, but given how common single block loops are, this seems very worthwhile. llvm-svn: 331557	2018-05-04 21:35:00 +00:00
Shoaib Meenai	57fadab1cb	[ObjCARC] Account for catchswitch in bitcast insertion A catchswitch is both a pad and a terminator, meaning it must be the only non-phi instruction in its basic block. When we're inserting a bitcast in the incoming basic block for a phi, if that incoming block is a catchswitch, we should go up the dominator tree to find a valid insertion point rather than attempting to insert before the catchswitch (which would result in invalid IR). Differential Revision: https://reviews.llvm.org/D46412 llvm-svn: 331548	2018-05-04 19:03:11 +00:00
Craig Topper	ded8ee07e9	[LoopIdiomRecognize] Don't create an IRBuilder just to call getTrue/getFalse. We can call the methods in ConstantInt directly. We just need a context. llvm-svn: 331542	2018-05-04 17:39:08 +00:00
Max Kazantsev	786032c1b7	[IRCE] Fix misuse of dyn_cast which leads to UB llvm-svn: 331508	2018-05-04 07:34:35 +00:00
Craig Topper	9510f70636	[LoopIdiomRecognize] Replace more unchecked dyn_casts with cast. Two of these are immediately dereferenced on the next line. The other two are passed immediately to the IRBuilder constructor which can't handle a nullptr. llvm-svn: 331500	2018-05-04 01:04:28 +00:00
Craig Topper	cafae62ec9	[LoopIdiomRecognize] Use a regular array instead of a SmallVector and explicit ArrayRef. llvm-svn: 331499	2018-05-04 01:04:26 +00:00
Craig Topper	8304231508	[LoopIdiomRecognize] Turn two uncheck dyn_casts into regular casts. These are casts on users of a PHINode to Instruction. I think since PHINode is an Instruction any users would also be Instructions. At least a cast will give us an assertion if its wrong. llvm-svn: 331498	2018-05-04 01:04:24 +00:00
Sanjay Patel	e7b6654711	[InstCombine] refine select-of-constants to bitwise ops Add logic for the special case when a cmp+select can clearly be reduced to just a bitwise logic instruction, and remove an over-reaching chunk of general purpose bit magic. The primary goal is to remove cases where we are not improving the IR instruction count when doing these select transforms, and in all cases here that is true. In the motivating 3-way compare tests, there are further improvements because we can combine/propagate select values (not sure if that belongs in instcombine, but it's there for now). DAGCombiner has folds to turn some of these selects into bit magic, so there should be no difference in the end result in those cases. Not all constant combinations are handled there yet, however, so it is possible that some targets will see more cmov/csel codegen with this change in IR canonicalization. Ideally, we'll go further to not turn selects into multiple logic/math ops in instcombine, and we'll canonicalize to selects. But we should make sure that this step does not result in regressions first (and if it does, we should fix those in the backend). The general direction for this change was discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html Alive proofs for the new bit magic: https://rise4fun.com/Alive/XG7 Differential Revision: https://reviews.llvm.org/D46086 llvm-svn: 331486	2018-05-03 21:58:44 +00:00
Piotr Padlewski	c77ab8ef2f	perform DSE through launder.invariant.group Summary: Alias Analysis knows that llvm.launder.invariant.group returns pointer that mustalias argument, but this information wasn't used, therefor we didn't DSE through launder.invariant.group Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma Reviewed By: dberlin Subscribers: amharc, llvm-commits, nlewycky, rsmith Differential Revision: https://reviews.llvm.org/D31581 llvm-svn: 331449	2018-05-03 11:03:53 +00:00
Craig Topper	856fd68690	[LoopIdiomRecognize] When looking for 'x & (x -1)' for popcnt, make sure the left hand side of the 'and' matches the left hand side of the 'subtract' llvm-svn: 331437	2018-05-03 05:48:49 +00:00
Craig Topper	8ef2abdbc4	[LoopIdiomRecognize] Remove unnecessary cast from BinaryOperator to Instruction. NFC BinaryOperator is a sub class of Instruction. We don't need an explicit cast back to Instruction. llvm-svn: 331432	2018-05-03 05:00:18 +00:00
Shoaib Meenai	a07295f977	[ObjCARC] Convert an if to an early continue. NFC This reduces nesting and makes the logic slightly easier to follow. Differential Revision: https://reviews.llvm.org/D46371 llvm-svn: 331422	2018-05-03 01:20:36 +00:00
Chandler Carruth	e74c354d12	[gcov] Switch to an explicit if clunky array to satisfy some compilers on various build bots that are unhappy with using makeArrayRef with an initializer list. llvm-svn: 331418	2018-05-03 00:11:03 +00:00
Chandler Carruth	71c3a3fac5	[GCOV] Emit the writeout function as nested loops of global data. Summary: Prior to this change, LLVM would in some cases emit massive writeout functions with many 10s of 1000s of function calls in straight-line code. This is a very wasteful way to represent what are fundamentally loops and creates a number of scalability issues. Among other things, register allocating these calls is extremely expensive. While D46127 makes this less severe, we'll still run into scaling issues with this eventually. If not in the compile time, just from the code size. Now the pass builds up global data structures modeling the inputs to these functions, and simply loops over the data structures calling the relevant functions with those values. This ensures that the code size is a fixed and only data size grows with larger amounts of coverage data. A trivial change to IRBuilder is included to make it easier to build the constants that make up the global data. Reviewers: wmi, echristo Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46357 llvm-svn: 331407	2018-05-02 22:24:39 +00:00
Daniel Sanders	8d0d1aa229	[reassociate] Fix excessive revisits when processing long chains of reassociatable instructions. Summary: Some of our internal testing detected a major compile time regression which I've tracked down to: r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst". It appears that processing long chains of reassociatable instructions causes non-linear (potentially exponential) growth in the number of times an instruction is revisited. For example, the included test revisits instructions 220 times in a 20-instruction test. It appears that r278938 reversed the order instructions were visited and that this is preventing scheduled revisits from being cancelled as a result of visiting the instructions naturally during normal processing. However, simply reversing the order also harmed the generated code. Upon closer inspection, it was discovered that revisits occurred in the opposite order to the first pass (Thanks to escha for spotting that). This patch makes the revisit order consistent with the first pass which allows more revisits to be cancelled. This does appear to have a small impact on the generated code in few cases but it significantly reduces compile-time. After this patch, our internal test that was most affected by the regression dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46% of the runtime it had before (99.54% improvement). Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only: \| metric \| geomean before patch \| geomean after patch \| delta \| \| ----- \| ----- \| ----- \| ----- \| \| compile time \| 0.1956 \| 0.1261 \| -35.54% \| \| execution time \| 0.3240 \| 0.3237 \| - \| \| code size \| 7365.4459 \| 7365.6079 \| - \| The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though: \| Performance Regressions - compile_time \| Δ \| Previous \| Current \| \| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk \| 9.82% \| 2.0473 \| 2.2483 \| Reviewers: javed.absar, dberlin Reviewed By: dberlin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45734 llvm-svn: 331381	2018-05-02 17:59:16 +00:00
Simon Pilgrim	f53ee8e640	Fix '32-bit shift implicitly converted to 64 bits' warning by using APInt::setBit instead. llvm-svn: 331359	2018-05-02 14:22:30 +00:00
Florian Hahn	5912c667b0	[LoopInterchange] Update some loops to use range base for loops (NFC). llvm-svn: 331342	2018-05-02 10:53:04 +00:00
Sanjay Patel	d2025a2e31	[AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare and (or (lshr X, C), ...), 1 --> (X & C') != 0 I initially thought about implementing the minimal pattern in instcombine as mentioned here: https://bugs.llvm.org/show_bug.cgi?id=37098#c6 ...but we need to do better to catch the more general sequence from the motivating test (more than 2 bits in the compare). And a test-suite run with statistics showed that this pattern only happened 2 times currently. It would potentially happen more often if reassociation worked better (D45842), but it's probably still not too frequent? This is small enough that I didn't see a need to create a whole new class/file within AggressiveInstCombine. There are likely other relatively small matchers like what was discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). We could potentially also consolidate matchers for ctpop, bswap, etc under here. Differential Revision: https://reviews.llvm.org/D45986 llvm-svn: 331311	2018-05-01 21:02:09 +00:00
Adrian Prantl	4dfcc4a788	Remove @brief commands from doxygen comments, too. This is a follow-up to r331272. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done https://reviews.llvm.org/D46290 llvm-svn: 331275	2018-05-01 16:10:38 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Daniel Neilson	9e4bbe801a	[LV] Preserve inbounds on created GEPs Summary: This is a fix for PR23997. The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations. This patch preserves the inbounds property in the case where a load/store is being fed by an inbounds GEP. Reviewers: mkuper, javed.absar, hsaito Reviewed By: hsaito Subscribers: dcaballe, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D46191 llvm-svn: 331269	2018-05-01 15:35:08 +00:00
Wei Mi	eec5ba9fae	Fix the issue that ComputeValueKnownInPredecessors only handles the case when phi is on lhs of a comparison op. For the following testcase, L1: %t0 = add i32 %m, 7 %t3 = icmp eq i32* %t2, null br i1 %t3, label %L3, label %L2 L2: %t4 = load i32, i32* %t2, align 4 br label %L3 L3: %t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ] %t6 = icmp eq i32 %t0, %t5 br i1 %t6, label %L4, label %L5 We know if we go through the path L1 --> L3, %t6 should always be true. However currently, if the rhs of the eq comparison is phi, JumpThreading fails to evaluate %t6 to true. And we know that Instcombine cannot guarantee always canonicalizing phi to the left hand side of the comparison operation according to the operand priority comparison mechanism in instcombine. The patch handles the case when rhs of the comparison op is a phi. Differential Revision: https://reviews.llvm.org/D46275 llvm-svn: 331266	2018-05-01 14:47:24 +00:00
Omer Paparo Bivas	82ef8e19ef	[InstCombine] Adjusting bswap pattern matching to hold for And/Shift mixed case Differential Revision: https://reviews.llvm.org/D45731 Change-Id: I85d4226504e954933c41598327c91b2d08192a9d llvm-svn: 331257	2018-05-01 12:25:46 +00:00
Chandler Carruth	2c85a23123	[PM/LoopUnswitch] Remove the last manual domtree update code from loop unswitch and replace it with the amazingly simple update API code. This addresses piles of FIXMEs around the update logic here and makes everything substantially simpler. llvm-svn: 331247	2018-05-01 09:54:39 +00:00
Chandler Carruth	44aab925fd	[PM/LoopUnswitch] Add back a successor set that was removed based on code review. It turns out this is necessary, and I read the comment on the API correctly the first time. ;] The `applyUpdates` routine requires that updates are "balanced". This is in order to cleanly handle cycles like inserting, removing, nad then re-inserting the same edge. This precludes inserting the same edge multiple times in a row as handling that would cause the insertion logic to become ordered instead of unordered (which is what the API provides). It happens that in this specific case nothing (other than an assert and contract violation) goes wrong because we're never inserting and removing the same edge. The implementation happens to do the right thing to eliminate redundant insertions in that case. But the requirement is there and there is an assert to catch it. Somehow, after the code review I never did another asserts-clang build testing loop-unswich for a long time. As a consequence, I didn't notice this despite a bunch of testing going on, but it shows up immediately with an asserts build of clang itself. llvm-svn: 331246	2018-05-01 09:42:09 +00:00
Florian Hahn	3df8844b92	[SimplifyCFG] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC). This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: aprantl, vsk, hans, danielcdh Reviewed By: hans Differential Revision: https://reviews.llvm.org/D46252 llvm-svn: 331221	2018-04-30 20:10:53 +00:00
Florian Hahn	8fe04ad3f7	[LoopSimplify] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC). This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: aprantl, vsk, chandlerc Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D46253 llvm-svn: 331217	2018-04-30 19:19:36 +00:00
Roman Lebedev	aa4faec114	[InstCombine] Unfold masked merge with constant mask Summary: As discussed in D45733, we want to do this in InstCombine. https://rise4fun.com/Alive/LGk Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: chandlerc, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D45867 llvm-svn: 331205	2018-04-30 17:59:33 +00:00
Davide Italiano	bd3bf1660b	[SLPVectorizer] Debug info shouldn't impact spill cost computation. <rdar://problem/39794738> (Also, PR32761). Differential Revision: https://reviews.llvm.org/D46199 llvm-svn: 331199	2018-04-30 16:57:33 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Florian Hahn	deb01ea126	[LV] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC). This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: mkuper, rengolin, dcaballe, aprantl, vsk Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D46254 llvm-svn: 331174	2018-04-30 13:28:08 +00:00
Hideki Saito	f2ec16ccc2	[NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file Summary: This is a follow up to D45420 (included here since it is still under review and this change is dependent on that) and D45072 (committed). Actual change for this patch is LoopVectorize* and cmakefile. All others are all from D45420. LoopVectorizationLegality is an analysis and thus really belongs to Analysis tree. It is modular enough and it is reusable enough ---- we can further improve those aspects once uses outside of LV picks up. Hopefully, this will make it easier for people familiar with vectorization theory, but not necessarily LV itself to contribute, by lowering the volume of code they should deal with. We probably should start adding some code in LV to check its own capability (i.e., vectorization is legal but LV is not ready to handle it) and then bail out. Reviewers: rengolin, fhahn, hfinkel, mkuper, aemerson, mssimpso, dcaballe, sguggill Reviewed By: rengolin, dcaballe Subscribers: egarcia, rogfer01, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D45552 llvm-svn: 331139	2018-04-29 07:26:18 +00:00
Roman Lebedev	136867931a	[InstCombine] Canonicalize variable mask in masked merge Summary: Masked merge has a pattern of: `((x ^ y) & M) ^ y`. But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`, We should canonicalize the pattern to non-inverted mask. https://rise4fun.com/Alive/Yol Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45664 llvm-svn: 331112	2018-04-28 15:45:07 +00:00
Philip Reames	502d4481d4	[LoopGuardWidening] Make PostDomTree optional The effect of doing so is not disrupting the LoopPassManager when mixing this pass with other loop passes. This should help locality of access substaintially and avoids the cost of computing PostDom. The assumption here is that the full GuardWidening (which does use PostDom) is run as a canonicalization before loop opts and that this version is just catching cases exposed by other loop passes. (i.e. LoopPredication, IndVarSimplify, LoopUnswitch, etc..) llvm-svn: 331094	2018-04-27 23:15:56 +00:00
Adrian Prantl	210a29de7b	Fix a bug in GlobalOpt's handling of DIExpressions. This patch adds support for fragment expressions TryToShrinkGlobalToBoolean() which were previously just dropped. Thanks to Reid Kleckner for providing me a reproducer! llvm-svn: 331086	2018-04-27 21:41:36 +00:00
Roman Lebedev	6959b8e76f	[PatternMatch] Stabilize the matching order of commutative matchers Summary: Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the `LHS` and `RHS` matchers: 1. match `RHS` matcher to the `first` operand of binary operator, 2. and then match `LHS` matcher to the `second` operand of binary operator. This works ok. But it complicates writing of commutative matchers, where one would like to match (`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side. This is additionally complicated by the fact that `m_Specific()` stores the `Value `, not `Value `, so it won't work at all out of the box. The last problem is trivially solved by adding a new `m_c_Specific()` that stores the `Value `, not `Value `. I'm choosing to add a new matcher, not change the existing one because i guess all the current users are ok with existing behavior, and this additional pointer indirection may have performance drawbacks. Also, i'm storing pointer, not reference, because for some mysterious-to-me reason it did not work with the reference. The first one appears trivial, too. Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ operands: 1. match ~~`RHS`~~ `LHS` matcher to the ~~`first`~~ `second` operand of binary operator, 2. and then match ~~`LHS`~~ `RHS` matcher to the ~~`second`~ `first` operand of binary operator. Surprisingly, `$ ninja check-llvm` still passes with this. But i expect the bots will disagree.. The motivational unittest is included. I'd like to use this in D45664. Reviewers: spatel, craig.topper, arsenm, RKSimon Reviewed By: craig.topper Subscribers: xbolva00, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D45828 llvm-svn: 331085	2018-04-27 21:23:20 +00:00
Philip Reames	5a6482450a	[LICM] Reduce nesting with an early return [NFC] llvm-svn: 331080	2018-04-27 20:58:30 +00:00
Daniel Neilson	a19ee7d7b6	[LV] Common duplicate vector load/store address calculation (NFC) Summary: Commoning some obviously copy/paste code in InnerLoopVectorizer::vectorizeMemoryInstruction llvm-svn: 331076	2018-04-27 20:29:18 +00:00
Philip Reames	de5a1da2d2	[GuardWidening] Add some clarifying comments about heuristics [NFC] llvm-svn: 331061	2018-04-27 17:41:37 +00:00
Philip Reames	9258e9d190	[LoopGuardWidening] Split out a loop pass version of GuardWidening The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk. This patch only includes a legacy pass. A follow up will add a new style pass as well. llvm-svn: 331060	2018-04-27 17:29:10 +00:00
Florian Hahn	f3fea0f11f	[LoopInterchange] Allow some loops with PHI nodes in the exit block. We currently support LCSSA PHI nodes in the outer loop exit, if their incoming values do not come from the outer loop latch or if the outer loop latch has a single predecessor. In that case, the outer loop latch will be executed only if the inner loop gets executed. If we have multiple predecessors for the outer loop latch, it may be executed even if the inner loop does not get executed. This is a first step to support the case described in https://bugs.llvm.org/show_bug.cgi?id=30472 Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43237 llvm-svn: 331037	2018-04-27 13:52:51 +00:00
Matt Morehouse	1ae1febfde	Revert "[SimplifyLibcalls] Replace locked IO with unlocked IO" This reverts r331002 due to sanitizer bot breakage. llvm-svn: 331011	2018-04-27 01:48:09 +00:00
Eli Friedman	e06539456c	[LowerTypeTests] Mark .cfi.jumptable nounwind. It doesn't unwind, and the wrong marking leads to the creation of an .eh_frame section when it isn't necessary. Differential Revision: https://reviews.llvm.org/D46082 llvm-svn: 331008	2018-04-27 00:32:24 +00:00
David Bolvansky	2c9cc9c731	[SimplifyLibcalls] Replace locked IO with unlocked IO Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 331002	2018-04-26 22:31:43 +00:00
Sanjoy Das	6f1937b10f	[InstCombine] Simplify Add with remainder expressions as operands. Summary: Simplify integer add expression X % C0 + (( X / C0 ) % C1) * C0 to X % (C0 * C1). This is a common pattern seen in code generated by the XLA GPU backend. Add test cases for this new optimization. Patch by Bixia Zheng! Reviewers: sanjoy Reviewed By: sanjoy Subscribers: efriedma, craig.topper, lebedev.ri, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D45976 llvm-svn: 330992	2018-04-26 20:52:28 +00:00
Vlad Tsyrklevich	b768d235a9	Revert "Enable EliminateAvailableExternally pass for -O1" This reverts commit r330961 because it breaks a handful of clang tests. llvm-svn: 330964	2018-04-26 17:54:53 +00:00
Vlad Tsyrklevich	42c5a9c29a	Enable EliminateAvailableExternally pass for -O1 Summary: Follow-up to D43690, the EliminateAvailableExternally pass currently runs under -O0 and -O2 and up. Under -O1 we would still want to drop available_externally symbols to reduce space without inlining having run. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D46093 llvm-svn: 330961	2018-04-26 17:33:24 +00:00
Florian Hahn	fd2bc11248	[LoopInterchange] Ignore debug intrinsics during legality checks. Reviewers: aprantl, mcrosier, karthikthecool Reviewed By: aprantl Subscribers: mattd, vsk, #debug-info, llvm-commits Differential Revision: https://reviews.llvm.org/D45379 llvm-svn: 330931	2018-04-26 10:26:17 +00:00
David Bolvansky	cb8ca5f37c	[SimplifyLibcalls] Atoi, strtol replacements Reviewers: spatel, lebedev.ri, xbolva00, efriedma Reviewed By: xbolva00, efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45418 llvm-svn: 330860	2018-04-25 18:58:53 +00:00
Taewook Oh	923c216da5	[ICP] Do not attempt type matching for variable length arguments. Summary: When performing indirect call promotion, current implementation inspects "all" parameters of the callsite and attemps to match with the formal argument type of the callee function. However, it is not possible to find the type for variable length arguments, and the compiler crashes when it attemps to match the type for variable lenght argument. It seems that the bug is introduced with D40658. Prior to that, the type matching is performed only for the parameters whose ID is less than callee->getFunctionNumParams(). The attached test case will crash without the patch. Reviewers: mssimpso, davidxl, davide Reviewed By: mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46026 llvm-svn: 330844	2018-04-25 17:19:21 +00:00
Sanjay Patel	807ddee1bf	[InstCombine] clean up foldSelectICmpAnd(); NFC As discussed in D45862, we want to delete parts of this code because it can create more instructions than it removes. But we also want to preserve some folds that are winners, so tidy up what's here to make splitting the good from bad a bit easier. llvm-svn: 330841	2018-04-25 16:34:01 +00:00
Florian Hahn	1da30c659d	[LoopInterchange] Use getExitBlock()/getExitingBlock instead of manual impl. This also means we have to check if the latch is the exiting block now, as `transform` expects the latches to be the exiting blocks too. https://bugs.llvm.org/show_bug.cgi?id=36586 Reviewers: efriedma, davide, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45279 llvm-svn: 330806	2018-04-25 09:35:54 +00:00
Bjorn Pettersson	bec2a7c4eb	[DebugInfo] Invalidate debug info in ReassociatePass::RewriteExprTree Summary: When Reassociate is rewriting an expression tree it may reuse old binary expression nodes, for new expressions. Whenever an expression node is reused, but with a non-trivial change in the result, we need to invalidate any debug info that is associated with the node. If for example rewriting x = mul a, b y = mul c, x into x = mul c, b y = mul a, x we still get the same result for 'y', but 'x' is a new expression. All debug info referring to 'x' must be invalidated (marked as optimized out) since we no longer calculate the expected value. As a side-effect this patch avoid (at least some) problems where reassociate could end up creating IR with debug-use before def. Earlier the dbg.value nodes where left untouched in the IR, while the reused binary nodes where sinked to just before the root node of the rewritten expression tree. See PR27273 for more info about such problems. Reviewers: dblaikie, aprantl, dexonsmith Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45975 llvm-svn: 330804	2018-04-25 09:23:56 +00:00
David Bolvansky	3ea50f9fef	Merging r46043: ------------------------------------------------------------------------ llvm-svn: 330799	2018-04-25 04:33:36 +00:00
Geoff Berry	2af5f3c1e5	[DivRemPairs] Fix non-determinism in use list order. Summary: Use a MapVector instead of a DenseMap for RemMap since it is iteratated over and the order of iteration can effect the order that new instructions are created. This can in turn effect the use list order of div/rem input values if multiple new instructions are created that share any input values. Reviewers: spatel Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45858 llvm-svn: 330792	2018-04-25 02:17:56 +00:00
Chandler Carruth	69e68f8468	[PM/LoopUnswitch] Begin teaching SimpleLoopUnswitch to use the new update API for dominators rather than doing manual, hacky updates. This is just the first step, but in some ways the most important as it moves the non-trivial unswitching to update the domtree rather than fully recalculating it each time. Subsequent patches should remove the custom update logic used by the trivial unswitch and replace it with uses of the update API. This also fixes a number of bugs I was seeing when testing non-trivial unswitch due to it querying the quasi-correct dominator tree. Now the tree is 100% correct and safe to query. That said, there are still more bugs I can see with non-trivial unswitch just running over the test suite, so more bugfix patches are needed as well. Thanks to both Sanjoy and Fedor for reviews and testing! Differential Revision: https://reviews.llvm.org/D45943 llvm-svn: 330787	2018-04-25 00:18:07 +00:00
Diego Caballero	60f2776b2f	[LV][VPlan] Detect outer loops for explicit vectorization. Patch #2 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces the basic infrastructure to detect, legality check and process outer loops annotated with hints for explicit vectorization. All these changes are protected under the feature flag -enable-vplan-native-path. This should make this patch NFC for the existing inner loop vectorizer. Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso. Differential Revision: https://reviews.llvm.org/D42447 llvm-svn: 330739	2018-04-24 17:04:17 +00:00
Florian Hahn	ceee788947	[LoopInterchange] Make isProfitableForVectorization slightly more conservative. After D43236, we started interchanging loops with empty dependence matrices. In isProfitableForVectorization, we try to determine if interchanging makes the loop dependences more friendly to the vectorizer. If there are no dependences, we should not interchange, based on that heuristic. Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource Reviewed By: mcrosier Differential Revision: https://reviews.llvm.org/D45208 llvm-svn: 330738	2018-04-24 16:55:32 +00:00
David Blaikie	ba47dd16c5	Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h) llvm-svn: 330726	2018-04-24 15:40:07 +00:00
Benjamin Kramer	f85f5da3b2	[LoadStoreVectorize] Ignore interleaved invariant loads. The memory location an invariant load is using can never be clobbered by any store, so it's safe to move the load ahead of the store. Differential Revision: https://reviews.llvm.org/D46011 llvm-svn: 330725	2018-04-24 15:28:47 +00:00
Chandler Carruth	43acdb35bc	[PM/LoopUnswitch] Fix a bug in the loop block set formation of the new loop unswitch. This code incorrectly added the header to the loop block set early. As a consequence we would incorrectly conclude that a nested loop body had already been visited when the header of the outer loop was the preheader of the nested loop. In retrospect, adding the header eagerly doesn't really make sense. It seems nicer to let the cycle be formed naturally. This will catch crazy bugs in the CFG reconstruction where we can't correctly form the cycle earlier rather than later, and makes the rest of the logic just fall out. I've also added various asserts that make these issues much easier to debug. llvm-svn: 330707	2018-04-24 10:33:08 +00:00
Max Kazantsev	c54e67d6b9	[NFC] Remove recently added SE verification because it may be false-positive llvm-svn: 330699	2018-04-24 09:11:01 +00:00
Max Kazantsev	30dee7874d	[NFC] Use forgetTopmostLoop instead of logic duplication llvm-svn: 330683	2018-04-24 04:33:04 +00:00
Chandler Carruth	0ace148ca6	[PM/LoopUnswitch] Remove another over-aggressive assert. This code path can very clearly be called in a context where we have baselined all the cloned blocks to a particular loop and are trying to handle nested subloops. There is no harm in this, so just relax the assert. I've added a test case that will make sure we actually exercise this code path. llvm-svn: 330680	2018-04-24 03:27:00 +00:00
Max Kazantsev	5a0a40b8cb	[NFC] Add clarification comment llvm-svn: 330677	2018-04-24 02:08:05 +00:00
David Blaikie	a27771b62f	InstCombine: Fix layering by not including Scalar.h in InstCombine (notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be included by InstCombine which doesn't/shouldn't need to depend on ScalarOpts) llvm-svn: 330669	2018-04-24 00:48:59 +00:00
Craig Topper	1bcb258ba3	[AggressiveInstCombine] Add aggressive inst combiner to the LLVM C API. I just tried to copy what was done for regular InstCombine. Hopefully I didn't miss anything. llvm-svn: 330668	2018-04-24 00:39:29 +00:00
Alex Shlyapnikov	909fb12f0c	[HWASan] Use dynamic shadow memory on Android only (LLVM) There're issues with IFUNC support on other platforms. DIfferential Revision: https://reviews.llvm.org/D45840 llvm-svn: 330665	2018-04-24 00:16:54 +00:00
Craig Topper	d4eb2073b7	[AggressiveInstCombine] Add library initializer routine for AggressiveInstCombine library. Use it in bugpoint and llvm-opt-fuzzer to match regular InstCombine. This should make aggressive instcombine usable with these tools. llvm-svn: 330663	2018-04-24 00:05:21 +00:00
Florian Hahn	7441818560	[LoopInterchange] Do not change LI for BBs in child loops. If a loop with child loops becomes our new inner loop after interchanging, we only need to update LoopInfo for the blocks defined in the old outer loop. BBs in child loops will stay there. Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45970 llvm-svn: 330653	2018-04-23 21:38:19 +00:00
Xin Tong	8edff27923	[CallSiteSplit] Make sure we remove nonnull if the parameter turns out to be a constant. Summary: We do not need nonull attribute if we know an argument is going to be constant. Reviewers: junbuml, davide, fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45608 llvm-svn: 330641	2018-04-23 20:09:08 +00:00
Bjorn Pettersson	8e484dc531	[MemCpyOpt] Skip optimizing basic blocks not reachable from entry Summary: Skip basic blocks not reachable from the entry node in MemCpyOptPass::iterateOnFunction. Code that is unreachable may have properties that do not exist for reachable code (an instruction in a basic block can for example be dominated by a later instruction in the same basic block, for example if there is a single block loop). MemCpyOptPass::processStore is only safe to use for reachable basic blocks, since it may iterate past the basic block beginning when used for unreachable blocks. By simply skipping to optimize unreachable basic blocks we can avoid asserts such as "Assertion `!NodePtr->isKnownSentinel()' failed." in MemCpyOptPass::processStore. The problem was detected by fuzz tests. Reviewers: eli.friedman, dneilson, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45889 llvm-svn: 330635	2018-04-23 19:55:04 +00:00
Daniel Neilson	cc45e923c5	[DSE] Teach the pass that atomic memory intrinsics are stores. Summary: This change teaches DSE that the atomic memory intrinsics are stores that can be eliminated, and can allow other stores to be eliminated. This change specifically does not teach DSE that these intrinsics can be partially eliminated (i.e. length reduced, and dest/src changed); that will be handled in another change. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith Reviewed By: efriedma Subscribers: dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D45535 llvm-svn: 330629	2018-04-23 19:06:49 +00:00
Alex Shlyapnikov	a2b4f9b4d4	[HWASan] Switch back to fixed shadow mapping for x86-64 For now switch back to fixed shadow mapping for x86-64 due to the issues with IFUNC linking on older binutils. More details will be added to https://bugs.chromium.org/p/chromium/issues/detail?id=835864 Differential Revision: https://reviews.llvm.org/D45840 llvm-svn: 330623	2018-04-23 18:14:39 +00:00
Max Kazantsev	91f481665e	[LoopRotate] Fix incorrect SCEV invalidation in loop rotation LoopRotate only invalidates innermost loops while the changes that it makes may also affert any of this parents. With patch rL329047, SCEV becomes much smarter about calculation of exit counts for outer loops, so we cannot assume that they are not affected. Differential Revision: https://reviews.llvm.org/D45945 llvm-svn: 330582	2018-04-23 12:33:31 +00:00
Max Kazantsev	acda4c0f18	[LoopUnroll] Fix potentially incorrect SCEV invalidation in UnrollRuntime Current runtime unrolling invalidates parent loop saying that it might have changed after the inner loop has changed, but it doesn't bother to do the same to its parents. With patch rL329047, SCEV becomes much smarter about calculation of exit counts for outer loops. We might need to invalidate not only the immediate parent, but also any of its parents as well. There is no clear evidence that there is some miscompile happening because of this (at least I don't have such test), but the common sense says that the current code is wrong. Differential Revision: https://reviews.llvm.org/D45940 Reviewed By: chandlerc llvm-svn: 330577	2018-04-23 10:39:38 +00:00
Max Kazantsev	b1137c42fa	[LoopSimplify] Fix incorrect SCEV invalidation In the function `simplifyOneLoop` we optimistically assume that changes in the inner loop only affect this very loop and have no impact on its parents. In fact, after rL329047 has been merged, we can now calculate exit counts for outer loops which may depend on inner loops. Thus, we need to invalidate all parents when we do something to a loop. There is an evidence of incorrect behavior of `simplifyOneLoop`: when we insert `SE->verify()` check in the end of this funciton, it fails on a bunch of existing test, in particular: LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll LLVM :: Transforms/LoopUnroll/peel-loop.ll LLVM :: Transforms/LoopUnroll/peel-loop2.ll Note that previously we have fixed issues of this variety, see rL328483. This patch makes this function invalidate the outermost loop properly. Differential Revision: https://reviews.llvm.org/D45937 Reviewed By: chandlerc llvm-svn: 330576	2018-04-23 10:32:37 +00:00
Chandler Carruth	bf7190a154	[PM/LoopUnswitch] Remove a buggy assert in the new loop unswitch. The condition this was asserting doesn't actually hold. I've added comments to explain why, removed the assert, and added a fun test case reduced out of 403.gcc. llvm-svn: 330564	2018-04-23 06:58:36 +00:00
Chandler Carruth	b525424118	[PM/LoopUnswitch] Fix comment typo. NFC. llvm-svn: 330560	2018-04-23 00:48:42 +00:00
Sanjay Patel	30be665e82	[PatternMatch] allow undef elements when matching a vector zero This is the last step in getting constant pattern matchers to allow undef elements in constant vectors. I'm adding a dedicated m_ZeroInt() function and building m_Zero() from that. In most cases, calling code can be updated to use m_ZeroInt() directly when there's no need to match pointers, but I'm leaving that efficiency optimization as a follow-up step because it's not always clear when that's ok. There are just enough icmp folds in InstSimplify that can be used for integer or pointer types, that we probably still want a generic m_Zero() for those cases. Otherwise, we could eliminate it (and possibly add a m_NullPtr() as an alias for isa<ConstantPointerNull>()). We're conservatively returning a full zero vector (zeroinitializer) in InstSimplify/InstCombine on some of these folds (see diffs in InstSimplify), but I'm not sure if that's actually necessary in all cases. We may be able to propagate an undef lane instead. One test where this happens is marked with 'TODO'. llvm-svn: 330550	2018-04-22 17:07:44 +00:00
Shoaib Meenai	106df7dd20	[ObjCARC] Take BlockColors by const reference. NFC llvm-svn: 330489	2018-04-20 22:14:45 +00:00
Shoaib Meenai	d64b83266b	[ObjCARC] Account for funclet token in storeStrong transform When creating a call to storeStrong in ObjCARCContract, ensure the call gets the correct funclet token, otherwise WinEHPrepare will turn the call (and all subsequent instructions) into unreachable. We already have logic to do this for the ARC autorelease elision marker; factor that out into a common function that's used for both. These are the only two places in this transform that create call instructions. Differential Revision: https://reviews.llvm.org/D45857 llvm-svn: 330487	2018-04-20 22:11:03 +00:00
Alex Shlyapnikov	99cf54baa6	[HWASan] Introduce non-zero based and dynamic shadow memory (LLVM). Summary: Support the dynamic shadow memory offset (the default case for user space now) and static non-zero shadow memory offset (-hwasan-mapping-offset option). Keeping the the latter case around for functionality and performance comparison tests (and mostly for -hwasan-mapping-offset=0 case). The implementation is stripped down ASan one, picking only the relevant parts in the following assumptions: shadow scale is fixed, the shadow memory is dynamic, it is accessed via ifunc global, shadow memory address rematerialization is suppressed. Keep zero-based shadow memory for kernel (-hwasan-kernel option) and calls instreumented case (-hwasan-instrument-with-calls option), which essentially means that the generated code is not changed in these cases. Reviewers: eugenis Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D45840 llvm-svn: 330475	2018-04-20 20:04:04 +00:00
Sean Fertile	18f17333dd	[PartialInlining] Fix Crash from holding a reference to a destructed ORE. The callback used to create an ORE for the legacy PI pass caches the allocated object in a unique_ptr in the runOnModule function, and returns a reference to that object. Under certian circumstances we can end up holding onto that reference after the OREs destruction. Rather then allowing the new and legacy passes to create ORE object in diffrent ways, create the ORE at the point of use. Differential Revision: https://reviews.llvm.org/D43219 llvm-svn: 330473	2018-04-20 19:56:26 +00:00
Michael Zolotukhin	e268304122	Revert r330431. There are still stage3/stage4 miscompares :( llvm-svn: 330446	2018-04-20 16:57:10 +00:00
Florian Hahn	773872fd67	[NewGVN] Split OpPHI detection and creation. It also adds a check making sure PHIs for operands are all in the same block. Patch by Daniel Berlin <dberlin@dberlin.org> Reviewers: dberlin, davide Differential Revision: https://reviews.llvm.org/D43865 llvm-svn: 330444	2018-04-20 16:37:13 +00:00
Michael Zolotukhin	a2c9af0209	Revert "Revert r330403 and r330413." Reapply the patches with a fix. Thanks Ilya and Hans for the reproducer! This reverts commit r330416. The issue was that removing predecessors invalidated uses that we stored for rewrite. The fix is to finish manipulating with CFG before we select uses for rewrite. llvm-svn: 330431	2018-04-20 13:34:32 +00:00
Ilya Biryukov	afe822bd6d	Revert r330403 and r330413. Revert r330413: "[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites." Revert r330403 "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time." r330403 commit seems to crash clang during our integrate while doing PGO build with the following stacktrace: #2 llvm::SSAUpdaterBulk::RewriteAllUses(llvm::DominatorTree, llvm::SmallVectorImpl<llvm::PHINode>) #3 llvm::JumpThreadingPass::ThreadEdge(llvm::BasicBlock, llvm::SmallVectorImpl<llvm::BasicBlock> const&, llvm::BasicBlock) #4 llvm::JumpThreadingPass::ProcessThreadableEdges(llvm::Value, llvm::BasicBlock, llvm::jumpthreading::ConstantPreference, llvm::Instruction) #5 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock) The crash happens while compiling 'lib/Analysis/CallGraph.cpp'. r3340413 is reverted due to conflicting changes. llvm-svn: 330416	2018-04-20 10:52:54 +00:00
Michael Zolotukhin	9dea079315	[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites. llvm-svn: 330413	2018-04-20 10:31:06 +00:00
Michael Zolotukhin	79e4f7fadb	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. Hopefully, changing set to vector removes nondeterminism detected by some bots, or the new assert will catch something. This reverts commit r330180. llvm-svn: 330403	2018-04-20 08:01:08 +00:00
Michael Zolotukhin	26339b445a	[SSAUpdaterBulk] Add an assert. llvm-svn: 330402	2018-04-20 07:59:57 +00:00
Michael Zolotukhin	0df1d48ca9	[SSAUpdaterBulk] Add * and & to auto. llvm-svn: 330400	2018-04-20 07:58:54 +00:00
Michael Zolotukhin	bc843211fd	[SSAUpdaterBulk] Use PredCache in ComputeLiveInBlocks. llvm-svn: 330399	2018-04-20 07:57:24 +00:00
Michael Zolotukhin	79cb54b2d9	[SSAUpdaterBulk] Use SmallVector instead of SmallPtrSet for uses. llvm-svn: 330398	2018-04-20 07:56:00 +00:00
Vlad Tsyrklevich	230b256783	LowerTypeTests: Propagate symver directives Summary: This change fixes https://crbug.com/834474, a build failure caused by LowerTypeTests not preserving .symver symbol versioning directives for exported functions. Emit symver information to ThinLTO summary data and then propagate symver directives for exported functions to the merged module. Emitting symver information to the summaries increases the size of intermediate build artifacts for a Chromium build by less than 0.2%. Reviewers: pcc Reviewed By: pcc Subscribers: tejohnson, mehdi_amini, eraman, llvm-commits, eugenis, kcc Differential Revision: https://reviews.llvm.org/D45798 llvm-svn: 330387	2018-04-20 01:36:48 +00:00
Jin Lin	585f2699cf	Refine the loop rotation's API Summary: The following changes addresses the following two issues. 1) The existing loop rotation pass contains both loop latch simplification and loop rotation. So one flag RotationOnly is added to be passed to the loop rotation pass. 2) The threshold value is initialized with MAX_UINT since the loop rotation utility should not have threshold limit. Reviewers: dmgreen, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45582 llvm-svn: 330362	2018-04-19 20:29:43 +00:00
Chandler Carruth	32e62f9c5b	[PM/LoopUnswitch] Detect irreducible control flow within loops and skip unswitching non-trivial edges. Summary: This fixes the bug pointed out in review with non-trivial unswitching. This also provides a basis that should make it pretty easy to finish fleshing out a routine to scan an entire function body for irreducible control flow, but this patch remains minimal for disabling loop unswitch. Reviewers: sanjoy, fedor.sergeev Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45754 llvm-svn: 330357	2018-04-19 18:44:25 +00:00
Sanjay Patel	a201787fd7	[Reassociate] fix formatting; NFC llvm-svn: 330348	2018-04-19 17:56:36 +00:00
Florian Hahn	b789165e6b	[NewGVN] Add ops as dependency if we cannot find a leader for ValueOp. If those operands change, we might find a leader for ValueOp, which could enable new phi-of-op creation. This fixes a case where we missed creating a phi-of-ops node. With D43865 and this patch, bootstrapping clang/llvm works with -enable-newgvn, whereas without it, the "value changed after iteration" assertion is triggered. Reviewers: dberlin, davide Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D42180 llvm-svn: 330334	2018-04-19 15:05:47 +00:00
Sanjay Patel	b2ab3f28d5	[SimplifyLibcalls] Realloc(null, N) -> Malloc(N) Patch by Dávid Bolvanský! Differential Revision: https://reviews.llvm.org/D45413 llvm-svn: 330259	2018-04-18 14:21:31 +00:00
Sam Parker	3c19051bf0	[IRCE] Only check for NSW on equality predicates After investigation discussed in D45439, it would seem that the nsw flag restriction is unnecessary in most cases. So the IsInductionVar lambda has been removed, the functionality extracted, and now only require nsw when using eq/ne predicates. Differential Revision: https://reviews.llvm.org/D45617 llvm-svn: 330256	2018-04-18 13:50:28 +00:00
Florian Hahn	ac27758895	[LoopUnroll] Only peel if a predicate becomes known in the loop body. If a predicate does not become known after peeling, peeling is unlikely to be beneficial. Reviewers: mcrosier, efriedma, mkazantsev, junbuml Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D44983 llvm-svn: 330250	2018-04-18 12:29:24 +00:00
Bjorn Pettersson	bc4f19b6bd	[DebugInfo] Sink related dbg users when sinking in InstCombine Summary: When sinking an instruction in InstCombine we now also sink the DbgInfoIntrinsics that are using the sunken value. Example) When sinking the load in this input bb.X: %0 = load i64, i64* %start, align 4, !dbg !31 tail call void @llvm.dbg.value(metadata i64 %0, ...) br i1 %cond, label %for.end, label %for.body.lr.ph for.body.lr.ph: br label %for.body we now also move the dbg.value, like this bb.X: br i1 %cond, label %for.end, label %for.body.lr.ph for.body.lr.ph: %0 = load i64, i64* %start, align 4, !dbg !31 tail call void @llvm.dbg.value(metadata i64 %0, ...) br label %for.body In the past we haven't moved the dbg.value so we got bb.X: tail call void @llvm.dbg.value(metadata i64 %0, ...) br i1 %cond, label %for.end, label %for.body.lr.ph for.body.lr.ph: %0 = load i64, i64* %start, align 4, !dbg !31 br label %for.body So in the past we got a debug-use before the def of %0. And that dbg.value was also on the path jumping to %for.end, for which %0 never was defined. CodeGenPrepare normally comes to rescue later (when not moving the dbg.value), since it moves dbg.value instrinsics quite brutally, without really analysing if it is correct to move the intrinsic (see PR31878). So at the moment this patch isn't expected to have much impact, besides that it is moving the dbg.value already in opt, making the IR look more sane directly. This can be seen as a preparation to (hopefully) make it possible to turn off CodeGenPrepare::placeDbgValues later as a solution to PR31878. I also adjusted test/DebugInfo/X86/sdagsplit-1.ll to make the IR in the test case up-to-date with this behavior in InstCombine. Reviewers: rnk, vsk, aprantl Reviewed By: vsk, aprantl Subscribers: mattd, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45425 llvm-svn: 330243	2018-04-18 08:08:04 +00:00
Sanjay Patel	aea15131db	[InstCombine] peek through bitcasted vector/array pointer GEP operand The bitcast may be interfering with other combines or vectorization as shown in PR16739: https://bugs.llvm.org/show_bug.cgi?id=16739 Most pointer-related optimizations are probably able to look through this bitcast, but removing the bitcast shrinks the IR, so it's at least a size savings. Differential Revision: https://reviews.llvm.org/D44833 llvm-svn: 330237	2018-04-18 00:36:40 +00:00
Vedant Kumar	b0585893cc	[Mem2Reg] Create merged debug locations for inserted phis Track the debug locations of the incoming values to newly-created phis, and apply merged debug locations to the phis. A merged location will be on line 0, but will have the correct scope set. This improves crash reporting when an inlined instruction with a merged location triggers a machine exception. A debugger will be able to narrow down the crash to the correct inlined scope, instead of simply pointing to the outer scope of the caller. Taken together with a change allows generating merged line-0 locations for instructions which aren't calls, this results in a 0.5% increase in the uncompressed size of the .debug_line section of a stage2+Release build of clang (-O3 -g). rdar://33858697 Differential Revision: https://reviews.llvm.org/D45397 llvm-svn: 330227	2018-04-17 22:03:08 +00:00
Vedant Kumar	4b29172d09	[Mem2Reg] Make RenamePassData a struct, NFC llvm-svn: 330226	2018-04-17 22:03:07 +00:00
Stanislav Mekhanoshin	0bee630814	LoadStoreVectorizer crashes due to unsized type When we skip bitcasts while looking for GEP in LoadSoreVectorizer we should also verify that the type is sized otherwise we assert Differential Revision: https://reviews.llvm.org/D45709 llvm-svn: 330221	2018-04-17 21:40:04 +00:00
Michael Zolotukhin	21458fdc55	Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again." This reverts r330175. There are still stage3/stage4 miscompares. llvm-svn: 330180	2018-04-17 07:31:27 +00:00
Michael Zolotukhin	a6e7bd7001	[SSAUpdaterBulk] Add debug logging. llvm-svn: 330176	2018-04-17 04:45:40 +00:00
Michael Zolotukhin	3f5fd1b129	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again. One more, hopefully the last, bug is fixed: when forming UsesToRewrite we should ignore phi operands coming from edges that we want to delete. This reverts r329910. llvm-svn: 330175	2018-04-17 04:45:22 +00:00
Haicheng Wu	f7466f3164	[SLP] Use getExtractWithExtendCost() to compute the scalar cost of extractelement/ext pair We use getExtractWithExtendCost to calculate the cost of extractelement and s\|zext together when computing the extract cost after vectorization, but we calculate the cost of extractelement and s\|zext separately when computing the scalar cost which is larger than it should be. Differential Revision: https://reviews.llvm.org/D45469 llvm-svn: 330143	2018-04-16 18:09:49 +00:00
Sanjay Patel	f4c4fc77cd	[InstCombine] simplify code in SimplifyAssociativeOrCommutative; NFCI llvm-svn: 330137	2018-04-16 17:15:13 +00:00
Sanjay Patel	d93b8a0740	[InstCombine] simplify getBinOpsForFactorization(); NFC llvm-svn: 330129	2018-04-16 15:19:24 +00:00
Sanjay Patel	1170daa277	[InstCombine] simplify fneg+fadd folds; NFC Two cleanups: 1. As noted in D45453, we had tests that don't need FMF that were misplaced in the 'fast-math.ll' test file. 2. This removes the final uses of dyn_castFNegVal, so that can be deleted. We use 'match' now. llvm-svn: 330126	2018-04-16 14:13:57 +00:00
Sanjay Patel	77e990d887	[InstCombine] fix formatting; NFC llvm-svn: 330124	2018-04-16 13:21:15 +00:00
Roman Lebedev	f84bfb2147	[InstCombine] Simplify 'xor' to 'or' if no common bits are set. Summary: In order to get the whole fold as specified in [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]], let's first handle the simple straight-forward things. Let's start with the `and` -> `or` simplification. The one obvious thing missing here: the constant mask is not handled. I have an idea how to handle it, but it will require some thinking, and is not strictly required here, so i've left that for later. https://rise4fun.com/Alive/Pkmg Reviewers: spatel, craig.topper, eli.friedman, jingyue Reviewed By: spatel Subscribers: llvm-commits Was reviewed as part of https://reviews.llvm.org/D45631 llvm-svn: 330103	2018-04-15 18:59:44 +00:00
Roman Lebedev	25cbb62d18	[NFC] ConstantOffsetExtractor::CanTraceInto(): add FIXME: no tests As suggested in https://reviews.llvm.org/D45631#1068338, looking at haveNoCommonBitsSet() users, and trying to show the change effect elsewhere. llvm-svn: 330100	2018-04-15 18:59:27 +00:00
Sanjay Patel	34ea6cdfab	[InstCombine] simplify more code for distributive property; NFCI Also, fix capitalization to current style. Follow-up to: rL330096 llvm-svn: 330097	2018-04-15 16:20:58 +00:00
Sanjay Patel	f1aa0d7af2	[InstCombine] simplify code for distributive property; NFCI llvm-svn: 330096	2018-04-15 15:39:57 +00:00
Warren Ristow	8b2f27ce3a	[InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMF These simplifications were previously enabled only with isFast(), but that is more restrictive than required. Since r317488, FMF has 'reassoc' to control these cases at a finer level. llvm-svn: 330089	2018-04-14 19:18:28 +00:00
Hiroshi Inoue	ae17900997	[NFC] fix trivial typos in document and comments "not not" -> "not" etc llvm-svn: 330083	2018-04-14 08:59:00 +00:00
Roman Tereshin	dab10b5468	[DebugInfo][OPT] NFC follow-up on "Fixing a couple of DI duplication bugs of CloneModule" llvm-svn: 330070	2018-04-13 21:23:11 +00:00
Roman Tereshin	d769eb36ab	[DebugInfo][OPT] Fixing a couple of DI duplication bugs of CloneModule As demonstrated by the regression tests added in this patch, the following cases are valid cases: 1. A Function with no DISubprogram attached, but various debug info related to its instructions, coming, for instance, from an inlined function, also defined somewhere else in the same module; 2. ... or coming exclusively from the functions inlined and eliminated from the module entirely. The ValueMap shared between CloneFunctionInto calls within CloneModule needs to contain identity mappings for all of the DISubprogram's to prevent them from being duplicated by MapMetadata / RemapInstruction calls, this is achieved via DebugInfoFinder collecting all the DISubprogram's. However, CloneFunctionInto was missing calls into DebugInfoFinder for functions w/o DISubprogram's attached, but still referring DISubprogram's from within (case 1). This patch fixes that. The fix above, however, exposes another issue: if a module contains a DISubprogram referenced only indirectly from other debug info metadata, but not attached to any Function defined within the module (case 2), cloning such a module causes a DICompileUnit duplication: it will be moved in indirecty via a DISubprogram by DebugInfoFinder first (because of the first bug fix described above), without being self-mapped within the shared ValueMap, and then will be copied during named metadata cloning. So this patch makes sure DebugInfoFinder visits DICompileUnit's referenced from DISubprogram's as it goes w/o re-processing llvm.dbg.cu list over and over again for every function cloned, and makes sure that CloneFunctionInto self-maps DICompileUnit's referenced from the entire function, not just its own DISubprogram attached that may also be missing. The most convenient way of tesing CloneModule I found is to rely on CloneModule call from `opt -run-twice`, instead of writing tedious unit tests. That feature has a couple of properties that makes it hard to use for this purpose though: 1. CloneModule doesn't copy source filename, making `opt -run-twice` report it as a difference. 2. `opt -run-twice` does the second run on the original module, not its clone, making the result of cloning completely invisible in opt's actual output with and without `-run-twice` both, which directly contradicts `opt -run-twice`s own error message. This patch fixes this as well. Reviewed By: aprantl Reviewers: loladiro, GorNishanov, espindola, echristo, dexonsmith Subscribers: vsk, debug-info, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45593 llvm-svn: 330069	2018-04-13 21:22:24 +00:00
Krzysztof Parzyszek	dfed941eec	[LV] Introduce TTI::getMinimumVF The function getMinimumVF(ElemWidth) will return the minimum VF for a vector with elements of size ElemWidth bits. This value will only apply to targets for which TTI::shouldMaximizeVectorBandwidth returns true. The value of 0 indicates that there is no minimum VF. Differential Revision: https://reviews.llvm.org/D45271 llvm-svn: 330062	2018-04-13 20:16:32 +00:00
Mandeep Singh Grang	636d94db3b	[Transforms] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: kcc, pcc, danielcdh, jmolloy, sanjoy, dberlin, ruiu Reviewed By: ruiu Subscribers: ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D45142 llvm-svn: 330059	2018-04-13 19:47:57 +00:00
Andrey Konovalov	1ba9d9c6ca	hwasan: add -fsanitize=kernel-hwaddress flag This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables -hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff. Differential Revision: https://reviews.llvm.org/D45046 llvm-svn: 330044	2018-04-13 18:05:21 +00:00
Roman Lebedev	c00659328a	[InstCombine]: foldSelectICmpAndAnd(): and is commutative Summary: The fold added in D45108 did not account for the fact that the and instruction is commutative, and if the mask is a variable, the mask variable and the fold variable may be swapped. I have noticed this by accident when looking into [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]] This extends/generalizes that fold, so it is handled too. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45539 llvm-svn: 330001	2018-04-13 09:57:57 +00:00
Craig Topper	254ed028a4	[X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR. This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990	2018-04-13 06:07:18 +00:00
Xin Tong	d83c883d29	[CallSiteSplit] Fix comment. NFC llvm-svn: 329987	2018-04-13 04:35:38 +00:00
Eli Friedman	e1938cbc87	Don't call skipModule for CFI lowering passes. opt-bisect shouldn't skip these passes; they lower intrinsics which no other pass can handle. llvm-svn: 329961	2018-04-12 22:04:11 +00:00
Benjamin Kramer	b4ba3988bb	Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time." This reverts commit r329865. Causes stage2/stage3 miscompare. llvm-svn: 329910	2018-04-12 13:52:02 +00:00
Sam Parker	9737535943	[IRCE] isKnownNonNegative helper function Created a helper function to query for non negative SCEVs. Uses the SGE predicate to catch constants that could be interpreted as negative. Differential Revision: https://reviews.llvm.org/D45481 llvm-svn: 329907	2018-04-12 12:49:40 +00:00
Hiroshi Inoue	bcadfee2ad	[NFC] fix trivial typos in documents and comments "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878	2018-04-12 05:53:20 +00:00
George Burgess IV	48ee59b6f0	[DeadArgElim] Remove allocsize attributes on callsites We're already removing allocsize attributes from Functions that we remove args from, since removing arguments from a function may make the allocsize attribute incorrect. It appears we forgot to also remove them from callsites. Without this, I get verifier errors on `@Test2`. It probably wouldn't be too hard to make DAE properly update allocsize attributes instead of dropping them, but I can't think of a scenario where that'd be useful in practice. llvm-svn: 329868	2018-04-12 02:06:01 +00:00
Michael Zolotukhin	815f453f76	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reapplies commit r329644. llvm-svn: 329865	2018-04-11 23:37:53 +00:00
Michael Zolotukhin	4fbb93003b	[SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify order of evaluation. The standard says that the order of evaluation of an expression s[x] = foo() is unspecified. In our case, we first create an empty entry in the map, then call foo(), then store its return value to the created entry. The problem is that foo uses the map as a cache, so if it finds that there is an entry in the map, it stops computation. This change explicitly sets the order, thus fixing this heisenbug. llvm-svn: 329864	2018-04-11 23:37:37 +00:00
Sanjay Patel	ff98682c9c	[InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse() llvm-svn: 329821	2018-04-11 15:57:18 +00:00
Artur Gainullin	d928201ac5	Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max. Bitwise 'not' of the min/max could be eliminated in the pattern: %notx = xor i32 %x, -1 %cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y %smax = select i1 %cmp1, i32 %notx, i32 %y %res = xor i32 %smax, -1 https://rise4fun.com/Alive/lCN Reviewers: spatel Reviewed by: spatel Subscribers: a.elovikov, llvm-commits Differential Revision: https://reviews.llvm.org/D45317 llvm-svn: 329791	2018-04-11 10:29:37 +00:00
Sriraman Tallam	182f2df7c5	Simplification of libcall like printf->puts must check for RtLibUseGOT metadata. With -fno-plt, for example, calls to printf when getting converted to puts still use the PLT. This patch checks for the metadata "RtLibUseGOT" and annotates the declaration with the right attributes. Differential Revision: https://reviews.llvm.org/D45180 llvm-svn: 329768	2018-04-10 23:32:36 +00:00
Sanjay Patel	3b6d46761f	[CVP] simplify phi with constant incoming values that match common variable edge values This is based on an example that was recently posted on llvm-dev: void propagate_null(void b, int* g) { if (!b) { return 0; } (*g)++; return b; } https://godbolt.org/g/xYk3qG The original code or constant propagation in other passes has obscured the fact that the phi can be removed completely. Differential Revision: https://reviews.llvm.org/D45448 llvm-svn: 329755	2018-04-10 20:42:39 +00:00
Michael Zolotukhin	d6beefd5d3	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reverts r329661. Bots are still unhappy. llvm-svn: 329666	2018-04-10 03:40:29 +00:00
Michael Zolotukhin	8a13f6d4a7	Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."" This reapplies commit r329644. llvm-svn: 329661	2018-04-10 02:16:45 +00:00
Michael Zolotukhin	aa7868594e	[SSAUpdaterBulk] Handle CFG with unreachable from entry blocks. llvm-svn: 329660	2018-04-10 02:16:29 +00:00
Michael Zolotukhin	0274632ee6	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." This reverts commit r329644. llvm-svn: 329650	2018-04-10 00:42:43 +00:00
Hideki Saito	d829973794	Fix for the buildbot failure. Now-unused private field TTI deleted. llvm-svn: 329649	2018-04-10 00:38:36 +00:00
Hideki Saito	dfa932b049	[NFC][LV] Move InterleaveInfo from Legal to CostModel Summary: Another clean up, following D43208. Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it. In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that, by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created. After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis. Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson Reviewed By: rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45072 llvm-svn: 329645	2018-04-09 23:45:40 +00:00
Michael Zolotukhin	c6d2d65f37	[PR16756] Use SSAUpdaterBulk in JumpThreading. Summary: SSAUpdater is a bottleneck in JumpThreading, and this patch improves the situation by using SSAUpdaterBulk instead. Compile time impact: no noticable changes on CTMark, a big improvement on the test from PR16756. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329644	2018-04-09 23:37:37 +00:00
Michael Zolotukhin	52b064f3d3	[PR16756] Add SSAUpdaterBulk. Summary: SSAUpdater is a bottleneck in a number of passes, and one of the reasons is that it performs a lot of unnecessary computations (DT/IDF) over and over again. This patch adds a new SSAUpdaterBulk that uses existing DT and avoids recomputing IDF when possible. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329643	2018-04-09 23:37:20 +00:00
Simon Pilgrim	23c2182c2b	Support generic expansion of ordered vector reduction (PR36732) Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order. This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1] Differential Revision: https://reviews.llvm.org/D45366 llvm-svn: 329585	2018-04-09 15:44:20 +00:00
Xin Tong	fdad23bc36	[MergeICmp] Update debug msg.NFC llvm-svn: 329572	2018-04-09 14:29:13 +00:00
Xin Tong	0efadbbcde	[MergeICmp] Split blocks that do other work. Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564	2018-04-09 13:14:06 +00:00
Max Kazantsev	8624a4786a	[IRCE] Relax restriction on collected range checks In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547	2018-04-09 06:01:22 +00:00
Hiroshi Inoue	9ff2380ea6	[NFC] fix trivial typos in comments and error message "is is" -> "is", "are are" -> "are" llvm-svn: 329546	2018-04-09 04:37:53 +00:00
Xin Tong	99c4e2f364	[LIR] Reorder header. NFC llvm-svn: 329530	2018-04-08 13:19:53 +00:00
Sanjay Patel	2a24958923	[InstCombine] simplify code that propagates FMF; NFC llvm-svn: 329503	2018-04-07 14:14:23 +00:00
Roman Lebedev	41922f1a6d	[InstCombine] Get rid of select of bittest (PR36950 / PR17564) Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 \| PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 \| PR17564 ]], D45065, D45107 https://godbolt.org/g/iAYRup Alive proof: https://rise4fun.com/Alive/uiH Testing: `ninja check-llvm` Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45108 llvm-svn: 329492	2018-04-07 10:37:24 +00:00
Nico Weber	b64da22db7	Remove trailing space in build file. llvm-svn: 329479	2018-04-07 03:30:28 +00:00
Vitaly Buka	9cb59b92cc	Fix warning by cl::opt<int> -> cl::opt<unsigned> llvm-svn: 329461	2018-04-06 21:41:17 +00:00
Vitaly Buka	66f53d71f7	Runtime flag to control branch funnel threshold Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459	2018-04-06 21:32:36 +00:00
Geoff Berry	5bf4a5eafa	[EarlyCSE] Add debug counter for debugging mis-optimizations. NFC. Reviewers: reames, spatel, davide, dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45162 llvm-svn: 329443	2018-04-06 18:47:33 +00:00
Sanjay Patel	a9ca709011	[InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse() As noted in the post-commit discussion for r329350, we shouldn't generally assume that fsub is the same cost as fneg. llvm-svn: 329429	2018-04-06 17:24:08 +00:00
Simon Pilgrim	a74f4ae404	Strip trailing whitespace. NFCI. llvm-svn: 329421	2018-04-06 17:01:54 +00:00
Mircea Trofin	aa3fea6cb0	[GlobalOpt] Fix support for casts in ctors. Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409	2018-04-06 15:54:47 +00:00
Chad Rosier	45735b8e40	[LoopUnroll] Make LoopPeeling respect the AllowPeeling preference. The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395	2018-04-06 13:57:21 +00:00
Hans Wennborg	b230c763a4	EntryExitInstrumenter: Handle musttail calls Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385	2018-04-06 10:14:09 +00:00
Max Kazantsev	832563a782	[NFC] Add missing end of line symbols llvm-svn: 329383	2018-04-06 09:47:06 +00:00
Sanjay Patel	04683de82f	[InstCombine] FP: Z - (X - Y) --> Z + (Y - X) This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362	2018-04-05 23:21:15 +00:00
Sanjay Patel	03e2526728	[InstCombine] nsz: -(X - Y) --> Y - X This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350	2018-04-05 21:37:17 +00:00
Daniel Neilson	367c2aea4e	[InstCombine] Properly change GEP type when reassociating loop invariant GEP chains Summary: This is a fix to PR37005. Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug whereby it will convert: %src = getelementptr inbounds i8, i8* %base, <2 x i64> %val %res = getelementptr inbounds i8, <2 x i8> %src, i64 %val2 into: %src = getelementptr inbounds i8, i8 %base, i64 %val2 %res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2 is loop invariant. This fix recreates new GEP instructions if the index operand swap would result in the type of %src changing from vector to scalar, or vice versa. Reviewers: sebpop, spatel Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45287 llvm-svn: 329331	2018-04-05 18:51:45 +00:00
Sanjay Patel	deaf4f354e	[InstCombine] use pattern matchers for fsub --> fadd folds This allows folding for vectors with undef elements. llvm-svn: 329316	2018-04-05 17:06:45 +00:00
Sanjay Patel	236442e063	[InstCombine] cleanup; NFC llvm-svn: 329282	2018-04-05 13:24:26 +00:00
Florian Hahn	6e0043365b	[LoopInterchange] Add stats counter for number of interchanged loops. Reviewers: samparker, karthikthecool, blitz.opensource Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D45209 llvm-svn: 329269	2018-04-05 10:39:23 +00:00
Florian Hahn	831a757728	[LoopInterchange] Preserve LoopInfo after interchanging. LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place. Reviewers: davide, efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45278 llvm-svn: 329264	2018-04-05 09:48:45 +00:00
Taewook Oh	e0db533feb	[CallSiteSplitting] Do not perform callsite splitting inside landing pad Summary: If the callsite is inside landing pad, do not perform callsite splitting. Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure. Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad. Attached test case will crash with assertion failure without the fix. Reviewers: fhahn, junbuml, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45130 llvm-svn: 329250	2018-04-05 04:16:23 +00:00
Evgeniy Stepanov	1f1a7a719d	hwasan: add -hwasan-match-all-tag flag Sometimes instead of storing addresses as is, the kernel stores the address of a page and an offset within that page, and then computes the actual address when it needs to make an access. Because of this the pointer tag gets lost (gets set to 0xff). The solution is to ignore all accesses tagged with 0xff. This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore accesses through pointers with a particular pointer tag value for validity. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D44827 llvm-svn: 329228	2018-04-04 20:44:59 +00:00
Benjamin Kramer	1fc0da4849	Make helpers static. NFC. llvm-svn: 329170	2018-04-04 11:45:11 +00:00
Nicolai Haehnle	eb7311ffb1	StructurizeCFG: Test for branch divergence correctly Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165	2018-04-04 10:58:15 +00:00
Craig Topper	7d3aba6687	[SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142	2018-04-04 03:47:17 +00:00
Ikhlas Ajbar	1376d934ed	[Hexagon] peel loops with runtime small trip counts Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129	2018-04-03 22:55:09 +00:00
Sanjay Patel	81b3b10a95	[InstCombine] allow more fmul folds with 'reassoc' The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121	2018-04-03 22:19:19 +00:00
Vlad Tsyrklevich	07cf78cdad	Fix bad copy-and-paste in r329108 llvm-svn: 329118	2018-04-03 21:40:27 +00:00
Gor Nishanov	d4712715dd	[coroutines] Respect alloca alignment requirements when building coroutine frame Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112	2018-04-03 20:54:20 +00:00
Florian Hahn	9467ccf447	[LoopInterchange] Add remark for calls preventing interchanging. It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111	2018-04-03 20:54:04 +00:00
Vlad Tsyrklevich	d17f61ea3b	Add the ShadowCallStack attribute Summary: Introduce the ShadowCallStack function attribute. It's added to functions compiled with -fsanitize=shadow-call-stack in order to mark functions to be instrumented by a ShadowCallStack pass to be submitted in a separate change. Reviewers: pcc, kcc, kubamracek Reviewed By: pcc, kcc Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44800 llvm-svn: 329108	2018-04-03 20:10:40 +00:00
Alexey Bataev	d5b1f7892f	[SLP] Fixed formatting, NFC. llvm-svn: 329091	2018-04-03 17:48:14 +00:00
Daniel Neilson	901acfab0c	[InstCombine] Fold compare of int constant against a splatted vector of ints Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087	2018-04-03 17:26:20 +00:00
Alexey Bataev	428e9d9d87	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085	2018-04-03 17:14:47 +00:00
Alexey Bataev	df989c54cf	Recommit "[SLP] Fix issues with debug output in the SLP vectorizer." The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082	2018-04-03 16:40:33 +00:00
Benjamin Kramer	2fc3b18922	Revert "[SLP] Fix PR36481: vectorize reassociated instructions." This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071	2018-04-03 14:40:33 +00:00
Alexander Potapenko	ac70668cff	MSan: introduce the conservative assembly handling mode. The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first \|sizeof(type)\| bytes for every \|type*\| pointer passed into the assembly statement. llvm-svn: 329054	2018-04-03 09:50:06 +00:00
Chandler Carruth	597bfd8448	[SLP] Fix issues with debug output in the SLP vectorizer. The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046	2018-04-03 05:27:28 +00:00
Ikhlas Ajbar	b7322e8ac7	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042	2018-04-03 03:39:43 +00:00
Haicheng Wu	7f0daaeb86	[SLP] Distinguish "demanded and shrinkable" from "demanded and not shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035	2018-04-03 00:05:10 +00:00
Brian Gesiak	64521bed0d	[Coroutines] Avoid assert splitting hidden coros Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function after `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033	2018-04-02 23:39:40 +00:00
Reid Kleckner	298ffc609b	[InstCombine] Don't strip function type casts from musttail calls Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027	2018-04-02 22:49:44 +00:00
Reid Kleckner	a9e9918ee4	Treat inlining a notail call as a regular, non-tail call Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015	2018-04-02 21:23:16 +00:00
Sanjay Patel	cbb0450540	[InstCombine] add folds for icmp + sub (PR36969) (A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011	2018-04-02 20:37:40 +00:00
Rong Xu	5a8d4c3357	[DeadArgumentElim] Clone function level metadatas Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991	2018-04-02 17:27:38 +00:00
Gor Nishanov	b0316d96ae	[coroutines] Add support for llvm.coro.noop intrinsics Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986	2018-04-02 16:55:12 +00:00
Alexey Bataev	3decaf4275	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980	2018-04-02 14:51:37 +00:00
Teresa Johnson	974706ebf7	[ThinLTO] Add an import cutoff for debugging/triaging Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934	2018-04-01 15:54:40 +00:00
David Green	f80ebc8d21	[LoopRotate] Rotate loops with loop exiting latches If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933	2018-04-01 12:48:24 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Peter Collingbourne	d03bf12c1b	DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890	2018-03-30 18:37:55 +00:00
Krzysztof Parzyszek	fce30c2ba3	Revert "peel loops with runtime small trip counts" This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875	2018-03-30 16:55:44 +00:00
Ikhlas Ajbar	66c8ba5a50	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854	2018-03-30 03:05:34 +00:00
David Blaikie	f423062aff	Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation llvm-svn: 328844	2018-03-29 22:42:08 +00:00
David Blaikie	7883340331	Remove unused header to fix layering. llvm-svn: 328842	2018-03-29 22:35:59 +00:00
David Blaikie	4778bb88ef	Remove unused headers to fix layering llvm-svn: 328840	2018-03-29 22:31:39 +00:00
David Blaikie	c90289b5d3	llvm-c: Split Utils out of Scalar.h To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839	2018-03-29 22:31:38 +00:00
Evgeniy Stepanov	50635dab26	Add msan custom mapping options. Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830	2018-03-29 21:18:17 +00:00
Philip Reames	5c14ed89f6	[NFC][LICM] Rearrange checks to have the cheap bail out first llvm-svn: 328822	2018-03-29 20:32:15 +00:00
Haicheng Wu	c7cc87922e	[JumpThreading] Don't select an edge that we know we can't thread In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798	2018-03-29 16:01:26 +00:00
David Green	b0aa36f9c2	[LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766	2018-03-29 08:48:15 +00:00
Benjamin Kramer	6b995a4a7e	[Transforms] Make sure to include the c binding header when defining c binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765	2018-03-29 07:56:53 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
David Blaikie	eb8cc04ea2	Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back llvm-svn: 328720	2018-03-28 18:03:25 +00:00
David Blaikie	a373d18eb7	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717	2018-03-28 17:44:36 +00:00
Alexander Potapenko	4e7ad0805e	[MSan] Introduce ActualFnStart. NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697	2018-03-28 11:35:09 +00:00
Alexander Potapenko	e1d5877847	[MSan] Add an isStore argument to getShadowOriginPtr(). NFC This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692	2018-03-28 10:17:17 +00:00
Xin Tong	0272cb077f	80-line wrap. NFC llvm-svn: 328660	2018-03-27 19:43:02 +00:00
Rong Xu	662f38b16f	[PGO] Fix branch probability remarks assert Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653	2018-03-27 18:55:56 +00:00
Krzysztof Parzyszek	5d93fdfa89	[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632	2018-03-27 16:14:11 +00:00
Max Kazantsev	b1ad66ff12	[LoopUnroll][NFC] Remove redundant canPeel check We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615	2018-03-27 09:40:51 +00:00
Sam Parker	90b7f4f72c	[IRCE] Enable decreasing loops of non-const bound As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613	2018-03-27 08:24:53 +00:00
Sanjay Patel	0e3167cb30	[InstCombine] improve code comment; NFC llvm-svn: 328560	2018-03-26 17:52:02 +00:00
Sebastian Pop	d870aea03e	[InstCombine] reassociate loop invariant GEP chains to enable LICM This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539	2018-03-26 16:19:31 +00:00
Sanjay Patel	4fd4fd610c	[InstCombine] distribute fmul over fadd/fsub This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502	2018-03-26 15:03:57 +00:00
Sanjay Patel	2455fef497	[InstCombine] check uses before creating instructions for fmul distribution As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498	2018-03-26 14:25:43 +00:00
Krzysztof Parzyszek	0b377e0ae9	[LSR] Allow giving priority to post-incrementing addressing modes Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490	2018-03-26 13:10:09 +00:00
Max Kazantsev	a55749312b	[LoopUnroll] Fix dangling pointers in SCEV Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483	2018-03-26 11:31:46 +00:00
Benjamin Kramer	8840f644b4	[DeadArgElim] Strip allocsize attributes when deleting an argument. Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481	2018-03-26 09:44:24 +00:00
Sam Parker	53a423a417	[IRCE] Enable increasing loops of variable bounds CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480	2018-03-26 09:29:42 +00:00
Sanjay Patel	93e64dd9a1	[PatternMatch] allow undef elements when matching vector FP +0.0 This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461	2018-03-25 21:16:33 +00:00
Sanjay Patel	841aac04d4	[InstCombine] peek through more icmp of FP cast + bitcast This is an extension of rL328426 as noted in D44367. llvm-svn: 328448	2018-03-25 14:01:42 +00:00
Sanjay Patel	745a9c62c2	[InstCombine] peek through FP casts for sign-bit compares (PR36682) This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426	2018-03-24 15:45:02 +00:00
Sanjay Patel	286074e8a1	[InstCombine] fix formatting; NFC llvm-svn: 328425	2018-03-24 15:41:59 +00:00
David Blaikie	53f51c1df8	Remove unused header from EntryExitInstrumenter Fixes layering, since Transforms/Utils doesn't depend on CodeGen, so shouldn't include headers from it. llvm-svn: 328399	2018-03-24 00:06:14 +00:00
Philip Reames	6a1f3446b5	[GuardWidening] Group code by class [NFC] llvm-svn: 328387	2018-03-23 23:41:47 +00:00
David Blaikie	4fe1fe1418	Fix Layering, move instrumentation transform headers into Instrumentation subdirectory llvm-svn: 328379	2018-03-23 22:11:06 +00:00
Fedor Sergeev	6660fd0f95	[PM][FunctionAttrs] add NoUnwind attribute inference to PostOrderFunctionAttrs pass Summary: This was motivated by absence of PrunEH functionality in new PM. It was decided that a proper way to do PruneEH is to add NoUnwind inference into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top. This change generalizes attribute handling implemented for (a removal of) Convergent attribute, by introducing a generic builder-like class AttributeInferer It registers all the attribute inference requests, storing per-attribute predicates into a vector, and then goes through an SCC Node, scanning all the instructions for not breaking attribute assumptions. The main idea is that as soon all the instructions from all the functions of SCC Node conform to attribute assumptions then we are free to infer the attribute as set for all the functions of SCC Node. It handles two distinct cases of attributes: - those that might break due to derefinement of the function code for these attributes we are allowed to apply inference only if all the functions are "exact definitions". Example - NoUnwind. - those that do not care about derefinement for these attributes we are allowed to apply inference as soon as we see any function definition. Example - removal of Convergent attribute. Also in this commit: * Converted all the FunctionAttrs tests to use FileCheck and added new-PM invocations to them * FunctionAttrs/convergent.ll test demonstrates a difference in behavior between new and old PM implementations. Marked with FIXME. * PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg combo as intended * some of "other" tests were updated since function-attrs now infers 'nounwind' even for old PM pipeline * -disable-nounwind-inference hidden option added as a possible workaround for a supposedly rare case when nounwind being inferred by default presents a problem Reviewers: chandlerc, jlebar Reviewed By: jlebar Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44415 llvm-svn: 328377	2018-03-23 21:46:16 +00:00
Sanjay Patel	32381d7c7e	[InstCombine] simplify code for FP intrinsic shrinking; NFCI llvm-svn: 328372	2018-03-23 21:18:12 +00:00
Alex Shlyapnikov	83e7841419	[HWASan] Port HWASan to Linux x86-64 (LLVM) Summary: Porting HWASan to Linux x86-64, first of the three patches, LLVM part. The approach is similar to ARM case, trap signal is used to communicate memory tag check failure. int3 instruction is used to generate a signal, access parameters are stored in nop [eax + offset] instruction immediately following the int3 one. One notable difference is that x86-64 has to untag the pointer before use due to the lack of feature comparable to ARM's TBI (Top Byte Ignore). Reviewers: eugenis Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44699 llvm-svn: 328342	2018-03-23 17:57:54 +00:00
Andrew Kaylor	a237866faf	Fix a block copying problem in LICM Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336	2018-03-23 17:36:18 +00:00
Sanjay Patel	713ca3d36a	[InstCombine] reduce code duplication; NFC llvm-svn: 328323	2018-03-23 15:07:35 +00:00
Sanjay Patel	6de89ce3f7	[InstCombine] improve variable name; NFC llvm-svn: 328322	2018-03-23 14:48:31 +00:00
Matthew Simpson	6c289a1c74	[SLP] Stop counting cost of gather sequences with multiple uses When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316	2018-03-23 14:18:27 +00:00
Florian Hahn	f73c3ece7f	Revert r328307: [IPSCCP] Use constant range information for comparisons of parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312	2018-03-23 12:49:39 +00:00
Florian Hahn	b1feec087e	[IPSCCP] Use constant range information for comparisons of parameters. For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307	2018-03-23 11:56:00 +00:00
Florian Hahn	52436a587e	[LoopUnroll] Simplify induction variables after peeling too. Loop peeling also has an impact on the induction variables, so we should benefit from induction variable simplification after peeling too. Reviewers: sanjoy, bogner, mzolotukhin, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43878 llvm-svn: 328301	2018-03-23 10:38:12 +00:00
David Blaikie	301627f875	Move SampleProfile.h into IPO along with the rest of the IPO pass headers llvm-svn: 328262	2018-03-22 22:42:44 +00:00
David Blaikie	376294c23a	Finish moving the IPSCCP pass from Scalar to IPO - moving the registration llvm-svn: 328259	2018-03-22 22:07:53 +00:00
David Blaikie	3bbf5af0ac	Fix layering between SCCP and IPO SCCP Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but this meant Transforms/Scalar including Transfroms/IPO headers, creating a circular dependency. (IPO depends on Scalar already) - so move the IPO SCCP shims out into IPO and the basic library implementation accessible from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation. llvm-svn: 328250	2018-03-22 21:41:29 +00:00
David Blaikie	2965a01e98	Move the initialization of the Meta Renamer pass over to IPO along with the rest of it that was moved in r328209 llvm-svn: 328234	2018-03-22 19:36:54 +00:00
Daniel Neilson	710d7b9945	[InstCombineCalls] Update deprecated API usage (NFC) Summary: Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The former has been deprecated. llvm-svn: 328227	2018-03-22 18:36:15 +00:00
Matt Morehouse	236cdaf84c	[SimplifyCFG] Create attribute for fuzzing-specific optimizations. Summary: When building with libFuzzer, converting control flow to selects or obscuring the original operands of CMPs reduces the effectiveness of libFuzzer's heuristics. This patch provides an attribute to disable or modify certain optimizations for optimal fuzzing signal. Provides a less aggressive alternative to https://reviews.llvm.org/D44057. Reviewers: vitalybuka, davide, arsenm, hfinkel Reviewed By: vitalybuka Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44232 llvm-svn: 328214	2018-03-22 17:07:51 +00:00
Anna Thomas	9b1176b0ef	[LoopPredication] Add profitability check based on BPI Summary: LoopPredication is not profitable when the loop is known to always exit through some block other than the latch block. A coarse grained latch check can cause loop predication to predicate the loop, and unconditionally deoptimize. However, without predicating the loop, the guard may never fail within the loop during the dynamic execution because the non-latch loop termination condition exits the loop before the latch condition causes the loop to exit. We teach LP about this using BranchProfileInfo pass. Reviewers: apilipenko, skatkov, mkazantsev, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44667 llvm-svn: 328210	2018-03-22 16:03:59 +00:00
David Blaikie	0368417595	Move MetaRenamer from Transforms/UTils to Transforms/IPO since it implements part of IPO.h llvm-svn: 328209	2018-03-22 15:57:47 +00:00
Florian Hahn	9bc0bc4b9b	[CallSiteSplitting] Preserve DominatorTreeAnalysis. The dominator tree analysis can be preserved easily. Some other kinds of analysis can probably be preserved too. Reviewers: junbuml, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43173 llvm-svn: 328206	2018-03-22 15:23:33 +00:00
Sanjay Patel	94c91b78e7	[InstCombine] add folds for xor-of-icmp signbit tests (PR36682) This is a retry of r328119 which was reverted at r328145 because it could crash by trying to combine icmps with different operand types. This version has a check for that and additional tests. Original commit message: This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328197	2018-03-22 14:08:16 +00:00
Florian Hahn	3bb822e7d6	[CloneFunction] Preserve DT in DuplicateInstructionsInSplitBetween. DuplicateInstructionsInSplitBetween can preserve the DT by passing through DT to SplitEdge. Reviewers: sanjoy, junbuml, anna, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D44629 llvm-svn: 328189	2018-03-22 11:38:53 +00:00
David Blaikie	2be3922807	Fix a couple of layering violations in Transforms Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165	2018-03-21 22:34:23 +00:00
Reid Kleckner	762331be07	Revert r328119 "[InstCombine] add folds for xor-of-icmp signbit tests (PR36682)" This asserts when compiling safe_numerics_unittest.cpp in Chromium with MSan. llvm-svn: 328145	2018-03-21 20:35:36 +00:00
Sanjay Patel	778032f39d	[InstCombine] add folds for xor-of-icmp signbit tests (PR36682) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328119	2018-03-21 17:17:13 +00:00
Daniel Neilson	6f1eb58e92	[MemCpyOpt] Update to new API for memory intrinsic alignment Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097	2018-03-21 14:14:55 +00:00
Justin Lebar	038cbc5c13	Re-re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for causing performance regressions. Re-landing because we've determined that these regressions were noise. Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 328096	2018-03-21 14:08:21 +00:00
Philip Reames	23aed5ef6f	[MustExecute] Move isGuaranteedToExecute and related rourtines to Analysis Next step is to actually merge the implementations and get both implementations tested through the new printer. llvm-svn: 328055	2018-03-20 22:45:23 +00:00
Shoaib Meenai	3f689c8632	[ObjCARC] Add funclet token to ARC marker The inline assembly generated for the ARC autorelease elision marker must have a funclet token if it's emitted inside a funclet, otherwise the inline assembly (and all subsequent code in the funclet) will be marked unreachable by WinEHPrepare. Note that this only applies for the non-O0 case, since at O0, clang emits the autorelease elision marker itself rather than deferring to the backend. The fix for clang is handled in a separate change. Differential Revision: https://reviews.llvm.org/D44641 llvm-svn: 328042	2018-03-20 20:45:41 +00:00
Xin Tong	a713ebea24	[MergeICmps] Break eargerly out of loop llvm-svn: 327972	2018-03-20 12:03:25 +00:00
Xin Tong	bdbd97ed9a	[MergeICmp] Fix a bug in entry block shuffled to middle of the chain Summary: Fix a bug in entry block shuffled to middle of the chain. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44642 llvm-svn: 327971	2018-03-20 11:57:54 +00:00
Andrei Elovikov	8b8253fdc7	[LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast. Summary: It turned out to be error-prone to expect the callers to handle that - better to leave the decision to this routine and make the required data to be explicitly passed to the function. This handles the case that was missed in the r322473 and fixes the assert mentioned in PR36524. Reviewers: dorit, mssimpso, Ayal, dcaballe Reviewed By: dcaballe Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D43812 llvm-svn: 327960	2018-03-20 09:04:39 +00:00
Sanjay Patel	0ce3086777	[InstCombine] canonicalize fcmp+select to fabs This is complicated by -0.0 and nan. This is based on the DAG patterns as shown in D44091. I'm hoping that we can just remove those DAG folds and always rely on IR canonicalization to handle the matching to fabs. We would still need to delete the broken code from DAGCombiner to fix PR36600: https://bugs.llvm.org/show_bug.cgi?id=36600 Differential Revision: https://reviews.llvm.org/D44550 llvm-svn: 327858	2018-03-19 15:14:30 +00:00
Alexander Potapenko	fa0217276a	[MSan] fix the types of RegSaveAreaPtrPtr and OverflowArgAreaPtrPtr Despite their names, RegSaveAreaPtrPtr and OverflowArgAreaPtrPtr used to be i8* instead of i8**. This is important, because these pointers are dereferenced twice (first in CreateLoad(), then in getShadowOriginPtr()), but for some reason MSan allowed this - most certainly because it was possible to optimize getShadowOriginPtr() away at compile time. Differential revision: https://reviews.llvm.org/D44520 llvm-svn: 327830	2018-03-19 10:08:04 +00:00
Alexander Potapenko	014ff63f24	[MSan] Don't create zero offsets in getShadowPtrForArgument(). NFC For MSan instrumentation with MS.ParamTLS and MS.ParamOriginTLS being TLS variables, the CreateAdd() with ArgOffset==0 is a no-op, because the compiler is able to fold the addition of 0. But for KMSAN, which receives ParamTLS and ParamOriginTLS from a call to the runtime library, this introduces a stray instruction which complicates reading/testing the IR. Differential revision: https://reviews.llvm.org/D44514 llvm-svn: 327829	2018-03-19 10:03:47 +00:00
Alexander Potapenko	e0bafb4359	[MSan] Introduce insertWarningFn(). NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to use a different warning function, __msan_warning_32(uptr origin), so we'd better create the warning calls in one place. Differential Revision: https://reviews.llvm.org/D44513 llvm-svn: 327828	2018-03-19 09:59:44 +00:00
Anastasis Grammenos	3a589103a4	[LICM] Salvage DI from dying Instructions LICM deletes trivially dead instructions which it won't attempt to sink. Attempt to salvage debug values which reference these instructions. llvm-svn: 327800	2018-03-18 15:59:19 +00:00
Roman Lebedev	e6da3063a5	[InstCombine] peek through unsigned FP casts for zero-equality compares (PR36682) Summary: This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H See also D44416 Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44424 llvm-svn: 327799	2018-03-18 15:53:02 +00:00
Sanjay Patel	63b1028953	[InstCombine] add nnan requirement for sqrt(x) * sqrt(y) -> sqrt(x*y) This is similar to D43765. llvm-svn: 327797	2018-03-18 14:32:54 +00:00
Oren Ben Simhon	fdd72fd522	[X86] Added support for nocf_check attribute for indirect Branch Tracking X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET). IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp. The `nocf_check` attribute has two roles in the context of X86 IBT technology: 1. Appertains to a function - do not add ENDBR instruction at the beginning of the function. 2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction. This patch implements `nocf_check` context for Indirect Branch Tracking. It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks. Differential Revision: https://reviews.llvm.org/D41879 llvm-svn: 327767	2018-03-17 13:29:46 +00:00
Craig Topper	71d69b2ea5	[CorrelatedValuePropagation] Use SelectInst::getCondition/getTrueValue/getFalseValue instead of getOperand for readability. NFC llvm-svn: 327728	2018-03-16 18:18:47 +00:00
Philip Reames	8a106272e8	[LICM/mustexec] Extend first iteration must execute logic to fcmps This builds on the work from https://reviews.llvm.org/D44287. It turned out supporting fcmp was much easier than I realized, so let's do that now. As an aside, our -O3 handling of a floating point IVs leaves a lot to be desired. We do convert the float IV to an integer IV, but do so late enough that many other optimizations are missed (e.g. we don't vectorize). Differential Revision: https://reviews.llvm.org/D44542 llvm-svn: 327722	2018-03-16 16:33:49 +00:00
Brian M. Rzycki	f65ddc5fa2	[JumpThreading] Track unreachable BBs to avoid processing JumpThreading iterates over F until the IR quiesces. Transforming unreachable BBs increases compile time and it is also possible to never stabilize causing JumpThreading to hang. An older attempt at fixing this problem was D3991 where removeUnreachableBlocks(F) was called before JumpThreading began. This has a few drawbacks: * expensive - the routine attempts to fix up the IR to identify additional BBs that can be removed along with unreachable BBs. * aggressive - does not identify and preserve the shape of the IR. At a minimum it does not preserve loop hierarchies. * invasive - altering reachable blocks it may disrupt IR shapes that could have otherwise been JumpThreaded. This patch avoids removeUnreachableBlocks(F) and instead tracks unreachable BBs in a SmallPtrSet using DominatorTree to validate the initial state of all BBs. We then rely on subsequent passes to identify and remove these unreachable blocks from F. Reviewers: dberlin, sebpop, kuhar, dinesh.d Reviewed by: sebpop, kuhar Subscribers: hiraditya, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D44177 llvm-svn: 327713	2018-03-16 15:13:47 +00:00
Florian Hahn	fc97b6173f	[LoopUnroll] Peel off iterations if it makes conditions true/false. If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 llvm-svn: 327671	2018-03-15 21:34:43 +00:00
Philip Reames	a21d5f1e18	[LICM] Ignore exits provably not taken on first iteration when computing must execute It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest is known to be taken on the first iteration. This case comes up in two major ways: * If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration. * Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated. The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate. Differential Revision: https://reviews.llvm.org/D44287 llvm-svn: 327664	2018-03-15 21:04:28 +00:00
Diego Caballero	cae4994a58	[LV] Test commit. Removing white space. This is just to check that I have commit access privilege. llvm-svn: 327656	2018-03-15 19:34:27 +00:00
Philip Reames	422024a1b7	[EarlyCSE] Don't hide earler invariant.scopes If we've already established an invariant scope with an earlier generation, we don't want to hide it in the scoped hash table with one with a later generation. I noticed this when working on the invariant-load handling, but it also applies to the invariant.start case as well. Without this change, my previous patch for invariant-load regresses some cases, so I'm pushing this without waiting for review. This is why you don't make last minute tweaks to patches to catch "obvious cases" after it's already been reviewed. Bad Philip! llvm-svn: 327655	2018-03-15 18:12:27 +00:00
Philip Reames	ca587fe0b4	[EarlyCSE] Reuse invariant scopes for invariant load This is a follow up to https://reviews.llvm.org/D43716 which rewrites the invariant load handling using the new infrastructure. It's slightly more powerful, but only in somewhat minor ways for the moment. It's not clear that DSE of stores to invariant locations is actually interesting since why would your IR have such a construct to start with? Note: The submitted version is slightly different than the reviewed one. I realized the scope could start for an invariant load which was proven redundant and removed. Added a test case to illustrate that as well. Differential Revision: https://reviews.llvm.org/D44497 llvm-svn: 327646	2018-03-15 17:29:32 +00:00
Ulrich Weigand	f4ceef8d3f	[Debug] Retain both copies of debug intrinsics in HoistThenElseCodeToIf When hoisting common code from the "then" and "else" branches of a condition to before the "if", the HoistThenElseCodeToIf routine will attempt to merge the debug location associated with the two original copies of the hoisted instruction. This is a problem in the special case where the hoisted instruction is a debug info intrinsic, since for those the debug location is considered part of the intrinsic and attempting to modify it may resut in invalid IR. This is the underlying cause of PR36410. This patch fixes the problem by handling debug info intrinsics specially: instead of hoisting one copy and merging the two locations, the code now simply hoists both copies, each with its original location intact. Note that this is still only done in the case where both original copies are otherwise (i.e. apart from location metadata) identical. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D44312 llvm-svn: 327622	2018-03-15 12:28:48 +00:00
Fedor Sergeev	194a407bda	[New PM][IRCE] port of Inductive Range Check Elimination pass to the new pass manager There are two nontrivial details here: * Loop structure update interface is quite different with new pass manager, so the code to add new loops was factored out * BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization sequence (e.g. LoopSimplify) might invalidate BPI results. Complete solution for BPI will likely take some time to discuss and figure out, so for now this was partially solved by making BPI optional in IRCE (skipping a couple of profitability checks if it is absent). Most of the IRCE tests got their corresponding new-pass-manager variant enabled. Only two of them depend on BPI, both marked with TODO, to be turned on when BPI starts being available for loop passes. Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43795 llvm-svn: 327619	2018-03-15 11:01:19 +00:00
Andrei Elovikov	f9b8035f3c	[LoopUnroll] Ignore ephemeral values when checking full unroll profitability. Summary: Before this patch call graph is like this in the LoopUnrollPass: tryToUnrollLoop ApproximateLoopSize collectEphemeralValues /* Use collected ephemeral values / computeUnrollCount analyzeLoopUnrollCost / Bail out from the analysis if loop contains CallInst / This patch moves collection of the ephemeral values to the tryToUnrollLoop function and passes the collected values into both ApproximateLoopsize (as before) and additionally starts using them in analyzeLoopUnrollCost: tryToUnrollLoop collectEphemeralValues ApproximateLoopSize(EphValues) / Use EphValues / computeUnrollCount(EphValues) analyzeLoopUnrollCost(EphValues) / Ignore ephemeral values - they don't contribute to the final cost / / Bail out from the analysis if loop contains CallInst */ Reviewers: mzolotukhin, evstupac, sanjoy Reviewed By: evstupac Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43931 llvm-svn: 327617	2018-03-15 09:59:15 +00:00
George Burgess IV	cedfa6da81	Remove unused variable; NFC llvm-svn: 327597	2018-03-15 02:58:36 +00:00
Matt Davis	9407bb5f54	[CleanUp] Remove NumInstructions field from LoopVectorizer's RegisterUsage struct. Summary: This variable is largely going unused; aside from reporting number of instructions for in DEBUG builds. The only use of NumInstructions is in debug output to represent the LoopSize. That value can be can be misleading as it also includes metadata instructions (e.g., DBG_VALUE) which have no real impact. If we do choose to keep this around, we probably should guard it by a DEBUG macro, as it's not used in production builds. Reviewers: majnemer, congh, rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D44495 llvm-svn: 327589	2018-03-14 23:30:31 +00:00
Philip Reames	0adbb19409	[EarlyCSE] Exploit open ended invariant.start scopes If we have an invariant.start with no corresponding invariant.end, then the memory location becomes invariant indefinitely after the invariant.start. As a result, anything dominated by the start is guaranteed to see the value the memory location had when the invariant.start executed. This patch adds an AvailableInvariants table which tracks the generation a particular memory location became invariant and then uses that information to allow value forwarding that would otherwise be disallowed by potentially aliasing stores. (Reminder: In EarlyCSE everything clobbers everything by default.) This should be compatible with the MemorySSA variant, but design is generational. We can and should add first class support for invariant.start within MemorySSA at a later time. I took a quick look at doing so, but probably need some input from a MemorySSA expert. Differential Revision: https://reviews.llvm.org/D43716 llvm-svn: 327577	2018-03-14 21:35:06 +00:00
Hiroshi Yamauchi	e6a3dc7699	Simplify more cases of logical ops of masked icmps. Summary: For example, ((X & 255) != 0) && ((X & 15) == 8) -> ((X & 15) == 8). ((X & 7) != 0) && ((X & 15) == 8) -> false. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43835 llvm-svn: 327450	2018-03-13 21:13:18 +00:00
Anna Thomas	5ac72f94f3	Test Commit NFC. Updated comment llvm-svn: 327436	2018-03-13 19:38:45 +00:00
Haicheng Wu	aee0af3e23	[SLP] clean some formats llvm-svn: 327433	2018-03-13 18:44:19 +00:00
Rafael Espindola	f5220fb68f	[ThinLTO] Clear dllimport when setting dso_local. This is PR36686. If a user of a library is LTOed with that library we take the opportunity to set dso_local, but we don't clear dllimport, which creates an invalid IR. llvm-svn: 327408	2018-03-13 15:24:51 +00:00
Sanjay Patel	204edeca56	[InstCombine] fix fmul reassociation to avoid creating an extra fdiv This was supposed to be an NFC refactoring that will eventually allow eliminating the isFast() predicate, but there's a rare possibility that we would pessimize the code as shown in the test case because we failed to check 'hasOneUse()' properly. This version also removes an inefficiency of the old code; we would look for: (X * C) * C1 --> X * (C * C1) ...but that pattern is always handled by SimplifyAssociativeOrCommutative(). llvm-svn: 327404	2018-03-13 14:46:32 +00:00
Daniel Neilson	41e781d5f1	[SROA] Take advantage of separate alignments for memcpy source and destination Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the SROA pass to cease using the old getAlignment() & setAlignment() APIs of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. This allows us to enhance visitMemTransferInst to be more aggressive setting the alignment in memcpy calls that it creates, as well as to only change the alignment of a memcpy/memmove argument that it replaces. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: chandlerc, bollu, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42974 llvm-svn: 327398	2018-03-13 14:25:33 +00:00
Eugene Leviant	6f42a2cd91	[Evaluator] Evaluate load/store with bitcast Differential revision: https://reviews.llvm.org/D43457 llvm-svn: 327381	2018-03-13 10:19:50 +00:00
Clement Courbet	9f0b3170bc	[MergeICmps] Make sure that the comparison only has one use. Summary: Fixes PR36557. Reviewers: trentxintong, spatel Subscribers: mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44083 llvm-svn: 327372	2018-03-13 07:05:55 +00:00
Vlad Tsyrklevich	aab6000684	Reland r327041: [ThinLTO] Keep available_externally symbols live Summary: This change fixes PR36483. The bug was originally introduced by a change that marked non-prevailing symbols dead. This broke LowerTypeTests handling of available_externally functions, which are non-prevailing. LowerTypeTests uses liveness information to avoid emitting thunks for unused functions. Marking available_externally functions dead is incorrect, the functions are used though the function definitions are not. This change keeps them live, and lets the EliminateAvailableExternally/GlobalDCE passes remove them later instead. (Reland with a suspected fix for a unit test failure I haven't been able to reproduce locally) Reviewers: pcc, tejohnson Reviewed By: tejohnson Subscribers: grimar, mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D43690 llvm-svn: 327360	2018-03-13 05:08:48 +00:00
Saleem Abdulrasool	f159a389df	ObjCARC: address review comments from majnemer I forgot to incorporate these comments into the original revision. This is just code cleanup addressing the feedback, NFC. llvm-svn: 327351	2018-03-12 23:48:20 +00:00
Volkan Keles	4ecdb44a64	BlockExtractor: Don’t delete functions directly Blocks may have function calls, so don’t erase functions directly to avoid erasing a function that has a user. llvm-svn: 327340	2018-03-12 22:28:18 +00:00
Saleem Abdulrasool	8b342680bf	ObjCARC: teach the cloner about funclets In the case that the CallInst that is being moved has an associated operand bundle which is a funclet, the move will construct an invalid instruction. The new site will have a different token and needs to be reassociated with the new instruction. Unfortunately, there is no way to alter the bundle after the construction of the instruction. Replace the call instruction cloning with a custom helper to clone the instruction and reassociate the funclet token. llvm-svn: 327336	2018-03-12 21:46:09 +00:00
Vedant Kumar	3a408538f0	Remove the LoopInstSimplify pass (-loop-instsimplify) LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329	2018-03-12 20:49:42 +00:00
Michael Zolotukhin	a3d8ef0f08	Improve caching scheme in ProvenanceAnalysis. Summary: ProvenanceAnalysis::related(A, B) currently memoizes its results, and on big tests the cache grows too large, and we're spending most of the time growing/looking through DenseMap. This patch reduces the size of the cache by normalizing keys first: we do that by calling GetUnderlyingObjCPtr on the input values. The results of GetUnderlyingObjCPtr are also memoized in a separate cache. The patch doesn't bring noticable changes to compile time on CTMark, however significantly helps one of our internal tests. Reviewers: gottesmm Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44270 llvm-svn: 327328	2018-03-12 20:36:25 +00:00
Craig Topper	ee99aa4dd0	[InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMore getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer. There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them. Differential Revision: https://reviews.llvm.org/D44398 llvm-svn: 327315	2018-03-12 18:46:05 +00:00
Craig Topper	3b4ad9c12d	[CallSiteSplitting] Use !Instruction::use_empty instead of checking for a non-zero return from getNumUses getNumUses is a linear operation. It walks a linked list to get a count. So in this case its better to just ask if there are any users rather than how many. llvm-svn: 327314	2018-03-12 18:40:59 +00:00
Eugene Leviant	19e238746b	[ThinLTO] Recommit of import global variables This wasreverted in r326638 due to link problems and fixed afterwards llvm-svn: 327254	2018-03-12 10:30:50 +00:00
Justin Lebar	24b6640b1b	Back out "Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." This reverts r326908, originally landed as D44102. Reverted for causing performance regressions on x86. (These regressions are not yet understood.) llvm-svn: 327252	2018-03-12 09:26:09 +00:00
Florian Hahn	a7dcfa746e	[PartialInlining] Use isInlineViable to detect constructs preventing inlining. Use isInlineViable to prevent inlining of functions with non-inlinable constructs, in case cost analysis is skipped. Reviewers: efriedma, sfertile, davide, davidxl Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D42846 llvm-svn: 327207	2018-03-10 14:53:44 +00:00
Ulrich Weigand	019dd2316d	Revert "[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf" This reverts commit r327175 as problems in debug info generation were shown. llvm-svn: 327176	2018-03-09 22:00:10 +00:00
Ulrich Weigand	fa4e63c0d6	[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf When hoisting common code from the "then" and "else" branches of a condition to before the "if", there is no need to require that debug intrinsics match before moving them (and merging them). Instead, we can simply always keep all debug intrinsics from both sides of the "if". This fixes PR36410, which describes a problem where as a result of the attempt to merge debug locations for two debug intrinsics we end up with an invalid intrinsic, where the scope indicated in the !dbg location no longer matches the scope of the variable tracked by the intrinsic. In addition, this has the benefit that we no longer throw away information that is actually still valid, helping to generate better debug data. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D44312 llvm-svn: 327175	2018-03-09 21:37:07 +00:00
Renato Golin	038ede2a16	[NFC] Consolidate six getPointerOperand() utility functions into one place There are six separate instances of getPointerOperand() utility. LoopVectorize.cpp has one of them, and I don't want to create a 7th one while I'm trying to move LoopVectorizationLegality into a separate file (eventual objective is to move it to Analysis tree). See http://lists.llvm.org/pipermail/llvm-dev/2018-February/120999.html for llvm-dev discussions Closes D43323. Patch by Hideki Saito <hideki.saito@intel.com>. llvm-svn: 327173	2018-03-09 21:05:58 +00:00
Peter Collingbourne	2974856ad4	Use branch funnels for virtual calls when retpoline mitigation is enabled. The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the branch predictor, and as a result it can lead to a measurable loss of performance. We can reduce the performance impact of retpolined virtual calls by replacing them with a special construct known as a branch funnel, which is an instruction sequence that implements virtual calls to a set of known targets using a binary tree of direct branches. This allows the processor to speculately execute valid implementations of the virtual function without allowing for speculative execution of of calls to arbitrary addresses. This patch extends the whole-program devirtualization pass to replace certain virtual calls with calls to branch funnels, which are represented using a new llvm.icall.jumptable intrinsic. It also extends the LowerTypeTests pass to recognize the new intrinsic, generate code for the branch funnels (x86_64 only for now) and lay out virtual tables as required for each branch funnel. The implementation supports full LTO as well as ThinLTO, and extends the ThinLTO summary format used for whole-program devirtualization to support branch funnels. For more details see RFC: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html Differential Revision: https://reviews.llvm.org/D42453 llvm-svn: 327163	2018-03-09 19:11:44 +00:00
Chad Rosier	95d9ccb2a0	[JumpThreading] Don't restrict cast-traversal to i1 In r263618, JumpThreading learned to look trough simple cast instructions, but only if the source of those cast instructions was a phi/cmp i1 (in an effort to limit compile time effects). I think this condition is too restrictive. For switches with limited value range, InstCombine will readily introduce an extra trunc instruction to a smaller integer type (e.g. from i8 to i2), leaving us in the somewhat perverse situation that jump-threading would work before running instcombine, but not after. Since instcombine produces this pattern, I think we need to consider it canonical and support it in JumpThreading. In general, for limiting recursion, I think the existing restriction to phi and cmp nodes should be sufficient to avoid looking through unprofitable chains of instructions. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42262 llvm-svn: 327150	2018-03-09 16:43:46 +00:00
Renato Golin	6b62039bb0	[LV] Fix vectorizer's isUniform() abuse triggers assert in SCEV Fixes PR36311. See more detailed analysis in https://bugs.llvm.org/show_bug.cgi?id=36311. isUniform() information is recomputed after LV started transforming the underlying IR and that triggered an assert in SCEV. From vectorizer's architectural perspective, such information, while still useful in vector code gen, should not be recomputed after the start of transforming the LLVM IR. Instead, we should collect and cache such information during the analysis phase of LV and use the cached info during code gen. From the symptom perspective, this assert as it stands right now is not very useful. Legality already rejected loops that would trigger the assert. As such, commenting out the assert is NFC from vectorizer's functionality perspective. On top of that, just above the assertion, we check for unit-strided load/store or gather scatter. Addresses can't be uniform below that check. From vectorization theory point of view, we don't have to reject all cases of stores to uniform addresses. Eventually, we should support safe/profitable cases. This patch resolves the issue by removing the useless assertion that is invoking LAA's isUniform() that requires up-to-date DomTree ---- once vector code gen starts modifying CFG, we don't have an up-to-date DomTree. Patch by Hideki Saito <hideki.saito@intel.com>. llvm-svn: 327109	2018-03-09 10:31:31 +00:00
Eric Christopher	3caa0fd050	Revert "[ThinLTO] Keep available_externally symbols live" This reverts commit r327041 and the followup attempts at fixing the testcase as they're still failing. llvm-svn: 327094	2018-03-09 01:25:18 +00:00
Adrian Prantl	5b477be72a	LowerDbgDeclare: ignore dbg.declares for allocas with volatile access There is no point in lowering a dbg.declare describing an alloca that has volatile loads or stores as users, since the alloca cannot be elided. Lowering the dbg.declare will result in larger debug info that may also have worse coverage than just describing the alloca. rdar://problem/34496278 llvm-svn: 327092	2018-03-09 00:45:04 +00:00
Philip Reames	fbffd126b8	[NFC] Factor out a helper function for checking if a block has a potential early implicit exit. llvm-svn: 327065	2018-03-08 21:25:30 +00:00
Kuba Mracek	8842da8e07	[asan] Fix a false positive ODR violation due to LTO ConstantMerge pass [llvm part, take 3] This fixes a false positive ODR violation that is reported by ASan when using LTO. In cases, where two constant globals have the same value, LTO will merge them, which breaks ASan's ODR detection. Differential Revision: https://reviews.llvm.org/D43959 llvm-svn: 327061	2018-03-08 21:02:18 +00:00
Kuba Mracek	f0bcbfef5c	Revert r327053. llvm-svn: 327055	2018-03-08 20:13:39 +00:00
Kuba Mracek	584bd10803	[asan] Fix a false positive ODR violation due to LTO ConstantMerge pass [llvm part, take 2] This fixes a false positive ODR violation that is reported by ASan when using LTO. In cases, where two constant globals have the same value, LTO will merge them, which breaks ASan's ODR detection. Differential Revision: https://reviews.llvm.org/D43959 llvm-svn: 327053	2018-03-08 20:05:45 +00:00
Vlad Tsyrklevich	7b66ef1036	[ThinLTO] Keep available_externally symbols live Summary: This change fixes PR36483. The bug was originally introduced by a change that marked non-prevailing symbols dead. This broke LowerTypeTests handling of available_externally functions, which are non-prevailing. LowerTypeTests uses liveness information to avoid emitting thunks for unused functions. Marking available_externally functions dead is incorrect, the functions are used though the function definitions are not. This change keeps them live, and lets the EliminateAvailableExternally/GlobalDCE passes remove them later instead. I've also enabled EliminateAvailableExternally for all optimization levels, I believe it being disabled for O1 was an oversight. Reviewers: pcc, tejohnson Reviewed By: tejohnson Subscribers: grimar, mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D43690 llvm-svn: 327041	2018-03-08 18:48:03 +00:00
Kuba Mracek	e834b22874	Revert r327029 llvm-svn: 327033	2018-03-08 17:32:00 +00:00
Kuba Mracek	0e06d37dba	[asan] Fix a false positive ODR violation due to LTO ConstantMerge pass [llvm part] This fixes a false positive ODR violation that is reported by ASan when using LTO. In cases, where two constant globals have the same value, LTO will merge them, which breaks ASan's ODR detection. Differential Revision: https://reviews.llvm.org/D43959 llvm-svn: 327029	2018-03-08 17:24:06 +00:00
Farhana Aleen	89196642f7	[AMDGPU] Increased vector length for global/constant loads. Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44179 llvm-svn: 326910	2018-03-07 17:09:18 +00:00
Justin Lebar	eccfbf1bcd	Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for failing an assert in clang bootstrap builds. Re-landing with a fix for handling non-power-of-two inputs (e.g. udiv i24). Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326908	2018-03-07 16:56:49 +00:00
Farhana Aleen	347d12b4ce	Revert "[AMDGPU] Widened vector length for global/constant address space." This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03. llvm-svn: 326907	2018-03-07 16:55:27 +00:00
Farhana Aleen	0d03d0588d	[AMDGPU] Widened vector length for global/constant address space. llvm-svn: 326904	2018-03-07 16:29:05 +00:00
Justin Lebar	eeeb0eb049	Revert rL326898: "Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." Breaks bootstrap builds: clang built with this patch asserts while building MCDwarf.cpp: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. llvm-svn: 326900	2018-03-07 16:05:43 +00:00
Justin Lebar	cb9e89c39b	Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Reviewers: spatel, sanjoy Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326898	2018-03-07 15:11:13 +00:00
Sven van Haastregt	19f531d31e	[LoadStoreVectorizer] Differentiate between <1 x T> and T The LoadStoreVectorizer thought that <1 x T> and T were the same types when merging stores, leading to a crash later. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D44014 llvm-svn: 326884	2018-03-07 10:29:28 +00:00
Evgeny Stupachenko	204ade4102	Add early exit on reassociation of 0 expression. Summary: Before the patch a try to reassociate ((v * 16) * 0) * 1 fall into infinite loop Reviewers: pankajchawla Differential Revision: http://reviews.llvm.org/D41467 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326861	2018-03-07 02:17:08 +00:00
Eugene Zelenko	e2fc88a2fe	[Transforms] Add missing header for InstructionCombining.cpp, in order to export LLVMInitializeInstCombine as extern "C". Fixes PR35947. Patch by Brenton Bostick. Differential revision: https://reviews.llvm.org/D44140 llvm-svn: 326843	2018-03-06 23:06:13 +00:00
Sebastian Pop	bf6e1c26cf	DA: remove uses of GEP, only ask SCEV It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly detects multi-dimensional arrays in single nested loops: from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6 ;; for (long int i = 0; i < 50; i++) { ;; A[i][3i - 6] = i; ;; B++ = A[i][i]; DA used to detect two subscripts, which makes no sense in the LLVM IR or in C/C++ semantics, as there are no guarantees as in Fortran of subscripts not overlapping into a next array dimension: maximum nesting levels = 1 SrcPtrSCEV = %A DstPtrSCEV = %A using GEPs subscript 0 src = {0,+,1}<nuw><nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} subscript 1 src = {-6,+,3}<nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} Separable = {} Coupled = {1} With the current patch, DA will correctly work on only one dimension: maximum nesting levels = 1 SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body> DstSCEV = {%A,+,404}<%for.body> subscript 0 src = {(-2424 + %A)<nsw>,+,1212}<%for.body> dst = {%A,+,404}<%for.body> class = 1 loops = {1} Separable = {0} Coupled = {} This change removes all uses of GEP from DA, and we now only rely on the SCEV representation. The patch does not turn on -da-delinearize by default, and so the DA analysis will be more conservative in the case of multi-dimensional memory accesses in nested loops. I disabled some interchange tests, as the DA is not able to disambiguate the dependence anymore. To make DA stronger, we may need to compute a bound on the number of iterations based on the access functions and array dimensions. The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to avoid checking for snippets of LLVM IR: this form of checking is very hard to maintain. Instead, we now check for output of the pass that are more meaningful than dozens of lines of LLVM IR. Some tests now require -debug messages and thus only enabled with asserts. Patch written by Sebastian Pop and Aditya Kumar. Differential Revision: https://reviews.llvm.org/D35430 llvm-svn: 326837	2018-03-06 21:55:59 +00:00
Sanjay Patel	1f2f5d18d3	[InstCombine] simplify min/max canonicalization; NFCI llvm-svn: 326828	2018-03-06 19:01:18 +00:00
Sanjay Patel	7ed0bc26ac	[ValueTracking] move helpers for SelectPatterns from InstCombine to ValueTracking Most of the folds based on SelectPatternResult belong in InstSimplify rather than InstCombine, so the helper code should be available to other passes/analysis. llvm-svn: 326812	2018-03-06 16:57:55 +00:00
Florian Hahn	517dc51c48	[CallSiteSplitting] Do not crash when BB's terminator changes. Change doCallSiteSplitting to iterate until we reach the terminator instruction. tryToSplitCallSite can replace BB's terminator in case BB is a successor of itself. Then IE will be invalidated and we also have to check the current terminator. Reviewers: junbuml, davidxl, davide, fhahn Reviewed By: fhahn, junbuml Differential Revision: https://reviews.llvm.org/D43824 llvm-svn: 326793	2018-03-06 14:00:58 +00:00
Florian Hahn	f0a25f7253	[CloneFunction] Support BB == PredBB in DuplicateInstructionsInSplit. In case PredBB == BB and StopAt == BB's terminator, StopAt != &*BI will fail, because BB's terminator instruction gets replaced. By using BB.getTerminator() we get the current terminator which we can use to compare. Reviewers: sanjoy, anna, reames Reviewed By: anna Differential Revision: https://reviews.llvm.org/D43822 llvm-svn: 326779	2018-03-06 13:12:32 +00:00
Xin Tong	8fd561f572	[MergeICmp] Simplify how BCECmpBlock instructions are blacklisted llvm-svn: 326761	2018-03-06 02:24:02 +00:00
Xin Tong	98af9efca5	[MergeICmp] Fix printing. NFC llvm-svn: 326760	2018-03-06 02:04:57 +00:00
Daniel Neilson	82daad31fe	[RewriteStatepoints] Fix stale parse points Summary: RewriteStatepointsForGC collects parse points for further processing. During the collection if a callsite is found in an unreachable block (DominatorTree::isReachableFromEntry()) then all unreachable blocks are removed by removeUnreachableBlocks(). Some of the removed blocks could have been reachable according to DominatorTree::isReachableFromEntry(). In this case the collected parse points became stale and resulted in a crash when accessed. The fix is to unconditionally canonicalize the IR to removeUnreachableBlocks and then collect the parse points. The added test crashes with the old version and passes with this patch. Patch by Yevgeny Rouban! Reviewed by: Anna Differential Revision: https://reviews.llvm.org/D43929 llvm-svn: 326748	2018-03-05 22:27:30 +00:00
Daniel Neilson	bdda115e19	[InstCombine] Don't blow up in foldICmpWithCastAndCast on vector icmp instructions. Summary: Presently, InstCombiner::foldICmpWithCastAndCast() implicitly assumes that it is only invoked with icmp instructions of integer type. If that assumption is broken, and it is called with an icmp of vector type, then it fails (asserts/crashes). This patch addresses the deficiency. It allows it to simplify icmp (ptrtoint x), (ptrtoint/c) of vector type into a compare of the inputs, much as is done when the type is integer. Reviewers: apilipenko, fedor.sergeev, mkazantsev, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44063 llvm-svn: 326730	2018-03-05 18:05:51 +00:00
Craig Topper	8452faceae	[InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc. This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops. Differential Revision: https://reviews.llvm.org/D43774 llvm-svn: 326729	2018-03-05 18:04:12 +00:00
Florian Hahn	0b7c6422fb	[IPSCCP] Add getCompare which returns either true, false, undef or null. getCompare returns true, false or undef constants if the comparison can be evaluated, or nullptr if it cannot. This is in line with what ConstantExpr::getCompare returns. It also allows us to use ConstantExpr::getCompare for comparing constants. Reviewers: davide, mssimpso, dberlin, anna Reviewed By: davide Differential Revision: https://reviews.llvm.org/D43761 llvm-svn: 326720	2018-03-05 17:33:50 +00:00
Sanjay Patel	53ffabdfcb	[CVP] fix formatting; NFC llvm-svn: 326711	2018-03-05 16:08:34 +00:00
Xin Tong	8345c0e3a5	[MergeICmp] We can discard initial blocks that do other work Summary: We can discard initial blocks that do other work We do not need to limit ourselves to just the first block in the chain. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44029 llvm-svn: 326698	2018-03-05 13:54:47 +00:00
Clement Courbet	34be1b0288	[MergeICmps][NFC] Improve logging. llvm-svn: 326683	2018-03-05 08:21:47 +00:00
Fedor Indutny	364b9c2adb	[CallSiteSplitting] fix use after-free Iterating through predecessors of `TailBB` while removing their terminators leads to use after-free, because the predecessor list is changing on each removal. llvm-svn: 326668	2018-03-03 22:34:38 +00:00
Fedor Indutny	f9e09c1dd0	[CallSiteSplitting] properly split musttail calls Summary: `musttail` calls can't be naively splitted. The split blocks must include not only the call instruction itself, but also (optional) `bitcast` and `return` instructions that follow it. Clone `bitcast` and `ret`, place them into the split blocks, and remove the tail block when done. Reviewers: junbuml, mcrosier, davidxl, davide, fhahn Reviewed By: fhahn Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D43729 llvm-svn: 326666	2018-03-03 21:40:14 +00:00
Sanjay Patel	1a8d5c3d1f	[InstCombine] (~X) - (~Y) --> Y - X llvm-svn: 326660	2018-03-03 17:53:25 +00:00
Chandler Carruth	a4619d9944	[ThinLTO] Revert r325320: Import global variables This caused some links to fail with ThinLTO due to missing symbols as well as causing some binaries to have failures at runtime. We're working with the author to get a test case, but want to get the tree green again. Further, it appears to introduce a data race. While the test usage of threads was disabled in r325361 & r325362, that isn't an acceptable fix. I've reverted both of these as well. This code needs to be thread safe. Test cases for this are already on the original commit thread. llvm-svn: 326638	2018-03-02 23:40:08 +00:00
Vedant Kumar	7fc591f8bb	[AggressiveInstCombine] Use use_empty() instead of !getNumUses(), NFC use_empty() runs in O(1), whereas getNumUses() runs in O(# uses). llvm-svn: 326635	2018-03-02 23:22:49 +00:00
Sanjay Patel	e29375d04c	[InstCombine] rearrange visitFMul; NFCI Put the simplest non-FMF folds first, so it's easier to see what's left to fix/group/add with the FMF folds. llvm-svn: 326632	2018-03-02 23:06:45 +00:00
Vedant Kumar	f69baf64eb	[Utils] Salvage debug info in block simplification In stage2 -O3 builds of llc, this results in small but measurable increases in the number of variables with locations, and in the number of unique source variables overall. (According to llvm-dwarfdump --statistics, there are 123 additional variables with locations, which is just a 0.006% improvement). The size of the .debug_loc section of the llc dsym increases by 0.004%. llvm-svn: 326629	2018-03-02 22:46:48 +00:00
Vedant Kumar	334fa57456	[Utils] Salvage debug info in recursive inst deletion In stage2 -O3 builds of llc, this results in a 0.3% increase in the number of variables with locations, and a 0.2% increase in the number of unique source variables overall. The size of the .debug_loc section of the llc dsym increases by 0.5%. llvm-svn: 326621	2018-03-02 21:36:35 +00:00
Craig Topper	c7461e1aad	[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again. Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends. Differential Revision: https://reviews.llvm.org/D44038 llvm-svn: 326617	2018-03-02 21:25:18 +00:00
Sanjay Patel	2fd0acf05a	[InstCombine] partly fix FMF for fmul+log2 fold The code was checking that all of the instructions in the sequence are 'fast', but that's not necessary. The final multiply is all that we need to check (tests adjusted). The fmul doesn't need to be fully 'fast' either, but that can be another patch. llvm-svn: 326608	2018-03-02 20:32:46 +00:00
Yaxun Liu	3c42f1c3c9	LoopUnroll: respect pragma unroll when AllowRemainder is disabled Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops. Differential Revision: https://reviews.llvm.org/D43826 llvm-svn: 326585	2018-03-02 16:22:32 +00:00
Florian Hahn	515acd64fd	[LV][CFG] Add irreducible CFG detection for outer loops This patch adds support for detecting outer loops with irreducible control flow in LV. Current detection uses SCCs and only works for innermost loops. This patch adds a utility function that works on any CFG, given its RPO traversal and its LoopInfoBase. This function is a generalization of isIrreducibleCFG from lib/CodeGen/ShrinkWrap.cpp. The code in lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility function. Patch by Diego Caballero <diego.caballero@intel.com> Differential Revision: https://reviews.llvm.org/D40874 llvm-svn: 326568	2018-03-02 12:24:25 +00:00
Fedor Indutny	1571b1271e	[ArgumentPromotion] don't break musttail invariant PR36543 Summary: Do not break musttail invariant by promoting arguments of musttail callee or caller. Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk Reviewed By: rnk Subscribers: rnk, llvm-commits Differential Revision: https://reviews.llvm.org/D43926 llvm-svn: 326521	2018-03-02 00:59:27 +00:00
Sanjay Patel	d0cdb2f861	[InstCombine] allow fmul fold with less than 'fast' This is a retry of r326502 with updates to the reassociate test file that I missed the first time. @test15_reassoc in the supposed -reassociate test file (except that it tests 2 other passes too...) shows that there's no clear responsiblity for reassociation transforms. Instcombine now gets that case, but only because the constant values are identical. Otherwise, it would still miss that pattern. Reassociate doesn't get that case because it hasn't been updated to use less than 'fast' FMF. llvm-svn: 326513	2018-03-02 00:14:51 +00:00
Sanjay Patel	eb5d046890	revert r326502: [InstCombine] allow fmul fold with less than 'fast' I forgot that I added tests for 'reassoc' to -reassociate, but suprisingly that file calls -instcombine too, so it is affected. I'll update that file and try again. llvm-svn: 326510	2018-03-01 23:39:24 +00:00
Sanjay Patel	7373ae5c9a	[InstCombine] allow fmul fold with less than 'fast' llvm-svn: 326502	2018-03-01 22:53:47 +00:00
Craig Topper	2915bc0046	[SimplifyLibCalls] Update an obviously copy and pasted header comment to match this file. NFC llvm-svn: 326475	2018-03-01 20:05:09 +00:00
Sanjay Patel	f3b1af7aa4	[InstCombine] simplify code for (XY) X => (XX) Y ; NFCI llvm-svn: 326444	2018-03-01 15:50:26 +00:00
Benjamin Kramer	d1cf7ff5ab	[SCCP] Fix unused variable warning in release builds. llvm-svn: 326429	2018-03-01 11:31:44 +00:00
Reid Kleckner	3762a089d7	[IPSCCP] do not break musttail invariant (PR36485) Do not replace results of `musttail` calls with a constant if the call itself can't be removed. Do not zap returns of `musttail` callees, if the call site can't be removed and replaced with a constant. Do not zap returns of `musttail`-calling blocks, this breaks invariant too. Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43695 llvm-svn: 326404	2018-03-01 01:19:18 +00:00
Reid Kleckner	cb9611ca67	[DAE] don't remove args of musttail target/caller `musttail` requires identical signatures of caller and callee. Removing arguments breaks `musttail` semantics. PR36441 Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43708 llvm-svn: 326394	2018-03-01 00:09:35 +00:00
Sanjay Patel	eaf5a120ed	[InstCombine] simplify code for X * -1.0 --> -X; NFC I've added random FMF to one of the tests to show those are propagated. llvm-svn: 326377	2018-02-28 22:30:04 +00:00
Jonas Devlieghere	9ca064552a	[GlobalOpt] don't change CC of musttail calle(e\|r) When the function has musttail call - its cc is fixed to be equal to the cc of the musttail callee. In such case (and in the case of the musttail callee), GlobalOpt should not change the cc to fastcc as it will break the invariant. This fixes PR36546 Patch by: Fedor Indutny (indutny) Differential revision: https://reviews.llvm.org/D43859 llvm-svn: 326376	2018-02-28 22:28:44 +00:00
Craig Topper	b95298b041	[InstCombine] Split the FP constant code out of lookThroughFPExtensions and use nullptr as a sentinel Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings. This patch just moves all the code into a helper function and returns nullptr when it wants to stop. I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement. Differential Revision: https://reviews.llvm.org/D43833 llvm-svn: 326361	2018-02-28 20:14:34 +00:00
Vedant Kumar	9a041a7522	[InstrProfiling] Emit the runtime hook when no counters are lowered The API verification tool tapi has difficulty processing frameworks which enable code coverage, but which have no code. The profile lowering pass does not emit the runtime hook in this case because no counters are lowered. While the hook is not needed for program correctness (the profile runtime doesn't have to be linked in), it's needed to allow tapi to validate the exported symbol set of instrumented binaries. It was not possible to add a workaround in tapi for empty binaries due to an architectural issue: tapi generates its expected symbol set before it inspects a binary. Changing that model has a higher cost than simply forcing llvm to always emit the runtime hook. rdar://36076904 Differential Revision: https://reviews.llvm.org/D43794 llvm-svn: 326350	2018-02-28 19:00:08 +00:00
Sanjay Patel	b3f4f62698	[InstCombine] move invariant call out of loop; NFC We really shouldn't need a 2-loop here at all, but that's another cleanup. llvm-svn: 326330	2018-02-28 16:50:51 +00:00
Sanjay Patel	8fdd87f929	[InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly vague. The constant check is redundant in some cases, but it allows removing duplication for most of the calls. llvm-svn: 326329	2018-02-28 16:36:24 +00:00
Xin Tong	256869d8bc	Fix typo. NFC llvm-svn: 326319	2018-02-28 12:09:53 +00:00
Xin Tong	8ba674e43b	[MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed more than once. Summary: Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module. The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block is processed and second time when the last block is processed. We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43825 llvm-svn: 326318	2018-02-28 12:08:00 +00:00
David Green	7c35de124a	[Dominators] Remove verifyDomTree and add some verifying for Post Dom Trees Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315	2018-02-28 11:00:08 +00:00
Florian Hahn	1807c516c7	[NewGVN] Update phi-of-ops def block when updating existing ValuePHI. In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181	2018-02-27 09:34:51 +00:00
Sanjay Patel	31a90468e1	[InstCombine] allow fdiv folds with less than fully 'fast' ops Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098	2018-02-26 16:02:45 +00:00
Renato Golin	9d1b2acaaa	[LV] Move isLegalMasked* functions from Legality to CostModel All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079	2018-02-26 11:06:36 +00:00
Florian Hahn	a1822cbabc	[LoopInterchange] Loops with empty dependency matrix are safe. The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077	2018-02-26 10:45:25 +00:00
Adam Nemet	e4e1de60aa	Revert "StructurizeCFG: Test for branch divergence correctly" This reverts commit r325881. Breaks many bots llvm-svn: 326037	2018-02-24 17:29:09 +00:00
Sanjay Patel	2db2769499	[InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFC llvm-svn: 325968	2018-02-23 22:38:10 +00:00
Sanjay Patel	db53d1847b	[InstSimplify] sqrt(X) * sqrt(X) --> X This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965	2018-02-23 22:20:13 +00:00
Sanjay Patel	d32104e1b2	[InstCombine] allow fmul-sqrt folds with less than full -ffast-math Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960	2018-02-23 21:16:12 +00:00
Matt Davis	523c656e25	[Debug] Add dbg.value intrinsics for PHIs created during LCSSA. Summary: This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: ``` int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } ``` In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Reviewers: mzolotukhin, aprantl, vsk, davide Reviewed By: aprantl, vsk Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 325926	2018-02-23 17:38:27 +00:00
Sanjay Patel	6b9c7a9c83	[InstCombine] refactor fmul with negated op folds; NFCI The existing code was inefficiently looking for 'nsz' variants. That's unnecessary because we canonicalize those to the expected form with -0.0. We may also want to adjust or remove the fold that sinks negation. We don't do that for fdiv (or integer ops?). That should be uniform? It may also lead to missed optimization as in PR21914: https://bugs.llvm.org/show_bug.cgi?id=21914 ...or we just have to fix other passes to avoid that problem. llvm-svn: 325924	2018-02-23 17:14:28 +00:00
Sanjay Patel	4a9116e897	[InstCombine] use FMF-copying functions to reduce code; NFCI llvm-svn: 325923	2018-02-23 17:07:29 +00:00
Nicolai Haehnle	43c1115cd4	StructurizeCFG: Test for branch divergence correctly Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881	2018-02-23 10:45:46 +00:00
Bjorn Steinbrink	983d6c3f18	Mark MergedLoadStoreMotion as not preserving MemDep results Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880	2018-02-23 10:41:57 +00:00
Eric Christopher	675dcf02a8	Update comment for whether or not we can optimize an alias - we're checking the alias and not the aliasee. If the alias can be interposed then we shouldn't do anything. llvm-svn: 325837	2018-02-22 23:12:11 +00:00
Peter Collingbourne	32f5405bff	Fix DataFlowSanitizer instrumentation pass to take parameter position changes into account for custom functions. When DataFlowSanitizer transforms a call to a custom function, the new call has extra parameters. The attributes on parameters must be updated to take the new position of each parameter into account. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D43132 llvm-svn: 325820	2018-02-22 19:09:07 +00:00
Daniel Neilson	20c9207be3	[AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816	2018-02-22 18:55:59 +00:00
Luke Cheeseman	6c1e6bbe0c	[FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations passes for naked functions - Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788	2018-02-22 14:42:08 +00:00
Mircea Trofin	56950974d4	[SampleProf] NFC. Expose reusable functionality in SampleProfile. Summary: Exposing getOffset and findFunctionSamples as members of SampleProfile. They are intimately tied to design choices of the sample profile format - using offsets instead of line numbers, and traversing inlined functions stack, respectively. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43605 llvm-svn: 325747	2018-02-22 06:42:57 +00:00
Vedant Kumar	1ceabcf080	[Utils] Avoid a hash table lookup in salvageDI, NFC According to the current coverage report salvageDebugInfo() is called 5.12 million times during testing and almost always returns early. The early return depends on LocalAsMetadata::getIfExists returning null, which involves a DenseMap lookup in an LLVMContextImpl. We can probably speed this up by simply checking the IsUsedByMD bit in Value. llvm-svn: 325738	2018-02-22 01:29:41 +00:00
Sanjay Patel	5a6f904520	[InstCombine] add and use Create*FMF functions; NFC llvm-svn: 325730	2018-02-21 22:18:55 +00:00
Evgeniy Stepanov	43271b1803	[hwasan] Fix inline instrumentation. This patch changes hwasan inline instrumentation: Fixes address untagging for shadow address calculation (use 0xFF instead of 0x00 for the top byte). Emits brk instruction instead of hlt for the kernel and user space. Use 0x900 instead of 0x100 for brk immediate (0x100 - 0x800 are unavailable in the kernel). Fixes and adds appropriate tests. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D43135 llvm-svn: 325711	2018-02-21 19:52:23 +00:00
Vedant Kumar	56492f9177	[BDCE] Salvage debug info from dying insts This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660	2018-02-21 01:55:33 +00:00
Sanjay Patel	6f716a7c5e	[InstCombine] C / -X --> -C / X We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. This is similar to rL325648. llvm-svn: 325649	2018-02-21 00:01:45 +00:00
Sanjay Patel	d8dd0151fc	[InstCombine] -X / C --> X / -C for FP We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. llvm-svn: 325648	2018-02-20 23:51:16 +00:00
Sanjoy Das	737fa40ffa	[DSE] Don't DSE stores that subsequent memmove calls read from Summary: We used to remove the first memmove in cases like this: memmove(p, p+2, 8); memmove(p, p+2, 8); which is incorrect. Fix this by changing isPossibleSelfRead to what was most likely the intended behavior. Historical note: the buggy code was added in https://reviews.llvm.org/rL120974 to address PR8728. Reviewers: rsmith Subscribers: mcrosier, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D43425 llvm-svn: 325641	2018-02-20 23:19:34 +00:00
Sanjay Patel	7365b44b85	[InstCombine] remove unneeded operand swap: NFCI FMul is commutative, so complexity-based canonicalization should always take care of the swap via SimplifyAssociativeOrCommutative(). llvm-svn: 325628	2018-02-20 21:52:46 +00:00
Sanjay Patel	29b98ae337	[InstCombine] remove unneeded dyn_cast to prevent unused variable warning llvm-svn: 325597	2018-02-20 17:14:53 +00:00
Sanjay Patel	b2d978682b	[InstCombine] remove compound fdiv pattern folds These are fdiv-with-constant-divisor, so they already become reciprocal multiplies. The last gap for vector ops should be closed with rL325590. It's possible that we're missing folds for some edge cases with denormal intermediate constants after deleting these, but there are no tests for those patterns, and it would be better to handle denormals more consistently (and less conservatively) as noted in TODO comments. llvm-svn: 325595	2018-02-20 16:52:17 +00:00
Sanjay Patel	90f4c8ec29	[InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C) llvm-svn: 325590	2018-02-20 16:08:15 +00:00
Sanjay Patel	2816560b2c	[InstCombine] use CreateWithCopiedFlags to reduce code; NFCI Also, move the folds with constants closer to make it easier to follow. llvm-svn: 325541	2018-02-19 23:09:03 +00:00
Brian Gesiak	d1eabb1810	Revert "[mem2reg] Use range loops (NFCI)" This reverts commit r325532. llvm-svn: 325539	2018-02-19 22:48:51 +00:00
Sanjay Patel	1d14779aed	[InstCombine] allow fdiv with constant dividend folds with less than full -ffast-math It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this should be conservatively better than what we have right now. GCC allows this with only -freciprocal-math. The last test is changed to show a case that is expected to fold, but we need D43398. llvm-svn: 325533	2018-02-19 21:46:52 +00:00
Brian Gesiak	49a9d1a4e6	[mem2reg] Use range loops (NFCI) Summary: Several for loops in PromoteMemoryToRegister.cpp leave their increment expression empty, instead incrementing the iterator within the for loop body. I believe this is because these loops were previously implemented as while loops; see https://reviews.llvm.org/rL188327. Incrementing the iterator within the body of the for loop instead of in its increment expression makes it seem like the iterator will be modified or conditionally incremented within the loop, but that is not the case in these loops. Instead, use range loops. Test Plan: `check-llvm` Reviewers: davide, bkramer Reviewed By: davide, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43473 llvm-svn: 325532	2018-02-19 21:44:52 +00:00
Sanjay Patel	e412954953	[InstCombine] refactor fdiv with constant dividend folds; NFC The last fold that used to be here was not necessary. That's a combination of 2 folds (and there's a regression test to show that). The transforms are guarded by isFast(), but that should be loosened. llvm-svn: 325531	2018-02-19 21:17:58 +00:00
Brian Gesiak	58434db098	[Coroutines] Move debug statement before assert Summary: Move a debug statement to above where an assertion is hit, so that the debug statement can be inspected before a stack trace. Test Plan: `check-llvm` llvm-svn: 325529	2018-02-19 20:50:09 +00:00
Charles Saternos	b040fcc693	[ThinLTO] Add GraphTraits for FunctionSummaries Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Third attempt - moved function from lambda to static function due to build failures. llvm-svn: 325506	2018-02-19 15:14:50 +00:00
Ivan A. Kosarev	f03f579d1d	[Transforms] Propagate new-format TBAA tags on simplification of memory-transfer intrinsics With this patch in place, when a new-format TBAA tag is available for a memory-transfer intrinsic call, we prefer propagating that new-format tag. Otherwise, we fallback to the old approach where we try to construct a proper TBAA access tag from 'tbaa.struct' metadata. Differential Revision: https://reviews.llvm.org/D41543 llvm-svn: 325488	2018-02-19 12:10:20 +00:00
Simon Pilgrim	0efed32577	Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). Reverted due to buildbot failures llvm-svn: 325454	2018-02-18 00:01:36 +00:00
Charles Saternos	35878ee7a4	[ThinLTO] Add GraphTraits for FunctionSummaries Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). llvm-svn: 325448	2018-02-17 21:39:24 +00:00
Sanjay Patel	08868e494e	[Constant] add floating-point helpers for normal/finite-nz; NFC ...and delete the equivalent local functiona from InstCombine. These might be useful to other InstCombine files or other passes and makes FP queries more similar to integer constant queries. llvm-svn: 325398	2018-02-16 22:32:54 +00:00
Simon Pilgrim	c2ee69035c	Remove useless comment - seems to be a copy+paste typo. NFCI llvm-svn: 325385	2018-02-16 20:41:06 +00:00
Sanjay Patel	91bb775087	[InstCombine] clean up fdiv-with-fdiv folds; NFCI llvm-svn: 325366	2018-02-16 17:52:32 +00:00
Sanjay Patel	e16b0cfba9	[InstCombine] remove redundant debug info setting; NFC The IRBuilder sets debuginfo in Insert(), so this was duplicating what already happened. llvm-svn: 325358	2018-02-16 16:42:04 +00:00
Brian M. Rzycki	f1a7df5ef2	[JumpThreading] PR36133 enable/disable DominatorTree for LVI analysis Summary: The LazyValueInfo pass caches a copy of the DominatorTree when available. Whenever there are pending DominatorTree updates within JumpThreading's DeferredDominance object we cannot use the cached DT for LVI analysis. This commit adds the new methods enableDT() and disableDT() to LVI. JumpThreading also sets the appropriate usage model before calling LVI analysis methods. Fixes https://bugs.llvm.org/show_bug.cgi?id=36133 Reviewers: sebpop, dberlin, kuhar Reviewed by: sebpop, kuhar Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov Differential Revision: https://reviews.llvm.org/D42717 llvm-svn: 325356	2018-02-16 16:35:17 +00:00
Sanjay Patel	65da14d6c8	[InstCombine] reduce code duplication; NFC llvm-svn: 325353	2018-02-16 16:13:20 +00:00
Ivan A. Kosarev	53270d0fa6	[Transforms] Propagate TBAA info in SROA Now that we have the new TBAA metadata format that is capable of representing accesses to aggregates, we can propagate TBAA access tags from memory setting and transferring intrinsics to load and store instructions and vice versa. Since SROA produces lots of new loads and stores on optimized builds, this change significantly decreases the share of undecorated memory accesses on such builds. Differential Revision: https://reviews.llvm.org/D41563 llvm-svn: 325329	2018-02-16 10:10:29 +00:00
Eugene Leviant	7331a0bf1c	[ThinLTO] Import global variables Differential revision: https://reviews.llvm.org/D43077 llvm-svn: 325320	2018-02-16 08:11:04 +00:00
Vedant Kumar	616fdb00df	[GVN] Partially revert debug info salvage change (r325063) In r325063, we salvaged debug values from dying instructions in GVN::processBlock() and GVN::performScalarPRE(). The change in performScalarPRE(), while correct, is unhelpful. It introduced a call to salvageDebugInfo() which was immediately followed by a RAUW, meaning it prevented the RAUW from efficiently updating dbg.value intrinsics. This commit reverts the mistake and tightens up the affected test case. llvm-svn: 325308	2018-02-16 01:15:20 +00:00
Vedant Kumar	1df820ecd7	[DCE] Salvage debug info from dead insts This results in small increases in the size of the .debug_loc section and the number of unique source variables in a stage2 build of opt. llvm-svn: 325301	2018-02-15 22:26:18 +00:00
Brian Gesiak	a5e3675bd3	[Coroutines] Don't move stores for allocator args Summary: The behavior described in Coroutines TS `[dcl.fct.def.coroutine]/7` allows coroutine parameters to be passed into allocator functions. The instructions to store values into the alloca'd parameters must not be moved past the frame allocation, otherwise uninitialized values are passed to the allocator. Test Plan: `check-llvm` Reviewers: rsmith, GorNishanov, eric_niebler Reviewed By: GorNishanov Subscribers: compnerd, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D43000 llvm-svn: 325285	2018-02-15 19:31:45 +00:00
Vedant Kumar	044b588929	[Utils] salvageDI: Add a comment and move a call earlier, NFC llvm-svn: 325280	2018-02-15 19:13:03 +00:00
Sanjay Patel	1e04511e16	[InstCombine] use m_OneUse to reduce code; NFC llvm-svn: 325263	2018-02-15 16:30:10 +00:00
Sanjay Patel	339b4d338d	[InstCombine] allow sin/cos transforms with 'reassoc' The variable name 'AllowReassociate' is a lie at this point because it's set to 'isFast()' which is more than the 'reassoc' FMF after rL317488. In D41286, we showed that this transform may be valid even with strict math by brute force checking every 32-bit float result. There's a potential problem here because we're replacing with a tan() libcall rather than a hypothetical LLVM tan intrinsic. So we might set errno when we should be guaranteed not to do that. But that's independent of this change. llvm-svn: 325247	2018-02-15 15:07:12 +00:00
Sanjay Patel	6a0f667077	[InstCombine] allow X / C -> X * (1.0/C) for vector splat FP constants llvm-svn: 325237	2018-02-15 13:55:52 +00:00
Sanjay Patel	b39bcc0437	[InstCombine] clean up fold for X / C -> X * (1.0/C); NFCI This should work with vector constants too, but it's currently limited to scalar. llvm-svn: 325187	2018-02-14 23:04:17 +00:00
Rafael Espindola	7186753218	Pass a module reference to CloneModule. It can never be null and most callers were already using references or std::unique_ptr. llvm-svn: 325160	2018-02-14 19:50:40 +00:00
Rafael Espindola	6a86e25d90	Pass a reference to a module to the bitcode writer. This simplifies most callers as they are already using references or std::unique_ptr. llvm-svn: 325155	2018-02-14 19:11:32 +00:00
David Green	0d5f9651f2	Move llvm::computeLoopSafetyInfo from LICM.cpp to LoopUtils.cpp. NFC Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h, into the corresponding LoopUtils.cpp, as opposed to LICM where it resides at the moment. This will allow other functions from Transforms/Utils to reference it. llvm-svn: 325151	2018-02-14 18:34:53 +00:00
Craig Topper	1c19cc1745	[InstCombine] Don't fold select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W)) if the binop is rem or div. The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe. Fixes PR36362. Differential Revision: https://reviews.llvm.org/D43276 llvm-svn: 325148	2018-02-14 18:08:33 +00:00
Sanjay Patel	5df4d8892f	[InstCombine] simplify isFMulOrFDivWithConstant(); NFCI llvm-svn: 325142	2018-02-14 17:16:33 +00:00
Sanjay Patel	58dab856f7	[InstCombine] replace isa/cast with dyn_cast; NFC llvm-svn: 325141	2018-02-14 16:56:44 +00:00
Sanjay Patel	604cb9e3ed	[InstCombine] refactor folds for mul with negated operands; NFCI This keeps with our current usage of 'match' and is easier to see that the optional NSW only applies in the non-constant operand case. llvm-svn: 325140	2018-02-14 16:50:55 +00:00
Alexey Bataev	7f246e003a	[SLP] Allow vectorization of reversed loads. Summary: Reversed loads are handled as gathering. But we can just reshuffle these values. Patch adds support for vectorization of reversed loads. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43022 llvm-svn: 325134	2018-02-14 15:29:15 +00:00
Florian Hahn	b4e3bad89b	Recommit r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325126	2018-02-14 13:59:12 +00:00
Florian Hahn	c6296fea3f	[LoopInterchange] Incrementally update the dominator tree. We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops. Reviewers: dmgreen, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D43176 llvm-svn: 325122	2018-02-14 13:13:15 +00:00
Petar Jovanovic	1768957c82	[Utils] Salvage the debug info of DCE'ed 'and' instructions Preserve debug info from a dead 'and' instruction with a constant. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D43163 llvm-svn: 325119	2018-02-14 13:10:35 +00:00
Elena Demikhovsky	945b7e5aa6	Adding a width of the GEP index to the Data Layout. Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102	2018-02-14 06:58:08 +00:00
Vedant Kumar	1d5d31b706	[GVN] Salvage debug info from dead insts This preserves an additional 581 unique source variables in a stage2 build of clang (according to `llvm-dwarfdump --statistics`). It increases the size of the .debug_loc section by 0.1% (or 87139 bytes). Differential Revision: https://reviews.llvm.org/D43255 llvm-svn: 325063	2018-02-13 22:27:17 +00:00
Sanjay Patel	7558d860af	[InstCombine] (lshr X, 31) * Y --> (ashr X, 31) & Y This replaces the bit-tracking based fold that did the same thing, but it only worked for scalars and not directly. There is no evidence in existing regression tests that the greater power of bit-tracking was needed here, but we should be aware of this potential loss of optimization. llvm-svn: 325062	2018-02-13 22:24:37 +00:00
Sanjay Patel	cb8ac00f73	[InstCombine] (bool X) * Y --> X ? Y : 0 This is both a functional improvement for vectors and an efficiency improvement for scalars. The existing code below the new folds does the same thing for scalars, but in an indirect and expensive way. llvm-svn: 325048	2018-02-13 20:41:22 +00:00
Vedant Kumar	35fc103e1e	[DeadStoreElimination] Salvage debug info from dead insts According to `llvm-dwarfdump --statistics` this salvages 43 additional unique source variables in a stage2 build of clang. It increases the size of the .debug_loc section by 0.002% (or 2864 bytes). Differential Revision: https://reviews.llvm.org/D43220 llvm-svn: 325035	2018-02-13 18:15:26 +00:00
Florian Hahn	35d744d388	Revert r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. Due to memsan not being happy with the array of ValueToValue maps. llvm-svn: 325009	2018-02-13 14:48:39 +00:00
Florian Hahn	348b48ac6b	[CallSiteSplitting] Clear ValueToValue maps. llvm-svn: 325006	2018-02-13 14:17:00 +00:00
Florian Hahn	78bddd4cca	[CallSiteSplitting] Dereference pointer earlier. This should make the sanitizers happy. llvm-svn: 325004	2018-02-13 13:51:51 +00:00
Simon Pilgrim	be0dd72620	[InstCombine] Simplify getLogBase2 case for scalar/splats. NFCI. llvm-svn: 325003	2018-02-13 13:16:26 +00:00
Florian Hahn	b0884b6443	[CallSiteSplitting] Support splitting of blocks with instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325001	2018-02-13 12:00:48 +00:00
Florian Hahn	1f95ef1815	[LoopInterchange] Check number of latch successors before accessing them. In cases where the OuterMostLoopLatchBI only has a single successor, accessing the second successor will fail. This fixes a failure when building the test-suite with loop-interchange enabled. Reviewers: mcrosier, karthikthecool, davide Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D42906 llvm-svn: 324994	2018-02-13 10:02:52 +00:00
Vedant Kumar	388fac5de6	[Utils] Salvage debug info from all no-op casts We already try to salvage debug values from no-op bitcasts and inttoptr instructions: we should handle ptrtoint instructions as well. This saves an additional 24,444 debug values in a stage2 build of clang, and (according to llvm-dwarfdump --statistics) provides an additional 289 unique source variables. llvm-svn: 324982	2018-02-13 03:34:23 +00:00
Vedant Kumar	4011c26cc7	[Utils] Salvage debug info of DCE'ed mul/sdiv/srem instructions Here are the number of additional debug values salvaged in a stage2 build of clang: 63 SALVAGE: MUL 1250 SALVAGE: SDIV (No values were salvaged from `srem` instructions in this experiment, but it's a simple case to handle so we might as well.) llvm-svn: 324976	2018-02-13 01:09:52 +00:00
Vedant Kumar	31ec356a48	[Utils] Salvage debug info of DCE'ed shl/lhsr/ashr instructions Here are the number of additional debug values salvaged in a stage2 build of clang: 1912 SALVAGE: ASHR 405 SALVAGE: LSHR 249 SALVAGE: SHL llvm-svn: 324975	2018-02-13 01:09:49 +00:00
Vedant Kumar	47b16c45d7	[Utils] Salvage the debug info of DCE'ed 'sub' instructions This salvages 14 debug values in a stage2 build of clang. llvm-svn: 324974	2018-02-13 01:09:47 +00:00
Vedant Kumar	96b7dc041b	[Utils] Salvage the debug info of DCE'ed 'xor' instructions This salvages 259 debug values in a stage2 build of clang. Differential Revision: https://reviews.llvm.org/D43207 llvm-svn: 324973	2018-02-13 01:09:46 +00:00
Daniel Neilson	2363da9236	[InstCombine] Simplify MemTransferInst's source and dest alignments separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the InstCombine pass to cease using the deprecated MemoryIntrinsic::getAlignment() method, and instead we use the separate getSourceAlignment and getDestAlignment APIs to simplify the source and destination alignment attributes separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: majnemer, bollu, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D42871 llvm-svn: 324960	2018-02-12 23:06:55 +00:00
Adam Nemet	031a00c660	Revert "[LSR] Avoid UB overflow when examining reuse opportunities" This reverts commit r324943. Breaking bots, reverting for Gerolf. llvm-svn: 324958	2018-02-12 22:42:13 +00:00
Gerolf Hoflehner	edcd564820	[LSR] Avoid UB overflow when examining reuse opportunities llvm-svn: 324943	2018-02-12 21:49:32 +00:00
Volodymyr Sapsai	2ad768bb13	Revert "[ThinLTO] Add GraphTraits for FunctionSummaries" It caused assertion failure Assertion failed: (!DD.IsLambda && !MergeDD.IsLambda && "faked up lambda definition?"), function MergeDefinitionData, file /Users/buildslave/jenkins/workspace/clang-stage1-configure-RA/llvm/tools/clang/lib/Serialization/ASTReaderDecl.cpp, line 1675. on the second stage build bots. llvm-svn: 324932	2018-02-12 20:43:31 +00:00
Sanjay Patel	4a4f35f324	[InstCombine] X / (X * Y) --> 1.0 / Y This is similar to the instsimplify fold added with D42385 ( rL323716 ) ...but this can't be in instsimplify because we're creating/morphing a different instruction. llvm-svn: 324927	2018-02-12 19:39:21 +00:00
Sanjay Patel	1998cc6a47	[InstCombine] various clean-ups for div transforms; NFC llvm-svn: 324922	2018-02-12 18:38:35 +00:00
Jun Bum Lim	144eb593dd	[LICM] update BlockColors after splitting predecessors Update BlockColors after splitting predecessors. Do not allow splitting EHPad for sinking when the BlockColors is not empty, so we can simply assign predecessor's color to the new block. Fixes PR36184 llvm-svn: 324916	2018-02-12 17:56:55 +00:00
Alexey Bataev	ca2396e673	[SLP] Take user instructions cost into consideration in insertelement vectorization. Summary: For better vectorization result we should take into consideration the cost of the user insertelement instructions when we try to vectorize sequences that build the whole vector. I.e. if we have the following scalar code: ``` <Scalar code> insertelement <ScalarCode>, ... ``` we should consider the cost of the last `insertelement ` instructions as the cost of the scalar code. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42657 llvm-svn: 324893	2018-02-12 14:54:48 +00:00
Sanjay Patel	39059d2630	[InstCombine] various clean-ups for commonIDivTransforms; NFC llvm-svn: 324891	2018-02-12 14:14:56 +00:00
Florian Hahn	e54a20e094	[LoopInterchange] Simplify splitInnerLoopHeader logic (NFC). We can use SplitBlock for both cases, which makes the code slightly simpler and updates both LoopInfo and the dominator tree. llvm-svn: 324881	2018-02-12 11:10:58 +00:00
Max Kazantsev	b57ca09e43	[NFC] Fix typos llvm-svn: 324867	2018-02-12 05:16:28 +00:00
Charles Saternos	d3e7d19f59	[ThinLTO] Add GraphTraits for FunctionSummaries Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. llvm-svn: 324854	2018-02-11 22:06:20 +00:00
Sanjay Patel	510d647a4d	[InstCombine] X / (X * Y) -> 1 / Y if the multiplication does not overflow The related cases for (X * Y) / X were handled in rL124487. https://rise4fun.com/Alive/6k9 The division in these tests is subsequently eliminated by existing instcombines for 1/X. llvm-svn: 324843	2018-02-11 17:20:32 +00:00
Simon Pilgrim	19495198af	[InstCombine] Add constant vector support for ~(C >> Y) --> ~C >> Y Includes adding m_NonNegative constant pattern matcher llvm-svn: 324825	2018-02-10 21:46:09 +00:00
Mircea Trofin	73b96d6dcf	[LV] Fix analyzeInterleaving when -pass-remarks enabled Summary: If -pass-remarks=loop-vectorize, atomic ops will be seen by analyzeInterleaving(), even though canVectorizeMemory() == false. This is because we are requesting extra analysis instead of bailing out. In such a case, we end up with a Group in both Load- and StoreGroups, and then we'll try to access freed memory when traversing LoadGroups after having had released the Group when iterating over StoreGroups. The fix is to include mayWriteToMemory() when validating that two instructions are the same kind of memory operation. Reviewers: mssimpso, davidxl Reviewed By: davidxl Subscribers: hsaito, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D43064 llvm-svn: 324786	2018-02-10 00:07:45 +00:00
Vedant Kumar	04386d8e3d	[Utils] Salvage debug info from dead 'or' instructions Extend salvageDebugInfo to preserve the debug info from a dead 'or' with a constant. Patch by Ismail Badawi! Differential Revision: https://reviews.llvm.org/D43129 llvm-svn: 324764	2018-02-09 19:19:55 +00:00
Steven Wu	33ba93c2b5	[ThinLTO] Teach ThinLTO about auto hide symbols Summary: For symbols that has linkonce_odr linkage and unnamed_addr, it can be auto hide by linker to avoid weak external symbols. Teach ThinLTO to perform auto hide so it can safely promote linkonce_odr to weak symbols without breaking this nice property. Reviewers: tejohnson, mehdi_amini Reviewed By: tejohnson Subscribers: inglorion, eraman, rnk, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D43130 llvm-svn: 324757	2018-02-09 18:34:08 +00:00
Simon Pilgrim	9620f4b746	[InstCombine] Add constant vector support for X udiv C, where C >= signbit llvm-svn: 324728	2018-02-09 10:43:59 +00:00
Serguei Katkov	3cb4c34a4e	Rename and move utility function getLatchPredicateForGuard. NFC. Rename getLatchPredicateForGuard to more common name getFlippedStrictnessPredicate and move it to ICmpInst class. llvm-svn: 324717	2018-02-09 07:59:07 +00:00
Evgeniy Stepanov	80ccda2d4b	[hwasan] Fix kernel instrumentation of stack. Summary: Kernel addresses have 0xFF in the most significant byte. A tag can not be pushed there with OR (tag << 56); use AND ((tag << 56) \| 0x00FF..FF) instead. Reviewers: kcc, andreyknvl Subscribers: srhines, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42941 llvm-svn: 324691	2018-02-09 00:59:10 +00:00
Dmitry Mikulin	5cf73cea9c	[ThinLTO] Skip BlockAddresses while replacing uses in function import. Differential Revision: https://reviews.llvm.org/D43027 llvm-svn: 324658	2018-02-08 22:14:56 +00:00
Daniel Neilson	606cf6f64f	[DSan] Update uses of memory intrinsic get/setAlignment to new API (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the DataFlowSanitizer pass to cease using the old get/setAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324654	2018-02-08 21:28:26 +00:00
Daniel Neilson	a98d9d92da	[ASan] Update uses of IRBuilder::CreateMemCpy to new API (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AddressSanitizer pass to cease using The old IRBuilder CreateMemCpy single-alignment API in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324653	2018-02-08 21:26:12 +00:00
Daniel Neilson	57b34ce574	[MSan] Update uses of IRBuilder::CreateMemCpy to new API (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemorySanitizer pass to cease using the old IRBuilder CreateMemCpy single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324642	2018-02-08 19:46:12 +00:00
Simon Pilgrim	a54e8e429b	[InstCombine] visitSRem - use m_Negative(APInt) helper. NFCI. llvm-svn: 324636	2018-02-08 19:00:45 +00:00
Simon Pilgrim	1889f26b94	[InstCombine] Add m_Negative pattern matching Allows us to add non-uniform constant vector support for "X urem C -> X < C ? X : X - C, where C >= signbit." llvm-svn: 324631	2018-02-08 18:36:01 +00:00

... 8 9 10 11 12 ...

20356 Commits