llvm-project

Commit Graph

Author	SHA1	Message	Date
Vitaly Buka	c559e63798	Handle IntToPtr in isBytewiseValue Summary: This helps with more efficient use of memset for pattern initialization From @pcc prototype for -ftrivial-auto-var-init=pattern optimizations Binary size change on CTMark, (with -fuse-ld=lld -Wl,--icf=all, similar results with default linker options) ``` master patch diff Os 8.238864e+05 8.238864e+05 0.0 O3 1.054797e+06 1.054797e+06 0.0 Os zero 8.292384e+05 8.292384e+05 0.0 O3 zero 1.062626e+06 1.062626e+06 0.0 Os pattern 8.579712e+05 8.338048e+05 -0.030299 O3 pattern 1.090502e+06 1.067574e+06 -0.020481 ``` Zero vs Pattern on master ``` zero pattern diff Os 8.292384e+05 8.579712e+05 0.036578 O3 1.062626e+06 1.090502e+06 0.025124 ``` Zero vs Pattern with the patch ``` zero pattern diff Os 8.292384e+05 8.338048e+05 0.003333 O3 1.062626e+06 1.067574e+06 0.003193 ``` Reviewers: pcc, eugenis Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63967 llvm-svn: 365858	2019-07-12 01:42:03 +00:00
Tim Northover	67828edbbd	OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC. llvm-svn: 365770	2019-07-11 13:13:02 +00:00
Tim Northover	030bb3d363	InstructionSimplify: Simplify InstructionSimplify. NFC. The interface predates CallBase, so both it and implementation were significantly more complicated than they needed to be. There was even some redundancy that could be eliminated. Should also help with OpaquePointers by not trying to derive a function's type from it's PointerType. llvm-svn: 365767	2019-07-11 13:11:44 +00:00
Chen Zheng	627095ec5b	[SCEV] teach SCEV symbolical execution about overflow intrinsics folding. Differential Revision: https://reviews.llvm.org/D64422 llvm-svn: 365726	2019-07-11 02:18:22 +00:00
Johannes Doerfert	3ed286a388	Replace three "strip & accumulate" implementations with a single one This patch replaces the three almost identical "strip & accumulate" implementations for constant pointer offsets with a single one, combining the respective functionalities. The old interfaces are kept for now. Differential Revision: https://reviews.llvm.org/D64468 llvm-svn: 365723	2019-07-11 01:14:48 +00:00
Vitaly Buka	d03bd1db59	NFC: Pass DataLayout into isBytewiseValue Summary: We will need to handle IntToPtr which I will submit in a separate patch as it's not going to be NFC. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63940 llvm-svn: 365709	2019-07-10 22:53:52 +00:00
Jinsong Ji	06fef0b359	Revert "[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()" This reverts commit `d955573065`. llvm-svn: 365520	2019-07-09 17:53:09 +00:00
Chen Zheng	d955573065	[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable() Differential Revision: https://reviews.llvm.org/D64197 llvm-svn: 365497	2019-07-09 14:56:17 +00:00
Tim Northover	60afa49abe	OpaquePtr: add Type parameter to Loads analysis API. This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468	2019-07-09 11:35:35 +00:00
Denis Bakhvalov	74be349bcf	[SCEV] Fix for PR42397. SCEVExpander wrongly adds nsw to shl instruction. Change-Id: I76c9f628c092ae3e6e78ebdaf55cec726e25d692 llvm-svn: 365363	2019-07-08 18:03:43 +00:00
Brian Homerding	b4b21d807e	Add, and infer, a nofree function attribute This patch adds a function attribute, nofree, to indicate that a function does not, directly or indirectly, call a memory-deallocation function (e.g., free, C++'s operator delete). Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365336	2019-07-08 15:57:56 +00:00
Eugene Leviant	3aef35288b	[ThinLTO] Attempt to recommit r365188 after alignment fix llvm-svn: 365215	2019-07-05 15:25:05 +00:00
Eugene Leviant	e91f86f0ac	Reverted r365188 due to alignment problems on i686-android llvm-svn: 365206	2019-07-05 13:26:05 +00:00
Eugene Leviant	820cc01d1e	[ThinLTO] Attempt to recommit r365040 after caching fix It's possible that some function can load and store the same variable using the same constant expression: store %Derived* @foo, %Derived bitcast (%Base @bar to %Derived*) %42 = load %Derived, %Derived bitcast (%Base @bar to %Derived**) The bitcast expression was mistakenly cached while processing loads, and never examined later when processing store. This caused @bar to be mistakenly treated as read-only variable. See load-store-caching.ll. llvm-svn: 365188	2019-07-05 12:00:10 +00:00
Reid Kleckner	f7e52fbdb5	Revert [ThinLTO] Optimize writeonly globals out This reverts r365040 (git commit `5cacb91475`) Speculatively reverting, since this appears to have broken check-lld on Linux. Partial analysis in https://crbug.com/981168. llvm-svn: 365097	2019-07-04 00:03:30 +00:00
Evgeniy Stepanov	50dc28b556	Teach ValueTracking that aarch64.irg result aliases its input. Reviewers: javed.absar, olista01 Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64103 llvm-svn: 365079	2019-07-03 20:19:14 +00:00
Philip Reames	39e7a97ad7	[SCEV] Preserve flags on add/muls in getSCEVATScope We haven't changed the set of users, just specialized an operand for those users. Given that, the previous wrap flags must still be correct. Sorry for the lack of test case. Noticed this while working on something else, and haven't figured out to exercise this standalone. llvm-svn: 365053	2019-07-03 16:34:08 +00:00
Eugene Leviant	5cacb91475	[ThinLTO] Optimize writeonly globals out Differential revision: https://reviews.llvm.org/D63444 llvm-svn: 365040	2019-07-03 14:14:52 +00:00
Eugene Leviant	ac407a7b4a	[SCEV][LSR] Prevent using undefined value in binops On some occasions ReuseOrCreateCast may convert previously expanded value to undefined. That value may be passed by SCEVExpander as an argument to InsertBinop making IV chain undefined. Differential revision: https://reviews.llvm.org/D63928 llvm-svn: 365009	2019-07-03 09:36:32 +00:00
Jordan Rupprecht	02647f73d4	Revert [InlineCost] cleanup calculations of Cost and Threshold This reverts r364422 (git commit `1a3dc76186`) The inlining cost calculation is incorrect, leading to stack overflow due to large stack frames from heavy inlining. llvm-svn: 365000	2019-07-03 04:01:51 +00:00
Chen Zheng	dfdccbb26b	[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop. Differential Revision: https://reviews.llvm.org/D63477 llvm-svn: 364993	2019-07-03 01:49:03 +00:00
Teresa Johnson	a700436323	[ThinLTO] Add summary entries for index-based WPD Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The follow-on index-based WPD patch is D55153. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 364960	2019-07-02 19:38:02 +00:00
Kristof Umann	9353421ecd	[IDF] Generalize IDFCalculator to be used with Clang's CFG I'm currently working on a GSoC project that aims to improve the the bug reports of the analyzer. The main heuristic I plan to use is to explain values that are a control dependency of the bug location better. 01 bool b = messyComputation(); 02 int i = 0; 03 if (b) // control dependency of the bug site, let's explain why we assume val 04 // to be true 05 10 / i; // warn: division by zero Because of this, I'd like to generalize IDFCalculator so that I could use it for Clang's CFG: D62883. In detail: * Rename IDFCalculator to IDFCalculatorBase, make it take a general CFG node type as a template argument rather then strictly BasicBlock (but preserve ForwardIDFCalculator and ReverseIDFCalculator) * Move IDFCalculatorBase from llvm/include/llvm/Analysis to llvm/include/llvm/Support (but leave the BasicBlock variants in llvm/include/llvm/Analysis) * clang-format the file since this patch messes up git blame anyways * Change typedef to using * Add the new type ChildrenGetterTy, and store an instance of it in IDFCalculatorBase. This is important because I'll have to specialize it for Clang's CFG to filter out nullpointer successors, similarly to D62507. Differential Revision: https://reviews.llvm.org/D63389 llvm-svn: 364911	2019-07-02 11:30:12 +00:00
Fangrui Song	78ee2fbf98	Cleanup: llvm::bsearch -> llvm::partition_point after r364719 llvm-svn: 364720	2019-06-30 11:19:56 +00:00
Johannes Doerfert	6ed459fd41	Use "willreturn" in isGuaranteedToTransferExecutionToSuccessor The `willreturn` function attribute guarantees that a function call will come back to the call site if the call is also known not to throw. Therefore, this attribute can be used in `isGuaranteedToTransferExecutionToSuccessor`. Patch by Hideto Ueno (@uenoku) Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63372 llvm-svn: 364580	2019-06-27 19:29:48 +00:00
Philip Reames	1cf9e72cbc	Update -analyze -scalar-evolution output for multiple exit loops w/computable exit values The previous output was next to useless if any exit was not computable. If we have more than one exit, show the exit count for each so that it's easier to see what's going from with SCEV analysis when debugging. llvm-svn: 364579	2019-06-27 19:22:43 +00:00
Fedor Sergeev	1a3dc76186	[InlineCost] cleanup calculations of Cost and Threshold Summary: Doing better separation of Cost and Threshold. Cost counts the abstract complexity of live instructions, while Threshold is an upper bound of complexity that inlining is comfortable to pay. There are two parts: - huge 15K last-call-to-static bonus is no longer subtracted from Cost but rather is now added to Threshold. That makes much more sense, as the cost of inlining (Cost) is not changed by the fact that internal function is called once. It only changes the likelyhood of this inlining being profitable (Threshold). - bonus for calls proved-to-be-inlinable into callee is no longer subtracted from Cost but added to Threshold instead. While calculations are somewhat different, overall InlineResult should stay the same since Cost >= Threshold compares the same. Reviewers: eraman, greened, chandlerc, yrouban, apilipenko Reviewed By: apilipenko Tags: #llvm Differential Revision: https://reviews.llvm.org/D60740 llvm-svn: 364422	2019-06-26 13:24:24 +00:00
Chen Zheng	aa99952896	[HardwareLoops] NFC - move loop with irreducible control flow checking logic to HarewareLoopInfo. llvm-svn: 364415	2019-06-26 12:02:43 +00:00
Chen Zheng	46ce9e4fff	[HardwareLoops] NFC - move loop with irreducible control flow checking logic to isHardwareLoopProfitable() llvm-svn: 364397	2019-06-26 09:12:52 +00:00
Clement Courbet	3bc5ad551a	[ExpandMemCmp] Move all options to TargetTransformInfo. Split off from D60318. llvm-svn: 364281	2019-06-25 08:04:13 +00:00
Bjorn Pettersson	485a421876	[ConstantFolding] Use hasVectorInstrinsicScalarOpd. NFC Summary: Use the hasVectorInstrinsicScalarOpd helper function in ConstantFoldVectorCall. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin, RKSimon Subscribers: tschuett, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63705 llvm-svn: 364178	2019-06-24 12:07:17 +00:00
Bjorn Pettersson	512b118779	[Scalarizer] Add scalarizer support for smul.fix.sat Summary: Handle smul.fix.sat in the scalarizer. This is done by adding smul.fix.sat to the set of "isTriviallyVectorizable" intrinsics. The addition of smul.fix.sat in isTriviallyVectorizable and hasVectorInstrinsicScalarOpd can also be seen as a preparation to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63704 llvm-svn: 364177	2019-06-24 12:07:11 +00:00
Fangrui Song	dc8de6037c	Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC llvm-svn: 364006	2019-06-21 05:40:31 +00:00
Sanjay Patel	b342f026a4	[InstSimplify] simplify power-of-2 (single bit set) sequences As discussed in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 Improving the canonicalization for these patterns: rL363956 ...means we should adjust/enhance the related simplification. https://rise4fun.com/Alive/w1cp Name: isPow2 or zero %x = and i32 %xx, 2048 %a = add i32 %x, -1 %r = and i32 %a, %x => %r = i32 0 llvm-svn: 363997	2019-06-20 22:55:28 +00:00
Alina Sbirlea	109d2ea153	[MemorySSA] Cleanup trivial phis. Summary: This is unfortunately needed for correctness, if we are to extend the tolerance of the update API to the way simple loop unswitch is doing cloning. In simple loop unswitch (as opposed to loop unswitch), not all blocks are cloned. This can create unreachable cloned blocks (no predecessor), which are later cleaned up. In MemorySSA, the APIs for supporting these kind of updates (clone + update exit blocks), make certain assumption on the integrity of the CFG. When cloning, if something was not cloned, it's values in MemorySSA default to LiveOnEntry. When updating exit blocks, it is safe to assume that we can first insert phis in the blocks merging two clones, then add additional phis in the IDF of the blocks that received phis. This no longer holds true if one of the clones being merged comes from an unreachable block. We'd conservatively need to add all phis before filling in their incoming definitions. In practice this restriction can be relaxed if we clean up trivial phis after the first round of insertion. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63354 llvm-svn: 363880	2019-06-19 21:33:09 +00:00
Alina Sbirlea	238b8e62b6	[MemorySSA] Use GraphDiff info when computing IDF. Summary: When computing IDF for insert updates, ensure we use the snapshot CFG offered by GraphDiff. Caught by D63389. Reviewers: kuhar, george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits, Szelethus Tags: #llvm Differential Revision: https://reviews.llvm.org/D63443 llvm-svn: 363879	2019-06-19 21:17:31 +00:00
Bjorn Pettersson	16ff5fea87	[ConstantFolding] Add constant folding for smul.fix and smul.fix.sat Summary: This patch teaches ConstantFolding to constant fold both scalar and vector variants of llvm.smul.fix and llvm.smul.fix.sat. As described in the LangRef rounding is unspecified for these instrinsics. If the result cannot be represented exactly the default behavior in ConstantFolding is to round down towards negative infinity. If a target has a preferred rounding that is different some kind of target hook would be needed (same strategy as used by the SelectionDAG legalizer). Reviewers: nikic, leonardchan, RKSimon Reviewed By: leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63385 llvm-svn: 363811	2019-06-19 14:28:03 +00:00
Bjorn Pettersson	b81b9a4e7b	[ConstantFolding] Refactor ConstantFoldScalarCall. NFC This patch splits ConstantFoldScalarCall into several functions. Benefits: - Reduces indentation levels and avoids long if-statements. - Makes it easier to add support for > 3 operands. llvm-svn: 363810	2019-06-19 14:27:51 +00:00
Jay Foad	45d19fb470	[ConstantFolding] Fix assertion failure on non-power-of-two vector load. Summary: The test case does an (out of bounds) load from a global constant with type <3 x float>. InstSimplify tried to turn this into an integer load of the whole alloc size of the vector, which is 128 bits due to alignment padding, and then bitcast this to <3 x vector> which failed an assertion due to the type size mismatch. The fix is to do an integer load of the normal size of the vector, with no alignment padding. Reviewers: tpr, arsenm, majnemer, dstuttard Reviewed By: arsenm Subscribers: hfinkel, wdng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63375 llvm-svn: 363784	2019-06-19 10:28:48 +00:00
Chen Zheng	c5b918de58	[NFC] move some hardware loop checking code to a common place for other using. Differential Revision: https://reviews.llvm.org/D63478 llvm-svn: 363758	2019-06-19 01:26:31 +00:00
Amara Emerson	146882242f	[GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block. Inter-block localization is the same as what currently happens, except now it only runs on the entry block because that's where the problematic constants with long live ranges come from. The second phase is a new intra-block localization phase which attempts to re-sink the already localized instructions further right before one of the multiple uses. One additional change is to also localize G_GLOBAL_VALUE as they're constants too. However, on some targets like arm64 it takes multiple instructions to materialize the value, so some additional heuristics with a TTI hook have been introduced attempt to prevent code size regressions when localizing these. Overall, these changes improve CTMark code size on arm64 by 1.2%. Full code size results: Program baseline new diff ------------------------------------------------------------------------------ test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6% test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6% test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4% test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2% test-suite :: CTMark/lencod/lencod.test 1340592 `1324200` -1.2% test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9% test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0% test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0% Geomean difference -1.2% Differential Revision: https://reviews.llvm.org/D63303 llvm-svn: 363632	2019-06-17 23:20:29 +00:00
Philip Reames	44475363e8	Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges This patch really contains two pieces: Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi. Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses. Differential Revision: https://reviews.llvm.org/D63224 llvm-svn: 363619	2019-06-17 21:06:17 +00:00
Alina Sbirlea	7a0098aa6e	[MemorySSA] Don't use template when the clone is a simplified instruction. Summary: LoopRotate doesn't create a faithful clone of an instruction, it may simplify it beforehand. Hence the clone of an instruction that has a MemoryDef associated may not be a definition, but a use or not a memory alternig instruction. Don't rely on the template when the clone may be simplified. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63355 llvm-svn: 363597	2019-06-17 18:58:40 +00:00
Alina Sbirlea	05f77803f4	[MemorySSA] Add all MemoryPhis before filling their values. Summary: Add all MemoryPhis in IDF before filling in their incomign values. Otherwise, a new Phi can be added that needs to become the incoming value of another Phi. Test fails the verification in verifyPrevDefInPhis. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63353 llvm-svn: 363590	2019-06-17 18:16:53 +00:00
Warren Ristow	6452bdd29b	[LV] Suppress vectorization in some nontemporal cases When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581	2019-06-17 17:20:08 +00:00
Sam Parker	60d6fb2a63	[SCEV] Use NoWrapFlags when expanding a simple mul Second functional change following on from rL362687. Pass the NoWrapFlags from the MulExpr to InsertBinop when we're generating a shl or mul. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363540	2019-06-17 10:05:18 +00:00
Roman Lebedev	5a663bd77a	[InstSimplify] Fix addo/subo undef folds (PR42209) Fix folds of addo and subo with an undef operand to be: `@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`, as per LLVM undef rules. Same for commuted variants. Based on the original version of the patch by @nikic. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 \| PR42209 ]] Differential Revision: https://reviews.llvm.org/D63065 llvm-svn: 363522	2019-06-16 20:39:45 +00:00
Nikita Popov	8550fb386a	[SCEV] Use unsigned/signed intersection type in SCEV Based on D59959, this switches SCEV to use unsigned/signed range intersection based on the sign hint. This will prefer non-wrapping ranges in the relevant domain. I've left the one intersection in getRangeForAffineAR() to use the smallest intersection heuristic, as there doesn't seem to be any obvious preference there. Differential Revision: https://reviews.llvm.org/D60035 llvm-svn: 363490	2019-06-15 09:15:52 +00:00
Akira Hatanaka	a704a8f28c	[ObjC][ARC] Delete ObjC runtime calls on global variables annotated with 'objc_arc_inert' Those calls are no-ops, so they can be safely deleted. rdar://problem/49839633 Differential Revision: https://reviews.llvm.org/D62433 llvm-svn: 363468	2019-06-14 22:06:32 +00:00
Matt Arsenault	282dac717e	SROA: Allow eliminating addrspacecasted allocas There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462	2019-06-14 21:38:31 +00:00
Sam Parker	0cf9639a9c	[SCEV] Pass NoWrapFlags when expanding an AddExpr InsertBinop now accepts NoWrapFlags, so pass them through when expanding a simple add expression. This is the first re-commit of the functional changes from rL362687, which was previously reverted. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363364	2019-06-14 09:19:41 +00:00
Nikita Popov	ad81d427ca	[LangRef] Clarify poison semantics I find the current documentation of poison somewhat confusing, mainly because its use of "undefined behavior" doesn't seem to align with our usual interpretation (of immediate UB). Especially the sentence "any instruction that has a dependence on a poison value has undefined behavior" is very confusing. Clarify poison semantics by: * Replacing the introductory paragraph with the standard rationale for having poison values. * Spelling out that instructions depending on poison return poison. * Spelling out how we go from a poison value to immediate undefined behavior and give the two examples we currently use in ValueTracking. * Spelling out that side effects depending on poison are UB. Differential Revision: https://reviews.llvm.org/D63044 llvm-svn: 363320	2019-06-13 19:45:36 +00:00
Philip Reames	038e01dc9a	Add a clarifying comment about branching on poison I recently got this wrong (again), and I'm sure I'm not the only one. Put a comment in the logical place someone would look to "fix" the obvious "missed optimization" which arrises based on the common misunderstanding. Hopefully, this will save others time. :) llvm-svn: 363318	2019-06-13 19:27:56 +00:00
Joseph Tremoulet	3bc6e2a7aa	[EarlyCSE] Ensure equal keys have the same hash value Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274	2019-06-13 15:24:11 +00:00
Philip Reames	e51c3d8b82	[SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673 SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA. It's not entirely clear this restriction is neccessary, but for the moment it exists. For this reason, we don't analyze single-entry phi inputs. However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant. Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops. This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values. We should probably also add similiar logic directly in the SCEV construction path itself. Patch by: mkazantsev (with revised commit message by me) Differential Revision: https://reviews.llvm.org/D58113 llvm-svn: 363180	2019-06-12 17:21:47 +00:00
Matt Arsenault	2466ba97bc	LoopDistribute/LAA: Respect convergent This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160	2019-06-12 13:34:19 +00:00
Nico Weber	8bbdea447e	Fix a Wunused-lambda-capture warning. The capture was added in the first commit of https://reviews.llvm.org/D61934 when it was used. In the reland, the use was removed but the capture wasn't removed. llvm-svn: 363155	2019-06-12 12:46:46 +00:00
Sam Parker	61de6a4e9c	[NFC][SCEV] Add NoWrapFlag argument to InsertBinOp 'Use wrap flags in InsertBinop' (rL362687) was reverted due to miscompiles. This patch introduces the previous change to pass no-wrap flags but now only FlagAnyWrap is passed. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363147	2019-06-12 11:53:55 +00:00
Philip Reames	02f0b379f5	Fix a bug in getSCEVAtScope w.r.t. non-canonical loops The issue is that if we have a loop with multiple predecessors outside the loop, the code was expecting to merge them and only return if equal, but instead returned the first one seen. I have no idea if this actually tripped anywhere. I noticed it by accident when reading the code and have no idea how to go about constructing a test case. llvm-svn: 363112	2019-06-11 23:21:24 +00:00
Sanjay Patel	40e3bdf876	[Analysis] add isSplatValue() for vectors in IR We have the related getSplatValue() already in IR (see code just above the proposed addition). But sometimes we only need to know that the value is a splat rather than capture the splatted scalar value. Also, we have an isSplatValue() function already in SDAG. Motivation - recent bugs that would potentially benefit from improved splat analysis in IR: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 Differential Revision: https://reviews.llvm.org/D63138 llvm-svn: 363106	2019-06-11 22:25:18 +00:00
Alina Sbirlea	cb4ed8a7bc	[MemorySSA] When applying updates, clean unnecessary Phis. Summary: After applying a set of insert updates, there may be trivial Phis left over. Clean them up. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63033 llvm-svn: 363094	2019-06-11 19:09:34 +00:00
Alina Sbirlea	3cef1f7d64	Only passes that preserve MemorySSA must mark it as preserved. Summary: The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as preserved, because it's being called in a lot of passes that do not preserve MemorySSA. Instead, mark the MemorySSA analysis as preserved by each pass that does preserve it. These changes only affect the new pass mananger. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62536 llvm-svn: 363091	2019-06-11 18:27:49 +00:00
Philip Reames	4bf1c23990	Factor out a helper function for readability and reuse in a future patch [NFC] llvm-svn: 362980	2019-06-10 20:41:27 +00:00
Sanjay Patel	866db10228	[InstSimplify] reduce code duplication for fcmp folds; NFC llvm-svn: 362904	2019-06-09 13:58:46 +00:00
Sanjay Patel	73f5a855b3	[InstSimplify] enhance fcmp fold with never-nan operand This is another step towards correcting our usage of fast-math-flags when applied on an fcmp. In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of the fcmp. But I'm leaving that clause in until we're more confident that we can stop relying on fcmp's FMF. By using the more general "isKnownNeverNaN()", we gain a simplification shown on the tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN). On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction in addition to the FMF on the fcmp. This is a continuation of D62979 / rL362879. llvm-svn: 362903	2019-06-09 13:48:59 +00:00
Ayke van Laethem	f18cf230e4	[CaptureTracking] Don't let comparisons against null escape inbounds pointers Pointers that are in-bounds (either through dereferenceable_or_null or thorough a getelementptr inbounds) cannot be captured with a comparison against null. There is no way to construct a pointer that is still in bounds but also NULL. This helps safe languages that insert null checks before load/store instructions. Without this patch, almost all pointers would be considered captured even for simple loads. With this patch, an icmp with null will not be seen as escaping as long as certain conditions are met. There was a lot of discussion about this patch. See the Phabricator thread for detals. Differential Revision: https://reviews.llvm.org/D60047 llvm-svn: 362900	2019-06-09 10:20:33 +00:00
Sanjay Patel	4329c15f11	[InstSimplify] enhance fcmp fold with never-nan operand This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp. In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of the fcmp. But I'm leaving that clause in until we're more confident that we can stop relying on fcmp's FMF. By using the more general "isKnownNeverNaN()", we gain a simplification shown on the tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN). On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction in addition to the FMF on the fcmp. I'll update the 'ult' case below here as a follow-up assuming no problems here. Differential Revision: https://reviews.llvm.org/D62979 llvm-svn: 362879	2019-06-08 15:12:33 +00:00
Sanjay Patel	e490e4a0e7	[Analysis] simplify code for getSplatValue(); NFC AFAIK, this is only currently called by TTI, but it could be used from instcombine or CGP to help solve problems like: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 llvm-svn: 362810	2019-06-07 16:09:54 +00:00
Joerg Sonnenberger	b2e96169b0	[NFC] Don't export helpers of ConstantFoldCall llvm-svn: 362799	2019-06-07 13:28:52 +00:00
Sam Parker	c5ef502ee8	[CodeGen] Generic Hardware Loop Support Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. llvm-svn: 362774	2019-06-07 07:35:30 +00:00
Craig Topper	ca541b20d0	[CFLGraph] Add support for unary fneg instruction. Differential Revision: https://reviews.llvm.org/D62791 llvm-svn: 362737	2019-06-06 19:21:23 +00:00
Craig Topper	6cda33ba36	[InlineCost] Add support for unary fneg. This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost. Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now. Differential Revision: https://reviews.llvm.org/D62699 llvm-svn: 362732	2019-06-06 19:02:18 +00:00
Whitney Tsang	03e8369a72	[DA] Add an option to control delinearization validity checks Summary: Dependence Analysis performs static checks to confirm validity of delinearization. These checks often fail for 64-bit targets due to type conversions and integer wrapping that prevent simplification of the SCEV expressions. These checks would also fail at compile-time if the lower bound of the loops are compile-time unknown. For example: void foo(int n, int m, int a[][m]) { for (int i = 0; i < n; ++i) for (int j = 0; j < m; ++j) { a[i][j] = a[i+1][j-2]; } } opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline -pass-remarks=.* -debug-pass=Arguments -da-permissive-validity-checks=false k3.ll -analyze -da will produce the following by default: da analyze - anti [* *\|<]! but will produce the following expected dependence vector if the validity checks are disabled: da analyze - consistent anti [1 -2]! This revision will introduce a debug option that will leave the validity checks in place by default, but allow them to be turned off. New tests are added for cases where it cannot be proven at compile-time that the individual subscripts stay in-bound with respect to a particular dimension of an array. These tests enable the option to provide user guarantee that the subscripts do not over/under-flow into other dimensions, thereby producing more accurate dependence vectors. For prior discussion on this topic, leading to this change, please see the following thread: http://lists.llvm.org/pipermail/llvm-dev/2019-May/132372.html Reviewers: Meinersbur, jdoerfert, kbarton, dmgreen, fhahn Reviewed By: Meinersbur, jdoerfert, dmgreen Subscribers: fhahn, hiraditya, javed.absar, llvm-commits, Whitney, etiotto Tag: LLVM Differential Revision: https://reviews.llvm.org/D62610 llvm-svn: 362711	2019-06-06 15:12:49 +00:00
Benjamin Kramer	f1249442cf	Revert "[SCEV] Use wrap flags in InsertBinop" This reverts commit r362687. Miscompiles llvm-profdata during selfhost. llvm-svn: 362699	2019-06-06 12:35:46 +00:00
Sam Parker	7cc580f5e9	[SCEV] Use wrap flags in InsertBinop If the given SCEVExpr has no (un)signed flags attached to it, transfer these to the resulting instruction or use them to find an existing instruction. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 362687	2019-06-06 08:56:26 +00:00
Whitney Tsang	2d0896c1cb	[LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, and loop induction variable. Summary: This PR extends the loop object with more utilities to get loop bounds, step, and loop induction variable. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 362644	2019-06-05 20:42:47 +00:00
Whitney Tsang	590b1aee60	Revert "Title: [LOOPINFO] Extend Loop object to add utilities to get the loop" This reverts commit `d34797dfc2`. llvm-svn: 362615	2019-06-05 15:32:56 +00:00
Benjamin Kramer	b90b354798	[LoopInfo] Fix unused variable warning. NFC. llvm-svn: 362610	2019-06-05 14:43:58 +00:00
Whitney Tsang	d34797dfc2	Title: [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, and loop induction variable. Summary: This PR extends the loop object with more utilities to get loop bounds, step, and loop induction variable. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 362609	2019-06-05 14:34:12 +00:00
Nemanja Ivanovic	fe97754acf	Initial support for IBM MASS vector library This is the LLVM portion of patch https://reviews.llvm.org/D59881. The clang portion is to follow. llvm-svn: 362568	2019-06-05 01:31:43 +00:00
Johannes Doerfert	40107ce753	Introduce Value::stripPointerCastsSameRepresentation This patch allows current users of Value::stripPointerCasts() to force the result of the function to have the same representation as the value it was called on. This is useful in various cases, e.g., (non-)null checks. In this patch only a single call site was adjusted to fix an existing misuse that would cause nonnull where they may be wrong. Uses in attribute deduction and other areas, e.g., D60047, are to be expected. For a discussion on this topic, please see [0]. [0] http://lists.llvm.org/pipermail/llvm-dev/2018-December/128423.html Reviewers: hfinkel, arsenm, reames Subscribers: wdng, hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61607 llvm-svn: 362545	2019-06-04 20:21:46 +00:00
Nikita Popov	df621bdfc8	[LVI][CVP] Add support for urem, srem and sdiv The underlying ConstantRange functionality has been added in D60952, D61207 and D61238, this just exposes it for LVI. I'm switching the code from using a whitelist to a blacklist, as we're down to one unsupported operation here (xor) and writing it this way seems more obvious :) Differential Revision: https://reviews.llvm.org/D62822 llvm-svn: 362519	2019-06-04 16:24:09 +00:00
George Burgess IV	c24a2f4ad9	CFLAA: reflow comments; NFC llvm-svn: 362442	2019-06-03 19:56:22 +00:00
Craig Topper	7a4eabef39	[CFLGraph] Add FAdd to visitConstantExpr. This looks like an oversight as all the other binary operators are present. Accidentally noticed while auditing places that need FNeg handling. No test because as noted in the review it would be contrived and amount to "don't crash" Differential Revision: https://reviews.llvm.org/D62790 llvm-svn: 362441	2019-06-03 19:35:52 +00:00
Craig Topper	7cebf0af40	[InlineCost] Don't add the soft float function call cost for the fneg idiom, fsub -0.0, %x Summary: Fneg can be implemented with an xor rather than a function call so we don't need to add the function call overhead. This was pointed out in D62699 Reviewers: efriedma, cameron.mcinally Reviewed By: efriedma Subscribers: javed.absar, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62747 llvm-svn: 362304	2019-06-01 19:40:07 +00:00
Erik Pilkington	abb2a93c53	[SimplifyLibCalls] Fold more fortified functions into non-fortified variants When the object size argument is -1, no checking can be done, so calling the _chk variant is unnecessary. We already did this for a bunch of these functions. rdar://50797197 Differential revision: https://reviews.llvm.org/D62358 llvm-svn: 362272	2019-05-31 22:41:36 +00:00
Russell Gallop	802c9b59d5	ftime-trace: Trace loop passes These can take a significant amount of time in some builds. Suggested by Andrea Di Biagio. Differential Revision: https://reviews.llvm.org/D62666 llvm-svn: 362219	2019-05-31 10:14:04 +00:00
Craig Topper	b457e430f3	[InstructionSimplify] Add missing implementation of llvm::SimplifyUnOp. NFC There are no callers currently, but the function is declared so we should at least implement it. llvm-svn: 362205	2019-05-31 08:10:23 +00:00
Nikita Popov	332c100562	[ValueTracking][ConstantRange] Distinguish low/high always overflow In order to fold an always overflowing signed saturating add/sub, we need to know in which direction the always overflow occurs. This patch splits up AlwaysOverflows into AlwaysOverflowsLow and AlwaysOverflowsHigh to pass through this information (but it is not used yet). Differential Revision: https://reviews.llvm.org/D62463 llvm-svn: 361858	2019-05-28 18:08:31 +00:00
Craig Topper	ab53c5e5ab	[InlineCost] Fix a couple comments. NFC Replace "unary operator" with "unary instruction" in visitUnaryInstruction since we now have a UnaryOperator class which might needs its own visit function. Fix a copy/paste in visitCastInst that appears to have been copied from visitPtrToInt. llvm-svn: 361794	2019-05-28 07:25:27 +00:00
Craig Topper	50d502826b	[CostModel] Add really basic support for being able to query the cost of the FNeg instruction. Summary: This reuses the getArithmeticInstrCost, but passes dummy values of the second operand flags. The X86 costs are wrong and can be improved in a follow up. I just wanted to stop it from reporting an unknown cost first. Reviewers: RKSimon, spatel, andrew.w.kaylor, cameron.mcinally Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62444 llvm-svn: 361788	2019-05-28 04:09:18 +00:00
Xing Xue	3860aad6e7	[MustExecute] Improve MustExecute to correctly handle loop nest Summary: for.outer: br for.inner for.inner: LI <loop invariant load instruction> for.inner.latch: br for.inner, for.outer.latch for.outer.latch: br for.outer, for.outer.exit LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock(). Current algorithm of allLoopPathsLeadToBlock() 1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch 2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false 3. return true Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this. Author: Whitney (committed by xingxue) Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn Reviewed By: jdoerfert Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour Tags: #LLVM Differential Revision: https://reviews.llvm.org/D62418 llvm-svn: 361762	2019-05-27 13:57:28 +00:00
Nikita Popov	d0f13e618f	[ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI The implementation in ValueTracking and ConstantRange are equally powerful, reuse the one in ConstantRange, which will make this easier to extend. llvm-svn: 361723	2019-05-26 13:22:01 +00:00
Nikita Popov	6bb5041e94	[LVI][CVP] Add support for saturating add/sub Adds support for the uadd.sat family of intrinsics in LVI, based on ConstantRange methods from D60946. Differential Revision: https://reviews.llvm.org/D62447 llvm-svn: 361703	2019-05-25 16:44:14 +00:00
Nikita Popov	024b18aca7	[LVI][CVP] Calculate with.overflow result range In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0) as the range of op(%x, %y). This is mainly useful in conjunction with D60650: If the result of the operation is extracted in a branch guarded against overflow, then the value of %x will be appropriately constrained and the result range of the operation will be calculated taking that into account. Differential Revision: https://reviews.llvm.org/D60656 llvm-svn: 361693	2019-05-25 09:53:45 +00:00
Nikita Popov	17367b0d89	[LVI] Extract helper for binary range calculations; NFC llvm-svn: 361692	2019-05-25 09:53:37 +00:00
Sanjay Patel	8869a98e82	[InstSimplify] fold insertelement-of-extractelement This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. llvm-svn: 361576	2019-05-24 00:13:58 +00:00
Sanjay Patel	e60cb7d1be	[InstSimplify] insertelement V, undef, ? --> V This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. llvm-svn: 361559	2019-05-23 21:49:47 +00:00
Kit Barton	987fdfd9a7	Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch. This reverts r361517 (git commit `2049e4dd8f`) llvm-svn: 361553	2019-05-23 20:53:05 +00:00
Kit Barton	2049e4dd8f	[LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch. Summary: This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getGuard --> if (guardcmp) goto loop; else goto afterloop /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Committed on behalf of @Whitney (Whitney Tsang). Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 361517	2019-05-23 17:56:35 +00:00
Sanjay Patel	63fa690617	[InstSimplify] update stale comment; NFC Missed this diff with rL361118. llvm-svn: 361180	2019-05-20 17:52:18 +00:00
Nick Desaulniers	639b29b1b5	[INLINER] allow inlining of blockaddresses if sole uses are callbrs Summary: It was supposed that Ref LazyCallGraph::Edge's were being inserted by inlining, but that doesn't seem to be the case. Instead, it seems that there was no test for a blockaddress Constant in an instruction that referenced the function that contained the instruction. Ex: ``` define void @f() { %1 = alloca i8, align 8 2: store i8 blockaddress(@f, %2), i8** %1, align 8 ret void } ``` When iterating blockaddresses, do not add the function they refer to back to the worklist if the blockaddress is referring to the contained function (as opposed to an external function). Because blockaddress has sligtly different semantics than GNU C's address of labels, there are 3 cases that can occur with blockaddress, where only 1 can happen in GNU C due to C's scoping rules: * blockaddress is within the function it refers to (possible in GNU C). * blockaddress is within a different function than the one it refers to (not possible in GNU C). * blockaddress is used in to declare a global (not possible in GNU C). The second case is tested in: ``` $ ./llvm/build/unittests/Analysis/AnalysisTests \ --gtest_filter=LazyCallGraphTest.HandleBlockAddress ``` This patch adjusts the iteration of blockaddresses in LazyCallGraph::visitReferences to not revisit the blockaddresses function in the first case. The Linux kernel contains code that's not semantically valid at -O0; specifically code passed to asm goto. It requires that asm goto be inline-able. This patch conservatively does not attempt to handle the more general case of inlining blockaddresses that have non-callbr users (pr/39560). https://bugs.llvm.org/show_bug.cgi?id=39560 https://bugs.llvm.org/show_bug.cgi?id=40722 https://github.com/ClangBuiltLinux/linux/issues/6 https://reviews.llvm.org/rL212077 Reviewers: jyknight, eli.friedman, chandlerc Reviewed By: chandlerc Subscribers: george.burgess.iv, nathanchance, mgorny, craig.topper, mengxu.gatech, void, mehdi_amini, E5ten, chandlerc, efriedma, eraman, hiraditya, haicheng, pirama, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D58260 llvm-svn: 361173	2019-05-20 16:48:09 +00:00
Cameron McInally	2d2a46db8e	[InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg Differential Revision: https://reviews.llvm.org/D62077 llvm-svn: 361151	2019-05-20 13:13:35 +00:00
Sanjay Patel	9ef99b4b11	[InstSimplify] fold fcmp (maxnum, X, C1), C2 This is the sibling transform for rL360899 (D61691): maxnum(X, GreaterC) == C --> false maxnum(X, GreaterC) <= C --> false maxnum(X, GreaterC) < C --> false maxnum(X, GreaterC) >= C --> true maxnum(X, GreaterC) > C --> true maxnum(X, GreaterC) != C --> true llvm-svn: 361118	2019-05-19 14:26:39 +00:00
Cameron McInally	067e946859	[InstSimplify] Add unary fneg to `fsub 0.0, (fneg X) ==> X` transform Differential Revision: https://reviews.llvm.org/D62013 llvm-svn: 361047	2019-05-17 16:47:00 +00:00
Sanjay Patel	152f81fae8	[InstSimplify] fold fcmp (minnum, X, C1), C2 minnum(X, LesserC) == C --> false minnum(X, LesserC) >= C --> false minnum(X, LesserC) > C --> false minnum(X, LesserC) != C --> true minnum(X, LesserC) <= C --> true minnum(X, LesserC) < C --> true maxnum siblings will follow if there are no problems here. We should be able to perform some other combines when the constants are equal or greater-than too, but that would go in instcombine. We might also generalize this by creating an FP ConstantRange (similar to what we do for integers). Differential Revision: https://reviews.llvm.org/D61691 llvm-svn: 360899	2019-05-16 14:03:10 +00:00
Fangrui Song	3e92df3e39	Add Triple::isPPC64() llvm-svn: 360864	2019-05-16 08:31:22 +00:00
Cameron McInally	0c82d9b5a2	Teach InstSimplify -X + X --> 0.0 about unary FNeg Differential Revision: https://reviews.llvm.org/D61916 llvm-svn: 360777	2019-05-15 14:31:33 +00:00
Nikita Popov	48c4e4fa80	[LVI][CVP] Add support for abs/nabs select pattern flavor Based on ConstantRange support added in D61084, we can now handle abs and nabs select pattern flavors in LVI. Differential Revision: https://reviews.llvm.org/D61794 llvm-svn: 360700	2019-05-14 18:53:47 +00:00
Kit Barton	37b7922daa	Save the induction binary operator in IVDescriptors for non FP induction variables. Summary: Currently InductionBinOps are only saved for FP induction variables, the PR extends it with non FP induction variable, so user of IVDescriptors can query the InductionBinOps for integer induction variables. The changes in hasUnsafeAlgebra() and getUnsafeAlgebraInst() are required for the existing LIT test cases to pass. As described in the comment of the two functions, one of the requirement to return true is it is a FP induction variable. The checks was not needed because InductionBinOp was not set on non FP cases before. https://reviews.llvm.org/D60565 depends on the patch. Committed on behalf of @Whitney (Whitney Tsang). Reviewers: jdoerfert, kbarton, fhahn, hfinkel, dmgreen, Meinersbur Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61329 llvm-svn: 360671	2019-05-14 13:26:36 +00:00
Teresa Johnson	37b80122bd	[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible Summary: We hit undefined references building with ThinLTO when one source file contained explicit instantiations of a template method (weak_odr) but there were also implicit instantiations in another file (linkonce_odr), and the latter was the prevailing copy. In this case the symbol was marked hidden when the prevailing linkonce_odr copy was promoted to weak_odr. It led to unsats when the resulting shared library was linked with other code that contained a reference (expecting to be resolved due to the explicit instantiation). Add a CanAutoHide flag to the GV summary to allow the thin link to identify when all copies are eligible for auto-hiding (because they were all originally linkonce_odr global unnamed addr), and only do the auto-hide in that case. Most of the changes here are due to plumbing the new flag through the bitcode and llvm assembly, and resulting test changes. I augmented the existing auto-hide test to check for this situation. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi Tags: #llvm Differential Revision: https://reviews.llvm.org/D59709 llvm-svn: 360466	2019-05-10 20:08:24 +00:00
Philip Reames	76ea748d2d	Compile time tweak for libcall lookup If we have a large module which is mostly intrinsics, we hammer the lib call lookup path from CodeGenPrepare. Adding a fastpath reduces compile by 15% for one such example. The problem is really more general than intrinsics - a module with lots of non-intrinsics non-libcall calls has the same problem - but we might as well avoid an easy case quickly. llvm-svn: 360391	2019-05-09 23:13:09 +00:00
Warren Ristow	d27b0c6247	[SCEV] Suppress hoisting insertion point of binops when unsafe InsertBinop tries to move insertion-points out of loops for expressions that are loop-invariant. This patch adds a new parameter, IsSafeToHost, to guard that hoisting. This allows callers to suppress that hoisting for unsafe situations, such as divisions that may have a zero denominator. This fixes PR38697. Differential Revision: https://reviews.llvm.org/D55232 llvm-svn: 360280	2019-05-08 18:50:07 +00:00
Alina Sbirlea	f31eba6494	[MemorySSA] Teach LoopSimplify to preserve MemorySSA. Summary: Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available. Do not preserve it in the new pass manager. Update tests. Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60833 llvm-svn: 360270	2019-05-08 17:05:36 +00:00
Nikita Popov	9fd02a71a3	Revert "[ValueTracking] Improve isKnowNonZero for Ints" This reverts commit `3b137a4956`. As reported in https://reviews.llvm.org/D60846, this is causing miscompiles. llvm-svn: 360260	2019-05-08 14:50:01 +00:00
Dan Robertson	3b137a4956	[ValueTracking] Improve isKnowNonZero for Ints Improve isKnownNonZero for integers in order to improve cttz optimizations. Differential Revision: https://reviews.llvm.org/D60846 llvm-svn: 360222	2019-05-08 02:25:08 +00:00
Sanjay Patel	e088d03b9c	[ValueTracking] add logic for known-never-nan with minnum/maxnum From the LangRef: "Returns NaN only if both operands are NaN." llvm-svn: 360206	2019-05-07 22:58:31 +00:00
Keno Fischer	a1a4adf4b9	[SCEV] Add explicit representations of umin/smin Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159	2019-05-07 15:28:47 +00:00
Cameron McInally	c3167696bc	Add FNeg support to InstructionSimplify Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053	2019-05-06 16:05:10 +00:00
Cameron McInally	1d0c845d9d	Add FNeg IR constant folding support llvm-svn: 359982	2019-05-05 16:07:09 +00:00
Alina Sbirlea	0363c3b8bb	[MemorySSA] Check that block is reachable when adding phis. Summary: Originally the insertDef method was only used when building MemorySSA, and was limiting the number of Phi nodes that it created. Now it's used for updates as well, and it can create additional Phis needed for correctness. Make sure no Phis are created in unreachable blocks (condition met during MSSA build), otherwise the renamePass will find a null DTNode. Resolves PR41640. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61410 llvm-svn: 359845	2019-05-02 23:41:58 +00:00
Alina Sbirlea	151ab4844a	[MemorySSA] Refactor removing multiple trivial phis [NFC]. Summary: Create a method to clean up multiple potentially trivial phis, since we will need this often. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61471 llvm-svn: 359842	2019-05-02 23:12:49 +00:00
Keno Fischer	a3e4b3bd33	[SCEV] Use isKnownViaNonRecursiveReasoning for smax simplification Summary: Commit rL331949: SCEV] Do not use induction in isKnownPredicate for simplification umax changed the codepath for umax from isKnownPredicate to isKnownViaNonRecursiveReasoning to avoid compile time blow up (and as I found out also stack overflows). However, there is an exact copy of the code for umax that was lacking this change. In D50167 I want to unify these codepaths, but to avoid that being a behavior change for the smax case, pull this independent bit out of it. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D61166 llvm-svn: 359693	2019-05-01 15:58:24 +00:00
Keno Fischer	d8f856d265	[LoopInfo] Faster implementation of setLoopID. NFC. Summary: This change was part of D46460. However, in the meantime rL341926 fixed the correctness issue here. What remained was the performance issue in setLoopID where it would iterate through all blocks in the loop and their successors, rather than just the predecessor of the header (the later presumably being much faster). We already have the `getLoopLatches` to compute precisely these basic blocks in an efficient manner, so just use it (as the original commit did for `getLoopID`). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61215 llvm-svn: 359684	2019-05-01 14:39:11 +00:00
Alina Sbirlea	b468320313	[MemorySSA] Invalidate MemorySSA if AA or DT are invalidated. Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: LLVM Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359627	2019-04-30 22:43:55 +00:00
Alina Sbirlea	ba48a2c5e8	[AliasAnalysis/NewPassManager] Invalidate AAManager less often. Summary: This is a redo of D60914. The objective is to not invalidate AAManager, which is stateless, unless there is an explicit invalidate in one of the AAResults. To achieve this, this patch adds an API to PAC, to check precisely this: is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61284 llvm-svn: 359622	2019-04-30 22:15:47 +00:00
Fedor Sergeev	eeae45dc77	[NFC][InlineCost] cleanup - comments, overflow handling. Reviewed By: apilipenko Tags: #llvm Differential Revision: https://reviews.llvm.org/D60751 llvm-svn: 359609	2019-04-30 20:44:53 +00:00
Simon Pilgrim	f5e8f222d6	Revert rL359519 : [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated. Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 ........ This was causing windows build bot failures llvm-svn: 359555	2019-04-30 12:34:21 +00:00
Sjoerd Meijer	ea31ddb36f	[ARM] Implement TTI::getMemcpyCost This implements TargetTransformInfo method getMemcpyCost, which estimates the number of instructions to which a memcpy instruction expands to. Differential Revision: https://reviews.llvm.org/D59787 llvm-svn: 359547	2019-04-30 10:28:50 +00:00
Alina Sbirlea	9a1edd14a2	[MemorySSA] Invalidate MemorySSA if AA or DT are invalidated. Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359519	2019-04-29 23:53:04 +00:00
Nikita Popov	7a94795b2b	[ConstantRange] Add makeExactNoWrapRegion() I got confused on the terminology, and the change in D60598 was not correct. I was thinking of "exact" in terms of the result being non-approximate. However, the relevant distinction here is whether the result is * Largest range such that: Forall Y in Other: Forall X in Result: X BinOp Y does not wrap. (makeGuaranteedNoWrapRegion) * Smallest range such that: Forall Y in Other: Forall X not in Result: X BinOp Y wraps. (A hypothetical makeAllowedNoWrapRegion) * Both. (makeExactNoWrapRegion) I'm adding a separate makeExactNoWrapRegion method accepting a single APInt (same as makeExactICmpRegion) and using it in the places where the guarantee is relevant. Differential Revision: https://reviews.llvm.org/D60960 llvm-svn: 359402	2019-04-28 15:40:56 +00:00
Philip Reames	88cd69b56f	Consolidate existing utilities for interpreting vector predicate maskes [NFC] llvm-svn: 359163	2019-04-25 02:30:17 +00:00
Xinliang David Li	499c80b890	Add optional arg to profile count getters to filter synthetic profile count. Differential Revision: http://reviews.llvm.org/D61025 llvm-svn: 359131	2019-04-24 19:51:16 +00:00
Bjorn Pettersson	71e8c6f20f	Add "const" in GetUnderlyingObjects. NFC Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072	2019-04-24 06:55:50 +00:00
Alina Sbirlea	b341efce31	Revert [AliasAnalysis] AAResults preserves AAManager. Triggers use-after-free. llvm-svn: 359055	2019-04-24 00:28:29 +00:00
Josh Stone	27924c3a3c	[Lint] Permit aliasing noalias readonly arguments Summary: If two arguments are both readonly, then they have no memory dependency that would violate noalias, even if they do actually overlap. Reviewers: hfinkel, efriedma Reviewed By: efriedma Subscribers: efriedma, hiraditya, llvm-commits, tstellar Tags: #llvm Differential Revision: https://reviews.llvm.org/D60239 llvm-svn: 359047	2019-04-23 23:43:47 +00:00
Alina Sbirlea	4fd1f266b1	[MemorySSA] LCSSA preserves MemorySSA. Summary: Enabling MemorySSA in the old pass manager leads to MemorySSA being run twice due to the fact that LCSSA and LoopSimplify do not preserve MemorySSA. This is the first step to address that: target LCSSA. LCSSA does not make any changes that invalidate MemorySSA, so it preserves it by design. It must preserve AA as well, for this to hold. After this patch, MemorySSA is still run twice in the old pass manager. Step two follows: target LoopSimplify. Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60832 llvm-svn: 359032	2019-04-23 20:59:44 +00:00
Alina Sbirlea	a809e8e5e7	[AliasAnalysis] AAResults preserves AAManager. Summary: AAResults should not invalidate AAManager. Update tests. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60914 llvm-svn: 359014	2019-04-23 17:21:18 +00:00
Fangrui Song	efd94c56ba	Use llvm::stable_sort While touching the code, simplify if feasible. llvm-svn: 358996	2019-04-23 14:51:27 +00:00
Fedor Sergeev	652168a99b	[CallSite removal] move InlineCost to CallBase usage Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982	2019-04-23 12:43:27 +00:00
Philip Reames	d8d9b7b20e	[InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913	2019-04-22 19:30:01 +00:00
Nikita Popov	5aacc7a573	Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC" This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445. After thinking about this more, this isn't right, the range is not exact in the same sense as makeExactICmpRegion(). This needs a separate function. llvm-svn: 358876	2019-04-22 09:01:38 +00:00
Nikita Popov	5299e25f50	[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC Following D60632 makeGuaranteedNoWrapRegion() always returns an exact nowrap region. Rename the function accordingly. This is in line with the naming of makeExactICmpRegion(). llvm-svn: 358875	2019-04-22 08:36:05 +00:00
Nikita Popov	dbc3fbafe7	[ConstantRange] Add getNonEmpty() constructor ConstantRanges have an annoying special case: If upper and lower are the same, it can be either an empty or a full set. When constructing constant ranges nearly always a full set is intended, but this still requires an explicit check in many places. This revision adds a getNonEmpty() constructor that disambiguates this case: If upper and lower are the same, a full set is created. Differential Revision: https://reviews.llvm.org/D60947 llvm-svn: 358854	2019-04-21 15:22:54 +00:00
Chandler Carruth	ce3f75df1f	[CallSite removal] Move the legacy PM, call graph, and some inliner code to `CallBase`. This patch focuses on the legacy PM, call graph, and some of inliner and legacy passes interacting with those APIs from `CallSite` to the new `CallBase` class. No interesting changes. Differential Revision: https://reviews.llvm.org/D60412 llvm-svn: 358739	2019-04-19 05:59:42 +00:00
Nicolai Haehnle	523f90a2ba	[SDA] Bug fix: Use IPD outside the loop as divergence bound Summary: The immediate post dominator of the loop header may be part of the divergent loop. Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: mmasten, arsenm, jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59042 llvm-svn: 358681	2019-04-18 16:17:35 +00:00
Philip Reames	b2c9fc02d5	Fix a bug in SCEV's isSafeToExpand around speculation safety isSafeToExpand was making a common, but dangerously wrong, mistake in assuming that if any instruction within a basic block executes, that all instructions within that block must execute. This can be trivially shown to be false by considering the following small example: bb: add x, y <-- InsertionPoint call @throws() udiv x, y <-- SCEV* S br ... It's clearly not legal to expand S above the throwing call, but the previous logic would do so since S dominates (but not properlyDominates) the block containing the InsertionPoint. Since iterating instructions w/in a block is expensive, this change special cases two cases: 1) S is an operand of InsertionPoint, and 2) InsertionPoint is the terminator of it's block. These two together are enough to keep all current optimizations triggering while fixing the latent correctness issue. As best I can tell, this is a silent bug in current ToT. Given that, there's no tests with this change. It was noticed in an upcoming optimization change (D60093), and was reviewed as part of that. That change will include the test which caused me to notice the issue. I'm submitting this seperately so that anyone bisecting a problem gets a clear explanation. llvm-svn: 358680	2019-04-18 16:10:21 +00:00
Nikita Popov	2039581002	[LVI][CVP] Constrain values in with.overflow branches If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1) then we can constrain the value of %x inside the branch based on makeGuaranteedNoWrapRegion(). We do this by extending the edge-value handling in LVI. This allows CVP to then fold comparisons against %x, as illustrated in the tests. Differential Revision: https://reviews.llvm.org/D60650 llvm-svn: 358597	2019-04-17 16:57:42 +00:00
Nikita Popov	79dffc67b5	[IR] Add WithOverflowInst class This adds a WithOverflowInst class with a few helper methods to get the underlying binop, signedness and nowrap type and makes use of it where sensible. There will be two more uses in D60650/D60656. The refactorings are all NFC, though I left some TODOs where things could be improved. In particular we have two places where add/sub are handled but mul isn't. Differential Revision: https://reviews.llvm.org/D60668 llvm-svn: 358512	2019-04-16 18:55:16 +00:00
Alina Sbirlea	f9f073a861	[MemorySSA] Add previous def to cache when found, even if trivial. Summary: When inserting a new Def, MemorySSA may be have non-minimal number of Phis. While inserting, the walk to find the previous definition may cleanup minimal Phis. When the last definition is trivial to obtain, we do not cache it. It is possible while getting the previous definition for a Def to get two different answers: - one that was straight-forward to find when walking the first path (a trivial phi in this case), and - another that follows a cleanup of the trivial phi, it determines it may need additional Phi nodes, it inserts them and returns a new phi in the same position as the former trivial one. While the Phis added for the second path are all redundant, they are not complete (the walk is only done upwards), and they are not properly cleaned up afterwards. A way to fix this problem is to cache the straight-forward answer we got on the first walk. The caching is only kept for the duration of a getPreviousDef call, and for Phis we use TrackingVH, so removing the trivial phi will lead to replacing it with the next dominating phi in the cache. Resolves PR40749. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60634 llvm-svn: 358313	2019-04-12 21:58:52 +00:00
Alina Sbirlea	2312a06c87	[SCEV] Add option to forget everything in SCEV. Summary: Create a method to forget everything in SCEV. Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll. Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch. Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build. Testcase showcasing this cannot be opensourced and is fairly large. The option disabled by default, but it may be desirable to enable by default. Evidence in favor (two difference runs on different days/ToT state): File Before (s) After (s) clang-9.bc 7267.91 6639.14 llvm-as.bc 194.12 194.12 llvm-dis.bc 62.50 62.50 opt.bc 1855.85 1857.53 File Before (s) After (s) clang-9.bc 8588.70 7812.83 llvm-as.bc 196.20 194.78 llvm-dis.bc 61.55 61.97 opt.bc 1739.78 1886.26 Reviewers: sanjoy Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60144 llvm-svn: 358304	2019-04-12 19:16:07 +00:00
Alina Sbirlea	57769382b1	[MemorySSA] Small fix for the clobber limit. Summary: After introducing the limit for clobber walking, `walkToPhiOrClobber` would assert that the limit is at least 1 on entry. The test included triggered that assert. The callsite in `tryOptimizePhi` making the calls to `walkToPhiOrClobber` is structured like this: ``` while (true) { if (getBlockingAccess()) { // calls walkToPhiOrClobber } for (...) { walkToPhiOrClobber(); } } ``` The cleanest fix is to check if the limit was reached inside `walkToPhiOrClobber`, and give an allowence of 1. This approach not make any alias() calls (no calls to instructionClobbersQuery), so the performance condition is enforced. The limit is set back to 0 if not used, as this provides info on the fact that we stopped before reaching a true clobber. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60479 llvm-svn: 358303	2019-04-12 18:48:46 +00:00
Fangrui Song	cecc435250	Use llvm::lower_bound. NFC This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version. llvm-svn: 358246	2019-04-12 02:02:06 +00:00
Ali Tamur	7822b46188	Revert "Use llvm::lower_bound. NFC" This reverts commit rL358161. This patch have broken the test: llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s llvm-svn: 358199	2019-04-11 17:35:20 +00:00
Sander de Smalen	4f5d2df48d	[ValueTracking] Change if-else chain into switch in computeKnownBitsFromAssume This is a follow-up patch to D60504 to further improve performance issues in computeKnownBitsFromAssume. The patch is NFC, but may improve compile-time performance if the compiler isn't clever enough to do the optimization itself. llvm-svn: 358163	2019-04-11 13:02:19 +00:00
Fangrui Song	71cce580b9	Use llvm::lower_bound. NFC llvm-svn: 358161	2019-04-11 10:25:41 +00:00
Erik Pilkington	cb5c7bd9eb	Fix a hang when lowering __builtin_dynamic_object_size If the ObjectSizeOffsetEvaluator fails to fold the object size call, then it may litter some unused instructions in the function. When done repeatably in InstCombine, this results in an infinite loop. Fix this by tracking the set of instructions that were inserted, then removing them on failure. rdar://49172227 Differential revision: https://reviews.llvm.org/D60298 llvm-svn: 358146	2019-04-10 23:42:11 +00:00
Sander de Smalen	0e66db5d77	Improve compile-time performance in computeKnownBitsFromAssume. This patch changes the order of pattern matching by first testing a compare instruction's predicate, before doing the pattern match for the whole expression tree. Patch by Paul Walker. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D60504 llvm-svn: 358097	2019-04-10 16:24:48 +00:00
Nikita Popov	4b2323d1a3	[ValueTracking] Use computeConstantRange() for signed sub overflow determination This is the same change as D60420 but for signed sub rather than signed add: Range information is intersected into the known bits result, allows to detect more no/always overflow conditions. Differential Revision: https://reviews.llvm.org/D60469 llvm-svn: 358020	2019-04-09 17:01:49 +00:00
Nikita Popov	10edd2b79d	[ValueTracking] Use computeConstantRange() in signed add overflow determination This is D59386 for the signed add case. The computeConstantRange() result is now intersected into the existing known bits information, allowing to detect additional no-overflow/always-overflow conditions (though the latter isn't used yet). This (finally...) covers the motivating case from D59071. Differential Revision: https://reviews.llvm.org/D60420 llvm-svn: 358014	2019-04-09 16:12:59 +00:00
Nemanja Ivanovic	820b90318f	NFC: Refactor library-specific mappings of scalar maths functions to their vector counterparts This patch factors out mappings of scalar maths functions to their vector counterparts from TargetLibraryInfo.cpp to a separate VecFuncs.def file. Such mappings are currently available for Accelerate framework, and SVML library. This is in support of the follow-up: https://reviews.llvm.org/D59881 Patch by pjeeva01 Differential revision: https://reviews.llvm.org/D60211 llvm-svn: 358001	2019-04-09 13:21:11 +00:00
Nikita Popov	6e9157d588	[ValueTracking] Use ConstantRange methods; NFC Switch part of the computeOverflowForSignedAdd() implementation to use Range.isAllNegative() rather than KnownBits.isNegative() and similar. They do the same thing, but using the ConstantRange methods allows dropping the KnownBits variables more easily in D60420. llvm-svn: 357969	2019-04-09 07:13:09 +00:00
Nikita Popov	7bd7878d22	[ValueTracking] Explicitly specify intersection type; NFC Preparation for D60420. llvm-svn: 357968	2019-04-09 07:13:03 +00:00
Nikita Popov	3db93ac5d6	Reapply [ValueTracking] Support min/max selects in computeConstantRange() Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This fixes an infinite InstCombine loop, with the test case taken from D59378. Relative to the previous iteration, this contains some adjustments for AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen, which ends up constant folding some existing med3 tests after this change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option is added, which allows disabling scalar IR passes (that use InstSimplify) for testing purposes. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 357870	2019-04-07 17:22:16 +00:00
Guozhi Wei	36fc9c3107	[LCG] Add aliased functions as LCG roots Current LCG doesn't check aliased functions. So if an internal function has a public alias it will not be added to CG SCC, but it is still reachable from outside through the alias. So this patch adds aliased functions to SCC. Differential Revision: https://reviews.llvm.org/D59898 llvm-svn: 357795	2019-04-05 18:51:08 +00:00
Nick Lewycky	a6ed16c98f	An unreachable block may have a route to a reachable block, don't fast-path return that it can't. A block reachable from the entry block can't have any route to a block that's not reachable from the entry block (if it did, that route would make it reachable from the entry block). That is the intended performance optimization for isPotentiallyReachable. For the case where we ask whether an unreachable from entry block has a route to a reachable from entry block, we can't conclude one way or the other. Fix a bug where we claimed there could be no such route. The fix in rL357425 ironically reintroduced the very bug it was fixing but only when a DominatorTree is provided. This fixes the remaining bug. llvm-svn: 357734	2019-04-04 23:09:40 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Evandro Menezes	7c711ccf36	[IR] Create new method in `Function` class (NFC) Create method `optForNone()` testing for the function level equivalent of `-O0` and refactor appropriately. Differential revision: https://reviews.llvm.org/D59852 llvm-svn: 357638	2019-04-03 21:27:03 +00:00
Alon Zakai	b4f9991f38	[WebAssembly] Add Emscripten OS definition + small_printf The Emscripten OS provides a definition of __EMSCRIPTEN__, and also that it supports iprintf optimizations. Also define small_printf optimizations, which is a printf with float support but not long double (which in wasm can be useful since long doubles are 128 bit and force linking of float128 emulation code). This part is based on sunfish's https://reviews.llvm.org/D57620 (which can't land yet since the WASI integration isn't ready yet). Differential Revision: https://reviews.llvm.org/D60167 llvm-svn: 357552	2019-04-03 01:08:35 +00:00
Matt Arsenault	03e7492876	InstSimplify: Fold round intrinsics from sitofp/uitofp https://godbolt.org/z/gEMRZb llvm-svn: 357549	2019-04-03 00:25:06 +00:00
Philip Reames	d3d5d76a7b	[WideableCond] Fix a nasty bug in detection of "explicit guards" The code was failing to actually check for the presence of the call to widenable_condition. The whole point of specifying the widenable_condition intrinsic was allowing widening transforms. A normal branch is not widenable. A normal branch leading to a deopt is not widenable (in general). I added a test case via LoopPredication, but GuardWidening has an analogous bug. Those are the only two passes actually using this utility just yet. Noticed while working on LoopPredication for non-widenable branches; POC in D60111. llvm-svn: 357493	2019-04-02 16:51:43 +00:00
Nick Lewycky	c0ebfbe3f3	Add an optional list of blocks to avoid when looking for a path in isPotentiallyReachable. The leads to some ambiguous overloads, so update three callers. Differential Revision: https://reviews.llvm.org/D60085 llvm-svn: 357447	2019-04-02 01:05:48 +00:00
Nick Lewycky	66d7eb9704	Not all blocks are reachable from entry. Don't assume they are. Fixes a bug in isPotentiallyReachable, noticed by inspection. llvm-svn: 357425	2019-04-01 20:03:16 +00:00
Alina Sbirlea	c8d6e0496d	[MemorySSA] Temporary fix assert when reaching 0 limit. llvm-svn: 357327	2019-03-29 22:55:59 +00:00
Sanjoy Das	53a5952a93	Try to fix buildbot error Error is: llvm/lib/Analysis/ScalarEvolution.cpp:3534:10: error: chosen constructor is explicit in copy-initialization return {UniqueSCEVs.FindNodeOrInsertPos(ID, IP), std::move(ID), IP}; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/bin/../lib/gcc/aarch64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here constexpr tuple(_UElements&&... __elements) ^ 1 error generated. llvm-svn: 357324	2019-03-29 22:27:10 +00:00
Sanjoy Das	32fd32bc6f	[SCEV] Check the cache in get{S\|U}MaxExpr before doing any work Summary: This lets us avoid e.g. checking if A >=s B in getSMaxExpr(A, B) if we've already established that (A smax B) is the best we can do. Fixes PR41225. Reviewers: asbirlea Subscribers: mcrosier, jlebar, bixia, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60010 llvm-svn: 357320	2019-03-29 22:00:12 +00:00
Alina Sbirlea	f085cc5aa7	[MemorySSA] Limit clobber walks. Summary: This patch limits all getClobberingMemoryAccess() walks to MaxCheckLimit. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59569 llvm-svn: 357319	2019-03-29 21:56:09 +00:00
Alina Sbirlea	e589067e61	[MemorySSA] Don't optimize incomplete phis. Summary: MemoryPhis cannot be optimized out until they are complete. Resolves PR41254. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59966 llvm-svn: 357315	2019-03-29 21:16:31 +00:00
Mircea Trofin	b27d0fd0bf	[llvm][NFC] Factor out logic for getting incoming & back Loop edges Reviewers: davidxl Reviewed By: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59967 llvm-svn: 357284	2019-03-29 17:39:17 +00:00
Florian Hahn	9b41a7320d	Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Updated to use DenseMap::insert instead of [] operator for insertion, to avoid a crash caused by epoch checks. This reverts commit `2b85de4383`. llvm-svn: 357257	2019-03-29 14:10:24 +00:00
Florian Hahn	2b85de4383	Revert Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Another buildbot failure http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20402 clang-9: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/include/llvm/ADT/DenseMap.h:1228: llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type* llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::operator->() const [with KeyT = const llvm::Instruction; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<const llvm::Instruction>; Bucket = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; bool IsConst = false; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::pointer = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>]: Assertion `isHandleInSync() && "invalid iterator access!"' failed. 0. Program arguments: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/bin/clang-9 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name ArchiveCommandLine.cpp -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu skylake-avx512 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -coverage-notes-file /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip/Output/ArchiveCommandLine.llvm.gcno -resource-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0 -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/include -I ../../../include -D _GNU_SOURCE -D __STDC_LIMIT_MACROS -D NDEBUG -D BREAK_HANDLER -D UNICODE -D _UNICODE -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/C -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/myWindows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/include_windows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP -I . -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D NDEBUG -D _REENTRANT -D ENV_UNIX -D _7ZIP_LARGE_PAGES -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward -internal-isystem /usr/local/include -internal-isystem /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=gnu++98 -fdeprecated-macro -fdebug-compilation-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -ferror-limit 19 -fmessage-length 0 -pthread -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o Output/ArchiveCommandLine.llvm.o -x c++ /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/7zip/UI/Common/ArchiveCommandLine.cpp -faddrsig This reverts r357222 (git commit `64cccfcc72`) llvm-svn: 357227	2019-03-29 00:22:26 +00:00
Florian Hahn	64cccfcc72	Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Recommitting after addressing a buildbot failure. This reverts commit `c87869ebea`. llvm-svn: 357222	2019-03-28 23:11:00 +00:00
Florian Hahn	c87869ebea	Revert [DSE] Preserve basic block ordering using OrderedBasicBlock. This reverts r357208 (git commit `c0bfd37d38`) This causes a buildbot failure: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/16124 FAILED: lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/install/stage2/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/IR -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR -Iinclude -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3 -UNDEBUG -fno-exceptions -fno-rtti -MD -MT lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -MF lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o.d -o lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -c /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR/IRBuilder.cpp clang-9: /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/Analysis/OrderedBasicBlock.cpp:38: bool llvm::OrderedBasicBlock::comesBefore(const llvm::Instruction , const llvm::Instruction ): Assertion `!(LastInstFound == BB->end() && NextInstPos != 0) && "Instruction supposed to be in NumberedInsts"' failed. llvm-svn: 357211	2019-03-28 20:36:24 +00:00
Florian Hahn	c0bfd37d38	[DSE] Preserve basic block ordering using OrderedBasicBlock. By extending OrderedBB to allow removing and replacing cached instructions, we can preserve OrderedBBs in DSE easily. This eliminates one source of quadratic compile time in DSE. Fixes PR38829. Reviewers: rnk, efriedma, hfinkel Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59789 llvm-svn: 357208	2019-03-28 20:02:33 +00:00
Florian Hahn	6c3024368c	[MemDepAnalysis] Allow caller to pass in an OrderedBasicBlock. If the caller can preserve the OBB, we can avoid recomputing the order for each getDependency call. Reviewers: efriedma, rnk, hfinkel Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D59788 llvm-svn: 357206	2019-03-28 19:17:31 +00:00
Chandler Carruth	923ff550b9	[NewPM] Fix a nasty bug with analysis invalidation in the new PM. The issue here is that we actually allow CGSCC passes to mutate IR (and therefore invalidate analyses) outside of the current SCC. At a minimum, we need to support mutating parent and ancestor SCCs to support the ArgumentPromotion pass which rewrites all calls to a function. However, the analysis invalidation infrastructure is heavily based around not needing to invalidate the same IR-unit at multiple levels. With Loop passes for example, they don't invalidate other Loops. So we need to customize how we handle CGSCC invalidation. Doing this without gratuitously re-running analyses is even harder. I've avoided most of these by using an out-of-band preserved set to accumulate the cross-SCC invalidation, but it still isn't perfect in the case of re-visiting the same SCC repeatedly but it coming off the worklist. Unclear how important this use case really is, but I wanted to call it out. Another wrinkle is that in order for this to successfully propagate to function analyses, we have to make sure we have a proxy from the SCC to the Function level. That requires pre-creating the necessary proxy. The motivating test case now works cleanly and is added for ArgumentPromotion. Thanks for the review from Philip and Wei! Differential Revision: https://reviews.llvm.org/D59869 llvm-svn: 357137	2019-03-28 00:51:36 +00:00
Hans Wennborg	5519cb2d94	Fix the build with GCC 4.8 after r356783 llvm-svn: 356875	2019-03-25 09:27:42 +00:00
Nikita Popov	977934f00f	[ConstantRange] Add getFull() + getEmpty() named constructors; NFC This adds ConstantRange::getFull(BitWidth) and ConstantRange::getEmpty(BitWidth) named constructors as more readable alternatives to the current ConstantRange(BitWidth, /* full */ false) and similar. Additionally private getFull() and getEmpty() member functions are added which return a full/empty range with the same bit width -- these are commonly needed inside ConstantRange.cpp. The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet) constructor is now mandatory for the few usages that still make use of it. Differential Revision: https://reviews.llvm.org/D59716 llvm-svn: 356852	2019-03-24 09:34:40 +00:00
Nikita Popov	280a6b01c8	[ValueTracking] Avoid redundant known bits calculation in computeOverflowForSignedAdd() We're already computing the known bits of the operands here. If the known bits of the operands can determine the sign bit of the result, we'll already catch this in signedAddMayOverflow(). The only other way (and as the comment already indicates) we'll get new information from computing known bits on the whole add, is if there's an assumption on it. As such, we change the code to only compute known bits from assumptions, instead of computing full known bits on the add (which would unnecessarily recompute the known bits of the operands as well). Differential Revision: https://reviews.llvm.org/D59473 llvm-svn: 356785	2019-03-22 17:51:40 +00:00
Alina Sbirlea	bfc779e491	[AliasAnalysis] Second prototype to cache BasicAA / anyAA state. Summary: Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it. AA changes: - This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode". - All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged. - AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added. MemorySSA changes: - All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults). - At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults. - All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built. - The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA. - All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis. Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59315 llvm-svn: 356783	2019-03-22 17:22:19 +00:00
Bixia Zheng	bdf0230cff	[ConstantFolding] Fix GetConstantFoldFPValue to avoid cast overflow. Summary: In C++, the behavior of casting a double value that is beyond the range of a single precision floating-point to a float value is undefined. This change replaces such a cast with APFloat::convert to convert the value, which is consistent with how we convert a double value to a half value. Reviewers: sanjoy Subscribers: lebedev.ri, sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59500 llvm-svn: 356781	2019-03-22 16:37:37 +00:00
Craig Topper	9f0b17a248	[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687	2019-03-21 17:38:52 +00:00
Nikita Popov	3af5b28f47	[ValueTracking] Use ConstantRange based overflow check for signed sub This is D59450, but for signed sub. This case is not NFC, because the overflow logic in ConstantRange is more powerful than the existing check. This resolves the TODO in the function. I've added two tests to show that this indeed catches more cases than the previous logic, but the main correctness test coverage here is in the existing ConstantRange unit tests. Differential Revision: https://reviews.llvm.org/D59617 llvm-svn: 356685	2019-03-21 17:23:51 +00:00
Fangrui Song	ebfb7852be	[BasicAA] Use DenseMap::try_emplace after D59151. NFC llvm-svn: 356651	2019-03-21 08:47:40 +00:00
Alina Sbirlea	4fdbd822fc	[BasicAA] Reduce no of map seaches [NFCI]. Summary: This is a refactoring patch. - Reduce the number of map searches by reusing the iterator. - Add asserts to check that the entry is in the cache, as this is something BasicAA relies on to avoid infinite recursion. Reviewers: chandlerc, aschwaighofer Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59151 llvm-svn: 356644	2019-03-21 05:02:05 +00:00
George Burgess IV	ae84e9ab49	[MSSA] Delete move ctor; remove dynamic never-moved verification Code archaeology in D59315 revealed that MSSA should never be moved. Rather than trying to check dynamically that this hasn't happened in the verify() functions of Walkers, it's likely best to just delete its move constructor. Since all these verify() functions did is check that MSSA hasn't moved, this allows us to remove these verify functions. I can readd the verification checks if someone's super concerned about us trying to `memcpy` MemorySSA or something somewhere, but I imagine we have other problems if we're trying anything like that... llvm-svn: 356641	2019-03-21 03:11:34 +00:00
Nikita Popov	00b5ecab5d	[ValueTracking] Compute range for abs without nsw This is a small followup to D59511. The code that was moved into computeConstantRange() there is a bit overly conversative: If the abs is not nsw, it does not compute any range. However, abs without nsw still has a well-defined contiguous unsigned range from 0 to SIGNED_MIN. This is a lot less useful than the usual 0 to SIGNED_MAX range, but if we're already here we might as well specify it... Differential Revision: https://reviews.llvm.org/D59563 llvm-svn: 356586	2019-03-20 18:16:02 +00:00
Nikita Popov	208381953b	[ValueTracking] Use computeConstantRange() for unsigned add/sub overflow Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by intersecting the computeConstantRange() result into the ConstantRange created from computeKnownBits(). This allows us to detect some additional never/always overflows conditions that can't be determined from known bits. This revision also adds basic handling for constants to computeConstantRange(). Non-splat vectors will be handled in a followup. The signed case will also be handled in a followup, as it needs some more groundwork. Differential Revision: https://reviews.llvm.org/D59386 llvm-svn: 356489	2019-03-19 17:53:56 +00:00
Simon Pilgrim	a56f2822d0	[SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468	2019-03-19 16:24:55 +00:00
Simon Pilgrim	8ee477a2ab	[InstSimplify] SimplifyICmpInst - icmp eq/ne %X, undef -> undef As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand: When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction) When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst). Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef. The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968. Differential Revision: https://reviews.llvm.org/D59541 llvm-svn: 356456	2019-03-19 14:08:23 +00:00
Nikita Popov	3e9770d2dc	Revert "[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()" This reverts commit `106f0cdefb`. This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests. llvm-svn: 356424	2019-03-18 22:26:27 +00:00
Nikita Popov	106f0cdefb	[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange() Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This was suggested by spatel as an alternative approach to D59378. I've also added the infinite looping test from that revision here. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 356415	2019-03-18 21:35:19 +00:00
Nikita Popov	f89343bc47	[ValueTracking][InstSimplify] Move abs handling into computeConstantRange(); NFC This is preparation for D59506. The InstructionSimplify abs handling is moved into computeConstantRange(), which is the general place for such calculations. This is NFC and doesn't affect the existing tests in test/Transforms/InstSimplify/icmp-abs-nabs.ll. Differential Revision: https://reviews.llvm.org/D59511 llvm-svn: 356409	2019-03-18 21:20:03 +00:00
Warren Ristow	ad7d0ded2e	[SCEV] Guard movement of insertion point for loop-invariants This reinstates r347934, along with a tweak to address a problem with PHI node ordering that that commit created (or exposed). (That commit was reverted at r348426, due to the PHI node issue.) Original commit message: r320789 suppressed moving the insertion point of SCEV expressions with dev/rem operations to the loop header in non-loop-invariant situations. This, and similar, hoisting is also unsafe in the loop-invariant case, since there may be a guard against a zero denominator. This is an adjustment to the fix of r320789 to suppress the movement even in the loop-invariant case. This fixes PR30806. Differential Revision: https://reviews.llvm.org/D57428 llvm-svn: 356392	2019-03-18 18:52:35 +00:00
Nikita Popov	322e2dbee1	[ValueTracking] Use ConstantRange overflow check for signed add; NFC This is the same change as rL356290, but for signed add. It replaces the existing ripple logic with the overflow logic in ConstantRange. This is NFC in that it should return NeverOverflow in exactly the same cases as the previous implementation. However, it does make computeOverflowForSignedAdd() more powerful by now also determining AlwaysOverflows conditions. As none of its consumers handle this yet, this has no impact on optimization. Making use of AlwaysOverflows in with.overflow folding will be handled as a followup. Differential Revision: https://reviews.llvm.org/D59450 llvm-svn: 356345	2019-03-17 21:25:26 +00:00
Nikita Popov	ef2d979943	[ConstantRange] Add fromKnownBits() method Following the suggestion in D59450, I'm moving the code for constructing a ConstantRange from KnownBits out of ValueTracking, which also allows us to test this code independently. I'm adding this method to ConstantRange rather than KnownBits (which would have been a bit nicer API wise) to avoid creating a dependency from Support to IR, where ConstantRange lives. Differential Revision: https://reviews.llvm.org/D59475 llvm-svn: 356339	2019-03-17 20:24:02 +00:00
Nikita Popov	614b1bea97	[ValueTracking] Use ConstantRange overflow checks for unsigned add/sub; NFC Use the methods introduced in rL356276 to implement the computeOverflowForUnsigned(Add\|Sub) functions in ValueTracking, by converting the KnownBits into a ConstantRange. This is NFC: The existing KnownBits based implementation uses the same logic as the the ConstantRange based one. This is not the case for the signed equivalents, so I'm only changing unsigned here. This is in preparation for D59386, which will also intersect the computeConstantRange() result into the range determined from KnownBits. llvm-svn: 356290	2019-03-15 18:37:45 +00:00
Teresa Johnson	70ec64cb72	[ThinLTO] Restructure AliasSummary to contain ValueInfo of Aliasee Summary: The AliasSummary previously contained the AliaseeGUID, which was only populated when reading the summary from bitcode. This patch changes it to instead hold the ValueInfo of the aliasee, and always populates it. This enables more efficient access to the ValueInfo (specifically in the recent patch r352438 which needed to perform an index hash table lookup using the aliasee GUID). As noted in the comments in AliasSummary, we no longer technically need to keep a pointer to the corresponding aliasee summary, since it could be obtained by walking the list of summaries on the ValueInfo looking for the summary in the same module. However, I am concerned that this would be inefficient when walking through the index during the thin link for various analyses. That can be reevaluated in the future. By always populating this new field, we can remove the guard and special handling for a 0 aliasee GUID when dumping the dot graph of the summary. An additional improvement in this patch is when reading the summaries from LLVM assembly we now set the AliaseeSummary field to the aliasee summary in that same module, which makes it consistent with the behavior when reading the summary from bitcode. Reviewers: pcc, mehdi_amini Subscribers: inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D57470 llvm-svn: 356268	2019-03-15 15:11:38 +00:00
Sanjay Patel	de1d5d3675	[InstCombine] canonicalize funnel shift constant shift amount to be modulo bitwidth The shift argument is defined to be modulo the bitwidth, so if that argument is a constant, we can always reduce the constant to its minimal form to allow better CSE and other follow-on transforms. We need to be careful to ignore constant expressions here, or we will likely infinite loop. I'm adding a general vector constant query for that case. Differential Revision: https://reviews.llvm.org/D59374 llvm-svn: 356192	2019-03-14 19:22:08 +00:00
Alina Sbirlea	2d7458a351	[MemorySSA] Remove redundant walker assignment [NFC]. Subscribers: llvm-commits llvm-svn: 356189	2019-03-14 18:45:17 +00:00
Teresa Johnson	4ab0a9f0a4	[SCEV] Use depth limit for trunc analysis Summary: This fixes an extremely long compile time caused by recursive analysis of truncs, which were not previously subject to any depth limits unlike some of the other ops. I decided to use the same control used for sext/zext, since the routines analyzing these are sometimes mutually recursive with the trunc analysis. Reviewers: mkazantsev, sanjoy Subscribers: sanjoy, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58994 llvm-svn: 355949	2019-03-12 18:28:05 +00:00
Sjoerd Meijer	31ff647c1d	[TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI. This is addressing the issue that we're not modeling the cost of clib functions in TTI::getIntrinsicCosts and thus we're basically addressing this fixme: // FIXME: This is wrong for libc intrinsics. To enable analysis of clib functions, we not only need an intrinsic ID and formal arguments, but also the actual user of that function so that we can e.g. look at alignment and values of arguments. So, this is the initial plumbing to pass the user of an intrinsinsic on to getCallCosts, which queries getIntrinsicCosts. Differential Revision: https://reviews.llvm.org/D59014 llvm-svn: 355901	2019-03-12 09:48:02 +00:00
Sanjoy Das	3f5ce18658	Reland "Relax constraints for reduction vectorization" Change from original commit: move test (that uses an X86 triple) into the X86 subdirectory. Original description: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355889	2019-03-12 01:31:44 +00:00
Sanjoy Das	2136a5bc49	Revert "Relax constraints for reduction vectorization" This reverts commit r355868. Breaks hexagon. llvm-svn: 355873	2019-03-11 22:37:31 +00:00
Sanjoy Das	93f8cc186a	Relax constraints for reduction vectorization Summary: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355868	2019-03-11 21:36:41 +00:00
Nikita Popov	490975979b	[ValueTracking] Move constant range computation into ValueTracking; NFC InstructionSimplify currently has some code to determine the constant range of integer instructions for some simple cases. It is used to simplify icmps. This change moves the relevant code into ValueTracking as llvm::computeConstantRange(), so it can also be reused for other purposes. In particular this is with the optimization of overflow checks in mind (ref D59071), where constant ranges cover some cases that known bits don't. llvm-svn: 355781	2019-03-09 21:17:42 +00:00
Michael Kruse	65c5821e3f	[RegionPass] Fix forgotten "!". Commit r355068 "Fix IR/Analysis layering issue with OptBisect" uses the template return Gate.isEnabled() && !Gate.shouldRunPass(this, getDescription(...)); for all pass kinds. For the RegionPass, it left out the not operator, causing region passes to be skipped as soon as a pass gate is used. llvm-svn: 355733	2019-03-08 21:03:06 +00:00
George Burgess IV	4ea679f1f4	[CFLAnders] Fix typo in comment; NFC Patch by Enna1! Differential Revision: https://reviews.llvm.org/D58756 llvm-svn: 355715	2019-03-08 19:28:55 +00:00
Clement Courbet	8e16d73346	[SelectionDAG] Allow the user to specify a memeq function. Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 llvm-svn: 355672	2019-03-08 09:07:45 +00:00
Fangrui Song	9ade843ccb	[IDF] Delete a redundant J-edge test In the DJ-graph based computation of iterated dominance frontiers, SuccNode->getIDom() == Node is one of the tests to check if (Node,Succ) is a J-edge. If it is true, since Node is dominated by Root, SuccLevel = level(Node)+1 > RootLevel which means the next test SuccLevel > RootLevel will also be true. test the check is redundant and can be deleted as it also involves one indirection and provides no speed-up. llvm-svn: 355589	2019-03-07 11:42:59 +00:00
David Green	4511f3fa86	[SCEV] Ensure that isHighCostExpansion takes into account what is being divided A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393	2019-03-05 12:12:18 +00:00
Sanjoy Das	719e78631d	PHI nodes are not `FPMathOperator` s Reviewers: chandlerc, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58887 llvm-svn: 355362	2019-03-05 01:15:08 +00:00
Sanjay Patel	2a70703770	[ValueTracking] do not try to peek through bitcasts in computeKnownBitsFromAssume() There are no tests for this case, and I'm not sure how it could ever work, so I'm just removing this option from the matcher. This should fix PR40940: https://bugs.llvm.org/show_bug.cgi?id=40940 llvm-svn: 355292	2019-03-03 18:59:33 +00:00
Fangrui Song	5fa53d1593	[DemandedBits] Remove some redundancy in the work list InputIsKnownDead check is shared by all operands. Compute it once. For non-integer instructions, use Visited.insert(I).second to replace a find() and an insert(). llvm-svn: 355290	2019-03-03 14:50:01 +00:00
Fangrui Song	981f216d1d	[DemandedBits] Optimize a find()+insert pattern with try_emplace and APInt::operator\|= llvm-svn: 355284	2019-03-03 11:12:57 +00:00
Florian Hahn	98f11a7d75	[SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR. In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\| !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259	2019-03-02 02:31:44 +00:00
Florian Hahn	3c7e92b5d6	[SCEV] Remove undef check for SCEVConstant (NFC) The value stored in SCEVConstant is of type ConstantInt*, which can never be UndefValue. So we should never hit that code. Reviewers: mkazantsev, sanjoy Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58851 llvm-svn: 355257	2019-03-02 01:57:28 +00:00
Nikita Popov	ed3ca9272f	[ValueTracking] Known bits support for unsigned saturating add/sub We have two sources of known bits: 1. For adds leading ones of either operand are preserved. For sub leading zeros of LHS and leading ones of RHS become leading zeros in the result. 2. The saturating math is a select between add/sub and an all-ones/ zero value. As such we can carry out the add/sub known bits calculation, and only preseve the known one/zero bits respectively. Differential Revision: https://reviews.llvm.org/D58329 llvm-svn: 355223	2019-03-01 20:07:04 +00:00
Jonas Hahnfeld	e071cd86df	Hide two unused debugging methods, NFCI. GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205	2019-03-01 17:15:21 +00:00
Rong Xu	a6ff69f6dd	[PGO] Context sensitive PGO (part 2) Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131	2019-02-28 19:55:07 +00:00
Nikita Popov	af2b0bef43	[ValueTracking] More accurate unsigned sub overflow detection Second part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a - b overflows iff a < b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and b subject to the known bits constraint. llvm-svn: 355109	2019-02-28 18:04:20 +00:00
Bjorn Pettersson	d30f308a9f	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Nikita Popov	6c57395fb4	[ValueTracking] More accurate unsigned add overflow detection Part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a + b overflows iff a > ~b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and ~b subject to the known bits constraint. llvm-svn: 355072	2019-02-28 08:11:20 +00:00
Richard Trieu	b37a70f40e	Fix IR/Analysis layering issue with OptBisect OptBisect is in IR due to LLVMContext using it. However, it uses IR units from Analysis as well. This change moves getDescription functions from OptBisect to their respective IR units. Generating names for IR units will now be up to the callers, keeping the Analysis IR units in Analysis. To prevent unnecessary string generation, isEnabled function is added so that callers know when the description needs to be generated. Differential Revision: https://reviews.llvm.org/D58406 llvm-svn: 355068	2019-02-28 04:00:55 +00:00
Alina Sbirlea	fcfa7c5f92	[MemorySSA] Make insertDef insert corresponding phi nodes. Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040	2019-02-27 22:20:22 +00:00
Sanjay Patel	9dada83d6c	[InstSimplify] remove zero-shift-guard fold for general funnel shift As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html We can't remove the compare+select in the general case because we are treating funnel shift like a standard instruction (as opposed to a special instruction like select/phi). That means that if one of the operands of the funnel shift is poison, the result is poison regardless of whether we know that the operand is actually unused based on the instruction's particular semantics. The motivating case for this transform is the more specific rotate op (rather than funnel shift), and we are preserving the fold for that case because there is no chance of introducing extra poison when there is no anonymous extra operand to the funnel shift. llvm-svn: 354905	2019-02-26 18:26:56 +00:00
Simon Pilgrim	a066f1f9e6	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790	2019-02-25 15:42:02 +00:00
Jordan Rupprecht	6387fa2715	[NFC] Fix typos: preceeding -> preceding llvm-svn: 354715	2019-02-23 01:28:32 +00:00
Chijun Sima	70e97163e0	[DTU] Refine the interface and logic of applyUpdates Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669	2019-02-22 13:48:38 +00:00
Sanjay Patel	68171e3cd6	[InstSimplify] use any-zero matcher for fcmp folds The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 The previous attempt at this in rL354406 had a logic bug that actually triggered a regression test failure, but I failed to notice it the first time. llvm-svn: 354467	2019-02-20 14:34:00 +00:00
Sanjay Patel	49f97395ab	Revert "[InstSimplify] use any-zero matcher for fcmp folds" This reverts commit `058bb83513`. Forgot to update another test affected by this change. llvm-svn: 354408	2019-02-20 00:20:38 +00:00
Sanjay Patel	058bb83513	[InstSimplify] use any-zero matcher for fcmp folds The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 llvm-svn: 354406	2019-02-20 00:09:50 +00:00
Sam Parker	0b53e8454b	[BPI] Look through bitcasts in calcZeroHeuristic Constant hoisting may have hidden a constant behind a bitcast so that it isn't folded into its users. However, this prevents BPI from calculating some of its heuristics that are based upon constant values. So, I've added a simple helper function to look through these casts. Differential Revision: https://reviews.llvm.org/D58166 llvm-svn: 354119	2019-02-15 11:50:21 +00:00
Nick Desaulniers	6a84cd3b8e	Revert "[INLINER] allow inlining of address taken blocks" This reverts commit `19e95fe611`. llvm-svn: 354082	2019-02-14 23:42:21 +00:00
Nick Desaulniers	19e95fe611	[INLINER] allow inlining of address taken blocks as long as their uses does not contain calls to functions that capture the argument (potentially allowing the blockaddress to "escape" the lifetime of the caller). TODO: - add more tests - fix crash in llvm::updateCGAndAnalysisManagerForFunctionPass when invoking Transforms/Inline/blockaddress.ll llvm-svn: 354079	2019-02-14 23:35:53 +00:00
Max Kazantsev	24383cd7bb	Make widenable condition transparent for MemoryWriteTracking Side effects of widenable condition intrinsic are modelled via InaccessibleMemOnly, and there is no way to say that it isn't really writing any memory. This patch teaches MemoryWriteTracking ignore this intrinsic. llvm-svn: 354021	2019-02-14 11:10:29 +00:00
Max Kazantsev	b3168a400f	Teach isGuaranteedToTransferExecutionToSuccessor about widenable conditions Widenable condition intrinsic is guaranteed to return value, notify the isGuaranteedToTransferExecutionToSuccessor function about it. llvm-svn: 354020	2019-02-14 11:10:21 +00:00
Max Kazantsev	4a1c02987e	[NFC] Simplify code & reduce nest slightly llvm-svn: 353832	2019-02-12 11:31:46 +00:00
Evandro Menezes	f4a369596f	[TargetLibraryInfo] Update run time support for Windows It seems that, since VC19, the `float` C99 math functions are supported for all targets, unlike the C89 ones. According to the discussion at https://reviews.llvm.org/D57625. llvm-svn: 353758	2019-02-11 22:12:01 +00:00
Alina Sbirlea	d77edc00a8	[MemorySSA] Remove verifyClobberSanity. Summary: This verification may fail after certain transformations due to BasicAA's fragility. Added a small explanation and a testcase that triggers the assert in checkClobberSanity (before its removal). Addresses PR40509. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, llvm-commits, Prazek Tags: #llvm Differential Revision: https://reviews.llvm.org/D57973 llvm-svn: 353739	2019-02-11 19:51:21 +00:00
Michael Kruse	77a614a6e1	Refactor setAlreadyUnrolled() and setAlreadyVectorized(). Loop::setAlreadyUnrolled() and LoopVectorizeHints::setLoopAlreadyUnrolled() both add loop metadata that stops the same loop from being transformed multiple times. This patch merges both implementations. In doing so we fix 3 potential issues: * setLoopAlreadyUnrolled() kept the llvm.loop.vectorize/interleave.* metadata even though it will not be used anymore. This already caused problems such as http://llvm.org/PR40546. Change the behavior to the one of setAlreadyUnrolled which deletes this loop metadata. * setAlreadyUnrolled() used to create a new LoopID by calling MDNode::get with nullptr as the first operand, then replacing it by the returned references using replaceOperandWith. It is possible that MDNode::get would instead return an existing node (due to de-duplication) that then gets modified. To avoid, use a fresh TempMDNode that does not get uniqued with anything else before replacing it with replaceOperandWith. * LoopVectorizeHints::matchesHintMetadataName() only compares the suffix of the attribute to set the new value for. That is, when called with "enable", would erase attributes such as "llvm.loop.unroll.enable", "llvm.loop.vectorize.enable" and "llvm.loop.distribute.enable" instead of the one to replace. Fortunately, function was only called with "isvectorized". Differential Revision: https://reviews.llvm.org/D57566 llvm-svn: 353738	2019-02-11 19:45:44 +00:00
Evandro Menezes	4b86c474ff	[TargetLibraryInfo] Update run time support for Windows It seems that the run time for Windows has changed and supports more math functions than it used to, especially on AArch64, ARM, and AMD64. Fixes PR40541. Differential revision: https://reviews.llvm.org/D57625 llvm-svn: 353733	2019-02-11 19:02:28 +00:00
Chandler Carruth	9beadff6a5	Move CFLGraph and the AA summary code over to the new `CallBase` instruction base class rather than the `CallSite` wrapper. llvm-svn: 353676	2019-02-11 09:25:41 +00:00
Chandler Carruth	2d2a4359a2	Remove `CallSite` from the CodeMetrics analysis, moving it to the new `CallBase` and simpler APIs therein. llvm-svn: 353673	2019-02-11 09:03:32 +00:00
Chandler Carruth	dac20a8254	[CallSite removal] Port InstSimplify over to use `CallBase` both in its interface and implementation. Port code with: `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353662	2019-02-11 07:54:10 +00:00
Chandler Carruth	751d95fb9b	[CallSite removal] Migrate ConstantFolding APIs and implementation to `CallBase`. Users have been updated. You can see how to update any out-of-tree usages: pass `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353661	2019-02-11 07:51:44 +00:00
Craig Topper	784929d045	Implementation of asm-goto support in LLVM This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563	2019-02-08 20:48:56 +00:00
Sergey Dmitriev	807960e6ef	[CodeExtractor] Update function's assumption cache after extracting blocks from it Summary: Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache. Reviewers: hfinkel, vsk, fhahn, davidxl, sanjoy Reviewed By: hfinkel, vsk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57215 llvm-svn: 353500	2019-02-08 06:55:18 +00:00
Sam Parker	67756c09f2	[LSR] Generate cross iteration indexes Modify GenerateConstantOffsetsImpl to create offsets that can be used by indexed addressing modes. If formulae can be generated which result in the constant offset being the same size as the recurrence, we can generate a pre-indexed access. This allows the pointer to be updated via the single pre-indexed access so that (hopefully) no add/subs are required to update it for the next iteration. For small cores, this can significantly improve performance DSP-like loops. Differential Revision: https://reviews.llvm.org/D55373 llvm-svn: 353403	2019-02-07 13:32:54 +00:00
Alina Sbirlea	6cba96ed52	[LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with MemorySSA. Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM. When using MemorySSA, we build an AliasSetTracker on demand in order to reuse the current infrastructure. We only build it if less than AccessCapForMSSAPromotion exist in the loop, a cap that is by default set to 250. This value ensures there are no runtime regressions, and there are small compile time gains for pathological cases. A much lower value (20) was found to yield a single regression in the llvm-test-suite and much higher benefits for compile times. Conservatively we set the current cap to a high value, but we will explore lowering it when MemorySSA is enabled by default. Reviewers: sanjoy, chandlerc Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56625 llvm-svn: 353339	2019-02-06 20:25:17 +00:00
Alina Sbirlea	910c6bef3e	[AliasSetTracker] Pass MustAlias to addPointer more often. Summary: Pass the alias info to addPointer when available. Will save an alias() call for must sets when adding a known Must or May alias. [Part of a series of cleanup patches] Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D56613 llvm-svn: 353335	2019-02-06 19:55:12 +00:00
Philip Reames	00ae46ba52	[AliasSetTracker] Minor style tweak to avoid a variable w/two distinct live ranges [NFC] llvm-svn: 353267	2019-02-06 03:46:40 +00:00
Richard Trieu	5f436fc57a	Move DomTreeUpdater from IR to Analysis DomTreeUpdater depends on headers from Analysis, but is in IR. This is a layering violation since Analysis depends on IR. Relocate this code from IR to Analysis to fix the layering violation. llvm-svn: 353265	2019-02-06 02:52:52 +00:00
Alina Sbirlea	b9c1bc6d3c	[BasicAA] Cache nonEscapingLocalObjects for alias() calls. Summary: Use a small cache for Values tested by nonEscapingLocalObject(). Since the calls to PointerMayBeCaptured are fairly expensive, this saves a good amount of compile time for anything relying heavily on BasicAA.alias() calls. This uses the same approach as the AliasCache, i.e. the cache is reset after each alias() call. The cache is not used or updated by modRefInfo calls since it's harder to know when to reset the cache. Testcases that show improvements with this patch are too large to include. Example compile time improvement: 7s to 6s. Reviewers: chandlerc, sunfish Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57627 llvm-svn: 353245	2019-02-05 23:52:08 +00:00
Evandro Menezes	e5bb58b115	[TargetLibraryInfo] Regroup run time functions for Windows (NFC) Regroup supported and unsupported functions by precision and C standard. llvm-svn: 353213	2019-02-05 20:24:21 +00:00
Hiroshi Inoue	02a2bb2f54	[NFC] fix trivial typos in comments llvm-svn: 353147	2019-02-05 08:30:48 +00:00
Evandro Menezes	98f356cd74	Revert "[PATCH] [TargetLibraryInfo] Update run time support for Windows" This reverts accidental commit `ff5527718d`. llvm-svn: 353118	2019-02-04 23:34:50 +00:00
Evandro Menezes	ff5527718d	[PATCH] [TargetLibraryInfo] Update run time support for Windows It seems that the run time for Windows has changed and supports more math functions than before. Since LLVM requires at least VS2015, I assume that this is the run time that would be redistributed with programs built with Clang. Thus, I based this update on the header file `math.h` that accompanies it. This patch addresses the PR40541. Unfortunately, I have no access to a Windows development environment to validate it. llvm-svn: 353114	2019-02-04 23:29:41 +00:00
David Callahan	fd3e7a9320	Adjust cardinality of internal inliner thresholds Summary: While compiling openJDK11 (also other workloads), some make files would pass both CFLAGS and LDFLAGS at link step ; resulting in duplicate options on the command line when one is using LTO and trying to influence the inliner. Most of the internal flags are ZeroOrMore, this diff changes the remaining ones. Reviewers: david2050, twoh, modocache Reviewed By: twoh Subscribers: mehdi_amini, dexonsmith, eraman, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57537 Patch by: Abdoul-Kader Keita llvm-svn: 353071	2019-02-04 18:46:25 +00:00
Max Kazantsev	437ee05885	[SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we have a huge amounts of such code, we will be littering SE with these nodes. We could just state that they all are undef and save some memory. Differential Revision: https://reviews.llvm.org/D57567 Reviewed By: sanjoy llvm-svn: 353017	2019-02-04 05:04:19 +00:00
Philip Pfaffe	9438585fe4	[DA][NewPM] Handle transitive dependencies in the new-pm version of DA Summary: The analysis result of DA caches pointers to AA, SCEV, and LI, but it never checks for their invalidation. Fix that. Reviewers: chandlerc, dmgreen, bogner Reviewed By: dmgreen Subscribers: hiraditya, bollu, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D56381 llvm-svn: 352986	2019-02-03 12:25:41 +00:00
Dmitry Venikov	aaa709f2ec	[InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D41940 llvm-svn: 352981	2019-02-03 03:48:30 +00:00
Yevgeny Rouban	15b17d0a7c	Provide reason messages for unviable inlining InlineCost's isInlineViable() is changed to return InlineResult instead of bool. This provides messages for failure reasons and allows to get more specific messages for cases where callsites are not viable for inlining. Reviewed By: xbolva00, anemet Differential Revision: https://reviews.llvm.org/D57089 llvm-svn: 352849	2019-02-01 10:44:43 +00:00
Alina Sbirlea	240a90a57e	[MemorySSA] Extend removeMemoryAccess API to optimize MemoryPhis. Summary: EarlyCSE needs to optimize MemoryPhis after an access is removed and has special handling for it. This should be handled by MemorySSA instead. The default remains that MemoryPhis are not optimized after an access is removed. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D57199 llvm-svn: 352787	2019-01-31 20:13:47 +00:00
Yevgeny Rouban	ae29857d64	Test commit. NFCI. llvm-svn: 352738	2019-01-31 08:49:20 +00:00
Max Kazantsev	b37419ef66	[SCEV] Prohibit SCEV transformations for huge SCEVs Currently SCEV attempts to limit transformations so that they do not work with big SCEVs (that may take almost infinite compile time). But for this, it uses heuristics such as recursion depth and number of operands, which do not give us a guarantee that we don't actually have big SCEVs. This situation is still possible, though it is not likely to happen. However, the bug PR33494 showed a bunch of simple corner case tests where we still produce huge SCEVs, even not reaching big recursion depth etc. This patch introduces a concept of 'huge' SCEVs. A SCEV is huge if its expression size (intoduced in D35989) exceeds some threshold value. We prohibit optimizing transformations if any of SCEVs we are dealing with is huge. This gives us a reliable check that we don't spend too much time working with them. As the next step, we can possibly get rid of old limiting mechanisms, such as recursion depth thresholds. Differential Revision: https://reviews.llvm.org/D35990 Reviewed By: reames llvm-svn: 352728	2019-01-31 06:19:25 +00:00
Erik Pilkington	600e9deacf	Add a 'dynamic' parameter to the objectsize intrinsic This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664	2019-01-30 20:34:35 +00:00
Hiroshi Inoue	c437f310a5	[NFC] fix trivial typos in comments llvm-svn: 352602	2019-01-30 05:26:31 +00:00
Max Kazantsev	23e642248d	[NFC] Use ArrayRef instead of SmallVectorImpl where possible llvm-svn: 352466	2019-01-29 09:39:15 +00:00
Max Kazantsev	468ad52213	[SCEV] Take correct loop in AddRec simplification. PR40420 The code of AddRec simplification is using wrong loop when it creates a new AddRecExpr. It should be using AddRecLoop which we have saved and against which all gate checks are made, and not calling AddRec->getLoop() over and over again because AddRec may change and become an AddRecurrency from outer loop during the transform iterations. Considering this change trivial, commiting for postcommit review. llvm-svn: 352451	2019-01-29 05:37:59 +00:00
Teresa Johnson	2f616e479b	[ThinLTO] Add option to dump per-module summary dot graph Summary: I found that there currently isn't a way to invoke exportToDot from the command line for a per-module summary index, and therefore no testing of that case. Add an internal option and use it to test dumping of per module summary indexes. In particular, I am looking at fixing the limitation that causes the aliasee GUID in the per-module summary to be 0, and want to be able to test that change. Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57206 llvm-svn: 352441	2019-01-28 23:43:26 +00:00
Alina Sbirlea	8e1d65771a	[AliasSetTracker] Cleanup more comments. [NFCI] llvm-svn: 352416	2019-01-28 19:38:03 +00:00
Alina Sbirlea	d8c829bc22	[AliasSetTracker] Cleanup comments. [NFCI] llvm-svn: 352406	2019-01-28 19:01:32 +00:00
Alina Sbirlea	3d1d95ca55	[AliasSetTracker] Update signature to aliasesPointer [NFCI]. llvm-svn: 352399	2019-01-28 18:30:05 +00:00
Johannes Doerfert	00102c7d95	[ValueTracking] Look through casts when determining non-nullness Bitcast and certain Ptr2Int/Int2Ptr instructions will not alter the value of their operand and can therefore be looked through when we determine non-nullness. Differential Revision: https://reviews.llvm.org/D54956 llvm-svn: 352293	2019-01-26 23:40:35 +00:00
Eli Friedman	525ef0159d	[Analysis] Fix isSafeToLoadUnconditionally handling of volatile. A volatile operation cannot be used to prove an address points to normal memory. (LangRef was recently updated to state it explicitly.) Differential Revision: https://reviews.llvm.org/D57040 llvm-svn: 352109	2019-01-24 21:31:13 +00:00
Simon Pilgrim	0e08b6f017	Move saturated arithmetic intrinsics to other integer intrinsics. NFCI. They were in the floating point group. llvm-svn: 351953	2019-01-23 13:49:10 +00:00
Max Kazantsev	ca47f1f72a	[NFC] Add function to parse widenable conditional branches llvm-svn: 351803	2019-01-22 11:21:32 +00:00
Max Kazantsev	bd374b27cc	[NFC] Add detector for guards expressed as branch by widenable conditions This patch adds a function to detect guards expressed in explicit control flow form as branch by `and` with widenable condition intrinsic call: %wc = call i1 @llvm.experimental.widenable.condition() %guard_cond = and i1, %some_cond, %wc br i1 %guard_cond, label %guarded, label %deopt deopt: <maybe some non-side-effecting instructions> deoptimize() This form can be used as alternative to implicit control flow guard representation expressed by `experimental_guard` intrinsic. Differential Revision: https://reviews.llvm.org/D56074 Reviewed By: reames llvm-svn: 351791	2019-01-22 09:36:22 +00:00
Philip Reames	390c0e2f72	[CVP] Use LVI to constant fold deopt operands Deopt operands are generally intended to record information about a site in code with minimal perturbation of the surrounding code. Idiomatically, they also tend to appear down rare paths. Putting these together, we have an obvious case for extending CVP w/deopt operand constant folding. Arguably, we should be doing this for all operands on all instructions, but that's definitely a much larger and risky change. Differential Revision: https://reviews.llvm.org/D55678 llvm-svn: 351774	2019-01-22 01:34:33 +00:00
Max Kazantsev	85c988388a	[SCEV][NFC] Introduces expression sizes estimation This patch introduces the field `ExpressionSize` in SCEV. This field is calculated only once on SCEV creation, and it represents the complexity of this SCEV from arithmetical point of view (not from the point of the number of actual different SCEV nodes that are used in the expression). Roughly saying, it is the number of operands and operations symbols when we print this SCEV. A formal definition is following: if SCEV `X` has operands `Op1`, `Op2`, ..., `OpN`, then Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN). Size of SCEVConstant and SCEVUnknown is one. Expression size may be used as a universal way to limit SCEV transformations for huge SCEVs. Currently, we have a bunch of options that represents various limits (such as recursion depth limit) that may not make any sense from the point of view of a LLVM users who is not familiar with SCEV internals, and all these different options pursue one goal. A more general rule that may potentially allow us to get rid of this redundancy in options is "do not make transformations with SCEVs of huge size". It can apply to all SCEV traversals and transformations that may need to visit a SCEV node more than once, hence they are prone to combinatorial explosions. This patch only introduces SCEV sizes calculation as NFC, its utilization will be introduced in follow-up patches. Differential Revision: https://reviews.llvm.org/D35989 Reviewed By: reames llvm-svn: 351725	2019-01-21 06:19:50 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Teresa Johnson	8d86f1ba47	Revert "[ThinLTO] Add summary entries for index-based WPD" Mistaken commit of something still under review! This reverts commit r351453. llvm-svn: 351455	2019-01-17 16:05:04 +00:00
Teresa Johnson	4fcf3b1621	[ThinLTO] Add summary entries for index-based WPD Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The index-based WPD will be sent as a follow-on. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 351453	2019-01-17 15:49:03 +00:00
Tom Stellard	3d36e5c3e6	Only promote args when function attributes are compatible Summary: Check to make sure that the caller and the callee have compatible function arguments before promoting arguments. This uses the same TargetTransformInfo queries that are used to determine if attributes are compatible for inlining. The goal here is to avoid breaking ABI when a called function's ABI depends on a target feature that is not enabled in the caller. This is a very conservative fix for PR37358. Ideally we would have a more sophisticated check for ABI compatiblity rather than checking if the attributes are compatible for inlining. Reviewers: echristo, chandlerc, eli.friedman, craig.topper Reviewed By: echristo, chandlerc Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits Differential Revision: https://reviews.llvm.org/D53554 llvm-svn: 351296	2019-01-16 05:15:31 +00:00
Nikita Popov	5f393eb5da	Reapply "[DemandedBits] Use SetVector for Worklist" DemandedBits currently uses a simple vector for the worklist, which means that instructions may be inserted multiple times into it. Especially in combination with the deep lattice, this may cause instructions too be recomputed very often. To avoid this, switch to a SetVector. Reapplying with a smaller number of inline elements in the SmallSetVector, to avoid running into the SmallDenseMap issue described in D56455. Differential Revision: https://reviews.llvm.org/D56362 llvm-svn: 350997	2019-01-12 09:09:15 +00:00
Nikita Popov	9f6e9cf71b	[ConstantFolding] Fold undef for integer intrinsics This fixes https://bugs.llvm.org/show_bug.cgi?id=40110. This implements handling of undef operands for integer intrinsics in ConstantFolding, in particular for the bitcounting intrinsics (ctpop, cttz, ctlz), the with.overflow intrinsics, the saturating math intrinsics and the funnel shift intrinsics. The undef behavior follows what InstSimplify does for the general cas e of non-constant operands. For the bitcount intrinsics (where InstSimplify doesn't do undef handling -- there cannot be a combination of an undef + non-constant operand) I'm using a 0 result if the intrinsic is defined for zero and undef otherwise. Differential Revision: https://reviews.llvm.org/D55950 llvm-svn: 350971	2019-01-11 21:18:00 +00:00
Teresa Johnson	290a839891	[LTO] Record whether LTOUnit splitting is enabled in index Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948	2019-01-11 18:31:57 +00:00
Alina Sbirlea	e41f4b39e5	[MemorySSA] Disable checkClobberSanity for SkipSelfWalker. Sanity will fail for this, since we're exploring getting a clobber further than the sanity check expects. Ideally we need to teach the sanity check to differentiate between the two walkers based on the SkipSelf bool in the query. llvm-svn: 350895	2019-01-10 21:47:15 +00:00
Easwaran Raman	b45994b843	Refactor synthetic profile count computation. NFC. Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755	2019-01-09 20:10:27 +00:00
Easwaran Raman	ed279752f0	[Inliner] Assert that the computed inline threshold is non-negative. Reviewers: chandlerc Subscribers: haicheng, llvm-commits Differential Revision: https://reviews.llvm.org/D56409 llvm-svn: 350751	2019-01-09 19:26:17 +00:00
David Callahan	3ef0f4447d	refactor BlockFrequencyInfo::view to take a title parameter Summary: All a non-default title for the debugging this debugging aide Reviewers: twoh, Kader, modocache Reviewed By: twoh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56499 llvm-svn: 350749	2019-01-09 19:12:38 +00:00
Max Kazantsev	4615a505f8	[IPT] Drop cache less eagerly in GVN and LoopSafetyInfo Current strategy of dropping `InstructionPrecedenceTracking` cache is to invalidate the entire basic block whenever we change its contents. In fact, `InstructionPrecedenceTracking` has 2 internal strictures: `OrderedInstructions` that is needed to be invalidated whenever the contents changes, and the map with first special instructions in block. This second map does not need an update if we add/remove a non-special instuction because it cannot affect the contents of this map. This patch changes API of `InstructionPrecedenceTracking` so that it now accounts for reasons under which we invalidate blocks. This should lead to much less recalculations of the map and should save us some compile time because in practice we don't typically add/remove special instructions. Differential Revision: https://reviews.llvm.org/D54462 Reviewed By: efriedma llvm-svn: 350694	2019-01-09 07:28:13 +00:00
Philip Pfaffe	efb5ad1c58	[DA][NewPM] Add a printerpass and port the testsuite The new-pm version of DA is untested. Testing requires a printer, so add that and use it in the existing DA tests. Differential Revision: https://reviews.llvm.org/D56386 llvm-svn: 350624	2019-01-08 14:06:58 +00:00
Alina Sbirlea	12bbb4fe8d	[MemorySSA] Add SkipSelfWalker. Summary: Add implementation of SkipSelfWalker. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D56285 llvm-svn: 350561	2019-01-07 19:38:47 +00:00
Alina Sbirlea	bc8aa24c2f	[MemorySSA] Refactor CachingWalker. Summary: Refactor caching walker to make creating a walker that skips the starting access strightforward. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D55957 llvm-svn: 350558	2019-01-07 19:22:37 +00:00
Alina Sbirlea	f723020456	[MemorySSA] Extend the clobber walker with the option to skip the starting access. Summary: The option enables loop transformations to hoist accesses that do not have clobbers in the loop. If the clobber queries skips the starting access, the result may be outside the loop instead of the header Phi. Adding the walker that uses this option in a separate patch. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D55944 llvm-svn: 350551	2019-01-07 18:40:27 +00:00
Nikita Popov	8dd19ed3ec	Revert "[DemandedBits] Use SetVector for Worklist" This reverts commit r350547. Seeing assertion failures on clang tests. llvm-svn: 350549	2019-01-07 18:15:11 +00:00
Nikita Popov	353d92decb	[DemandedBits] Use SetVector for Worklist DemandedBits currently uses a simple vector for the worklist, which means that instructions may be inserted multiple times into it. Especially in combination with the deep lattice, this may cause instructions too be recomputed very often. To avoid this, switch to a SetVector. Differential Revision: https://reviews.llvm.org/D56362 llvm-svn: 350547	2019-01-07 18:03:36 +00:00
Chandler Carruth	57578aaf96	[CallSite removal] Port `IndirectCallSiteVisitor` to use `CallBase` and update client code. Also rename it to use the more generic term `call` instead of something that could be confused with a praticular type. Differential Revision: https://reviews.llvm.org/D56183 llvm-svn: 350508	2019-01-07 07:15:51 +00:00
Chandler Carruth	363ac68374	[CallSite removal] Migrate all Alias Analysis APIs to use the newly minted `CallBase` class instead of the `CallSite` wrapper. This moves the largest interwoven collection of APIs that traffic in `CallSite`s. While a handful of these could have been migrated with a minorly more shallow migration by converting from a `CallSite` to a `CallBase`, it hardly seemed worth it. Most of the APIs needed to migrate together because of the complex interplay of AA APIs and the fact that converting from a `CallBase` to a `CallSite` isn't free in its current implementation. Out of tree users of these APIs can fairly reliably migrate with some combination of `.getInstruction()` on the `CallSite` instance and casting the resulting pointer. The most generic form will look like `CS` -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there is a more elegant migration. Hopefully, this migrates enough APIs for users to fully move from `CallSite` to the base class. All of the in-tree users were easily migrated in that fashion. Thanks for the review from Saleem! Differential Revision: https://reviews.llvm.org/D55641 llvm-svn: 350503	2019-01-07 05:42:51 +00:00
Nikita Popov	6658fce4fc	[BDCE] Remove dead uses of arguments In addition to finding dead uses of instructions, also find dead uses of function arguments, and replace them with zero as well. I'm changing the way the known bits are computed here to remove the coupling between the transfer function and the algorithm. It previously relied on the first op being visited first and computing known bits -- unless the first op is not an instruction, in which case they're computed on the second op. I could have adjusted this to check for "instruction or argument", but I think it's better to avoid the repeated calculation with an explicit flag. Differential Revision: https://reviews.llvm.org/D56247 llvm-svn: 350435	2019-01-04 21:21:43 +00:00
Florian Hahn	7902405c42	[ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset GetPointerBaseWithConstantOffset include this code, where ByteOffset and GEPOffset are both of type llvm::APInt : ByteOffset += GEPOffset.getSExtValue(); The problem with this line is that getSExtValue() returns an int64_t, but the += matches an overload for uint64_t. The problem is that the resulting APInt is no longer considered to be signed. That in turn causes assertion failures later on if the relevant pointer type is > 64 bits in width and the GEPOffset was negative. Changing it to ByteOffset += GEPOffset.sextOrTrunc(ByteOffset.getBitWidth()); resolves the issue and explicitly performs the sign-extending or truncation. Additionally, instead of asserting later if the result is > 64 bits, it breaks out of the loop in that case. See also https://reviews.llvm.org/D24729 https://reviews.llvm.org/D24772 This commit must be merged after D38662 in order for the test to pass. Patch by Michael Ferguson <mpfergu@gmail.com>. Reviewers: reames, sanjoy, hfinkel Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D38501 llvm-svn: 350395	2019-01-04 14:53:22 +00:00
Simon Pilgrim	c2aadfaaad	[SLPVectorizer] Flag ADD/SUB SSAT/USAT intrinsics trivially vectorizable (PR40123) Enables SLP vectorization for the SSE2 PADDS/PADDUS/PSUBS/PSUBUS style intrinsics llvm-svn: 350300	2019-01-03 12:18:23 +00:00
Hal Finkel	4f2381440d	[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug) Motivated by the discussion in D38499, this patch updates BasicAA to support arbitrary pointer sizes by switching most remaining non-APInt calculations to use APInt. The size of these APInts is set to the maximum pointer size (maximum over all address spaces described by the data layout string). Most of this translation is straightforward, but this patch contains a fix for a bug that revealed itself during this translation process. In order for test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit pointers, the intermediate calculations must be performed using 64-bit integers. This is because, as noted in the patch, when GetLinearExpression decomposes an expression into C1V+C2, and we then multiply this by Scale, and distribute, to get (C1Scale)V + C2Scale, it can be the case that, even through C1V+C2 does not overflow for relevant values of V, (C2Scale) can overflow. If this happens, later logic will draw invalid conclusions from the (base) offset value. Thus, when initially applying the APInt conversion, because the maximum pointer size in this test is 32 bits, it started failing. Suspicious, I created a 64-bit version of this test (included here), and that failed (miscompiled) on trunk for a similar reason (the multiplication can overflow). After fixing this overflow bug, the first test case (at least) in Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was relying on having 64-bit intermediate values to have BasicAA return an accurate result. In order to fix this problem, and because I believe that it is not uncommon to use i64 indexing expressions in 32-bit code (especially portable code using int64_t), it seems reasonable to always use at least 64-bit integers. In this way, we won't regress our analysis capabilities (and there's a command-line option added, so experimenting with this should be easy). As pointed out by Eli during the review, there are other potential overflow conditions that this patch does not address. Fixing those is left to follow-up work. Patch by me with contributions from Michael Ferguson (mferguson@cray.com). Differential Revision: https://reviews.llvm.org/D38662 llvm-svn: 350220	2019-01-02 16:28:09 +00:00
Nikita Popov	bc9986e9ad	Reapply "[BDCE][DemandedBits] Detect dead uses of undead instructions" This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The previous attempt to land this lead to miscompiles, because cases where uses were initially dead but were later found to be live during further analysis were not always correctly removed from the DeadUses set. This is fixed now and the added test case demanstrates such an instance. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 350188	2019-01-01 10:05:26 +00:00
George Burgess IV	69952979da	[MemoryLocation] Use LocationSize instead of ints; NFC Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. This one sadly isn't super small, but all of the changes here are either to: - libfuncs that are passed a constant size (memcpy, memset, ...) - instructions that store/load a constant size So they have to be precise llvm-svn: 350017	2018-12-23 03:36:44 +00:00
George Burgess IV	8c5413f3f7	[Loads] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. This tries to find literal loads/stores of the given type, so this has to be precise. llvm-svn: 350016	2018-12-23 03:10:56 +00:00
George Burgess IV	1329cf1791	[Lint] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350015	2018-12-23 02:50:08 +00:00
George Burgess IV	685e781d55	[AAEval] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350014	2018-12-23 02:39:58 +00:00
George Burgess IV	640be69249	[Analysis] More LocationSize cleanup; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350008	2018-12-22 18:23:21 +00:00
Vedant Kumar	b264d69de7	[IR] Add Instruction::isLifetimeStartOrEnd, NFC Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964	2018-12-21 21:49:40 +00:00
Reid Kleckner	84a2d29681	[BasicAA] Fix AA bug on dynamic allocas and stackrestore Summary: BasicAA has special logic for unescaped allocas, which normally applies equally well to dynamic and static allocas. However, llvm.stackrestore has the power to end the lifetime of dynamic allocas, without referring to them directly. stackrestore is already marked with the most conservative memory modification attributes, but because the alloca is not escaped, the normal logic produces incorrect results. I think BasicAA needs a special case here to teach it about the relationship between dynamic allocas and stackrestore. Fixes PR40118 Reviewers: gbiv, efriedma, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55969 llvm-svn: 349945	2018-12-21 19:59:03 +00:00
Florian Hahn	ef307b8c26	[LAA] Avoid generating RT checks for known deps preventing vectorization. If we found unsafe dependences other than 'unknown', we already know at compile time that they are unsafe and the runtime checks should always fail. So we can avoid generating them in those cases. This should have no negative impact on performance as the runtime checks that would be created previously should always fail. As a sanity check, I measured the test-suite, spec2k and spec2k6 and there were no regressions. Reviewers: Ayal, anemet, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D55798 llvm-svn: 349794	2018-12-20 18:49:09 +00:00
Clement Courbet	d4bd3eb85d	[NFC] Fix trailing comma after function. lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732	2018-12-20 09:20:07 +00:00
Michael Kruse	978ba61536	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata. The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725	2018-12-20 04:58:07 +00:00
Nikita Popov	3817ee7908	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions" This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684	2018-12-19 22:09:02 +00:00
Nikita Popov	649e125451	[BDCE][DemandedBits] Detect dead uses of undead instructions This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674	2018-12-19 19:56:21 +00:00
Sanjay Patel	798c5982a0	[ValueTracking] remove unused parameters from helper functions; NFC llvm-svn: 349641	2018-12-19 16:49:18 +00:00
Florian Hahn	485f2826ba	[LAA] Introduce enum for vectorization safety status (NFC). This patch adds a VectorizationSafetyStatus enum, which will be extended in a follow up patch to distinguish between 'safe with runtime checks' and 'known unsafe' dependences. Reviewers: anemet, anna, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D54892 llvm-svn: 349556	2018-12-18 22:25:11 +00:00
Pete Cooper	be4f571107	Change the objc ARC optimizer to use the new objc.* intrinsics We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534	2018-12-18 20:32:49 +00:00
Artur Pilipenko	2a0146e0fd	[CaptureTracking] Pass MaxUsesToExplore from wrappers to the actual implementation This is a follow up for rL347910. In the original patch I somehow forgot to pass the limit from wrappers to the function which actually does the job. llvm-svn: 349438	2018-12-18 03:32:33 +00:00
Nikita Popov	221f3fc750	[InstSimplify] Simplify saturating add/sub + icmp If a saturating add/sub has one constant operand, then we can determine the possible range of outputs it can produce, and simplify an icmp comparison based on that. The implementation is based on a similar existing mechanism for simplifying binary operator + icmps. Differential Revision: https://reviews.llvm.org/D55735 llvm-svn: 349369	2018-12-17 17:45:18 +00:00
Wei Mi	66c6c5abea	[SampleFDO] handle ProfileSampleAccurate when initializing function entry count ProfileSampleAccurate is used to indicate the profile has exact match to the code to be optimized. Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle ProfileSampleAccurate in multiple places. Differential Revision: https://reviews.llvm.org/D55660 llvm-svn: 349088	2018-12-13 21:51:42 +00:00
Easwaran Raman	5a7056fa03	[ThinLTO] Compute synthetic function entry count Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076	2018-12-13 19:54:27 +00:00
David L. Jones	54c01ad6a9	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail" This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002	2018-12-13 03:15:11 +00:00
Michael Kruse	7244852557	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944	2018-12-12 17:32:52 +00:00
Wei Mi	7da5a08e1a	[SampleFDO] Extend profile-sample-accurate option to cover isFunctionColdInCallGraph For SampleFDO, when a callsite doesn't appear in the profile, it will not be marked as cold callsite unless the option -profile-sample-accurate is specified. But profile-sample-accurate doesn't cover function isFunctionColdInCallGraph which is used to decide whether a function should be put into text.unlikely section, so even if the user knows the profile is accurate and specifies profile-sample-accurate, those functions not appearing in the sample profile are still not be put into text.unlikely section right now. The patch fixes that. Differential Revision: https://reviews.llvm.org/D55567 llvm-svn: 348940	2018-12-12 17:09:27 +00:00
Nikita Popov	79c994d976	[ConstantFolding] Handle leading zero-size elements in load folding Struct types may have leading zero-size elements like [0 x i32], in which case the "real" element at offset 0 will not necessarily coincide with the 0th element of the aggregate. ConstantFoldLoadThroughBitcast() wants to drill down the element at offset 0, but currently always picks the 0th aggregate element to do so. This patch changes the code to find the first non-zero-size element instead, for the struct case. The motivation behind this change is https://github.com/rust-lang/rust/issues/48627. Rust is fond of emitting [0 x iN] separators between struct elements to enforce alignment, which prevents constant folding in this particular case. The additional tests with [4294967295 x [0 x i32]] check that we don't end up unnecessarily looping over a large number of zero-size elements of a zero-size array. Differential Revision: https://reviews.llvm.org/D55169 llvm-svn: 348895	2018-12-11 20:29:16 +00:00
Fedor Sergeev	a1d95c3fc4	[NewPM] fixing asserts on deleted loop in -print-after-all IR-printing AfterPass instrumentation might be called on a loop that has just been invalidated. We should skip printing it to avoid spurious asserts. Reviewed By: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D54740 llvm-svn: 348887	2018-12-11 19:05:35 +00:00
Nikita Popov	94b8e2ea4e	[MemCpyOpt] memset->memcpy forwarding with undef tail Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. For the implementation, I'm reusing a bit of code for a similar existing optimization (direct memcpy of undef). I've also added memset support to MemDepAnalysis GetLocation -- Instead, getPointerDependencyFrom could be used, but it seems to make more sense to add this to GetLocation and thus make the computation cachable. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 348645	2018-12-07 21:16:58 +00:00
Nikita Popov	110cf05203	Reapply "[DemandedBits][BDCE] Support vectors of integers" DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. Unlike the previous iteration of this patch, getDemandedBits() can now again be called on arbirary (sized) instructions, even if they don't have integer or vector of integer type. (For vector types the size of the returned mask will now be the scalar size in bits though.) The added LoopVectorize test case shows a case which triggered an assertion failure with the previous attempt, because getDemandedBits() was called on a pointer-typed instruction. Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348602	2018-12-07 15:38:13 +00:00
Nikita Popov	14ca9a8355	Revert "[DemandedBits][BDCE] Support vectors of integers" This reverts commit r348549. Causing assertion failures during clang build. llvm-svn: 348558	2018-12-07 00:42:03 +00:00
Nikita Popov	cf65b9207b	[DemandedBits][BDCE] Support vectors of integers DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. The getDemandedBits() method can now only be called on instructions that have integer or vector of integer type. Previously it could be called on any sized instruction (even if it was not particularly useful). The size of the return value is now always the scalar size in bits (while previously it was the type size in bits). Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348549	2018-12-06 23:50:32 +00:00
David L. Jones	5ff7b8a04a	Revert r347934 "[SCEV] Guard movement of insertion point for loop-invariants" This change caused SEGVs in instcombine. (The r347934 change seems to me to be a precipitating cause, not a root cause. Details are on the llvm-commits thread for r347934.) llvm-svn: 348426	2018-12-05 23:13:50 +00:00
Sanjay Patel	3d5bb15a1d	[CmpInstAnalysis] fix function signature for ICmp code to predicate; NFC The old function underspecified the return type, took an unused parameter, and had a misleading name. llvm-svn: 348292	2018-12-04 18:53:27 +00:00
Sanjay Patel	472652ef68	[CmpInstAnalysis] fix formatting; NFC There are potential improvements to the structure of this API raised by D54994, but remove some cosmetic blemishes before making any functional changes. llvm-svn: 348149	2018-12-03 15:48:30 +00:00
Fedor Sergeev	7254d3c51c	Fixing -print-module-scope for legacy SCC passes It appears that print-module-scope was not implemented for legacy SCC passes. Fixed to print a whole module instead of just current SCC. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D54793 llvm-svn: 348144	2018-12-03 14:48:15 +00:00
Nikita Popov	687b92cd9c	[ValueTracking] Support funnel shifts in computeKnownBits() If the shift amount is known, we can determine the known bits of the output based on the known bits of two inputs. This is essentially the same functionality as implemented in D54869, but for ValueTracking rather than InstCombine SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D55140 llvm-svn: 348091	2018-12-02 14:14:11 +00:00
Sanjay Patel	7d82d37854	[ValueTracking] add helper function for testing implied condition; NFCI We were duplicating code around the existing isImpliedCondition() that checks for a predecessor block/dominating condition, so make that a wrapper call. llvm-svn: 348088	2018-12-02 13:26:03 +00:00
Teresa Johnson	5b8ff375c8	[ThinLTO] Allow importing of functions with var args Summary: Follow up to D54270, which allowed importing of var args functions unless they called va_start. As pointed out in the post-commit comments on that patch, the inliner can handle functions that call va_start in certain situations as well. Go ahead and enable importing of all var args functions. Measurements on a large binary show that this increases imports and binary size by an insignificant amount. Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54607 llvm-svn: 348068	2018-12-01 05:11:46 +00:00
Nicolai Haehnle	413f8691ab	LegacyDivergenceAnalysis: fix uninitialized value Change-Id: I014502e431a68f7beddf169f6a3d19dac5dd2c26 llvm-svn: 348051	2018-11-30 23:07:49 +00:00
Nicolai Haehnle	56d0ed2a50	[DA] GPUDivergenceAnalysis for unstructured GPU kernels Summary: This is patch #3 of the new DivergenceAnalysis <https://lists.llvm.org/pipermail/llvm-dev/2018-May/123606.html> The GPUDivergenceAnalysis is intended to eventually supersede the existing LegacyDivergenceAnalysis. The existing LegacyDivergenceAnalysis produces incorrect results on unstructured Control-Flow Graphs: <https://bugs.llvm.org/show_bug.cgi?id=37185> This patch adds the option -use-gpu-divergence-analysis to the LegacyDivergenceAnalysis to turn it into a transparent wrapper for the GPUDivergenceAnalysis. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: jholewinski, jvesely, jfb, llvm-commits, alex-t, sameerds, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D53493 llvm-svn: 348048	2018-11-30 22:55:20 +00:00
Renato Golin	de4b88e5ac	Fix parenthesis warning in IVDescriptors llvm-svn: 347990	2018-11-30 13:54:36 +00:00
Renato Golin	135e72e1b9	Add a new reduction pattern match Adding a new reduction pattern match for vectorizing code similar to TSVC s3111: for (int i = 0; i < N; i++) if (a[i] > b) sum += a[i]; This patch adds support for fadd, fsub and fmull, as well as multiple branches and different (but compatible) instructions (ex. add+sub) in different branches. The difference from the previous patch(https://reviews.llvm.org/D49168) is as follows: - Added check of fast-math property of fp-instruction to the previous patch - Fix/add some pattern for if-reduction.ll Differential Revision: https://reviews.llvm.org/D54464 Patch by Takahiro Miyoshi <takahiro.miyoshi@linaro.org> and Masakazu Ueno <masakazu.ueno@linaro.org> llvm-svn: 347989	2018-11-30 13:40:10 +00:00
Warren Ristow	72d1f3a285	[SCEV] Guard movement of insertion point for loop-invariants r320789 suppressed moving the insertion point of SCEV expressions with dev/rem operations to the loop header in non-loop-invariant situations. This, and similar, hoisting is also unsafe in the loop-invariant case, since there may be a guard against a zero denominator. This is an adjustment to the fix of r320789 to suppress the movement even in the loop-invariant case. This fixes PR30806. Differential Revision: https://reviews.llvm.org/D54713 llvm-svn: 347934	2018-11-30 00:02:54 +00:00
Artur Pilipenko	eba2365f23	Introduce MaxUsesToExplore argument to capture tracking Currently CaptureTracker gives up if it encounters a value with more than 20 uses. The motivation for this cap is to keep it relatively cheap for BasicAliasAnalysis use case, where the results can't be cached. Although, other clients of CaptureTracker might be ok with higher cost. This patch introduces an argument for PointerMayBeCaptured functions to specify the max number of uses to explore. The motivation for this change is a downstream user of CaptureTracker, but I believe upstream clients of CaptureTracker might also benefit from more fine grained cap. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D55042 llvm-svn: 347910	2018-11-29 20:08:12 +00:00
Sanjay Patel	d802270808	[InstSimplify] fold select with implied condition This is an almost direct move of the functionality from InstCombine to InstSimplify. There's no reason not to do this in InstSimplify because we never create a new value with this transform. (There's a question of whether any dominance-based transform belongs in either of these passes, but that's a separate issue.) I've changed 1 of the conditions for the fold (1 of the blocks for the branch must be the block we started with) into an assert because I'm not sure how that could ever be false. We need 1 extra check to make sure that the instruction itself is in a basic block because passes other than InstCombine may be using InstSimplify as an analysis on values that are not wired up yet. The 3-way compare changes show that InstCombine has some kind of phase-ordering hole. Otherwise, we would have already gotten the intended final result that we now show here. llvm-svn: 347896	2018-11-29 18:44:39 +00:00
Artur Pilipenko	8b92c1d142	NFC. Use unsigned type for uses counter in CaptureTracking llvm-svn: 347826	2018-11-29 02:15:35 +00:00
Nikita Popov	cf596a8c26	[ValueTracking] Determine always-overflow condition for unsigned sub Always-overflow was already determined for unsigned addition, but not subtraction. This patch establishes parity. This allows us to perform some additional simplifications for signed saturating subtractions. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347771	2018-11-28 16:37:04 +00:00
Vitaly Buka	44abeb5c3c	[stack-safety] Update comment llvm-svn: 347626	2018-11-27 01:56:44 +00:00
Vitaly Buka	7792f5f145	[stack-safety] Fix and uncomment assert llvm-svn: 347625	2018-11-27 01:56:35 +00:00
Vitaly Buka	769bff18be	[stack-safety] Fix build on gcc 5.4 llvm-svn: 347624	2018-11-27 01:56:26 +00:00
JF Bastien	3c242438ec	Fix debug build break Comment out an assertion from D54543 which failed with error: no member named 'Range' in '(anonymous namespace)::PassAsArgInfo'. llvm-svn: 347616	2018-11-26 23:48:47 +00:00
Vitaly Buka	42b050673e	[stack-safety] Inter-Procedural Analysis implementation Summary: IPA is implemented as module pass which produce map from Function or Alias to StackSafetyInfo for a single function. From prototype by Evgenii Stepanov and Vlad Tsyrklevich. Reviewers: eugenis, vlad.tsyrklevich, pcc, glider Subscribers: hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D54543 llvm-svn: 347611	2018-11-26 23:05:58 +00:00
Vitaly Buka	b8e6fa6638	[stack-safety] Empty local passes for Stack Safety Global Analysis Reviewers: eugenis, vlad.tsyrklevich Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54541 llvm-svn: 347610	2018-11-26 23:05:48 +00:00
Vitaly Buka	fa98c074b7	[stack-safety] Local analysis implementation Summary: Analysis produces StackSafetyInfo which contains information with how allocas and parameters were used in functions. From prototype by Evgenii Stepanov and Vlad Tsyrklevich. Reviewers: eugenis, vlad.tsyrklevich, pcc, glider Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54504 llvm-svn: 347603	2018-11-26 21:57:59 +00:00
Vitaly Buka	4493fe1c1b	[stack-safety] Empty local passes for Stack Safety Local Analysis Reviewers: eugenis, vlad.tsyrklevich Subscribers: mgorny, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54502 llvm-svn: 347602	2018-11-26 21:57:47 +00:00
Nikita Popov	f94c8f0d1b	[DemandedBits] Add support for funnel shifts Add support for funnel shifts to the DemandedBits analysis. The demanded bits of the first two operands can be determined if the shift amount is constant. The demanded bits of the third operand (shift amount) can be determined if the bitwidth is a power of two. This is basically the same functionality as implemented in D54869 and D54478, but for DemandedBits rather than InstCombine. Differential Revision: https://reviews.llvm.org/D54876 llvm-svn: 347561	2018-11-26 15:36:57 +00:00
Joel Jones	7459398a43	Revert unapproved commit llvm-svn: 347511	2018-11-24 07:26:55 +00:00
Joel Jones	5f533c5fe1	[AArch64] Enable libm vectorized functions via SLEEF This changeset is modeled after Intel's submission for SVML. It enables trigonometry functions vectorization via SLEEF: http://sleef.org/. * A new vectorization library enum is added to TargetLibraryInfo.h: SLEEF. * A new option is added to TargetLibraryInfoImpl - ClVectorLibrary: SLEEF. * A comprehensive test case is included in this changeset. * In a separate changeset (for clang), a new vectorization library argument is added to -fveclib: -fveclib=SLEEF. Trigonometry functions that are vectorized by sleef: acos asin atan atanh cos cosh exp exp2 exp10 lgamma log10 log2 log sin sinh sqrt tan tanh tgamma Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D53927 llvm-svn: 347510	2018-11-24 06:41:39 +00:00
John Regehr	3a1c9d55cc	[LVI] run transfer function for binary operator even when the RHS isn't a constant LVI was symbolically executing binary operators only when the RHS was constant, missing the case where we have a ConstantRange for the RHS, but not an actual constant. Tested using check-all and by bootstrapping. Compile time is not impacted measurably. Differential Revision: https://reviews.llvm.org/D19859 llvm-svn: 347379	2018-11-21 05:24:12 +00:00
Sanjay Patel	14ab9170b8	[InstSimplify] fold funnel shifts with undef operands Splitting these off from the D54666. Patch by: nikic (Nikita Popov) llvm-svn: 347332	2018-11-20 17:34:59 +00:00
Sanjay Patel	eea21da12a	[InstructionSimplify] Add support for saturating add/sub Add support for saturating add/sub in InstructionSimplify. In particular, the following simplifications are supported: sat(X + 0) -> X sat(X + undef) -> -1 sat(X uadd MAX) -> MAX (and commutative variants) sat(X - 0) -> X sat(X - X) -> 0 sat(X - undef) -> 0 sat(undef - X) -> 0 sat(0 usub X) -> 0 sat(X usub MAX) -> 0 Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54532 llvm-svn: 347330	2018-11-20 17:20:26 +00:00
Sanjay Patel	efc3d1dfaa	[ConstantFolding] Add support for saturating add/sub Support saturating add/sub in constant folding, based on the APInt methods introduced in D54332. Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54531 llvm-svn: 347328	2018-11-20 17:05:55 +00:00
Vedant Kumar	4de31bba51	[IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256	2018-11-19 19:54:27 +00:00
Anna Thomas	5e9215f02b	[LV] Avoid vectorizing unsafe dependencies in uniform address Summary: Currently, when vectorizing stores to uniform addresses, the only instance we prevent vectorization is if there are multiple stores to the same uniform address causing an unsafe dependency. This patch teaches LAA to avoid vectorizing loops that have an unsafe cross-iteration dependency between a load and a store to the same uniform address. Fixes PR39653. Reviewers: Ayal, efriedma Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D54538 llvm-svn: 347220	2018-11-19 15:39:59 +00:00
Fedor Sergeev	3a3d688cc8	[LoopPass] fixing 'Modification' messages in -debug-pass=Executions for loop passes Legacy loop pass manager is issuing "Made Modification" message after each Loop Pass run, however condition for issuing it is accumulated among all the runs. That leads to confusing 'modification' messages as soon as the first modification is done. Changing condition to be "current pass made modifications", similar to how it is being done in all other pass managers. llvm-svn: 347215	2018-11-19 15:10:59 +00:00
Vedant Kumar	e7b789b529	[ProfileSummary] Standardize methods and fix comment Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182	2018-11-19 05:23:16 +00:00
Fangrui Song	7570932977	Use llvm::copy. NFC llvm-svn: 347126	2018-11-17 01:44:25 +00:00
Eugene Leviant	bf46e7410c	[ThinLTO] Internalize readonly globals An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033	2018-11-16 07:08:00 +00:00
Sanjay Patel	e98ec77a95	[InstSimplify] delete shift-of-zero guard ops around funnel shifts This is a problem seen in common rotate idioms as noted in: https://bugs.llvm.org/show_bug.cgi?id=34924 Note that we are not canonicalizing standard IR (shifts and logic) to the intrinsics yet. (Although I've written this before...) I think this is the last step before we enable that transform. Ie, we could regress code by doing that transform without this simplification in place. In PR34924, I questioned whether this is a valid transform for target-independent IR, but I convinced myself this is ok. If we're speculating a funnel shift by turning cmp+br into select, then SimplifyCFG has already determined that the transform is justified. It's possible that SimplifyCFG is not taking into account profile or other metadata, but if that's true, then it's a bug independent of funnel shifts. Also, we do have CGP code to restore a guard like this around an intrinsic if it can't be lowered cheaply. But that isn't necessary for funnel shift because the default expansion in SelectionDAGBuilder includes this same cmp+select. Differential Revision: https://reviews.llvm.org/D54552 llvm-svn: 346960	2018-11-15 14:53:37 +00:00
Teresa Johnson	32dc5b9bf1	[ThinLTO] Update handling of vararg functions to match inliner Summary: Previously we marked all vararg functions as non-inlinable in the function summary, which prevented their importing. However, the corresponding inliner restriction was loosened in r321940/r342675 to only apply to functions calling va_start. Adjust the summary flag computation to match. Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54270 llvm-svn: 346883	2018-11-14 19:30:13 +00:00
Simon Pilgrim	2b166c5044	[TTI] getOperandInfo - a broadcast shuffle means the result is OK_UniformValue llvm-svn: 346868	2018-11-14 15:04:08 +00:00
Alina Sbirlea	b4d088d090	[MemorySSA] Create query after checking if instruction is a fence. The alternative is checking if I is a fence in the Query constructor, so as to not attempt to get a non-existent MemoryLocation. llvm-svn: 346798	2018-11-13 21:12:49 +00:00
Steven Wu	fa43892d6f	Revert "[ThinLTO] Internalize readonly globals" This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768	2018-11-13 17:35:04 +00:00
Florian Hahn	86ed347bcd	[VectorUtils] Use namespace for InterleaveGroup template specialization. llvm-svn: 346759	2018-11-13 16:26:34 +00:00
Florian Hahn	a4dc7feeea	[VPlan] VPlan version of InterleavedAccessInfo. This patch turns InterleaveGroup into a template with the instruction type being a template parameter. It also adds a VPInterleavedAccessInfo class, which only contains a mapping from VPInstructions to their respective InterleaveGroup. As we do not have access to scalar evolution in VPlan, we can re-use convert InterleavedAccessInfo to VPInterleavedAccess info. Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D49489 llvm-svn: 346758	2018-11-13 15:58:18 +00:00
Simon Pilgrim	077a42ca9f	[TTI] Make TargetTransformInfo::getOperandInfo static. NFCI. It has no member dependencies and this makes it easier to reuse in other cost analysis code. llvm-svn: 346755	2018-11-13 13:45:10 +00:00
Sanjay Patel	1456fd7614	[VectorUtils] add funnel-shifts to the list of vectorizable intrinsics This just identifies the intrinsics as candidates for vectorization. It does not mean we will attempt to vectorize under normal conditions (the test file is forcing vectorization). The cost model must be fixed to show that the transform is profitable in general. Allowing vectorization with these intrinsics is required to avoid potential regressions from canonicalizing to the intrinsics from generic IR: https://bugs.llvm.org/show_bug.cgi?id=37417 llvm-svn: 346661	2018-11-12 15:20:14 +00:00
Sanjay Patel	0f4f4806b3	[VectorUtils] reorder list of vectorizable intrinsics; NFC We need to add funnel-shifts to this list, so clean up the random order before it gets worse. llvm-svn: 346660	2018-11-12 15:10:30 +00:00
Max Kazantsev	7d49a3a816	[LICM] Hoist guards from non-header blocks This patch relaxes overconservative checks on whether or not we could write memory before we execute an instruction. This allows us to hoist guards out of loops even if they are not in the header block. Differential Revision: https://reviews.llvm.org/D50891 Reviewed By: fedor.sergeev llvm-svn: 346643	2018-11-12 09:29:58 +00:00
Eugene Leviant	be8d19967a	[ThinLTO] Internalize readonly globals This patch allows internalising globals if all accesses to them (from live functions) are from non-volatile load instructions Differential revision: https://reviews.llvm.org/D49362 llvm-svn: 346584	2018-11-10 08:31:21 +00:00
Simon Pilgrim	26e1c887f5	[TTI] Flip vector types in getShuffleCost SK_ExtractSubvector call For SK_ExtractSubvector, the default 'Ty' type is the source operand type and 'SubTy' is the destination subvector type I got this the wrong way around when I added rL346510 llvm-svn: 346534	2018-11-09 18:30:59 +00:00
Simon Pilgrim	d0c71609c5	[CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368) Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks. llvm-svn: 346510	2018-11-09 16:28:19 +00:00
Max Kazantsev	65cb9d79a2	[SCEV][NFC] Verify IR in isLoop[Entry,Backedge]GuardedByCond We have a lot of various bugs that are caused by misuse of SCEV (in particular in LV), all of them can simply be described as "we ask SCEV to prove some fact on invalid IR". Some of examples of those are PR36311, PR37221, PR39160. The problem is that these failues manifest differently (what we saw was failure of various asserts across SCEV, but there can also be miscompiles). This patch adds an assert into two SCEV methods that strongly rely on correctness of the IR and are involved in known failues. This will at least allow us to have a clear indication of what was wrong in this case. This patch also fixes a unit test with incorrect IR that fails this verification. Differential Revision: https://reviews.llvm.org/D52930 Reviewed By: fhahn llvm-svn: 346389	2018-11-08 05:07:58 +00:00
Matt Arsenault	8ba740a5a8	Allow subclassing ExternalAA This allows testing AMDGPU alias analysis like any other alias analysis pass. This fixes the existing test pointlessly running opt -O3 when it really just wants to run the one analysis. Before there was no way to test this using -aa-eval with opt, since the default constructed pass is run. The wrapper subclass allows the default constructor to pass the necessary callback. llvm-svn: 346353	2018-11-07 20:26:42 +00:00
James Y Knight	72f76bf230	Add support for llvm.is.constant intrinsic (PR4898) This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322	2018-11-07 15:24:12 +00:00
Calixte Denizet	8f07efc7c5	Fix unit tests after patch https://reviews.llvm.org/rL346313 Summary: Tests are broken so fix them. Reviewers: marco-c Reviewed By: marco-c Subscribers: sylvestre.ledru, llvm-commits Differential Revision: https://reviews.llvm.org/D54208 llvm-svn: 346318	2018-11-07 14:46:26 +00:00
Calixte Denizet	c3bed1e8e6	[GCOV] Flush counters before to avoid counting the execution before fork twice and for exec functions we must flush before the call Summary: This is replacement for patch in https://reviews.llvm.org/D49460. When we fork, the counters are duplicate as they're and so the values are finally wrong when writing gcda for parent and child. So just before to fork, we flush the counters and so the parent and the child have new counters set to zero. For exec functions, we need to flush before the call to have some data. Reviewers: vsk, davidxl, marco-c Reviewed By: marco-c Subscribers: llvm-commits, sylvestre.ledru, marco-c Differential Revision: https://reviews.llvm.org/D53593 llvm-svn: 346313	2018-11-07 13:49:17 +00:00
Teresa Johnson	cb397461e1	[ThinLTO] Split NotEligibleToImport into legality and inlinability flags Summary: The NotEligibleToImport flag on the GlobalValueSummary was set if it isn't legal to import (e.g. because it references unpromotable locals) and when it can't be inlined (in which case importing is pointless). I split out the inlinable piece into a separate flag on the FunctionSummary (doesn't make sense for aliases or global variables), because in the future we may want to import for reasons other than inlining. Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D53345 llvm-svn: 346261	2018-11-06 19:41:35 +00:00
Eli Friedman	e3a5fc6d80	Disable calls to _finite and other glibc-only functions on Musl. Non-GNU environments don't have __finite_, so treat them as unavailable. Differential Revision: https://reviews.llvm.org/D51282 llvm-svn: 346250	2018-11-06 18:23:32 +00:00
Max Kazantsev	4855b74f8b	[NFC] Turn collectTransitivePredecessors into a static function llvm-svn: 346217	2018-11-06 09:07:03 +00:00
Sanjay Patel	1440107821	[InstSimplify] fold select (fcmp X, Y), X, Y This is NFCI for InstCombine because it calls InstSimplify, so I left the tests for this transform there. As noted in the code comment, we can allow this fold more often by using FMF and/or value tracking. llvm-svn: 346169	2018-11-05 21:51:39 +00:00
David Green	ba9f245b0d	[Inliner] Penalise inlining of calls with loops at Oz We currently seem to underestimate the size of functions with loops in them, both in terms of absolute code size and in the difficulties of dealing with such code. (Calls, for example, can be tail merged to further reduce codesize). At -Oz, we can then increase code size by inlining small loops multiple times. This attempts to penalise functions with loops at -Oz by adding a CallPenalty for each top level loop in the function. It uses LI (and hence DT) to calculate the number of loops. As we are dealing with minsize, the inline threshold is small and functions at this point should be relatively small, making the construction of these cheap. Differential Revision: https://reviews.llvm.org/D52716 llvm-svn: 346134	2018-11-05 14:54:34 +00:00
Sanjay Patel	e7c94ef1de	[ValueTracking] determine sign of 0.0 from select when matching min/max FP In PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 ..we may fail to recognize/simplify fabs() in some cases because we do not canonicalize fcmp with a -0.0 operand. Adding that canonicalization can cause regressions on min/max FP tests, so that's this patch: for the purpose of determining whether something is min/max, let the value returned by the select determine how we treat a 0.0 operand in the fcmp. This patch doesn't actually change the -0.0 to +0.0. It just changes the analysis, so we don't fail to recognize equivalent min/max patterns that only differ in the signbit of 0.0. Differential Revision: https://reviews.llvm.org/D54001 llvm-svn: 346097	2018-11-04 14:28:48 +00:00
Sanjay Patel	cac28b452e	[ValueTracking] peek through 2-input shuffles in ComputeNumSignBits This patch gives the IR ComputeNumSignBits the same functionality as the DAG version (the code is derived from the existing code). This an extension of the single input shuffle analysis added with D53659. Differential Revision: https://reviews.llvm.org/D53987 llvm-svn: 346071	2018-11-03 13:18:55 +00:00
Easwaran Raman	c5e1506ec8	[ProfileSummary] Add options to override hot and cold count thresholds. Summary: The hot and cold count thresholds are derived from the summary, but for debugging purposes it is convenient to provide the actual thresholds. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54040 llvm-svn: 346005	2018-11-02 17:39:31 +00:00
Sanjay Patel	a68096c73e	[ValueTracking] allow non-canonical shuffles when computing signbits This possibility is noted in D53987 for a different case, so we need to adjust the existing code. llvm-svn: 345988	2018-11-02 15:51:47 +00:00
Alina Sbirlea	fd9722fbc6	[AliasSetTracker] Misc cleanup (NFCI) Summary: Remove two redundant checks, add one in the unit test. Remove an unused method. Fix computation of TotalMayAliasSetSize. llvm-svn: 345911	2018-11-01 23:37:51 +00:00
Reid Kleckner	4dc0b1ac60	Fix clang -Wimplicit-fallthrough warnings across llvm, NFC This patch should not introduce any behavior changes. It consists of mostly one of two changes: 1. Replacing fall through comments with the LLVM_FALLTHROUGH macro 2. Inserting 'break' before falling through into a case block consisting of only 'break'. We were already using this warning with GCC, but its warning behaves slightly differently. In this patch, the following differences are relevant: 1. GCC recognizes comments that say "fall through" as annotations, clang doesn't 2. GCC doesn't warn on "case N: foo(); default: break;", clang does 3. GCC doesn't warn when the case contains a switch, but falls through the outer case. I will enable the warning separately in a follow-up patch so that it can be cleanly reverted if necessary. Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu Differential Revision: https://reviews.llvm.org/D53950 llvm-svn: 345882	2018-11-01 19:54:45 +00:00
Sanjay Patel	746ebb4ee8	[InstSimplify] fold icmp based on range of abs/nabs (2nd try) This is retrying the fold from rL345717 (reverted at rL347780) ...with a fix for the miscompile demonstrated by PR39510: https://bugs.llvm.org/show_bug.cgi?id=39510 Original commit message: This is a fix for PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 We managed to get some of these patterns using computeKnownBits in https://reviews.llvm.org/D47041, but that can't be used for nabs(). Instead, put in some range-based logic, so we can fold both abs/nabs with icmp with a constant value. Alive proofs: https://rise4fun.com/Alive/21r Name: abs_nsw_is_positive %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp sgt i32 %abs, -1 => %r = i1 true Name: abs_nsw_is_not_negative %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp slt i32 %abs, 0 => %r = i1 false Name: nabs_is_negative_or_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp slt i32 %nabs, 1 => %r = i1 true Name: nabs_is_not_over_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp sgt i32 %nabs, 0 => %r = i1 false Differential Revision: https://reviews.llvm.org/D53844 llvm-svn: 345832	2018-11-01 14:07:39 +00:00
Max Kazantsev	bb84407f3d	[NFC] Specialize public API of ICFLoopSafetyInfo for insertions and removals llvm-svn: 345822	2018-11-01 10:16:06 +00:00
Max Kazantsev	e0a2613aea	[SCEV] Avoid redundant computations when doing AddRec merge When we calculate a product of 2 AddRecs, we end up making quite massive computations to deduce the operands of resulting AddRec. This process can be optimized by computing all args of intermediate sum and then calling `getAddExpr` once rather than calling `getAddExpr` with intermediate result every time a new argument is computed. Differential Revision: https://reviews.llvm.org/D53189 Reviewed By: rtereshin llvm-svn: 345813	2018-11-01 06:18:27 +00:00
Sanjay Patel	72fe03f93b	revert rL345717 : [InstSimplify] fold icmp based on range of abs/nabs This can miscompile as shown in PR39510: https://bugs.llvm.org/show_bug.cgi?id=39510 llvm-svn: 345780	2018-10-31 21:37:40 +00:00
Sanjay Patel	d4dc30c20d	[InstSimplify] fold 'fcmp nnan ult X, 0.0' when X is not negative This is the inverted case for the transform added with D53874 / rL345725. llvm-svn: 345728	2018-10-31 15:35:46 +00:00
Sanjay Patel	85cba3b6fb	[InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negative This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 ...but given the current implementation (no FMF on casts), this is likely the only way to predicate the transform. This is part of solving PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 Differential Revision: https://reviews.llvm.org/D53874 llvm-svn: 345725	2018-10-31 14:57:23 +00:00
Sanjay Patel	2efccd2cf2	[InstSimplify] fold icmp based on range of abs/nabs This is a fix for PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 We managed to get some of these patterns using computeKnownBits in D47041, but that can't be used for nabs(). Instead, put in some range-based logic, so we can fold both abs/nabs with icmp with a constant value. Alive proofs: https://rise4fun.com/Alive/21r Name: abs_nsw_is_positive %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp sgt i32 %abs, -1 => %r = i1 true Name: abs_nsw_is_not_negative %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp slt i32 %abs, 0 => %r = i1 false Name: nabs_is_negative_or_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp slt i32 %nabs, 1 => %r = i1 true Name: nabs_is_not_over_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp sgt i32 %nabs, 0 => %r = i1 false Differential Revision: https://reviews.llvm.org/D53844 llvm-svn: 345717	2018-10-31 13:25:10 +00:00
Dorit Nuzman	34da6dd696	[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705	2018-10-31 09:57:56 +00:00
Matthias Braun	9fd397b423	ADT/STLExtras: Introduce llvm::empty; NFC This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679	2018-10-31 00:23:23 +00:00
Alina Sbirlea	4f5d337199	[AliasSetTracker] Cleanup addPointer interface. [NFCI] Summary: Attempting to simplify the addPointer interface. Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call. Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D53836 llvm-svn: 345548	2018-10-29 22:25:59 +00:00
Renato Golin	53bd4f4832	Revert r344172: [LV] Add a new reduction pattern match This patch has caused fast-math issues in the reduction pattern. Will re-work and land again. llvm-svn: 345465	2018-10-27 22:13:43 +00:00
Sanjay Patel	cc9e401e3c	[ValueTracking] peek through shuffles in ComputeNumSignBits (PR37549) The motivating case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The analysis improvement allows us to form a vector 'select' out of bitwise logic (the use of ComputeNumSignBits was added at rL345149). The smaller test shows another InstCombine improvement - we use ComputeNumSignBits to add 'nsw' to shift-left. But the negative test shows an example where we must not add 'nsw' - when the shuffle mask contains undef elements. Differential Revision: https://reviews.llvm.org/D53659 llvm-svn: 345429	2018-10-26 21:05:14 +00:00
Dorit Nuzman	3ec99fe21b	[IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when optimizing for size LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog. The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate: "FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog." (requiresScalarEpilog() has nothing to do with VF). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53420 llvm-svn: 344883	2018-10-22 06:17:09 +00:00
Thomas Lively	8a91cf1cc5	[LoopVectorize] Loop vectorization for minimum and maximum Summary: Depends on D52766. Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52767 llvm-svn: 344816	2018-10-19 21:11:43 +00:00
Thomas Lively	c339250e12	[InstCombine] InstCombine and InstSimplify for minimum and maximum Summary: Depends on D52765 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52766 llvm-svn: 344799	2018-10-19 19:01:26 +00:00
Thomas Lively	fa54e56d84	[ConstantFolding] Constant fold minimum and maximum intrinsics Summary: Depends on D52764 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52765 llvm-svn: 344796	2018-10-19 18:15:32 +00:00
Chandler Carruth	608e6faa06	[TI removal] Switch some newly added code over to use `Instruction` directly. llvm-svn: 344768	2018-10-19 00:22:10 +00:00
Nicolai Haehnle	59041687be	[DA] DivergenceAnalysis for unstructured, reducible CFGs Summary: This is patch 2 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). This patch contains a generic divergence analysis implementation for unstructured, reducible Control-Flow Graphs. It contains two new classes. The `SyncDependenceAnalysis` class lazily computes sync dependences, which relate divergent branches to points of joining divergent control. The `DivergenceAnalysis` class contains the generic divergence analysis implementation. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: sameerds, kristina, nhaehnle, xbolva00, tschuett, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D51491 llvm-svn: 344734	2018-10-18 09:38:44 +00:00
Chandler Carruth	9871662e24	[TI removal] Switch an analysis to just use Instruction. llvm-svn: 344713	2018-10-18 00:36:15 +00:00
Max Kazantsev	e2566b5d87	[NFC] Remove GOTO from SCEV llvm-svn: 344687	2018-10-17 11:16:25 +00:00
Anna Thomas	6f732bfb79	[LV] Teach vectorizer about variant value store into uniform address Summary: Teach vectorizer about vectorizing variant value stores to uniform address. Similar to rL343028, we do not allow vectorization if we have multiple stores to the same uniform address. Cost model already has the change for considering the extract instruction cost for a variant value store. See added test cases for how vectorization is done. The patch also contains changes to the ORE messages. Reviewers: Ayal, mkuper, anemet, hsaito Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D52656 llvm-svn: 344613	2018-10-16 15:46:26 +00:00
Max Kazantsev	5f9acd279e	[NFC] Introduce ICFLoopSafetyInfo This is an alternative implementation of LoopSafetyInfo that uses the implicit control flow tracking to give precise answers on queries "whether or not this block contains throwing instructions". This rules out false-positive answers on LoopSafetyInfo's queries. This patch only introduces the new implementation. It is not currently used in any pass. The enabling patches will go separately, through review. The plan is to completely replace all uses of LoopSafetyInfo with ICFLoopSafetyInfo in the future, but to avoid introducing functional problems, we will do it pass by pass. llvm-svn: 344601	2018-10-16 09:58:09 +00:00
Max Kazantsev	87de55ad01	[NFC] Remove obsolete method headerMayThrow llvm-svn: 344596	2018-10-16 09:11:25 +00:00
Max Kazantsev	9c90ec2fae	[NFC] Make LoopSafetyInfo abstract to allow alternative implementations llvm-svn: 344592	2018-10-16 08:31:05 +00:00
Max Kazantsev	8d56be7070	[NFC] Encapsulate work with BlockColors in LoopSafetyInfo llvm-svn: 344590	2018-10-16 08:07:14 +00:00
Max Kazantsev	6a4f5e2add	[NFC] Move block throw check inside allLoopPathsLeadToBlock llvm-svn: 344588	2018-10-16 07:50:14 +00:00
Max Kazantsev	c8466f937c	[NFC] Turn isGuaranteedToExecute into a method llvm-svn: 344587	2018-10-16 06:34:53 +00:00
Max Kazantsev	fdfd98ceec	[SCEV] Limit AddRec "simplifications" to avoid combinatorial explosions SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into a single AddRec of size `2n+1` with complex combinatorial coefficients can easily trigger exponential growth of the SCEV (in case if nothing gets folded and simplified). We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`, but its default value seems to be insufficiently small: the test attached to this patch with default value of this option `16` has a SCEV of >3M symbols (when printed out). This patch reduces the simplification limit. It is not a cure to combinatorial explosions, but at least it reduces this corner case to something more or less reasonable. Differential Revision: https://reviews.llvm.org/D53282 Reviewed By: sanjoy llvm-svn: 344584	2018-10-16 05:26:21 +00:00
Chandler Carruth	edb12a838a	[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502	2018-10-15 10:04:59 +00:00
Chandler Carruth	b99a24689b	[TI removal] Remove TerminatorInst as an input parameter from all public LLVM APIs. There weren't very many. We still have the instruction visitor, and APIs with TerminatorInst as a return type or an output parameter. llvm-svn: 344494	2018-10-15 09:17:09 +00:00
Dorit Nuzman	38bbf81ade	recommit 344472 after fixing build failure on ARM and PPC. llvm-svn: 344475	2018-10-14 08:50:06 +00:00
Dorit Nuzman	5118c68cde	revert 344472 due to failures. llvm-svn: 344473	2018-10-14 07:21:20 +00:00
Dorit Nuzman	8174368955	[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472	2018-10-14 07:06:16 +00:00
Eugene Leviant	eddf6b5df5	[ThinLTO] Don't import GV which contains blockaddress Differential revision: https://reviews.llvm.org/D53139 llvm-svn: 344325	2018-10-12 07:24:02 +00:00
Max Kazantsev	5dbeff3e1c	[NFC] Factor out getOrCreateAddRecExpr method llvm-svn: 344227	2018-10-11 08:46:39 +00:00
George Burgess IV	6ef8002c2c	Replace most users of UnknownSize with LocationSize::unknown(); NFC Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186	2018-10-10 21:28:44 +00:00
Renato Golin	cb19c8e3aa	[LV] Add a new reduction pattern match Adding a new reduction pattern match for vectorizing code similar to TSVC s3111: for (int i = 0; i < N; i++) if (a[i] > b) sum += a[i]; This patch adds support for fadd, fsub and fmull, as well as multiple branches and different (but compatible) instructions (ex. add+sub) in different branches. I have forwarded to trunk, added fsub and fmul functionality and additional tests, but the credit goes to Takahiro, who did most of the actual work. Differential Revision: https://reviews.llvm.org/D49168 Patch by Takahiro Miyoshi <takahiro.miyoshi@linaro.org>. llvm-svn: 344172	2018-10-10 18:49:49 +00:00
George Burgess IV	40dc63e1f0	[Analysis] Make LocationSizes carry an 'imprecise' bit There are places where we need to merge multiple LocationSizes of different sizes into one, and get a sensible result. There are other places where we want to optimize aggressively based on the value of a LocationSizes (e.g. how can a store of four bytes be to an area of storage that's only two bytes large?) This patch makes LocationSize hold an 'imprecise' bit to note whether the LocationSize can be treated as an upper-bound and lower-bound for the size of a location, or just an upper-bound. This concludes the series of patches leading up to this. The most recent of which is r344108. Fixes PR36228. Differential Revision: https://reviews.llvm.org/D44748 llvm-svn: 344114	2018-10-10 06:39:40 +00:00
George Burgess IV	d98d505c0d	[Analysis] Make LocationSize pretty-printing more descriptive This is the third patch in a series intended to make https://reviews.llvm.org/D44748 more easily reviewable. Please see that patch for more context. The second being r344013. The intent is to make the output of printing a LocationSize more precise. The main motivation for this is that we plan to add a bit to distinguish whether a given LocationSize is an upper-bound or is precise; making that information available in pretty-printing is nice. llvm-svn: 344108	2018-10-10 01:35:22 +00:00
Cameron McInally	bea5967e8c	[FPEnv] PatternMatcher support for checking FNEG ignoring signed zeros https://reviews.llvm.org/D52934 llvm-svn: 344084	2018-10-09 21:48:00 +00:00
Chandler Carruth	d46f580986	[CFG Printer] Add support for writing the dot files with a custom prefix. Use this to direct these files to a specific location in the test suite so that we don't write files out to random directories (or fail if the working directory isn't writable). llvm-svn: 344014	2018-10-09 04:30:23 +00:00
George Burgess IV	f96e618017	Make LocationSize a proper Optional type; NFC This is the second in a series of changes intended to make https://reviews.llvm.org/D44748 more easily reviewable. Please see that patch for more context. The first change being r344012. Since I was requested to do all of this with post-commit review, this is about as small as I can make this patch. This patch makes LocationSize into an actual type that wraps a uint64_t; users are required to call getValue() in order to get the size now. If the LocationSize has an Unknown size (e.g. if LocSize == MemoryLocation::UnknownSize), getValue() will assert. This also adds DenseMap specializations for LocationInfo, which required taking two more values from the set of values LocationInfo can represent. Hence, heavy users of multi-exabyte arrays or structs may observe slightly lower-quality code as a result of this change. The intent is for getValue()s to be very close to a corresponding hasValue() (which is often spelled `!= MemoryLocation::UnknownSize`). Sadly, small diff context appears to crop that out sometimes, and the last change in DSE does require a bit of nonlocal reasoning about control-flow. :/ This also removes an assert, since it's now redundant with the assert in getValue(). llvm-svn: 344013	2018-10-09 03:18:56 +00:00
George Burgess IV	fefc42c9bb	Use locals instead of struct fields; NFC This is one of a series of changes intended to make https://reviews.llvm.org/D44748 more easily reviewable. Please see that patch for more context. Since I was requested to do all of this with post-commit review, this is about as small as I can make it (beyond committing changes to these few files separately, but they're incredibly similar in spirit, so...) On its own, this change doesn't make a great deal of sense. I plan on having a follow-up Real Soon Now(TM) to make the bits here make more sense. :) In particular, the next change in this series is meant to make LocationSize an actual type, which you have to call .getValue() on in order to get at the uint64_t inside. Hence, this change refactors code so that: - we only need to call the soon-to-come getValue() once in most cases, and - said call to getValue() happens very closely to a piece of code that checks if the LocationSize has a value (e.g. if it's != UnknownSize). llvm-svn: 344012	2018-10-09 02:14:33 +00:00
Dorit Nuzman	72f6e29980	[IAI,LV] Avoid creating interleave-groups for predicated accesse This patch fixes PR39099. When strided loads are predicated, each of them will form an interleaved-group (with gaps). However, subsequent stages of vectorization (planning and transformation) assume that if a load is part of an Interleave-Group it is not predicated, resulting in wrong code - unmasked wide loads are created. The Interleaving Analysis does take care not to have conditional interleave groups of size > 1, but until we extend the planning and transformation stages to support masked-interleave-groups we should also avoid having them for size == 1. Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D52682 llvm-svn: 343931	2018-10-07 06:57:25 +00:00
Jonas Paulsson	29d80f07ee	[LoopVectorizer] Use TTI.getOperandInfo() Call getOperandInfo() instead of using (near) duplicated code in LoopVectorizationCostModel::getInstructionCost(). This gets the OperandValueKind and OperandValueProperties values for a Value passed as operand to an arithmetic instruction. getOperandInfo() used to be a static method in TargetTransformInfo.cpp, but is now instead a public member. Review: Florian Hahn https://reviews.llvm.org/D52883 llvm-svn: 343852	2018-10-05 14:34:04 +00:00
Vitaly Buka	0509070811	[cxx2a] Fix warning triggered by r343285 llvm-svn: 343369	2018-09-29 02:17:12 +00:00
Thomas Lively	d47b5c7bed	[ValueTracking] Allow select patterns to work on FP vectors Summary: This CL allows constant vectors of floats to be recognized as non-NaN and non-zero in select patterns. This change makes `matchSelectPattern` more powerful generally, but was motivated specifically because I wanted fminnan and fmaxnan to be created for vector versions of the scalar patterns they are created for. Tested with check-all on all targets. A testcase in the WebAssembly backend that tests the non-nan codepath is in an upcoming CL. Reviewers: aheejin, dschuff Subscribers: sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52324 llvm-svn: 343364	2018-09-28 21:36:43 +00:00
Teresa Johnson	f24136f17a	[WPD] Fix incorrect devirtualization after indirect call promotion Summary: Add a dominance check to ensure that the possible devirtualizable call is actually dominated by the type test/checked load intrinsic being analyzed. With PGO, after indirect call promotion is performed during the compile step, followed by inlining, we may have a type test in the promoted and inlined sequence that allows an indirect call in that sequence to be devirtualized. That indirect call (inserted by inlining after promotion) will share the same vtable pointer as the fallback indirect call that cannot be devirtualized. Before this patch the code was incorrectly devirtualizing the fallback indirect call. See the new test and the example described there for more details. Reviewers: pcc, vitalybuka Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52514 llvm-svn: 343226	2018-09-27 14:55:32 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Anna Thomas	b1e3d45318	[LV][LAA] Vectorize loop invariant values stored into loop invariant address Summary: We are overly conservative in loop vectorizer with respect to stores to loop invariant addresses. More details in https://bugs.llvm.org/show_bug.cgi?id=38546 This is the first part of the fix where we start with vectorizing loop invariant values to loop invariant addresses. This also includes changes to ORE for stores to invariant address. Reviewers: anemet, Ayal, mkuper, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50665 llvm-svn: 343028	2018-09-25 20:57:20 +00:00
Sanjay Patel	3413a66c38	[Analysis] add comment to generalize finding a scalar op from vector; NFC llvm-svn: 342906	2018-09-24 17:18:32 +00:00
Fedor Sergeev	662e5686fe	[New PM][PassInstrumentation] IR printing support for New Pass Manager Implementing -print-before-all/-print-after-all/-filter-print-func support through PassInstrumentation callbacks. - PrintIR routines implement printing callbacks. - StandardInstrumentations class provides a central place to manage all the "standard" in-tree pass instrumentations. Currently it registers PrintIR callbacks. Reviewers: chandlerc, paquette, philip.pfaffe Differential Revision: https://reviews.llvm.org/D50923 llvm-svn: 342896	2018-09-24 16:08:15 +00:00
JF Bastien	73d8e4e531	Merge clang's isRepeatedBytePattern with LLVM's isBytewiseValue Summary: his code was in CGDecl.cpp and really belongs in LLVM's isBytewiseValue. Teach isBytewiseValue the tricks clang's isRepeatedBytePattern had, including merging undef properly, and recursing on more types. clang part of this patch: D51752 Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51751 llvm-svn: 342709	2018-09-21 05:17:42 +00:00
Sameer AbuAsal	77beee4136	[inline Cost] Don't mark functions accessing varargs as non-inlinable Summary: rL323619 marks functions that are calling va_end as not viable for inlining. This patch reverses that since this va_end doesn't need access to the vriadic arguments list that are saved on the stack, only va_start does. Reviewers: efriedma, fhahn Reviewed By: fhahn Subscribers: eraman, haicheng, llvm-commits Differential Revision: https://reviews.llvm.org/D52067 llvm-svn: 342675	2018-09-20 18:39:34 +00:00
Fedor Sergeev	ee8d31c49e	[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664	2018-09-20 17:08:45 +00:00
Eric Christopher	019889374b	Temporarily Revert "[New PM] Introducing PassInstrumentation framework" as it was causing failures in the asan buildbot. This reverts commit r342597. llvm-svn: 342616	2018-09-20 05:16:29 +00:00
Fedor Sergeev	a5f279ea89	[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342597	2018-09-19 22:42:57 +00:00
Fedor Sergeev	25de3f83be	Revert rL342544: [New PM] Introducing PassInstrumentation framework A bunch of bots fail to compile unittests. Reverting. llvm-svn: 342552	2018-09-19 14:54:48 +00:00
Fedor Sergeev	875c938fec	[New PM] Introducing PassInstrumentation framework Summary: Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342544	2018-09-19 12:25:52 +00:00
Michael Kruse	534c87df82	[Loopinfo] Remove one latch-case in getLoopID. NFC. getLoopID has different control flow for two cases: If there is a single loop latch and for any other number of loop latches (0 and more than one). The latter case should return the same result if there is only a single latch. We can save the preceding redundant search for a latch by handling both cases with the same code. Differential Revision: https://reviews.llvm.org/D52118 llvm-svn: 342406	2018-09-17 18:40:29 +00:00
Matt Arsenault	80ea6dd1d5	Fix vectorization of canonicalize llvm-svn: 342390	2018-09-17 13:24:30 +00:00
Florian Hahn	1086ce2397	[LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC) Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use them for VPlan outside LoopVectorize.cpp Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00 Reviewed By: rengolin, xbolva00 Differential Revision: https://reviews.llvm.org/D49488 llvm-svn: 342027	2018-09-12 08:01:57 +00:00
Ilya Biryukov	1ea75783fa	Remove unused include from IVDescriptors.cpp. This fixes a layering violation: Analysis/IVDescrtors.cpp can't include Transforms/Utils/BasicBlockUtils.h, since TransformUtils depends on Analysis. llvm-svn: 342024	2018-09-12 07:22:46 +00:00
Vikram TV	7e98d69847	Break LoopUtils into an Analysis file. Summary: The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory. The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html). Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h\|cpp to Analysis/IVDescriptors.h\|cpp. Reviewers: dmgreen, llvm-commits, hfinkel Reviewed By: dmgreen Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D51153 llvm-svn: 342016	2018-09-12 01:59:43 +00:00
Alexandros Lamprineas	96762b37e1	[MemorySSAUpdater] Avoid creating self-referencing MemoryDefs Fix for https://bugs.llvm.org/show_bug.cgi?id=38807, which occurred while compiling SemaTemplateInstantiate.cpp with clang and GVNHoist enabled. In the following example: 1=def(entry) / \ 2=def(1) 4=def(1) 3=def(2) 5=def(4) When removing the MemoryDef 2=def(1) from its basic block, and just before adding it to the end of the parent basic block, we first replace all its uses with the defining memory access: 3=def(2) -> 3=def(1) Then we call insertDef for adding 2=def(1) to the parent basic block, where we replace the uses of 1=def(entry) with 2=def(1). Doing so we create a self reference: 2=def(1) -> 2=def(2) (bad) 3=def(1) -> 3=def(2) (ok) 4=def(1) -> 4=def(2) (ok) Differential Revision: https://reviews.llvm.org/D51801 llvm-svn: 341947	2018-09-11 14:29:59 +00:00
Johannes Doerfert	a58e9214ac	[LoopInfo] Fix Loop::getLoopID() for loops with multiple latches The previous implementation traversed all loop blocks and bailed if one was not a latch block. Since we are only interested in latch blocks, we should only traverse those. llvm-svn: 341926	2018-09-11 11:44:17 +00:00
Max Kazantsev	bf00f03f56	[NFC] Sanitizing asserts for OrderedBasicBlock llvm-svn: 341917	2018-09-11 08:46:19 +00:00
Max Kazantsev	90edc98c58	[NFC] Rename variable llvm-svn: 341901	2018-09-11 05:10:01 +00:00
Peter Collingbourne	c7d281905b	Prevent Constant Folding From Optimizing inrange GEP This patch does the following things: 1. update SymbolicallyEvaluateGEP so that it bails out if it cannot preserve inrange arribute; 2. update llvm/test/Analysis/ConstantFolding/gep.ll to remove UB in it; 3. remove inaccurate comment above ConstantFoldInstOperandsImpl in llvm/lib/Analysis/ConstantFolding.cpp; 4. add a new regression test that makes sure that no optimizations change an inrange GEP in an unexpected way. Patch by Zhaomo Yang! Differential Revision: https://reviews.llvm.org/D51698 llvm-svn: 341888	2018-09-11 01:53:36 +00:00
Alina Sbirlea	7980099adb	API to update MemorySSA for cloned blocks and added CFG edges. Summary: End goal is to update MemorySSA in all loop passes. LoopUnswitch clones all blocks in a loop. SimpleLoopUnswitch clones some blocks. LoopRotate clones some instructions. Some of these loop passes also make CFG changes. This is an API based on what I found needed in LoopUnswitch, SimpleLoopUnswitch, LoopRotate, LoopInstSimplify, LoopSimplifyCFG. Adding dependent patches using this API for context. Reviewers: george.burgess.iv, dberlin Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D45299 llvm-svn: 341855	2018-09-10 20:13:01 +00:00
Philip Reames	5660bd460b	[AST] Visit memtransfer arguments in order The only point to this change is the test diffs. When I remove this code entirely (in favor of the recently added generic handling), I don't want there to be any confusion due to spurious test diffs. As an aside, the fact out tests are AST construction order dependent is not great. I thought about fixing that, but the reasonable schemes I might want (e.g. sort by name) need the test diffs anyways. Philip llvm-svn: 341841	2018-09-10 16:00:27 +00:00
Alina Sbirlea	65f385da63	[MemorySSA] Relax verification of clobbering accesses. llvm-svn: 341733	2018-09-07 23:51:41 +00:00
Philip Reames	cb8b3278e5	[AST] Generalize argument specific aliasing AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated. The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics and only these three intrinsics. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target. The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM. Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch. It was originally part of this one, but separate for ease of review and rebase. Differential Revision: https://reviews.llvm.org/D50730 llvm-svn: 341713	2018-09-07 21:36:11 +00:00
Alina Sbirlea	f98c2c5e6d	[MemorySSA] Update MemoryPhi wiring for block splitting to consider if identical edges were merged. Summary: Block splitting is done with either identical edges being merged, or not. Only critical edges can be split without merging identical edges based on an option. Teach the memoryssa updater to take this into account: for the same edge between two blocks only move one entry from the Phi in Old to the new Phi in New. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D51563 llvm-svn: 341709	2018-09-07 21:14:48 +00:00
Jessica Paquette	a0aa5b35e7	Output per-function size-info remarks This patch adds per-function size information remarks. Previously, passing -Rpass-analysis=size-info would only give you per-module changes. By adding the ability to do this per-function, it's easier to see which functions contributed the most to size changes. https://reviews.llvm.org/D51467 llvm-svn: 341588	2018-09-06 21:19:54 +00:00
Alina Sbirlea	b23648cfdb	[LoopPassManager] MemorySSA should be preserved when enabled. llvm-svn: 341587	2018-09-06 20:54:24 +00:00
Max Kazantsev	6afebe1d30	[NFC] Simplify inner structure of InstructionPrecedenceTracking Currently it has a set KnownBlocks that marks blocks as having cached answers and a map FirstSpecialInsts that maps these blocks to first special instructions in them. The value in the map is always non-null, and for blocks that are known to have no special instructions the map does not have an instance. This patch removes KnownBlocks as obsolete. Instead, for blocks that are known to have no special instructions, we just put a nullptr value. This makes the code much easier to read. llvm-svn: 341531	2018-09-06 09:29:42 +00:00
Max Kazantsev	8a9e059e5c	Return "[NFC] Add severe validation of InstructionPrecedenceTracking" This validation patch has been reverted as rL341147 because of conserns raised by @reames. This revision returns it as is to raise a discussion and address the concerns. Differential Revision: https://reviews.llvm.org/D51523 Reviewed By: reames llvm-svn: 341526	2018-09-06 08:33:02 +00:00
Jessica Paquette	71e9778006	[NFC] Optionally pass a function to emitInstrCountChangedRemark In basic block, loop, and function passes, we already have a function that we can use to emit optimization remarks. We can use that instead of searching the module for the first suitable function (that is, one that contains at least one basic block.) llvm-svn: 341253	2018-08-31 20:54:37 +00:00
Jessica Paquette	9a23c55920	[NFC] Pass the instruction delta to emitInstrCountChangedRemark Instead of counting the size of the entire module every time we run a pass, pass along a delta instead and use that to emit the remark. This means we only have to use (on average) smaller IR units to calculate instruction counts. E.g, in a BB pass, we only need to look at the delta of the BB instead of the delta of the entire module. 6/6 (This improved compile time for size remarks on sqlite3 + O2 significantly) llvm-svn: 341250	2018-08-31 20:20:57 +00:00
Jessica Paquette	1fc443b887	[NFC] Pre-calculate SCC IR counts in size remarks. Same vein as the previous commits. Pre-calculate the size of the module and use that to decide if we're going to emit a remark. This one comes with a FIXME and TODO. First off, CallGraphSCC and CallGraphNode don't have a getInstructionCount function. So, for now, we do the same thing as in a module pass. Second off, we're not really saving anything here yet, because as before, I need to change emitInstrCountChangedRemark to take in a delta. Keeping the patches small though, so that's coming up next. 5/6 llvm-svn: 341249	2018-08-31 20:20:56 +00:00
Jessica Paquette	872a4c92b2	[NFC] Pre-calculate loop IR counts in size remarks. Another commit reducing compile time in size remarks. Cache the size of the module and loop, and update values based off of deltas instead. Avoid recalculating the size of the whole module whenever possible. 3/6 llvm-svn: 341247	2018-08-31 20:20:54 +00:00
Max Kazantsev	c683d643c9	Revert "[NFC] Add severe validation of InstructionPrecedenceTracking" for discussion llvm-svn: 341147	2018-08-31 00:01:54 +00:00
Andrew Kaylor	d9b6b81d08	Reverting r340807. This patch restores the old behavior of getAllocationDataForFunction in MemoryBuiltins.cpp. llvm-svn: 341091	2018-08-30 18:37:18 +00:00
Nicolai Haehnle	35617ed4cb	[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes. Patch by: Simon Moll Reviewed By: nhaehnle Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50434 Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071	2018-08-30 14:21:36 +00:00
Max Kazantsev	ec9b386820	[NFC] Add severe validation of InstructionPrecedenceTracking llvm-svn: 341051	2018-08-30 10:26:06 +00:00
Max Kazantsev	d3487bdb61	[NFC] Rename map to make the naming consistent llvm-svn: 341043	2018-08-30 09:24:33 +00:00
Max Kazantsev	d3a4cbe153	[NFC] Move OrderedInstructions and InstructionPrecedenceTracking to Analysis These classes don't make any changes to IR and have no reason to be in Transform/Utils. This patch moves them to Analysis folder. This will allow us reusing these classes in some analyzes, like MustExecute. llvm-svn: 341015	2018-08-30 04:49:03 +00:00
Max Kazantsev	3c284bde3f	Re-enable "[NFC] Unify guards detection" rL340921 has been reverted by rL340923 due to linkage dependency from Transform/Utils to Analysis which is not allowed. In this patch this has been fixed, a new utility function moved to Analysis. Differential Revision: https://reviews.llvm.org/D51152 llvm-svn: 341014	2018-08-30 03:39:16 +00:00
Alina Sbirlea	6edcc9ee86	[MemorySSA] Silence warning. llvm-svn: 340995	2018-08-29 23:20:29 +00:00
Alina Sbirlea	5bce4d5a85	[MemorySSA] Fix checkClobberSanity to skip Start only for Defs and Uses. llvm-svn: 340981	2018-08-29 22:38:51 +00:00
Philip Reames	f562fc8dbf	[LICM] Hoist stores of invariant values to invariant addresses out of loops Teach LICM to hoist stores out of loops when the store writes to a location otherwise unused in the loop, writes a value which is invariant, and is guaranteed to execute if the loop is entered. Worth noting is that this transformation is partially overlapping with the existing promotion transformation. Reasons this is worthwhile anyway include: * For multi-exit loops, this doesn't require duplication of the store. * It kicks in for case where we can't prove we exit through a normal exit (i.e. we may throw), but can prove the store executes before that possible side exit. Differential Revision: https://reviews.llvm.org/D50925 llvm-svn: 340974	2018-08-29 21:49:30 +00:00
Alina Sbirlea	f5403d83e7	[MemorySSA] Add expesive check for validating clobber accesses. Summary: Add validation of clobber accesses as expensive check. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D51327 llvm-svn: 340951	2018-08-29 18:26:04 +00:00
Hans Wennborg	2c390c54f6	Revert r340921 "[NFC] Unify guards detection" This broke the build, see e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv8-lnt/builds/4626/ http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/18647/ http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/5856/ http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/22800/ > We have multiple places in code where we try to identify whether or not > some instruction is a guard. This patch factors out this logic into a separate > utility function which works uniformly in all places. > > Differential Revision: https://reviews.llvm.org/D51152 > Reviewed By: fedor.sergeev llvm-svn: 340923	2018-08-29 12:21:32 +00:00
Max Kazantsev	1dafaa87d9	[NFC] Unify guards detection We have multiple places in code where we try to identify whether or not some instruction is a guard. This patch factors out this logic into a separate utility function which works uniformly in all places. Differential Revision: https://reviews.llvm.org/D51152 Reviewed By: fedor.sergeev llvm-svn: 340921	2018-08-29 11:37:34 +00:00
Fedor Sergeev	43083111a2	[NFC][PassTiming] factor out generic PassTimingInfo Moving PassTimingInfo from legacy pass manager code into a separate header. Making it suitable for both legacy and new pass manager. Adding a test on -time-passes main functionality. llvm-svn: 340872	2018-08-28 21:06:51 +00:00
David Chisnall	5e52cadf89	Fix in getAllocationDataForFunction Summary: Correct to use set like behaviour of AllocType. Should check for subset, not precise value. Reviewers: theraven Reviewed By: theraven Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50959 llvm-svn: 340807	2018-08-28 08:59:06 +00:00
George Burgess IV	6a9aa02ff3	[MemorySSA] Add NDEBUG checks to verifiers; NFC verify*() methods are intended to have no side-effects (unless we detect broken MSSA, in which case they assert()), and all of the other verify methods are wrapped by `#ifndef NDEBUG`. llvm-svn: 340793	2018-08-28 00:32:32 +00:00
Roman Tereshin	02320eee6b	Revert "[SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs" This reverts r319889. Unfortunately, wrapping flags are not a part of SCEV's identity (they do not participate in computing a hash value or in equality comparisons) and in fact they could be assigned after the fact w/o rebuilding a SCEV. Grep for const_cast's to see quite a few of examples, apparently all for AddRec's at the moment. So, if 2 expressions get built in 2 slightly different ways: one with flags set in the beginning, the other with the flags attached later on, we may end up with 2 expressions which are exactly the same but have their operands swapped in one of the commutative N-ary expressions, and at least one of them will have "sorted by complexity" invariant broken. 2 identical SCEV's won't compare equal by pointer comparison as they are supposed to. A real-world reproducer is added as a regression test: the issue described causes 2 identical SCEV expressions to have different order of operands and therefore compare not equal, which in its turn prevents LoadStoreVectorizer from vectorizing a pair of consecutive loads. On a larger example (the source of the test attached, which is a bugpoint) I have seen even weirder behavior: adding a constant to an existing SCEV changes the order of the existing terms, for instance, getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)). Differential Revision: https://reviews.llvm.org/D40645 llvm-svn: 340777	2018-08-27 21:41:37 +00:00
Chandler Carruth	9ae926b973	[IR] Replace `isa<TerminatorInst>` with `isTerminator()`. This is a bit awkward in a handful of places where we didn't even have an instruction and now we have to see if we can build one. But on the whole, this seems like a win and at worst a reasonable cost for removing `TerminatorInst`. All of this is part of the removal of `TerminatorInst` from the `Instruction` type hierarchy. llvm-svn: 340701	2018-08-26 09:51:22 +00:00
Chandler Carruth	96fc1de77d	[IR] Begin removal of TerminatorInst by removing successor manipulation. The core get and set routines move to the `Instruction` class. These routines are only valid to call on instructions which are terminators. The iterator and generic range based access move to `CFG.h` where all the other generic successor and predecessor access lives. While moving the iterator here, simplify it using the iterator utilities LLVM provides and updates coding style as much as reasonable. The APIs remain pointer-heavy when they could better use references, and retain the odd behavior of `operator*` and `operator->` that is common in LLVM iterators. Adjusting this API, if desired, should be a follow-up step. Non-generic range iteration is added for the two instructions where there is an especially easy mechanism and where there was code attempting to use the range accessor from a specific subclass: `indirectbr` and `br`. In both cases, the successors are contiguous operands and can be easily iterated via the operand list. This is the first major patch in removing the `TerminatorInst` type from the IR's instruction type hierarchy. This change was discussed in an RFC here and was pretty clearly positive: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html There will be a series of much more mechanical changes following this one to complete this move. Differential Revision: https://reviews.llvm.org/D47467 llvm-svn: 340698	2018-08-26 08:41:15 +00:00
John Brawn	980da83f84	[PhiValues] Use callback value handles to invalidate deleted values The way that PhiValues is integrated with BasicAA it is possible for a pass which uses BasicAA to pick up an instance of BasicAA that uses PhiValues without intending to, and then delete values from a function in a way that causes PhiValues to return dangling pointers to these deleted values. Fix this by having a set of callback value handles to invalidate values when they're deleted. llvm-svn: 340613	2018-08-24 15:48:30 +00:00
Craig Topper	dfa176e813	[ValueTracking] Fix assert message and add test case for r340546 and PR38677. The bug was already fixed. This just adds a test case for it. llvm-svn: 340556	2018-08-23 17:45:53 +00:00
Craig Topper	15f8692381	[ValueTracking] Fix an assert from r340480. We need to allow ConstantExpr Selects in addition to SelectInst. I'll try to put together a test case, but I wanted to fix the issues being reported. Fixes PR38677 llvm-svn: 340546	2018-08-23 17:15:02 +00:00
Craig Topper	bec15b6516	[ValueTracking] Teach computeNumSignBits to understand min/max clamp patterns with constant/splat values If we have a min/max pair we can do a better job of counting sign bits if we look at them together. This is similar to what is done in the SelectionDAG version of computeNumSignBits for ISD::SMAX/SMIN. Differential Revision: https://reviews.llvm.org/D51112 llvm-svn: 340480	2018-08-22 23:27:50 +00:00
George Burgess IV	5676a5d48c	[MemorySSA] Invalidate optimized Defs upon moving them; NFC We're currently getting this behavior implicitly, since we determine if a Def's optimization is valid based on the ID of its defining access. This is incorrect, though I wouldn't be surprised if this was masked in part by that we're using a WeakVH to track what Defs are optimized to. (Not to mention that we don't move Defs super often, AFAICT). I'll submit a patch to fix this shortly. This also includes a minor refactor to reduce duplication a bit. No test is included, since like said, this already happens to be our behavior. I'll add a test for this with my fix to the other bug mentioned above. llvm-svn: 340461	2018-08-22 22:34:38 +00:00
Philip Reames	8abf4484fe	[AA] Remove a needless variable [NFC] There's no need to track a seperate variable for argmemonly aliasing. This falls out naturally of the modinfo union. Note that we may return earlier than we would have earlier if all arguments are explicitly readnone. The overall result doesn't change, just how we get there. llvm-svn: 340443	2018-08-22 19:50:45 +00:00
Philip Reames	f8681cea87	[AST] Minor whitespace cleanup [NFC] llvm-svn: 340440	2018-08-22 19:30:46 +00:00
George Burgess IV	d61e7071cd	[MemorySSA] Move two simple getters; NFC We're calling these functions quite a bit from outside of MemorySSA.cpp now. Given that they're relatively simple one-liners, I think the style preference is to have them inline. llvm-svn: 340430	2018-08-22 18:02:46 +00:00
Philip Reames	fdd73b5037	[AST] Fix a whitespace typo [NFC] llvm-svn: 340384	2018-08-22 03:36:42 +00:00
Philip Reames	5d90c14b76	[AST] Reorder code to reduce a future patch diff [NFC] llvm-svn: 340383	2018-08-22 03:33:55 +00:00
Philip Reames	825c74c241	[AST] Move a function definition into the cpp [NFC] llvm-svn: 340382	2018-08-22 03:32:52 +00:00
Philip Reames	c3c23e8cf2	[AST] Remove notion of volatile from alias sets [NFCI] Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet. It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.) There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for.. llvm-svn: 340312	2018-08-21 17:59:11 +00:00
Sanjay Patel	f3ae9cc33e	[InstSimplify] use isKnownNeverNaN to fold more fcmp ord/uno Remove duplicate tests from InstCombine that were added with D50582. I left negative tests there to verify that nothing in InstCombine tries to go overboard. If isKnownNeverNaN is improved to handle the FP binops or other cases, we should have coverage under InstSimplify, so we could remove more duplicate tests from InstCombine at that time. llvm-svn: 340279	2018-08-21 14:45:13 +00:00
Max Kazantsev	bfbd4d1fb6	[NFC] Factor out predecessors collection into a separate method It may be reused in a different piece of logic. Differential Revision: https://reviews.llvm.org/D50890 Reviewed By: reames llvm-svn: 340250	2018-08-21 07:15:06 +00:00
Philip Reames	a5a8546ac6	[AST] Mark invariant.starts as being readonly These intrinsics are modelled as writing for control flow purposes, but they don't actually write to any location. Marking these - as we did for guards - allows LICM to hoist loads out of loops containing invariant.starts. Differential Revision: https://reviews.llvm.org/D50861 llvm-svn: 340245	2018-08-21 00:55:35 +00:00
Matt Arsenault	450fcc77a7	ValueTracking: Handle more instructions in isKnownNeverNaN llvm-svn: 340187	2018-08-20 16:51:00 +00:00
Philip Reames	96bc076c3a	[AST] Clarify printing of unknown size locations [NFC] Printing "unknown" is much more clear than an arbitrary large integer llvm-svn: 340108	2018-08-17 23:17:31 +00:00
Alina Sbirlea	0dfe830318	[IDF] Teach Iterated Dominance Frontier to use a snapshot CFG based on a GraphDiff. Summary: Create the ability to compute IDF using a CFG View. For this, we'll need a new DT created using a list of Updates (to be refactored later to a GraphDiff), and the GraphTraits based on the same GraphDiff. Reviewers: kuhar, george.burgess.iv, mzolotukhin Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D50675 llvm-svn: 340052	2018-08-17 17:39:15 +00:00
Florian Hahn	19f9e32f07	[InstrSimplify,NewGVN] Add option to ignore additional instr info when simplifying. NewGVN uses InstructionSimplify for simplifications of leaders of congruence classes. It is not guaranteed that the metadata or other flags/keywords (like nsw or exact) of the leader is available for all members in a congruence class, so we cannot use it for simplification. This patch adds a InstrInfoQuery struct with a boolean field UseInstrInfo (which defaults to true to keep the current behavior as default) and a set of helper methods to get metadata/keywords for a given instruction, if UseInstrInfo is true. The whole thing might need a better name, to avoid confusion with TargetInstrInfo but I am not sure what a better name would be. The current patch threads through InstrInfoQuery to the required places, which is messier then it would need to be, if InstructionSimplify and ValueTracking would share the same Query struct. The reason I added it as a separate struct is that it can be shared between InstructionSimplify and ValueTracking's query objects. Also, some places do not need a full query object, just the InstrInfoQuery. It also updates some interfaces that do not take a Query object, but a set of optional parameters to take an additional boolean UseInstrInfo. See https://bugs.llvm.org/show_bug.cgi?id=37540. Reviewers: dberlin, davide, efriedma, sebpop, hiraditya Reviewed By: hiraditya Differential Revision: https://reviews.llvm.org/D47143 llvm-svn: 340031	2018-08-17 14:39:04 +00:00
Sanjay Patel	411b86081e	[ConstantFolding] add simplifications for funnel shift intrinsics This is another step towards being able to canonicalize to the funnel shift intrinsics in IR (see D49242 for the initial patch). We should not have any loss of simplification power in IR between these and the equivalent IR constructs. Differential Revision: https://reviews.llvm.org/D50848 llvm-svn: 340022	2018-08-17 13:23:44 +00:00
Max Kazantsev	7b78d3920c	[MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514 The description of `isGuaranteedToExecute` does not correspond to its implementation. According to description, it should return `true` if an instruction is executed under the assumption that its loop is entered. However there is a sophisticated alrogithm inside that tries to prove that the instruction is executed if the loop is exited, which is not the same thing for infinite loops. There is an attempt to protect from dealing with infinite loops by prohibiting loops without exit blocks, however an infinite loop can have exit blocks. As result of that, MustExecute can falsely consider some blocks that are never entered as mustexec, and LICM can hoist dangerous instructions out of them basing on this fact. This may introduce UB to programs which did not contain it initially. This patch removes the problematic algorithm and replaced it with a one which tries to prove what is required in description. Differential Revision: https://reviews.llvm.org/D50558 Reviewed By: reames llvm-svn: 339984	2018-08-17 06:19:17 +00:00
Philip Reames	684fa57ef7	[MemLoc] Fix a bug causing any use of invariant.end to crash in LICM The fix is fairly simple, but is says something unpleasant about the usage and testing of invariant.start/end scopes that this went undetected. To put this in perspective, any invariant.end in a loop flowing through LICM crashed. I haven't bothered to figure out just how far back this goes, but it's not caused by any of the recent changes. We're probably talking months if not years. llvm-svn: 339936	2018-08-16 20:48:55 +00:00
Philip Reames	0e2f9b9e30	[LICM][NFC] Restructure pointer invalidation API in terms of MemoryLocation Main value is just simplifying code. I'll further simply the argument handling case in a bit, but that involved a slightly orthogonal change so I went with the mildy ugly intermediate for this patch. Note that the isSized check in the old LICM code was not carried across. It turns out that check was dead. a) no test exercised it, and b) langref and verifier had been updated to disallow unsized types used in loads. llvm-svn: 339930	2018-08-16 20:11:15 +00:00
Max Kazantsev	a7415874c9	[NFC] Add missing const modifier llvm-svn: 339844	2018-08-16 06:28:04 +00:00
Easwaran Raman	aca738b742	[BFI] Use rounding while computing profile counts. Summary: Profile count of a block is computed by multiplying its block frequency by entry count and dividing the result by entry block frequency. Do rounded division in the last step and update test cases appropriately. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50822 llvm-svn: 339835	2018-08-16 00:26:59 +00:00
Alina Sbirlea	cc2e8ccc6f	[MemorySSA] Expose the verify as a debug option. Summary: Expose VerifyMemorySSA as a debug option. If set, passes will call the MSSA->verifyMemorySSA() after calling into the updater's APIs when MemorySSA should be valid. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D50749 llvm-svn: 339795	2018-08-15 17:34:55 +00:00
Max Kazantsev	5a10d127b9	[AliasSetTracker] Do not treat experimental_guard intrinsic as memory writing instruction The `experimental_guard` intrinsic has memory write semantics to model the thread-exiting logic, but does not do any actual writes to memory. Currently, `AliasSetTracker` treats it as a normal memory write. As result, a loop-invariant load cannot be hoisted out of loop because the guard may possibly alias with it. This patch makes `AliasSetTracker` so that it doesn't treat guards as memory writes. Differential Revision: https://reviews.llvm.org/D50497 Reviewed By: reames llvm-svn: 339753	2018-08-15 06:21:02 +00:00
Max Kazantsev	530b8d1c3d	[NFC] Refactoring of LoopSafetyInfo, step 1 Turn structure into class, encapsulate methods, add clarifying comments. Differential Revision: https://reviews.llvm.org/D50693 Reviewed By: reames llvm-svn: 339752	2018-08-15 05:55:43 +00:00
Reid Kleckner	40e7663b1f	[BasicAA] Don't assume tail calls with byval don't alias allocas Summary: Calls marked 'tail' cannot read or write allocas from the current frame because the current frame might be destroyed by the time they run. However, a tail call may use an alloca with byval. Calling with byval copies the contents of the alloca into argument registers or stack slots, so there is no lifetime issue. Tail calls never modify allocas, so we can return just ModRefInfo::Ref. Fixes PR38466, a longstanding bug. Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50679 llvm-svn: 339636	2018-08-14 01:24:35 +00:00
Philip Reames	90bffb3eb9	[AST] Minor formatting cleanup [NFC] llvm-svn: 339627	2018-08-13 22:34:14 +00:00
Philip Reames	0f396696d1	[AST] Cleanup code by using MemoryLocation utility [NFC] Differential Revision: https://reviews.llvm.org/D50588 llvm-svn: 339625	2018-08-13 22:25:16 +00:00
Craig Topper	484b342c68	[X86] Add constant folding for AVX512 versions of scalar floating point to integer conversion intrinsics. Summary: We've supported constant folding for sse versions for many years. This patch adds support for the avx512 versions including unsigned with the default rounding mode. We could probably do more with other roundings modes and SAE in the future. The test cases are largely based on the sse.ll test cases. But I did add some test cases to ensure the unsigned versions don't accept negative values. Also checked the bounds of f64->i32 conversions to make sure unsigned has a larger positive range than signed. Reviewers: RKSimon, spatel, chandlerc Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50553 llvm-svn: 339529	2018-08-12 22:09:54 +00:00
Benjamin Kramer	bae6aab6fb	[InstSimplify] Guard against large shift amounts. These are always UB, but can happen for large integer inputs. Testing it is very fragile as -simplifycfg will nuke the UB top-down. llvm-svn: 339515	2018-08-12 11:43:03 +00:00
George Burgess IV	ff08c80efc	[MemorySSA] "Fix" lifetime intrinsic handling MemorySSA currently creates MemoryAccesses for lifetime intrinsics, and sometimes treats them as clobbers. This may/may not be the best way forward, but while we're doing it, we should consider MayAlias/PartialAlias to be clobbers. The ideal fix here is probably to remove all of this reasoning about lifetimes from MemorySSA + put it into the passes that need to care. But that's a wayyy broader fix that needs some consensus, and we have miscompiles + a release branch today, and this should solve the miscompiles just as well. differential revision is D43269. Landing without an explicit LGTM (and without using the special please-autoclose-this syntax) so we can still use that revision as a place to decide what the right fix here is. llvm-svn: 339411	2018-08-10 05:14:43 +00:00
Matt Arsenault	d54b7f0592	ValueTracking: Start enhancing isKnownNeverNaN llvm-svn: 339399	2018-08-09 22:40:08 +00:00
Sanjay Patel	c6944f795d	[InstSimplify] move minnum/maxnum with Inf folds from instcombine llvm-svn: 339396	2018-08-09 22:20:44 +00:00
Sanjay Patel	9b07347033	[InstSimplify] fold fsub+fadd with common operand llvm-svn: 339176	2018-08-07 20:32:55 +00:00
Sanjay Patel	4364d604c2	[InstSimplify] fold fadd+fsub with common operand llvm-svn: 339174	2018-08-07 20:23:49 +00:00

... 9 10 11 12 13 ...

9116 Commits