llvm-project

Commit Graph

Author	SHA1	Message	Date
Zhi Zhuang	37fb860301	Add support of __builtin_expect_with_probability Add a new builtin-function __builtin_expect_with_probability and intrinsic llvm.expect.with.probability. The interface is __builtin_expect_with_probability(long expr, long expected, double probability). It is mainly the same as __builtin_expect besides one more argument indicating the probability of expression equal to expected value. The probability should be a constant floating-point expression and be in range [0.0, 1.0] inclusive. It is similar to builtin-expect-with-probability function in GCC built-in functions. Differential Revision: https://reviews.llvm.org/D79830	2020-06-22 10:21:28 -07:00
Hiroshi Yamauchi	9e1decf743	[PGO][PGSO] Enable non-cold size opts under partial profile sample PGO. Summary: Similar to D81020. Follow up D78949. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82053	2020-06-22 10:12:48 -07:00
Sanjay Patel	9934cc544c	[VectorCombine] make helper function for shift-shuffle; NFC This will probably be useful for other extract patterns.	2020-06-22 12:23:52 -04:00
Florian Hahn	328c8642e2	[DSE,MSSA] Reorder DSE blocking checks. Currently we stop exploring candidates too early in some cases. In particular, we can continue checking the defining accesses of non-removable MemoryDefs and defs without analyzable write location (read clobbers are already ruled out using MemorySSA at this point).	2020-06-22 17:16:34 +01:00
Sanjay Patel	98c2f4eea5	[VectorCombine] add helper to replace uses and rename The tests are regenerated to show a path that missed renaming, but there should be no functional difference from this patch.	2020-06-22 09:58:49 -04:00
Sanjay Patel	de65b356dc	[VectorCombine] add/use pass-level IRBuilder This saves creating/destroying a builder every time we perform some transform. The tests show instruction ordering diffs resulting from always inserting at the root instruction now, but those should be benign.	2020-06-22 09:01:29 -04:00
Sanjay Patel	cce625f73d	[VectorCombine] improve IR debugging by providing/salvaging value names The tests are regenerated to show the diffs, but there should be no functional change from this patch.	2020-06-22 08:35:47 -04:00
Serguei Katkov	eae0d2e9b2	Revert "[Peeling] Extend the scope of peeling a bit" This reverts commit `29b2c1ca72`. The patch causes the DT verifier failure like: DominatorTree is different than a freshly computed one! Not sure the patch itself it wrong but revert to investigate the failure.	2020-06-22 17:48:29 +07:00
Florian Hahn	0e19ff02d8	[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).	2020-06-22 10:58:53 +01:00
Serguei Katkov	29b2c1ca72	[Peeling] Extend the scope of peeling a bit Currently we allow peeling of the loops if there is a exiting latch block and all other exits are blocks ending with deopt. Actually we want that exit would end up with deopt unconditionally but it is not required that exit itself ends with deopt. Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev Reviewed By: apilipenko Subscribers: hiraditya, zzheng, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D81140	2020-06-22 12:17:44 +07:00
Sanjay Patel	6bdd531af5	[VectorCombine] create class for pass to hold analyses, etc; NFC This doesn't change anything currently, but it would make sense to create a class-level IRBuilder instead of recreating that everywhere. As we expand to more optimizations, we will probably also want to hold things like the DataLayout or other constant refs in here too.	2020-06-21 16:07:33 -04:00
Florian Hahn	40569db7b3	[DSE,MSSA] Move reachability check to main loop. As we traverse the CFG backwards, we could end up reaching unreachable blocks. For unreachable blocks, we won't have computed post order numbers and because DomAccess is reachable, unreachable blocks cannot be on any path from it. This fixes a crash with unreachable blocks.	2020-06-21 16:38:10 +01:00
clfbbn	10b0539772	[Attributor][NFC] Fix indentation Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, kuter, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82260	2020-06-21 15:43:32 +08:00
Wenlei He	7c8a6936bf	[Remarks] Add callsite locations to inline remarks Summary: Add call site location info into inline remarks so we can differentiate inline sites. This can be useful for inliner tuning. We can also reconstruct full hierarchical inline tree from parsing such remarks. The messege of inline remark is also tweaked so we can differentiate SampleProfileLoader inline from CGSCC inline. Reviewers: wmi, davidxl, hoy Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82213	2020-06-20 23:32:10 -07:00
Eric Christopher	dc20419351	Rename function to more accurately reflect what it does.	2020-06-20 14:37:29 -07:00
Sanjay Patel	741e20f3d6	[VectorCombine] fix assert for type of compare operand As shown in the post-commit comment for D81661 - we need to loosen the type assertion to allow scalarization of a compare for vectors of pointers.	2020-06-20 15:20:17 -04:00
Sanjay Patel	7b201bfcac	[InstCombine] remove unused parameter and add assert; NFC	2020-06-20 11:47:00 -04:00
Sanjay Patel	d84cdb81ed	[InstCombine] fabs(X) / fabs(X) -> X / X Also, consolidate related folds so we don't miss/repeat these.	2020-06-20 10:20:21 -04:00
Eric Christopher	10563e16aa	[Analysis/Transforms/Sanitizers] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:42:26 -07:00
Eric Christopher	858d385578	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:24:57 -07:00
Fangrui Song	2a4317bfb3	[SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D82244	2020-06-19 22:22:47 -07:00
Yevgeny Rouban	6429471e8b	[IR] Convert profile metadata in createCallMatchingInvoke() When an invoke instruction is converted to a call its profile metadata is dropped because it has incompatible format (see commit `16ad6eeb94`). This patch adds an attempt to convert profile data to format of the call instruction. This used to work well before the commit `dcfa78a4cc`. Reviewers: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D82071	2020-06-20 12:10:31 +07:00
Eric Christopher	b6536e549d	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-19 15:12:18 -07:00
Sanjay Patel	216a37bb46	[VectorCombine] refactor extract-extract logic; NFCI	2020-06-19 14:52:27 -04:00
Sanjay Patel	6d864097a2	[VectorCombine] fix crash while transforming constants This is a variation of the proposal in D82049 with an extra test.	2020-06-19 12:30:32 -04:00
Florian Hahn	f9d8e33c32	[SCCP] Turn sext into zext for non-negative ranges. This patch updates SCCP/IPSCCP to use the computed range info to turn sexts into zexts, if the value is known to be non-negative. We already to a similar transform in CorrelatedValuePropagation, but it seems like we can catch a lot of additional cases by doing it in SCCP/IPSCCP as well. The transform is limited to ranges that are known to not include undef. Currently constant ranges from conditions are treated as potentially containing undef, due to PR46144. Once we flip this, the transform will be more effective in practice. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81756	2020-06-19 10:17:55 +01:00
Tyker	b7338fb1a6	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-19 10:32:26 +02:00
Matt Arsenault	b13f6b0fe0	BypassSlowDivision: Fix dropping debug info I don't know anything about debug info, but this seems like more work should be necessary. This constructs a new IRBuilder and reconstructs the original divides rather than moving the original. One problem this has is if a div/rem pair are handled, both end up with the same debugloc. I'm not sure how to fix this, since this uses a cache when it sees the same input operands again, which will have the first instance's location attached.	2020-06-18 17:27:19 -04:00
Christopher Tetreault	8d11ec66b6	[SVE] Remove calls to VectorType::getNumElements from Transforms/Utils Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82057	2020-06-18 13:39:14 -07:00
Sanjay Patel	46a285ad9e	[IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC The predicate can always be used to distinguish between icmp and fcmp, so we don't need to keep repeating this check in the callers.	2020-06-18 15:47:06 -04:00
Davide Italiano	8cdd2a158c	[SimplifyCFG] Update debug location when folding branch to common destination Sometimes a dead block gets folded and the debug information is still retained. This manifests as jumpy stepping in lldb, see the bugzilla PR for an end-to-end C testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=46008 Differential Revision: https://reviews.llvm.org/D82062	2020-06-18 12:33:32 -07:00
serge-sans-paille	4dd332723d	Fix return status of LoopDistribute Move code that may update the IR after precondition, so that if precondition fail, the IR isn't modified. Differential Revision: https://reviews.llvm.org/D81225	2020-06-18 20:13:18 +02:00
Arthur Eubanks	91ef930526	[GlobalOpt] Remove preallocated calls when possible When possible (e.g. internal linkage), strip preallocated attribute off parameters/arguments. This requires removing the "preallocated" operand bundle from the call site, replacing @llvm.call.preallocated.arg() with an alloca and a bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since @llvm.call.preallocated.arg() can be called multiple times with the same arg index, we create an alloca per arg index. We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was and a @llvm.stackrestore() after the preallocated call to prevent the stack from blowing up. This is valid because the argument would normally not exist on the stack after the call before the transformation. This does not currently handle all possible preallocated calls. We will need to figure out where to put @llvm.stackrestore() in the cases where there is no obvious place to put it, for example conditional preallocated calls, invokes. This sort of transformation may need to be moved to somewhere more accessible to accomodate similar transformations (like inlining) in the future. Reviewers: efriedma, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80951	2020-06-18 09:56:13 -07:00
Florian Hahn	1669fddc9f	[Matrix] Use alignment info when lowering loads/stores. This patch updates LowerMatrixIntrinsics to preserve the alignment specified at the original load/stores and the align attribute for the pointer argument of the column.major.load/store intrinsics. We can always use the specified alignment for the load of the first column. For subsequent columns, the alignment may need to be reduced. For ConstantInt strides, compute the offset for the start of the column in bytes and use commonAlignment to get the largest valid alignment. For non-ConstantInt strides, we need to take the common alignment of the initial alignment and the element size in bytes. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D81960	2020-06-18 13:19:31 +01:00
Florian Hahn	d88acd8f7d	[Matrix] Preserve volatile when loading loads/stores. Currently the matrix lowering turns volatile loads/stores into non-volatile ones. This patch updates the lowering to preserve the volatile bit. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D81498	2020-06-18 12:14:19 +01:00
Florian Hahn	6d18c2067e	[Matrix] Update load/store intrinsics. This patch adjust the load/store matrix intrinsics, formerly known as llvm.matrix.columnwise.load/store, to improve the naming and allow passing of extra information (volatile). The patch performs the following changes: * Rename columnwise.load/store to column.major.load/store. This is more expressive and also more in line with the naming in Clang. * Changes the stride arguments from i32 to i64. The stride can be larger than i32 and this makes things more uniform with the way things are handled in Clang. * A new boolean argument is added to indicate whether the load/store is volatile. The lowering respects that when emitting vector load/store instructions * MatrixBuilder is updated to require both Alignment and IsVolatile arguments, which are passed through to the generated intrinsic. The alignment is set using the `align` attribute. The changes are grouped together in a single patch, to have a single commit that breaks the compatibility. We probably should be fine with updating the intrinsics, as we did not yet officially support them in the last stable release. If there are any concerns, we can add auto-upgrade rules for the columnwise intrinsics though. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse Reviewed By: anemet, nicolasvasilache Differential Revision: https://reviews.llvm.org/D81472	2020-06-18 09:44:52 +01:00
serge-sans-paille	f9c7e3136e	Correctly report modified status for HWAddressSanitizer Differential Revision: https://reviews.llvm.org/D81238	2020-06-18 10:27:44 +02:00
Mehdi Amini	77b79d79c0	Remove "unused" member ModuleSlice from `struct OpenMPOpt` This is fixing warning from clang: warning: private field 'ModuleSlice' is not used [-Wunused-private-field] SmallPtrSetImpl<Function *> &ModuleSlice; ^ Differential Revision: https://reviews.llvm.org/D82027	2020-06-18 03:02:26 +00:00
Eric Christopher	a8dad30388	Revert "Remove unused class variable ModuleSlice." as it was used in debug only code. This reverts commit `07a1749081`.	2020-06-17 14:45:17 -07:00
Eric Christopher	07a1749081	Remove unused class variable ModuleSlice.	2020-06-17 14:33:29 -07:00
Roman Lebedev	84b4f5a6a6	[InstCombine] Negator: while there, add detection for cycles during negation I don't have any testcases showing it happening, and i haven't succeeded in creating one, but i'm also not positive it can't ever happen, and i recall having something that looked like that in the very beginning of Negator creation. But since we now already have a negation cache, we can now detect such cases practically for free. Let's do so instead of "relying" on stack overflow :D	2020-06-17 22:47:20 +03:00
Roman Lebedev	e3d8cb1e1d	[InstCombine] Negator: cache negation results (PR46362) It is possible that we can try to negate the same value multiple times. For example, PHI nodes may happen to have multiple incoming values (all of which must be the same value) for the same incoming basic block. It may happen that we try to negate such a PHI node, and succeed, and that might result in having now-different incoming values.. To avoid that, and in general to reduce the amount of duplicated work we might be doing, let's introduce a cache where we'll track results of negating each value. The added test was previously failing -verify after -instcombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=46362	2020-06-17 22:47:20 +03:00
Roman Lebedev	c4166f3d84	[NFC][InstCombine] Negator: add thin negate() wrapped before visit()	2020-06-17 22:47:20 +03:00
Roman Lebedev	2b85147337	[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header	2020-06-17 22:47:19 +03:00
Nick Desaulniers	88c965ba14	BreakCriticalEdges for callbr indirect dests Summary: llvm::SplitEdge was failing an assertion that the BasicBlock only had one successor (for BasicBlocks terminated by CallBrInst, we typically have multiple successors). It was surprising that the earlier call to SplitCriticalEdge did not handle the critical edge (there was an early return). Removing that triggered another assertion relating to creating a BlockAddress for a BasicBlock that did not (yet) have a parent, which is a simple order of operations issue in llvm::SplitCriticalEdge (a freshly constructed BasicBlock must be inserted into a Function's basic block list to have a parent). Thanks to @nathanchance for the report. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018 Reviewers: craig.topper, jyknight, void, fhahn, efriedma Reviewed By: efriedma Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D81607	2020-06-17 11:45:06 -07:00
sstefan1	7cfd267c51	[OpenMPOPT][NFC] Introducing OMPInformationCache. Summary: Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them. Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81798	2020-06-17 16:56:45 +02:00
Simon Pilgrim	a5f1f9c9b8	ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC. Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency. Add implicit header dependency to source files where necessary.	2020-06-17 15:48:23 +01:00
Sjoerd Meijer	c1034d044a	Follow up of rGe345d547a0d5, and attempt to pacify buildbot: "error: 'get' is deprecated: The base class version of get with the scalable argument defaulted to false is deprecated." Changed VectorType::get() -> FixedVectorType::get().	2020-06-17 13:24:09 +01:00
Sjoerd Meijer	e345d547a0	Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops" Fixed ARM regression test. Please see the original commit message rG47650451738c for details.	2020-06-17 13:12:15 +01:00
David Green	076e08aa45	[LSR] Filter for postinc formulae In more complicated loops we can easily hit the complexity limits of loop strength reduction. If we do and filtering occurs, it's all too easy to remove the wrong formulae for post-inc preferring accesses due to it attempting to maximise register re-use. The patch adds an alternative filtering step when the target is preferring postinc to pick postinc formulae instead, hopefully lowering the complexity to below the limit so that aggressive filtering is not needed. There is also a change in here to stop considering existing addrecs as free under postinc. We should already be modelling them as a reg so don't want it to cause us to get the cost wrong. (I'm not sure that code makes sense in general, but there are X86 tests specifically for it where it seems to be helping so have left it around for the standard non-post-inc case). Differential Revision: https://reviews.llvm.org/D80273	2020-06-17 12:32:04 +01:00
Sam Parker	5bf0858c0b	Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant" I originally reverted the patch because it was causing performance issues, but now I think it's just enabling simplify-cfg to do something that I don't want instead :) Sorry for the noise. This reverts commit `3e39760f8e`.	2020-06-17 11:38:59 +01:00
Hans Wennborg	16ad6eeb94	[IR] Don't copy profile metadata in createCallMatchingInvoke() The invoke instruction can have profile metadata with branch_weights, which does not make sense for a call instruction and will be rejected by the verifier. Differential revision: https://reviews.llvm.org/D81996	2020-06-17 11:18:23 +02:00
serge-sans-paille	1cafd8a5d1	Fix LoopIdiomRecognize pass return status Introduce an helper class to aggregate the cleanup in case of rollback. Differential Revision: https://reviews.llvm.org/D81230	2020-06-17 11:12:03 +02:00
Sjoerd Meijer	d4e183f686	Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops" This reverts commit `4765045173` while I investigate the build bot failures.	2020-06-17 10:09:54 +01:00
Florian Hahn	773353be4e	[SCCP] Move common code to simplify basic block to helper (NFC). Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81755	2020-06-17 10:03:43 +01:00
Sjoerd Meijer	4765045173	[LV] Emit @llvm.get.active.mask for tail-folded loops This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised loops if the intrinsic is supported by the backend, which is checked by querying TargetTransform hook emitGetActiveLaneMask. This intrinsic creates a mask representing active and inactive vector lanes, which is used by the masked load/store instructions that are created for tail-folded loops. The semantics of @llvm.get.active.mask are described here in LangRef: https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics This intrinsic is also used to provide a hint to the backend. That is, the second argument of the intrinsic represents the back-edge taken count of the loop. For MVE, for example, we use that to set up tail-predication, which is a new form of predication in MVE for vector loops that implicitely predicates the last vector loop iteration by implicitely setting active/inactive lanes, i.e. the tail loop is predicated. In order to set up a tail-predicated vector loop, we need to know the number of data elements processed by the vector loop, which corresponds the the tripcount of the scalar loop, which we can now reconstruct using @llvm.get.active.mask. Differential Revision: https://reviews.llvm.org/D79100	2020-06-17 09:53:58 +01:00
Christopher Tetreault	ff628f5f5e	[SVE] Eliminate calls to default-false VectorType::get() from Vectorize Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81521	2020-06-16 12:50:13 -07:00
Sanjay Patel	ed67f5e7ab	[VectorCombine] scalarize compares with insertelement operand(s) Generalize scalarization (recently enhanced with D80885) to allow compares as well as binops. Similar to binops, we are avoiding scalarization of a loaded value because that could avoid a register transfer in codegen. This requires 1 extra predicate that I am aware of: we do not want to scalarize the condition value of a vector select. That might also invert a transform that we do in instcombine that prefers a vector condition operand for a vector select. I think this is the final step in solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 Differential Revision: https://reviews.llvm.org/D81661	2020-06-16 13:48:10 -04:00
Tyker	d7deef1206	Revert "[AssumeBundles] add cannonicalisation to the assume builder" This reverts commit `90c50cad19`.	2020-06-16 14:34:55 +02:00
Tyker	90c50cad19	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-16 13:12:35 +02:00
sstefan1	e099c7b64a	[NFC][OpenMPOpt] Provide function-specific foreachUse.	2020-06-16 12:33:15 +02:00
Jay Foad	6fdd5a28b7	Revert "[IR] Clean up dead instructions after simplifying a conditional branch" This reverts commit `69bdfb075b`. Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343	2020-06-16 10:32:15 +01:00
Gui Andrade	b0ffa8befe	[MSAN] Pass Origin by parameter to __msan_warning functions Summary: Normally, the Origin is passed over TLS, which seems like it introduces unnecessary overhead. It's in the (extremely) cold path though, so the only overhead is in code size. But with eager-checks, calls to __msan_warning functions are extremely common, so this becomes a useful optimization. This can save ~5% code size. Reviewers: eugenis, vitalybuka Reviewed By: eugenis, vitalybuka Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D81700	2020-06-15 17:49:18 -07:00
Florian Hahn	120c059292	[DSE,MSSA] Port partial store merging. Port partial constant store merging logic to MemorySSA backed DSE. The heavy lifting is done by the existing helper function. It is used in context where we already ensured that the later instruction can eliminate the earlier one, if it is a complete overwrite.	2020-06-15 18:41:46 +01:00
Florian Hahn	71a91b9837	[DSE] Hoist partial store merging code into function (NFC). Hoist the general logic into a new function, because it can be re-used by the MemorySSA backed DSE as well.	2020-06-15 17:44:24 +01:00
Florian Hahn	8c61f13a0f	[DSE,MSSA] Delete instructions after printing it. Also enables a now-passing test case, that exposed a crash caused by the wrong order.	2020-06-15 16:01:36 +01:00
Sam Parker	2596da3174	[CostModel] getCFInstrCost in getUserCost. Have BasicTTI call the base implementation so that both agree on the default behaviour, which the default being a cost of '1'. This has required an X86 specific implementation as it seems to be very reliant on those instructions being free. Changes are also made to AMDGPU so that their implementations distinguish between cost kinds, so that the unrolling isn't affected. PowerPC also has its own implementation to prevent changes to the reg-usage vectorizer test. The cost model test changes now reflect that ret instructions are not generally free. Differential Revision: https://reviews.llvm.org/D79164	2020-06-15 09:28:46 +01:00
Max Kazantsev	60da4369a1	[NFC] Bail early simplifying unconditional branches	2020-06-15 13:59:53 +07:00
Sam Parker	3e39760f8e	Revert "Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"" This reverts commit `23291b9863`. This caused performance regressions.	2020-06-15 07:46:28 +01:00
Whitney Tsang	5225cd43e8	[LoopUnroll] Allow loops with multiple exiting blocks where loop latch is not necessary one of them. Summary: Currently LoopUnrollPass already allow loops with multiple exiting blocks, but it is only allowed when the loop latch is one of the exiting blocks. When the loop latch is not an exiting block, then only single exiting block is supported. When possible, the single loop latch or the single exiting block terminator is optimized to an unconditional branch in the unrolled loop. This patch allows loops with multiple exiting blocks even if the loop latch is not one of them. However, the optimization of exiting block terminator to unconditional branch is not done when there exists more than one exiting block. Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour Reviewed By: efriedma Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D81053	2020-06-14 18:44:18 +00:00
Sanjay Patel	098e48a6a1	[PassManager] restore early-cse to vector cleanup As noted in D80236 - the early-cse pass was included here before: D75145 / rG71a316883d50 But it got moved outside of the "extra" option there, then it got dropped while adjusting -vector-combine: rG6438ea45e053 rG57bb4787d72f So this is restoring the behavior and adding a test to prevent accidental changes again. I don't see an equivalent option for the new pass manager.	2020-06-14 10:04:53 -04:00
Sanjay Patel	b5fb26951a	[InstCombine] reassociate FP diff of sums into sum of diffs (a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3]) This should be the last step in solving PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953 We started emitting reduction intrinsics with: D80867/ rGe50059f6b6b3 So it's a relatively easy pattern match now to re-order those ops. Also, I have not seen any complaints for the switch to intrinsics yet, so I'll propose to remove the "experimental" tag from the intrinsics soon. Differential Revision: https://reviews.llvm.org/D81491	2020-06-14 09:09:03 -04:00
Sanjay Patel	aeb5044801	[InstCombine] allow undef elements when comparing vector constants for min/max bailout This is a hacky, but low-risk fix to avoid the infinite loop in PR46271: https://bugs.llvm.org/show_bug.cgi?id=46271 As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict with a transform that wants to pull a 'not' op through min/max via SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include undefined elements in vector constants to avoid that. Alternatively, we could improve or cripple the demanded elements analysis, but that could create even more problems. The likely better, safer alternative will be to create min/max intrinsics, so we can remove all of the hacks related to min/max matching in instcombine. Differential Revision: https://reviews.llvm.org/D81698	2020-06-14 09:02:47 -04:00
Roman Lebedev	e987ee6318	[NFCI][AggressiveInstCombiner] Add `STATISTIC()`s for transforms	2020-06-13 23:53:16 +03:00
Florian Hahn	97e7147e34	[DSE,MSSA] Fix location order in isOverwrite call. isOverwrite expects the later location as first argument and the earlier result later. The adjusted call is intended to check whether CC overwrites DefLoc.	2020-06-13 20:39:00 +01:00
Eric Christopher	b422fe7d62	Temporarily revert "[MemCpyOptimizer] Simplify API of processStore and processMem* functions" as it seems to be causing some internal crashes in AA after email with the author. This reverts commit `f79e6a8847`.	2020-06-12 14:01:27 -07:00
Roman Lebedev	7aeb41b3c8	[NFCI] VectorCombine: add statistic for bitcast(shuf()) -> shuf(bitcast()) xform	2020-06-12 23:10:53 +03:00
Roman Lebedev	55eb714a0e	[NFC] OpenMPOpt: add a statistic for num of parallel regions deleted	2020-06-12 23:10:53 +03:00
Marco Elver	8af7fa07aa	[ASan][NFC] Refactor redzone size calculation Refactor redzone size calculation. This will simplify changing the redzone size calculation in future. Note that AddressSanitizer.cpp violates the latest LLVM style guide in various ways due to capitalized function names. Only code related to the change here was changed to adhere to the style guide. No functional change intended. Reviewed By: andreyknvl Tags: #llvm Differential Revision: https://reviews.llvm.org/D81367	2020-06-12 15:33:00 +02:00
Florian Hahn	4495a6b141	[BreakCritEdges] Add option to opt-out of perserving loop-simplify. This patch adds a new option to CriticalEdgeSplittingOptions to control whether loop-simplify form must be preserved. It is them used by GVN to indicate that loop-simplify form does not have to be preserved. This fixes a crash exposed by `189efe295b`. If the critical edge we are splitting goes from a block inside a loop to a block outside the loop, splitting the edge will create a new exit block. As a result, the new block will branch to the original exit block, which will add a non-loop predecessor, breaking loop-simplify form. To preserve loop-simplify form, the predecessor blocks of the original exit are split, but that does not work for blocks with indirectbr terminators. If preserving loop-simplify form is requested, bail out , before making any changes. Reviewers: reames, hfinkel, davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81582	2020-06-12 11:47:13 +01:00
Florian Hahn	3a846d4d92	[VPlan] Reject loops without computable backedge taken counts getOrCreateTripCount is used to generate code for the outer loop, but it requires a computable backedge taken counts. Check that in the VPlan native path. Reviewers: Ayal, gilr, rengolin, sguggill Reviewed By: sguggill Differential Revision: https://reviews.llvm.org/D81088	2020-06-12 10:31:18 +01:00
EgorBo	012909dcaf	[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0" Summary: "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two) However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression: "X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj This is my first contribution to LLVM so I hope I didn't mess things up Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79369	2020-06-12 10:20:06 +03:00
Yevgeny Rouban	707836ed4e	[JumpThreading] Handle zero !prof branch_weights Avoid division by zero in updatePredecessorProfileMetadata(). Reviewers: yamauchi Tags: #llvm Differential Revision: https://reviews.llvm.org/D81499	2020-06-12 11:55:15 +07:00
Alina Sbirlea	519b019a0a	Verify MemorySSA after all updates. Verify after completing all updates. Resolves PR46275.	2020-06-11 18:48:41 -07:00
Sanjay Patel	039ff29ef6	[VectorCombine] remove unused parameters; NFC	2020-06-11 19:15:03 -04:00
Stanislav Mekhanoshin	a98d618f6e	Fixed assertion in SROA if block has ho successors BasicBlock::isLegalToHoistInto() asserts if block does not have successors. The case is degenarate but assertion still needs to be avoided. https://bugs.llvm.org/show_bug.cgi?id=46280 Differential Revision: https://reviews.llvm.org/D81674	2020-06-11 15:15:19 -07:00
serge-sans-paille	bff09876d7	Fix return status of DataFlowSanitizer pass Take into account added functions, global values and attribute change. Differential Revision: https://reviews.llvm.org/D81239	2020-06-11 16:05:17 +02:00
Jay Foad	69bdfb075b	[IR] Clean up dead instructions after simplifying a conditional branch Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when removePredecessor calls PHINode::removeIncomingValue. Differential Revision: https://reviews.llvm.org/D80206	2020-06-11 14:53:01 +01:00
Jay Foad	f45c65aa41	Revert "[IR] Clean up dead instructions after simplifying a conditional branch" This reverts commit `4494e45316`. It caused problems for sanitizer buildbots.	2020-06-11 14:22:16 +01:00
Jay Foad	4494e45316	[IR] Clean up dead instructions after simplifying a conditional branch Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Differential Revision: https://reviews.llvm.org/D80206	2020-06-11 13:28:10 +01:00
Jay Foad	f79e6a8847	[MemCpyOptimizer] Simplify API of processStore and processMem* functions Previously these functions either returned a "changed" flag or a "repeat instruction" flag, and could also modify an iterator to control which instruction would be processed next. Simplify this by always returning a "changed" flag, and handling all of the "repeat instruction" functionality by modifying the iterator. No functional change intended except in this case: // If the source and destination of the memcpy are the same, then zap it. ... where the previous code failed to process the instruction after the zapped memcpy. Differential Revision: https://reviews.llvm.org/D81540	2020-06-11 12:48:09 +01:00
Chris Jackson	4707bc2177	[DebugInfo] Refactor SalvageDebugInfo and SalvageDebugInfoForDbgValues - Simplify the salvaging interface and the algorithm in InstCombine Reviewers: vsk, aprantl, Orlando, jmorse, TWeaver Reviewed by: Orlando Differential Revision: https://reviews.llvm.org/D79863	2020-06-11 11:13:46 +01:00
Craig Topper	94b1404587	[InstCombine] Remove some repeated calls to getOperand. NFCI We had alread loaded operand 1 and 2 of the select as TV and FV using the more the readable getTrueValue/getFalseValue.	2020-06-10 16:54:50 -07:00
serge-sans-paille	9daccb7a47	Correctly update Changed status for SimplifyCFG Interestingly, this leads to better output in one of the test case. Differential Revision: https://reviews.llvm.org/D81237	2020-06-10 16:54:15 +02:00
Kuter Dinel	70330edc4d	Reland: [Attributor] Split the Attributor::run() into multiple functions. Summary: This patch splits the Attributor::run() function into multiple functions. Simple Logic changes to make this possible: # Moved iteration count verification earlier. # NumFinalAAs get set a little bit later. Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81022	2020-06-10 13:21:22 +00:00
Marco Elver	d3f89314ff	[KernelAddressSanitizer] Make globals constructors compatible with kernel [v2] [ v1 was reverted by `c6ec352a6b` due to modpost failing; v2 fixes this. More info: https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783 ] This makes -fsanitize=kernel-address emit the correct globals constructors for the kernel. We had to do the following: * Disable generation of constructors that rely on linker features such as dead-global elimination. * Only instrument globals not in explicit sections. The kernel uses sections for special globals, which we should not touch. * Do not instrument globals that are prefixed with "__" nor that are aliased by a symbol that is prefixed with "__". For example, modpost relies on specially named aliases to find globals and checks their contents. Unfortunately modpost relies on size stored as ELF debug info and any padding of globals currently causes the debug info to cause size reported to be with redzone which throws modpost off. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493 Tested: * With 'clang/test/CodeGen/asan-globals.cpp'. * With test_kasan.ko, we can see: BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan] * allyesconfig, allmodconfig (x86_64) Reviewed By: glider Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81390	2020-06-10 15:08:42 +02:00
sstefan1	3013f2d329	Revert "[Attributor] Split the Attributor::run() into multiple functions." This reverts commit `0ee47cc92f`.	2020-06-10 10:10:49 +00:00
stefan	0ee47cc92f	[Attributor] Split the Attributor::run() into multiple functions. Summary: This patch splits the Attributor::run() function into multiple functions. Simple Logic changes to make this possible: # Moved iteration count verification earlier. # NumFinalAAs get set a little bit later. Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81022	2020-06-10 09:48:58 +00:00
Florian Hahn	67671024c8	[DSE,MSSA] Relax post-dom restriction for objs visible after return. This patch relaxes the post-dominance requirement for accesses to objects visible after the function returns. Instead of requiring the killing def to post-dominate the access to eliminate, the set of 'killing blocks' (= blocks that completely overwrite the original access) is collected. If all paths from the access to eliminate and an exit block go through a killing block, the access can be removed. To check this property, we first get the common post-dominator block for the killing blocks. If this block does not post-dominate the access block, there may be a path from DomAccess to an exit block not involving any killing block. Otherwise we have to check if there is a path from the DomAccess to the common post-dominator, that does not contain a killing block. If there is no such path, we can remove DomAccess. For this check, we start at the common post-dominator and then traverse the CFG backwards. Paths are terminated when we hit a killing block or a block that is not executed between DomAccess and a killing block according to the post-order numbering (if the post order number of a block is greater than the one of DomAccess, the block cannot be in in a path starting at DomAccess). This gives the following improvements on the total number of stores after DSE for MultiSource, SPEC2K, SPEC2006: Tests: 237 Same hash: 206 (filtered out) Remaining: 31 Metric: dse.NumRemainingStores Program base new100 diff test-suite...CFP2000/188.ammp/188.ammp.test 3624.00 3544.00 -2.2% test-suite...ch/g721/g721encode/encode.test 128.00 126.00 -1.6% test-suite.../Benchmarks/Olden/mst/mst.test 73.00 72.00 -1.4% test-suite...CFP2006/433.milc/433.milc.test 3202.00 3163.00 -1.2% test-suite...000/186.crafty/186.crafty.test 5062.00 5010.00 -1.0% test-suite...-typeset/consumer-typeset.test 40460.00 40248.00 -0.5% test-suite...Source/Benchmarks/sim/sim.test 642.00 639.00 -0.5% test-suite...nchmarks/McCat/09-vor/vor.test 642.00 644.00 0.3% test-suite...lications/sqlite3/sqlite3.test 35664.00 35563.00 -0.3% test-suite...T2000/300.twolf/300.twolf.test 7202.00 7184.00 -0.2% test-suite...lications/ClamAV/clamscan.test 19475.00 19444.00 -0.2% test-suite...INT2000/164.gzip/164.gzip.test 2199.00 2196.00 -0.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 2380.00 2378.00 -0.1% test-suite.../Benchmarks/Bullet/bullet.test 39335.00 39309.00 -0.1% test-suite...:: External/Povray/povray.test 36951.00 36927.00 -0.1% test-suite...marks/7zip/7zip-benchmark.test 67396.00 67356.00 -0.1% test-suite...6/464.h264ref/464.h264ref.test 31497.00 31481.00 -0.1% test-suite...006/453.povray/453.povray.test 51441.00 51416.00 -0.0% test-suite...T2006/401.bzip2/401.bzip2.test 4450.00 4448.00 -0.0% test-suite...Applications/kimwitu++/kc.test 23481.00 23471.00 -0.0% test-suite...chmarks/MallocBench/gs/gs.test 6286.00 6284.00 -0.0% test-suite.../CINT2000/254.gap/254.gap.test 13719.00 13715.00 -0.0% test-suite.../Applications/SPASS/SPASS.test 30345.00 30338.00 -0.0% test-suite...006/450.soplex/450.soplex.test 15018.00 15016.00 -0.0% test-suite...ications/JM/lencod/lencod.test 27780.00 27777.00 -0.0% test-suite.../CINT2006/403.gcc/403.gcc.test 105285.00 105276.00 -0.0% There might be potential to pre-compute some of the information of which blocks are on the path to an exit for each block, but the overall benefit might be comparatively small. On the set of benchmarks, 15738 times out of 20322 we reach the CFG check, the CFG check is successful. The total number of iterations in the CFG check is 187810, so on average we need less than 10 steps in the check loop. Bumping the threshold in the loop from 50 to 150 gives a few small improvements, but I don't think they warrant such a big bump at the moment. This is all pending further tuning in the future. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: george.burgess.iv Differential Revision: https://reviews.llvm.org/D78932	2020-06-10 10:39:25 +01:00
Vitaly Buka	5a3b380f49	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit `69c5ff4668`. This reverts commit `603d58b5e4`. This reverts commit `ba10bedf56`. This reverts commit `39b3c41b65`.	2020-06-10 02:32:50 -07:00
Whitney Tsang	01e64c9712	[LoopFusion] Update second loop guard non loop successor phis incoming blocks. Summary: The current LoopFusion forget to update the incoming block of the phis in second loop guard non loop successor from second loop guard block to first loop guard block. A test case is provided to better understand the problem. Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D81421	2020-06-09 21:14:51 +00:00
Christopher Tetreault	765ac39db2	[SVE] Eliminate calls to default-false VectorType::get() from Scalar Reviewers: efriedma, kmclaughlin, sdesmalen, fhahn, bkramer, anna, gchatelet, c-rhodes, david-arm, fpetrogalli Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80336	2020-06-09 14:09:02 -07:00
Simon Pilgrim	5dc4e7c2b9	[VectorCombine] scalarizeBinop - support an all-constant src vector operand scalarizeBinop currently folds vec_bo((inselt VecC0, V0, Index), (inselt VecC1, V1, Index)) -> inselt(vec_bo(VecC0, VecC1), scl_bo(V0,V1), Index) This patch extends this to account for cases where one of the vec_bo operands is already all-constant and performs similar cost checks to determine if the scalar binop with a constant still makes sense: vec_bo((inselt VecC0, V0, Index), VecC1) -> inselt(vec_bo(VecC0, VecC1), scl_bo(V0,extractelt(V1,Index)), Index) Fixes PR42174 Differential Revision: https://reviews.llvm.org/D80885	2020-06-09 19:02:05 +01:00
Simon Pilgrim	8233439fdb	[InstCombine] Ensure allocation alignment mask is within range before applying as an attribute Fixes OSS-Fuzz #23214 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23214	2020-06-09 17:31:55 +01:00
serge-sans-paille	5b08bd0eb4	Fix MemCpyOptimizer return status Differential Revision: https://reviews.llvm.org/D81229	2020-06-09 14:24:33 +02:00
serge-sans-paille	ef1a7f2f01	Update pass status for GCOVProfiling Take fork/exec instrumentation into account. Differential Revision: https://reviews.llvm.org/D81227	2020-06-09 14:23:30 +02:00
Petr Hosek	603d58b5e4	[InstrProfiling] Use !associated metadata for counters, data and values The !associated metadata may be attached to a global object declaration with a single argument that references another global object. This metadata prevents discarding of the global object in linker GC unless the referenced object is also discarded. Furthermore, when a function symbol is discarded by the linker, setting up !associated metadata allows linker to discard counters, data and values associated with that function symbol. This is not possible today because there's metadata to guide the linker. This approach is also used by other instrumentations like sanitizers. Note that !associated metadata is only supported by ELF, it does not have any effect on non-ELF targets. Differential Revision: https://reviews.llvm.org/D76802	2020-06-08 15:07:43 -07:00
Petr Hosek	ba10bedf56	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit `39b3c41b65` due to a failing associated.ll test.	2020-06-08 14:38:15 -07:00
Stanislav Mekhanoshin	87ff3401eb	Stabilize alloca slices sort in SROA Slice::operator<() has a non-deterministic behavior. If we have identical slices comparison will depend on the order or operands. Normally that does not result in unstable compilation results because the order in which slices are inserted into the vector is deterministic and llvm::sort() normally behaves as a stable sort, although that is not guaranteed. However, there is test option -sroa-random-shuffle-slices which is used to check exactly this aspect. The vector is first randomly shuffled and then sorted. The same shuffling happens without this option under expensive llvm checks. I have managed to write a test which has hit this problem. There are no fields in the Slice class to resolve the instability. We only have offsets, IsSplittable and Use, but neither Use nor User have anything suitable for predictable comparison. I have switched to stable_sort which has to be sufficient and removed that randon shuffle option. Differential Revision: https://reviews.llvm.org/D81310	2020-06-08 14:25:27 -07:00
Petr Hosek	39b3c41b65	[InstrProfiling] Use !associated metadata for counters, data and values The !associated metadata may be attached to a global object declaration with a single argument that references another global object. This metadata prevents discarding of the global object in linker GC unless the referenced object is also discarded. Furthermore, when a function symbol is discarded by the linker, setting up !associated metadata allows linker to discard counters, data and values associated with that function symbol. This is not possible today because there's metadata to guide the linker. This approach is also used by other instrumentations like sanitizers. Note that !associated metadata is only supported by ELF, it does not have any effect on non-ELF targets. Differential Revision: https://reviews.llvm.org/D76802	2020-06-08 13:35:56 -07:00
Hans Wennborg	fc202c5fec	[PGO] CallPromotion: Don't try to pass sret args to varargs functions It's not allowed by the verifier. Differential revision: https://reviews.llvm.org/D81409	2020-06-08 21:10:27 +02:00
Sanjay Patel	d50366d29f	[InstCombine] improve matching for sext-lshr-trunc patterns, part 2 Similar to rG42f488b63a04 This is intended to preserve the logic of the existing transform, but remove unnecessary restrictions on uses and types. https://rise4fun.com/Alive/oS0 Name: narrow input Pre: C1 <= width(C1) - 24 %B = sext i8 %A %C = lshr %B, C1 %r = trunc %C to i24 => %s = ashr i8 %A, trunc(umin(C1, 7)) %r = sext i8 %s to i24 Name: wide input Pre: C1 <= width(C1) - 24 %B = sext i24 %A %C = lshr %B, C1 %r = trunc %C to i8 => %s = ashr i24 %A, trunc(umin(C1, 23)) %r = trunc i24 %s to i8	2020-06-08 14:41:50 -04:00
Chris Jackson	c6c65164af	[DebugInfo] Reduce SalvageDebugInfo() functions - Now all SalvageDebugInfo() calls will mark undef if the salvage attempt fails. Reviewed by: vsk, Orlando Differential Revision: https://reviews.llvm.org/D78369	2020-06-08 19:28:18 +01:00
Hiroshi Yamauchi	b5632f4083	[PGO][PGSO] Enable non-cold code size opts under non-partial-profile sample PGO. Summary: Following up D78949. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81020	2020-06-08 10:02:00 -07:00
Sanjay Patel	42f488b63a	[InstCombine] improve matching for sext-lshr-trunc patterns This is intended to preserve the logic of the existing transform, but remove unnecessary restrictions on uses and types. https://rise4fun.com/Alive/pYfR Pre: C1 <= width(C1) - 8 %B = sext i8 %A %C = lshr %B, C1 %r = trunc %C to i8 => %r = ashr i8 %A, trunc(umin(C1, 7))	2020-06-08 11:55:30 -04:00
Sanjay Patel	af7587d755	[InstCombine] reduce code duplication in visitTrunc(); NFC	2020-06-08 11:15:44 -04:00
Marco Elver	c6ec352a6b	Revert "[KernelAddressSanitizer] Make globals constructors compatible with kernel" This reverts commit `866ee2353f`. Building the kernel results in modpost failures due to modpost relying on debug info and inspecting kernel modules' globals: https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783	2020-06-08 10:34:03 +02:00
Benjamin Kramer	3badd17b69	SmallPtrSet::find -> SmallPtrSet::count The latter is more readable and more efficient. While there clean up some double lookups. NFCI.	2020-06-07 22:38:08 +02:00
Fangrui Song	e3200dab60	[gcov] Support .gcno/.gcda in gcov 8, 9 or 10 compatible formats	2020-06-07 11:27:49 -07:00
AK	96458fc510	Add cl::ZeroOrMore to get around build system issues It is quite common to get multiple instances of optimization flags while building. The following optimizations does not have cl::ZeroOrMore which causes errors during the build. Reviewers: alexbdv,spop Differential Revision: https://reviews.llvm.org/D81187	2020-06-07 10:15:18 -07:00
Sanjay Patel	2552f65183	[InstCombine] fold mask op into casted shift (PR46013) https://rise4fun.com/Alive/Qply8 Pre: C2 == (-1 u>> zext(C1)) %a = ashr %x, C1 %s = sext %a to i16 %r = and i16 %s, C2 => %s2 = sext %x to i16 %r = lshr i16 %s2, zext(C1) https://bugs.llvm.org/show_bug.cgi?id=46013	2020-06-07 09:33:18 -04:00
Simon Pilgrim	1c2d2c88b4	AlignmentFromAssumptions.h - reduce includes to forward declarations. NFC.	2020-06-07 13:51:48 +01:00
Fangrui Song	693ff89f47	[gcov] Delete unneeded code	2020-06-06 20:36:46 -07:00
Fangrui Song	cdd683b516	[gcov] Support big-endian .gcno and simplify version handling in .gcda	2020-06-06 11:01:47 -07:00
Simon Pilgrim	f14d4c9c54	EHPersonalities.h - reduce Triple.h include to forward declaration. NFC. Move implicit include dependencies down to source files.	2020-06-06 15:48:31 +01:00
dfukalov	c94d32a6b3	[AMDGPU] Increase max iterations count to analyze complete unroll Summary: In some cases inner loops may not get boosts so try to analyze them deeper. Reviewers: rampitec, mzolotukhin Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81204	2020-06-06 16:32:45 +03:00
Simon Pilgrim	5006e551d3	LoopAnalysisManager.h - reduce includes to forward declarations. NFC. Move implicit include dependencies down to header/source files.	2020-06-06 14:06:46 +01:00
Nikita Popov	ff1210edb6	[NewGVN] Remove alignment from LoadExpression (NFC) The alignment is not actually used.	2020-06-06 11:49:20 +02:00
Nikita Popov	a4953db530	[InstCombine] Remove unnecessary MaybeAlign use (NFC) Alloca align is required now.	2020-06-06 11:44:01 +02:00
Richard Smith	f39e12a06b	PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz. Summary: This transformation is correct for a builtin call to 'free(p)', but not for 'operator delete(p)'. There is no guarantee that a user replacement 'operator delete' has no effect when called on a null pointer. However, the principle behind the transformation is correct, and can be applied more broadly: a 'delete p' expression is permitted to unconditionally call 'operator delete(p)'. So do that in Clang under -Oz where possible. We do this whether or not 'p' has trivial destruction, since the destruction might turn out to be trivial after inlining, and even for a class-specific (but non-virtual, non-destroying, non-array) 'operator delete'. Reviewers: davide, dnsampaio, rjmccall Reviewed By: dnsampaio Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79378	2020-06-05 17:13:43 -07:00
Nikita Popov	bff94a8e2b	[LoopIdiomRecognize] Remove unnecessary MaybeAlign use (NFC) Loads and stores always have an alignment now.	2020-06-05 23:11:57 +02:00
Stanislav Mekhanoshin	1e9a0a4e04	SROA: Remove pointer from visited along with instruction If an instruction is erased we also need to remove it from Visited set. There is a very small chance that an another newly created instruction will be created with the same pointer value in place of an erased one. Differential Revision: https://reviews.llvm.org/D80958	2020-06-05 12:47:23 -07:00
Marco Elver	866ee2353f	[KernelAddressSanitizer] Make globals constructors compatible with kernel Summary: This makes -fsanitize=kernel-address emit the correct globals constructors for the kernel. We had to do the following: - Disable generation of constructors that rely on linker features such as dead-global elimination. - Only emit constructors for globals not in explicit sections. The kernel uses sections for special globals, which we should not touch. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493 Tested: 1. With 'clang/test/CodeGen/asan-globals.cpp'. 2. With test_kasan.ko, we can see: BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan] Reviewers: glider, andreyknvl Reviewed By: glider Subscribers: cfe-commits, nickdesaulniers, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D80805	2020-06-05 20:20:46 +02:00
Arthur Eubanks	8133e289b6	Add ASan metadata globals to @llvm.compiler.used under COFF Summary: This matches ELF. This makes the number of ASan failures under the new pass manager on Windows go from 18 to 1. Under the old pass manager, the ASan module pass was one of the very last things run, so these globals didn't get removed due to GlobalOpt. But with the NPM the ASan module pass that adds these globals are run much earlier in the pipeline and GlobalOpt ends up removing them. Reviewers: vitalybuka, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81175	2020-06-05 09:04:52 -07:00
serge-sans-paille	977d27d881	[SCCP] Report changes after removing stores to constant global Differential Revision: https://reviews.llvm.org/D81228	2020-06-05 16:09:07 +02:00
serge-sans-paille	8405f6bcd4	Correctly report modified status for DivRemPairs Differential Revision: https://reviews.llvm.org/D81231	2020-06-05 16:06:31 +02:00
serge-sans-paille	424510095d	Correctly report modified status for DSE Differential Revision: https://reviews.llvm.org/D81233	2020-06-05 15:59:42 +02:00
serge-sans-paille	f987cceb13	Correctly report modified status for TailRecursionElimination Differential Revision: https://reviews.llvm.org/D81232	2020-06-05 15:58:20 +02:00
serge-sans-paille	1086d777be	Correctly report modified status for ObjCARCContract Differential Revision: https://reviews.llvm.org/D81226	2020-06-05 15:56:57 +02:00
serge-sans-paille	80f1ec7008	Correctly report modified status for ObjCARCOpt Differential Revision: https://reviews.llvm.org/D81234	2020-06-05 15:56:57 +02:00
Max Kazantsev	23291b9863	Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant" This reverts commit `c4b5a66e44`. Returning along with Clang test fix	2020-06-05 20:48:29 +07:00
serge-sans-paille	2e5940cf29	Correctly report modified status for LoopSimplify Differential Revision: https://reviews.llvm.org/D81235	2020-06-05 15:46:28 +02:00
serge-sans-paille	2fc085e0e5	Fix return status of AddressSanitizer pass Differential Revision: https://reviews.llvm.org/D81240	2020-06-05 15:44:51 +02:00
Simon Pilgrim	06fd973c85	TargetLibraryInfo.h - reduce Triple.h include to forward declaration. NFC. Move implicit include dependencies down to source files.	2020-06-05 14:35:30 +01:00
Kadir Cetinkaya	c4b5a66e44	Revert "[InstCombine] Simplify compare of Phi with constant inputs against a constant" This reverts commit `16b7eb6dd1`. Breaks build bots, see http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/29888 for an example.	2020-06-05 13:02:35 +02:00
Max Kazantsev	16b7eb6dd1	[InstCombine] Simplify compare of Phi with constant inputs against a constant We can simplify ``` icmp <pred> phi(C1, C2, ...), C ``` with ``` phi(icmp(C1, C), icmp(C2, C), ...) ``` provided that all comparison of constants are constants themselves. Differential Revision: https://reviews.llvm.org/D81151 Reviewed By: lebedev.ri	2020-06-05 17:02:47 +07:00
Simon Pilgrim	44d86982d2	MemorySSAUpdater.h - reduce unnecessary includes to forward declarations. NFC. Remove unnecessary MemoryAccess forward declaration as its already included from MemorySSA.h Move implicit include dependencies down to source files.	2020-06-05 10:45:59 +01:00
Max Kazantsev	80cb25cbd5	Revert "[InstCombine][NFC] Factor out constant check" This reverts commit `9bdb918890`. This refactoring proved to not be useful.	2020-06-05 12:00:44 +07:00
Petr Hosek	d76e62fdb7	[AddressSanitizer] Don't use weak linkage for __{start,stop}_asan_globals It should not be necessary to use weak linkage for these. Doing so implies interposablity and thus PIC generates indirections and dynamic relocations, which are unnecessary and suboptimal. Aside from this, ASan instrumentation never introduces GOT indirection relocations where there were none before--only new absolute relocs in RELRO sections for metadata, which are less problematic for special linkage situations that take pains to avoid GOT generation. Patch By: mcgrathr Differential Revision: https://reviews.llvm.org/D80605	2020-06-04 20:18:35 -07:00
Philip Reames	3d40c75189	[Statepoint] Switch RS4GC to using gc-live bundle form Now that we have an operand based form for the GC arguments to a statepoint intrinsic, update RS4GC to use it and update tests to reflect. This is pretty straight forward. I nearly landed without review, but figured a second set of eyes didn't hurt. Differential Revision: https://reviews.llvm.org/D81121	2020-06-04 15:49:11 -07:00
Petr Hosek	b16ed493dd	[Fuchsia] Rely on linker switch rather than dead code ref for profile runtime Follow the model used on Linux, where the clang driver passes the linker a -u switch to force the profile runtime to be linked in, rather than having every TU emit a dead function with a reference. Differential Revision: https://reviews.llvm.org/D79835	2020-06-04 15:47:05 -07:00
Petr Hosek	e1ab90001a	Revert "[Fuchsia] Rely on linker switch rather than dead code ref for profile runtime" This reverts commit `d510542174` since it broke several bots.	2020-06-04 15:44:10 -07:00
Craig Topper	3ad8fbd205	[Reassociate] Teach ConvertShiftToMul to preserve nsw flag if the shift amount is not bitwidth - 1. Multiply and shl have different signed overflow behavior in some cases. But it looks like we should be ok as long as the shift amount is less than bitwidth - 1. Alive2: http://volta.cs.utah.edu:8080/z/MM4WZP Differential Revision: https://reviews.llvm.org/D81189	2020-06-04 14:51:34 -07:00
Sanjay Patel	192cb71836	[InstCombine] avoid crashing on select-shuffle detection As mentioned in the post-commit comments of D81013 - the mask check API has to assume the shuffle is not length-changing, but we have not ruled that out in this code. Use the ShuffleVectorInst call instead.	2020-06-04 17:27:14 -04:00
Petr Hosek	d510542174	[Fuchsia] Rely on linker switch rather than dead code ref for profile runtime Follow the model used on Linux, where the clang driver passes the linker a -u switch to force the profile runtime to be linked in, rather than having every TU emit a dead function with a reference. Patch By: mcgrathr Differential Revision: https://reviews.llvm.org/D79835	2020-06-04 14:25:19 -07:00
Sanjay Patel	8a96c1f627	[InstCombine] move vector select ahead of select-shuffle select Cond, (shuf_sel X, Y), X --> shuf_sel X, (select Cond, Y, X) A select of a select-shuffle ("blend" in x86 lingo) can be reversed so that the select is done first. This is a more limited version of what I was trying in D80658, but it enables existing demanded bits transforms to catch some of the motivating cases. The tricky bit in that seems to be that by moving the shuffle later, we can always guarantee that poison is correctly inhibited by the shuffle mask in the final value. Alive2 checks for the basic tests: http://volta.cs.utah.edu:8080/z/Qqd3RK http://volta.cs.utah.edu:8080/z/S4wchM http://volta.cs.utah.edu:8080/z/wf9zPL http://volta.cs.utah.edu:8080/z/wJeEGk Differential Revision: https://reviews.llvm.org/D81013	2020-06-04 14:29:13 -04:00
Layton Kifer	7381fcdf62	[TRE] Allow accumulator elimination when base case returns non-constant Remove the requirement, that when performing accumulator elimination, all other cases must return the same dynamic constant. We can do this by initializing the accumulator with the identity value of the accumulation operation, and inserting an additional operation before any return. Differential Revision: https://reviews.llvm.org/D80844	2020-06-04 10:34:42 -07:00
Huihui Zhang	bd43f78c76	[LSR][SCEVExpander] Avoid blind cast 'Factor' to SCEVConstant in FactorOutConstant. Summary: In SCEVExpander FactorOutConstant(), when GEP indexing into/over scalable vector, it is legal for the 'Factor' in a MulExpr to be the size of a scalable vector instead of a compile-time constant. Current upstream crash with the test attached. Reviewers: efriedma, sdesmalen, sanjoy.google, mkazantsev Reviewed By: efriedma Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80973	2020-06-04 10:33:39 -07:00
Max Kazantsev	9bdb918890	[InstCombine][NFC] Factor out constant check We plan to add more transforms here. Besides, this check should be done in the beginning just from function's name.	2020-06-04 18:54:23 +07:00
Yevgeny Rouban	dcfa78a4cc	Extend InvokeInst !prof branch_weights metadata to unwind branches Allow InvokeInst to have the second optional prof branch weight for its unwind branch. InvokeInst is a terminator with two successors. It might have its unwind branch taken many times. If so the BranchProbabilityInfo unwind branch heuristic can be inaccurate. This patch allows a higher accuracy calculated with both branch weights set. Changes: - A new section about InvokeInst is added to the BranchWeightMetadata page. It states the old information that missed in the doc and adds new about the second branch weight. - Verifier is changed to allow either 1 or 2 branch weights for InvokeInst. - A new test is written for BranchProbabilityInfo to demonstrate the main improvement of the simple fix in calcMetadataWeights(). - Several new testcases are created for Inliner. Those check that both weights are accounted for invoke instruction weight calculation. - PGOUseFunc::setBranchWeights() is fixed to be applicable to InvokeInst. Reviewers: davidxl, reames, xur, yamauchi Tags: #llvm Differential Revision: https://reviews.llvm.org/D80618	2020-06-04 15:37:15 +07:00
Yevgeny Rouban	417bcb8827	[Instruction] Remove setProfWeight() Remove the function Instruction::setProfWeight() and make use of Instruction::copyMetadata(.., {LLVMContext::MD_prof}). This is correct for all use cases of setProfWeight() as it is applied to CallBase instructions only. This change results in prof metadata copied intact even if the source has "VP". The old pair of calls extractProfTotalWeight() + setProfWeight() resulted in setting branch_weights if the source had "VP" data. Reviewers: yamauchi, davidxl Tags: #llvm Differential Revision: https://reviews.llvm.org/D80987	2020-06-04 15:10:55 +07:00
Arnold Schwaighofer	2e4c5d1c48	CoroSplit: Fix coroutine splitting for retcon and retcon.once Summary: For retcon and retcon.once coroutines we assume that all uses of spills can be sunk past coro.begin. This simplifies handling of instructions that escape the address of an alloca. The current implementation would have issues if the address of the alloca is escaped before coro.begin. (It also has issues with casts before and uses of those casts after the coro.begin instruction) %alloca_addr = alloca ... %escape = ptrtoint %alloca_addr coro.begin store %escape to %alloca_addr rdar://60272809 Subscribers: hiraditya, modocache, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81023	2020-06-03 12:10:58 -07:00
Florian Hahn	211596c94e	[VPlan] Support extracting lanes for defs managed in VPTransformState. Currently extracting a lane for a VPValue def is not supported, if it is managed directly by VPTransformState (e.g. because it is created by a VPInstruction or an external VPValue def). For now, simply extract the requested lane. In the future, we should also cache the extracted scalar values, similar to LV. Reviewers: Ayal, rengolin, gilr, SjoerdMeijer Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D80787	2020-06-03 12:14:16 +01:00
Kazu Hirata	f355c7fc2f	[JumpThreading] Simplify FindMostPopularDest (NFC) Summary: This patch simplifies FindMostPopularDest without changing the functionality. Given a list of jump threading destinations, the function finds the most popular destination. To ensure determinism when there are multiple destinations with the highest popularity, the function picks the first one in the successor list with the highest popularity. Without this patch: - The function populates DestPopularity -- a histogram mapping destinations to their respective occurrence counts. - Then we iterate over DestPopularity, looking for the highest popularity while building a vector of destinations with the highest popularity. - Finally, we iterate the successor list, looking for the destination with the highest popularity. With this patch: - We implement DestPopularity with MapVector instead of DenseMap. We populate the map with popularity 0 for all successors in the order they appear in the successor list. - We build the histogram in the same way as before. - We simply use std::max_element on DestPopularity to find the most popular destination. The use of MapVector ensures determinism. Reviewers: wmi, efriedma Reviewed By: wmi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81030	2020-06-02 18:43:31 -07:00
Wei Mi	7a6c89427c	[SampleFDO] Add use-sample-profile function attribute. When sampleFDO is enabled, people may expect they can use -fno-profile-sample-use to opt-out using sample profile for a certain file. That could be either for debugging purpose or for performance tuning purpose. However, when thinlto is enabled, if a function in file A compiled with -fno-profile-sample-use is imported to another file B compiled with -fprofile-sample-use, the inlined copy of the function in file B may still get its profile annotated. The inconsistency may even introduce profile unused warning because if the target is not compiled with explicit debug information flag, the function in file A won't have its debug information enabled (debug information will be enabled implicitly only when -fprofile-sample-use is used). After it is imported into file B which is compiled with -fprofile-sample-use, profile annotation for the outline copy of the function will fail because the function has no debug information, and that will trigger profile unused warning. We add a new attribute use-sample-profile to control whether a function will use its sample profile no matter for its outline or inline copies. That will make the behavior of -fno-profile-sample-use consistent. Differential Revision: https://reviews.llvm.org/D79959	2020-06-02 17:23:17 -07:00
Hiroshi Yamauchi	089759b96d	[PGO] Enable memcmp/bcmp size value profiling. Summary: Following up D79751. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80578	2020-06-02 10:27:11 -07:00
Florian Hahn	b446ec56a2	[LV] Make sure the MaxVF is a power-of-2 by rounding down. LV currently only supports power of 2 vectorization factors, which has been made explicit with the assertion added in `840450549c`. However, if the widest type is not a power-of-2 the computed MaxVF won't be a power-of-2 either. This patch updates computeFeasibleMaxVF to ensure the returned value is a power-of-2 by rounding down to the nearest power-of-2. Fixes PR46139. Reviewers: Ayal, gilr, rengolin Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D80870	2020-06-02 10:40:49 +01:00
Denis Antrushin	3c626c714c	[EarlyCSE] Common gc.relocate calls. gc.relocate intrinsic is special in that its second and third operands are not real values, but indices into relocate's parent statepoint list of GC pointers. To be CSE'd, they need special handling in `isEqual()` and `getHashCode()`. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D80445	2020-06-02 12:25:43 +03:00
Mircea Trofin	999ea25a9e	[llvm][NFC] Cache FAM in InlineAdvisor Summary: This simplifies the interface by storing the function analysis manager with the InlineAdvisor, and, thus, not requiring it be passed each time we inquire for an advice. Reviewers: davidxl, asbirlea Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80405	2020-06-01 13:02:34 -07:00
Sanjay Patel	26ebe936f3	[InstCombine] fix use of base VectorType; NFC SimplifyDemandedVectorElts() bails out on ScalableVectorType anyway, but we can exit faster with the external check. Move this to a helper function because there are likely other vector folds that we can try here.	2020-06-01 14:28:31 -04:00
Hiroshi Yamauchi	6c27c61d32	[PGO] Improve the working set size heuristics under the partial sample PGO. Summary: The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize) under the partial sample PGO may not be accurate because the profile is partial and the number of hot profile counters in the ProfileSummary may not reflect the actual working set size of the program being compiled. To improve this, the (approximated) ratio of the the number of profile counters of the program being compiled to the number of profile counters in the partial sample profile is computed (which is called the partial profile ratio) and the working set size of the profile is scaled by this ratio to reflect the working set size of the program being compiled and used for the working set size heuristics. The partial profile ratio is approximated based on the number of the basic blocks in the program and the NumCounts field in the ProfileSummary and computed through the thin LTO indexing. This means that there is the limitation that the scaled working set size is available to the thin LTO post link passes only. Reviewers: davidxl Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79831	2020-06-01 10:29:23 -07:00
Stanislav Mekhanoshin	745c6c8458	Process gep (phi ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79218	2020-06-01 08:41:05 -07:00
Sanjay Patel	dd54432a0f	[InstNamer] use 'i' for Instructions, not 'tmp' As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and D80584, the name 'tmp' is almost always a bad choice, but we have a legacy of regression tests with that name because it was baked into utils/update_test_checks.py. This change makes -instnamer more consistent (already using "arg" and "bb", the common LLVM shorthand). And it avoids the conflict in telling users of the FileCheck script to run "-instnamer" to create a better regression test and having that cause a warn/fail in update_test_checks.py.	2020-06-01 11:11:14 -04:00
Ehud Katz	8a84158e5b	[StructurizeCFG] Fix an incorrect comment, NFC.	2020-06-01 17:42:09 +03:00
Ehud Katz	85c3088049	[StructurizeCFG] Fix region nodes ordering This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. The new implementation uses SCCs instead of Loops to take account of irreducible loops. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037	2020-06-01 12:50:35 +03:00
Whitney Tsang	7873376bb3	[LoopUnroll] Fix build failure for allyesconfig. Differential Revision: https://reviews.llvm.org/D80477.	2020-05-30 18:32:47 +00:00
zoecarver	065bf124fd	[DSE] Remove noop stores in MSSA. Adds a simple fast-path check for the pattern: v = load ptr store v to ptr I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent). Refs: https://bugs.llvm.org/show_bug.cgi?id=45795 Differential Revision: https://reviews.llvm.org/D79391	2020-05-30 09:57:30 -07:00
Christopher Tetreault	e6cf402e83	[SVE] Eliminate calls to default-false VectorType::get() from AggressiveInstCombine Reviewers: efriedma, aymanmus, c-rhodes, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80332	2020-05-29 15:49:33 -07:00
Valery N Dmitriev	a45688a72c	[SLP] Apply external to vectorizable tree users cost adjustment for relevant aggregate build instructions only (UserCost). Users are detected with findBuildAggregate routine and the trick is that following SLP vectorization may end up vectorizing entire list with smaller chunks. Cost adjustment then is applied for individual chunks and these adjustments obviously have to be smaller than the entire aggregate build cost. Differential Revision: https://reviews.llvm.org/D80773	2020-05-29 15:37:41 -07:00
Christopher Tetreault	8f8029b458	[SVE] Eliminate calls to default-false VectorType::get() from InstCombine Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80334	2020-05-29 15:31:31 -07:00
Christopher Tetreault	e4d2037a5c	[SVE] Eliminate calls to default-false VectorType::get() from Instrumentation Reviewers: efriedma, fpetrogalli, kmclaughlin Reviewed By: fpetrogalli Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80335	2020-05-29 15:22:58 -07:00
Christopher Tetreault	c8f1aca316	[SVE] Eliminate calls to default-false VectorType::get() from Utils Reviewers: efriedma, c-rhodes, sdesmalen, xbolva00 Reviewed By: c-rhodes Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80337	2020-05-29 15:01:18 -07:00
Stanislav Mekhanoshin	af852d6f36	Revert "Process gep (phi ptr1, ptr2) in SROA" This reverts commit `f66a43c11a`.	2020-05-29 13:51:03 -07:00
Stanislav Mekhanoshin	f66a43c11a	Process gep (phi ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79218	2020-05-29 13:05:51 -07:00
Christopher Tetreault	d2befc6633	[SVE] Eliminate calls to default-false VectorType::get() from Vectorize Reviewers: efriedma, c-rhodes, david-arm, fhahn Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80339	2020-05-29 11:31:24 -07:00
Ehud Katz	c710bb44a6	[Local] Prevent `invertCondition` from creating a redundant instruction Prevent `invertCondition` from creating the inversion instruction, in case the given value is an argument which has already been inverted. Note that this approach has already been taken in case the given value is an instruction (and not an argument). Differential Revision: https://reviews.llvm.org/D80399	2020-05-29 21:08:22 +03:00
Paul Robinson	8c2d2d971b	Preserve DbgLoc when DeadArgumentElimination rewrites a 'ret'. Fixes PR46002.	2020-05-29 10:00:33 -07:00
Florian Hahn	01f999ae88	[SCCP] Switch to widen at PHIs, stores and call edges. Currently SCCP does not widen PHIs, stores or along call edges (arguments/return values), but on operations that directly extend ranges (like binary operators). This means PHIs, stores and call edges are not pessimized by widening currently, while binary operators are. The main reason for widening operators initially was that opting-out for certain operations was more straight-forward in the initial implementation (and it did not matter too much, as range support initially was only implemented for a very limited set of operations. During the discussion in D78391, it was suggested to consider flipping widening to PHIs, stores and along call edges. After adding support for tracking the number of range extensions in ValueLattice, limiting the number of range extensions per value is straight forward. This patch introduces a MaxWidenSteps option to the MergeOptions, limiting the number of range extensions per value. For PHIs, it seems natural allow an extension for each (active) incoming value plus 1. For the other cases, a arbitrary limit of 10 has been chosen initially. It would potentially make sense to set it depending on the users of a function/global, but that still needs investigating. This potentially leads to more state-changes and longer compile-times. The results look quite promising (MultiSource, SPEC): Same hash: 179 (filtered out) Remaining: 58 Metric: sccp.IPNumInstRemoved Program base widen-phi diff test-suite...ks/Prolangs-C/agrep/agrep.test 58.00 82.00 41.4% test-suite...marks/SciMark2-C/scimark2.test 32.00 43.00 34.4% test-suite...rks/FreeBench/mason/mason.test 6.00 8.00 33.3% test-suite...langs-C/football/football.test 104.00 128.00 23.1% test-suite...cations/hexxagon/hexxagon.test 36.00 42.00 16.7% test-suite...CFP2000/177.mesa/177.mesa.test 214.00 249.00 16.4% test-suite...ngs-C/assembler/assembler.test 14.00 16.00 14.3% test-suite...arks/VersaBench/dbms/dbms.test 10.00 11.00 10.0% test-suite...oxyApps-C++/miniFE/miniFE.test 43.00 47.00 9.3% test-suite...ications/JM/ldecod/ldecod.test 179.00 195.00 8.9% test-suite...CFP2006/433.milc/433.milc.test 249.00 265.00 6.4% test-suite.../CINT2000/175.vpr/175.vpr.test 98.00 104.00 6.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 70.00 74.00 5.7% test-suite...CFP2000/188.ammp/188.ammp.test 71.00 75.00 5.6% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 111.00 117.00 5.4% test-suite...ce/Applications/Burg/burg.test 41.00 43.00 4.9% test-suite...000/197.parser/197.parser.test 66.00 69.00 4.5% test-suite...tions/lambda-0.1.3/lambda.test 23.00 24.00 4.3% test-suite...urce/Applications/lua/lua.test 301.00 313.00 4.0% test-suite...TimberWolfMC/timberwolfmc.test 76.00 79.00 3.9% test-suite...lications/ClamAV/clamscan.test 991.00 1030.00 3.9% test-suite...plications/d/make_dparser.test 53.00 55.00 3.8% test-suite...fice-ispell/office-ispell.test 83.00 86.00 3.6% test-suite...lications/obsequi/Obsequi.test 28.00 29.00 3.6% test-suite.../Prolangs-C/bison/mybison.test 56.00 58.00 3.6% test-suite.../CINT2000/254.gap/254.gap.test 170.00 176.00 3.5% test-suite.../Applications/lemon/lemon.test 30.00 31.00 3.3% test-suite.../CINT2000/176.gcc/176.gcc.test 1202.00 1240.00 3.2% test-suite...pplications/treecc/treecc.test 79.00 81.00 2.5% test-suite...chmarks/MallocBench/gs/gs.test 357.00 366.00 2.5% test-suite...eeBench/analyzer/analyzer.test 103.00 105.00 1.9% test-suite...T2006/445.gobmk/445.gobmk.test 1697.00 1724.00 1.6% test-suite...006/453.povray/453.povray.test 1812.00 1839.00 1.5% test-suite.../Benchmarks/Bullet/bullet.test 337.00 342.00 1.5% test-suite.../CINT2000/252.eon/252.eon.test 426.00 432.00 1.4% test-suite...T2000/300.twolf/300.twolf.test 214.00 217.00 1.4% test-suite...pplications/oggenc/oggenc.test 244.00 247.00 1.2% test-suite.../CINT2006/403.gcc/403.gcc.test 4008.00 4055.00 1.2% test-suite...T2006/456.hmmer/456.hmmer.test 175.00 177.00 1.1% test-suite...nal/skidmarks10/skidmarks.test 430.00 434.00 0.9% test-suite.../Applications/sgefa/sgefa.test 115.00 116.00 0.9% test-suite...006/447.dealII/447.dealII.test 1082.00 1091.00 0.8% test-suite...6/482.sphinx3/482.sphinx3.test 141.00 142.00 0.7% test-suite...ocBench/espresso/espresso.test 152.00 153.00 0.7% test-suite...3.xalancbmk/483.xalancbmk.test 4003.00 4025.00 0.5% test-suite...lications/sqlite3/sqlite3.test 548.00 551.00 0.5% test-suite...marks/7zip/7zip-benchmark.test 5522.00 5551.00 0.5% test-suite...nsumer-lame/consumer-lame.test 208.00 209.00 0.5% test-suite...:: External/Povray/povray.test 1556.00 1563.00 0.4% test-suite...000/186.crafty/186.crafty.test 298.00 299.00 0.3% test-suite.../Applications/SPASS/SPASS.test 2019.00 2025.00 0.3% test-suite...ications/JM/lencod/lencod.test 8427.00 8449.00 0.3% test-suite...6/464.h264ref/464.h264ref.test 6797.00 6813.00 0.2% test-suite...6/471.omnetpp/471.omnetpp.test 431.00 430.00 -0.2% test-suite...006/450.soplex/450.soplex.test 446.00 447.00 0.2% test-suite...0.perlbench/400.perlbench.test 1729.00 1727.00 -0.1% test-suite...000/255.vortex/255.vortex.test 3815.00 3819.00 0.1% Reviewers: efriedma, nikic, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79036	2020-05-29 11:59:17 +01:00
David Sherwood	f254f1d94e	[SVE] Remove getNumElements() warnings in InstCombiner::visitBitCast Whilst trying to compile this test to assembly: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret.c I discovered some warnings were firing in InstCombiner::visitBitCast due to calls to getNumElements() for scalable vector types. These calls only really made sense for fixed width vectors so I have fixed up the code appropriately. Differential Revision: https://reviews.llvm.org/D80559	2020-05-29 08:00:08 +01:00
Whitney Tsang	4e74541a92	[LoopUnroll] Fix not-rotated.ll by adding back a limitation was unintentionally removed in https://reviews.llvm.org/D80477	2020-05-29 03:05:58 +00:00
Whitney Tsang	1bc73b02d6	[LoopUnroll] Support loops with exiting block that is neither header nor latch. Summary: Remove the limitation in LoopUnrollPass that exiting block must be either header or latch. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto, fhahn, efriedma Reviewed By: etiotto, fhahn, efriedma Subscribers: efriedma, lkail, xbolva00, hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80477	2020-05-29 01:18:38 +00:00
Philip Reames	a0d2fd4a1f	[Statepoint] Sink actual_args and gc_args to GCStatepointInst [NFC] These are the two operand sets which are expected to survive more than another week or so. Instead of bothering to update the deopt and gc-transition operands, we'll just wait until those are removed and delete the code. For those following along, this is likely to be the last (major) change in this sequence for about a week. I want to wait until all of this has been merged downstream to ensure I haven't introduced any bugs (and migrate some downstream code to the new interfaces). Once that's done, we should be able to delete Statepoint/ImmutableStatepoint without too much work.	2020-05-28 13:51:59 -07:00
aartbik	f719e7d9e7	[llvm] [MatrixIntrinsics] Add row-major support for llvm.matrix.transpose Summary: Only column-major was supported so far. This adds row-major support as well. Note that we probably also want very efficient SIMD implementations for the various target platforms. Bug: https://bugs.llvm.org/show_bug.cgi?id=46085 Reviewers: nicolasvasilache, reidtatge, bkramer, fhahn, ftynse, andydavis1, craig.topper, dcaballe, mehdi_amini, anemet Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80673	2020-05-28 12:13:32 -07:00
Whitney Tsang	47ffc81830	Revert "[LoopUnroll] Support loops with exiting block that is neither header nor" This reverts commit `2810582265`. Revert until http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/7334 is resolved.	2020-05-28 19:10:27 +00:00
Whitney Tsang	2810582265	[LoopUnroll] Support loops with exiting block that is neither header nor latch. Summary: Remove the limitation in LoopUnrollPass that exiting block must be either header or latch. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto, fhahn, efriedma Reviewed By: etiotto, fhahn, efriedma Subscribers: efriedma, lkail, xbolva00, hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80477	2020-05-28 18:27:09 +00:00
Philip Reames	587fa99cfd	Default to generating statepoints with deopt and gc-transition bundles if needed Continues from D80598. The key point of the change is to default to using operand bundles instead of the inline length prefix argument lists for statepoint nodes. An important subtlety to note is that the presence of a bundle has semantic meaning, even if it is empty. As such, we need to make a somewhat deeper change to the interface than is first obvious. Existing code treats statepoint deopt arguments and the deopt bundle operands differently during inlining. The former is ignored (resulting in caller state being dropped), the later is merged. We can't preserve the old behaviour for calls with deopt fed to RS4GC and then inlining, but we can avoid the no-deopt case changing. At least in internal testing, that seem to be the important one. (I'd argue the "stop merging after RS4GC" behaviour for the former was always "unexpected", but that the behaviour for non-deopt calls actually make sense.) Differential Revision: https://reviews.llvm.org/D80674	2020-05-28 10:14:23 -07:00
Hiroshi Yamauchi	f0c2cfe4d0	[PGO] Guard the memcmp/bcmp size value profiling instrumentation behind flag. Summary: Follow up D79751 and put the instrumentation / value collection side (in addition to the optimization side) behind the flag as well. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80646	2020-05-28 10:07:04 -07:00
Sidharth Baveja	15b6730f07	Create utility function to Merge Adjacent Basic Blocks Summary: The following code from /llvm/lib/Transforms/Utils/LoopUnrollAndJam.cpp can be used by other transformations: while (!MergeBlocks.empty()) { BasicBlock BB = MergeBlocks.begin(); BranchInst Term = dyn_cast<BranchInst>(BB->getTerminator()); if (Term && Term->isUnconditional() && L->contains(Term->getSuccessor(0))) { BasicBlock Dest = Term->getSuccessor(0); BasicBlock *Fold = Dest->getUniquePredecessor(); if (MergeBlockIntoPredecessor(Dest, &DTU, LI)) { // Don't remove BB and add Fold as they are the same BB assert(Fold == BB); (void)Fold; MergeBlocks.erase(Dest); } else MergeBlocks.erase(BB); } else MergeBlocks.erase(BB); } Hence it should be separated into its own utility function. Authored By: sidbav Reviewer: Whitney, Meinersbur, asbirlea, dmgreen, etiotto Reviewed By: asbirlea Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80583	2020-05-28 16:44:37 +00:00
Matt Arsenault	d6671ee90c	InferAddressSpaces: Handle ptrmask intrinsic This one is slightly odd since it counts as an address expression, which previously could never fail. Allow the existing TTI hook to return the value to use, and re-use it for handling how to handle ptrmask. Handles the no-op addrspacecasts for AMDGPU. We could probably do something better based on analysis of the mask value based on the address space, but leave that for now.	2020-05-28 10:04:02 -04:00
Kazu Hirata	c4990a03c6	[JumpThreading] Use emplace_back instead of push_back (NFC) Summary: This patch replaces push_back with emplace_back where appropriate. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80688	2020-05-27 22:31:23 -07:00
Philip Reames	87bea912c2	[Statepoint] Replace uses of isX functions with idiomatic isa<X> Now that all of the statepoint related routines have classes with isa support, let's cleanup. I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.	2020-05-27 18:32:28 -07:00
Layton Kifer	2bf3fe9b6d	[TRE] Allow elimination when the returned value is non-constant Currently we can only eliminate call return pairs that either return the result of the call or a dynamic constant. This patch removes that limitation. Differential Revision: https://reviews.llvm.org/D79660	2020-05-27 16:55:03 -07:00
Mircea Trofin	fa3b587196	[llvm]NFC] Simplify ProfileSummaryInfo state transitions ProfileSummaryInfo is updated seldom, as result of very specific triggers. This patch clearly demarcates state updates from read-only uses. This, arguably, improves readability and maintainability.	2020-05-27 11:58:37 -07:00
Rithik Sharma	eadf295956	[CodeMoverUtils] Use dominator tree level to decide the direction of code motion Summary: Currently isSafeToMoveBefore uses DFS numbering for determining the relative position of instruction and insert point which is not always correct. This PR proposes the use of Dominator Tree depth for the same. If a node is at a higher level than the insert point then it is safe to say that we want to move in the forward direction. Authored By: RithikSharma Reviewer: Whitney, nikic, bmahjour, etiotto, fhahn Reviewed By: Whitney Subscribers: fhahn, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80084	2020-05-27 18:02:06 +00:00
David Green	70d4a20299	[UnJ] Update LI for inner nested loops This makes sure to correctly register the loop info of the children of unroll and jammed loops. It re-uses some code from the unroller for registering subloops. Differential Revision: https://reviews.llvm.org/D80619	2020-05-27 14:36:38 +01:00
Florian Hahn	9b507b2127	[LAA] We only need pointer checks if there are non-zero checks (NFC). If it turns out that we can do runtime checks, but there are no runtime-checks to generate, set RtCheck.Need to false. This can happen if we can prove statically that the pointers passed in to canCheckPtrAtRT do not alias. This should not change any results, but allows us to skip some work and assert that runtime checks are generated, if LAA indicates that runtime checks are required. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79969 Note: This is a recommit of `259abfc7cb`, with some suggested renaming.	2020-05-27 12:47:36 +01:00
Florian Hahn	2d0389821e	Revert "[LAA] We only need pointer checks if there are non-zero checks (NFC)." This reverts commit `259abfc7cb`. Reverting this, as I missed a case where we return without setting RtCheck.Need.	2020-05-27 12:39:45 +01:00
Florian Hahn	259abfc7cb	[LAA] We only need pointer checks if there are non-zero checks (NFC). If it turns out that we can do runtime checks, but there are no runtime-checks to generate, set RtCheck.Need to false. This can happen if we can prove statically that the pointers passed in to canCheckPtrAtRT do not alias. This should not change any results, but allows us to skip some work and assert that runtime checks are generated, if LAA indicates that runtime checks are required. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79969	2020-05-27 12:37:20 +01:00
Daniil Suchkov	706b22e3e4	[SimpleLoopUnswitch] Drop uses of instructions before block deletion Currently if instructions defined in a block are used in unreachable blocks and SimpleLoopUnswitch attempts deleting the block, it triggers assertion "Uses remain when a value is destroyed!". This patch fixes it by replacing all uses of instructions from BB with undefs before BB deletion. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D80551	2020-05-27 18:25:18 +07:00
Simon Pilgrim	35963f6d85	VPlanValue.h - reduce unnecessary includes to forward declarations. NFC.	2020-05-27 11:26:14 +01:00
Djordje Todorovic	65030821d4	[NFC][Debugify] Format the CheckModuleDebugify output This fixes the output of the check-debugify option. Without the patch an example of running the option: $ opt -check-debugify test.ll -S -o testDebugify.ll CheckModuleDebugifySkipping module without debugify metadata After the patch: $ opt -check-debugify test.ll -S -o testDebugify.ll CheckModuleDebugify: Skipping module without debugify metadata Differential Revision: https://reviews.llvm.org/D80553	2020-05-27 10:32:40 +02:00
Florian Hahn	5cf90d6cf1	[LoopUnroll] Simplify latch/header block handling (NFC). I think the current code dealing with connecting the unrolled iterations is a bit more complicated than necessary currently. To connect the unrolled iterations, we have to update the unrolled latch blocks to branch to the header of the next unrolled iteration. We need to do this regardless whether the latch is exiting or not. Additionally, we try to turn the conditional branch in the exiting block to an unconditional one. This is an optimization only; alternatively we could leave the conditional branches in place and rely on other passes to simplify the conditions. Logically, this is a separate step from connecting the latches to the headers, but it is convenient to fold them into the same loop, if the latch is also exiting. For headers (or other non-latch exiting blocks, this is done separately). Hopefully the patch with additional comments makes things a bit clearer. Reviewers: efriedma, dmgreen, hfinkel, Whitney Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D80544	2020-05-26 21:54:12 +01:00
Stanislav Mekhanoshin	42725aeed8	Process gep (select ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79217	2020-05-26 12:56:02 -07:00
Sanjay Patel	1a2bffaf8b	[InstCombine] reassociate sub+add to increase adds and throughput The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953 Follows-up the FP version of the same transform: rGa0ce2338a083	2020-05-26 14:49:17 -04:00
Hiroshi Yamauchi	106ec64fbc	[PGO] Add memcmp/bcmp size value profiling. Summary: This adds support for memcmp/bcmp to the existing memcpy/memset value profiling. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79751	2020-05-26 10:28:04 -07:00
Sanjay Patel	a0ce2338a0	[InstCombine] reassociate fsub+fadd with FMF to increase adds and throughput The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953	2020-05-26 13:17:15 -04:00
Serge Pavlov	4d20e31f73	[FPEnv] Intrinsic llvm.roundeven This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven, and performs rounding to the nearest integer value, rounding halfway cases to even. The intrinsic represents the missed case of IEEE-754 rounding operations and now llvm provides full support of the rounding operations defined by the standard. Differential Revision: https://reviews.llvm.org/D75670	2020-05-26 19:24:58 +07:00
Yi Kong	c1c9eb0ab7	[Transforms] Check validity of profile reader before invoking it Although an invalid sampling profile would fail the compilation anyway, this avoids crashing the compiler.	2020-05-26 20:11:24 +08:00
Sam Parker	871556a494	[CostModel] Unify Intrinsic Costs. Recommitting most of the remaining changes from `259eb619ff`, but excluding the call to getUserCost from getInstructionThroughput. Though there's still no test changes, I doubt that this is an NFC... With the two getIntrinsicInstrCosts folded into one, now fold in the scalar/code-size orientated getIntrinsicCost. The remaining scalar intrinsics were memcpy, cttz and ctlz which now have special handling in the BasicTTI implementation. This had required a change in the AMDGPU backend for fabs as it should always be 'free'. I've also changed the X86 backend to return the BaseT implementation when the CostKind isn't RecipThroughput. Differential Revision: https://reviews.llvm.org/D80012	2020-05-26 09:48:26 +01:00
Florian Hahn	179c80117c	[LoopUnroll] Remove dead NextBlocks argument (NFC).	2020-05-25 22:09:11 +01:00
Marek Kurdej	bc93c2d72e	[Transforms] Fix typos. NFC	2020-05-25 22:34:08 +02:00
Whitney Tsang	5d6c5b463c	[LoopUtils] Use llvm::find Summary: Fixes this build error: llvm/lib/Transforms/Utils/LoopUtils.cpp:679:26: error: no matching function for call to 'find' Loop::iterator I = find(ParentLoop->begin(), ParentLoop->end(), L); ^~~~ Authored By: orivej Reviewer: Whitney Reviewed By: Whitney Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80473	2020-05-25 13:34:56 +00:00
Simon Pilgrim	8b4ecafee6	InstructionSimplify.h - remove unnecessary includes. NFC. Remove unused User.h include. Replace SetVector.h with forward declaration. Sort the forward declarations + remove FastMathFlags (defined in Operator.h). Fix implicit SetVector.h dependency in LowerConstantIntrinsics.cpp.	2020-05-25 13:45:03 +01:00
Ayal Zaks	840450549c	[LV] Clamp MaxVF to power of 2. If a loop has a constant trip count known to be a multiple of MaxVF (times user UF), LV infers that no tail will be generated for any chosen VF. This relies on the chosen VF's being powers of 2 bound by MaxVF, and assumes MaxVF is a power of 2. Make sure the latter holds, in particular when MaxVF is set by a memory dependence distance which may not be a power of 2. Differential Revision: https://reviews.llvm.org/D80491	2020-05-25 11:24:33 +03:00
Sanjay Patel	57bb4787d7	[Pass Manager] remove EarlyCSE as clean-up for VectorCombine EarlyCSE was added with D75145, but the motivating test is not regressed by removing the extra pass now. That might be because VectorCombine altered the way it processes instructions, or it might be from (re)moving VectorCombine in the pipeline. The extra round of EarlyCSE appears to cost approximately 0.26% in compile-time as discussed in D80236, so we need some evidence to justify its inclusion here, but we do not have that (yet). I suspect that between SLP and VectorCombine, we are creating patterns that InstCombine and/or codegen are not prepared for, but we will need to reduce those examples and include them as PhaseOrdering and/or test-suite benchmarks.	2020-05-24 12:36:21 -04:00
Florian Hahn	0deab8a54f	[LV] Either get invariant condition OR vector condition. Currently we unconditionally get the first lane of the condition operand, even if we later use the full vector condition. This can result in some unnecessary instructions being generated. Suggested as follow-up in D80219.	2020-05-24 17:16:42 +01:00
Sanjay Patel	c048a02b5b	[InstCombine] fold FP trunc into exact itofp Similar to D79116 and rGbfd512160fe0 - if the 1st cast is exact, then we can go directly to the destination type because there is no double-rounding.	2020-05-24 09:30:19 -04:00
Sanjay Patel	7eed772a27	[PatternMatch] abbreviate vector inst matchers; NFC Readability is not reduced with these opcodes/match lines, so reduce odds of awkward wrapping from 80-col limit.	2020-05-24 09:19:47 -04:00
Florian Hahn	15224408f0	[VPlan] Use VPUser for VPWidenSelectRecipe operands (NFC). VPWidenSelectRecipe already contains a VPUser, but it is not used. This patch updates the code related to VPWidenSelectRecipe to use VPUser for its operands. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D80219	2020-05-24 13:58:08 +01:00
Matt Arsenault	cdd006eec9	SimplifyCFG: Clean up optforfuzzing implementation This should function as any other SimplifyCFGOption rather than having the transform check and specially consider the attribute itself.	2020-05-23 13:49:50 -04:00
Matt Arsenault	27fe841aa6	AMDGPU: Refine rcp/rsq intrinsic folding for modern FP rules We have to assume undef could be an snan, which would need quieting so returning qnan is safer than undef. Also consider strictfp, and don't care if the result rounded.	2020-05-23 13:28:36 -04:00
Michal Paszkowski	335de55fa3	Revert "Added a new IRCanonicalizer pass." This reverts commit `14d358537f`.	2020-05-23 13:51:43 +02:00
Michal Paszkowski	14d358537f	Added a new IRCanonicalizer pass. Summary: Added a new IRCanonicalizer pass which aims to transform LLVM modules into a canonical form by reordering and renaming instructions while preserving the same semantics. The canonicalizer makes it easier to spot semantic differences when diffing two modules which have undergone different passes. Presentation: https://www.youtube.com/watch?v=c9WMijSOEUg Reviewed by: plotfi Differential Revision: https://reviews.llvm.org/D66029	2020-05-23 12:45:53 +02:00
Craig Topper	7392820f98	[Align] Remove operations on MaybeAlign that asserted that it had a defined value. If the caller needs to reponsible for making sure the MaybeAlign has a value, then we should just make the caller convert it to an Align with operator*. I explicitly deleted the relational comparison operators that were being inherited from Optional. It's unclear what the meaning of two MaybeAligns were one is defined and the other isn't should be. So make the caller reponsible for defining the behavior. I left the ==/!= operators from Optional. But now that exposed a weird quirk that ==/!= between Align and MaybeAlign required the MaybeAlign to be defined. But now we use the operator== from Optional that takes an Optional and the Value. Differential Revision: https://reviews.llvm.org/D80455	2020-05-22 21:54:28 -07:00
Sanjay Patel	024098ae53	[VectorCombine] set preserve alias analysis As noted in D80236, moving the pass in the pipeline exposed this shortcoming. Extra work to recalculate the alias results showed up as a compile-time slowdown.	2020-05-22 16:25:16 -04:00
Sanjay Patel	6438ea45e0	[VectorCombine] position pass after SLP in the optimization pipeline rather than before There are 2 known problem patterns shown in the test diffs here: vector horizontal ops (an x86 specialization) and vector reductions. SLP has greater ability to match and fold those than vector-combine, so let SLP have first chance at that. This is a quick fix while we continue to improve vector-combine and possibly canonicalize to reduction intrinsics. In the longer term, we should improve matching of these patterns because if they were created in the "bad" forms shown here, then we would miss optimizing them. I'm not sure what is happening with alias analysis on the addsub test. The old pass manager now shows an extra line for that, and we see an improvement that comes from SLP vectorizing a store. I don't know what's missing with the new pass manager to make that happen. Strangely, I can't reproduce the behavior if I compile from C++ with clang and invoke the new PM with "-fexperimental-new-pass-manager". Differential Revision: https://reviews.llvm.org/D80236	2020-05-22 12:22:44 -04:00
Sanjay Patel	2f7c24fe30	[InstCombine] (A + B) + B --> A + (B << 1) This eliminates a use of 'B', so it can enable follow-on transforms as well as improve analysis/codegen. The PhaseOrdering test was added for D61726, and that shows the limits of instcombine vs. real reassociation. We would need to run some form of CSE to collapse that further. The intermediate variable naming here is intentional because there's a test at llvm/test/Bitcode/value-with-long-name.ll that would break with the usual nameless value. I'm not sure how to improve that test to be more robust. The naming may also be helpful to debug regressions if this change exposes weaknesses in the reassociation pass for example.	2020-05-22 11:46:59 -04:00
Anh Tuyen Tran	13bf6039c9	Title: [LV] Handle Fold-Tail of loops with vectorizarion factor equal to 1 Summary: When handling loops whose VF is 1, fold-tail vectorization sets the backedge taken count of the original loop with a vector of a single element. This causes type-mismatch during instruction generartion. The purpose of this patch is toto address the case of VF==1. Reviewer: Ayal (Ayal Zaks), bmahjour (Bardia Mahjour), fhahn (Florian Hahn), gilr (Gil Rapaport), rengolin (Renato Golin) Reviewed By: Ayal (Ayal Zaks), bmahjour (Bardia Mahjour), fhahn (Florian Hahn) Subscribers: Ayal (Ayal Zaks), rkruppe (Hanna Kruppe), bmahjour (Bardia Mahjour), rogfer01 (Roger Ferrer Ibanez), vkmr (Vineet Kumar), bollu (Siddharth Bhat), hiraditya (Aditya Kumar), llvm-commits (Mailing List llvm-commits) Tag: LLVM Differential Revision: https://reviews.llvm.org/D79976	2020-05-22 13:30:56 +00:00
Sanjay Patel	21f7cf4057	[SLP] fix verification check for valid IR This is a fix for PR45965 - https://bugs.llvm.org/show_bug.cgi?id=45965 - which was left out of D80106 because of a test failure. SLP does its own mini-CSE after potentially creating redundant instructions, so we need to wait for that to complete before running the verifier. Otherwise, we will see a test failure for test/Transforms/SLPVectorizer/X86/crash_vectorizeTree.ll (not changed here) because a phi temporarily has identical but different incoming values for the same incoming block. A related, but independent, test that would have been altered here was fixed with: rG880df55 The test was escaping verification in SLP without this change because we were not running verifyFunction() unless SLP actually changed the IR. Differential Revision: https://reviews.llvm.org/D80401	2020-05-22 09:15:27 -04:00
Matt Arsenault	88c20fa3d2	InstCombine: Add constant folding/simplify for amdgcn.ldexp intrinsic This really belongs in InstructionSimplify since it doesn't introduce new instructions. Put it in instcombine to avoid increasing the number of passes considering target intrinsics. I also noticed that we seem to now be interpreting strictfp attributes on call sites, so try to handle that.	2020-05-22 08:21:38 -04:00
Roman Lebedev	cd921accf9	[NFC] InstCombineNegator: use auto where type is obvious from the cast	2020-05-22 11:14:54 +03:00
Max Kazantsev	403810557b	[InstCombine] Sink pure instructions down to return and unreachable blocks If the only user of `Instr` is in a return or unreachable block, we can sink `Instr` to the`User` safely (unless it reads/writes memory). Return or unreachable blocks are guaranteed to execute zero or one time, and `Instr` always dominates `User`, so they either will be executed together (execution of `User` always implies execution of `Instr`) or not executed at all. Differential Revision: https://reviews.llvm.org/D80120 Reviewed By: asbirlea, jdoerfert	2020-05-22 14:33:42 +07:00
Dinar Temirbulatov	df3b95bc0a	[SLP][NFC] PR45269 getVectorElementSize() is slow The algorithm inside getVectorElementSize() is almost O(x^2) complexity and when, for example, we compile MultiSource/Applications/ClamAV/shared_sha256.c with 1k instructions inside sha256_transform() function that resulted in almost ~800k iterations. The following change improves the algorithm with the map to a liner complexity. Differential Revision: https://reviews.llvm.org/D80241	2020-05-21 17:26:50 +02:00
Sam Parker	259eb619ff	Revert "[CostModel] Unify Intrinsic Costs." This reverts commit `de71def3f5`. This is causing some very large changes, so I'm first going to break this patch down and re-commit in parts.	2020-05-21 12:50:24 +01:00
Ehud Katz	111ddc57d3	[FlattenCFG] Fix `MergeIfRegion` in case then-path is empty In case the then-path of an if-region is empty, then merging with the else-path should be handled with the inverse of the condition (leading to that path). Fix PR37662 Differential Revision: https://reviews.llvm.org/D78881	2020-05-21 14:06:44 +03:00
Roman Lebedev	b2df961231	[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835) Summary: Currently, `rewriteLoopExitValues()`'s logic is roughly as following: > Loop over each incoming value in each PHI node. > Query whether the SCEV for that incoming value is high-cost. > Expand the SCEV. > Perform sanity check (`isValidRewrite()`, D51582) > Record the info > Afterwards, see if we can drop the loop given replacements. > Maybe perform replacements. The problem is that we interleave SCEV cost checking and expansion. This is A Problem, because `isHighCostExpansion()` takes special care to not bill for the expansions that were already expanded, and we can reuse. While it makes sense in general - if we know that we will expand some SCEV, all the other SCEV's costs should account for that, which might cause some of them to become non-high-cost too, and cause chain reaction. But that isn't what we are doing here. We expand all SCEV's, unconditionally. So every next SCEV's cost will be affected by the already-performed expansions for previous SCEV's. Even if we are not planning on keeping some of the expansions we performed. Worse yet, this current "bonus" depends on the exact PHI node incoming value processing order. This is completely wrong. As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have a PHI node with two(!) identical high-cost incoming values for the same basic blocks, we would decide first time around that it is high-cost, expand it, and immediately decide that it is not high-cost because we have an expansion that we could reuse (because we expanded it right before, temporarily), and replace the second incoming value but not the first one; thus resulting in a broken PHI. What we instead should do for now, is not perform any expansions until after we've queried all the costs. Later, in particular after `isValidRewrite()` is an assertion (D51582) we could improve upon that, but in a more coherent fashion. See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 \| PR45835 ]] Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma Reviewed By: dmajor, mkazantsev Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D79787	2020-05-21 13:05:55 +03:00
Benjamin Kramer	5b0d1f04bf	Fix a layering violation by not depending from Transforms/Utils on Transforms/Scalar. NFC.	2020-05-21 09:51:58 +02:00
Sam Parker	de71def3f5	[CostModel] Unify Intrinsic Costs. With the two getIntrinsicInstrCosts folded into one, now fold in the scalar/code-size orientated getIntrinsicCost. This involved sinking cost of the TTIImpl into the base implementation, as it performs no target checks. The opcodes remaining were memcpy, cttz and ctlz which now have special handling in the BasicTTI implementation. getInstructionThroughput can now directly return the result of getUserCost. This had required a change in the AMDGPU backend for fabs and its always 'free'. I've also changed the X86 backend to return '1' for any intrinsic when the CostKind isn't RecipThroughput. Though this intended to be a non-functional change, there are many paths being combined here so I would be very surprised if this didn't have an effect. Differential Revision: https://reviews.llvm.org/D80012	2020-05-21 07:38:25 +01:00
Yevgeny Rouban	8138487468	[BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights() Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights(). Changing unreachable branch probabilities to raw(1) and distributing the rest (oldProbability - raw(1)) over the reachable branches could introduce total probability inaccuracy bigger than 1/numOfBranches. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396	2020-05-21 12:52:37 +07:00
Juneyoung Lee	d9a4a24413	Add CanonicalizeFreezeInLoops pass Summary: If an induction variable is frozen and used, SCEV yields imprecise result because it doesn't say anything about frozen variables. Due to this reason, performance degradation happened after https://reviews.llvm.org/D76483 is merged, causing SCEV yield imprecise result and preventing LSR to optimize a loop. The suggested solution here is to add a pass which canonicalizes frozen variables inside a loop. To be specific, it pushes freezes out of the loop by freezing the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze. This solution was also mentioned at https://reviews.llvm.org/D70623 . Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert Reviewed By: fhahn Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes Tags: #llvm Differential Revision: https://reviews.llvm.org/D77523	2020-05-21 09:29:29 +09:00
Eli Friedman	f26bdb539e	Make Value::getPointerAlignment() return an Align, not a MaybeAlign. If we don't know anything about the alignment of a pointer, Align(1) is still correct: all pointers are at least 1-byte aligned. Included in this patch is a bugfix for an issue discovered during this cleanup: pointers with "dereferenceable" attributes/metadata were assumed to be aligned according to the type of the pointer. This wasn't intentional, as far as I can tell, so Loads.cpp was fixed to stop making this assumption. Frontends may need to be updated. I updated clang's handling of C++ references, and added a release note for this. Differential Revision: https://reviews.llvm.org/D80072	2020-05-20 16:37:20 -07:00
Roman Lebedev	55430f53f3	[InstCombine] `insertelement` is negatible if both sources are negatible ---------------------------------------- define <2 x i4> @negate_insertelement(<2 x i4> %src, i4 %a, i32 %x, <2 x i4> %b) { %0: %t0 = sub <2 x i4> { 0, 0 }, %src %t1 = sub i4 0, %a %t2 = insertelement <2 x i4> %t0, i4 %t1, i32 %x %t3 = sub <2 x i4> %b, %t2 ret <2 x i4> %t3 } => define <2 x i4> @negate_insertelement(<2 x i4> %src, i4 %a, i32 %x, <2 x i4> %b) { %0: %t2.neg = insertelement <2 x i4> %src, i4 %a, i32 %x %t3 = add <2 x i4> %t2.neg, %b ret <2 x i4> %t3 } Transformation seems to be correct!	2020-05-20 21:44:31 +03:00
Roman Lebedev	ebed96fdbf	[InstCombine] Negator: `extractelement` is negatible if src is negatible ---------------------------------------- define i4 @negate_extractelement(<2 x i4> %x, i32 %y, i4 %z) { %0: %t0 = sub <2 x i4> { 0, 0 }, %x call void @use_v2i4(<2 x i4> %t0) %t1 = extractelement <2 x i4> %t0, i32 %y %t2 = sub i4 %z, %t1 ret i4 %t2 } => define i4 @negate_extractelement(<2 x i4> %x, i32 %y, i4 %z) { %0: %t0 = sub <2 x i4> { 0, 0 }, %x call void @use_v2i4(<2 x i4> %t0) %t1.neg = extractelement <2 x i4> %x, i32 %y %t2 = add i4 %t1.neg, %z ret i4 %t2 } Transformation seems to be correct!	2020-05-20 21:44:31 +03:00
Arthur Eubanks	8a88755610	Reland [X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 11:25:44 -07:00
Arthur Eubanks	b8cbff51d3	Revert "[X86] Codegen for preallocated" This reverts commit `810567dc69`. Some tests are unexpectedly passing	2020-05-20 10:04:55 -07:00
Arthur Eubanks	810567dc69	[X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 09:20:38 -07:00
Sam Parker	8cc911fa5b	[NFCI][CostModel] Refactor getIntrinsicInstrCost Combine the two API calls into one by introducing a structure to hold the relevant data. This has the added benefit of moving the boiler plate code for arguments and flags, into the constructors. This is intended to be a non-functional change, but the complicated web of logic involved here makes it very hard to guarantee. Differential Revision: https://reviews.llvm.org/D79941	2020-05-20 11:59:08 +01:00
Florian Hahn	bcbd26bfe6	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as `b8a3c34eee`, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-05-20 10:53:40 +01:00
Benjamin Kramer	350dadaa8a	Give helpers internal linkage. NFC.	2020-05-19 22:16:37 +02:00
Nikita Popov	5fae613a4f	[LVI] Don't require DominatorTree in LVI (NFC) After D76797 the dominator tree is no longer used in LVI, so we can remove it as a pass dependency, and also get rid of the dominator tree enabling/disabling logic in JumpThreading. Apart from cleaning up the code, this also clarifies LVI cache consistency, in that the LVI cache can no longer depend on whether the DT was or wasn't enabled due to pending DT updates at any given time. Differential Revision: https://reviews.llvm.org/D76985	2020-05-19 20:21:46 +02:00
Florian Hahn	7cefd1b4cd	[LV] Remove duplicated return stmt (NFC).	2020-05-19 17:20:50 +01:00
Jay Foad	9bc989a48d	[InstCombine] Remove hasNoInfs check for pow(C,y) -> exp2(log2(C)*y) We already check hasNoNaNs and that x is finite and strictly positive. That only leaves the following special cases (taken from the Linux man page for pow): If x is +1, the result is 1.0 (even if y is a NaN). If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity. If the absolute value of x is greater than 1, and y is negative infinity, the result is +0. If the absolute value of x is less than 1, and y is positive infinity, the result is +0. If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity. The first case is handled elsewhere, and this transformation preserves all the others, so there is no need to limit it to hasNoInfs. Differential Revision: https://reviews.llvm.org/D79409	2020-05-19 17:06:05 +01:00
Florian Hahn	cff9399f6b	[VPlan] Fix comment for User in VPWidenSelectRecipe (NFC). The comment was referring the arguments of the call, but the recipe widens a select.	2020-05-19 15:31:39 +01:00
Florian Hahn	f828d75b46	[VPlan] Add & use VPValue operands for VPReplicateRecipe (NFC). This patch adds VPValue version of the instruction operands to VPReplicateRecipe and uses them during code-generation. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D80114	2020-05-19 15:12:17 +01:00
Florian Hahn	66ad107452	[VPlan] Remove unique_ptr from VPBranchOnRecipeMask (NFC). We can remove a dynamic memory allocation, by checking the number of operands: no operands = all true, 1 operand = mask. Reviewers: Ayal, gilr, rengolin Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D80110	2020-05-19 15:01:37 +01:00
Sameer Sahasrabuddhe	6c84884366	[LoopSimplify] don't separate nested loops with convergent calls Summary: When a loop has multiple backedges, loop simplification attempts to separate them out into nested loops. This results in incorrect control flow in the presence of some functions like a GPU barrier. This change skips the transformation when such "convergent" function calls are present in the loop body. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D80078	2020-05-19 09:22:39 +05:30
Eli Friedman	27b4e6931d	[NFC] Replace MaybeAlign with Align in TargetTransformInfo.	2020-05-18 19:25:49 -07:00
Ayal Zaks	682e739638	[LV] Fix FoldTail under user VF and UF LV considers an internally computed MaxVF to decide if a constant trip-count is a multiple of any subsequently chosen VF, and conclude that no scalar remainder iterations (tail) will be left for Fold Tail to handle. If an external VF is provided via -force-vector-width, it must be considered instead of the internal MaxVF. If an external UF is provided via -force-vector-interleave, it too must be considered in addition to MaxVF or user VF. Fixes PR45679. Differential Revision: https://reviews.llvm.org/D80085	2020-05-19 01:32:25 +03:00
Craig Topper	c9f63297e2	Fix several places that were calling verifyFunction or verifyModule without checking the return value. verifyFunction/verifyModule don't assert or error internally. They also don't print anything if you don't pass a raw_ostream to them. So the caller needs to check the result and ideally pass a stream to get the messages. Otherwise they're just really expensive no-ops. I've filed PR45965 for another instance in SLPVectorizer that causes a lit test failure. Differential Revision: https://reviews.llvm.org/D80106	2020-05-18 13:28:46 -07:00
Nikita Popov	47a0e9f49b	[Sanitizers] Use getParamByValType() (NFC) Instead of fetching the pointer element type.	2020-05-18 22:06:18 +02:00
Volkan Keles	63081dc6f6	LoadStoreVectorizer: Match nested adds to prove vectorization is safe If both OpA and OpB is an add with NSW/NUW and with the same LHS operand, we can guarantee that the transformation is safe if we can prove that OpA won't overflow when IdxDiff added to the RHS of OpA. Review: https://reviews.llvm.org/D79817	2020-05-18 12:13:01 -07:00
Nikita Popov	736db2f710	[Loads] Require Align in isSafeToLoadUnconditionally() (NFC) Now that load/store have required alignment, accept Align here. This also avoids uses of getPointerElementType(), which is incompatible with opaque pointers.	2020-05-18 20:50:35 +02:00
Mircea Trofin	691980ebb4	[llvm][NFC] Fixed non-compliant style in InlineAdvisor.h Changed OnPass{Entry\|Exit} -> onPass{Entry\|Exit} Also fixed a small typo in a comment.	2020-05-18 10:26:45 -07:00
Vedant Kumar	623b254244	[Local] Do not ignore zexts in salvageDebugInfo, PR45923 Summary: When salvaging a dead zext instruction, append a convert operation to the DIExpressions of the debug uses of the instruction, to prevent the salvaged value from being sign-extended. I confirmed that lldb prints out the correct unsigned result for "f" in the example from PR45923 with this changed applied. rdar://63246143 Reviewers: aprantl, jmorse, chrisjackson, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80034	2020-05-18 09:52:02 -07:00
Max Kazantsev	e47c101e35	[InstCombine][NFC] Simplify check in sinking We just need to check that the only predecessor of user parent is BB, we don't need to iterate through BB's successors for it.	2020-05-18 18:10:40 +07:00
Craig Topper	5f65faef2c	ValueMapper does not preserve inline assembly dialect when remapping the type Bug report: https://bugs.llvm.org/show_bug.cgi?id=45291 Patch by Tomasz Miąsko Differential Revision: https://reviews.llvm.org/D80066	2020-05-17 14:57:50 -07:00
Nikita Popov	52e98f620c	[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC) Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to return Align instead of MaybeAlign.	2020-05-17 22:19:15 +02:00
Roman Lebedev	fde8eb00e1	[InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants (PR45955) We can't leave undef vector element constants as-is, it is a miscompile, so we need to sanitize them. We have two vectors (C and ~C): * We can't replace undef with 0 in both of them * We can't replace undef with 0 in only one of them * We could replace undef with -1 in both of them * We could replace undef with -1 in only one(!) of them * We could replace undef with -1 in one and 0 in another one of them. Therefore, it seems best to go with the last option, since otherwise we'd loose knowledge that C and ~C have no common bits set, which seems more important than preserving partial undef knowledge. Fixes https://bugs.llvm.org/show_bug.cgi?id=45955	2020-05-17 22:53:03 +03:00
Sanjay Patel	bfd512160f	[InstCombine] improve analysis of FP->int->FP to eliminate fpextend This was originally in D79116. Converting from a narrow-enough FP source value to integer and back to FP guarantees that the conversion to FP is exact because of UB/poison-on-overflow. This was suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19	2020-05-17 09:06:57 -04:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Sanjay Patel	81e9ede3a2	[VectorCombine] forward walk through instructions to improve chaining of transforms This is split off from D79799 - where I was proposing to fully iterate over a function until there are no more transforms. I suspect we are still going to want to do something like that eventually. But we can achieve the same gains much more efficiently on the current set of regression tests just by reversing the order that we visit the instructions. This may also reduce the motivation for D79078, but we are still not getting the optimal pattern for a reduction.	2020-05-16 13:08:01 -04:00
Nikita Popov	604f44977b	[InstCombine] Clean up alignment handling (NFC) Now that load/store alignment is required, we can simplify code in some places.	2020-05-16 18:47:29 +02:00
Mircea Trofin	08e2386dee	Revert "Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs"" This reverts commit `454de99a6f`. The problem was that one of the ctor arguments of CallAnalyzer was left to be const std::function<>&. A function_ref was passed for it, and then the ctor stored the value in a function_ref field. So a std::function<> would be created as a temporary, and not survive past the ctor invocation, while the field would. Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild Original Differential Revision: https://reviews.llvm.org/D79917	2020-05-15 12:29:16 -07:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Scott Linder	03c44c7584	[NFC] Deduplicate comment in PromoteMemoryToRegister.cpp This has been duplicated since before `2372a193ba`, but that commit has it appearing twice in the space of 10 lines of the same function body. It could also be hoisted up to the point just after where the last special-case is considered, but I want to keep the intent of the original authors. Committed as obvious without a review.	2020-05-15 15:18:07 -04:00
Nikita Popov	f89f7da999	[IR] Convert null-pointer-is-valid into an enum attribute The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute. In the future, this attribute may be replaced with data layout properties. Differential Revision: https://reviews.llvm.org/D78862	2020-05-15 19:41:07 +02:00
Anna Thomas	7cc3769adb	[VectorUtils] Expose vector-function-abi-variant mangling as a utility. Summary: This change exposes the vector name mangling with LLVM ISA (used as part of vector-function-abi-variant) as a utility. This can then be used by front-ends that add this attribute. Note that all parameters passed in to the function will be mangled with the "v" token to identify that they are of of vector type. So, it is the responsibility of the caller to confirm that all parameters in the vectorized variant is of vector type. Added unit test to show vector name mangling. Reviewed-By: fpetrogalli, simoll Differential Revision: https://reviews.llvm.org/D79867	2020-05-15 11:42:20 -04:00
Dmitry Vyukov	151ed6aa38	[TSAN] Add option to allow instrumenting reads of reads-before-writes Add -tsan-instrument-read-before-write which allows instrumenting reads of reads-before-writes. This is required for KCSAN [1], where under certain configurations plain writes behave differently (e.g. aligned writes up to word size may be treated as atomic). In order to avoid missing potential data races due to plain RMW operations ("x++" etc.), we will require instrumenting reads of reads-before-writes. [1] https://github.com/google/ktsan/wiki/KCSAN Author: melver (Marco Elver) Reviewed-in: https://reviews.llvm.org/D79983	2020-05-15 16:08:44 +02:00
Mircea Trofin	454de99a6f	Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs" This reverts commit `767db5be67`.	2020-05-14 22:32:44 -07:00
Mircea Trofin	767db5be67	[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs Summary: Replacing uses of std::function pointers or refs, or Optional, to function_ref, since the usage pattern allows that. If the function is optional, using a default parameter value (nullptr). This led to a few parameter reshufles, to push all optionals to the end of the parameter list. Reviewers: davidxl, dblaikie Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79917	2020-05-14 22:13:53 -07:00
Davide Italiano	da52aa2c33	[LICM] When promoting loads to the preheader, drop the location. It's really almost going to be misleading, see the example in https://bugs.llvm.org/show_bug.cgi?id=45820 Maybe at some point we can do something fancier, but at least this will fix a bug where we step on dead code while debugging.	2020-05-14 17:05:23 -07:00
Eli Friedman	accc6b5545	LoadInst should store Align, not MaybeAlign. The fact that loads and stores can have the alignment missing is a constant source of confusion: code that usually works can break down in rare cases. So fix the LoadInst API so the alignment is never missing. To reduce the number of changes required to make this work, IRBuilder and certain LoadInst constructors will grab the module's datalayout and compute the alignment automatically. This is the same alignment instcombine would eventually apply anyway; we're just doing it earlier. There's a minor risk that the way we're retrieving the datalayout could break out-of-tree code, but I don't think that's likely. This is the last in a series of patches, so most of the necessary changes have already been merged. Differential Revision: https://reviews.llvm.org/D77454	2020-05-14 13:19:21 -07:00
Anna Thomas	eb282be9f8	[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer"" This is relanding of rGbb308b020522420413c7d3f2989a88f2fc423c56 after speculatively fixing buildbot lit test failure which was seen on two bots (I cannot reproduce the lit test failure locally either). [RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer Summary: This is a more general fix to `59029b9eef` (D75704). This patch does the following: updates isKnownBaseValue to account for base pointer and derived pointer having differing types. This inturn allows us to populate the lattice (States) for such derived pointers. It also updates all states where the base and derived pointers have differing types (vector versus scalar) and conservatively marks these states as conflictcs. Note that in `59029b9eef`, we were just fixing existing lattice values and that too, only for uses of extractelement. Reviewers: reames, skatkov, dantrushin Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D76305	2020-05-14 11:17:45 -04:00
Ehud Katz	c6c265527d	Revert "[StructurizeCFG] Fix region nodes ordering" This reverts commit `897d8ee5cd`, due to causing an infinite loop when encountering a loop with a sub-region with an inner loop.	2020-05-14 17:56:39 +03:00
Anna Thomas	f20c62741e	Revert "[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer" This reverts commit `bb308b0205`. Failing a testcase.	2020-05-14 10:16:25 -04:00
Anna Thomas	bb308b0205	[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer Summary: This is a more general fix to `59029b9eef` (D75704). This patch does the following: 1. updates isKnownBaseValue to account for base pointer and derived pointer having differing types. 2. This inturn allows us to populate the lattice (States) for such derived pointers. 3. It also updates all states where the base and derived pointers have differing types (vector versus scalar) and conservatively marks these states as conflictcs. Note that in `59029b9eef`, we were just fixing existing lattice values and that too, only for uses of extractelement. Reviewers: reames, skatkov, dantrushin Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76305	2020-05-14 10:03:30 -04:00
Florian Hahn	4c8285c750	[VPlan] Move emission of \\l\"+\n to dumpBasicBlock (NFC). The patch standardizes printing of VPRecipes a bit, by hoisting out the common emission of \\l\"+\n. It simplifies the code and is also a first step towards untangling printing from DOT format output, with the goal of making the DOT output optional and to provide a more concise debug output if DOT output is disabled. Reviewers: gilr, Ayal, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D78883	2020-05-14 13:07:59 +01:00
Omar Ahmed	425333c23b	[Attributor] Improve the alignment of the loads This patch introduces an improvement in the Alignment of the loads generated in createReplacementValues() by querying AAAlign attribute for the best Alignment for the base. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76550	2020-05-13 18:24:05 -05:00
Kuter Dinel	e57807769b	[Attributor] Use AAValueConstantRange to infer dereferencability. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76208	2020-05-13 16:44:15 -05:00
Eric Christopher	d6e3e55c40	Remove unused Debugging variable.	2020-05-13 14:37:26 -07:00
Mircea Trofin	d6695e1876	[llvm] Add interface to drive inlining decision using ML model Summary: This change introduces InliningAdvisor (and related APIs), the interface that abstracts decision making away from the inlining pass. We will use this interface to delegate decision making to a trained ML model, subsequently (see referenced RFC). RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html Reviewers: davidxl, eraman, dblaikie Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79042	2020-05-13 13:27:29 -07:00
Alina Sbirlea	bd541b217f	[NewPassManager] Add assertions when getting statefull cached analysis. Summary: Analyses that are statefull should not be retrieved through a proxy from an outer IR unit, as these analyses are only invalidated at the end of the inner IR unit manager. This patch disallows getting the outer manager and provides an API to get a cached analysis through the proxy. If the analysis is not stateless, the call to getCachedResult will assert. Reviewers: chandlerc Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72893	2020-05-13 12:38:38 -07:00
Alina Sbirlea	db04ff4b6b	[SimpleLoopUnswitch] Add non-empty unreachable block check to exit cases removed. Summary: Update check to include the check for unreachable. Basic blocks ending in unreachable are special cased, as these blocks may be already unswitched. Before this patch this check is only done for the default destination. The condition for the exit cases and the default case must be the same, because we should never leave edges from the switch instruction to a basic block that we are unswitching. In PR45355 we still have a remaining edge (that we're attempting to remove from the DT) because its the default edge to an unreachable-terminated block where we unswitch a case edge to that block. Resolves PR45355. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78279	2020-05-13 12:38:37 -07:00
Eli Friedman	fcfb3170a7	[SROA] Clean up some uses of MaybeAlign in SROA. Use Align instead of using MaybeAlign; all the operations in question have known alignment. For getSliceAlign() in particular, in the cases where we used to return None, it would be converted back to an Align by IRBuilder, so there's no functional change there. Split off from D77454. Differential Revision: https://reviews.llvm.org/D79205	2020-05-13 11:23:29 -07:00
Huber, Joseph	4d4ea9ac59	OpenMPOpt Remarks Support Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D79359	2020-05-13 12:20:40 -05:00
Reid Kleckner	1370757dd0	Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC." This reverts commit `eef95f2746`. The new assertion about branch propability sums does not hold.	2020-05-13 08:23:09 -07:00
Pierre-vh	2668775f66	[LSR][ARM] Add new TTI hook to mark some LSR chains as profitable This patch adds a new TTI hook to allow targets to tell LSR that a chain including some instruction is already profitable and should not be optimized. This patch also adds an implementation of this TTI hook for ARM so LSR doesn't optimize chains that include the VCTP intrinsic. Differential Revision: https://reviews.llvm.org/D79418	2020-05-13 14:18:28 +01:00
Sjoerd Meijer	9529597cf4	Recommit #2 : "[LV] Induction Variable does not remain scalar under tail-folding." This was reverted because of a miscompilation. At closer inspection, the problem was actually visible in a changed llvm regression test too. This one-line follow up fix/recommit will splat the IV, which is what we are trying to avoid if unnecessary in general, if tail-folding is requested even if all users are scalar instructions after vectorisation. Because with tail-folding, the splat IV will be used by the predicate of the masked loads/stores instructions. The previous version omitted this, which caused the miscompilation. The original commit message was: If tail-folding of the scalar remainder loop is applied, the primary induction variable is splat to a vector and used by the masked load/store vector instructions, thus the IV does not remain scalar. Because we now mark that the IV does not remain scalar for these cases, we don't emit the vector IV if it is not used. Thus, the vectoriser produces less dead code. Thanks to Ayal Zaks for the direction how to fix this.	2020-05-13 13:50:09 +01:00
Ehud Katz	897d8ee5cd	[StructurizeCFG] Fix region nodes ordering This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037	2020-05-13 15:33:36 +03:00
Yevgeny Rouban	eef95f2746	[BrachProbablityInfo] Set edge probabilities at once. NFC. Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396	2020-05-13 13:55:36 +07:00
KAWASHIMA Takahiro	272bc25bc1	[LoopReroll] Fix rerolling loop with use outside the loop Fixes PR41696 The loop-reroll pass generates an invalid IR (or its assertion fails in debug build) if values of the base instruction and other root instructions (terms used in the loop-reroll pass) are used outside the loop block. See IRs written in PR41696 as examples. The current implementation of the loop-reroll pass can reroll only loops that don't have values that are used outside the loop, except reduced values (the last values of reduction chains). This is described in the comment of the `LoopReroll::reroll` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600 This is checked in the `LoopReroll::DAGRootTracker::validate` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393 However, the base instruction and other root instructions skip this check in the validation loop. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229 Moving the check in front of the skip is the logically simplest fix. However, inserting the check in an earlier stage is better in terms of compilation time of unrerollable loops. This fix inserts the check for the base instruction into the function to validate possible base/root instructions. Check for other root instructions is unnecessary because they don't match any base instructions if they have uses outside the loop. Differential Revision: https://reviews.llvm.org/D79549	2020-05-13 13:03:03 +09:00
Johannes Doerfert	af48351cc8	[Attributor][FIX] Stabilize the state of AAReturnedValues each update For AAReturnedValues we treated new and existing information differently in the updateImpl. Only the latter was properly analyzed and categorized. The former was thought to be analyzed in the subsequent update. Since the Attributor does not support "self-updates" we need to make sure the state is "stable" after each updateImpl invocation. That is, if the surrounding information does not change, the state is valid. Now we make sure all return values have been handled and properly categorized each iteration. We might not update again if we have not requested a non-fix attribute so we cannot "wait" for the next update to analyze a new return value. Bug reported by @sdmitriev.	2020-05-12 21:00:30 -05:00
Zequan Wu	cb22ab7403	Add nomerge function attribute to supress tail merge optimization in simplifyCFG We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. There is also an asan usecase where having this attribute would be beneficial to avoid alternative work-arounds. Here is the link to the feature request: https://bugs.llvm.org/show_bug.cgi?id=42783. `nomerge` is different from `noline`. `noinline` prevents function from inlining at callsites, but `nomerge` prevents multiple identical calls from being merged into one. This patch adds `nomerge` to disable the optimization in IR level. A followup patch will be needed to let backend understands `nomerge` and avoid tail merge at backend. Reviewed By: asbirlea, rnk Differential Revision: https://reviews.llvm.org/D78659	2020-05-12 16:49:20 -07:00
Sergey Dmitriev	32f5ee830b	[Attributor] Fixup block addresses after rewriting function signature Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79801	2020-05-12 13:53:04 -07:00
Juneyoung Lee	e5f602d82c	[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. Summary: This patch makes propagatesPoison be more accurate by returning true on more bin ops/unary ops/casts/etc. The changed test in ScalarEvolution/nsw.ll was introduced by `a19edc4d15` . IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has no-overflow flags even if the loop isn't in the wanted form. It becomes more accurate with this patch, so think this is okay. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy Reviewed By: spatel, nikic Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D78615	2020-05-13 02:51:42 +09:00
Fangrui Song	b56b1e67e3	[gcov] Default coverage version to '408' and delete CC1 option -coverage-exit-block-before-body gcov 4.8 (r189778) moved the exit block from the last to the second. The .gcda format is compatible with 4.7 but decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings, and print wrong `" returned %s` for branch statistics (-b). * decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues. Also, rename "return block" to "exit block" because the latter is the appropriate term.	2020-05-12 09:14:03 -07:00
Eric Christopher	a42e53cccf	Fix typos encountered while working on pass pipeline for O1.	2020-05-12 00:45:15 -07:00
Johannes Doerfert	8d94d3c3b4	[Attributor][FIX] Disallow function signature rewrite for casted calls We will now ensure ensure the return type of called function is the type of all call sites we are going to rewrite. This avoids a problem partially fixed by D79680. The part that was not covered is a use of this "weird" casted call site (see `@func3` in `misc_crash.ll`). misc_crash.ll checks are auto-generated now.	2020-05-11 15:32:47 -05:00
Johannes Doerfert	c115a78f0d	[Attributor] Make AAIsDead dependences optional to prevent top state We should never give up on AAIsDead as it guards other AAs from unreachable code (in which SSA properties are meaningless). We did however use required dependences on some queries in AAIsDead which caused us to invalidate AAIsDead if the queried AA got invalidated. We now use optional dependences instead. The bug that exposed this is added to the liveness.ll test and other test changes show the impact. Bug report by @sdmitriev.	2020-05-11 15:32:47 -05:00
Johannes Doerfert	c86fd3333d	[Attributor] Force update of "newly live" abstract attributes During an update of AAIsDead, new instructions become live. If we query information from them, the result is often just the initial state, e.g., for call site `noreturn` and `nounwind`. We will now trigger an update for cached attributes during the AAIsDead update, though other AAs might later use the same API.	2020-05-11 15:32:47 -05:00
Sanjay Patel	5f730b645d	[VectorCombine] account for extra uses in scalarization cost Follow-up to D79452. Mimics the extra use cost formula for the inverse transform with extracts.	2020-05-11 15:20:57 -04:00
Mircea Trofin	48fa355ed4	[llvm][NFC] Move inlining decision-related APIs in InliningAdvisor. Summary: Factoring out in preparation to https://reviews.llvm.org/D79042 Reviewers: dblaikie, davidxl Subscribers: mgorny, eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79613	2020-05-11 09:00:59 -07:00
Sergey Dmitriev	3df40007e6	[Attributor] Fix for a crash on RAUW when rewriting function signature Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: uenoku Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79680	2020-05-11 08:06:19 -07:00
Tyker	78d85c2091	[AssumeBundles] fix crashes Summary: this patch fixe crash/asserts found in the test-suite. the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought. prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79458	2020-05-11 11:52:21 +02:00
OCHyams	da100de0a6	[NFC][DwarfDebug] Add test for variables with a single location which don't span their entire scope. The previous commit (`6d1c40c171`) is an older version of the test. Reviewed By: aprantl, vsk Differential Revision: https://reviews.llvm.org/D79573	2020-05-11 11:49:11 +02:00
Xun Li	44e5aaf911	Remove an unused Module param Summary: In D65848 the function getFuncNameInModule was refactored to no longer use module. This diff removes the parameter and rename the function name to avoid confusion. Reviewers: wenlei, wmi, davidxl Reviewed By: wenlei Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79310	2020-05-10 22:09:55 -07:00
Johannes Doerfert	3a8740bdd5	[Attributor] Merge the query set into AbstractAttribute The old QuerriedAAs contained two vectors, one for required one for optional dependences (=queries). We now use a single vector and encode the kind directly in the pointer. This reduces memory consumption and makes the connection between abstract attributes and their dependences clearer. No functional change is intended, changes in the test are due to different order in the query map. Neither the order before nor now is in any way special. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 543734 (329735/s) temporary memory allocations: 105895 (64217/s) peak heap memory consumption: 19.19MB peak RSS (including heaptrack overhead): 102.26MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 513292 (341511/s) temporary memory allocations: 106028 (70544/s) peak heap memory consumption: 13.35MB peak RSS (including heaptrack overhead): 95.64MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -30442 (208506/s) temporary memory allocations: 133 (-910/s) peak heap memory consumption: -5.84MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D78729	2020-05-10 22:27:00 -05:00
Johannes Doerfert	5e06b2514a	[Attributor][FIX] Carefully handle/ignore/forget `argmemonly` When we have an existing `argmemonly` or `inaccessiblememorargmemonly` we used to "know" that information. However, interprocedural constant propagation can invalidate these attributes. We now ignore and remove these attributes for internal functions (which may be affected by IP constant propagation), if we are deriving new attributes for the function.	2020-05-10 19:06:11 -05:00
Johannes Doerfert	713ee3aa77	[Attributor] Use "simplify to constant" in genericValueTraversal As we replace values with constants interprocedurally, we also need to do this "look-through" step during the generic value traversal or we would derive properties from replaced values. While this is often not problematic, it is when we use the "kind" of a value for reasoning, e.g., accesses to arguments allow `argmemonly`.	2020-05-10 19:06:11 -05:00
Johannes Doerfert	513ac6e9b0	[Attributor] Ignore illegal accesses to `null` When we categorize a pointer value we bailed at `null` before. If we know `null` is not a valid memory location we can ignore it as there won't be an access at all.	2020-05-10 19:06:10 -05:00
Johannes Doerfert	31c03b9223	[Attributor] Use existing helpers to determine IR facts We now use getPointerDereferenceableBytes to determine `nonnull` and `dereferenceable` facts from the IR. We also use getPointerAlignment in AAAlign for the same reason. The latter can interfere with callbacks so we do restrict it to non-function-pointers for now.	2020-05-10 19:06:10 -05:00
Johannes Doerfert	a9ee8b492c	[Attributor][NFC] Clang format Attributor*.cpp	2020-05-10 19:06:10 -05:00
Fangrui Song	25544ce2df	[gcov] Default coverage version to '407' and delete CC1 option -coverage-cfg-checksum Defaulting to -Xclang -coverage-version='407' makes .gcno/.gcda compatible with gcov [4.7,8) In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum. We can infer the information from the version. With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7. We don't need other -Xclang -coverage* options. There may be a mismatching version warning, though. (Note, GCC r173147 "split checksum into cfg checksum and line checksum" made gcov 4.7 incompatible with previous versions.)	2020-05-10 16:14:07 -07:00
Fangrui Song	13a633b438	[gcov] Delete CC1 option -coverage-no-function-names-in-data rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION (this might be part of the reasons the header says "We emit files in a corrupt version of GCOV's "gcda" file format"). rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data to work around the issue. (However, the description is wrong. libgcov never writes function names, even before GCC 4.2). In reality, the linker command line has to look like: clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7 either produce wrong results (for one gcov-4.9 program, I see "No executable lines") or segfault (gcov-7). (gcov-8 uses an incompatible format.) This patch deletes -coverage-no-function-names-in-data and the related function names support from libclang_rt.profile	2020-05-10 12:37:44 -07:00
Tyker	5957e058e4	[AssumeBundles] Remove non-determinisme from assume builder Summary: The assume builder was non-deterministic when working on unamed values. this patch fixes this. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78616	2020-05-10 21:18:33 +02:00
Tyker	821a0f23d8	[AssumeBundles] Prevent generation of some redundant assumes Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78014	2020-05-10 19:23:59 +02:00
Florian Hahn	8528186b9b	[LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC) Currently LAA's uses of ScalarEvolutionExpander blocks moving the expander from Analysis to Transforms. Conceptually the expander does not fit into Analysis (it is only used for code generation) and runtime-check generation also seems to be better suited as a transformation utility. Reviewers: Ayal, anemet Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78460	2020-05-10 17:39:26 +01:00
Sanjay Patel	856cc60bc1	[InstCombine] canonicalize bitcast after insertelement into undef We have a transform in the opposite direction only for the x86 MMX type, Other types are not handled either way before this patch. The motivating case from PR45748: https://bugs.llvm.org/show_bug.cgi?id=45748 ...is the last test diff. In that example, we are triggering an existing bitcast transform, so we reduce the number of casts, and that should give us the ideal x86 codegen. Differential Revision: https://reviews.llvm.org/D79171	2020-05-10 11:37:47 -04:00
Simon Pilgrim	bab44a698e	[InstCombine] matchOrConcat - match BITREVERSE Fold or(zext(bitreverse(x)),shl(zext(bitreverse(y)),bw/2) -> bitreverse(or(zext(x),shl(zext(y),bw/2)) Practically this is the same as the BSWAP pattern so we might as well handle it.	2020-05-10 16:00:29 +01:00
Florian Hahn	96c63f544f	Recommit "[LAA] Remove one addRuntimeChecks function (NFC)." The failing assertion has been fixed and the problematic test case has been added. This reverts the revert commit `fc44617f28`.	2020-05-10 15:19:57 +01:00
Florian Hahn	fc44617f28	Revert "[LAA] Remove one addRuntimeChecks function (NFC)." This reverts commit `c28114c8ff`. This causes some bots to fail: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/30596/steps/build%20android%2Faarch64/logs/stdio	2020-05-10 13:28:00 +01:00
Florian Hahn	c28114c8ff	[LAA] Remove one addRuntimeChecks function (NFC). In order to reduce the API surface area (preparation for D78460), remove a addRuntimeChecks() function and do the additional check in the single caller. Reviewers: Ayal, anemet Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79679	2020-05-10 12:48:55 +01:00
Sanjay Patel	a62533c29f	[InstCombine] fold fpext into exact integer-to-FP cast We can combine a floating-point extension cast with a conversion from integer if we know the earlier cast is exact. This is an optimization suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19 However, this patch does not change the example suggested there. This patch only uses the existing analysis to handle cases where the integer source value magnitude is narrower than the intermediate FP mantissa (guarantees that the conversion to FP is exact). Follow-up patches to the analysis function can enable more cases. Differential Revision: https://reviews.llvm.org/D79116	2020-05-10 07:04:54 -04:00
Arthur Eubanks	73a9b7dee0	Add missing pass initialization Summary: This was preventing MemorySanitizerLegacyPass from appearing in --print-after-all. Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79661	2020-05-09 21:31:52 -07:00
Jinsong Ji	a72b9dfd45	[sanitizer] Enable whitelist/blacklist in new PM https://reviews.llvm.org/D63616 added `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang. However, it was done only for legacy pass manager. This patch enable it for new pass manager as well. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D79653	2020-05-10 02:34:29 +00:00
Matt Arsenault	16295d521e	InstCombine: Broaden copy-constant-to-alloca optimization Consider any constant memory type, not just global constants. AMDGPU kernel parameters are effectively global constants, but appear as either reads from an intrinsic derived pointer or function argument.	2020-05-09 16:00:27 -04:00
Evgenii Stepanov	68a9308a0b	[hwasan] Allow -hwasan-globals flag to appear more than once.	2020-05-08 16:35:48 -07:00
Layton Kifer	23cbea9a04	[TRE][NFC] Refactor shared state into member variables. Separate functions that require shared state into a class to avoid needing to pass them though multiple functions just to be available where needed. The main motivation for this is that we would like to remove the limitation that accumulator values be dynamic constant, which would require additional shared state between call eliminations in the same function, compounding this issue. Differential Revision: https://reviews.llvm.org/D79299	2020-05-08 14:36:02 -07:00
Sanjay Patel	0d2a0b44c8	[VectorCombine] scalarize binop of inserted elements into vector constants As with the extractelement patterns that are currently in vector-combine, there are going to be several possible variations on this theme. This should be the clearest, simplest example. Scalarization is the right direction for target-independent canonicalization, and InstCombine has some of those folds already, but it doesn't do this. I proposed a similar transform in D50992. Here in vector-combine, we can check the cost model to be sure it's profitable, so there should be less risk. Differential Revision: https://reviews.llvm.org/D79452	2020-05-08 16:31:12 -04:00
Sanjay Patel	46d6f76be3	[InstCombine] fix typo in comment; NFC	2020-05-08 15:43:14 -04:00
zoecarver	f65f566aeb	Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-08 12:24:10 -07:00
Sanjay Patel	5cf17034e5	[InstCombine] add helper for known exact cast to FP; NFC As suggested in D79116 - there's shared logic between the existing code and potential new folds. This could go in ValueTracking if it seems generally useful.	2020-05-08 15:22:36 -04:00
Ricky Zhou	b38d77f185	[SimplifyCFG] Remap rewritten debug intrinsic operands. FoldBranchToCommonDest clones instructions to a different basic block, but handles debug intrinsics in a separate path. Previously, when cloning debug intrinsics, their operands were not updated to reference the correct cloned values. As a result, we would emit debug.value intrinsics with broken operand references which are discarded in later passes. This leads to incorrect debuginfo that reports incorrect values for variables. Fix this by remapping debug intrinsic operands when cloning them. Fixes https://bugs.llvm.org/show_bug.cgi?id=45667. Differential Revision: https://reviews.llvm.org/D79602	2020-05-08 11:10:25 -07:00
Sanjay Patel	ff9045dc9c	[InstCombine] clean up foldItoFPtoI; NFC Mostly cosmetic improvements to variable names and logic to ease refactoring suggested in D79116.	2020-05-08 12:13:42 -04:00
Sanjay Patel	09d70e0588	[InstCombine] simplify code for FP to integer casts; NFCI FoldIToFPtoI() returns immediately if the operand is not an opposite cast instruction, so the extra checks in the callers are redundant.	2020-05-08 10:14:03 -04:00
Benjamin Kramer	f936457f80	Revert "Recommit "[LV] Induction Variable does not remain scalar under tail-folding."" This reverts commit `ae45b4dbe7`. It causes miscompilations, test case on the mailing list.	2020-05-08 14:49:10 +02:00
Diego Caballero	f5224d437e	[LoopFusion] Remove unreachable blocks from DT and LI after fusion This patch removes FC0.ExitBlock and FC1GuardBlock from DT and LI after fusion of guarded loops. They become unreachable and LI verification failed when they happened to be inside another loop. Reviewed By: kbarton Differential Revision: https://reviews.llvm.org/D78679	2020-05-07 16:44:40 -07:00
Johannes Doerfert	edf0391491	[Attributor][FIX] Record dependences for assumed dead abstract attributes In a recent patch we introduced a problem with abstract attributes that were assumed dead at some point. Since `Attributor::updateAA` was introduced in `95e0d28b71`, we did not remember the dependence on the liveness AA when an abstract attribute was assumed dead and therefore not updated. Explicit reproducer added in liveness.ll. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 509242 (345483/s) temporary memory allocations: 98666 (66937/s) peak heap memory consumption: 18.60MB peak RSS (including heaptrack overhead): 103.29MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 529332 (355494/s) temporary memory allocations: 102107 (68574/s) peak heap memory consumption: 19.40MB peak RSS (including heaptrack overhead): 102.79MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: 20090 (1339333/s) temporary memory allocations: 3441 (229400/s) peak heap memory consumption: 801.45KB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-07 17:00:50 -05:00
Johannes Doerfert	675334daef	[Attributor] Mark dependence as optional	2020-05-07 17:00:50 -05:00
Alina Sbirlea	6227f021ad	[SimpleLoopUnswitch] Update DefaultExit condition to check unreachable is not empty. Summary: Update the check for the default exit block to not only check that the terminator is not unreachable, but also check that unreachable block has only the unreachable instruction. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78277	2020-05-07 13:48:30 -07:00
Huihui Zhang	1ec0cc0f02	[InstCombine][SVE] Fix visitExtractElementInst for scalable type. Summary: This patch fix the following issues with visitExtractElementInst: 1. Restrict VectorUtils::findScalarElement to fixed-length vector. For scalable type, the number of elements in shuffle mask is unknown at compile-time. 2. Fix out-of-range calculation for fixed-length vector. 3. Skip scalable type when analysis rely on fixed number of elements. 4. Add unit tests to check functionality of extractelement for scalable type. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78267	2020-05-07 13:03:52 -07:00
Huihui Zhang	08c9c13749	[InstCombine][SVE] Fix visitInsertElementInst for scalable type. Summary: This patch fixes the following issues in visitInsertElementInst: 1. Bail out for scalable type when analysis requires fixed size number of vector elements. 2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion on scalable vector type. 3. For scalable type, avoid folding a chain of insertelement into splat: insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ... -> shufflevector(insertelt(X, %k, 0), undef, zero) The length of scalable vector is unknown at compile-time, therefore we don't know if given insertelement sequence is valid for splat. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: sdesmalen, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78895	2020-05-07 12:44:52 -07:00
Sanjay Patel	02051c7f3a	[SLP] add another bailout for load-combine patterns (2nd try) The original patch (rG86dfbc676ebe) exposed an existing bug: we could wrongly cast a constant expression to BinaryOperator because the pattern matching allows that. This adds a check for that case, and there's a reduced test case to verify no crashing. Original commit message: This builds on the or-reduction bailout that was added with D67841. We still do not have IR-level load combining, although that could be a target-specific enhancement for -vector-combiner. The heuristic is narrowly defined to catch the motivating case from PR39538: https://bugs.llvm.org/show_bug.cgi?id=39538 ...while preserving existing functionality. That is, there's an unmodified test of pure load/zext/store that is not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll. That's the reason for the logic difference to require the 'or' instructions. The chances that vectorization would actually help a memory-bound sequence like that seem small, but it looks nicer with: vpmovzxwd (%rsi), %xmm0 vmovdqu %xmm0, (%rdi) rather than: movzwl (%rsi), %eax movl %eax, (%rdi) ... In the motivating test, we avoid creating a vector mess that is unrecoverable in the backend, and SDAG forms the expected bswap instructions after load combining: movzbl (%rdi), %eax vmovd %eax, %xmm0 movzbl 1(%rdi), %eax vmovd %eax, %xmm1 movzbl 2(%rdi), %eax vpinsrb $4, 4(%rdi), %xmm0, %xmm0 vpinsrb $8, 8(%rdi), %xmm0, %xmm0 vpinsrb $12, 12(%rdi), %xmm0, %xmm0 vmovd %eax, %xmm2 movzbl 3(%rdi), %eax vpinsrb $1, 5(%rdi), %xmm1, %xmm1 vpinsrb $2, 9(%rdi), %xmm1, %xmm1 vpinsrb $3, 13(%rdi), %xmm1, %xmm1 vpslld $24, %xmm0, %xmm0 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpslld $16, %xmm1, %xmm1 vpor %xmm0, %xmm1, %xmm0 vpinsrb $1, 6(%rdi), %xmm2, %xmm1 vmovd %eax, %xmm2 vpinsrb $2, 10(%rdi), %xmm1, %xmm1 vpinsrb $3, 14(%rdi), %xmm1, %xmm1 vpinsrb $1, 7(%rdi), %xmm2, %xmm2 vpinsrb $2, 11(%rdi), %xmm2, %xmm2 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpinsrb $3, 15(%rdi), %xmm2, %xmm2 vpslld $8, %xmm1, %xmm1 vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero vpor %xmm2, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rsi) movl (%rdi), %eax movl 4(%rdi), %ecx movl 8(%rdi), %edx movbel %eax, (%rsi) movbel %ecx, 4(%rsi) movl 12(%rdi), %ecx movbel %edx, 8(%rsi) movbel %ecx, 12(%rsi) Differential Revision: https://reviews.llvm.org/D78997	2020-05-07 15:04:37 -04:00
Christopher Tetreault	b6c6bab9a5	[SVE] Fix incorrect usage of getNumElements() in InstCombineCalls Summary: Remove incorrect usage of getNumElements() from visitCallInst(). The number of elements was being used to construct a DemandedElts bitfield. This operation does not make sense for scalable vectors. Cast to FixedVectorType Identified by test case Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_mla.c Reviewers: rengolin, efriedma, sdesmalen, c-rhodes, david-arm Reviewed By: david-arm Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79524	2020-05-07 08:46:51 -07:00
Hans Wennborg	c54c6ee1a7	Revert "[SLP] add another bailout for load-combine patterns" It caused asserts building Chromium, see discussion on https://reviews.llvm.org/D78997 This reverts commit `86dfbc676e`.	2020-05-07 16:31:52 +02:00
Sjoerd Meijer	3bbc71d6c9	[LV] Fix typo in variable name. NFC.	2020-05-07 13:53:44 +01:00
Calixte Denizet	bec223a9bc	[profile] Don't crash when forking in several threads Summary: When forking in several threads, the counters were written out in using the same global static variables (see GCDAProfiling.c): that leads to crashes. So when there is a fork, the counters are resetted in the child process and they will be dumped at exit using the interprocess file locking. When there is an exec, the counters are written out and in case of failures they're resetted. Reviewers: jfb, vsk, marco-c, serge-sans-paille Reviewed By: marco-c, serge-sans-paille Subscribers: llvm-commits, serge-sans-paille, dmajor, cfe-commits, hiraditya, dexonsmith, #sanitizers, marco-c, sylvestre.ledru Tags: #sanitizers, #clang, #llvm Differential Revision: https://reviews.llvm.org/D78477	2020-05-07 14:13:11 +02:00
Sjoerd Meijer	ae45b4dbe7	Recommit "[LV] Induction Variable does not remain scalar under tail-folding." With 3 llvm regr tests fixed/updated that I had missed.	2020-05-07 11:52:20 +01:00
Yevgeny Rouban	b921543c49	SplitIndirectBrCriticalEdges: Fix Branch Probability update Splitting critical edges for indirect branches the SplitIndirectBrCriticalEdges() function may break branch probabilities if target basic block happens to have unset a probability for any of its successors. That is because in such cases the getEdgeProbability(Target) function returns probability 1/NumOfSuccessors and it is called after Target was split (thus Target has a single successor). As the result the correspondent successor of the split block gets probability 100% but 1/NumOfSuccessors is expected (or better be left unset). Reviewers: yamauchi Differential Revision: https://reviews.llvm.org/D78806	2020-05-07 15:31:44 +07:00
Sjoerd Meijer	20d67ffeae	Revert "[LV] Induction Variable does not remain scalar under tail-folding." This reverts commit `617aa64c84`. while I investigate buildbot failures.	2020-05-07 09:29:56 +01:00
Sjoerd Meijer	617aa64c84	[LV] Induction Variable does not remain scalar under tail-folding. If tail-folding of the scalar remainder loop is applied, the primary induction variable is splat to a vector and used by the masked load/store vector instructions, thus the IV does not remain scalar. Because we now mark that the IV does not remain scalar for these cases, we don't emit the vector IV if it is not used. Thus, the vectoriser produces less dead code. Thanks to Ayal Zaks for the direction how to fix this. Differential Revision: https://reviews.llvm.org/D78911	2020-05-07 09:15:23 +01:00
Whitney Tsang	0a52401ad6	[LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest. Summary: As discussed in https://reviews.llvm.org/D73129. Example Before unroll and jam: for A for B for C D E After unroll and jam (currently): for A A' for B for C D B' for C' D' E E' After unroll and jam (Ideal): for A A' for B B' for C C' D D' E E' This is the first patch to change unroll and jam to work in the ideal way. This patch change the safety checks needed to make sure is safe to unroll and jam in the ideal way. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: Meinersbur Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan Tag: LLVM Differential Revision: https://reviews.llvm.org/D76132	2020-05-06 21:47:44 +00:00
zoecarver	1998e796e9	Revert "Mark values as trivially dead when their only use is a start or end lifetime intrinsic." This reverts commit `95aa28cc8f`.	2020-05-06 11:07:22 -07:00
zoecarver	95aa28cc8f	Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-06 10:58:08 -07:00
Sanjay Patel	2058c98715	[InstCombine] limit bitcast+insertelement transform to x86 MMX type This is unusual for the general case because we are replacing 1 instruction with 2. Splitting from a potential conflicting transform in D79171	2020-05-06 13:12:36 -04:00
Matt Arsenault	59bc99a08a	InstCombine: Fix return after else	2020-05-06 11:53:26 -04:00
Benjamin Kramer	d5ea89f891	Quiet some -Wdocumentation warnings.	2020-05-06 11:23:13 +02:00
Vitaly Buka	04bd2c37ca	[local-bounds] Ignore volatile operations Summary: -fsanitize=local-bounds is very similar to ``object-size`` and should also ignore volatile pointers. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#volatile Reviewers: chandlerc, rsmith Reviewed By: rsmith Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D78607	2020-05-05 23:08:08 -07:00
Johannes Doerfert	f014972446	[Attributor][NFC] Cleanup some AAMemoryLocation code This is the first step to resolve a TODO in AAMemoryLocation and to fix a bug we have when handling `byval` arguments of `readnone` call sites. No functional change intended.	2020-05-05 23:15:33 -05:00
Johannes Doerfert	0cc9c02255	[Attributor][NFC] Minor code cleanups to minimize follow up diffs	2020-05-05 23:14:23 -05:00
Johannes Doerfert	094137a6c6	[Attributor][NFC] Avoid dependences on known information	2020-05-05 23:14:23 -05:00
Christopher Tetreault	855e02e799	[SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem Summary: getLogBase2 tries to iterate over the number of vector elements. Since the number of elements of a scalable vector is unknown at compile time, we must return null if the input type is scalable. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, fpetrogalli, kmclaughlin, spatel Reviewed By: efriedma, fpetrogalli Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79197	2020-05-05 15:19:01 -07:00
Kazu Hirata	e8984fe65b	[Inlining] Teach shouldBeDeferred to take the total cost into account Summary: This patch teaches shouldBeDeferred to take into account the total cost of inlining. Suppose we have a call hierarchy {A1,A2,A3,...}->B->C. (Each of A1, A2, A3, ... calls B, which in turn calls C.) Without this patch, shouldBeDeferred essentially returns true if TotalSecondaryCost < IC.getCost() where TotalSecondaryCost is the total cost of inlining B into As. This means that if B is a small wraper function, for example, it would get inlined into all of As. In turn, C gets inlined into all of As. In other words, shouldBeDeferred ignores the cost of inlining C into each of As. This patch adds an option, inline-deferral-scale, to replace the expression above with: TotalCost < Allowance where - TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and - Allowance is IC.getCost() * Scale For now, the new option defaults to -1, disabling the new scheme. Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79138	2020-05-05 11:02:06 -07:00
Sanjay Patel	86dfbc676e	[SLP] add another bailout for load-combine patterns This builds on the or-reduction bailout that was added with D67841. We still do not have IR-level load combining, although that could be a target-specific enhancement for -vector-combiner. The heuristic is narrowly defined to catch the motivating case from PR39538: https://bugs.llvm.org/show_bug.cgi?id=39538 ...while preserving existing functionality. That is, there's an unmodified test of pure load/zext/store that is not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll. That's the reason for the logic difference to require the 'or' instructions. The chances that vectorization would actually help a memory-bound sequence like that seem small, but it looks nicer with: vpmovzxwd (%rsi), %xmm0 vmovdqu %xmm0, (%rdi) rather than: movzwl (%rsi), %eax movl %eax, (%rdi) ... In the motivating test, we avoid creating a vector mess that is unrecoverable in the backend, and SDAG forms the expected bswap instructions after load combining: movzbl (%rdi), %eax vmovd %eax, %xmm0 movzbl 1(%rdi), %eax vmovd %eax, %xmm1 movzbl 2(%rdi), %eax vpinsrb $4, 4(%rdi), %xmm0, %xmm0 vpinsrb $8, 8(%rdi), %xmm0, %xmm0 vpinsrb $12, 12(%rdi), %xmm0, %xmm0 vmovd %eax, %xmm2 movzbl 3(%rdi), %eax vpinsrb $1, 5(%rdi), %xmm1, %xmm1 vpinsrb $2, 9(%rdi), %xmm1, %xmm1 vpinsrb $3, 13(%rdi), %xmm1, %xmm1 vpslld $24, %xmm0, %xmm0 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpslld $16, %xmm1, %xmm1 vpor %xmm0, %xmm1, %xmm0 vpinsrb $1, 6(%rdi), %xmm2, %xmm1 vmovd %eax, %xmm2 vpinsrb $2, 10(%rdi), %xmm1, %xmm1 vpinsrb $3, 14(%rdi), %xmm1, %xmm1 vpinsrb $1, 7(%rdi), %xmm2, %xmm2 vpinsrb $2, 11(%rdi), %xmm2, %xmm2 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpinsrb $3, 15(%rdi), %xmm2, %xmm2 vpslld $8, %xmm1, %xmm1 vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero vpor %xmm2, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rsi) movl (%rdi), %eax movl 4(%rdi), %ecx movl 8(%rdi), %edx movbel %eax, (%rsi) movbel %ecx, 4(%rsi) movl 12(%rdi), %ecx movbel %edx, 8(%rsi) movbel %ecx, 12(%rsi) Differential Revision: https://reviews.llvm.org/D78997	2020-05-05 12:44:38 -04:00
Simon Pilgrim	4e3c005554	[TTI] getScalarizationOverhead - use explicit VectorType operand getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions). Followup to D78357 Reviewed By: @samparker Differential Revision: https://reviews.llvm.org/D79341	2020-05-05 16:59:23 +01:00
Arthur Eubanks	d056c0c71f	Remove unnecessary check for inalloca in IPConstantPropagation Summary: This was added in https://reviews.llvm.org/D2449, but I'm not sure it's necessary since an inalloca value is never a Constant (should be an AllocaInst). Reviewers: hans, rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79350	2020-05-05 08:26:11 -07:00
Jay Foad	22829ab5fa	[InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y) We check that C is finite and strictly positive, but there's no need to check that it's normal too. exp2 should be just as accurate on denormals as pow is. Differential Revision: https://reviews.llvm.org/D79413	2020-05-05 16:25:48 +01:00
David Green	146d44c251	[LSR] Don't require register reuse under postinc LSR has some logic that tries to aggressively reuse registers in formula. This can lead to sub-optimal decision in complex loops where the backend it trying to use shouldFavorPostInc. This disables the re-use in those situations. Differential Revision: https://reviews.llvm.org/D79301	2020-05-05 16:04:50 +01:00
Jay Foad	fa2783d79a	[InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x) I don't think there's any good reason not to do this transformation when the pow has multiple uses. Differential Revision: https://reviews.llvm.org/D79407	2020-05-05 14:46:08 +01:00
Simon Pilgrim	5c91aa6603	[InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2)) This adds a general combine that can be used to fold: or(zext(OP(x)), shl(zext(OP(y)),bw/2)) --> OP(or(zext(x), shl(zext(y),bw/2))) Allowing us to widen 'concat-able' style or+zext patterns - I've just set this up for BSWAP but we could use this for other similar ops (BITREVERSE for instance). We already do something similar for bitop(bswap(x),bswap(y)) --> bswap(bitop(x,y)) Fixes PR45715 Reviewed By: @lebedev.ri Differential Revision: https://reviews.llvm.org/D79041	2020-05-05 12:30:10 +01:00
Sam Parker	40574fefe9	[NFC][CostModel] Add TargetCostKind to relevant APIs Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002	2020-05-05 10:35:54 +01:00
Pratyai Mazumder	08032e7192	[SanitizerCoverage] Replace the unconditional store with a load, then a conditional store. Reviewers: vitalybuka, kcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79392	2020-05-05 02:25:05 -07:00
Sergey Dmitriev	f637334df9	[CallGraphUpdater] Removed references to calles when deleting function Summary: Otherwise we can get unaccounted references to call graph nodes. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79382	2020-05-04 18:59:47 -07:00
Zola Bridges	8d8fda49c9	[llvm][dfsan][NFC] Factor out fcn initialization Summary: Moving these function initializations into separate functions makes it easier to read the runOnModule function. There is also precedent in the sanitizer code: asan has a function ModuleAddressSanitizer::initializeCallbacks(Module &M). I thought it made sense to break the initializations into two sets. One for the compiler runtime functions and one for the event callbacks. Tested with: check-all Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D79307	2020-05-04 10:01:40 -07:00
Simon Pilgrim	940061438e	[InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476) This patch adds support for discarding integer absolutes (abs + nabs variants) from self-multiplications. ABS Alive2: http://volta.cs.utah.edu:8080/z/rwcc8W NABS Alive2: http://volta.cs.utah.edu:8080/z/jZXUwQ This is an InstCombine version of D79304 - I'm not sure yet if we'll need that after this. Reviewed By: @lebedev.ri and @xbolva00 Differential Revision: https://reviews.llvm.org/D79319	2020-05-04 15:21:52 +01:00
Jay Foad	e737847b8f	[SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func optimizePow does not create any new calls to pow, so it should work regardless of whether the pow library function is available. This allows it to optimize the llvm.pow intrinsic on targets with no math library. Based on a patch by Tim Renouf. Differential Revision: https://reviews.llvm.org/D68231	2020-05-04 10:54:07 +01:00
Florian Hahn	935685f420	[SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC). There's no need to duplicate the logic to push to the different work-lists.	2020-05-04 10:19:39 +01:00
Johannes Doerfert	14cb0bdf2b	[Attributor][NFC] Replace the nested AAMap with a key pair No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 512375 (362871/s) temporary memory allocations: 98746 (69933/s) peak heap memory consumption: 22.54MB peak RSS (including heaptrack overhead): 106.78MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 509833 (338534/s) temporary memory allocations: 98902 (65671/s) peak heap memory consumption: 18.71MB peak RSS (including heaptrack overhead): 103.00MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -2542 (-27042/s) temporary memory allocations: 156 (1659/s) peak heap memory consumption: -3.83MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-03 22:10:47 -05:00
Johannes Doerfert	95e0d28b71	[Attributor] Remember only necessary dependences Before we eagerly put dependences into the QueryMap as soon as we encountered them (via `Attributor::getAAFor<>` or `Attributor::recordDependence`). Now we will wait to see if the dependence is useful, that is if the target is not already in a fixpoint state at the end of the update. If so, there is no need to record the dependence at all. Due to the abstraction via `Attributor::updateAA` we will now also treat the very first update (during attribute creation) as we do subsequent updates. Finally this resolves the problematic usage of QueriedNonFixAA. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 554675 (389245/s) temporary memory allocations: 101574 (71280/s) peak heap memory consumption: 28.46MB peak RSS (including heaptrack overhead): 116.26MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 512465 (345559/s) temporary memory allocations: 98832 (66643/s) peak heap memory consumption: 22.54MB peak RSS (including heaptrack overhead): 106.58MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: -42210 (-727758/s) temporary memory allocations: -2742 (-47275/s) peak heap memory consumption: -5.92MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-03 22:01:51 -05:00
Johannes Doerfert	231026a508	[Attributor] Inititialize "value attributes" w/ must-be-executed-context info Attributes that only depend on the value (=bit pattern) can be initialized from uses in the must-be-executed-context (MBEC). We did use `AAComposeTwoGenericDeduction` and `AAFromMustBeExecutedContext` before to do this for some positions of these attributes but not for all. This was fairly complicated and also problematic as we did run it in every `updateImpl` call even though we only use known information. The new implementation removes `AAComposeTwoGenericDeduction`* and `AAFromMustBeExecutedContext` in favor of a simple interface `AddInformation::fromMBEContext(...)` which we call from the `initialize` methods of the "value attribute" `Impl` classes, e.g. `AANonNullImpl:initialize`. There can be two types of test changes: 1) Artifacts were we miss some information that was known before a global fixpoint was reached and therefore available in an update but not at the beginning. 2) Deduction for values we did not derive via the MBEC before or which were not found as the `AAFromMustBeExecutedContext::updateImpl` was never invoked. * An improved version of AAComposeTwoGenericDeduction can be found in D78718. Once we find a new use case that implementation will be able to handle "generic" AAs better. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 468428 (328952/s) temporary memory allocations: 77480 (54410/s) peak heap memory consumption: 32.71MB peak RSS (including heaptrack overhead): 122.46MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 554720 (351310/s) temporary memory allocations: 101650 (64376/s) peak heap memory consumption: 28.46MB peak RSS (including heaptrack overhead): 116.75MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: 86292 (556722/s) temporary memory allocations: 24170 (155935/s) peak heap memory consumption: -4.25MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D78719	2020-05-03 21:41:22 -05:00
Johannes Doerfert	87f1e93945	[Attributor][NFC] Use reference instead of pointer	2020-05-03 21:38:06 -05:00
Johannes Doerfert	2f97b8b891	[Attributor][NFC] Proactively ask for `nocapure` on call site arguments This minimizes test noise later on and is in line with other attributes we derive proactively.	2020-05-03 21:38:06 -05:00
Sergey Dmitriev	0f70f73308	[Attributor] Bitcast constant to the returned value type if it has different type Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79277	2020-05-03 11:46:13 -07:00
Hongtao Yu	911e06f5eb	[ICP] Handling must tail calls in indirect call promotion Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2202: ; preds = %605, %2201, %2199 ret void, !dbg !229485 This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 ret void, !dbg !229485 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 ret void, !dbg !229485 Differential Revision: https://reviews.llvm.org/D79258	2020-05-03 10:42:22 -07:00
Mircea Trofin	bec4ab95a4	[llvm][NFC] Inliner: factor cost and reporting out of inlining process Summary: This factors cost and reporting out of the inlining workflow, thus making it easier to reuse when driving inlining from the upcoming InliningAdvisor. Depends on: D79215 Reviewers: davidxl, echristo Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79275	2020-05-03 10:38:28 -07:00
Florian Hahn	bbdfcf8f69	[VPlan] Remove unused & undefined print method (NFC).	2020-05-03 18:36:20 +01:00
Johannes Doerfert	8228153f87	[Attributor][NFC] Encode IRPositions in the bits of a single pointer This reduces memory consumption for IRPositions by eliminating the vtable pointer and the `KindOrArgNo` integer. Since each abstract attribute has an associated IRPosition, the 12-16 bytes we save add up quickly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 469545 (260135/s) temporary memory allocations: 77137 (42735/s) peak heap memory consumption: 30.50MB peak RSS (including heaptrack overhead): 119.50MB total memory leaked: 269.07KB ``` After: ``` calls to allocation functions: 468999 (274108/s) temporary memory allocations: 77002 (45004/s) peak heap memory consumption: 28.83MB peak RSS (including heaptrack overhead): 118.05MB total memory leaked: 269.07KB ``` Difference: ``` calls to allocation functions: -546 (5808/s) temporary memory allocations: -135 (1436/s) peak heap memory consumption: -1.67MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- CTMark 15 runs Metric: compile_time Program lhs rhs diff test-suite...:: CTMark/sqlite3/sqlite3.test 25.07 24.09 -3.9% test-suite...Mark/mafft/pairlocalalign.test 14.58 14.14 -3.0% test-suite...-typeset/consumer-typeset.test 21.78 21.58 -0.9% test-suite :: CTMark/SPASS/SPASS.test 21.95 22.03 0.4% test-suite :: CTMark/lencod/lencod.test 25.43 25.50 0.3% test-suite...ark/tramp3d-v4/tramp3d-v4.test 23.88 23.83 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 60.24 60.11 -0.2% test-suite :: CTMark/kimwitu++/kc.test 15.69 15.69 -0.0% test-suite...:: CTMark/ClamAV/clamscan.test 25.43 25.42 -0.0% test-suite :: CTMark/Bullet/bullet.test 37.63 37.62 -0.0% Geomean difference -0.8% --- Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78722	2020-05-03 12:15:19 -05:00
Johannes Doerfert	6bf16ee4c5	[Attributor][NFC] Let AbstractAttribute be an IRPosition Since every AbstractAttribute so far, and for the foreseeable future, corresponds to a single IRPosition we can simplify the class structure. We already did this for IRAttribute but there is no reason to stop there.	2020-05-03 12:13:40 -05:00
Mircea Trofin	667f558c3f	[llvm][NFC] Inliner.cpp shouldInline post-commit feedback Discussion is in https://reviews.llvm.org/D79215	2020-05-03 09:31:31 -07:00
Sanjay Patel	682f0b366b	[InstCombine] use select-of-constants with set/clear bit mask patterns Cond ? (X & ~C) : (X \| C) --> (X & ~C) \| (Cond ? 0 : C) Cond ? (X \| C) : (X & ~C) --> (X & ~C) \| (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880	2020-05-03 09:44:43 -04:00
Nikita Popov	b7e2358220	Remove getNumUses() comparisons (NFC) getNumUses() scans the full use list. Don't use it is we only want to check if there's zero or one uses.	2020-05-02 11:05:19 +02:00
Nikita Popov	60e9ee16b4	[MergeFuncs] Don't merge shufflevectors with different masks When the shufflevector mask operand was converted into special instruction data, the FunctionComparator was not updated to account for this. As such, MergeFuncs will happily merge shufflevectors with different masks. This fixes https://bugs.llvm.org/show_bug.cgi?id=45773. Differential Revision: https://reviews.llvm.org/D79261	2020-05-02 10:21:14 +02:00
Mircea Trofin	3dbc612cf2	[llvm][NFC] Rename variable as per https://reviews.llvm.org/D79215 Operator error - performed the rename and didn't save.	2020-05-01 16:30:41 -07:00
Mircea Trofin	e1c4a7cb16	[llvm][NFC] Inliner: simplify inlining decision logic Summary: shouldInline makes a decision based on the InlineCost of a call site, as well as an evaluation on whether the site should be deferred. This means it's possible for the decision to be not to inline, even for an InlineCost that would otherwise allow it. Both uses of shouldInline performed the exact same logic after calling it. In addition, the decision on whether to inline or not was communicated through two values of the Option<InlineCost> return value: None, or an InlineCost evaluating to false. Simplified by: - encapsulating the decision in the return object. The bool it evaluates to communicates unambiguously the decision. The InlineCost is also available. - encapsulated the common post-shouldInline code into shouldInline. Reviewers: davidxl, echristo, eraman Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79215	2020-05-01 16:18:59 -07:00
Christopher Tetreault	beeabe382d	[SVE] Fix invalid usage of VectorType::getNumElements() in InstCombine Summary: Make foldVectorBinop return null if the instruction type is a scalable vector. It is unclear what, if any, of this function works with scalable vectors. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79196	2020-05-01 10:56:29 -07:00
Sanjay Patel	7fa150203f	[InstCombine] fix miscompile from multi-use cttz/ctlz transform PR45762: https://bugs.llvm.org/show_bug.cgi?id=45762	2020-05-01 13:52:24 -04:00
Florian Hahn	d911c17596	[SCCP] Get a copy of the state of CopyOf once. This fixes potential reference invalidations, when no lattice value is assigned for CopyOf. As the state of CopyOf won't change while in handleCallResult, we can get a copy once and use that. Should fix PR45749.	2020-05-01 14:46:35 +01:00
Benjamin Kramer	7a5a1e9460	[IR] AttributeList::getContext has a single user, remove it.	2020-05-01 14:18:29 +02:00
Florian Hahn	19ab53f1e2	[LoopVersioning] Update setAliasChecks to take ArrayRef argument (NFC). This cleanup was suggested as part of D78458.	2020-04-30 22:17:12 +01:00
Nikita Popov	b74c6d2c9d	[InlineFunction] Disable emission of alignment assumptions by default In D74183 clang started emitting alignment for sret parameters unconditionally. This caused a 1.5% compile-time regression on tramp3d-v4. The reason is that we now generate many instance of IR like %ptrint = ptrtoint %class.GuardLayers* %guards_m to i64 %maskedptr = and i64 %ptrint, 3 %maskcond = icmp eq i64 %maskedptr, 0 tail call void @llvm.assume(i1 %maskcond) to preserve the alignment information during inlining. Based on IR analysis, these assumptions also regress optimization. The attached phase ordering test case illustrates two issues: One are instruction count based optimization heuristics, which are affected by the four additional instructions of the assumption. The other is blocking of SROA due to ptrtoint casts (PR45763). We already encountered the same problem in Rust, where we (unlike Clang) generally prefer to emit alignment information absolutely everywhere it is available. We were only able to do this after hardcoding -preserve-alignment-assumptions-during-inlining=false, because we were seeing significant optimization and compile-time regressions otherwise. This patch disables -preserve-alignment-assumptions-during-inlining by default, because we should not be punishing people for adding more alignment annotations. Once the assume bundle work shakes out and we can represent (and use) alignment assumptions using assume bundles, it should be possible to re-enable this with reduced overhead. Differential Revision: https://reviews.llvm.org/D76886	2020-04-30 23:12:54 +02:00
Arthur Eubanks	a90948fd6e	[NFC] Rename ByValOrInalloca to PassPointeeByValue Summary: In preparation for preallocated. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79152	2020-04-30 09:42:13 -07:00
Jann Horn	a22685885d	[AddressSanitizer] Instrument byval call arguments Summary: In the LLVM IR, "call" instructions read memory for each byval operand. For example: ``` $ cat blah.c struct foo { void a, b, c; }; struct bar { struct foo foo; }; void func1(const struct foo); void func2(struct bar bar) { func1(bar->foo); } $ [...]/bin/clang -S -flto -c blah.c -O2 ; cat blah.s [...] define dso_local void @func2(%struct.bar* %bar) local_unnamed_addr #0 { entry: %foo = getelementptr inbounds %struct.bar, %struct.bar* %bar, i64 0, i32 0 tail call void @func1(%struct.foo* byval(%struct.foo) align 8 %foo) #2 ret void } [...] $ [...]/bin/clang -S -c blah.c -O2 ; cat blah.s [...] func2: # @func2 [...] subq $24, %rsp [...] movq 16(%rdi), %rax movq %rax, 16(%rsp) movups (%rdi), %xmm0 movups %xmm0, (%rsp) callq func1 addq $24, %rsp [...] retq ``` Let ASAN instrument these hidden memory accesses. This is patch 4/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77619	2020-04-30 17:09:13 +02:00
Jann Horn	cfe36e4c6a	[AddressSanitizer] Refactor: Permit >1 interesting operands per instruction Summary: Refactor getInterestingMemoryOperands() so that information about the pointer operand is returned through an array of structures instead of passing each piece of information separately by-value. This is in preparation for returning information about multiple pointer operands from a single instruction. A side effect is that, instead of repeatedly generating the same information through isInterestingMemoryAccess(), it is now simply collected once and then passed around; that's probably more efficient. HWAddressSanitizer has a bunch of copypasted code from AddressSanitizer, so these changes have to be duplicated. This is patch 3/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments [glider: renamed llvm::InterestingMemoryOperand::Type to OpType to fix GCC compilation] Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77618	2020-04-30 17:09:13 +02:00
Jann Horn	223a95fdf0	[AddressSanitizer] Split out memory intrinsic handling Summary: In both AddressSanitizer and HWAddressSanitizer, we first collect instructions whose operands should be instrumented and memory intrinsics, then instrument them. Both during collection and when inserting instrumentation, they are handled separately. Collect them separately and instrument them separately. This is a bit more straightforward, and prepares for collecting operands instead of instructions in a future patch. This is patch 2/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77617	2020-04-30 17:09:13 +02:00
Jann Horn	e29996c9a2	[AddressSanitizer] Refactor ClDebug{Min,Max} handling Summary: A following commit will split the loop over ToInstrument into two. To avoid having to duplicate the condition for suppressing instrumentation sites based on ClDebug{Min,Max}, refactor it out into a new function. While we're at it, we can also avoid the indirection through NumInstrumented for setting FunctionModified. This is patch 1/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77616	2020-04-30 17:09:13 +02:00
Alexander Potapenko	7e7754df32	Revert an accidental commit of four AddressSanitizer refactor CLs I couldn't make arc land the changes properly, for some reason they all got squashed. Reverting them now to land cleanly. Summary: This reverts commit `cfb5f89b62`. Reviewers: kcc, thejh Subscribers:	2020-04-30 16:15:43 +02:00
Jann Horn	cfb5f89b62	[AddressSanitizer] Refactor ClDebug{Min,Max} handling Summary: A following commit will split the loop over ToInstrument into two. To avoid having to duplicate the condition for suppressing instrumentation sites based on ClDebug{Min,Max}, refactor it out into a new function. While we're at it, we can also avoid the indirection through NumInstrumented for setting FunctionModified. This is patch 1/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: jfb, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77616	2020-04-30 15:30:46 +02:00
David Spickett	3929429347	[globalopt] Don't emit DWARF fragments for members of a struct that cover the whole struct This can happen when the rest of the members of are zero length. Following the same pattern applied to the SROA pass in: `d7f6f1636d` Fixes: https://bugs.llvm.org/show_bug.cgi?id=45335 Differential Revision: https://reviews.llvm.org/D78720	2020-04-30 11:36:55 +01:00
Evgeniy Brevnov	3acf62f3ad	[BPI][NFC] IRCE shoud qequest BPI through analysis manager. Summary: There is no need to create BPI explicitly. It should be requested through AM in a normal way. Reviewers: skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79080	2020-04-30 16:04:06 +07:00
Evgeniy Brevnov	3e68a66704	[BPI][NFC] Reuse post dominantor tree from analysis manager when available Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager. Reviewers: skatkov, taewookoh, yrouban Reviewed By: skatkov Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78987	2020-04-30 11:31:03 +07:00
Mircea Trofin	3ab319b295	[llvm][NFC] Use CallBase explicitly instead of Instruction in FunctionComparator Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79098	2020-04-29 15:37:46 -07:00
Mircea Trofin	2c7ff270d2	[llvm][NFC] Inliner: rename call site variables. Summary: Renamed 'CS' to 'CB', and, in one case, to a more specific name to avoid naming collision with outer scope (a maintainability/readability reason, not correctness) Also updated comments. Reviewers: davidxl, dblaikie, jdoerfert Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79101	2020-04-29 15:36:29 -07:00
Anh Tuyen Tran	c7878ad231	[VFDatabase] Scalar functions are vector functions with VF =1 Summary: Return scalar function when VF==1. The new trivial mapping scalar --> scalar when VF==1 to prevent false positive for "isVectorizable" query. Author: masoud.ataei (Masoud Ataei) Reviewers: Whitney (Whitney Tsang), fhahn (Florian Hahn), pjeeva01 (Jeeva P.), fpetrogalli (Francesco Petrogalli), rengolin (Renato Golin) Reviewed By: fpetrogalli (Francesco Petrogalli) Subscribers: hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D78054	2020-04-29 17:20:37 +00:00
Mircea Trofin	4632b7292a	[llvm][NFC] Removed addressed fixme; formatting. Removed already-addressed fixme, and updated formatting of a few lines that were triggering Harbormaster.	2020-04-29 09:06:01 -07:00
Hiroshi Yamauchi	1831986826	[PGO][PGSO] Prep for enabling non-cold code size opts under non-partial-profile sample PGO. Summary: - Distinguish between partial-profile and non-partial-profile sample PGO. - Add a flag for partial-profile sample PGO. - Tune the sample PGO cutoff. - No default behavior change (yet). Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78949	2020-04-29 08:57:47 -07:00
Mircea Trofin	e61247c0a8	[llvm][NFC] Change parameter type to more specific CallBase in IndirectCallPromotion Reviewers: dblaikie, craig.topper, wmi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79047	2020-04-29 08:42:32 -07:00
Simon Pilgrim	090cae8491	[TTI] Add DemandedElts to getScalarizationOverhead The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited. This patch does 2 things: 1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern. 2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs. This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing. A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well. Reviewed By: @craig.topper Differential Revision: https://reviews.llvm.org/D78216	2020-04-29 12:00:38 +01:00
Florian Hahn	e89379856a	Recommit "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)." The crash that caused the original revert has been fixed in `a3c964a278`. I also added a reduced version of the crash reproducer. This reverts the revert commit `2107af9ccf`.	2020-04-29 11:40:39 +01:00
Florian Hahn	616657b39c	[LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC). This allows forward declarations of PointerCheck, which in turn reduce the number of times LoopAccessAnalysis needs to be included. Ultimately this helps with moving runtime check generation to Transforms/Utils/LoopUtils.h, without having to include it there. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78458	2020-04-28 21:47:31 +01:00
Mircea Trofin	8a7cf11f92	[llvm][NFC] Refactor APIs operating on CallBase Summary: Refactored the parameter and return type where they are too generally typed as Instruction. Reviewers: dblaikie, wmi, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79027	2020-04-28 13:23:47 -07:00
David Blaikie	95e570725a	OpenMPOpt::RuntimeFunctionInfo::UsesMap: Use unique_ptr for values to simplify memory management	2020-04-28 12:26:53 -07:00
David Blaikie	3c89256d71	Attributor::ArgumentReplacementMap: Use unique_ptr to simplify memory management	2020-04-28 12:26:52 -07:00
Roman Lebedev	a0004358a8	[InstCombine] Negator: 'or' with no common bits set is just 'add' In `InstCombiner::visitAdd()`, we have ``` // A+B --> A\|B iff A and B have no bits set in common. if (haveNoCommonBitsSet(LHS, RHS, DL, &AC, &I, &DT)) return BinaryOperator::CreateOr(LHS, RHS); ``` so we should handle such `or`'s here, too.	2020-04-28 19:16:32 +03:00
Sam Parker	e9c9329aa4	[TTI] Add TargetCostKind argument to getUserCost There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635	2020-04-28 08:57:45 +01:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
Mircea Trofin	cb56e9b923	[llvm][NFC] Use CallBase instead of Instruction in ProfileSummaryInfo Summary: getProfileCount requires the parameter be a valid CallBase, and its uses reflect that. Reviewers: dblaikie, craig.topper, wmi Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78940	2020-04-27 20:47:52 -07:00
Arthur Eubanks	3b0450acec	Add IR constructs for preallocated (inalloca replacement) Add llvm.call.preallocated.{setup,arg} instrinsics. Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup. Add "preallocated" parameter attribute, which is like byval but without the copy. Verifier changes for these IR constructs. See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74651	2020-04-27 16:15:50 -07:00
Sanjay Patel	21acc0612a	[SLP] refactor load-combine logic; NFC We may want to identify sequences that are not reductions, but still qualify as load-combines in the back-end, so make most of the body a helper function.	2020-04-27 16:02:37 -04:00
Sameer Sahasrabuddhe	8488763682	[NFC] UnifyLoopExits: correctly skip expensive checks	2020-04-27 15:10:35 +05:30
Ayal Zaks	a3c964a278	[LV] Fix recording of BranchTakenCount for FoldTail When folding tail, branch taken count is computed during initial VPlan execution and recorded to be used by the compare computing the loop's mask. This recording should directly set the State, instead of reusing Value2VPValue mapping which serves original Values present prior to vectorization. The branch taken count may be a constant Value, which may be used elsewhere in the loop; trying to employ Value2VPValue for both leads to the issue reported in https://reviews.llvm.org/D76992#inline-721028 Differential Revision: https://reviews.llvm.org/D78847	2020-04-26 20:13:10 +03:00
Florian Hahn	2f3e86b318	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Florian Hahn	7d57d22baa	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Simon Pilgrim	a3982491db	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Nikita Popov	164845cd92	[GVN] Reduce expression size (NFC) Reduce size of GVN::Expression by reordering fields to reduce padding.	2020-04-26 09:43:35 +02:00
Sergei Trofimovich	09684b08d3	llvm: IPO: handle IRMover error handling, bug #45636 Summary: Missing error mangling is noticed in https://bugs.llvm.org/show_bug.cgi?id=45636 where inconsistent profiling input caused llvm/lld to crash as: ``` Program aborted due to an unhandled Error: linking module flags 'ProfileSummary': IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o' ``` The change does not change the fact that LLVM crashes but changes error output to say what was incorrect: ``` LLVM ERROR: Function Import: link error: linking module flags 'ProfileSummary': IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o' ``` Actual crash has yet to be fixed. Reviewers: lattner Reviewed By: lattner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78676	2020-04-25 19:16:01 +01:00
Sergey Dmitriev	67aed1469b	[Attributor] Do not set 'returned' attribute for arguments that cannot be bitcasted to function result Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78828	2020-04-25 09:49:40 -07:00
Sanjay Patel	4abab5c5ca	[InstCombine] generalize canonicalization of masked equality comparisons (X \| MaskC) == C --> (X & ~MaskC) == C ^ MaskC (X \| MaskC) != C --> (X & ~MaskC) != C ^ MaskC We have more analyis for 'and' patterns and already lean this way in the existing code, so this should be neutral or better in IR. If this does not do as well in codegen, the problem already exists and we should fix that based on target costs/heuristics. http://volta.cs.utah.edu:8080/z/oP3ecL define void @src(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %or = or i8 %x, %OrC %eq = icmp eq i8 %or, %C store i1 %eq, i1* %p0 %ne = icmp ne i8 %or, %C store i1 %ne, i1* %p1 ret void } define void @tgt(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %NotOrC = xor i8 %OrC, -1 %a = and i8 %x, %NotOrC %NewC = xor i8 %C, %OrC %eq = icmp eq i8 %a, %NewC store i1 %eq, i1* %p0 %ne = icmp ne i8 %a, %NewC store i1 %ne, i1* %p1 ret void }	2020-04-25 11:31:57 -04:00
Florian Hahn	46a04940e8	[DSE] Add stat for remaining stores after DSE. Using the existing NumFastStores statistic can be misleading when comparing the impact of DSE patches. For example, consider the case where a store gets removed from a function before it is inlined into another function. A less powerful DSE might only remove the store from functions it has been inlined into, which will result in more stores being removed, but no difference in the actual number of stores after DSE. The new stat provides the absolute number of stores surviving after DSE. Reviewers: dmgreen, bryant, asbirlea, jfb Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D78830	2020-04-25 16:12:55 +01:00
Tyker	e5f8a77c19	[AssumeBundles] Refactor asssume builder Summary: refactor assume bulider for the next patch. the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78013	2020-04-25 13:43:52 +02:00
Benjamin Kramer	1d42764df7	Give helpers internal linkage. NFC.	2020-04-25 11:50:52 +02:00
Ehud Katz	64249f177e	[CodeExtractor] Fix extraction of a value used only by intrinsics outside of region We should only skip `lifetime` and `dbg` intrinsics when searching for users. Other intrinsics are legit users that can't be ignored. Without this fix, the testcase would result in an invalid IR. `memcpy` will have a reference to the, now, external value (local to the extracted loop function). Fix PR42194 Differential Revision: https://reviews.llvm.org/D78749	2020-04-25 11:44:47 +03:00
Craig Topper	2c24051bac	[CallSite removal] Rename CallSite.h to AbstractCallSite.h. NFC The CallSite and ImmutableCallSite were removed in a previous commit. So rename the file to match the remaining class and the name of the cpp that implements it.	2020-04-24 22:12:25 -07:00
Tyker	97ecd91e20	[NFC] Refactor SimplifyCFG to make propagating information easier. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77742	2020-04-24 22:22:20 +02:00
Michael Liao	495bb8feb9	Fix `-Wparentheses` warnings. NFC.	2020-04-24 15:04:01 -04:00
Tyker	42431da895	[AssumeBundles] Use assume bundles in isKnownNonZero Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1 Reviewed By: jdoerfert Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76149	2020-04-24 20:41:51 +02:00
Florian Hahn	e1235831c4	[DSE,MSSA] Improve debug output (NFC). This patch slightly improves the formatting of the debug output, adds a few missing outputs and makes some existing outputs more consistent with the rest.	2020-04-24 17:50:08 +01:00
Florian Hahn	44ce588670	[DSE,MSSA] Skip checking write clobber for DomAccess (NFC). There is no need to check if the starting access for is a write clobber and all of its uses have already been checked.	2020-04-24 17:16:22 +01:00
Sanjay Patel	e4175ff525	[InstCombine] intersect FMF when reassociating FP min/max intrinsics As discussed in PR45478: https://bugs.llvm.org/show_bug.cgi?id=45478 ...propagating FMF from the outer (second) call is not correct, so intersect them instead. I suspect we could do better (see TODO comment), but mismatched FMF is probably too rare to care about. Differential Revision: https://reviews.llvm.org/D78631	2020-04-24 12:14:03 -04:00
Simon Pilgrim	27ad103a3a	ARCRuntimeEntryPoints.h - remove unnecessary includes. NFC.	2020-04-24 14:32:45 +01:00
Max Kazantsev	9cd4debd5a	[LoopVectorize] Preserve CFG analyses if CFG wasn't modified One of transforms the loop vectorizer makes is LCSSA formation. In some cases it is the only transform it makes. We should not drop CFG analyzes if only LCSSA was formed and no actual CFG changes was made. We should think of expanding this logic to other passes as well, and maybe make it a part of PM framework. Reviewed By: Florian Hahn Differential Revision: https://reviews.llvm.org/D78360	2020-04-24 17:22:24 +07:00
Johannes Doerfert	1dfc473177	Revert "[Attributor][NFC] Encode IRPositions in the bits of a single pointer" A dependent patch has been reverted [0]. Until it goes back in this one has to stay out. [0] `ebdb893994` This reverts commit `d254b50b2b`.	2020-04-24 02:53:51 -05:00
Johannes Doerfert	d254b50b2b	[Attributor][NFC] Encode IRPositions in the bits of a single pointer This reduces memory consumption for IRPositions by eliminating the vtable pointer and the `KindOrArgNo` integer. Since each abstract attribute has an associated IRPosition, the 12-16 bytes we save add up quickly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 469545 (260135/s) temporary memory allocations: 77137 (42735/s) peak heap memory consumption: 30.50MB peak RSS (including heaptrack overhead): 119.50MB total memory leaked: 269.07KB ``` After: ``` calls to allocation functions: 468999 (274108/s) temporary memory allocations: 77002 (45004/s) peak heap memory consumption: 28.83MB peak RSS (including heaptrack overhead): 118.05MB total memory leaked: 269.07KB ``` Difference: ``` calls to allocation functions: -546 (5808/s) temporary memory allocations: -135 (1436/s) peak heap memory consumption: -1.67MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- CTMark 15 runs Metric: compile_time Program lhs rhs diff test-suite...:: CTMark/sqlite3/sqlite3.test 25.07 24.09 -3.9% test-suite...Mark/mafft/pairlocalalign.test 14.58 14.14 -3.0% test-suite...-typeset/consumer-typeset.test 21.78 21.58 -0.9% test-suite :: CTMark/SPASS/SPASS.test 21.95 22.03 0.4% test-suite :: CTMark/lencod/lencod.test 25.43 25.50 0.3% test-suite...ark/tramp3d-v4/tramp3d-v4.test 23.88 23.83 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 60.24 60.11 -0.2% test-suite :: CTMark/kimwitu++/kc.test 15.69 15.69 -0.0% test-suite...:: CTMark/ClamAV/clamscan.test 25.43 25.42 -0.0% test-suite :: CTMark/Bullet/bullet.test 37.63 37.62 -0.0% Geomean difference -0.8% --- Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78722	2020-04-24 01:58:47 -05:00
Mircea Trofin	b8960b5d81	[llvm][NFC][CallSite] Remove remaining {Immutable}CallSite uses Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78789	2020-04-23 22:19:39 -07:00
Mehdi Amini	2107af9ccf	Revert "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)." This reverts commit `9245c7ac13`. This is triggering a segfault in XLA downstream, we'll follow-up with a reproducer, it is likely influenced by TTI/TLI settings or other options as a simple `opt -loop-vectorize` invocation on the IR before the crash does not reproduce immediately.	2020-04-24 05:07:32 +00:00
Mircea Trofin	2059a6e3ef	[llvm][NFC][CallSite] Remove ImmutableCallSite from a few locations Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78783	2020-04-23 21:18:44 -07:00
Craig Topper	cbe77ca9bd	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-23 21:01:48 -07:00
Craig Topper	81c5e83f7d	[CallSite removal][Transform] Replace CallSite with CallBase in Utils. NFC Differential Revision: https://reviews.llvm.org/D78780	2020-04-23 20:49:33 -07:00
Roman Lebedev	5a159ed2a8	[InstCombine] Negator: don't negate multi-use `sub` While we can do that, it doesn't increase instruction count, if the old `sub` sticks around then the transform is not only not a unlikely win, but a likely regression, since we likely now extended live range and use count of both of the `sub` operands, as opposed to just the result of `sub`. As Kostya Serebryany notes in post-commit review in https://reviews.llvm.org/D68408#1998112 this indeed can degrade final assembly, increase register pressure, and spilling. This isn't what we want here, so at least for now let's guard it with an use check.	2020-04-23 23:59:15 +03:00
Christopher Tetreault	7ca56c90bd	[SVE] Remove calls to isScalable from Transforms Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77756	2020-04-23 13:50:07 -07:00
Mircea Trofin	ceb7f308b8	[llvm][NFC][CallSite] Removed CallSite from few implementation details Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78724	2020-04-23 10:36:36 -07:00
Mircea Trofin	cea6f4d5f8	[llvm][NFC][CallSite] Remove CallSite from TypeMetadataUtils & related Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78666	2020-04-23 08:23:16 -07:00
Sanjay Patel	62da6ecea2	[InstCombine] substitute equivalent constant to reduce logic-of-icmps (X == C) && (Y Pred1 X) --> (X == C) && (Y Pred1 C) (X != C) \|\| (Y Pred1 X) --> (X != C) \|\| (Y Pred1 C) This cooperates/overlaps with D78430, but it is a more general transform that gets us most of the expected simplifications and several other improvements. http://volta.cs.utah.edu:8080/z/5gxjjc PR45618: https://bugs.llvm.org/show_bug.cgi?id=45618 Differential Revision: https://reviews.llvm.org/D78582	2020-04-23 10:19:16 -04:00
Simon Pilgrim	7a8b1096be	[ObjCARC] Remove unused forward declarations. NFC.	2020-04-23 13:52:49 +01:00
Simon Pilgrim	b108a457e1	[VPlan] Remove unused forward declarations. NFC. Move VPlan.h include from VPlanVerifier.h down to VPlanVerifier.cpp	2020-04-23 12:34:20 +01:00
Serguei Katkov	c0d2bbb1d4	[CaptureTracking] Replace hardcoded constant to option. NFC. The motivation is to be able to play with the option and change if it is required. Reviewers: fedor.sergeev, apilipenko, rnk, jdoerfert Reviewed By: fedor.sergeev Subscribers: hiraditya, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D78624	2020-04-23 18:23:35 +07:00
Florian Hahn	9245c7ac13	[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC). This patch adds VPValue version of the instruction operands to VPWidenRecipe and uses them during code-generation. Similar to D76373 this reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Reviewers: rengolin, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D76992	2020-04-23 12:16:46 +01:00
Craig Topper	25807452ac	[ArgumentPromotion] Remove unnecessary getScalarType() before casting to PointerType. NFC I don't believe this pass deals with vectors of pointers. I think this getScalarType() was added during a mechanical opaque pointer change of the interface to GetElementPtrInst::getIndexedType.	2020-04-22 22:51:41 -07:00
Vedant Kumar	2fa656cdfd	[Debugify] Do not require named metadata to be present when stripping This allows -mir-strip-debug to be run without -debugify having run before.	2020-04-22 17:03:39 -07:00
Vedant Kumar	2a5675f11d	[MachineDebugify] Insert synthetic DBG_VALUE instructions Summary: Teach MachineDebugify how to insert DBG_VALUE instructions. This can help find bugs causing CodeGen differences when debug info is present. DBG_VALUE instructions are only emitted when -debugify-level is set to locations+variables. There is essentially no attempt made to match up DBG_VALUE register operands with the local variables they ought to correspond to. I'm not sure how to improve the situation. In some cases (MachineMemOperand?) it's possible to find the IR instruction a MachineInstr corresponds to, but in general this seems to call for "undoing" the work done by ISel. Reviewers: dsanders, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78135	2020-04-22 17:03:39 -07:00
Juneyoung Lee	aca335955c	[ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 08:08:53 +09:00
Juneyoung Lee	5ceef26350	Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison" This reverts commit `80faa8c3af`.	2020-04-23 08:07:09 +09:00
Juneyoung Lee	80faa8c3af	RFC: [ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 07:57:12 +09:00
Florian Hahn	352b612a71	[SCCP] Drop unnecessary early exit for ExtractValueInst. visitExtractValueInst uses mergeInValue, so it already can handle constant ranges. Initially the early exit was using isOverdefined to keep things as NFC during the initial move to ValueLatticeElement. As the function already supports constant ranges, it can just use ValueState[&I].isOverdefined. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78393	2020-04-22 22:07:59 +01:00
Craig Topper	be04aba6fc	[CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC Differential Revision: https://reviews.llvm.org/D78613	2020-04-22 12:06:58 -07:00
Christopher Tetreault	2dea3f1298	[SVE] Add new VectorType subclasses Summary: Introduce new types for fixed width and scalable vectors. Does not remove getNumElements yet so as to not break code during transition period. Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr Reviewed By: sdesmalen Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm, #lldb Differential Revision: https://reviews.llvm.org/D77587	2020-04-22 08:59:01 -07:00
Mircea Trofin	1b6b05a250	[llvm][NFC][CallSite] Remove CallSite from a few trivial locations Summary: Implementation details and internal (to module) APIs. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78610	2020-04-22 08:39:21 -07:00
Dmitry Vyukov	5a2c31116f	[TSAN] Add optional support for distinguishing volatiles Add support to optionally emit different instrumentation for accesses to volatile variables. While the default TSAN runtime likely will never require this feature, other runtimes for different environments that have subtly different memory models or assumptions may require distinguishing volatiles. One such environment are OS kernels, where volatile is still used in various places for various reasons, and often declare volatile to be "safe enough" even in multi-threaded contexts. One such example is the Linux kernel, which implements various synchronization primitives using volatile (READ_ONCE(), WRITE_ONCE()). Here the Kernel Concurrency Sanitizer (KCSAN) [1], is a runtime that uses TSAN instrumentation but otherwise implements a very different approach to race detection from TSAN. While in the Linux kernel it is generally discouraged to use volatiles explicitly, the topic will likely come up again, and we will eventually need to distinguish volatile accesses [2]. The other use-case is ignoring data races on specially marked variables in the kernel, for example bit-flags (here we may hide 'volatile' behind a different name such as 'no_data_race'). [1] https://github.com/google/ktsan/wiki/KCSAN [2] https://lkml.kernel.org/r/CANpmjNOfXNE-Zh3MNP=-gmnhvKbsfUfTtWkyg_=VqTxS4nnptQ@mail.gmail.com Author: melver (Marco Elver) Reviewed-in: https://reviews.llvm.org/D78554	2020-04-22 17:27:09 +02:00
Roman Lebedev	67266d879c	[InstCombine] Negator: shufflevector is negatible All these folds are correct as per alive-tv	2020-04-22 15:14:23 +03:00
Craig Topper	05a11974ae	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-22 00:07:13 -07:00
Johannes Doerfert	ca59ff5af9	[Attributor] Replace AccessKind2Accesses map with an "array map" The number of different access location kinds we track is relatively small (8 so far). With this patch we replace the DenseMap that mapped from index (0-7) to the access set pointer with an array of access set pointers. This reduces memory consumption. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 472499 (215654/s) temporary memory allocations: 77794 (35506/s) peak heap memory consumption: 35.28MB peak RSS (including heaptrack overhead): 125.46MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 472270 (308673/s) temporary memory allocations: 77578 (50704/s) peak heap memory consumption: 32.70MB peak RSS (including heaptrack overhead): 121.78MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -229 (346/s) temporary memory allocations: -216 (326/s) peak heap memory consumption: -2.58MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` ---	2020-04-22 01:35:27 -05:00
Johannes Doerfert	f20ff4b17d	[Attributor] Run IRPosition::verify only with EXPENSIVE_CHECKS	2020-04-22 01:35:12 -05:00
Sameer Sahasrabuddhe	5a7a6382bc	FixIrreducible: don't crash when moving a child loop Summary: When an irreducible SCC is converted into a new natural loop, existing loops included in that SCC now become children of the new loop. The logic that moves these loops from the parent loop to the new loop invoked undefined behaviour when it modified the container that it was iterating over. Fixed this by first extracting all the loops that are to be removed from the parent. Fixes bug 45623. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D78544	2020-04-22 07:47:30 +05:30
Mircea Trofin	9ee02aef62	[llvm][NFC][CallSite] Remove CallSite from FunctionAttrs Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78584	2020-04-21 16:16:00 -07:00
Johannes Doerfert	46b7ed0e6f	[Attributor] Remove dependence edges eagerly If we have a dependence between an abstract attribute A to an abstract attribute B such hat changes in A should trigger an update of B, we do not need to keep the dependence around once the update was triggered. If the dependence is still required the update will reinsert it into the dependence map, if it is not we avoid triggering B in the future. This replaces the "recompute interval" mechanism we used before to prune stale dependences. Number of required iterations is generally down, compile time for the module pass (not really the CGSCC pass) is down quite a bit. There is one test change which looks like an artifact in the undefined behavior AA that needs to be looked at.	2020-04-21 15:22:10 -05:00
Johannes Doerfert	ea439bbcbb	[Attributor][NFC] Track the number of created AAs in the statistics	2020-04-21 15:22:10 -05:00
Johannes Doerfert	c5794f77eb	[Attributor][PM] Introduce `-attributor-enable={none,cgscc,module,all}` The old command line option `-attributor-disable` was too coarse grained as we want to measure the effects of the module or cgscc pass without the other as well. Since `none` is the default there is no real functional change. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78571	2020-04-21 15:22:10 -05:00
Michael Liao	163bd9d858	Fix `-Wpedantic` warnings. NFC.	2020-04-21 16:09:17 -04:00
Michael Liao	21529355e1	Fix `-Wparentheses` warnings. NFC.	2020-04-21 15:02:59 -04:00
Roman Lebedev	352fef3f11	[InstCombine] Negator - sink sinkable negations Summary: As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]), `sub` instruction can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`, we sparsely do that for non-constants. But we should. Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y` This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms). For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues For latter, if there are still any such opposing folds, we'd need to remove the colliding fold. In any case, reproducers welcomed! Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68408	2020-04-21 22:00:23 +03:00
Benjamin Kramer	9a08c30705	Bit-pack some pairs. No functionlity change intended.	2020-04-21 20:40:20 +02:00
Fangrui Song	cca545ce46	[CallSite] Fix build breakage after D78538	2020-04-21 11:33:40 -07:00
Mircea Trofin	d702325af6	[llvm][NFC][CallSite] Remove CallSite from DeadArgumentElimination Summary: Also capitalized some induction variables, to match coding style. Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78538	2020-04-21 10:48:38 -07:00
Simon Pilgrim	d9af50efbc	[Transforms] getOrEnforceKnownAlignment - fix MSVC result of 32-bit shift implicitly converted to 64 bits warning. NFCI We don't overflow here so we can use a U64 shift directly.	2020-04-21 18:32:12 +01:00
Johannes Doerfert	177c065e50	[Attributor] Use a pointer value type for the OpcodeInstMap This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 490390 (320725/s) temporary memory allocations: 84601 (55330/s) peak heap memory consumption: 41.70MB peak RSS (including heaptrack overhead): 131.18MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 489359 (301144/s) temporary memory allocations: 82983 (51066/s) peak heap memory consumption: 36.76MB peak RSS (including heaptrack overhead): 126.48MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -1031 (-10739/s) temporary memory allocations: -1618 (-16854/s) peak heap memory consumption: -4.94MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:20:09 -05:00
Johannes Doerfert	99662c22cd	[Attributor] Use a pointer value type for the QueryMap This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 596180 (374484/s) temporary memory allocations: 84979 (53378/s) peak heap memory consumption: 52.14MB peak RSS (including heaptrack overhead): 139.79MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 489200 (303285/s) temporary memory allocations: 83406 (51708/s) peak heap memory consumption: 41.70MB peak RSS (including heaptrack overhead): 131.76MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -106980 (-5094285/s) temporary memory allocations: -1573 (-74904/s) peak heap memory consumption: -10.44MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:20:04 -05:00
Johannes Doerfert	1f570e019d	[Attributor] Use a pointer value type for the access kind -> accesses map This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 616219 (381559/s) temporary memory allocations: 83294 (51575/s) peak heap memory consumption: 72.15MB peak RSS (including heaptrack overhead): 160.04MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 595004 (357145/s) temporary memory allocations: 83840 (50324/s) peak heap memory consumption: 52.14MB peak RSS (including heaptrack overhead): 138.32MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -21215 (-415980/s) temporary memory allocations: 546 (10705/s) peak heap memory consumption: -20.01MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:20:02 -05:00
Johannes Doerfert	40f3baeb20	[Attributor] Pass the Attributor to the AbstractAttribute constructors AbstractAttribute::initialize is used to initialize the deduction and the object we do not always call it. To make sure we have the option to initialize the object even if initialize is not called we pass the Attributor to AbstractAttribute constructors now.	2020-04-21 11:20:02 -05:00
Johannes Doerfert	91a6c88349	[Attributor] Use a pointer value type for the AAMap This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 613353 (376521/s) temporary memory allocations: 83636 (51341/s) peak heap memory consumption: 75.64MB peak RSS (including heaptrack overhead): 162.97MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 616575 (349929/s) temporary memory allocations: 83650 (47474/s) peak heap memory consumption: 72.15MB peak RSS (including heaptrack overhead): 159.81MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: 3222 (24225/s) temporary memory allocations: 14 (105/s) peak heap memory consumption: -3.49MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:19:58 -05:00
Sanjay Patel	978166f209	[InstCombine] improve types/names for logic-of-icmp helper function; NFC	2020-04-21 10:16:45 -04:00
Florian Hahn	647c9e72e4	[VPlan] Make various tryTo* helpers private and mark as const (NFC). The individual tryTo* helpers do not need to be public. Also, the builder contained two consecutive public: sections, which is not necessary. Moved the remaining public methods after the constructor. Also make some of the tryTo* helpers const. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed by: gilr Differential Revision: https://reviews.llvm.org/D78288	2020-04-21 14:49:02 +01:00
Sanjay Patel	ba72389269	[InstCombine] improve types/names for logic-of-icmp helper functions; NFC	2020-04-21 09:18:22 -04:00
Craig Topper	6235951ec0	[CallSite removal][Instrumentation] Use CallBase instead of CallSite in AddressSanitizer/DataFlowSanitizer/MemorySanitizer. NFC Differential Revision: https://reviews.llvm.org/D78524	2020-04-20 22:39:14 -07:00
Max Kazantsev	a116f0fa86	[LICM][NFC] Reorder checks to speed up things slightly Side effect check is made faster than potentially heavy other checks.	2020-04-21 11:34:44 +07:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Johannes Doerfert	dc3b5b00fe	[OpenMPOpt] Make the combination of `ident_t` deterministic Before we kept the first applicable `ident_t` during deduplication of runtime calls. The problem is that "first" is dependent on the iteration order of a DenseMap. Since the proper solution, which is to combine the information from all `ident_t`, should be deterministic on its own, we will not try to make the iteration order deterministic. Instead, we will create a fresh `ident_t` if there is not a unique existing `ident_t*` to pick.	2020-04-20 23:27:08 -05:00
Johannes Doerfert	8855fec37e	[OpenMPOpt] Use a pointer value type in map The value type was a set before which can easily lead to excessive memory usage and copying. We use a pointer to a vector instead now.	2020-04-20 23:27:08 -05:00
Johannes Doerfert	ee17263adc	[OpenMPOpt] Make the SCC a vector to ensure deterministic results	2020-04-20 23:27:08 -05:00
Mircea Trofin	c2d86e1f30	[llvm][NFC][CallSite] Remove CallSite from ArgumentPromotion Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78528	2020-04-20 19:33:42 -07:00
Johannes Doerfert	87aa362985	[Attributor] Use the BumpPtrAllocator in InformationCache as well We now also use the BumpPtrAllocator from the Attributor in the InformationCache. The lifetime of objects in either is pretty much the same and it should result in consistently good performance regardless of the allocator. Doing so requires to call more constructors manually but so far that does not seem to be problematic or messy. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 615359 (368257/s) temporary memory allocations: 83315 (49859/s) peak heap memory consumption: 75.64MB peak RSS (including heaptrack overhead): 163.43MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 613042 (359555/s) temporary memory allocations: 83322 (48869/s) peak heap memory consumption: 75.64MB peak RSS (including heaptrack overhead): 162.92MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -2317 (-68147/s) temporary memory allocations: 7 (205/s) peak heap memory consumption: 2.23KB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-20 21:12:41 -05:00
Mircea Trofin	15cd1e36e4	[llvm][NFC][CallSite] Remove CallSite from CoroEarly Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78523	2020-04-20 18:15:25 -07:00
Sriraman Tallam	365b60fc93	New pass to make internal linkage symbol names unique. With clang option -funique-internal-linkage-symbols, symbols with internal linkage get names with the module hash appended. Differential Revision: https://reviews.llvm.org/D78243	2020-04-20 15:05:22 -07:00
Craig Topper	fcc9d70260	Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign." This is breaking the clang build. This reverts commit `897409fb56`.	2020-04-20 13:25:06 -07:00
Craig Topper	897409fb56	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 13:08:05 -07:00
Nikita Popov	54d01cbc15	[IPT] Don't use OrderedInstructions (NFC) Use Instruction::comesBefore() instead of OrderedInstructions inside InstructionPrecedenceTracking. This also removes the dominator tree dependency. Differential Revision: https://reviews.llvm.org/D78461	2020-04-20 18:25:31 +02:00
Bjorn Pettersson	a8a31fdd80	[Scalarizer] Fix a non-deterministic scatter order problem Summary: The indexing operator in Scatterer may result in building new instructions. When using multiple such operators in a function argument list the order in which we build instructions depend on argument evaluation order (which is undefined in C++). This patch avoid such problems by expanding the components using the [] operator prior to the function call. Problem was seen when comparing output, while builing LLVM with different compilers (clang vs gcc). Reviewers: foad, cameron.mcinally, uabelho Reviewed By: foad Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78455	2020-04-20 16:05:33 +02:00
Florian Hahn	fa284e136e	[VPlan] Clean up tryToCreate(Widen)Recipe. (NFC) This patch includes some clean-ups to tryToCreateRecipe, suggested in D77973. It includes: * Renaming tryToCreateRecipe to tryToCreateWidenRecipe. * Move VPBB insertion logic to caller of tryToCreateWidenRecipe. * Hoists instruction checks to tryToCreateWidenRecipe, making it clearer which instructions are handled by which recipe, simplifying the checks by using early exits. * Split up handling of induction PHIs and truncates using inductions. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78287	2020-04-20 10:06:35 +01:00
Florian Hahn	4331b3812a	[PredicateInfo] Use new Instruction::comesBefore instead of OI (NFC). The recently added Instruction::comesBefore can be used instead of OrderedInstructions. Reviewers: rnk, nikic, efriedma Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D78452	2020-04-20 09:22:21 +01:00
Sam Parker	e3056ae9a0	[NFC][TTI] Explicit use of VectorType The API for shuffles and reductions uses generic Type parameters, instead of VectorType, and so assertions and casts are used a lot. This patch makes those types explicit, which means that the clients can't be lazy, but results in less ambiguity, and that can only be a good thing. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562 Differential Revision: https://reviews.llvm.org/D78357	2020-04-20 09:16:52 +01:00
Craig Topper	53ee8fbc23	[CallSite removal][SCCP] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78470	2020-04-20 00:16:09 -07:00
Craig Topper	4cf6d4ab48	[CallSite removal][CalledValuePropagation] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78467	2020-04-19 22:05:40 -07:00
Florian Hahn	a7aaadc135	[TTI] Clean up includes (NFC). Remove some unnecessary includes, replace some with forward declarations. This also exposed a few places that were missing some includes.	2020-04-19 20:11:59 +01:00
Florian Hahn	32af48cdcf	[IVDescriptors] Clean up includes. Some includes are not required and forward declarations can be used instead. This also exposed a few places that were not directly including required files.	2020-04-19 20:07:47 +01:00
Florian Hahn	7a87e8f90b	[LoopUtils] Clean up includes, use forward decls if appropriate (NFC). Most of the includes in LoopUtils.h are not required in the header and they can be replaced by forward declarations. Unfortunately includes of TargetTransformInfo.h and IVDescriptors.h pull in a bunch of additional things, but there is no easy way to get rid of them at the moment I think.	2020-04-19 19:44:29 +01:00
Sanjay Patel	bef6e67e95	[VectorCombine] transform bitcasted shuffle to wider elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' This is the widen shuffle elements enhancement to D76727. It builds on the analysis and simplifications in D77881 and rG6a7e958a423e. The phase ordering tests show that we can simplify inverse shuffles across a binop in both directions (widen/narrow or narrow/widen) now. There's another potential transform visible in some of the remaining TODOs - move a bitcasted operand of a shuffle after the shuffle. Differential Revision: https://reviews.llvm.org/D78371	2020-04-19 08:24:38 -04:00
Benjamin Kramer	ff54d1c897	Remove remaining callers of CreateShuffleVector with unsigned indices and mark it as deprecated No functionality change intended.	2020-04-19 11:48:28 +02:00
Florian Hahn	6ba0695c60	[ValueLattice] Add struct for merge options. This makes it easier to extend the merge options in the future and also reduces the risk of accidentally setting a wrong option. Reviewers: efriedma, nikic, reames, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78368	2020-04-19 09:03:16 +01:00
Ayal Zaks	8e0c5f7200	[LV] Mark first-order recurrences as allowed exits First-order recurrences require special treatment when they are live-out; such treatment is provided by fixFirstOrderRecurrence(), so they should be included in AllowedExit set. (Should probably have been included originally in D16197.) Fixes PR45526: AllowedExit set is used by prepareToFoldTailByMasking() to check whether the treatment for live-outs also holds when folding the tail, which is not (yet) the case for first-order recurrences. Differential Revision: https://reviews.llvm.org/D78210	2020-04-18 23:54:21 +03:00
Craig Topper	7fde990694	Recommit "[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI" With a tweak to avoid a linker error for passing MaxAlignmentExponent by reference to std::min.	2020-04-18 13:51:57 -07:00
Nikita Popov	a42fd18d0f	[PredicateInfo] Factor out PredicateInfoBuilder (NFC) When running IPSCCP on a module with many small functions, memory usage is dominated by PredicateInfo, which is a huge structure (partially due to some unfortunate nested SmallVector use). However, most of it is actually only temporary state needed to build predicate info, and does not need to be retained after initial construction. This patch factors out the predicate building logic and state into a separate PrediceInfoBuilder, with the extra bonus that it does not need to live in the header anymore. Differential Revision: https://reviews.llvm.org/D78326	2020-04-18 22:34:38 +02:00
Craig Topper	44d63b7528	Revert "[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI" This reverts commit `e00cfe254d`. Seems to be causing a linker error on the build bots.	2020-04-18 13:23:29 -07:00
Craig Topper	e00cfe254d	[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI We previously clamped the trailing zero count to 31 bits. And then clamped the final alignment to MaximumAlignment which is 1 << 29. This patch simplifies this to just clamp the trailing zero to 29 using MaxAlignmentExponent. I was looking into changing this function to use Align/MaybeAlign and noticed this. Differential Revision: https://reviews.llvm.org/D78418	2020-04-18 12:52:47 -07:00
Florian Hahn	46853b95ca	[SCCP] Drop unused early exit from visitStoreInst (NFC). There are no lattice values associated with store instructions directly. They will never get marked as overdefined.	2020-04-18 19:44:54 +01:00
Florian Hahn	034e8d58a8	[SCCP] Drop unused early exit from visitReturnInst (NFC). There are no lattice values associated with return instructions directly. They will never get marked as overdefined.	2020-04-18 13:52:41 +01:00
Florian Hahn	4ee45ab60f	[LV] Invalidate cost model decisions along with interleave groups. Cost-modeling decisions are tied to the compute interleave groups (widening decisions, scalar and uniform values). When invalidating the interleave groups, those decisions also need to be invalidated. Otherwise there is a mis-match during VPlan construction. VPWidenMemoryRecipes created initially are left around w/o converting them into VPInterleave recipes. Such a conversion indeed should not take place, and these gather/scatter recipes may in fact be right. The crux is leaving around obsolete CM_Interleave (and dependent) markings of instructions along with their costs, instead of recalculating decisions, costs, and recipes. Alternatively to forcing a complete recompute later on, we could try to selectively invalidate the decisions connected to the interleave groups. But we would likely need to run the uniform/scalar value detection parts again anyways and the extra complexity is probably not worth it. Fixes PR45572. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78298	2020-04-18 10:23:49 +01:00
Mircea Trofin	41ad8b7388	[llvm][NFC][CallSite] Remove CallSite from Evaluator. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78395	2020-04-17 19:11:17 -07:00
Anna Thomas	fd5e069d23	Fix buildbot failure due to obsolete CallSite usage Fix buildbot failures due to `ef49b1d97e` (which was a revert of a previous change).	2020-04-17 17:46:19 -04:00
Anna Thomas	ef49b1d97e	Revert "[InlineFunction] Update metadata on loads that are return values" This reverts commit `1d0f757904` because of https://bugs.llvm.org/show_bug.cgi?id=45590. Needs investigation.	2020-04-17 17:23:00 -04:00
Craig Topper	5f6d93c7d3	[CallSite removal][Attributor] Replaces use of CallSite with CallBase. NFC Differential Revision: https://reviews.llvm.org/D78343	2020-04-17 10:44:31 -07:00
Craig Topper	0feaba683e	[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC There are also some adjustments to use MaybeAlign in here due to CallBase::getParamAlignment() being deprecated. It would be a little cleaner if getOrEnforceKnownAlignment was migrated to Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78345	2020-04-17 10:32:45 -07:00
Craig Topper	8c94d616e1	Revert "[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC" There were extra changes that weren't supposed to be in there This reverts commit `b91f78db37`.	2020-04-17 10:11:22 -07:00
Craig Topper	b91f78db37	[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC There are also some adjustments to use MaybeAlign in here due to CallBase::getParamAlignment() being deprecated. It would be cleaner if getOrEnforceKnownAlignment was migrated to Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78345	2020-04-17 10:07:20 -07:00
Florian Hahn	c245d3e033	[ValueLattice] Steal bits from Tag to track range extensions (NFC). Users of ValueLatticeElement currently have to ensure constant ranges are not extended indefinitely. For example, in SCCP, mergeIn goes to overdefined if a constantrange value is repeatedly merged with larger constantranges. This is a simple form of widening. In some cases, this leads to an unnecessary loss of information and things can be improved by allowing a small number of extensions in the hope that a fixed point is reached after a small number of steps. To make better decisions about widening, it is helpful to keep track of the number of range extensions. That state is tied directly to a concrete ValueLatticeElement and some unused bits in the class can be used. The current patch preserves the existing behavior by default: CheckWiden defaults to false and if CheckWiden is true, a single change to the range is allowed. Follow-up patches will slightly increase the threshold for widening. Reviewers: efriedma, davide, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78145	2020-04-17 15:38:23 +01:00
Benjamin Kramer	c5e7c2691d	Remove accidental include. Thank you clangd.	2020-04-17 16:36:30 +02:00
Benjamin Kramer	b639091c02	Change users of CreateShuffleVector to pass the masks as int instead of Constants No functionality change intended.	2020-04-17 16:34:29 +02:00
Benjamin Kramer	166467e822	[VectorUtils] Create shufflevector masks as int vectors instead of Constants No functionality change intended.	2020-04-17 15:28:00 +02:00
Max Kazantsev	72c13446ce	[NFC] Add missing 'const' notion to LCSSA-related functions These functions don't really do any changes to loop info or dominator tree. We should state this explicitly using 'const'.	2020-04-17 17:49:34 +07:00
Simon Pilgrim	fa7f328a15	[cmake] LLVMVectorize - add include/llvm/Transforms/Vectorize header path MSVC projects were missing the llvm/Transforms/Vectorize/* headers	2020-04-17 11:06:26 +01:00
Craig Topper	5034df8600	[SampleProfile] Use CallBase in function arguments and data structures to reduce the number of explicit casts. NFCI Removing CallSite left us with a bunch of explicit casts from Instruction to CallBase. This moves the casts earlier so that function arguments and data structure types are CallBase so we don't have to cast when we use them. Differential Revision: https://reviews.llvm.org/D78246	2020-04-16 22:10:34 -07:00
Craig Topper	798b262c3c	[CallSite removal][IPO] Change implementation of AbstractCallSite to store a CallBase* instead of CallSite. NFCI. CallSite will likely be removed soon, but AbstractCallSite serves a different purpose and won't be going away. This patch switches it to internally store a CallBase* instead of a CallSite. The only interface changes are the removal of the getCallSite method and getCallBackUses now takes a CallBase&. These methods had only a few callers that were easy enough to update without needing a compatibility shim. In the future once the other CallSites are gone, the CallSite.h header should be renamed to AbstractCallSite.h Differential Revision: https://reviews.llvm.org/D78322	2020-04-16 16:24:45 -07:00
Bob Haarman	cc5c58889e	[WPD] Avoid noalias assumptions in unique return value optimization Summary: Changes the type of the @__typeid_.*_unique_member imports we generate for unique return value optimization from i8 to [0 x i8]. This prevents assuming that these imports do not alias, such as when two unique return values occur in the same vtable. Fixes PR45393. Reviewers: tejohnson, pcc Reviewed By: pcc Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77421	2020-04-16 14:49:51 -07:00
Roman Lebedev	b1fbf438f6	[OpenMPOpt] deduplicateRuntimeCalls(): avoid traditional map lookup pitfall Summary: This roughly halves time spent in that pass, while unsurprisingly significantly reducing total memory usage. This makes sense because most functions won't use any openmp functions.. old ``` 0.2329 ( 0.5%) 0.0409 ( 0.9%) 0.2738 ( 0.5%) 0.2736 ( 0.5%) OpenMP specific optimizations ``` ``` total runtime: 63.32s. bytes allocated in total (ignoring deallocations): 8.34GB (131.70MB/s) calls to allocation functions: 14526259 (229410/s) temporary memory allocations: 3335760 (52680/s) peak heap memory consumption: 324.36MB peak RSS (including heaptrack overhead): 5.39GB total memory leaked: 289.93MB ``` new ``` 0.1457 ( 0.3%) 0.0276 ( 0.6%) 0.1732 ( 0.3%) 0.1731 ( 0.3%) OpenMP specific optimizations ``` ``` total runtime: 55.01s. bytes allocated in total (ignoring deallocations): 6.70GB (121.89MB/s) calls to allocation functions: 14268205 (259398/s) temporary memory allocations: 3225355 (58637/s) peak heap memory consumption: 324.09MB peak RSS (including heaptrack overhead): 5.39GB total memory leaked: 289.87MB ``` diff ``` total runtime: -8.31s. bytes allocated in total (ignoring deallocations): -1.63GB (196.58MB/s) calls to allocation functions: -258054 (31034/s) temporary memory allocations: -110405 (13277/s) peak heap memory consumption: -262.36KB peak RSS (including heaptrack overhead): 0B total memory leaked: -61.45KB ``` Reviewers: jdoerfert, hfinkel Reviewed By: jdoerfert Subscribers: yaxunl, hiraditya, guansong, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78299	2020-04-16 19:54:02 +03:00
Bjorn Pettersson	fdf9bad573	[Float2Int] Stop passing around a reference to the class member Roots. NFC The Float2IntPass got a class member called Roots, but Roots was also passed around to member function as a reference. This patch simply remove those references.	2020-04-16 15:24:13 +02:00
Johannes Doerfert	c4d3188adb	[Attributor][NFC] Reduce indention for call site attribute seeding Also added a TODO to remind us that indirect calls could be optimized as well.	2020-04-16 02:32:31 -05:00
Johannes Doerfert	0741dec27b	[Attributor][FIX] Handle droppable uses when replacing values Since we use the fact that some uses are droppable in the Attributor we need to handle them explicitly when we replace uses. As an example, an assumed dead value can have live droppable users. In those we cannot replace the value simply by an undef. Instead, we either drop the uses (via `dropDroppableUses`) or keep them as they are. In this patch we do both, depending on the situation. For values that are dead but not necessarily removed we keep droppable uses around because they contain information we might be able to use later. For values that are removed we drop droppable uses explicitly to avoid replacement with undef.	2020-04-16 00:56:08 -05:00
Johannes Doerfert	ea7f17ee38	[InstCombine] Simplify calls with casted `returned` attribute The handling of the `returned` attribute in D75815 did miss the case where the argument is (bit)casted to a different type. This is explicitly allowed by the language reference and exposed by the Attributor. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D77977	2020-04-16 00:56:00 -05:00
Johannes Doerfert	253d6be0f6	[Attributor][FIX] Properly check for accesses to globals The check if globals were accessed was not always working because two bits are set for NO_GLOBAL_MEM. The new check works also if only on kind of globals (internal/external) is accessed.	2020-04-16 00:55:34 -05:00
Johannes Doerfert	ad9c284cc3	[Attributor][NFC] Run the verifier only on functions and under EXPENSIVE_CHECKS Running the verifier is expensive so we want to avoid it even in runs that enable assertions. As we move closer to enabling the Attributor this code will be executed by some buildbots but not cause overhead for most people.	2020-04-16 00:55:33 -05:00
Craig Topper	8e1408695c	[CallSite removal][TargetLibraryInfo] Replace ImmutableCallSite with CallBase in one of the getLibFunc signatures. NFC Differential Revision: https://reviews.llvm.org/D78083	2020-04-15 22:43:41 -07:00
Mircea Trofin	4213bc761a	[llvm][NFC][CallSite] Removed CallSite from some implementation details. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78256	2020-04-15 22:27:05 -07:00
Johannes Doerfert	898bbc252a	[Attributor] Lazily collect function information Before, we eagerly analyzed all the functions to collect information about them, e.g. what instructions may read/write memory. This had multiple drawbacks: - In CGSCC-mode we can end up looking at a callee which is not in the SCC but for which we need an initialized cache. - We end up looking at functions that we deem dead and never need to analyze in the first place. - We have a implicit dependence which is easy to break. This patch moves the function analysis into the information cache and makes it lazy. There is no real functional change expected except due to the first reason above.	2020-04-15 22:26:38 -05:00
Johannes Doerfert	8c4057e3a3	[Attributor] Replace call graph call sites after function replacement The CallGraphUpdater allows to directly alter call site information and we should do so. This might appease the windows buildbot that crashes during the SCC traversal.	2020-04-15 22:24:09 -05:00
Johannes Doerfert	df675890b7	[CallGraphUpdater][NFC] Minor updates to D77855 I uploaded the old version accidentally instead of the one with these minor adjustments requested by the reviewers. Differential Revision: https://reviews.llvm.org/D77855	2020-04-15 21:26:35 -05:00
Alina Sbirlea	edccc35e8f	[Reassociate] Preserve AAManager and BasicAA analyses. Now Reassociate Pass invalidates the analysis results of AAManager and BasicAA, but it saves GlobalsAA, although it seems that it should preserve them, since it affects only Unary and Binary operators. Author: kpolushin (Kirill) Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D77137	2020-04-15 16:58:03 -07:00
Johannes Doerfert	937025757c	[CallGraphUpdater] Remove nodes from their SCC (old PM) Summary: We can and should remove deleted nodes from their respective SCCs. We did not do this before and this was a potential problem even though I couldn't locally trigger an issue. Since the `DeleteNode` would assert if the node was not in the SCC, we know we only remove nodes from their SCC and only once (when run on all the Attributor tests). Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku Subscribers: hiraditya, bollu, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77855	2020-04-15 18:38:50 -05:00
Johannes Doerfert	1b34b84ddd	[CallGraphUpdater] Update the ExternalCallingNode for node replacements Summary: While it is uncommon that the ExternalCallingNode needs to be updated, it can happen. It is uncommon because most functions listed as callees have external linkage, modifying them is usually not allowed. That said, there are also internal functions that have, or better had, their "address taken" at construction time. We conservatively assume various uses cause the address "to be taken". Furthermore, the user might have become dead at some point. As a consequence, transformations, e.g., the Attributor, might be able to replace a function that is listed as callee of the ExternalCallingNode. Since there is no function corresponding to the ExternalCallingNode, we did just remove the node from the callee list if we replaced it (so far). Now it would be preferable to replace it if needed and remove it otherwise. However, removing the node has implications on the CGSCC iteration. Locally, that caused some other nodes to be never visited but it is for sure possible other (bad) side effects can occur. As it seems conservatively safe to keep the new node in the callee list we will do that for now. Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku Subscribers: hiraditya, bollu, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77854	2020-04-15 18:38:50 -05:00
Johannes Doerfert	7ec8d79385	[CallGraphUpdater] Properly remove strongly connected components (oldPM) Summary: The old code did eliminate references from and to functions that were about to be deleted only just before we deleted them. This can cause references from other functions that are supposed to be deleted to still exist, depending on the order. If the functions form a strongly connected component the problem manifests regardless of the order in which we try to actually delete the functions. This patch introduces a two step deletion. First we remove all references and then we delete the function. Note that this only affects the old call graph. There should not be any functional changes if no old style call graph was given. To test this we delete two strongly connected functions instead of one in an existing test. Reviewers: hfinkel Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77975	2020-04-15 18:38:49 -05:00
Craig Topper	240725666a	[CallSite removal][CallSiteSplitting] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78240	2020-04-15 15:38:02 -07:00
Craig Topper	fbb804983d	[CallSite removal][CloneFunction] Use CallSite instead of CallBase. NFC Differential Revision: https://reviews.llvm.org/D78236	2020-04-15 15:38:02 -07:00
Philip Reames	80c46c53bd	[PoisonChecking] Further clarify file scope comment, and update to match naming now used in code	2020-04-15 14:48:53 -07:00
Philip Reames	463513e959	[NFC] Adjust style and clarify comments in PoisonChecking	2020-04-15 14:48:53 -07:00
Philip Reames	75ca7127bc	[NFC] Use new canCreatePoison to make code intent more clear in PoisonChecking	2020-04-15 14:48:53 -07:00
Craig Topper	592d8e7d75	[CallSite removal][SimpleLoopUnswitch] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78227	2020-04-15 13:25:02 -07:00
Craig Topper	7b6ff8bf1f	[CallSite removal][SampleProfile] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78219	2020-04-15 12:47:17 -07:00
Davide Italiano	5f87415efc	[LICM] Try to merge debug locations when sinking. The current strategy LICM uses when sinking for debuginfo is that of picking the debug location of one of the uses. This causes stepping to be wrong sometimes, see, e.g. PR45523. This patch introduces a generalization of getMergedLocation(), that operates on a vector of locations instead of two, and try to merge all them together, and use the new API in LICM. <rdar://problem/61750950>	2020-04-15 12:29:34 -07:00
Craig Topper	a0d92248ea	[CallSite removal][PruneEH] Use CallBase instead of CallSite. NFC Reviewers: mtrofin, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78182	2020-04-15 10:11:41 -07:00
Sanjay Patel	01bcc3e937	[InstCombine] prevent infinite loop with sub/abs of constant expression PR45539: https://bugs.llvm.org/show_bug.cgi?id=45539	2020-04-15 09:19:16 -04:00
Benjamin Kramer	cc035d475f	Upgrade users of 'new ShuffleVectorInst' to pass indices as an int array No functionality change intended.	2020-04-15 14:29:43 +02:00
Florian Hahn	3f7f06888b	[VPlan] Branches are not widened by VPWidenRecipe, assert (NFC).	2020-04-15 12:03:45 +01:00
Benjamin Kramer	6f64daca8f	Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints No functionality change intended.	2020-04-15 12:51:38 +02:00
Florian Hahn	5b4b3e0b6e	[VPlan] Move widening check for non-memory/non-calls to function (NFC). After introducing VPWidenSelectRecipe, the duplicated logic can be shared. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77973	2020-04-15 11:48:37 +01:00
Florian Hahn	cf9ee49b4d	[DSE] Lift post-dominance for objs not accessible in caller. We can eliminate MemoryDefs of objects not accessible after the function returns (e.g. alloca), if there are no reads between the MemoryDef and any function exits. We can stop traversing paths that completely overwrite the memory location of the MemoryDef. This patch was split off D73763. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: asbirlea, george.burgess.iv Differential Revision: https://reviews.llvm.org/D77736	2020-04-15 11:37:14 +01:00
Sameer Sahasrabuddhe	7bb9f500e2	fix warning: specialization of template in different namespace This is related to commit `8c11bc0cd0` which introduces the FixIrreducible pass. The warning seems hard to reproduce locally. The latest attempt ought to work.	2020-04-15 15:57:53 +05:30
Sameer Sahasrabuddhe	8c11bc0cd0	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198 This restores commit `2ada8e2525`. Originally reverted with commit `44e09b59b8`.	2020-04-15 15:05:51 +05:30
Florian Hahn	79d185c792	[VPlan] Move Load/Store checks out of tryToWiden (NFC). Handling LoadInst and StoreInst in tryToWiden seems a bit counter-intuitive, as there is only an assertion for them and in no case VPWidenRefipes are created for them. I think it makes sense to move the assertion to handleReplication, where the non-widened loads and store are handled. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77972	2020-04-15 10:18:42 +01:00
Gil Rapaport	b747d72c19	[LV] Fix PR45525: Incorrect assert in blend recipe Fix an assert introduced in 41ed5d856c1: a phi with a single predecessor and a mask is a valid case which is already supported by the code. Differential Revision: https://reviews.llvm.org/D78115	2020-04-15 10:39:07 +03:00
Sameer Sahasrabuddhe	44e09b59b8	Revert "Introduce fix-irreducible pass" This reverts commit `2ada8e2525`. Buildbots produced compilation errors which I was not able to quickly reproduce locally. Need more time to investigate.	2020-04-15 12:19:50 +05:30
Sameer Sahasrabuddhe	2ada8e2525	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198	2020-04-15 11:29:19 +05:30
Teresa Johnson	33ffb62e23	Allow disabling of vectorization using internal options Summary: Currently, the internal options -vectorize-loops, -vectorize-slp, and -interleave-loops do not have much practical effect. This is because they are used to initialize the corresponding flags in the pass managers, and those flags are then unconditionally overwritten when compiling via clang or via LTO from the linkers. The only exception was -vectorize-loops via opt because of some special hackery there. While vectorization could still be disabled when compiling via clang, using -fno-[slp-]vectorize, this meant that there was no way to disable it when compiling in LTO mode via the linkers. This only affected ThinLTO, since for regular LTO vectorization is done during the compile step for scalability reasons. For ThinLTO it is invoked in the LTO backends. See also the discussion on PR45434. This patch makes it so the internal options can actually be used to disable these optimizations. Ultimately, the best long term solution is to mark the loops with metadata (similar to the approach used to fix -fno-unroll-loops in D77058), but this enables a shorter term workaround, and actually makes these internal options useful. I constant propagated the initial values of these internal flags into the pass manager flags (for some reasons vectorize-loops and interleave-loops were initialized to true, while vectorize-slp was initialized to false). As mentioned above, they are overwritten unconditionally so this doesn't have any real impact, and these initial values aren't particularly meaningful. I then changed the passes to check the internl values and return without performing the associated optimization when false (I changed the default of -vectorize-slp to true so the options behave similarly). I was able to remove the hackery in opt used to get -vectorize-loops=false to work, as well as a special option there used to disable SLP vectorization. Finally, I changed thinlto-slp-vectorize-pm.c to: a) Only test SLP (moved the loop vectorization checking to a new test). b) Use code that is slp vectorized when it is enabled, and check that instead of whether the pass is enabled. c) Test the new behavior of -vectorize-slp. d) Test both pass managers. The loop vectorization (and associated interleaving) testing I moved to a new thinlto-loop-vectorize-pm.c test, with several changes: a) Changed the flags on the interleaving testing so that it will actually interleave, and check that. b) Test the new behavior of -vectorize-loops and -interleave-loops. c) Test both pass managers. Reviewers: fhahn, wmi Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77989	2020-04-14 18:09:10 -07:00
Mircea Trofin	447e2c3067	[llvm][NFC][CallSite] Remove Implementation uses of CallSite Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78142	2020-04-14 14:49:47 -07:00
Christopher Tetreault	8226d599ff	[SVE] Remove calls to getBitWidth from Transforms Reviewers: efriedma, sdesmalen, spatel, eugenis, chandlerc Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77896	2020-04-14 14:31:42 -07:00
Huihui Zhang	5c1d1a62e3	[InstCombine][SVE] Fix visitGetElementPtrInst for scalable type. Summary: This patch fix the following issues in InstCombiner::visitGetElementPtrInst 1. Skip for scalable type if transformation requires fixed size number of vector element. 2. Skip for scalable type if transformation relies on compile-time known type alloc size. 3. Use VectorType::getElementCount when scalable property is used to construct new VectorType. 4. Use TypeSize::getKnownMinSize when minimal size of a scalable type is valid to determine GEP 'inbounds'. 5. Explicitly call TypeSize::getFixedSize to avoid implicit type conversion to uint64_t. Reviewers: sdesmalen, efriedma, spatel, ctetreau Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78081	2020-04-14 12:38:32 -07:00
Sanjay Patel	6a7e958a42	[InstCombine] try to reduce more shuffles with bitcasted operand This is the widen mask element sibling to D76844. shuf (bitcast X), undef, Mask --> bitcast X' http://volta.cs.utah.edu:8080/z/4dt3V8	2020-04-14 15:03:59 -04:00
Benjamin Kramer	7bf166665e	[FunctionAttrs] Don't copy all the nodes where a reference is fine.	2020-04-14 17:18:23 +02:00
Max Kazantsev	f8a42bca28	[ADCE] Fix incorrect reporting of CFG changes This patch fixes 2 related bugs in ADCE: - `performDeadCodeElimination` does not report changes if it did ONLY CFG changes (affects both old and new pass managers); - When control flow removal is enabled, new pass manager does not drop CFG analyses. Both can lead to incorrect loop info after ADCE that does only CFG changes. Differential Revision: https://reviews.llvm.org/D78103 Reviewed By: Denis Antrushin	2020-04-14 20:26:13 +07:00
Aaron Puchert	e833e58300	[ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC Reviewed By: fhahn, echristo Differential Revision: https://reviews.llvm.org/D78061	2020-04-14 13:32:53 +02:00
Georgii Rymar	1647ff6e27	[ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. It can be used to avoid passing the begin and end of a range. This makes the code shorter and it is consistent with another wrappers we already have. Differential revision: https://reviews.llvm.org/D78016	2020-04-14 14:11:02 +03:00
Florian Hahn	38609fa9e4	Recommit "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This includes a fix reported with simplifications in the presence of NaN. This reverts the revert commit `06408451bf`.	2020-04-14 11:48:52 +01:00
Tyker	3bdfa966ec	[AssumeBundles] preserve knowledge in DCE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77403	2020-04-14 12:48:15 +02:00
Tyker	086de7673e	[AssumeBundles] preserve knowledge in DSE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77404	2020-04-14 12:48:15 +02:00
Tyker	de4dc275f5	[AssumeBundles] preserve information in NewGVN Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77406	2020-04-14 12:48:14 +02:00
Tyker	c35194b800	[AssumeBundles] preserve information in LICM Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77407	2020-04-14 12:48:14 +02:00
Tyker	1d2b76a8fc	[AssumeBundles] adapte GVN to assume bundles Summary: prevent GVN from removing assume bundles make GVN preserve information from removed instructions Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77405	2020-04-14 12:48:14 +02:00
Pratyai Mazumder	0c61e91100	[SanitizerCoverage] The section name for inline-bool-flag was too long for darwin builds, so shortening it. Summary: Following up on the comments on D77638. Not undoing rGd6525eff5ebfa0ef1d6cd75cb9b40b1881e7a707 here at the moment, since I don't know how to test mac builds. Please let me know if I should include that here too. Reviewers: vitalybuka Reviewed By: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77889	2020-04-14 02:06:33 -07:00
Mircea Trofin	4aae4e3f48	[llvm][NFC] CallSite removal from inliner-related files Summary: This removes CallSite from inliner files. Some dependencies where thus affected. Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77991	2020-04-13 21:28:58 -07:00
Mehdi Amini	384ca190ae	Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object" This reverts commit `10df1563d6`. Some buildbots are broken.	2020-04-14 00:27:08 +00:00
Mehdi Amini	10df1563d6	Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object ModuleSummaryAnalysis is the only file in libAnalysis that brings a dependency on the CodeGen layer from libAnalysis, moving it breaks this dependency. Differential Revision: https://reviews.llvm.org/D77994	2020-04-13 23:12:11 +00:00
Benjamin Kramer	f1542efd97	[CHR] Clean up some code and reduce copying. NFCI.	2020-04-13 23:11:20 +02:00
Christopher Tetreault	3297e9b7c3	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: rriddle, sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77259	2020-04-13 12:29:43 -07:00
Benjamin Kramer	ec228d722c	[InstCombine] Use SmallBitVector for convienently checking if all bits are set	2020-04-13 20:37:15 +02:00
Vedant Kumar	4831f4b7bd	[InstCombine] Fix debug variance issue in tryToMoveFreeBeforeNullTest Fix an issue where the presence of debug info could disable an optimization in tryToMoveFreeBeforeNullTest.	2020-04-13 10:55:17 -07:00
Vedant Kumar	122a6bfb07	[Debugify] Strip added metadata in the -debugify-each pipeline Summary: Share logic to strip debugify metadata between the IR and MIR level debugify passes. This makes it simpler to hunt for bugs by diffing IR with vs. without -debugify-each turned on. As a drive-by, fix an issue causing CallGraphNodes to become invalid when a dead llvm.dbg.value prototype is deleted. Reviewers: dsanders, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77915	2020-04-13 10:55:17 -07:00
Gil Rapaport	41ed5d856c	[LV] Clean up vectorizeInterleaveGroup (NFCI) Pass from the calling recipe the interleave group itself instead of passing the group's insertion position and having the function query CM for its interleave group and making sure that given instruction is the insertion point of. Differential Revision: https://reviews.llvm.org/D78002	2020-04-13 13:15:06 +03:00
Tyker	813f438baa	[AssumeBundles] adapt Assumption cache to assume bundles Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge. Reviewers: jdoerfert, hfinkel Reviewed By: jdoerfert Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77402	2020-04-13 12:04:51 +02:00
Benjamin Kramer	06408451bf	Revert "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This reverts commit `1a02aaeaa4`. Crashes on the following test case: $ cat crash.ll source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\00\00\C0\7F\09\85\08?\ED\C94\FE~\EB/\F3\90\CF\BA\C1" @1 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\A3\A0\0FA\00\00\C0\7F\00\00\C0\7F\00\00\00\00\02\9AA\00" define void @IgammaSpecialValues.448() { entry: br label %fusion.26.loop_header.dim.0 fusion.26.loop_header.dim.0: ; preds = %fusion.26.loop_header.dim.0, %entry %fusion.26.invar_address.dim.0.0 = phi i64 [ 0, %entry ], [ %invar.inc17, %fusion.26.loop_header.dim.0 ] %0 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @0 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %1 = load float, float %0 %2 = fmul float %1, 0.000000e+00 %3 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @1 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %4 = load float, float %3 %5 = fneg float %4 %6 = fadd float %2, %5 %invar.inc17 = add nuw nsw i64 %fusion.26.invar_address.dim.0.0, 1 br label %fusion.26.loop_header.dim.0 } $ opt -ipsccp -S < crash.ll opt: llvm/include/llvm/Analysis/ValueLattice.h:251: bool llvm::ValueLatticeElement::markConstant(llvm::Constant *, bool): Assertion `getConstant() == V && "Marking constant with different value"' failed.	2020-04-13 11:23:26 +02:00
Florian Hahn	18138e0252	[VPlan] Introduce VPWidenSelectRecipe (NFC). Widening a selects depends on whether the condition is loop invariant or not. Rather than checking during codegen-time, the information can be recorded at the VPlan construction time. This was suggested as part of D76992, to reduce the reliance on accessing the original underlying IR values. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77869	2020-04-13 08:35:28 +01:00
Eli Friedman	cfb844265a	[GlobalOpt] Explicitly set alignment of bool load/store operations.	2020-04-12 16:03:12 -07:00
Huihui Zhang	4bde7c5986	[NFC] Use VectorType::isScalable to align with ongoing VectorType refactor.	2020-04-12 15:39:13 -07:00
Mircea Trofin	d2f1cd5d97	[llvm][NFC] Refactor uses of CallSite to CallBase - call promotion Summary: Updated CallPromotionUtils and impacted sites. Parameters that are expected to be non-null, and return values that are guranteed non-null, were replaced with CallBase references rather than pointers. Left FIXME in places where more changes are facilitated by CallBase, but aren't CallSites: Instruction* parameters or return values, for example, where the contract that they are actually CallBase values. Reviewers: davidxl, dblaikie, wmi Reviewed By: dblaikie Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77930	2020-04-12 08:27:29 -07:00
Florian Hahn	ae1e353a25	[VPlan] Turn classes with all public members into structs (NFC). struct should be used when all members are public: https://llvm.org/docs/CodingStandards.html#use-of-class-and-struct-keywords Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77865	2020-04-12 11:03:39 +01:00
Sanjay Patel	1318ddbc14	[VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC As proposed in D77881, we'll have the related widening operation, so this name becomes too vague. While here, change the function signature to take an 'int' rather than 'size_t' for the scaling factor, add an assert for overflow of 32-bits, and improve the documentation comments.	2020-04-11 10:05:49 -04:00
Benjamin Kramer	e590bd6b92	[argpromote] Use formatv to simplify code. NFCI.	2020-04-11 14:54:32 +02:00
Florian Hahn	719846c469	[VPlan] Drop redundant private: at beginning of class defs (NFC). Default visibility for classes is private, so the private: at the top of various class definitions is redundant. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77810	2020-04-11 13:27:10 +01:00
Huihui Zhang	6e7eeb44b3	[GVN] Fix VNCoercion for Scalable Vector. Summary: For VNCoercion, skip scalable vector when analysis rely on fixed size, otherwise call TypeSize::getFixedSize() explicitly. Add unit tests to check funtionality of GVN load elimination for scalable type. Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau Reviewed By: efriedma Subscribers: bjope, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76944	2020-04-10 17:49:07 -07:00
Eric Christopher	45dca04395	Exclude bitcast and ext/trunc signbit optimization on ppc_fp128 Revision `a1c05fe` <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad> removed bitcast from the list of problematic transformations, however: %97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation %98 = bitcast double %97 to i64 // `a1c05fe` checks ppc_fp128 at here %99 = icmp slt i64 %98, 0 %100 = zext i1 %99 to i8 store i8 %100, i8* %7, align 1 so this patch does that. I'm also disabling it in the presence of extend just in case. I verified separately that the hash of -std::infinity and std::infinity don't match now. Differential Revision: https://reviews.llvm.org/D77911	2020-04-10 17:07:55 -07:00
Mircea Trofin	da9bcdaad9	[llvm][NFC] Inliner.cpp: ensure InlineHistory ID is always initialized; Summary: The inline history is associated with a call site. There are two locations we fetch inline history. In one, we fetch it together with the call site. In the other, we initialize it under certain conditions, use it later under same conditions (different if check), and otherwise is uninitialized. Although currently there is no uninitialized use, the code is more challenging to maintain correctly, than if the value were always initialized. Changed to the upfront initialization pattern already present in this file. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77877	2020-04-10 15:28:53 -07:00
Matt Morehouse	bef187c750	Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang Summary: This commit adds two command-line options to clang. These options let the user decide which functions will receive SanitizerCoverage instrumentation. This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing. Patch by Yannis Juglaret of DGA-MI, Rennes, France libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy. Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions. The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`. Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists: ``` # (a) src:* fun:* # (b) src:SRC/* fun:* # (c) src:SRC/src/woff2_dec.cc fun:* ``` Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points. For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s. We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction. The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise. Reviewers: kcc, morehouse, vitalybuka Reviewed By: morehouse, vitalybuka Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D63616	2020-04-10 10:44:03 -07:00
Mircea Trofin	f62335b534	[llvm][NFC] Style fixes in Inliner.cpp Summary: Function names: camel case, lower case first letter. Variable names: start with upper letter. For iterators that were 'i', renamed with a descriptive name, as 'I' is 'Instruction&'. Lambda captures simplification. Opportunistic boolean return simplification. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77837	2020-04-10 08:04:39 -07:00
Ilya Leoshkevich	3bc439bdff	[MSan] Add instrumentation for SystemZ Summary: This patch establishes memory layout and adds instrumentation. It does not add runtime support and does not enable MSan, which will be done separately. Memory layout is based on PPC64, with the exception that XorMask is not used - low and high memory addresses are chosen in a way that applying AndMask to low and high memory produces non-overlapping results. VarArgHelper is based on AMD64. It might be tempting to share some code between the two implementations, but we need to keep in mind that all the ABI similarities are coincidental, and therefore any such sharing might backfire. copyRegSaveArea() indiscriminately copies the entire register save area shadow, however, fragments thereof not filled by the corresponding visitCallSite() invocation contain irrelevant data. Whether or not this can lead to practical problems is unclear, hence a simple TODO comment. Note that the behavior of the related copyOverflowArea() is correct: it copies only the vararg-related fragment of the overflow area shadow. VarArgHelper test is based on the AArch64 one. s390x ABI requires that arguments are zero-extended to 64 bits. This is particularly important for __msan_maybe_warning_() and __msan_maybe_store_origin_() shadow and origin arguments, since non zeroed upper parts thereof confuse these functions. Therefore, add ZExt attribute to the corresponding parameters. Add ZExt attribute checks to msan-basic.ll. Since with -msan-instrumentation-with-call-threshold=0 instrumentation looks quite different, introduce the new CHECK-CALLS check prefix. Reviewers: eugenis, vitalybuka, uweigand, jonpa Reviewed By: eugenis Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, stefansf, Andreas-Krebbel Tags: #llvm Differential Revision: https://reviews.llvm.org/D76624	2020-04-10 16:53:49 +02:00
Christopher Tetreault	3bebf02861	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77262	2020-04-10 07:47:19 -07:00
Florian Hahn	1a02aaeaa4	[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef. For non-integer constants/expressions and overdefined, I think we can just use SimplifyBinOp to do common folds. By just passing a context with the DL, SimplifyBinOp should not try to get additional information from looking at definitions. For overdefined values, it should be enough to just pass the original operand. Note: The comment before the `if (isconstant(V1State)...` was wrong originally: isConstant() also matches integer ranges with a single element. It is correct now. Reviewers: efriedma, davide, mssimpso, aartbik Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76459	2020-04-10 11:02:57 +01:00
John McCall	8423a6f363	Rename OptimalLayout to OptimizedStructLayout at Chris's request.	2020-04-10 00:14:20 -04:00
Max Kazantsev	4e87823026	[LoopLoadElim] Fix crash by always checking simplify form Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775	2020-04-10 09:23:28 +07:00
Mircea Trofin	655aa1ae4a	[llvm][NFC] Replace CallSite with CallBase in Inliner Summary: Almost all uses are replaced. Left FIXMEs for the two sites that require refactoring outside of Inliner, to scope this patch. Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77817	2020-04-09 15:01:58 -07:00
Christopher Tetreault	19cc9b9ded	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: efriedma, sdesmalen, rriddle Reviewed By: sdesmalen Subscribers: hiraditya, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77261	2020-04-09 14:59:14 -07:00
Christopher Tetreault	00a1032412	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: rriddle, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77260	2020-04-09 13:35:41 -07:00
Zequan Wu	eccfa35d53	Fix lifetime call in landingpad blocking Simplifycfg pass Fix lifetime call in landingpad blocks simplifycfg from removing the landingpad. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D77188	2020-04-09 13:07:32 -07:00
Gil Rapaport	e2a1867880	[LV] Add VPValue operands to VPBlendRecipe (NFCI) InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit introduces VPValues for VPBlendRecipe to use as the values to blend. The recipe is generated with VPValues wrapping the phi's incoming values of the scalar phi. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential Revision: https://reviews.llvm.org/D77539	2020-04-09 18:48:33 +03:00
Ayal Zaks	1678489234	[LV] FoldTail w/o Primary Induction Introduce a new VPWidenCanonicalIVRecipe to generate a canonical vector induction for use in fold-tail-with-masking, if a primary induction is absent. The canonical scalar IV having start = 0 and step = VFUF, created during code -gen to control the vector loop, is widened into a canonical vector IV having start = {<PartVF, PartVF+1, ..., PartVF+VF-1> for 0 <= Part < UF} and step = <VFUF, VFUF, ..., VF*UF>. Differential Revision: https://reviews.llvm.org/D77635	2020-04-09 17:45:23 +03:00
Sanjay Patel	812970edda	[InstCombine] replace undef in vector constant for safe shift transform (PR45447) As noted in PR45447, we have a vector-constant-with-undef-element transform bug: https://bugs.llvm.org/show_bug.cgi?id=45447 We replace undefs with a safe constant (0 or -1) based on the (non-)negative predicate constraint. So this is correct: http://volta.cs.utah.edu:8080/z/WZE36H ...but this is not: http://volta.cs.utah.edu:8080/z/boj8gJ Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800). But that's making an assumption about what qualifies as "safe", and that assumption may not always hold. Differential Revision: https://reviews.llvm.org/D77739	2020-04-09 08:00:46 -04:00
Anton Bikineev	9e1ccec8d5	tsan: don't instrument __attribute__((naked)) functions Naked functions are required to not have compiler generated prologues/epilogues, hence no instrumentation is needed for them. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45400 Differential Revision: https://reviews.llvm.org/D77477	2020-04-09 13:47:47 +02:00
Florian Hahn	a7efe06af0	[LV] Assert no DbgInfoIntrinsic calls are passed to widening (NFC). When building a VPlan, BasicBlock::instructionsWithoutDebug() is used to iterate over the instructions in a block. This means that no recipes should be created for debug info intrinsics already and we can turn the early exit into an assertion. Reviewers: Ayal, gilr, rengolin, aprantl Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D77636	2020-04-09 11:37:32 +01:00
Florian Hahn	9997ee23ed	[VPlan] Add & use VPValue operands for VPWidenCallRecipe (NFC). This patch adds VPValue versions for the arguments of the call to VPWidenCallRecipe and uses them during code-generation. Similar to D76373 this reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77655	2020-04-09 10:23:26 +01:00
Jay Foad	c63aed890e	[KnownBits] Move AND, OR and XOR logic into KnownBits Summary: There are at least three clients for KnownBits calculations: ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the common logic should be moved out of these clients and into KnownBits itself. This patch does this for AND, OR and XOR calculations by implementing and using appropriate operator overloads KnownBits::operator& etc. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74060	2020-04-09 10:10:37 +01:00
Serge Pavlov	c7ff5b38f2	[FPEnv] Use single enum to represent rounding mode Now compiler defines 5 sets of constants to represent rounding mode. These are: 1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes defined by IEEE-754 and is used in `APFloat` implementation. 2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754 rounding modes and a special value for dynamic rounding mode. It is used in clang frontend. 3. `llvm::fp::RoundingMode`. Defines the same values as `clang::LangOptions::FPRoundingModeKind` but in different order. It is used to specify rounding mode in in IR and functions that operate IR. 4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7). Besides constants for rounding mode it also uses a special value to indicate error. It is convenient to use in intrinsic functions, as it represents platform-independent representation for rounding mode. In this role it is used in some pending patches. 5. Values like `FE_DOWNWARD` and other, which specify rounding mode in library calls `fesetround` and `fegetround`. Often they represent bits of some control register, so they are target-dependent. The same names (not values) and a special name `FE_DYNAMIC` are used in `#pragma STDC FENV_ROUND`. The first 4 sets of constants are target independent and could have the same numerical representation. It would simplify conversion between the representations. Also now `clang::LangOptions::FPRoundingModeKind` and `llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding direction `roundTiesToAway`, although it is supported natively on some targets. This change defines all the rounding mode type via one `llvm::RoundingMode`, which also contains rounding mode for IEEE rounding direction `roundTiesToAway`. Differential Revision: https://reviews.llvm.org/D77379	2020-04-09 13:26:47 +07:00
Pratyai Mazumder	e8d1c6529b	[SanitizerCoverage] sancov/inline-bool-flag instrumentation. Summary: New SanitizerCoverage feature `inline-bool-flag` which inserts an atomic store of `1` to a boolean (which is an 8bit integer in practice) flag on every instrumented edge. Implementation-wise it's very similar to `inline-8bit-counters` features. So, much of wiring and test just follows the same pattern. Reviewers: kcc, vitalybuka Reviewed By: vitalybuka Subscribers: llvm-commits, hiraditya, jfb, cfe-commits, #sanitizers Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D77244	2020-04-08 22:43:52 -07:00
Vitaly Buka	8b1a6c0a57	[NFC][SanitizerCoverage] Simplify alignment calculation This reverts commit e42f2a0cd8b8007c816d0e63f5000c444e29105e.	2020-04-08 22:43:52 -07:00
Johannes Doerfert	cb0ecc5c33	[CallGraphUpdater] Remove dead constants before replacing a function Dead constants might be left when a function is replaced, we can gracefully handle this case and avoid complexity for the users who would see an assertion otherwise.	2020-04-08 22:52:46 -05:00
Craig Topper	f3d3cec648	[InstCombine] Avoid a call to deprecated version of CreateCall. Passing a Value * to CreateCall has to call getPointerElementType to find the type of the pointer. In this case we can rely on the fact that Intrinsic::getDeclaration returns a Function * and use that version of CreateCall.	2020-04-08 17:41:16 -07:00
Johannes Doerfert	0985554b70	[Attributor][NFC] Split AbstractAttributes out of Attributor.cpp Attributor.cpp became quite big and we need to start provide structure. The Attributor code is now in Attributor.cpp and the classes derived from AbstractAttribute are in AttributorAttributes.cpp. Minor changes were required but no intended functional changes. We also minimized includes as part of this. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76873	2020-04-08 19:02:14 -05:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Sanjay Patel	a1c05fe20f	[InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold Based on the post-commit comments for rG0f56bbc, there might be a problem with this transform: (bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0 ...and the ppc_fp128 data type, so conservatively bypass if we are bitcasting a ppc_fp128. We might be able to account for endian or other differences to enable this for PowerPC again if that is useful. Differential Revision: https://reviews.llvm.org/D77642	2020-04-08 08:56:19 -04:00
Max Kazantsev	7adb9e06fd	[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM	2020-04-08 17:32:03 +07:00
Kazu Hirata	91eb442fde	[JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl Summary: ComputeValueKnownInPredecessorsImpl is the main folding mechanism in JumpThreading.cpp. To avoid potential infinite recursion while chasing use-def chains, it uses: DenseSet<std::pair<Value , BasicBlock >> &RecursionSet to keep track of Value-BB pairs that we've processed. Now, when ComputeValueKnownInPredecessorsImpl recursively calls itself, it always passes BB as is, so the second element is always BB. This patch simplifes the function by dropping "BasicBlock *" from RecursionSet. Reviewers: wmi, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77699	2020-04-07 18:37:36 -07:00
Eli Friedman	565b56a72c	[NFC] Clean up uses of LoadInst constructor.	2020-04-07 16:28:53 -07:00
Daniel Sanders	1adeeabb79	Add MIR-level debugify with only locations support for now Summary: Re-used the IR-level debugify for the most part. The MIR-level code then adds locations to the MachineInstrs afterwards based on the LLVM-IR debug info. It's worth mentioning that the resulting locations make little sense as the range of line numbers used in a Function at the MIR level exceeds that of the equivelent IR level function. As such, MachineInstrs can appear to originate from outside the subprogram scope (and from other subprogram scopes). However, it doesn't seem worth worrying about as the source is imaginary anyway. There's a few high level goals this pass works towards: * We should be able to debugify our .ll/.mir in the lit tests without changing the checks and still pass them. I.e. Debug info should not change codegen. Combining this with a strip-debug pass should enable this. The main issue I ran into without the strip-debug pass was instructions with MMO's and checks on both the instruction and the MMO as the debug-location is between them. I currently have a simple hack in the MIRPrinter to resolve that but the more general solution is a proper strip-debug pass. * We should be able to test that GlobalISel does not lose debug info. I recently found that the legalizer can be unexpectedly lossy in seemingly simple cases (e.g. expanding one instr into many). I have a verifier (will be posted separately) that can be integrated with passes that use the observer interface and will catch location loss (it does not verify correctness, just that there's zero lossage). It is a little conservative as the line-0 locations that arise from conflicts do not track the conflicting locations but it can still catch a fair bit. Depends on D77439, D77438 Reviewers: aprantl, bogner, vsk Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77446	2020-04-07 16:25:13 -07:00
Fangrui Song	d2ef8c1f2c	[ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker() dso_local leads to direct access even if the definition is not within this compilation unit (it is still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link. If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no direct access will be generated. The current behavior is benign, because -fpic does not assume dso_local (clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal). If we do that for -fno-semantic-interposition (D73865), there will be an R_X86_64_PC32 linker error without this patch. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74751	2020-04-07 15:46:01 -07:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00
Jun Ma	46bff786bc	[Coroutines] Remove alignment check in shouldBeMustTail Differential Revision: https://reviews.llvm.org/D77362	2020-04-07 09:07:34 +08:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Vedant Kumar	5f185a8999	[AddressSanitizer] Fix for wrong argument values appearing in backtraces Summary: In some cases, ASan may insert instrumentation before function arguments have been stored into their allocas. This causes two issues: 1) The argument value must be spilled until it can be stored into the reserved alloca, wasting a stack slot. 2) Until the store occurs in a later basic block, the debug location will point to the wrong frame offset, and backtraces will show an uninitialized value. The proposed solution is to move instructions which initialize allocas for arguments up into the entry block, before the position where ASan starts inserting its instrumentation. For the motivating test case, before the patch we see: ``` \| 0033: movq %rdi, 0x68(%rbx) \| \| DW_TAG_formal_parameter \| \| ... \| \| DW_AT_name ("a") \| \| 00d1: movq 0x68(%rbx), %rsi \| \| DW_AT_location (RBX+0x90) \| \| 00d5: movq %rsi, 0x90(%rbx) \| \| ^ not correct ... \| ``` and after the patch we see: ``` \| 002f: movq %rdi, 0x70(%rbx) \| \| DW_TAG_formal_parameter \| \| \| \| DW_AT_name ("a") \| \| \| \| DW_AT_location (RBX+0x70) \| ``` rdar://61122691 Reviewers: aprantl, eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77182	2020-04-06 15:59:25 -07:00
Daniel Sanders	15f7bc7857	Add option to limit Debugify to locations (omitting variables) Summary: It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE around. In particular, because DEBUG_VALUE has the potential to change CodeGen behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally don't. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77438	2020-04-06 15:04:55 -07:00
Kirill Naumov	3f995ce8b5	[CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo The patch introduces the system to distinctively store the information needed for the Control Flow Graph as well as the instrumentary needed for the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D76820	2020-04-06 17:42:54 +00:00
Florian Hahn	7aba6a0333	[LV] Fix value that could be read uninitialized. This should fix http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569	2020-04-06 17:54:50 +01:00
Florian Hahn	90be3c24a7	[VPlan] Introduce new VPWidenCallRecipe (NFC). This patch moves calls to their own recipe, to simplify the transition to VPUser for operands of VPWidenRecipe, as discussed in D76992. Subsequently additional information can be added to the recipe rather than computing it during the execute step. Reviewers: rengolin, Ayal, gilr, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77467	2020-04-06 16:07:37 +01:00
Guillaume Chatelet	808286342a	[Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set. Summary: In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined. This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure. Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77538	2020-04-06 14:54:57 +00:00
Florian Hahn	6babae74c7	[Matrix] Update load/storeMatrix to take indices as Value* (NFC). This allows using the functions to be used with loop dependent indices.	2020-04-06 14:48:48 +01:00
Guillaume Chatelet	ff858d7781	[Alignment][NFC] Add DebugStr and operator* Summary: This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately) Differences from D77394: - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)` - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll) - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum) Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77537	2020-04-06 12:09:45 +00:00
Florian Hahn	39f2d9aa81	[Matrix] Add option to use row-major matrix layout as default. This patch adds a -matrix-default-layout option which can be used to set the default matrix layout to row-major or column-major (default). The initial patch updates codegen for loads, stores, binary operators and matrix multiply. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76325	2020-04-06 10:00:56 +01:00
Florian Hahn	d1fed7081d	[Matrix] Add initial tiling for load/multiply/store chains. This patch adds initial fusion for load/multiply/store chains of matrix operations. The patch contains roughly two parts: 1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused). First, we ensure that both loads of the multiply operands do not alias the store. If they do, we create new non-aliasing copies of the operands. Note that this may introduce new basic block. Finally we process TileSize x TileSize blocks. That is: load tiles from the input operands, multiply and store them. 2. Identify fusion candidates & matrix instructions. As a first step, collect all instructions with shape info and fusion candidates (currently @llvm.matrix.multiply calls). Next, try to fuse candidates and collect instructions eliminated by fusion. Finally iterate over all matrix instructions, skip the ones eliminated by fusion and lower the rest as usual. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75566	2020-04-06 09:28:15 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Tarindu Jayatilaka	b43b59fcc0	Expose `attributor-disable` to the new and old pass managers The new and old pass managers (PassManagerBuilder.cpp and PassBuilder.cpp) are exposed to an `extern` declaration of `attributor-disable` option which will guard the addition of the attributor passes to the pass pipelines. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76871	2020-04-05 22:29:34 -05:00
Anna Thomas	1d0f757904	[InlineFunction] Update metadata on loads that are return values This patch builds upon D76140 by updating metadata on pointer typed loads in inlined functions, when the load is the return value, and the callsite contains return attributes which can be updated as metadata on the load. Added test cases show this for nonnull, dereferenceable, dereferenceable_or_null Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D76792	2020-04-05 14:50:10 -04:00
Sanjay Patel	538a8f0227	[InstCombine] convert bitcast-shuffle to vector trunc As discussed in D76983, that patch can turn a chain of insert/extract with scalar trunc ops into bitcast+extract and existing instcombine vector transforms end up creating a shuffle out of that (see the PhaseOrdering test for an example). Currently, that process requires at least this sequence: -instcombine -early-cse -instcombine. Before D76983, the sequence of insert/extract would reach the SLP vectorizer and become a vector trunc there. Based on a small sampling of public targets/types, converting the shuffle to a trunc is better for codegen in most cases (and a regression of that form is the reason this was noticed). The trunc is clearly better for IR-level analysis as well. This means that we can induce "spontaneous vectorization" without invoking any explicit vectorizer passes (at least a vector cast op may be created out of scalar casts), but that seems to be the right choice given that we started with a chain of insert/extract, and the backend would expand back to that chain if a target does not support the op. Differential Revision: https://reviews.llvm.org/D77299	2020-04-05 09:48:02 -04:00
Sanjay Patel	4036a0af24	[InstCombine] enhance freelyNegateValue() by handling 'not' This patch extends D77230. If we have a 'not' instruction inside a negated expression, we can ignore extra uses of that op because the negation has a one-to-one replacement: negate becomes increment. Alive2 examples of the test cases: http://volta.cs.utah.edu:8080/z/T5-u9P http://volta.cs.utah.edu:8080/z/eT89L6 Differential Revision: https://reviews.llvm.org/D77459	2020-04-05 09:16:19 -04:00
Stefanos Baziotis	f3dd3a66d3	[Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions. Query AAValueSimplify on pointers in memory accessing instructions to take advantage of the constant propagation (or any other value simplification) of such values.	2020-04-05 02:46:26 +03:00
Florian Hahn	a2b18c5a08	[LV] Simplify tryToWiden as recipes are not re-used (NFC). After `49d00824bb`, VPWidenRecipe only stores a single instruction. tryToWiden can simply return the widen recipe, like other helpers in VPRecipeBuilder.	2020-04-04 18:30:50 +01:00
Nikita Popov	4ede730096	[InstCombine] Don't limit uses in eraseInstFromFunction() eraseInstFromFunction() adds the operands of the erased instructions, as those might now be dead as well. However, this is limited to instructions with less than 8 operands. This check doesn't make a lot of sense to me. As the instruction gets removed afterwards, I don't see a potential for anything overly pathological happening here (as we can only add those operands to the worklist once). The impact on CTMark is in the noise. We also have the same code in instruction sinking and don't limit the operand count there. Differential Revision: https://reviews.llvm.org/D77325	2020-04-04 18:37:30 +02:00
Luofan Chen	eec6d87626	[Attributor] Deduce attributes for non-exact functions This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable. See also [this github issue](https://github.com/llvm/llvm-project/issues/172). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76404	2020-04-04 11:34:58 -05:00
Nikita Popov	6896d559f3	[VNCoercion] Use IRBuilderBase; NFC And remove include from header.	2020-04-04 12:44:50 +02:00
Nikita Popov	ebd5a1b049	[Reassociate] Use IRBuilderBase; NFC And remove now unnecessary IRBuilder.h include in header.	2020-04-04 12:34:16 +02:00
Nikita Popov	1055e9e3c8	[IVDescriptors] Remove IRBuilder.h include; NFC IVDescriptors.h itself does not reference IRBuilder at all. Move the include into transformation passes that do.	2020-04-04 12:07:57 +02:00
Sanjay Patel	ce97ce3a5d	[VectorCombine] try to form a better extractelement Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms. Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement. But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet). In the motivating case from PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 The combination of subsequent instcombine and codegen transforms gets us this improvement: vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm4 vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3] vaddps %xmm0, %xmm2, %xmm0 vaddps %xmm1, %xmm3, %xmm1 vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3] vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2] --> vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm1 vaddps %xmm0, %xmm2, %xmm0 vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3] Differential Revision: https://reviews.llvm.org/D76623	2020-04-03 13:55:13 -04:00
Roman Lebedev	7d572ef2dd	Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)" As discussed in post-commit review in https://reviews.llvm.org/D73501 if the goal of this is to help vectorizer, then we should actually be teaching vectorizer to do this, because right now this rewrite is still budget-limited, which isn't what we'd want. Additionally, while the rest of the patch series was universally profitable, this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171) exposing cost-modeling issues on ARM. So let's just back this particular patch out. Once there's an undo transform, this could be considered for reintegration. This reverts commit `44edc6fd2c`.	2020-04-03 20:15:04 +03:00
Matt Arsenault	57a55313c3	InstCombine: Reduce minnum/maxnum if inputs are casted	2020-04-03 11:57:25 -04:00
Guillaume Chatelet	1a584a8d50	[Alignment][NFC] Remove unused private functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77297	2020-04-03 09:16:20 +00:00
OCHyams	9b56cc9361	[DebugInfo] Salvage debug info when sinking loop invariant instructions Reviewed By: vsk, aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D77318	2020-04-03 09:19:26 +01:00
Hongtao Yu	88da019977	Fix a bug in the inliner that causes subsequent double inlining Summary: A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining. To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges. ``` void top() { int t = first(); second(t); } void second(int t) { t = third(t); fourth(t); } void third(int t) { return t; } ``` The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up. We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too. Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification. Reviewers: wenlei, davidxl, tejohnson Reviewed By: wenlei, davidxl Subscribers: eraman, nikic, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76248	2020-04-02 21:08:05 -07:00
Jun Ma	9c6f32a0ff	[Coroutines] Simplify implementation using removePredecessor Differential Revision: https://reviews.llvm.org/D77035	2020-04-03 09:20:07 +08:00
Anna Thomas	bf7a16a768	[InlineFunction] Update valid return attributes at callsite within callee body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate valid attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. Also, this is valid only for attributes which are a property of a callsite and not those that are not dependent on the ABI, or a property of the call itself. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-04-02 14:13:12 -04:00
Sanjay Patel	f4448063cc	[InstCombine] try to reduce shuffle with bitcasted operand shuf (bitcast X), undef, Mask --> bitcast X' The 'inverse shuffles' test (shuf_bitcast_operand) is a pattern in the motivating examples from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 (see also D76727) We can deal with this class of patterns in generic instcombine because we are not creating any new shuffles, just a bitcast. Alive2 proof: http://volta.cs.utah.edu:8080/z/mwDUZf Differential Revision: https://reviews.llvm.org/D76844	2020-04-02 13:44:50 -04:00
Sanjay Patel	b6050ca181	[VectorCombine] transform bitcasted shuffle to narrower elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' We do not attempt this in InstCombine because we do not want to change types and create new shuffle ops that are potentially not lowered as well as the original code. Here, we can check the cost model to see if it is worthwhile. I've aggressively enabled this transform even if the types are the same size and/or equal cost because moving the bitcast allows InstCombine to make further simplifications. In the motivating cases from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 ...this is enough to let instcombine and the backend eliminate the redundant shuffles, but we probably want to extend VectorCombine to handle the inverse pattern (shuffle-of-bitcast) to get that simplification directly in IR. Differential Revision: https://reviews.llvm.org/D76727	2020-04-02 13:30:22 -04:00
Sanjay Patel	12fcbcecff	[InstCombine] add tests for cmyk benchmark; NFC These are versions of a function that regressed with: rGf2fbdf76d8d0 That particular problem occurs with an instcombine-simplifycfg-instcombine sequence, but we can show that it exists within instcombine only with other variations of the pattern.	2020-04-02 13:00:46 -04:00
Benjamin Kramer	de8831934a	[LoopDataPrefetch] Remove unused include that's a layering violation	2020-04-02 17:46:10 +02:00
Benjamin Kramer	dffc503187	Revert "[SimplifyLibCalls] Erase replaced instructions" This reverts commit `2a77544ad5`. This introduces a use-after-free in Transforms/InstCombine/sincospi.ll. Found by asan.	2020-04-02 17:30:47 +02:00
Sanjay Patel	1008435f3d	Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold" This reverts commit `f2fbdf76d8`. As noted in the post-commit thread: https://reviews.llvm.org/rGf2fbdf76d8d0 ...this can obscure a min/max pattern where the components have extra uses. We can show that the problem is independent of this change with a slightly modified source example, so this revert just delays/reduces the need to fix the real problem. We need to improve our analysis of negation or -- more generally -- subtraction using patches like D77230 or D68408.	2020-04-02 09:15:23 -04:00
Tyker	c00cb76274	[NFC] Split Knowledge retention and place it more appropriatly Summary: Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils allows Queries and Transform/Utils to use Analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77171	2020-04-02 15:01:41 +02:00
Jonas Paulsson	36d4421f50	[LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop. This patch adds - New arguments to getMinPrefetchStride() to let the target decide on a per-loop basis if software prefetching should be done even with a stride within the limit of the hw prefetcher. - New TTI hook enableWritePrefetching() to let a target do write prefetching by default (defaults to false). - In LoopDataPrefetch: - A search through the whole loop to gather information before emitting any prefetches. This way the target can get information via new arguments to getMinPrefetchStride() and emit prefetches more selectively. Collected information includes: Does the loop have a call, how many memory accesses, how many of them are strided, how many prefetches will cover them. This is NFC to before as long as the target does not change its definition of getMinPrefetchStride(). - If a previous access to the same exact address was 'read', and the current one is 'write', make it a 'write' prefetch. - If two accesses that are covered by the same prefetch do not dominate each other, put the prefetch in a block that dominates both of them. - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop. - A SystemZ implementation of getMinPrefetchStride(). Review: Ulrich Weigand, Michael Kruse Differential Revision: https://reviews.llvm.org/D70228	2020-04-02 14:57:46 +02:00
Florian Hahn	a63b5c9e53	[CallSiteSplitting] Simplify isPredicateOnPHI & continue checking PHIs. As pointed out by @thakis, currently CallSiteSplitting bails out after checking the first PHI node. We should check all PHI nodes, until we find one where call site splitting is beneficial. This patch also slightly simplifies the code using BasicBlock::phis(). Reviewers: davidxl, junbuml, thakis Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D77089	2020-04-02 10:11:27 +01:00
Johannes Doerfert	bcd8009369	[Attributor] Use the proper context instruction in genericValueTraversal There was a TODO in genericValueTraversal to provide the context instruction and due to the lack of it users that wanted one just used something available. Unfortunately, using a fixed instruction is wrong in the presence of PHIs so we need to update the context instruction properly. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D76870	2020-04-01 22:20:47 -05:00
Johannes Doerfert	ac96c8fd85	[Attributor][FIX] Do not compute ranges for arguments of declarations This cannot be triggered right now, as far as I know, but it doesn't make sense to deduce a constant range on arguments of declarations. Exposed during testing of AAValueSimplify extensions.	2020-04-01 22:05:30 -05:00
Johannes Doerfert	54d6a608bf	[Attributor][NFC] Predetermine the module It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.	2020-04-01 21:56:17 -05:00
Johannes Doerfert	9e19693994	[Attributor] Derive better alignment for accessed pointers Use DL & ABI information for better alignment deduction, e.g., if a type is accessed and the ABI specifies an alignment requirement for such an access we can use it. This is based on a patch by @lebedev.ri and inspired by getBaseAlign in Loads.cpp. Depends on D76673. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D76674	2020-04-01 21:49:57 -05:00
Johannes Doerfert	b1c788d051	[Attributor][FIX] Prevent alignment breakage wrt. must-tail calls If we have a must-tail call the callee and caller need to have matching ABIs. Part of that is alignment which we might modify when we deduce alignment of arguments of either. Since we would need to keep them in sync, which is not as simple, we simply avoid deducing alignment for arguments of the must-tail caller or callee. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D76673	2020-04-01 21:40:07 -05:00
Johannes Doerfert	41f2a57d0b	[Attributor][NFC] Use a BumpPtrAllocator to allocate `AbstractAttribute`s We create a lot of AbstractAttributes and they live as long as the Attributor does. It seems reasonable to allocate them via a BumpPtrAllocator owned by the Attributor. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76589	2020-04-01 20:53:28 -05:00
Sanjay Patel	3d90048791	[InstCombine] enhance freelyNegateValue() by handling xor Negation is equivalent to bitwise-not + 1, so try to convert more subtracts into adds using this relationship: 0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1 I doubt this will recover the regression noted in rGf2fbdf76d8d0, but seems like we're going to need to improve here and/or revive D68408? Alive2 proofs: http://volta.cs.utah.edu:8080/z/Re5tMU http://volta.cs.utah.edu:8080/z/An-uns Differential Revision: https://reviews.llvm.org/D77230	2020-04-01 15:05:13 -04:00
Jonathan Roelofs	1148f004fa	Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping find() was altering the UserChain, even in cases where it subsequently discovered that the resulting constant was a 0. This confuses rebuildWithoutConstOffset() when it attempts to walk the chain later, since it is expected that the chain itself be a path down the use-def edges of an expression.	2020-04-01 12:38:15 -06:00
Nikita Popov	50a3e8738a	Revert "[InstCombine] Erase old instruction when replacing extractelements" This reverts commit `d40368fdb5`. llvm-clang-x86_64-expensive-checks-debian failure looks related.	2020-04-01 20:10:11 +02:00
Nikita Popov	2a77544ad5	[SimplifyLibCalls] Erase replaced instructions After RAUWing an instruction, also erase it. This makes sure we don't perform extra InstCombine iterations to clean up the garbage.	2020-04-01 20:00:10 +02:00
Uday Bondhugula	6ee11c3b0f	[NewGVN] Make NewGVN aware of aligned_alloc Make the New GVN pass aware of aligned_alloc. Depends on D76975. Differential Revision: https://reviews.llvm.org/D76976	2020-04-01 23:26:51 +05:30
Uday Bondhugula	4cf70af94f	[GVN] Make GVN aware of aligned_alloc Make the GVN pass aware of aligned_alloc. Depends on D76974. Differential Revision: https://reviews.llvm.org/D76975	2020-04-01 23:26:50 +05:30
Uday Bondhugula	c4499e3333	[Attributor] Make attributor aware of aligned_alloc for heap to stack conversion Make the attributor pass aware of aligned_alloc for converting heap allocations to stack ones. Depends on D76971. Differential Revision: https://reviews.llvm.org/D76974	2020-04-01 23:26:50 +05:30
Nikita Popov	d40368fdb5	[InstCombine] Erase old instruction when replacing extractelements As we are not returning the result of replaceInstUsesWith(), so we need to clean up ourselves. NFC apart from worklist order.	2020-04-01 19:55:28 +02:00
Nikita Popov	4b35c816ef	[InstCombine] Use replaceOperand() in div transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-04-01 19:55:00 +02:00
Benjamin Kramer	66b9f5f7f0	[GVNSink] Simplify code. NFC.	2020-04-01 13:13:00 +02:00
Cullen Rhodes	84aa6cf1a9	[Transforms][SROA] Promote allocas with mem2reg for scalable types Summary: Aggregate types containing scalable vectors aren't supported and as far as I can tell this pass is mostly concerned with optimisations on aggregate types, so the majority of this pass isn't very useful for scalable vectors. This patch modifies SROA such that mem2reg is run on allocas with scalable types that are promotable, but nothing else such as slicing is done. The use of TypeSize in this pass has also been updated to be explicitly fixed size. When invoking the following methods in DataLayout: * getTypeSizeInBits * getTypeStoreSize * getTypeStoreSizeInBits * getTypeAllocSize we now called getFixedSize on the resultant TypeSize. This is quite an extensive change with around 50 calls to these functions, and also the first change of this kind (being explicit about fixed vs scalable size) as far as I'm aware, so feedback welcome. A test is included containing IR with scalable vectors that this pass is able to optimise. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76720	2020-04-01 10:34:11 +00:00
Eli Friedman	ba4764c2cc	Fix leak in GVNSink introduced in D72467.	2020-03-31 16:21:27 -07:00
Evgenii Stepanov	f9471b0010	Fix MSan false positive due to select folding. Summary: Select folding in JumpThreading can create a conditional branch on a code patch that did not have one in the original program. This is not a valid transformation in sanitize_memory functions. Note that JumpThreading does select folding in 3 different places. Two of them seem safe - they apply to a select instruction in a BB that ends with an unconditional branch to another BB, which (in turn) ends with a conditional branch or a switch with the same condition. Fixes PR45220. Reviewers: glider, dvyukov, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76332	2020-03-31 15:25:42 -07:00
Anna Thomas	58a05675da	Revert "[InlineFunction] Handle return attributes on call within inlined body" This reverts commit `28518d9ae3`. There is a failure in MsgPackReader.cpp when built with clang. It complains about "signext and zeroext" are incompatible. Investigating offline if it is infact a UB in the MsgPackReader code.	2020-03-31 16:16:34 -04:00
Nikita Popov	b7fe795e5b	[InstCombine] Use replaceOperand() in some select transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-03-31 22:10:55 +02:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Nikita Popov	c538c57d6d	[InstCombine] Use replaceOperand() in descaling To make sure the old operand gets DCEd. NFC apart from worklist order.	2020-03-31 22:05:53 +02:00
Nikita Popov	19df7fa892	[InstCombine] Erase old alloca in cast of alloca transform As we don't return the replaceInstUsesWith() result, we are responsible for erasing the instruction. NFC apart from worklist order.	2020-03-31 21:57:39 +02:00
Nikita Popov	87357808b8	[InstCombine] Use replaceOperand() in non zero phi transform To make sure the old operand gets DCEd. NFC apart from worklist order changes.	2020-03-31 21:54:21 +02:00
Nikita Popov	f3d4166368	[InstCombine] Report change in non zero phi transform We need to inform InstCombine (and transitively the pass manager) that we changed an instruction.	2020-03-31 21:52:40 +02:00
Anna Thomas	28518d9ae3	[InlineFunction] Handle return attributes on call within inlined body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate those attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. See added test cases. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-03-31 14:35:40 -04:00
Uday Bondhugula	dc817b2dea	[InstCombine] Deduce attributes for aligned_alloc in InstCombine Make InstCombine aware of the aligned_alloc library function. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Depends on D76970. Differential Revision: https://reviews.llvm.org/D76971	2020-03-31 23:17:28 +05:30

... 13 14 15 16 17 ...

25091 Commits