llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	104fa367ee	[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both. This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through. I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.	2022-08-22 15:16:39 -07:00
Philip Reames	27d3321c4f	[TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc] This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.	2022-08-22 11:26:31 -07:00
Philip Reames	274f86e7a6	[TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc] This completes the client side transition to the OperandValueInfo version of this routine. Backend TTI implementations still use the prior versions for now.	2022-08-22 11:06:32 -07:00
Philip Reames	c42a5f1cc2	[TTI] Migrate getOperandInfo to OperandVaueInfo [nfc] This is part of merging OperandValueKind and OperandValueProperties.	2022-08-22 10:19:02 -07:00
Philip Reames	5cd427106d	[TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc] OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling. We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so. This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works. Target TTI implementations still use the split flags. I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.	2022-08-22 09:48:15 -07:00
Max Kazantsev	e587199a50	[SCEV] Prove condition invariance via context, try 2 Initial implementation had too weak requirements to positive/negative range crossings. Not crossing zero with nuw is not enough for two reasons: - If ArLHS has negative step, it may turn from positive to negative without crossing 0 boundary from left to right (and crossing right to left doesn't count for unsigned); - If ArLHS crosses SINT_MAX boundary, it still turns from positive to negative; In fact we require that ArLHS always stays non-negative or negative, which an be enforced by the following set of preconditions: - both nuw and nsw; - positive step (looks liftable); Because of positive step, boundary crossing is only possible from left part to the right part. And because of no-wrap flags, it is guaranteed to never happen.	2022-08-22 14:31:19 +07:00
Ting Wang	d2d77e050b	[PowerPC][Coroutines] Add tail-call check with call information for coroutines Fixes #56679. Reviewed By: ChuanqiXu, shchenz Differential Revision: https://reviews.llvm.org/D131953	2022-08-21 22:20:40 -04:00
Simon Pilgrim	5263155d5b	[CostModel] Add CostKind argument to getShuffleCost Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future. Differential Revision: https://reviews.llvm.org/D132287	2022-08-21 10:54:51 +01:00
Kazu Hirata	258531b7ac	Remove redundant initialization of Optional (NFC)	2022-08-20 21:18:28 -07:00
Sanjay Patel	2981a94902	[EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0), part 2 Follow-up to `7f1262a322`. That patch avoided removing the call, but it still allowed the constant-folded result. This makes the behavior consistent with 1-arg libm folding: if the call potentially raises an exception, then we just bail out. It seems likely that there are other corner-cases like this, but the tests are incomplete, so we have lived with these discrepancies for a long time. This was untested before the the constant folding was expanded in D127964.	2022-08-20 10:16:06 -04:00
Philip Reames	b0a2c48e9f	[tti] Consolidate getOperandInfo without OperandValueProperties copies [nfc]	2022-08-19 16:22:22 -07:00
Sanjay Patel	7f1262a322	[EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0) These may raise an error (set errno) as discussed in the post-commit comments for D127964, so we can't fold away the call and potentially alter that behavior.	2022-08-19 12:27:29 -04:00
Alexey Bataev	d53e245951	[COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC. Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to better estimate cost with immediate values. Part of D126885.	2022-08-19 07:33:00 -07:00
Max Kazantsev	f798c042f4	Revert "[SCEV] Prove condition invariance via context" This reverts commit `a3d1fb3b59`. Reverting until investigation of https://github.com/llvm/llvm-project/issues/57247 has concluded.	2022-08-19 21:02:06 +07:00
Michael Maitland	f29401fcdf	[LoopVectorize][LoopAccessAnalysis] add newline to debug message A debug message in `LoopAccessAnalysis` did not have a newline in it, causing printed debug messages to be formatted incorrectly. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132172	2022-08-18 13:44:05 -07:00
Florian Hahn	b8709a9d03	[LV] Support fixed order recurrences. If the incoming previous value of a fixed-order recurrence is a phi in the header, go through incoming values from the latch until we find a non-phi value. Use this as the new Previous, all uses in the header will be dominated by the original phi, but need to be moved after the non-phi previous value. At the moment, fixed-order recurrences are modeled as a chain of first-order recurrences. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D119661	2022-08-18 19:15:52 +01:00
Simon Pilgrim	fdec50182d	[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks. Original Patch by @samparker (Sam Parker) Differential Revision: https://reviews.llvm.org/D79483	2022-08-18 11:55:23 +01:00
Simon Pilgrim	b994f87184	[Analysis] CostModel.cpp - merge isa<IntrinsicInst> and dyn_cast<IntrinsicInst> checks Pulled out of D79483	2022-08-18 10:43:29 +01:00
Simon Pilgrim	1d522a39f7	[TTI] Remove getInstructionThroughput cost helper. Pulled out of D79483 - we can just as easily use getUserCost directly	2022-08-17 11:41:47 +01:00
Zain Jaffal	f61f99a105	[instcombine] Optimise for zero initialisation of product given fast flags are enabled Currently, clang ignores the 0 initialisation in finite math For example: ``` double f_prod = 0; double arr[1000]; for (size_t i = 0; i < 1000; i++) { f_prod *= arr[i]; } ``` Clang will ignore that `f_prod` is set to zero and it will generate assembly to iterate over the loop. Reviewed By: fhahn, spatel Differential Revision: https://reviews.llvm.org/D131672	2022-08-17 11:12:15 +01:00
Graham Hunter	70d35443dc	[LAA] Handle forked pointers with add/sub instructions Handle cases where a forked pointer has an add or sub instruction before reaching a select. Reviewed By: fhahn Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130278	2022-08-17 09:51:13 +01:00
Simon Pilgrim	08d153d806	[ValueTracking] computeKnownBits - attempt to use a branch condition feeding a phi to improve known bits range (PR38280) If computeKnownBits encounters a phi node, and we fail to determine any known bits through direct analysis, see if the incoming value is part of a branch condition feeding the phi. Handle cases where icmp(IncomingValue PRED Constant) is driving a branch instruction feeding that phi node - at the moment this only handles EQ/ULT/ULE predicate cases as they are the most straightforward to handle and most likely for branch-loop 'max upper bound' cases - we can extend this if/when necessary. I investigated a more general icmp(LHS PRED RHS) KnownBits system, but the hard limits we put on value tracking depth through phi nodes meant that we were mainly catching constants anyhow. Fixes the pointless vectorization in PR38280 / Issue #37628 (excessive unrolling still needs handling though) Differential Revision: https://reviews.llvm.org/D131838	2022-08-16 16:54:44 +01:00
Max Kazantsev	ebabd6bf18	Return "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `354fa0b480`. Returning as is. The patch was reverted due to a miscompile, but this patch is not causing it. This patch made it possible to infer some nuw flags in code guarded by `false` condition, and then someone else to managed to propagate the flag from dead code outside. Returning the patch to be able to reproduce the issue.	2022-08-16 14:12:36 +07:00
Craig Topper	ef8c34e954	[InstSimplify] sle on i1 also encodes implication We already support SGE, so the same logic should hold for SLE with the LHS and RHS swapped. I didn't see this in the wild. Just happened to walk past this code and thought it was odd that it was asymmetric in what condition codes it handled. Reviewed By: spatel, reames Differential Revision: https://reviews.llvm.org/D131805	2022-08-15 08:27:23 -07:00
Max Kazantsev	354fa0b480	Revert "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `34ae308c73`. Our internal testing found a miscompile. Not sure if it's caused by this patch or it revealed something else. Reverting while investigating.	2022-08-15 18:51:59 +07:00
Wolfgang Pieb	7ddfb4dfeb	[Inlining] Introduce the function attribute "inline-max-stacksize" The value of the attribute is a size in bytes. It has the effect of suppressing inlining of functions whose stacksizes exceed the given value. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D129904	2022-08-12 11:07:18 -07:00
Max Kazantsev	a3d1fb3b59	[SCEV] Prove condition invariance via context Contextual knowledge may be used to prove invariance of some conditions. For example, in this case: ``` ; %len >= 0 guard(%iv = {start,+,1}<nuw> <s %len) guard(%iv = {start,+,1}<nuw> <u %len) ``` the 2nd check always fails if `start` is negative and always passes otherwise. It looks like there are more opportunities of this kind that are still to be implemented in the future. Differential Revision: https://reviews.llvm.org/D129753 Reviewed By: apilipenko	2022-08-12 14:23:35 +07:00
Mircea Trofin	3486b1b736	[mlgo][nfc] regalloc test model generator: prep for TFLite Casting operator to make TFLite happy. Reviewed By: yundiqian Differential Revision: https://reviews.llvm.org/D131584	2022-08-11 15:53:23 -07:00
Fangrui Song	57f334d817	[Support] Remove Log2 workaround for Android API level < 18 The function added by D9467 is unneeded. https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has moved forward to at least a minimum target API of 19. Reviewed By: srhines Differential Revision: https://reviews.llvm.org/D131656	2022-08-11 17:39:41 +00:00
Kevin P. Neal	de64d0076e	[FPEnv][InstSimplify] Fix formatting error. My most recent change for D131607 had a formatting error that I didn't notice until after I committed it. Let me fix it now so changes to this file will be back-to-back from me.	2022-08-11 12:10:05 -04:00
Kevin P. Neal	7bdb010d7c	[FPEnv][InstSimplify] 0.0 - -X ==> X Another ticket split out of D107285, this extends the optimization of 0.0 - -X to just X when using constrained intrinsics and the optimization is allowed. If the negation of X is done with fsub then the match fails because of the lack of IR Matcher support for constrained intrinsics. While I'm here, remove some TODO notices since the work is no longer planned. Differential Revision: https://reviews.llvm.org/D131607	2022-08-11 11:35:33 -04:00
Martin Sebor	0dcfe7aa35	[InstCombine] Tighten up known library function signature tests (PR #56463 ) Replace a switch statement used to validate arguments to known library functions with a more consistent table-driven approach and tighten it up.	2022-08-10 14:15:46 -06:00
Simon Pilgrim	77d33f4c1b	[Analysis] Remove unused CostModelAnalysis::getInstructionCost helper. NFCI. Everything now uses TTI costs calls directly	2022-08-10 17:21:46 +01:00
Mohammed Nurul Hoque	30abc1a6a1	[ConstantFolding] Eliminate atan and atan2 calls From the opengroup specifications, atan2 may fail if the result underflows and atan may fail if the argument is subnormal, but we assume that does not happen and eliminate the calls if we can constant fold the result at compile-time. Differential Revision: https://reviews.llvm.org/D127964	2022-08-10 11:01:50 -04:00
Dinar Temirbulatov	cab6cd6834	[AArch64][LoopVectorize] Introduce trip count minimal value threshold to ignore tail-folding. After D121595 was commited, I noticed regressions assosicated with small trip count numbersvectorisation by tail folding with scalable vectors. As a solution for those issues I propose to introduce the minimal trip count threshold value. Differential Revision: https://reviews.llvm.org/D130755	2022-08-09 22:10:17 +01:00
yundiqian	3edd8978c3	fix mlgo regalloc test model generation for tflite To move from TF C API to TFLite, we found that the argmax op in TFLite does not work for int64 inputs, so cast the int64 inputs to int32 inputs to make TFLite argmax op work Differential Revision: https://reviews.llvm.org/D131462	2022-08-09 12:36:28 -07:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Kazu Hirata	e20d210eef	[llvm] Qualify auto (NFC) Identified with readability-qualified-auto.	2022-08-07 23:55:27 -07:00
Sanjay Patel	74b5e797d5	[InstSimplify] fold scalable vectors with over-shift splat constant to poison Fixes #56968	2022-08-07 16:26:05 -04:00
Sanjay Patel	8148c28fad	[ConstFolding] fix overzealous assert when converting FP half Fixes #56981	2022-08-07 13:34:51 -04:00
Kazu Hirata	a2d4501718	[llvm] Fix comment typos (NFC)	2022-08-07 00:16:14 -07:00
Kazu Hirata	c8e6ebd74e	Use value instead of getValue (NFC)	2022-08-06 11:21:39 -07:00
Vitaly Buka	8d2901d537	[NFC][Inliner] Add Load/Store handler This is an additional signal which may benefit sanitizers. Reviewed By: kda Differential Revision: https://reviews.llvm.org/D131129	2022-08-05 13:42:17 -07:00
Sanjay Patel	b63fc26d33	[InstSimplify] make uses of isImpliedCondition more efficient (NFCI) As suggested in the post-commit comments for `019d76196f`, this makes the usage symmetric with the 'and' patterns and should be more efficient.	2022-08-05 12:06:47 -04:00
Sanjay Patel	019d76196f	[InstSimplify] use isImpliedCondition() instead of semi-duplicated code We get a couple of improvements from recognizing swapped operand patterns that were not handled by the replicated code. This should also enable simplifying larger patterns as seen in issue #56653 and issue #56654, but that requires enhancements to isImpliedCondition() itself.	2022-08-05 10:59:09 -04:00
David Green	b2de84633a	[ConstProp] Don't fallthorugh for poison constants on vctp and active_lane_mask. Given a poison constant as input, the dyn_cast to a ConstantInt would fail so we would fall through to the generic code that attempts to fold each element of the input vectors. The inputs to these intrinsics are not vectors though, leading to a compile time crash. Instead bail out properly for poison values by returning nullptr. This doesn't try to define what poison means for these intrinsics. Fixes #56945	2022-08-05 11:19:36 +01:00
Fangrui Song	7d6017fd31	[TTI] Change new getVectorInstrCost overload to use const reference after D131114 A const reference is preferred over a non-null const pointer. `Type *` is kept as is to match the other overload. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D131197	2022-08-04 15:16:51 -07:00
Sanjay Patel	8e7acb670b	[ValueTracking] improve readability in isImpliedCond helper functions; NFC This matches the caller code naming scheme and avoids the potentially confusing transition from left/right to A/B.	2022-08-04 17:43:31 -04:00
Sanjay Patel	657bfa364f	[ValueTracking] reduce code in isImpliedCondICmps; NFC This copies the implementation of the subsequent match with constants.	2022-08-04 17:03:42 -04:00
Mingming Liu	bc8f2f3649	[AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation. 1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method. 2) This patch also changes a few callsites (VectorCombine.cpp, SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method. 3) This is a split of D128302. Differential Revision: https://reviews.llvm.org/D131114	2022-08-04 12:58:25 -07:00
Arthur Eubanks	203296d642	[BoundsChecking] Fix merging of sizes BoundsChecking uses ObjectSizeOffsetEvaluator to keep track of the underlying size/offset of pointers in allocations. However, ObjectSizeOffsetVisitor (something ObjectSizeOffsetEvaluator uses to check for constant sizes/offsets) doesn't quite treat sizes and offsets the same way as BoundsChecking. BoundsChecking wants to know the size of the underlying allocation and the current pointer's offset within it, but ObjectSizeOffsetVisitor only cares about the size from the pointer to the end of the underlying allocation. This only comes up when merging two size/offset pairs. Add a new mode to ObjectSizeOffsetVisitor which cares about the underlying size/offset rather than the size from the current pointer to the end of the allocation. Fixes a false positive with -fsanitize=bounds. Reviewed By: vitalybuka, asbirlea Differential Revision: https://reviews.llvm.org/D131001	2022-08-03 17:21:19 -07:00
Vitaly Buka	a2aa6809a8	[NFC][Inliner] Add cl::opt<int> to tune InstrCost The plan is tune this for sanitizers. Differential Revision: https://reviews.llvm.org/D131123	2022-08-03 17:14:10 -07:00
Congzhe Cao	76be554931	[DependenceAnalysis][PR56275] Normalize negative dependence analysis results This patch is the first of the two-patch series (D130188, D130179) that resolve PR56275 (https://github.com/llvm/llvm-project/issues/56275) which is a missed opportunity, where a perfrectly valid case for loop interchange failed interchange legality. If the distance/direction vector produced by dependence analysis (DA) is negative, it needs to be normalized (reversed). This patch provides helper functions `isDirectionNegative()` and `normalize()` in DA that does the normalization, and clients can query DA to do normalization if needed. A pass option `<normalized-results>` is added to DependenceAnalysisPrinterPass, and we leverage it to update DA test cases to make sure of test coverage. The test cases added in `Banerjee.ll` shows that negative vectors are normalized with `print<da><normalized-results>`. Reviewed By: bmahjour, Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D130188	2022-08-03 19:59:00 -04:00
Mircea Trofin	0cb9746a7d	[nfc][mlgo] Separate logger and training-mode model evaluator This just shuffles implementations and declarations around. Now the logger and the TF C API-based model evaluator are separate. Differential Revision: https://reviews.llvm.org/D131116	2022-08-03 16:20:28 -07:00
Vitaly Buka	26dd42705c	[NFC][Inliner] Simplify clamping in addCost	2022-08-03 14:54:37 -07:00
Vitaly Buka	e056e74dda	[NFC][inline] Add const to an argument	2022-08-03 13:20:47 -07:00
Johannes Reifferscheid	3e9e43b48e	Fix compiler error: init-statements in if/switch. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D131061	2022-08-03 11:36:41 +02:00
Johannes Reifferscheid	7ae5d00afa	Fix a stack overflow in ScalarEvolution. Unfortunately, this overflow is extremely hard to reproduce reliably (in fact, I was unable to do so). The issue is that: - getOperandsToCreate sometimes skips creating an SCEV for the LHS - then, createSCEV is called for the BinaryOp - ... which calls getNoWrapFlagsFromUB - ... which under certain circumstances calls isSCEVExprNeverPoison - ... which under certain circumstances requires the SCEVs of all operands For certain deep dependency trees, this causes a stack overflow. Reviewed By: bkramer, fhahn Differential Revision: https://reviews.llvm.org/D129745	2022-08-03 11:08:01 +02:00
Nikita Popov	b128e057c1	[AA] Make ModRefInfo a bitmask enum (NFC) Mark ModRefInfo as a bitmask enum, which allows using normal & and \| operators on it. This supersedes various functions like unionModRef() and intersectModRef(). I think this makes the code cleaner than going through helper functions... Differential Revision: https://reviews.llvm.org/D130870	2022-08-03 10:05:55 +02:00
Max Kazantsev	34ae308c73	[SCEV] Use context to strengthen flags of BinOps Sometimes SCEV cannot infer nuw/nsw from something as simple as ``` len in [0, MAX_INT] ... iv = phi(0, iv.next) guard(iv <s len) guard(iv <u len) iv.next = iv + 1 ``` just because flag strenthening only relies on definition and does not use local facts. This patch adds support for the simplest case: inference of flags of `add(x, constant)` if we can contextually prove that `x <= max_int - constant`. In case if it has negative CT impact, we can add an option to switch it off. I woudln't expect that though. Differential Revision: https://reviews.llvm.org/D129643 Reviewed By: apilipenko	2022-08-03 14:08:57 +07:00
Paul Kirth	d434e40f39	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-08-03 00:09:45 +00:00
David Sherwood	4ef9cb6c17	[AArch64][LoopVectorize] Disable tail-folding for SVE when loop has interleaved accesses If we have interleave groups in the loop we want to vectorise then we should fall back on normal vectorisation with a scalar epilogue. In such cases when tail-folding is enabled we'll almost certainly go on to create vplans with very high costs for all vector VFs and fall back on VF=1 anyway. This is likely to be worse than if we'd just used an unpredicated vector loop in the first place. Once the vectoriser has proper support for analysing all the costs for each combination of VF and vectorisation style, then we should be able to remove this. Added an extra test here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll Differential Revision: https://reviews.llvm.org/D128342	2022-08-02 09:52:33 +01:00
jacquesguan	e38af7ba95	[LV] Refactor getExtendedAddReductionCost to support other extended reduction more than Add. Now the API getExtendedAddReductionCost is used to determine the cost of extended Add reduction with optional Mul. For Arm, it could cover the cases. But for other target, for example: RISCV, they support other kinds of extended recution, such as FAdd. This patch does the following changes: 1, Split getExtendedAddReductionCost into 2 new API: getExtendedReductionCost which handles the extended reduction with addtional input of Opcode; getMulAccReductionCost which handle the MLA cases the getExtendedAddReductionCost. 2, Refactor getReductionPatternCost, add some contraint condition to make sure the getMulAccReductionCost should only handle the reuction of Add + Mul. Differential Revision: https://reviews.llvm.org/D130868	2022-08-02 16:02:38 +08:00
Nikita Popov	4ec22ba9c8	[GlobalsAA] Remove unnecessary AAResultBase fallback (NFC) This is unnecessary, as AA result chaining is implemented at a higher level now.	2022-08-01 08:34:58 +02:00
Nikita Popov	5b1d10bda6	[AA] Drop setModAndRef() function (NFC) Without the "must" state, this function is pointless, because we can just directly create a ModRef instead.	2022-08-01 07:55:39 +02:00
Nikita Popov	34683c3e35	[MSSA] Fix expensive checks build	2022-08-01 07:28:52 +02:00
Nikita Popov	f96ea53e89	[AA] Do not track Must in ModRefInfo getModRefInfo() queries currently track whether the result is a MustAlias on a best-effort basis. The only user of this functionality is the optimized memory access type in MemorySSA -- which in turn has no users. Given that this functionality has not found a user since it was introduced five years ago (in D38862), I think we should drop it again. The context is that I'm working to separate FunctionModRefBehavior to track mod/ref for different location kinds (like argmem or inaccessiblemem) separately, and the fact that ModRefInfo also has an unrelated Must flag makes this quite awkward, especially as this means that NoModRef is not a zero value. If we want to retain the functionality, I would probably split getModRefInfo() results into a part that just contains the ModRef information, and a separate part containing a (best-effort) AliasResult. Differential Revision: https://reviews.llvm.org/D130713	2022-08-01 07:14:31 +02:00
Kazu Hirata	bf6021709a	Use drop_begin (NFC)	2022-07-31 15:17:09 -07:00
Sanjay Patel	02b3a35892	[InstSimplify] fold FP rounding intrinsic with rounded operand issue #56775 I rearranged the Thumb2 codegen test to avoid simplifying the chain of rounding instructions. I'm assuming the intent of the test is to verify lowering of each of those intrinsics.	2022-07-31 10:00:27 -04:00
Kazu Hirata	60db8d9b4e	Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2022-07-30 10:35:48 -07:00
Nuno Lopes	d4b4747de5	ConstantFolding: fold OOB accesses to poison instead of undef	2022-07-30 15:20:32 +01:00
Nuno Lopes	fffabd5348	[NFC] Switch a few uses of undef to poison as placeholders for unreachable code	2022-07-30 13:55:56 +01:00
Florian Hahn	214e2d8fe5	[SCEV] Avoid repeated proveNoSignedWrapViaInduction calls. At the moment, proveNoSignedWrapViaInduction may be called for the same AddRec a large number of times via getSignExtendExpr. This can have a severe compile-time impact for very loop-heavy code. If proveNoSignedWrapViaInduction failed to prove NSW the first time, it is unlikely to succeed on subsequent tries and the cost doesn't seem to be justified. This is the signed version of `8daa338297` / D130648. This can drastically improve compile-time in some excessive cases and also has a slightly positive compile-time impact on CTMark: NewPM-O3: -0.06% NewPM-ReleaseThinLTO: -0.04% NewPM-ReleaseLTO-g: -0.04% https://llvm-compile-time-tracker.com/compare.php?from=8daa338297d533db4d1ae8d3770613eb25c29688&to=aed126a196e7a5a9803543d9b4d6bdb233d0009c&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D130694	2022-07-29 09:15:03 +01:00
Liqiang Tao	d52e775b05	[llvm][ModuleInliner] Add inline cost priority for module inliner This patch introduces the inline cost priority into the module inliner, which uses the same computation as InlineCost. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D130012	2022-07-28 22:44:03 +08:00
Liqiang Tao	c113594378	Revert "[llvm][ModuleInliner] Add inline cost priority for module inliner" This reverts commit `bb7f62bbbd`.	2022-07-28 22:36:28 +08:00
Liqiang Tao	bb7f62bbbd	[llvm][ModuleInliner] Add inline cost priority for module inliner This patch introduces the inline cost priority into the module inliner, which uses the same computation as InlineCost. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D130012	2022-07-28 21:28:07 +08:00
Florian Hahn	8daa338297	[SCEV] Avoid repeated proveNoUnsignedWrapViaInduction calls. At the moment, proveNoUnsignedWrapViaInduction may be called for the same AddRec a large number of times via getZeroExtendExpr. This can have a severe compile-time impact for very loop-heavy code. One one particular workload, LSR takes ~51s without this patch, almost exlusively in proveNoUnsignedWrapViaInduction. With this patch, the time in LSR drops to ~0.4s. If proveNoUnsignedWrapViaInduction failed to prove NUW the first time, it is unlikely to succeed on subsequent tries and the cost doesn't seem to be justified. Besides drastically improving compile-time in some excessive cases, this also has a slightly positive compile-time impact on CTMark: NewPM-O3: -0.07% NewPM-ReleaseThinLTO: -0.08% NewPM-ReleaseLTO-g: -0.06 https://llvm-compile-time-tracker.com/compare.php?from=b435da027d7774c24cdb8c88d09f6b771e07fb14&to=f2729e33e8284b502f6c35a43345272252f35d12&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D130648	2022-07-28 10:02:19 +01:00
Max Kazantsev	2d1c6e0b44	[LAA] Remove block order sensitivity in LAA algorithm. PR56672 As test in PR56672 shows, LAA produces different results which lead to either positive or negative vectorization decisions depending on the order of blocks in loop. The exact reason of this is not clear to me, however this makes investigation of related bugs extremely complex. Current order of blocks in the loop is arbitrary. It may change, for example, if loop info analysis is dropped and recomputed. Seems that it interferes with LAA's logic. This patch chooses fixed traversal order of blocks in loops, making it RPOT. Note: this is not a fix for bug with incorrect analysis result. It just makes the answer more robust to make the investigation easier. Differential Revision: https://reviews.llvm.org/D130482 Reviewed By: aeubanks, fhahn	2022-07-28 13:36:56 +07:00
Paul Kirth	6e9bab71b6	Revert "[llvm][NFC] Refactor code to use ProfDataUtils" This reverts commit `300c9a7881`. We will reland once these issues are ironed out.	2022-07-27 21:38:11 +00:00
Paul Kirth	300c9a7881	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-07-27 21:13:54 +00:00
Stanislav Mekhanoshin	0562cf442f	Allow data prefetch into non-default address space I am playing with the LoopDataPrefetch pass and found out that it bails to work with a pointer in a non-zero address space. This patch adds the target callback to check if an address space is to be considered for prefetching. Default implementation still only allows address space 0, so this is NFCI. This does not currently affect any known targets, but seems to be generally useful for the future. Differential Revision: https://reviews.llvm.org/D129795	2022-07-27 10:01:26 -07:00
Martin Sebor	4447603616	[InstCombine] Fold strtoul and strtoull and avoid PR #56293 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D129224	2022-07-26 14:11:40 -06:00
Sanjay Patel	dcd09467b0	[InstSimplify] remove redundant calls to 'isImplied'; NFCI We already call the more general isImpliedCondition() (which calls isImpliedTrueByMatchingCmp() internally) from simplifyAndInst() and simplifyOrInst(). There was a difference visible with this change on a vector test before `a925bef70c`, but I can't find any gaps now.	2022-07-26 14:47:21 -04:00
Arthur Eubanks	2eade1dba4	[WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`. Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`. To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D128955	2022-07-26 08:01:08 -07:00
Sinan Lin	7a2f5dca09	[CodeMetrics] use hasOneLiveUse instead of hasOneUse while analyzing inlinable callsites It would be better for CodeMetrics to use hasOneLiveUse while analyzing static and called once callsites, since inline cost now uses hasOneLiveUse instead of hasOneUse to avoid overpessimization on dead constant cases (since this patch https://reviews.llvm.org/D109294). This change has no noticeable influence now, but it helps improve the accuracy of cost models of passes that use CodeMetrics. Reviewed By: fhahn, nikic Differential Revision: https://reviews.llvm.org/D130461	2022-07-26 13:46:19 +08:00
Augie Fackler	85063090e9	MemoryBuiltins: remove malloc-family funcs from list We no longer need specialized knowledge of these allocator functions in this file since we have the correct attributes available now. As far as I can tell the changes in the attributor tests are due to things getting more consistent on alloc-family once we remove the static list entries. The two test changes in NewGVN merit extra scrutiny: NewGVN appears to be _extremely_ sensitive to the inaccessiblememonly for reasons that are beyond me. As a result, I had-enumerated all the attributes on allocation functions in those two tests instead of using -inferattrs. I assumed that the two -disable-simplify-libcalls tests there no longer are sensible since the function declaration now includes all the relevant attributes. Differential Revision: https://reviews.llvm.org/D130107	2022-07-25 17:29:01 -04:00
Benjamin Kramer	5fde785186	[ValueTracking] Fix unused variable warning in release builds. NFC	2022-07-25 13:28:32 +02:00
Peter Waller	f8919d2f7e	[NFC][GVN] Put phi-translation of 'add' behind a switch The code in this `#if 0` block appears to be a net benefit. Put it behind a switch defaulting to off to support experimentation and as a request for comment. The codegen impact of enabling this that I'm currently persuing is that it allows PRE to take place more frequently, particularly in loops with second order recurrences. Preliminary experimental data: Across LNT on AArch64, 54 benchmarks are sped up by >1%, and 42 are regressed by >1%, the geomean (exec_time_enabled / exec_time_disabled) of these 96 "1% or greater significance" benchmarks is 0.991. For the full set of 770 benchmarks it's 0.998. There are two benchmarks which experience a >30% speedup, and the worst slowdown is ~12%, and for every benchmark with a slowdown there is a benckmark which is sped up by a greater factor. Differential Revision: https://reviews.llvm.org/D130241	2022-07-25 07:59:47 +00:00
Max Kazantsev	a053f35990	[SCEV][NFC][CT] Cheaper handling of guards in isBasicBlockEntryGuardedByCond Handle guards uniformly with assumes, rather than iterating through all block instructions in attempt to find them. Differential Revision: https://reviews.llvm.org/D129874 Reviewed By: nikic	2022-07-25 13:38:59 +07:00
Kazu Hirata	b5188591a0	[llvm] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-24 21:50:35 -07:00
Kazu Hirata	acf648b5e9	Use llvm::less_first and llvm::less_second (NFC)	2022-07-24 16:21:29 -07:00
Sanjay Patel	a925bef70c	[ValueTracking] allow vector types in isImpliedCondition() The matching of constants assumed integers, but we can handle splat vector constants seamlessly with m_APInt.	2022-07-24 17:46:48 -04:00
Kazu Hirata	97718180d7	[Analysis] Remove a redundant return statement (NFC) Identified with readability-redundant-control-flow.	2022-07-23 11:35:19 -07:00
Malhar Jajoo	41958f76d8	[Costmodel] Add "type-based-intrinsic-cost" cli option This patch adds a command line flag to be able to test the type based cost-model analysis for Intrinsics. Differential Revision: https://reviews.llvm.org/D129109	2022-07-22 15:50:57 +01:00
Teresa Johnson	1dad6247d2	[MemProf] Add memprof metadata related analysis utilities Adds a number of utilities that are used to help create and update memprof related metadata. These will be used during profile matching and annotation, as well as by the inliner when updating the metadata. Also adds unit tests for the utilities. See also related RFCs: RFC: Sanitizer-based Heap Profiler [1] RFC: A binary serialization format for MemProf [2] RFC: IR metadata format for MemProf [3] (Note that the IR metadata format has changed from the RFC during implementation, as described in the preceeding patch adding the basic metadata and verification support.) Depends on D128141. Differential Revision: https://reviews.llvm.org/D128854	2022-07-21 13:46:01 -07:00
Augie Fackler	62f48cadfd	MemoryBuiltins: accept non-TLI funcs with attribs as allocator funcs This allows us to accept annotations from out-of-tree languages (the example test is derived from Rust) so they can enjoy the benefits of LLVM's optimizations without requiring LLVM to have language-specific knowledge. Differential Revision: https://reviews.llvm.org/D123091	2022-07-21 15:31:16 -04:00
Augie Fackler	5a3e3675f6	MemoryBuiltins: start using properties of functions Prior to this change, we relied on the hard-coded list for all of the information performed by MemoryBuiltins. With this change, we're able to start relying on properites of functions described in attributes, which opens the door to out-of-tree compilers being able to describe their allocator functions to LLVM's optimizer logic without having to register their implementation details with LLVM. Differential Revision: https://reviews.llvm.org/D123090	2022-07-21 15:31:15 -04:00
Arthur Eubanks	04d398db46	[LoopAccessAnalysis] Simplify D119047 No need to add checks for every type per pointer that we couldn't create a check for the first time around, just the types that weren't successful. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D119376	2022-07-21 12:16:02 -07:00
David Sherwood	f15b6b2907	[AArch64] Add target hook for preferPredicateOverEpilogue This patch adds the AArch64 hook for preferPredicateOverEpilogue, which currently returns true if SVE is enabled and one of the following conditions (non-exhaustive) is met: 1. The "sve-tail-folding" option is set to "all", or 2. The "sve-tail-folding" option is set to "all+noreductions" and the loop does not contain reductions, 3. The "sve-tail-folding" option is set to "all+norecurrences" and the loop has no first-order recurrences. Currently the default option is "disabled", but this will be changed in a later patch. I've added new tests to show the options behave as expected here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll Differential Revision: https://reviews.llvm.org/D129560	2022-07-21 17:20:06 +01:00
Nikita Popov	1f69503107	[MemoryBuiltins] Add getReallocatedOperand() function (NFC) Replace the value-accepting isReallocLikeFn() overload with a getReallocatedOperand() function, which returns which operand is the one being reallocated. Currently, this is always the first one, but once allockind(realloc) is respected, the reallocated operand will be determined by the allocptr parameter attribute.	2022-07-21 14:54:16 +02:00
Nikita Popov	46e6dd84b7	[MemoryBuiltins] Remove isFreeCall() function (NFC) Remove isFreeCall() in favor of getFreedOperand(). Replace the two remaining uses with a getFreedOperand() != nullptr check, as they only care that something is getting freed. (The usage in DSE is correct as such. The allocator-related checks in CFLGraph look rather questionable in general.)	2022-07-21 14:44:23 +02:00
Nikita Popov	c81dff3c30	[MemoryBuiltins] Add getFreedOperand() function (NFCI) We currently assume in a number of places that free-like functions free their first argument. This is true for all hardcoded free-like functions, but with the new attribute-based design, the freed argument is supposed to be indicated by the allocptr attribute. To make sure we handle this correctly once allockind(free) is respected, add a getFreedOperand() helper which returns the freed argument, rather than just indicating whether the call frees some argument. This migrates most but not all users of isFreeCall() to the new API. The remaining users are a bit more tricky.	2022-07-21 12:39:35 +02:00
Nikita Popov	d144ae6e1b	[MemoryBuiltins] Default to trivial mapper in getAllocSize() (NFC) Default getAllocSize() to use the trivial mapper. Also switch from using std::function to function_ref. Furthermore, update the doc comment to point out a subtle difference between getAllocSize() and getObjectSize(): The latter may also return something for calls that return their argument (via "returned" attribute or special intrinsics like invariant groups).	2022-07-21 11:43:48 +02:00
Nikita Popov	235fb602ed	[MemoryBuiltins] Don't query TLI for non-pointer functions (NFC) Fetching allocation data for calls is a rather hot operation, and TLI lookups are slow. We can greatly reduce the number of calls for which TLI is queried by checking that they return a pointer value first, as this is a requirement for allocation functions anyway.	2022-07-21 11:28:36 +02:00
Nikita Popov	f45ab43332	[MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc Alloc directly checking whether a given call is a removable allocation, instead of first checking whether it is an allocation first.	2022-07-21 09:39:19 +02:00
Congzhe Cao	05ccde8023	[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation There is a problem in loop cache analysis that the types of SCEV variables `Coeff` and `ElemSize` in function `isConsecutive()` may not match. The mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`. The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to whichever is wider in those two variables. As a clean-up, duplicate calculations of `Stride` in `computeRefCost()` is then removed. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D128877	2022-07-21 01:57:05 -04:00
Schrodinger ZHU Yifan	304027206c	[ThinLTO] Support aliased GlobalIFunc Fixes https://github.com/llvm/llvm-project/issues/56290: when an ifunc is aliased in LTO, clang will attempt to create an alias summary; however, as ifunc is not included in the module summary, doing so will lead to crash. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129009	2022-07-20 15:30:38 -07:00
Philip Reames	f494f89b2a	[LAA] Fix latent missing check bug when mixing scalable and non-scalabe strides Noticed via inspection; to my knowledge, impossible to hit today. In theory, we could have a fixed stride check be analyzed, then a scalable one. With the old code, the scalable one would be silently dropped, and the runtime guard would go ahead with only the fixed one. This would be a miscompile.	2022-07-20 11:56:45 -07:00
Max Kazantsev	e0ccd190ae	[SCEV][NFC][CT] Do not waste time proving contextual facts for unreached loops and blocks In fact, in unreached code we can say that every fact is true. So do not waste time trying to do something smarter. Formally it's not an NFC because it may change query results in unreached code, but they won't have any impact on execution. Hypothetical CT boost expected but not measured in practice. Differential Revision: https://reviews.llvm.org/D129878	2022-07-20 19:02:28 +07:00
Chuanqi Xu	645d2dd3a9	Revert "Don't treat readnone call in presplit coroutine as not access memory" This reverts commit `57224ff4a6`. This commit may trigger crashes on some workloads. Revert it for clearness.	2022-07-20 17:00:58 +08:00
Chuanqi Xu	57224ff4a6	Don't treat readnone call in presplit coroutine as not access memory To solve the readnone problems in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for details. According to the discussion, we decide to fix the problem by inserting isPresplitCoroutine() checks in different passes instead of wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes. In this direction, we might not be able to cover every case at first. Let's take a "find and fix" strategy. Reviewed By: nikic, nhaehnle, jyknight Differential Revision: https://reviews.llvm.org/D127383	2022-07-20 10:37:23 +08:00
Nikita Popov	534b9246a2	[LoopInfo] Allow cloning of callbr After D129288, callbr is safe to clone without special handling. This permits optimizations like loop unroll and loop unswitch on loops containing callbrs. Fixes https://github.com/llvm/llvm-project/issues/41834. Differential Revision: https://reviews.llvm.org/D129993	2022-07-19 09:57:28 +02:00
Max Kazantsev	51f837a680	[NFC] Introduce API to detect tokens penetrating LCSSA form Following discussion in PR56243, we need to somehow detect the situation when token values penetrate LCSSA form for transforms that require that it is maintained by all values (for example, to sustain use-def dominance invarians). This patch introduces a parameter to LCSSA checkers to control their ignorance about tokens. Differential Revision: https://reviews.llvm.org/D129983 Reviewed By: efriedma	2022-07-19 13:52:30 +07:00
Benjamin Kramer	4bd072c56b	[LAA] Fix the build with older versions of Clang llvm/lib/Analysis/LoopAccessAnalysis.cpp:916:12: error: no viable conversion from returned value of type 'SmallVector<[...], 2>' to function return type 'SmallVector<[...], (default) CalculateSmallVectorDefaultInlinedElements<T>::value aka 3>' return Scevs; ^~~~~	2022-07-18 14:01:47 +02:00
Graham Hunter	db8fcb2c25	[LAA] Add recursive IR walker for forked pointers This builds on the previous forked pointers patch, which only accepted a single select as the pointer to check. A recursive function to walk through IR has been added, which searches for either a loop-invariant or addrec SCEV. This will only handle a single fork at present, so selects of selects or a GEP with a select for both the base and offset will be rejected. There is also a recursion limit with a cli option to change it. Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D108699	2022-07-18 12:06:17 +01:00
Kazu Hirata	601b3a13de	[Analysis] Qualify auto variables in for loops (NFC)	2022-07-16 23:26:34 -07:00
Kazu Hirata	92a1b2afc8	[Analysis] Remove isArithmeticRecurrenceKind The last use was removed on Jul 30, 2021 in commit `9d35594993`.	2022-07-16 13:23:32 -07:00
Max Kazantsev	883e83d5fe	[NFC][SCEV] Rename variable to correspond its current meaning	2022-07-15 22:33:57 +07:00
Nikita Popov	2659e1bf4b	[SCEV] List all binops in getOperandsToCreate() Explicitly list all binops rather than having a default case. There were two bugs here: 1. U->getOpcode() was used instead of BO->Opcode, which means we used the logic for the wrong opcode in some cases. 2. SCEV construction does not support LShr. We should return unknown for it rather than recursing into the operands.	2022-07-15 17:08:48 +02:00
Florian Hahn	e7ec1746a6	[SCEV] Avoid creating unnecessary SCEVs for SelectInsts. After `675080a453`, we always create SCEVs for all operands of a SelectInst. This can cause notable compile-time regressions compared to the recursive algorithm, which only evaluates the operands if the select is in a form we can create a usable expression. This approach adds additional logic to getOperandsToCreate to only queue operands for selects if we will later be able to construct a usable SCEV. Unfortunately this introduces a bit of coupling between actual SCEV construction for selects and getOperandsToCreate, but I am not sure if there are better alternatives to address the regression mentioned for `675080a453`. This doesn't have any notable compile-time impact on CTMark. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D129731	2022-07-14 09:23:47 -07:00
Philip Reames	3bc09c7da5	[SCEVExpander] Allow udiv with isKnownNonZero(RHS) + add vscale case Motivation here is to unblock LSRs ability to use ICmpZero uses - the major effect of which is to enable count down IVs. The test changes reflect this goal, but the potential impact is much broader since this isn't a change in LSR at all. SCEVExpander needs() to prove that expanding the expression is safe anywhere the SCEV expression is valid. In general, we can't expand any node which might fault (or exhibit UB) unless we can either a) prove it won't fault, or b) guard the faulting case. We'd been allowing non-zero constants here; this change extends it to non-zero values. vscale is never zero. This is already implemented in ValueTracking, and this change just adds the same logic in SCEV's range computation (which in turn drives isKnownNonZero). We should common up some logic here, but let's do that in separate changes. () As an aside, "needs" is such an interesting word here. First, we don't actually need to guard this at all; we could choose to emit a select for the RHS of ever udiv and remove this code entirely. Secondly, the property being checked here is way too strong. What the client actually needs is to expand the SCEV at some particular point in some particular loop. In the examples, the original urem dominates that loop and yet we completely ignore that information when analyzing legality. I don't plan to actively pursue either direction, just noting it for future reference. Differential Revision: https://reviews.llvm.org/D129710	2022-07-14 08:56:58 -07:00
Dawid Jurczak	d71128d97d	[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand> This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment coming from that review: https://reviews.llvm.org/D129468#3643295 Differential Revision: https://reviews.llvm.org/D129565	2022-07-14 17:22:32 +02:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Kazu Hirata	30d3f56e33	[Analysis] clang-format InlineAdvisor.cpp (NFC)	2022-07-13 13:38:50 -07:00
Max Kazantsev	30e33b4b81	[SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional	2022-07-13 18:54:25 +07:00
Peter Waller	8acf74fd56	[InstCombine][SVE] Bail out of isSafeToLoadUnconditionally for scalable types `isSafeToLoadUnconditionally` currently assumes sized types. Bail out for now. This fixes a TypeSize warning reachable from instcombine via (load (select cond, ptr, ptr)). Differential Revision: https://reviews.llvm.org/D129477	2022-07-13 10:07:36 +00:00
Dawid Jurczak	165240fe38	[NFC] Fix compile time regression seen on some benchmarks after `a630ea3003` commit The goal of this change is fixing most of compile time slowdown seen after `a630ea3003` commit on lencod and sqlite3 benchmarks. There are 3 improvements included in this patch: 1. In getNumOperands when possible get value directly from SmallNumOps. 2. Inline getLargePtr by moving its definition to header. 3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one. Differential Revision: https://reviews.llvm.org/D129468	2022-07-12 15:00:27 +02:00
Aiden Grossman	f3939dc509	[mlgo] Simplify autogenerated regalloc model Currently the autogenerated regalloc model will sometimes output an incorrect LR index to evict instead of the first LR with with the mask set to 1. This trips an assertion within the MLRegallocAdvisor that the evicted LR has a mask of 1. This patch, made possible by https://reviews.llvm.org/D124565, simplifies the autogenerated model by taking away all unnecessary features and getting rid of the functions that were previously to mix in all the necessary inputs so they wouldn't get pruned by the Tensorflow XLA AOT compiler. This is no longer necessary after the previously mentioned patch. This also fixes the nondeterministic behavior that is sometimes observed where the autogenerated model will simply output 0 instead of the correct index. Reviewed By: yundiqian Differential Revision: https://reviews.llvm.org/D129254	2022-07-11 13:23:31 -07:00
Mircea Trofin	24c6c35270	[mlgo] Don't provide default model URLs Pointed out in Issue #56432: the current reference models may not be quite friendly to open source projects. Their purpose is only illustrative - the expectation is that projects would train their own. To avoid unintentionally pulling such a model, made the URL cmake setting require explicit user setting. Differential Revision: https://reviews.llvm.org/D129342	2022-07-11 07:37:14 -07:00
David Sherwood	03fee6712a	[LoopVectorize] Add option to use active lane mask for loop control flow Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop. This patch adds support for using the active lane mask for control flow by: 1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch. I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region. Differential Revision: https://reviews.llvm.org/D125301	2022-07-11 13:46:55 +01:00
Nicolai Hähnle	ede600377c	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in `e9ce1a5880` due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 10:29:15 +02:00
Nicolai Hähnle	e9ce1a5880	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit `e6f1f06245`. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.	2022-07-10 09:54:30 +02:00
Nicolai Hähnle	e6f1f06245	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 09:15:08 +02:00
Wenlei He	a78f436c3f	[Inliner] Make recusive inlinee stack size limit tunable For recursive callers, we want to be conservative when inlining callees with large stack size. We currently have a limit `InlineConstants::TotalAllocaSizeRecursiveCaller`, but that is hard coded. We found the current limit insufficient to suppress problematic inlining that bloats stack size for deep recursion. This change adds a switch to make the limit tunable as a mitigation. Differential Revision: https://reviews.llvm.org/D129411	2022-07-08 21:32:39 -07:00
Nikita Popov	d686ea32b1	[ConstantFolding] Guard against unfolded FP binop Check that the operation actually folded before trying to flush denormals. A minor variation of the pr33453 test exposed this with the FP binops marked as undesirable.	2022-07-08 17:45:33 +02:00
Nikita Popov	4a579abd9f	[GlobalsModRef] Don't override getModRefBehavior() for CallBase BasicAA will already call getModRefBehavior() on the Function of the CallBase if there are no operand bundles. This happens through getBestAAResults(), i.e. it is a recursive call that will query other AA providers, not just the BasicAA implementation. As such, there is no need to reimplement the same functionality in GlobalsModRef, a combination of BasicAA and GlobalsModRef already handles it. This does mean that this no longer works under -disable-basic-aa, but that's a testing only option.	2022-07-07 10:35:44 +02:00
Nikita Popov	f96cb66d19	[ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC) As constant expressions can no longer trap, it only makes sense to call isSafeToSpeculativelyExecute on Instructions, so limit the API to accept only them, rather than general Operators or Values.	2022-07-06 11:12:49 +02:00
Nikita Popov	8ee913d83b	[IR] Remove Constant::canTrap() (NFC) As integer div/rem constant expressions are no longer supported, constants can no longer trap and are always safe to speculate. Remove the Constant::canTrap() method and its usages.	2022-07-06 10:36:47 +02:00
Nikita Popov	935570b2ad	[ConstExpr] Don't create div/rem expressions This removes creation of udiv/sdiv/urem/srem constant expressions, in preparation for their removal. I've added a ConstantExpr::isDesirableBinOp() predicate to determine whether an expression should be created for a certain operator. With this patch, div/rem expressions can still be created through explicit IR/bitcode, forbidding them entirely will be the next step. Differential Revision: https://reviews.llvm.org/D128820	2022-07-05 15:54:53 +02:00
Nikita Popov	e4d1d0cc2c	[SCEV] Fix isImpliedViaMerge() with values from previous iteration (PR56242) When trying to prove an implied condition on a phi by proving it for all incoming values, we need to be careful about values coming from a backedge, as these may refer to a previous loop iteration. A variant of this issue was fixed in D101829, but the dominance condition used there isn't quite right: It checks that the value dominates the incoming block, which doesn't exclude backedges (values defined in a loop will usually dominate the loop latch, which is the incoming block of the backedge). Instead, we should be checking for domination of the phi block. Any values defined inside the loop will not dominate the loop header phi. Fixes https://github.com/llvm/llvm-project/issues/56242. Differential Revision: https://reviews.llvm.org/D128640	2022-07-05 15:31:23 +02:00
Nikita Popov	f93cd56262	[BPI] Avoid ConstantExpr::get() Use ConstantFoldBinaryOpOperands() instead, to prepare for the case where not all binary operators have a constant expression form. I believe this code actually intended to set OnlyIfReduced=true, however ConstantExpr::get() actually accepts a Flags argument at that position (and OnlyIfReducedTy as the next argument), so this ended up creating a constant expression with some random flag (probably exact or nuw depending on which).	2022-07-04 16:04:26 +02:00
Nikita Popov	4905bcac00	[ConstantFolding] Check return value of ConstantFoldInstOperandsImpl() This operation is fallible, but ConstantFoldConstantImpl() is not. If we fail to fold, we should simply return the original expression. I don't think this can cause any issues right now, but it becomes a problem if once make ConstantFoldInstOperandsImpl() not create a constant expression for everything it possibly could.	2022-07-04 14:19:59 +02:00
Chen Zheng	2c3784cff8	[SCEV] recognize llvm.annotation intrinsic Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D127835	2022-07-03 21:02:50 -04:00
Nuno Lopes	53dc0f1078	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-07-03 14:34:03 +01:00
Nikita Popov	560e694d48	[AST] Don't assert instruction reads/writes memory (PR51333) This function is well-defined for an instruction that doesn't access memory (and thus trivially doesn't alias anything in the AST), so drop the assert. We can end up with a readnone call here if we originally created a MemoryDef for an indirect call, which was later replaced with a direct readnone call. Fixes https://github.com/llvm/llvm-project/issues/51333. Differential Revision: https://reviews.llvm.org/D127947	2022-07-01 17:04:48 +02:00
Nikita Popov	c8bd3e7825	[SCEV] Remove unnecessary pointer handling in BuildConstantFromSCEV (NFCI) Nowadays, we do not allow pointers in multiplies, and adds can only have a single pointer, which is also guaranteed to be last by complexity sorting. As such, we can somewhat simplify the treatment of pointer types.	2022-07-01 16:28:56 +02:00
Chen Zheng	758de0e931	[InstructionSimplify] handle denormal input for fcmp Handle denormal constant input for fcmp instructions based on the denormal handling mode. Reviewed By: spatel, dcandler Differential Revision: https://reviews.llvm.org/D128647	2022-07-01 03:51:28 -04:00
Nikita Popov	9ac386495d	[ConstExpr] Don't create insertvalue expressions In preparation for the removal in D128719, this stops creating insertvalue constant expressions (well, unless they are directly used in LLVM IR). Differential Revision: https://reviews.llvm.org/D128792	2022-07-01 09:23:28 +02:00
Fangrui Song	27abff670b	Remove unneeded cl::ZeroOrMore. NFC	2022-06-30 19:11:27 -07:00
Nikita Popov	0445c340ff	[ConstantFold] Support loads in ConstantFoldInstOperands() This allows all constant folding to happen through a single function, without requiring special handling for loads at each call-site. This may not be NFC because some callers currently don't do that special handling.	2022-06-30 12:18:15 +02:00

1 2 3 4 5 ...

11816 Commits