llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	6ac32872ee	[Attributes] Replace doesAttrKindHaveArgument() (NFC) This is now the same as isIntAttrKind(), so use that instead, as it does not require manual maintenance. The naming is also more accurate in that both int and type attributes have an argument, but this method was only targeting int attributes. I initially wanted to tighten the AttrBuilder assertion, but we have some in-tree uses that would violate it.	2021-07-12 21:57:26 +02:00
Kazu Hirata	4f94121cce	[Analysis] Remove changeCondBranchToUnconditionalTo (NFC) The last use was removed on Jan 21, 2021 in commit `0895b836d7`.	2021-07-10 17:31:43 -07:00
Eli Friedman	882ee7fbd6	Fix buildbot regression from `9c4baf5`. Apparently ScalarEvolution::isImpliedCond tries to truncate a pointer in some obscure cases. Guard the code with a check for pointers.	2021-07-09 17:54:09 -07:00
Eli Friedman	9c4baf5101	[ScalarEvolution] Strictly enforce pointer/int type rules. Rules: 1. SCEVUnknown is a pointer if and only if the LLVM IR value is a pointer. 2. SCEVPtrToInt is never a pointer. 3. If any other SCEV expression has no pointer operands, the result is an integer. 4. If a SCEVAddExpr has exactly one pointer operand, the result is a pointer. 5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other pointer operands, the result is a pointer. 6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a pointer. 7. Otherwise, the SCEV expression is invalid. I'm not sure how useful rule 6 is in practice. If we exclude it, we can guarantee that ScalarEvolution::getPointerBase always returns a SCEVUnknown, which might be a helpful property. Anyway, I'll leave that for a followup. This is basically mop-up at this point; all the changes with significant functional effects have landed. Some of the remaining changes could be split off, but I don't see much point. Differential Revision: https://reviews.llvm.org/D105510	2021-07-09 17:29:26 -07:00
Nikita Popov	2e3f4694d6	[IR] Add GEPOperator::indices() (NFC) In order to mirror the GetElementPtrInst::indices() API. Wanted to use this in the IRForTarget code, and was surprised to find that it didn't exist yet.	2021-07-09 21:41:20 +02:00
Kevin P. Neal	52900486a1	[FPEnv][InstSimplify] Constrained FP support for NaN Currently InstructionSimplify.cpp knows how to simplify floating point instructions that have a NaN operand. It does not know how to handle the matching constrained FP intrinsic. This patch teaches it how to simplify so long as the exception handling is not "fpexcept.strict". Differential Revision: https://reviews.llvm.org/D103169	2021-07-09 11:26:28 -04:00
Martin Storsjö	e479777d3c	Revert "[ScalarEvolution] Fix overflow in computeBECount." This reverts commit `5b350183cd` (and also "[NFC][ScalarEvolution] Cleanup howManyLessThans.", `009436e9c1`, to make it apply). See https://reviews.llvm.org/D105216 for discussion on various miscompilations caused by that commit.	2021-07-09 14:26:48 +03:00
David Green	38c9a4068d	[TTI] Remove IsPairwiseForm from getArithmeticReductionCost This patch removes the IsPairwiseForm flag from the Reduction Cost TTI hooks, along with some accompanying code for pattern matching reductions from trees starting at extract elements. IsPairWise is now assumed to be false, which was the predominant way that the value was used from both the Loop and SLP vectorizers. Since the adjustments such as D93860, the SLP vectorizer has not relied upon this distinction between paiwise and non-pairwise reductions. This also removes some code that was detecting reductions trees starting from extract elements inside the costmodel. This case was double-counting costs though, adding the individual costs on the individual instruction _and_ the total cost of the reduction. Removing it changes the costs in llvm/test/Analysis/CostModel/X86/reduction.ll to not double count. The cost of reduction intrinsics is still tested through the various tests in llvm/test/Analysis/CostModel/X86/reduce-xyz.ll. Differential Revision: https://reviews.llvm.org/D105484	2021-07-09 11:51:16 +01:00
Bjorn Pettersson	472462c472	[NewPM] Consistently use 'simplifycfg' rather than 'simplify-cfg' There was an alias between 'simplifycfg' and 'simplify-cfg' in the PassRegistry. That was the original reason for this patch, which effectively removes the alias. This patch also replaces all occurrances of 'simplify-cfg' by 'simplifycfg'. Reason for choosing that form for the name is that it matches the DEBUG_TYPE for the pass, and the legacy PM name and also how it is spelled out in other passes such as 'loop-simplifycfg', and in other options such as 'simplifycfg-merge-cond-stores'. I for some reason the name should be changed to 'simplify-cfg' in the future, then I think such a renaming should be more widely done and not only impacting the PassRegistry. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D105627	2021-07-09 09:47:03 +02:00
Eli Friedman	009436e9c1	[NFC][ScalarEvolution] Cleanup howManyLessThans. In preparation for D104075. Some NFC cleanup, and some test coverage for planned changes.	2021-07-08 17:56:26 -07:00
Michael Liao	8c7ff9da90	[Metadata] Decorate methods with 'const'. NFC. - Minor coding style fix.	2021-07-08 14:11:14 -04:00
Eli Friedman	5b350183cd	[ScalarEvolution] Fix overflow in computeBECount. There are two issues with the current implementation of computeBECount: 1. It doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. 2. It doesn't account for the possibility that Stride is zero. If Delta is zero, the backedge is never taken; the value of Stride isn't relevant. To handle this, we have to make sure that the expression returned by computeBECount evaluates to zero. To deal with this, add two new checks: 1. Use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. 2. Use umax(Stride, 1) to handle the possibility that Stride is zero. Differential Revision: https://reviews.llvm.org/D105216	2021-07-08 10:09:55 -07:00
Eli Friedman	f5603aa050	[ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers. Add a function removePointerBase that returns, essentially, S - getPointerBase(S). Use it in getMinusSCEV instead of actually subtracting pointers. Differential Revision: https://reviews.llvm.org/D105503	2021-07-07 10:27:10 -07:00
Eli Friedman	7ac1c7bead	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Recommitting with fix to MemoryDepChecker::isDependent. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 12:16:05 -07:00
Eli Friedman	a6d081b2cb	Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers." This reverts commit `74d6ce5d5f`. Seeing crashes on buildbots in MemoryDepChecker::isDependent.	2021-07-06 11:17:13 -07:00
Sanjay Patel	4ec7c02197	[InstSimplify] fix bug in poison propagation for FP ops If any operand of a math op is poison, that takes precedence over general undef/NaN. This should not be visible with binary ops because it requires 2 constant operands to trigger (and if both operands of a binop are constant, that should get handled first in ConstantFolding).	2021-07-06 14:06:50 -04:00
Eli Friedman	74d6ce5d5f	[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 10:54:41 -07:00
Kerry McLaughlin	a7512401e5	[LV] Prevent vectorization with unsupported element types. This patch adds a TTI function, isElementTypeLegalForScalableVector, to query whether it is possible to vectorize a given element type. This is called by isLegalToVectorizeInstTypesForScalable to reject scalable vectorization if any of the instruction types in the loop are unsupported, e.g: int foo(__int128_t* ptr, int N) #pragma clang loop vectorize_width(4, scalable) for (int i=0; i<N; ++i) ptr[i] = ptr[i] + 42; This example currently crashes if we attempt to vectorize since i128 is not a supported type for scalable vectorization. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D102253	2021-07-06 13:06:21 +01:00
Sanjay Patel	3d3c0ed932	[InstSimplify] fold extractelement of splat with variable extract index We already have a fold for variable index with constant vector, but if we can determine a scalar splat value, then it does not matter whether that value is constant or not. We overlooked this fold in D102404 and earlier patches, but the fixed vector variant is shown in: https://llvm.org/PR50817 Alive2 agrees on that: https://alive2.llvm.org/ce/z/HpijPC The same logic applies to scalable vectors. Differential Revision: https://reviews.llvm.org/D104867	2021-07-05 08:19:40 -04:00
Paul Walker	287d39dd5a	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
Roman Lebedev	fc150cecd7	[SimplifyCFG] simplifyUnreachable(): erase instructions iff they are guaranteed to transfer execution to unreachable This replaces the current ad-hoc implementation, by syncing the code from InstCombine's implementation in `InstCombinerImpl::visitUnreachableInst()`, with one exception that here in SimplifyCFG we are allowed to remove EH instructions. Effectively, this now allows SimplifyCFG to remove calls (iff they won't throw and will return), arithmetic/logic operations, etc. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D105374	2021-07-03 10:45:44 +03:00
Jacob Hegna	8cc8caa1b1	[MLGO] Update Oz model url.	2021-07-02 17:29:15 +00:00
Jacob Hegna	99f00635d7	Unpack the CostEstimate feature in ML inlining models. This change yields an additional 2% size reduction on an internal search binary, and an additional 0.5% size reduction on fuchsia. Differential Revision: https://reviews.llvm.org/D104751	2021-07-02 16:57:16 +00:00
Sanjay Patel	9eb613b2de	[InstSimplify] do not propagate poison from select arm to icmp user This is the cause of the miscompile in: https://llvm.org/PR50944 The problem has likely existed for some time, but it was made visible with: `5af8bacc94` ( D104661 ) handleOtherCmpSelSimplifications() assumed it can convert select of constants to bool logic ops, but that does not work with poison. We had a very similar construct in InstCombine, so the fix here mimics the fix there. The bug is in instsimplify, but I'm not sure how to reproduce it outside of instcombine. The reason this is visible in instcombine is because we have a hack (FIXME) to bypass simplification of a select when it has an icmp user: `955f125899/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (L2632)` So we get to an unusual case where we are trying to simplify an instruction that has an operand that would have already simplified if we had processed it in normal order. Differential Revision: https://reviews.llvm.org/D105298	2021-07-01 17:40:07 -04:00
Florian Hahn	dc4299a7f3	[BasicAA] Fix typo ScaleForGDC -> ScaleForGCD.	2021-07-01 09:58:38 +01:00
Florian Hahn	e6d22d0174	[BasicAA] Use separate scale variable for GCD. Use separate variable for adjusted scale used for GCD computations. This fixes an issue where we incorrectly determined that all indices are non-negative and returned noalias because of that. Follow up to `91fa3565da`.	2021-06-30 20:04:39 +01:00
Philip Reames	14d8f1546a	[SCEV] Fold (0 udiv %x) to 0 We have analogous rules in instsimplify, etc.., but were missing the same in SCEV. The fold is near trivial, but came up in the context of a larger change.	2021-06-30 08:31:13 -07:00
Jacob Hegna	7b639f5095	[NFC] clang-format on InlineCost.cpp and InlineAdvisor.h.	2021-06-29 18:15:27 +00:00
Florian Hahn	91fa3565da	[BasicAA] Be more careful with modulo ops on VariableGEPIndex. (V * Scale) % X may not produce the same result for any possible value of V, e.g. if the multiplication overflows. This means we currently incorrectly determine NoAlias in some cases. This patch updates LinearExpression to track whether the expression has NSW and uses that to adjust the scale used for alias checks. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99424	2021-06-29 09:22:36 +01:00
Sanjay Patel	7414bbebc2	[Analysis] improve function signature checking for calloc This would crash later if we thought the parameters were valid for the standard library call as shown in: https://llvm.org/PR50846	2021-06-27 08:19:00 -04:00
Eli Friedman	8d5bf0709d	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Sanjay Patel	1076b6c4f0	[Analysis] use better version of getLibFunc to check for alloc/free calls There's no reason to use the weaker name-only analysis when we have a function prototype to check (in fact, we probably should not even have that name-only function exposed for general use, but removing it requires auditing all of the callers). The version of getLibFunc that takes a Function argument also does some prototype checking to make sure the arguments/return type match the expected signature of a real library call. This is NFC-intended because the code in MemoryBuiltins does its own function signature checking. For now, that means there may be some redundancy in the checking, but that should not be above the noise for compile-time. Ideally, we can move the checks to a single location. There's still a hole in the logic that allows the example in https://llvm.org/PR50846 to cause a compiler crash.	2021-06-25 12:14:07 -04:00
Florian Hahn	6478f3fb78	[SCEV] Support single-cond range check idiom in applyLoopGuards. This patch extends applyLoopGuards to detect a single-cond range check idiom that InstCombine generates. It extends applyLoopGuards to detect conditions of the form (-C1 + X < C2). InstCombine will create this form when combining two checks of the form (X u< C2 + C1) and (X >=u C1). In practice, this enables us to correctly compute a tight trip count bounds for code as in the function below. InstCombine will fold the minimum iteration check created by LoopRotate with the user check (< 8). void unsigned_check(short pred, unsigned width) { if (width < 8) { for (int x = 0; x < width; x++) pred[x] = pred[x] pred[x]; } } As a consequence, LLVM creates dead vector loops for the code above, e.g. see https://godbolt.org/z/cb8eTcqET https://alive2.llvm.org/ce/z/SHHW4d Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104741	2021-06-25 10:24:40 +01:00
Sanjay Patel	50db987d59	[InstSimplify] move extract with undef index fold; NFC This puts it closer to the other undef query check and will avoid a potential ordering problem if we allow folding non-constant-int indexes.	2021-06-24 13:22:10 -04:00
Florian Hahn	121ecb05e7	[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions. This patch generalizes MatchBinaryAddToConst to support matching (A + C1), (A + C2), instead of just matching (A + C1), A. The existing cases can be handled by treating non-add expressions A as A + 0. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104634	2021-06-24 12:16:15 +01:00
Carl Ritson	ae266e743c	[LVI] Remove recursion from getValueForCondition (NFCI) Convert getValueForCondition to a worklist model instead of using recursion. In pathological cases getValueForCondition recurses heavily. Stack frames are quite expensive on x86-64, and some operating systems (e.g. Windows) have relatively low stack size limits. Using a worklist avoids potential failures from stack overflow. Differential Revision: https://reviews.llvm.org/D104191	2021-06-24 09:58:22 +09:00
Eli Friedman	b12192f7cd	[ScalarEvolution] Clarify implementation of getPointerBase(). getPointerBase should only be looking through Add and AddRec expressions; other expressions either aren't pointers, or can't be looked through. Technically, this is a functional change. For a multiply or min/max expression, if they have exactly one pointer operand, and that operand is the first operand, the behavior here changes. Similarly, if an AddRec has a pointer-type step, the behavior changes. But that shouldn't be happening in practice, and we plan to make such expressions illegal.	2021-06-23 12:55:59 -07:00
Eli Friedman	fdaf304e0d	[NFC][ScalarEvolution] Fix SCEVNAryExpr::getType(). SCEVNAryExpr::getType() could return the wrong type for a SCEVAddExpr. Remove it, and add getType() methods to the relevant subclasses. NFC because nothing uses it directly, as far as I know; this is just future-proofing.	2021-06-23 12:55:59 -07:00
Nikita Popov	00d3f7cc3c	[LAA] Make getPointersDiff() API compatible with opaque pointers Make getPointersDiff() and sortPtrAccesses() compatible with opaque pointers by explicitly passing in the element type instead of determining it from the pointer element type. The SLPVectorizer result is slightly non-optimal in that unnecessary pointer bitcasts are added. Differential Revision: https://reviews.llvm.org/D104784	2021-06-23 18:44:34 +02:00
Sanjay Patel	656001e7b2	[ValueTracking] look through bitcast of vector in computeKnownBits This borrows as much as possible from the SDAG version of the code (originally added with D27129 and since updated with big endian support). In IR, we can test more easily for correctness than we did in the original patch. I'm using the simplest cases that I could find for InstSimplify: we computeKnownBits on variable shift amounts to see if they are zero or in range. So shuffle constant elements into a vector, cast it, and shift it. The motivating x86 example from https://llvm.org/PR50123 is also here. We computeKnownBits in the caller code, but we only check if the shift amount is in range. That could be enhanced to catch the 2nd x86 test - if the shift amount is known too big, the result is 0. Alive2 understands the datalayout and agrees that the tests here are correct - example: https://alive2.llvm.org/ce/z/KZJFMZ Differential Revision: https://reviews.llvm.org/D104472	2021-06-23 11:46:46 -04:00
Juneyoung Lee	5af8bacc94	[InstSimplify] Add more poison folding optimizations This adds more poison folding optimizations to InstSimplify. Since all binary operators propagate poison, these are fine. Also, the precondition of `select cond, undef, x` -> `x` is relaxed to allow the case when `x` is undef. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104661	2021-06-23 20:25:24 +09:00
Florian Hahn	adee485adf	[SCEV] Support signed predicates in applyLoopGuards. This adds handling for signed predicates, similar to how unsigned predicates are already handled. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104732	2021-06-23 10:21:05 +01:00
Joseph Huber	2662351e3b	[OpenMP] Add new OpenMP globalization functions to library info Summary: The changes to globalization introduced in D97680 created two new functions to push / pop shareably memory on the GPU, __kmpc_alloc_shared and __kmpc_free_shared. This patch adds these new runtime functions to the library info so they can be used by the HeapToStack attributor interface. This optimization replaces malloc / free pairs with stack memory if legal. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D102087	2021-06-22 13:23:05 -04:00
Florian Hahn	6c782e6eb0	[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC). Hoist out common recurrence check and sink updating the map, to reduce the code required to support additional predicates.	2021-06-22 15:56:45 +01:00
Nikita Popov	e638a290f7	[ConstantFold] Delay fetching pointer element type Don't do this while stipping pointer casts, instead fetch it at the end. This improves compatibility with opaque pointers for the case where the base object is not opaque.	2021-06-22 15:51:00 +02:00
Florian Hahn	d17798823c	[SCEV] Retain AddExpr flags when subtracting a foldable constant. Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2. But we can retain flags under certain conditions: * Adding a smaller constant is NUW if the original AddExpr was NUW. * Adding a constant with the same sign and small magnitude is NSW, if the original AddExpr was NSW. This can improve results after using `SimplifyICmpOperands`, which may subtract one in order to use stricter predicates, as is the case for `isKnownPredicate`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104319	2021-06-22 11:27:51 +01:00
Nikita Popov	04395fd6cb	[ConstantFolding] Separate conditions in GEP evaluation (NFC) Handle to gep p, 0-v case separately, and not as part of the loop that ensures all indices are constant integers. Those two things are not really related.	2021-06-22 11:14:47 +02:00
Eli Friedman	8f3d16905d	[ScalarEvolution] Ensure backedge-taken counts are not pointers. A backedge-taken count doesn't refer to memory; returning a pointer type is nonsense. So make sure we always return an integer. The obvious way to do this would be to just convert the operands of the icmp to integers, but that doesn't quite work out at the moment: isLoopEntryGuardedByCond currently gets confused by ptrtoint operations. So we perform the ptrtoint conversion late for lt/gt operations. The test changes are mostly innocuous. The most interesting changes are more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to i64)) + %ptr)". This is expected: we can't fold this to zero because we need to preserve the pointer base. The call to isLoopEntryGuardedByCond in howFarToZero is less precise because of ptrtoint operations; this shows up in the function pr46786_c26_char in ptrtoint.ll. Fixing it here would require more complex refactoring. It should eventually be fixed by future improvements to isImpliedCond. See https://bugs.llvm.org/show_bug.cgi?id=46786 for context. Differential Revision: https://reviews.llvm.org/D103656	2021-06-21 16:24:16 -07:00
Jacob Hegna	f86d1f99b3	Remove ML inlining model artifacts. They are not conducive to being stored in git. Instead, we autogenerate mock model artifacts for use in tests. Production models can be specified with the cmake flag LLVM_INLINER_MODEL_PATH. LLVM_INLINER_MODEL_PATH has two sentinel values: - download, which will download the most recent compatible model. - autogenerate, which will autogenerate a "fake" model for testing the model uptake infrastructure. Differential Revision: https://reviews.llvm.org/D104251	2021-06-21 17:38:09 +00:00
Eli Friedman	62ed024c74	[NFC][ScalarEvolution] Clean up ExitLimit constructors. Make all the constructors forward to one constructor. Remove redundant assertions.	2021-06-20 17:40:30 -07:00
Juneyoung Lee	09e8c0d5aa	[InstSimplify] icmp poison, X -> poison This adds a simple transformation from icmp with poison constant to poison. Comparing poison with something else is poison, so this is okay. https://alive2.llvm.org/ce/z/e8iReb https://alive2.llvm.org/ce/z/q4MurY	2021-06-20 15:39:07 +09:00
Tomas Matheson	1bcfa84ae9	Allow building for release with EXPENSIVE_CHECKS D97225 moved LazyCallGraph verify() calls behind EXPENSIVE_CHECKS, but verity() is defined for debug builds only so this had the unintended effect of breaking release builds with EXPENSIVE_CHECKS. Fix by enabling verify() for both debug and EXPENSIVE_CHECKS. Differential Revision: https://reviews.llvm.org/D104514	2021-06-19 17:02:11 +01:00
Eli Friedman	8a567e5f22	[ScalarEvolution] Fix pointer/int type handling converting select/phi to min/max. The old version of this code would blindly perform arithmetic without paying attention to whether the types involved were pointers or integers. This could lead to weird expressions like negating a pointer. Explicitly handle simple cases involving pointers, like "x < y ? x : y". In all other cases, coerce the operands of the comparison to integer types. This avoids the weird cases, while handling most of the interesting cases. Differential Revision: https://reviews.llvm.org/D103660	2021-06-17 14:05:12 -07:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Joachim Meyer	053dbb939d	Use `-cfg-func-name` value as filter for `-view-cfg`, etc. Currently the value is only used when calling `F->viewCFG()` which is missing out on its potential and usefulness. So I added the check to the printer passes as well. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D102011	2021-06-16 23:54:51 +02:00
Eli Friedman	27963ccf07	[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI In preparation for D103660.	2021-06-16 12:32:32 -07:00
Sanjay Patel	ce95200b79	[InstSimplify] propagate poison through FP ops We already have this fold: fadd float poison, 1.0 --> poison ...via ConstantFolding, so this makes the behavior consistent if the other operand(s) are non-constant. The fold for undef was added before poison existed as a value/type in IR. This came up in D102673 / D103169 because we're trying to sort out the more complicated handling for constrained math ops. We should have the handling for the regular instructions done first, so we can build on that (or diverge as needed). Differential Revision: https://reviews.llvm.org/D104383	2021-06-16 11:31:58 -04:00
Roman Lebedev	a3113df219	[SCEV] PtrToInt on non-integral pointers is allowed As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change, we are now allowed to produce `ptrtoint` for non-integral pointers. This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104322	2021-06-16 10:24:25 +03:00
Arthur Eubanks	9aa1428174	[InstSimplify] Treat invariant group insts as bitcasts for load operands We can look through invariant group intrinsics for the purposes of simplifying the result of a load. Since intrinsics can't be constants, but we also don't want to completely rewrite load constant folding, we convert the load operand to a constant. For GEPs and bitcasts we just treat them as constants. For invariant group intrinsics, we treat them as a bitcast. Relanding with a check for self-referential values. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D101103	2021-06-15 12:59:43 -07:00
spupyrev	0a0800c4d1	A post-processing for BFI inference The current implementation for computing relative block frequencies does not handle correctly control-flow graphs containing irreducible loops. This results in suboptimally generated binaries, whose perf can be up to 5% worse than optimal. To resolve the problem, we apply a post-processing step, which iteratively updates block frequencies based on the frequencies of their predesessors. This corresponds to finding the stationary point of the Markov chain by an iterative method aka "PageRank computation". The algorithm takes at most O(\|E\| * IterativeBFIMaxIterations) steps but typically converges faster. It is turned on by passing option `use-iterative-bfi-inference` and applied only for functions containing profile data and irreducible loops. Tested on SPEC06/17, where it is helping to get correct profile counts for one of the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5% for binaries containing functions with hot irreducible loops. Reviewed By: hoy, wenlei, davidxl Differential Revision: https://reviews.llvm.org/D103289	2021-06-11 21:46:04 -07:00
Andrew Litteken	f6dea2e732	[IRSim] Strip out the findSimilarity call from the constructor Both doInitialize and runOnModule were running the entire analysis due to the actual work being done in the constructor. Strip it out here and only get the similarity during runOnModule. Author: lanza Reviewers: AndrewLitteken, paquette, plofti Differential Revision: https://reviews.llvm.org/D92524	2021-06-11 18:41:28 -05:00
Andrew Litteken	64720f57be	[IRSim] Don't copy the Mapper for createCandidatesFromSuffixTree Every invocation this was copying the Mapper for no reason. Take a const ref instead. Author: lanza Reviewers: AndrewLitteken, plofti, paquette, Differential Review: https://reviews.llvm.org/D92532	2021-06-11 16:36:23 -05:00
Simon Pilgrim	5e6bfb661e	[Analysis] Pass RecurrenceDescriptor as const reference. NFCI. We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.). Differential Revision: https://reviews.llvm.org/D104029	2021-06-11 10:24:14 +01:00
Philip Reames	7629b2a09c	[LI] Add a cover function for checking if a loop is mustprogress [nfc] Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.	2021-06-10 13:37:32 -07:00
Philip Reames	aaaeb4b160	[SCEV] Use mustprogress flag on loops (in addition to function attribute) This addresses a performance regression reported against `3c6e4191`. That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (`38540d7`) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form. It turns out that clang uses the function attribute form for C++, and the loop metadata form for C. As a result, `3c6e4191` ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.	2021-06-10 13:20:28 -07:00
Philip Reames	b6ee5f2b1d	Move code for checking loop metadata into Analysis [nfc] I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.	2021-06-10 13:01:22 -07:00
Serge Pavlov	8ff36aab69	[ConstantFolding] Enable folding of min/max/copysign for all floats Previously such folding was enabled for half, float and double values only. With this change it is allowed for other floating point values also. Differential Revision: https://reviews.llvm.org/D103956	2021-06-10 11:57:51 +07:00
Philip Reames	b65f30d6fb	[SCEV] Minor code motion to simplify a later patch [nfc]	2021-06-09 14:17:06 -07:00
Arthur Eubanks	222cce3828	Revert "[InstSimplify] Treat invariant group insts as bitcasts for load operands" This reverts commit `26044c6a54`. Breaks on invalid IR (see D101103).	2021-06-09 11:46:10 -07:00
Florian Hahn	b76f1f1202	[SCEV] Keep common NUW flags when inlining Add operands. Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr operands. As a consequence, we always drop flags when building expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`. We should be able to retain NUW flags common among all inlined SCEVAddExpr and the original flags. Reviewed By: nikic, mkazantsev Differential Revision: https://reviews.llvm.org/D103877	2021-06-09 17:13:21 +01:00
Artur Pilipenko	9197bac297	Add an option to hide "cold" blocks from CFG graph Introduce a new cl::opt to hide "cold" blocks from CFG DOT graphs. Use BFI to get block relative frequency. Hide the block if the frequency is below the threshold set by the command line option value. Reviewed By: davidxl, hoy Differential Revision: https://reviews.llvm.org/D103640	2021-06-08 11:29:27 -07:00
Caroline Concatto	6fd1604d14	[InstCombine] Add instcombine fold for extractelement + splat for scalable vectors This patch allows that scalable vector can also use the fold that already exists for fixed vector, only when the lane index is lower than the minimum number of elements of the vector. Differential Revision: https://reviews.llvm.org/D102404	2021-06-08 10:43:38 +01:00
Philip Reames	3c6e419198	[SCEV] Properly guard reasoning about infinite loops being UB on mustprogress Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location. Differential Revision: https://reviews.llvm.org/D103834	2021-06-07 14:47:36 -07:00
Daniil Suchkov	d32cc150fe	[BasicAA] Handle PHIs without incoming values gracefully Fix a bug introduced by `f6f6f6375d`. Now for empty PHIs, instead of crashing on assert(hasVal()) in Optional's internals, we'll return NoAlias, as we did before that patch. Differential Revision: https://reviews.llvm.org/D103831	2021-06-07 21:39:01 +00:00
Philip Reames	38540d71c7	[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be: for (unsigned i = 0; i < N; i += 2) { body; } Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body". The basic intuition behind this patch is as follows: * A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max. * Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this. In LLVM, this is tied to the mustprogress attribute. Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop. A couple notes for those reading along: * I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules. * We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path. Differential Revision: https://reviews.llvm.org/D103118	2021-06-07 11:24:00 -07:00
Simon Pilgrim	76a1be05fa	AssumeBundleQueries.cpp - don't dereference a dyn_cast<> result. NFCI. Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow. Fixes static analysis warning.	2021-06-06 15:25:03 +01:00
Roman Lebedev	e350494fb0	[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.	2021-06-05 12:17:51 +03:00
Fangrui Song	06e7de795b	Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build	2021-06-04 23:34:43 -07:00
Artur Pilipenko	a06e63fa52	NFC. Refactor DOTGraphTraits::isNodeHidden Restructure handling of cfg-hide-unreachable-paths and cfg-hide-deoptimize-paths options so as to make it easier to introduce new types of hidden blocks.	2021-06-03 11:27:06 -07:00
Qunyan Mangus	cbde248736	Add getDemandedBits for uses. Add getDemandedBits method for uses so we can query demanded bits for each use. This can help getting better use information. For example, for the code below define i32 @test_use(i32 %a) { %1 = and i32 %a, -256 %2 = or i32 %1, 1 %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose) ... some use of %3 ret %2 } if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D97074	2021-06-02 10:07:40 -04:00
Daniil Fukalov	0195e594fe	[TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102915	2021-06-02 16:04:11 +03:00
Bjorn Pettersson	9c54ee4378	[SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf When rewriting powf(2.0, itofp(x)) -> ldexpf(1.0, x) exp2(sitofp(x)) -> ldexp(1.0, sext(x)) exp2(uitofp(x)) -> ldexp(1.0, zext(x)) the wrong type was used for the second argument in the ldexp/ldexpf libc call, for target architectures with 16 bit "int" type. The transform incorrectly used a bitcasted function pointer with a 32-bit argument when emitting the ldexp/ldexpf call for such targets. The fault is solved by using the correct function prototype in the call, by asking TargetLibraryInfo about the size of "int". TargetLibraryInfo by default derives the size of the int type by assuming that it is 16 bits for 16-bit architectures, and 32 bits otherwise. If this isn't true for a target it should be possible to override that default in the TargetLibraryInfo initializer. Differential Revision: https://reviews.llvm.org/D99438	2021-06-02 11:40:34 +02:00
Arthur Eubanks	8961293851	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429	2021-06-01 16:52:32 -07:00
Arthur Eubanks	26044c6a54	[InstSimplify] Treat invariant group insts as bitcasts for load operands We can look through invariant group intrinsics for the purposes of simplifying the result of a load. Since intrinsics can't be constants, but we also don't want to completely rewrite load constant folding, we convert the load operand to a constant. For GEPs and bitcasts we just treat them as constants. For invariant group intrinsics, we treat them as a bitcast. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D101103	2021-06-01 16:33:06 -07:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
Florian Hahn	aa00b1d763	[LV] Try to sink users recursively for first-order recurrences. Update isFirstOrderRecurrence to explore all uses of a recurrence phi and check if we can sink them. If there are multiple users to sink, they are all mapped to the previous instruction. Fixes PR44286 (and another PR or two). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D84951	2021-05-31 19:55:33 +01:00
Daniil Fukalov	e853d3b274	[NFC] MemoryDependenceAnalysis cleanup. 1. Removed redundant includes, 2. Removed never defined and used `releaseMemory()`. 3. Fixed member functions names first letter case. 4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name `NonLocalDeps` to `NonLocalDepsMap`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102358	2021-05-31 18:07:55 +03:00
Roman Lebedev	f7c95c3322	[NFC] ScalarEvolution: apply SSO to the ExprValueMap value ExprValueMap is a map from SCEV * to a set-vector of (Value , ConstantInt ) pair, and while the map itself will likely be big-ish (have many keys), it is a reasonable assumption that each key will refer to a small-ish number of pairs. In particular looking at n=512 case from https://bugs.llvm.org/show_bug.cgi?id=50384, the small-size of 4 appears to be the sweet spot, it results in the least allocations while minimizing memory footprint. ``` $ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i \| tail -n 6; echo ""; done heaptrack.opt.0-orig.gz total runtime: 14.32s. calls to allocation functions: 8222442 (574192/s) temporary memory allocations: `2419000` (168924/s) peak heap memory consumption: 190.98MB peak RSS (including heaptrack overhead): 239.65MB total memory leaked: 67.58KB heaptrack.opt.1-n1.gz total runtime: 13.72s. calls to allocation functions: 7184188 (523705/s) temporary memory allocations: 2419017 (176338/s) peak heap memory consumption: 191.38MB peak RSS (including heaptrack overhead): 239.64MB total memory leaked: 67.58KB heaptrack.opt.2-n2.gz total runtime: 12.24s. calls to allocation functions: 6146827 (502355/s) temporary memory allocations: 2418997 (197695/s) peak heap memory consumption: 163.31MB peak RSS (including heaptrack overhead): 211.01MB total memory leaked: 67.58KB heaptrack.opt.3-n4.gz total runtime: 12.28s. calls to allocation functions: 6068532 (494260/s) temporary memory allocations: 2418985 (197017/s) peak heap memory consumption: 155.43MB peak RSS (including heaptrack overhead): 201.77MB total memory leaked: 67.58KB heaptrack.opt.4-n8.gz total runtime: 12.06s. calls to allocation functions: 6068042 (503321/s) temporary memory allocations: 2418992 (200646/s) peak heap memory consumption: 166.03MB peak RSS (including heaptrack overhead): 213.55MB total memory leaked: 67.58KB heaptrack.opt.5-n16.gz total runtime: 12.14s. calls to allocation functions: 6067993 (499958/s) temporary memory allocations: 2418999 (199307/s) peak heap memory consumption: 187.24MB peak RSS (including heaptrack overhead): 233.69MB total memory leaked: 67.58KB ``` While that test may be an edge worst-case scenario, https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions agrees that this also results in improvements in the usual situations.	2021-05-31 15:34:03 +03:00
Sanjay Patel	7bb8bfa062	[InstCombine] fix miscompile from vector select substitution This is similar to the fix in `c590a9880d` ( PR49832 ), but we missed handling the pattern for select of bools (no compare inst). We can't substitute a vector value because the equality condition replacement that we are attempting requires that the condition is true/false for the entire value. Vector select can be partly true/false. I added an assert for vector types, so we shouldn't hit this again. Fixed formatting while auditing the callers. https://llvm.org/PR50500	2021-05-30 07:11:58 -04:00
Mindong Chen	71acce68da	[NFCI] Move DEBUG_TYPE definition below #includes When you try to define a new DEBUG_TYPE in a header file, DEBUG_TYPE definition defined around the #includes in files include it could result in redefinition warnings even compile errors. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D102594	2021-05-30 17:31:01 +08:00
Florian Hahn	ec1f6f7e3f	Revert "[LAA] Support pointer phis in loop by analyzing each incoming pointer." This reverts commit `1ed7f8ede5`. This change can cause loop-distribute to crash in some cases. Revert until I have more time to wrap up a fix. See PR50296, PR5028 and D102266.	2021-05-28 10:33:52 +01:00
Yang Fan	f2264ebb08	[ConstantFolding] Fix -Wunused-variable warning (NFC) GCC warning: ``` /llvm-project/llvm/lib/Analysis/ConstantFolding.cpp: In function ‘llvm::Constant* llvm::ConstantFoldLoadFromConstPtr(llvm::Constant, llvm::Type, const llvm::DataLayout&)’: /llvm-project/llvm/lib/Analysis/ConstantFolding.cpp:713:19: warning: unused variable ‘SimplifiedGEP’ [-Wunused-variable] 713 \| if (auto *SimplifiedGEP = dyn_cast<GEPOperator>(Simplified)) { \| ^~~~~~~~~~~~~ ```	2021-05-28 16:17:12 +08:00
Arthur Eubanks	8086f9d87e	[ConstFold] Simplify a load's GEP operand through local aliases MSVC-style RTTI produces loads through a GEP of a local alias which itself is a GEP. Currently we aren't able to devirtualize any virtual calls when MSVC RTTI is enabled. This patch attempts to simplify a load's GEP operand by calling SymbolicallyEvaluateGEP() with an option to look through local aliases. Differential Revision: https://reviews.llvm.org/D101100	2021-05-27 16:04:19 -07:00
Philip Reames	ff08c3468f	[SCEV] Compute trip multiple for multiple exit loops This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case. Differential Revision: https://reviews.llvm.org/D103189	2021-05-26 11:52:25 -07:00
Philip Reames	9306bb638f	[SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context. I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns. There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable. If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired. Differential Revision: https://reviews.llvm.org/D103182	2021-05-26 11:18:25 -07:00
Philip Reames	921d3f7af0	[SCEV] Add a utility for converting from "exit count" to "trip count" (Mostly as a logical place to put a comment since this is a reoccuring confusion.)	2021-05-26 10:41:49 -07:00
Philip Reames	fb14577d0c	[SCEV] Extract out a helper for computing trip multiples	2021-05-26 10:15:03 -07:00
Philip Reames	9cc2181ec3	[unroll] Use value domain for symbolic execution based cost model The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost. By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928. Differential Revision: https://reviews.llvm.org/D102934	2021-05-26 08:41:25 -07:00
Vitaly Buka	f44f2e0afc	[NFC] Fix 'unused' warning	2021-05-25 12:23:57 -07:00
Nikita Popov	6300c37a46	[SCEV] Cache operands used in BEInfo (NFC) When memoized values for a SCEV expressions are dropped, we also drop all BECounts that make use of the SCEV expression. This is done by iterating over all the ExitNotTaken counts and (recursively) checking whether they use the SCEV expression. If there are many exits, this will take a lot of time. This patch improves the situation by pre-computing a set of all used operands, so that we can determine whether a certain BEInfo needs to be invalidated using a simple set lookup. Will still need to loop over all BEInfos though. This makes for a mild improvement on non-degenerate cases: https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384, for n=128 I'm seeing run time drop from 1.6s to 1.1s. Differential Revision: https://reviews.llvm.org/D102796	2021-05-25 21:03:33 +02:00
Sanjay Patel	ca7eaa0a54	[InstSimplify] allow undef element match in vector select condition value The semantics of select with undefined/poison condition are not explicitly stated in the LangRef, but this matches comments in the code and Alive2 appears to concur: https://alive2.llvm.org/ce/z/KXytmd We can find this pattern after demanded elements transforms. As noted in D101191, fuzzers are finding infinite loops because we may not account for this pattern in other passes.	2021-05-25 14:25:34 -04:00
Philip Reames	aabca2d1da	[SCEV] Cleanup doesIVOverflowOnX checks [NFC] Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.	2021-05-25 10:12:24 -07:00
Philip Reames	a47b2d4567	[SCEV] Remove unused parameter from computeBECount [NFC] All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.	2021-05-25 09:58:56 -07:00
David Goldblatt	8607a02357	[InstSimplify] Transform X * Y % Y --> 0 simplifyDiv already handles the case X * Y / Y --> X (barring overflow). This adds the equivalent handling to simplifyRem. Correctness: https://alive2.llvm.org/ce/z/J2cUbS https://alive2.llvm.org/ce/z/us9NUM https://alive2.llvm.org/ce/z/AvaDGJ https://alive2.llvm.org/ce/z/kq9ige Extending the situations in which we apply this transform would not be correct: https://alive2.llvm.org/ce/z/Lf9V63 https://alive2.llvm.org/ce/z/6RPQK3 https://alive2.llvm.org/ce/z/p9UdxC https://alive2.llvm.org/ce/z/A2zlhE https://alive2.llvm.org/ce/z/vHTtLw https://alive2.llvm.org/ce/z/lvpH42 Differential Revision: https://reviews.llvm.org/D102864	2021-05-25 10:16:04 -04:00
Sanjay Patel	a0e71f1832	[ConstProp] propagate poison from vector reduction element(s) to result This follows from the underlying logic for binops and min/max. Although it does not appear that we handle this for min/max intrinsics currently. https://alive2.llvm.org/ce/z/Kq9Xnh	2021-05-24 10:34:40 -04:00
Martin Storsjö	c5638a71d8	[MinGW] Mark a number of library functions unavailable for mingw targets These functions were marked unavailable for MSVC targets before, within an "T.isOSWindows() && !T.isOSCygMing()" block, but these ones are unavailable on MinGW targets too. This avoids generating calls to stpcpy for MinGW targets, which has been happening since `6dbf0cfcf7` (in some cases). This fixes https://github.com/mstorsjo/llvm-mingw/issues/201. Differential Revision: https://reviews.llvm.org/D102946	2021-05-22 23:40:19 +03:00
Serge Pavlov	c9c05a91c4	[ConstantFolding] Use APFloat for constant folding. NFC Replace use of host floating types with operations on APFloat when it is possible. Use of APFloat makes analysis more convenient and facilitates constant folding in the case of non-default FP environment. Differential Revision: https://reviews.llvm.org/D102672	2021-05-22 13:00:20 +07:00
Arthur Eubanks	f7788e1bff	Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes" This reverts commit `d14d84af2f`. Causes unacceptable memory regressions.	2021-05-21 16:38:03 -07:00
Arthur Eubanks	a52530dd6a	Revert "[NPM] Do not run function simplification pipeline unnecessarily" This reverts commit `97ab068034`. Depends on D100917, which is to be reverted.	2021-05-21 16:38:02 -07:00
Philip Reames	cc5f6ae4b4	Move a definition into cpp from header in advance of other changes [nfc]	2021-05-21 09:18:04 -07:00
Daniil Fukalov	e1cb98be2d	[TTI] NFC: Change getCostOfKeepingLiveOverCall to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102831	2021-05-21 15:18:12 +03:00
Daniil Fukalov	e8e88c3353	[TTI] NFC: Change getRegUsageForType to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102541	2021-05-21 15:17:23 +03:00
Joe Ellis	5a476987f7	[InstSimplify] Properly constrain {insert,extract}_subvector intrinsic fold The previous rule: (insert_vector _, (extract_vector X, 0), 0) -> X is not quite correct. The correct fold should be: (insert_vector Y, (extract_vector X, 0), 0) -> X where: Y is X, or Y is undef This commit updates the pattern. Reviewed By: peterwaller-arm, paulwalker-arm Differential Revision: https://reviews.llvm.org/D102699	2021-05-21 10:05:03 +00:00
Serge Pavlov	c162f086ba	[APFloat] convertToDouble/Float can work on shorter types Previously APFloat::convertToDouble may be called only for APFloats that were built using double semantics. Other semantics like single precision were not allowed although corresponding numbers could be converted to double without loss of precision. The similar restriction applied to APFloat::convertToFloat. With this change any APFloat that can be precisely represented by double can be handled with convertToDouble. Behavior of convertToFloat was updated similarly. It make the conversion operations more convenient and adds support for formats like half and bfloat. Differential Revision: https://reviews.llvm.org/D102671	2021-05-21 11:02:51 +07:00
Nikita Popov	b661a55a25	[ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC) We only use BackedgeTakenInfo::hasOperand().	2021-05-19 18:42:14 +02:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Arthur Eubanks	cc64ece77d	[NFC][OpaquePtr] Avoid using PointerType::getElementType() in VectorUtils.cpp Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102533	2021-05-17 18:35:44 -07:00
Nikita Popov	7243120198	[CaptureTracking] Simplify reachability check (NFCI) This code was re-implementing the same-BB case of isPotentiallyReachable(). Historically, this was done because CaptureTracking used additional caching for local dominance queries. Now that it is no longer needed, the code is effectively the same as isPotentiallyReachable(). The only difference are extra checks for invoke/phis. These are misleading checks related to dominance in the value availability sense that are not relevant for control reachability. The invoke check was correct but redundant in that invokes are always terminators, so `I` could never come before the invoke. The phi check is a matter of interpretation (should an earlier phi node be considered reachable from a later phi node in the same block?) but ultimately doesn't matter because phis don't capture anyway.	2021-05-16 16:04:10 +02:00
Nikita Popov	656296b1c2	Reapply [CaptureTracking] Do not check domination Reapply after adjusting the synchronized.m test case, where the TODO is now resolved. The pointer is only captured on the exception handling path. ----- For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it can also occur if the capture happens on an entirely disjoint path. This change was previously accepted in D90688, but had to be reverted due to large compile-time impact in some cases: It increases the number of reachability queries that are performed. After recent changes, the compile-time impact is largely mitigated, so I'm reapplying this patch. The remaining compile-time impact is largely proportional to changes in code-size.	2021-05-16 15:46:31 +02:00
Nikita Popov	541c2845de	Revert "[CaptureTracking] Do not check domination" This reverts commit `6b8b43e7af`. This causes clang test to fail (CodeGenObjC/synchronized.m). Revert until I can figure out whether that's an expected change.	2021-05-16 11:04:45 +02:00
Nikita Popov	6b8b43e7af	[CaptureTracking] Do not check domination For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it can also occur if the capture happens on an entirely disjoint path. This change was previously accepted in D90688, but had to be reverted due to large compile-time impact in some cases: It increases the number of reachability queries that are performed. After recent changes, the compile-time impact is largely mitigated, so I'm reapplying this patch. The remaining compile-time impact is largely proportional to changes in code-size.	2021-05-16 10:49:36 +02:00
Nikita Popov	6e9363c942	[CaptureTracking] Only check reachability for capture candidates Reachability queries are very expensive, and currently performed for each instruction we look at, even though most of them will not lead to a capture and are thus ultimately irrelevant. It is more efficient to walk a few unnecessary instructions than to perform unnecessary reachability queries. Theoretically, this may produce worse results, because the additional instructions considered may cause us to hit the use count limit earlier. In practice, this does not appear to be a problem, e.g. on test-suite O3 we report only one more captured-before with this change, with no resulting codegen differences. This makes PointerMayBeCapturedBefore() significantly cheaper in practice, hopefully allowing it to be used in more places.	2021-05-15 22:57:56 +02:00
Nikita Popov	f9e9b0cdb4	[CFG] Move reachable from entry checks into basic block variant These checks are not specific to the instruction based variant of isPotentiallyReachable(), they are equally valid for the basic block based variant. Move them there, to make sure that switching between the instruction and basic block variants cannot introduce regressions.	2021-05-15 15:42:02 +02:00
Nikita Popov	fb9ed1979a	[IR] Add BasicBlock::isEntryBlock() (NFC) This is a recurring and somewhat awkward pattern. Add a helper method for it.	2021-05-15 12:41:58 +02:00
Nikita Popov	6418bab6f8	[CFG] Use comesBefore() (NFC) Use comesBefore() instead of performing an instruction walk. In line with the previous implementation, instructions are considered to reach themselves.	2021-05-15 12:14:30 +02:00
Nikita Popov	f765e54db2	[CaptureTracking] Clean up same instruction check (NFC) Check the BeforeHere == I case once in shouldExplore, instead of handling it in four different places.	2021-05-15 11:58:55 +02:00
Nick Desaulniers	8c72749bd9	[LowerConstantIntrinsics] reuse isManifestLogic from ConstantFolding GlobalVariables are Constants, yet should not unconditionally be considered true for __builtin_constant_p. Via the LangRef https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic: This intrinsic generates no code. If its argument is known to be a manifest compile-time constant value, then the intrinsic will be converted to a constant true value. Otherwise, it will be converted to a constant false value. In particular, note that if the argument is a constant expression which refers to a global (the address of which _is_ a constant, but not manifest during the compile), then the intrinsic evaluates to false. Move isManifestConstant from ConstantFolding to be a method of Constant so that we can reuse the same logic in LowerConstantIntrinsics. pr/41459 Reviewed By: rsmith, george.burgess.iv Differential Revision: https://reviews.llvm.org/D102367	2021-05-14 15:35:21 -07:00
Nikita Popov	c4fb2a1fc2	[MemDep] Use BatchAA in more places (NFCI) Previously, we already used BatchAA for individual simple pointer dependency queries. This extends BatchAA usage for the non-local case, so that only one BatchAA instance is used for all blocks, instead of one instance per block. Use of BatchAA is safe as IR cannot be modified during a MemDep query.	2021-05-14 22:54:40 +02:00
Nikita Popov	5e289cc597	[AA] Support callCapturesBefore() on BatchAA (NFCI) This is not expected to have any practical compile-time effect, as the alias() calls inside callCapturesBefore() are rare. This should still be supported for API completeness, and might be useful for reachability caching.	2021-05-14 21:48:08 +02:00
Philip Reames	23c93c2555	Discount invariant instructions in full unrolling This patch updates the cost model for full unrolling to discount the cost of a loop invariant expression on all but one iteration. The reasoning here is that such an expression (as determined by SCEV) will be CSEd or DSEd once the loop is unrolled. Note that SCEVs reasoning will find things which could be invariant, not simply those outside the loop. Differential Revision: https://reviews.llvm.org/D102506	2021-05-14 11:07:19 -07:00
dfukalov	fdae3fc8b3	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-05-14 11:17:14 +03:00
Nikita Popov	425781bce0	[CaptureTracking] Use isIdentifiedFunctionLocal() (NFC) These conditions together exactly match isIdentifiedFunctionLocal(), and this is also what we logically want to check for here.	2021-05-13 23:06:42 +02:00
Nikita Popov	dce158c58d	[AA] Use isIdentifiedFunctionLocal() (NFC) This condition is equivalent to isIdentifiedFunctionLocal(), and this is also what we semantically want to check here.	2021-05-13 23:06:42 +02:00
Joe Ellis	2ed7db0d20	[InstSimplify] Remove redundant {insert,extract}_vector intrinsic chains This commit removes some redundant {insert,extract}_vector intrinsic chains by implementing the following patterns as instsimplifies: (insert_vector _, (extract_vector X, 0), 0) -> X (extract_vector (insert_vector _, X, 0), 0) -> X Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D101986	2021-05-13 16:09:50 +00:00
Florian Hahn	e2759f110b	[SCEV] Apply guards to max with non-unitary steps. We already apply loop-guards when computing the maximum with unitary steps. This extends the code to also do so when dealing with non-unitary steps. This allows us to infer a tighter maximum in some cases. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102267	2021-05-13 09:47:29 +01:00
Jordan Rupprecht	fec2945998	Revert "[GVN] Clobber partially aliased loads." This reverts commit `6c57044231`. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
Stanislav Mekhanoshin	22d295f695	[AMDGPU] Constant fold Intrinsic::amdgcn_perm Differential Revision: https://reviews.llvm.org/D102203	2021-05-10 16:23:11 -07:00
Florian Hahn	93a9a8a8d9	[VecLib] Add support for vector fns from Darwin's libsystem. This patch adds support for Darwin's libsystem math vector functions to TLI. Darwin's libsystem provides a range of vector functions for libm functions. This initial patch only adds the 2 x double and 4 x float versions, which are available on both X86 and ARM64. On X86, wider vector versions are supported as well. Reviewed By: jroelofs Differential Revision: https://reviews.llvm.org/D101856	2021-05-10 21:19:58 +01:00
Andy Kaylor	7086025d65	[Dependence Analysis] Enable delinearization of fixed sized arrays Patch by Artem Radzikhovskyy! Allow delinearization of fixed sized arrays if we can prove that the GEP indices do not overflow the array dimensions. The checks applied are similar to the ones that are used for delinearization of parametric size arrays. Make sure that the GEP indices are non-negative and that they are smaller than the range of that dimension. Changes Summary: - Updated the LIT tests with more exact values, as we are able to delinearize and apply more exact tests - profitability.ll - now able to delinearize in all cases, no need to use -da-disable-delinearization-checks flag and run the test twice - loop-interchange-optimization-remarks.ll - in one of the cases we are able to delinearize without using -da-disable-delinearization-checks - SimpleSIVNoValidityCheckFixedSize.ll - removed unnecessary "-da-disable-delinearization-checks" flag. Now can get the exact answer without it. - SimpleSIVNoValidityCheckFixedSize.ll and PreliminaryNoValidityCheckFixedSize.ll - made negative tests more explicit, in order to demonstrate the need for "-da-disable-delinearization-checks" flag Differential Revision: https://reviews.llvm.org/D101486	2021-05-10 10:30:15 -07:00
Nikita Popov	d26ca78c18	[SCEV] Handle and/or in applyLoopGuards() applyLoopGuards() already combines conditions from multiple nested guards. However, it cannot use multiple conditions on the same guard, combined using and/or. Add support for this by recursing into either `and` or `or`, depending on the direction of the branch. Differential Revision: https://reviews.llvm.org/D101692	2021-05-09 21:34:28 +02:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Florian Hahn	6c99e63120	[SCEV] By more careful when traversing phis in isImpliedViaMerge. I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the previous iteration. Hence we incorrectly determine that the previous value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829	2021-05-07 19:52:29 +01:00
Krzysztof Parzyszek	50cf0a1d1a	Allow empty value list in propagateMetadata(Inst, ArrayOf...) This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.	2021-05-07 13:20:50 -05:00
Fangrui Song	d8aba75a76	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
Whitney Tsang	1006ac3963	[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717	2021-05-07 16:04:18 +00:00
Joseph Tremoulet	bc302bfbef	BasicAA: Recognize inttoptr as isEscapeSource Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object. Reviewed By: nikic, nlopes, aqjune Differential Revision: https://reviews.llvm.org/D101541	2021-05-07 07:48:50 -07:00
Peilin Guo	911a541620	[LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion getValueFromCondition() uses a Visited set to record the intermediate value. However, it uses a postorder way to compute the value first and update the Visited set later. Thus it will be trapped into an infinite recursion if there exists IRs that use no dominated by its def as in this example: %tmp3 = or i1 undef, %tmp4 %tmp4 = or i1 undef, %tmp3 To prevent this, we can insert an Overdefined placeholder into the set before computing the actual value. Reviewed by: nikic Differential Revision: https://reviews.llvm.org/D101273	2021-05-07 16:05:50 +08:00
Mircea Trofin	97ab068034	[NPM] Do not run function simplification pipeline unnecessarily The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1]. This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since. The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios. [1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions \| compile time tracker ]] run with the option enabled. Differential Revision: https://reviews.llvm.org/D98103	2021-05-06 12:24:33 -07:00
Bjorn Pettersson	3ee826594a	Make dependency between certain analysis passes transitive (reapply) LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and LoopAccessLegacyAnalysis all cache pointers to their nestled required analysis passes. One need to use addRequiredTransitive to describe that the nestled passes can't be freed until those analysis passes no longer are used themselves. There is still a bit of a mess considering the getLazyBPIAnalysisUsage and getLazyBFIAnalysisUsage functions. Those functions are used from both Transform, CodeGen and Analysis passes. I figure it is OK to use addRequiredTransitive also when being used from Transform and CodeGen passes. On the other hand, I figure we must do it when used from other Analysis passes. So using addRequiredTransitive should be more correct here. An alternative solution would be to add a bool option in those functions to let the user tell if it is a analysis pass or not. Since those lazy passes will be obsolete when new PM has conquered the world I figure we can leave it like this right now. Intention with the patch is to fix PR49950. It at least solves the problem for the reproducer in PR49950. However, that reproducer need five passes in a specific order, so there are lots of various "solutions" that could avoid the crash without actually fixing the root cause. This is a reapply of commit `3655f0757f`, that was reverted in `33ff3c2049` due to problems with assertions in the polly lit tests. That problem is supposed to be solved by also adjusting ScopPass to explicitly preserve LazyBlockFrequencyInfo and LazyBranchProbabilityInfo (it already preserved OptimizationRemarkEmitter which depends on those lazy passes). Differential Revision: https://reviews.llvm.org/D100958	2021-05-05 15:17:55 +02:00
Bjorn Pettersson	33ff3c2049	Revert "Make dependency between certain analysis passes transitive" This reverts commit `3655f0757f`. It caused assertion failures related to setLastUser in polly builds.	2021-05-04 19:08:41 +02:00
Bjorn Pettersson	3655f0757f	Make dependency between certain analysis passes transitive LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and LoopAccessLegacyAnalysis all cache pointers to their nestled required analysis passes. One need to use addRequiredTransitive to describe that the nestled passes can't be freed until those analysis passes no longer are used themselves. There is still a bit of a mess considering the getLazyBPIAnalysisUsage and getLazyBFIAnalysisUsage functions. Those functions are used from both Transform, CodeGen and Analysis passes. I figure it is OK to use addRequiredTransitive also when being used from Transform and CodeGen passes. On the other hand, I figure we must do it when used from other Analysis passes. So using addRequiredTransitive should be more correct here. An alternative solution would be to add a bool option in those functions to let the user tell if it is a analysis pass or not. Since those lazy passes will be obsolete when new PM has conquered the world I figure we can leave it like this right now. Intention with the patch is to fix PR49950. It at least solves the problem for the reproducer in PR49950. However, that reproducer need five passes in a specific order, so there are lots of various "solutions" that could avoid the crash without actually fixing the root cause. Differential Revision: https://reviews.llvm.org/D100958	2021-05-04 11:50:08 +02:00
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert `02c5ba8679` Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
Arthur Eubanks	d14d84af2f	[NewPM] Only invalidate modified functions' analyses in CGSCC passes Previously, any change in any function in an SCC would cause all analyses for all functions in the SCC to be invalidated. With this change, we now manually invalidate analyses for functions we modify, then let the pass manager know that all function analyses should be preserved. So far this only touches the inliner, argpromotion, funcattrs, and updateCGAndAnalysisManager(), since they are the most used. Slight compile time improvements: http://llvm-compile-time-tracker.com/compare.php?from=326da4adcb8def2abdd530299d87ce951c0edec9&to=8942c7669f330082ef159f3c6c57c3c28484f4be&stat=instructions Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D100917	2021-05-03 17:21:44 -07:00
Philip Reames	e38ccb729b	Recommit "Generalize getInvertibleOperand recurrence handling slightly" This was reverted because of a reported problem. It turned out this patch didn't introduce said problem, it just exposed it more widely. `15a4233` fixes the root issue, so this simple a) rebases over that, and b) adds a much more extensive comment explaining why that weakened assert is correct. Original commit message follows: Follow up to D99912, specifically the revert, fix, and reapply thereof. This generalizes the invertible recurrence logic in two ways: * By allowing mismatching operand numbers of the phi, we can recurse through a pair of phi recurrences whose operand orders have not been canonicalized. * By allowing recurrences through operand 1, we can invert these odd (but legal) recurrence. Differential Revision: https://reviews.llvm.org/D100884	2021-05-03 16:40:56 -07:00
Sanjay Patel	15a42339fe	[ValueTracking] soften assert for invertible recurrence matching There's a TODO comment in the code and discussion in D99912 about generalizing this, but I wasn't sure how to implement that, so just going with a potential minimal fix to avoid crashing. The test is a reduction beyond useful code (there's no user of %user...), but it is based on https://llvm.org/PR50191, so this is asserting on real code. Differential Revision: https://reviews.llvm.org/D101772	2021-05-03 15:57:40 -04:00
Juneyoung Lee	d4d1caafc8	Fix MSan crash after `1977c53b`	2021-05-02 13:44:43 +09:00
Arthur Eubanks	07a9df5993	[NFC] Use getParamByValType instead of pointee type To reduce dependence on pointee types for opaque pointers. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101706	2021-05-01 21:22:41 -07:00
Juneyoung Lee	7257e6a68a	[ValueTracking] ctpop propagates poison This is a patch that adds ctpop intrinsics to propagatesPoison. Splitted from D101191	2021-05-02 13:04:37 +09:00
Juneyoung Lee	64e768e816	[ValueTracking] Improve impliesPoison to look into overflow intrinsics This update supports the following transformation: ``` select(extract(mul_with_overflow(a, _), _), (a == 0), false) => and(extract(mul_with_overflow(a, _), _), (a == 0)) ``` which is correct because if `a` was poison the select's condition was also poison. This update is splitted from D101423.	2021-05-02 12:03:55 +09:00
Juneyoung Lee	1977c53b2a	[InstCombine] Fold overflow bit of [u\|s]mul.with.overflow in a poison-safe way As discussed in D101191, this patch adds a poison-safe folding of overflow bit check: ``` %Op0 = icmp ne i4 %X, 0 %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = select i1 %Op0, i1 %Op1, i1 false => %Y.fr = freeze %Y %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = %Op1 ``` https://alive2.llvm.org/ce/z/zgPUGT https://alive2.llvm.org/ce/z/h2gZ_6 Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`. In this case, LLVM is already good because `%ret` is already successfully folded into `and`, triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K Differential Revision: https://reviews.llvm.org/D101423	2021-05-02 11:54:12 +09:00
Nikita Popov	db9d00c5e7	[LVI] Handle mask not equal zero conditions If V & Mask != 0, we know that at least one of the bits in Mask must be set, so the value must be >= the lowest bit in Mask.	2021-05-01 23:08:49 +02:00
Nikita Popov	cc58e8918b	[SCEV] Simplify backedge count clearing (NFC) This seems to be a leftover from when the BackedgeTakenInfo stored multiple exit counts with manual memory management. At some point this was switchted to a simple vector, and there should be no need to micro-manage the clearing anymore. We can simply drop the loop from the map and the the destructor do its job.	2021-05-01 17:50:01 +02:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Nikita Popov	fe230dc197	[ValueTracking] Slightly clean up programUndefinedIfUndefOrPoison() (NFC) Use contains() to check set membership, and adjust an oddly structured loop.	2021-04-30 23:05:41 +02:00
Nikita Popov	2cd7868605	[ValueTracking] Limit scan when checking poison UB (PR50155) The current code can scan an unlimited number of instructions, if the containing basic block is very large. The test case from PR50155 contains a basic block with approximately 100k instructions. To avoid this, limit the number of instructions we inspect. At the same time, drop the limit on the number of basic blocks, as this will be implicitly limited by the number of instructions as well.	2021-04-30 23:04:49 +02:00
Duncan P. N. Exon Smith	518d955f9d	Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC Stop using the compatibility spellings of `OF_{None,Text,Append}` left behind by `1f67a3cba9`. A follow-up will remove them. Differential Revision: https://reviews.llvm.org/D101650	2021-04-30 11:00:03 -07:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Roman Lebedev	ba5b015b0d	[InlineCost] CallAnalyzer: use TTI info for extractvalue - they are free (PR50099) It seems incorrect to use TTI data in some places, and override it in others. In this case, TTI says that `extractvalue` are free, yet we bill them. While this doesn't address https://bugs.llvm.org/show_bug.cgi?id=50099 yet, it reduces the cost from 55 to 50 while the threshold is 45. Differential Revision: https://reviews.llvm.org/D101228	2021-04-30 13:55:11 +03:00
Arthur Eubanks	a3a798d49d	[InlineCost] Remove visitUnaryInstruction() The simplifyInstruction() in visitUnaryInstruction() does not trigger for all of check-llvm. Looking at all delegates to UnaryInstruction in InstVisitor, the only instructions that either don't have a visitor in CallAnalyzer, or redirect to UnaryInstruction, are VAArgInst and Alloca. VAArgInst will never get simplified, and visitUnaryInstruction(Alloca) would always return false anyway. Reviewed By: mtrofin, lebedev.ri Differential Revision: https://reviews.llvm.org/D101577	2021-04-29 20:33:30 -07:00
jasonliu	7049fbf960	[XCOFF] Handle the case when personality routine is an alias Summary: Personality routine could be an alias to another personality routine. Fix the situation when we compile the file that contains the personality routine and the file also have functions that need to refer to the personality routine. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101401	2021-04-29 22:03:30 +00:00
Philip Reames	a047837b90	Revert "Generalize getInvertibleOperand recurrence handling slightly" This reverts commit `0c01b37eeb` while a problem reported is investigated.	2021-04-29 13:06:26 -07:00
Sanjay Patel	1089158c5a	[ConstantFolding] propagate poison through vector reduction intrinsics	2021-04-29 12:54:20 -04:00
Sanjay Patel	71597d40e8	[ConstantFolding] refactor helper for vector reductions; NFC We should handle other cases (undef/poison), so reduce the duplication of repeated switches.	2021-04-29 12:09:22 -04:00
Craig Topper	25391cec3a	[RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31. This seems like a reasonable upper bound on VL. WG discussions for the V spec would probably allow us to use 2^16 as an upper bound on VLEN, but this is good enough for now. This allows us to remove sext and zext if user happens to assign the size_t result into an int and then uses it as a VL intrinsic argument which is size_t. Reviewed By: frasercrmck, rogfer01, arcbbb Differential Revision: https://reviews.llvm.org/D101472	2021-04-29 08:07:59 -07:00
Philip Reames	0c01b37eeb	Generalize getInvertibleOperand recurrence handling slightly Follow up to D99912, specifically the revert, fix, and reapply thereof. This generalizes the invertible recurrence logic in two ways: * By allowing mismatching operand numbers of the phi, we can recurse through a pair of phi recurrences whose operand orders have not been canonicalized. * By allowing recurrences through operand 1, we can invert these odd (but legal) recurrence. Differential Revision: https://reviews.llvm.org/D100884	2021-04-28 14:38:07 -07:00
Philip Reames	0cc3e10f5e	[SCEV] Avoid range intersection idiom in getRangeForUnkownRecurrence [NFC] Addresses a review comment from D101181	2021-04-28 12:48:17 -07:00
Philip Reames	a836de0bde	[SCEV] Compute ranges for ashr recurrences Straight forward extension to the recently added infrastructure which was pioneered with shl. This was originally posted as part of D99687, but split off for ease of review. (I also decided to exclude the unknown start sign case explicitly for simplicity of understanding.) Differential Revision: https://reviews.llvm.org/D101181	2021-04-28 12:36:20 -07:00
Florian Hahn	1ed7f8ede5	[LAA] Support pointer phis in loop by analyzing each incoming pointer. SCEV does not look through non-header PHIs inside the loop. Such phis can be analyzed by adding separate accesses for each incoming pointer value. This results in 2 more loops vectorized in SPEC2000/186.crafty and avoids regressions when sinking instructions before vectorizing. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101286	2021-04-28 20:19:40 +01:00
Arthur Eubanks	cbce28f07e	[ConstFold] Use const-folded operands in more places Previously we were const folding operands but not passing them. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101394	2021-04-27 14:30:19 -07:00
Nikita Popov	e45168c4fa	[SCEV] Handle uge/ugt predicates in applyLoopGuards() These can be handled the same way as ule/ult, just using umax instead of umin. This is useful in cases where the umax prevents the upper bound from overflowing. Differential Revision: https://reviews.llvm.org/D101196	2021-04-27 22:41:05 +02:00
Andy Kaylor	0a82d885a4	[Dependence Analysis] Fix ExactSIV producing wrong analysis Patch by Artem Radzikhovskyy! Symptom: ExactSIV test produced incorrect analysis of dependencies see LIT tests Bug: At the end of the algorithm when determining dependence direction original author forgot to divide intermediate results by gcd and round result toward zero Although this bug can be fixed with significantly fewer changes I opted to write the code in such a way that reflects the original algorithm that Banerjee proposed, for easier reference in the future. This surprisingly results in shorter code, and fewer quotient and max/min calculations. Changes Summary: - fixed findGCD to return valid x and y so that they match the function description where: ax - by = gcd(a,b) - Fixed ExactSIV test, to produce proper results - Documented the extension of Banerjee's algorithm that the original code author introduced. Banerjee's original algorithm only tested whether Dst depends on Src, the extension also allows us to test whether Src depends on Dst, in one pass. - ExactRDIV test worked fine. Since it uses findGCD(), it needed to be updated.Since ExactRDIV test has very few changes from the core algorithm of ExactSIV I modified the test to have consistent format as ExactSIV. - Updated the LIT tests to be testing for correct values. Differential Revision: https://reviews.llvm.org/D100331	2021-04-27 12:24:00 -07:00
dfukalov	e4c606acaf	[TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101283	2021-04-27 08:51:48 +03:00
Hongtao Yu	30bb5be389	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2. As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention. The optimizations unblocked are: - In-block load propagation. - In-block dead store elimination - Memory copy optimization that turns stores to consecutive memories into a memset. These optimizations are local to a block, so they shouldn't affect the profile quality. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100075	2021-04-26 16:52:33 -07:00
Vineet Kumar	84d16e2055	Implementation for TargetTransformInfo::hasActiveVectorLength() This patch adds the missing implementation for TargetTransformInfo::hasActiveVectorLength() without which using hasActiveVectorLength() causes linker error. Patch by Vineet Kumar! Differential Revision: https://reviews.llvm.org/D100941	2021-04-26 21:20:05 +00:00
Nikita Popov	a5051f2fa2	[SCEV] Fix applyLoopGuards() chaining for ne predicates ICMP_NE predicates directly overwrote the rewritten result, instead of chaining it with previous rewrites, as was done for ICMP_ULT and ICMP_ULE. This means that some guards were effectively discarded, depending on their order.	2021-04-24 21:43:46 +02:00
dfukalov	6c57044231	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-04-24 14:14:20 +03:00
Sander de Smalen	f9a50f04ba	[TTI] NFC: Change getIntImmCost[Inst\|Intrin] to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565	2021-04-23 16:06:36 +01:00
Sander de Smalen	43ace8b5ce	[TTI] NFC: Change getScalingFactorCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100564	2021-04-23 16:06:36 +01:00
Sander de Smalen	008a072ded	[TTI] NFC: Change getMemcpyCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100563	2021-04-23 16:06:35 +01:00
Sander de Smalen	9ba07f37f8	[TTI] NFC: Change getGEPCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100562	2021-04-23 16:06:35 +01:00
Sander de Smalen	e0edfa052f	[TTI] NFC: Change getAddressComputationCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100561	2021-04-23 16:06:35 +01:00
dfukalov	9ab17a60eb	[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D101151	2021-04-23 18:02:00 +03:00
Philip Reames	424d6cb902	[SCEV] Compute ranges for lshr recurrences Straight forward extension to the recently added infrastructure which was pioneered with shl. Differential Revision: https://reviews.llvm.org/D99687	2021-04-22 11:06:31 -07:00
Wenlei He	dff8315892	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful. Differential Revision: https://reviews.llvm.org/D100528	2021-04-22 00:42:37 -07:00
Sanjay Patel	5e6dc5e404	[InstSimplify] generalize ctlz-of-shifted-constant https://alive2.llvm.org/ce/z/zWL_VQ	2021-04-21 14:23:55 -04:00
Nico Weber	ba7a92c01e	[Support] Don't include VirtualFileSystem.h in CommandLine.h CommandLine.h is indirectly included in ~50% of TUs when building clang, and VirtualFileSystem.h is large. (Already remarked by jhenderson on D70769.) No behavior change. Differential Revision: https://reviews.llvm.org/D100957	2021-04-21 10:19:01 -04:00
Yang Fan	4307446e9f	[SCEV] Fix -Wunused-variable warning (NFC) GCC warning: ``` /llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp: In member function ‘const llvm::SCEV* llvm::ScalarEvolution::getLosslessPtrToIntExpr(const llvm::SCEV, unsigned int)::SCEVPtrToIntSinkingRewriter::visitUnknown(const llvm::SCEVUnknown)’: /llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:1152:13: warning: unused variable ‘ExprPtrTy’ [-Wunused-variable] 1152 \| Type *ExprPtrTy = Expr->getType(); \| ^~~~~~~~~ ```	2021-04-21 16:01:46 +08:00
Nikita Popov	de18fa9e52	Revert "[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543)" This reverts commit `ea1a0d7c9a`. While this is strictly more powerful, it is also strictly slower. InstSimplify intentionally does not perform many folds that it is allowed to perform, if doing so requires a KnownBits calculation that will be repeated in InstCombine. Maybe it's worthwhile to do this here, but that needs a more explicitly stated motivation, evaluated in a review.	2021-04-21 09:55:25 +02:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Roman Lebedev	ea1a0d7c9a	[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543) We already special-cased a few interesting patterns, but that is strictly less powerful than using KnownBits. So instead get the known bits for the operand of `and`, and iff all the unset bits of the `and`-mask are known to be zeros in the operand, we can omit said `and`.	2021-04-21 00:31:46 +03:00
Philip Reames	6792e26c0d	Reapply "Look through invertible recurrences in isKnownNonEqual" I'd reverted this in commit `3b6acb1797` due to buildbot failures. This patch contains the fix for said issue. I'd forgotten to handle the case where two phis in the same block have different operand order. We canonicalize away from this, but it's still valid IR. The tests included in this change (as opposed to simply having test output changed), crashed without the fix. Original commit message follows... This extends the phi handling in isKnownNonEqual with a special case based on invertible recurrences. If we can prove the recurrence is invertible (which many common ones are), we can recurse through the start operands of the recurrence skipping the phi cycle. (Side note: Instcombine currently does not push back through these cases. I will implement that in a follow up change w/separate review.) Differential Revision: https://reviews.llvm.org/D99912	2021-04-20 12:47:59 -07:00
Philip Reames	3b6acb1797	Revert "Look through invertible recurrences in isKnownNonEqual" This reverts commit `be20eae25f`. It appears to have caused a crash on a buildbot (https://lab.llvm.org/buildbot#builders/77/builds/5653). Reverting while investigating.	2021-04-20 11:47:10 -07:00
Philip Reames	9c1a145aeb	Rearrange code to reduce diff for D99687 [nfc] Adding the switches to reduce diffs. I'm about to split that into an lshr part and an ashr part, doing the NFC part first makes it easier to maintain both diffs.	2021-04-20 11:40:15 -07:00
Roman Lebedev	7186764884	[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr()	2021-04-20 21:29:21 +03:00
Philip Reames	be20eae25f	Look through invertible recurrences in isKnownNonEqual This extends the phi handling in isKnownNonEqual with a special case based on invertible recurrences. If we can prove the recurrence is invertible (which many common ones are), we can recurse through the start operands of the recurrence skipping the phi cycle. (Side note: Instcombine currently does not push back through these cases. I will implement that in a follow up change w/separate review.) Differential Revision: https://reviews.llvm.org/D99912	2021-04-20 10:52:22 -07:00
Dávid Bolvanský	319c9f6e58	[MemoryBuiltins] Added support for memalign memalign is older aligned_alloc.	2021-04-20 12:39:54 +02:00
Joe Ellis	effacc1599	[AArch64] Constant fold sve_convert_from_svbool(zero) to zero Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D100463	2021-04-20 10:02:49 +00:00
Arthur Eubanks	5e71b9fa93	Explicitly pass type to cast load constant folding result Previously we would use the type of the pointee to determine what to cast the result of constant folding a load. To aid with opaque pointer types, we should explicitly pass the type of the load rather than looking at pointee types. ConstantFoldLoadThroughBitcast() converts the const prop'd value to the proper load type (e.g. [1 x i32] -> i32). Instead of calling this in every intermediate step like bitcasts, we only call this when we actually see the global initializer value. In some existing uses of this API, we don't know the exact type we're loading from immediately (e.g. first we visit a bitcast, then we visit the load using the bitcast). In those cases we have to manually call ConstantFoldLoadThroughBitcast() when simplifying the load to make sure that we cast to the proper type. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100718	2021-04-20 00:53:21 -07:00
Sanjay Patel	9d43f6d7ce	[LowerConstantIntrinsics] avoid crashing on alloca with unexpected operand type The test here is reduced from the fuzzer-generated crasher in: https://llvm.org/PR50023 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=33395 I don't know if this is the best or complete solution, but the zext of the i42 type appears to match the behavior if we run a weird type example like this through the IR optimizer with -O1. Differential Revision: https://reviews.llvm.org/D100766	2021-04-19 13:06:29 -04:00
Roman Lebedev	41c22acc22	[NFC][SCEV] Assert that we don't try to create SCEVPtrToIntExpr of a non-integral pointer ptr<->int casts are only valid for integral pointes, defensively assert that we don't try to break that here.	2021-04-19 18:38:38 +03:00
Simon Pilgrim	ddcdeae358	[Analysis] ImportedFunctionsInliningStatistics.h - add <memory> and remove unused <string> include. NFCI. Move <string> include to ImportedFunctionsInliningStatistics.cpp and add missing <memory> include as we have explicit uses of std::unique_ptr in the header.	2021-04-19 16:20:56 +01:00
Cullen Rhodes	f0bc2782f2	[TTI] NFC: Remove unused 'OptSize' parameter from shouldMaximizeVectorBandwidth Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D100377	2021-04-19 11:01:34 +00:00
Roman Lebedev	d480f968ad	Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`" As being discussed in https://reviews.llvm.org/D100721, this modelling is lossy, we can't reconstruct `ash`/`ashr exact` from it, which means that whenever we actually expand the IR, we've just pessimized the code.. It would be good to model this pattern, after all it comes up every time you want to compute a distance between two pointers, but not at this cost. This reverts commit `ec54867df5`.	2021-04-18 16:26:45 +03:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Thomas Lively	5c729750a6	[WebAssembly] Remove saturating fp-to-int target intrinsics Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead. This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u}, which are now represented in IR as a saturating truncation to a v2i32 followed by a concatenation with a zero vector. Differential Revision: https://reviews.llvm.org/D100596	2021-04-16 12:11:20 -07:00
Sanjay Patel	bb907b26e2	[ValueTracking] don't recursively compute known bits using multiple llvm.assumes This is an alternative to D99759 to avoid the compile-time explosion seen in: https://llvm.org/PR49785 Another potential solution would make the exclusion logic stronger to avoid blowing up, but note that we reduced the complexity of the exclusion mechanism in D16204 because it was too costly. So I'm questioning the need for recursion/exclusion entirely - what is the optimization value vs. cost of recursively computing known bits based on assumptions? This was built into the implementation from the start with `60db058`, and we have kept adding code/cost to deal with that capability. By clearing the query's AssumptionCache inside computeKnownBitsFromAssume(), this patch retains all existing assume functionality except refining known bits based on even more assumptions. We have 1 regression test that shows a difference in optimization power. Differential Revision: https://reviews.llvm.org/D100573	2021-04-16 08:43:35 -04:00
Mircea Trofin	0d06b14f59	[MLGO] Fix use of AM.invalidate post D100519 The ML inline advisors more aggressively invalidate certain analyses after each call site inlining, to more accurately capture the problem state.	2021-04-15 18:45:39 -07:00
Arthur Eubanks	c8f0a7c215	[NewPM] Cleanup IR printing instrumentation Being lazy with printing the banner seems hard to reason with, we should print it unconditionally first (it could also lead to duplicate banners if we have multiple functions in -filter-print-funcs). The printIR() functions were doing too many things. I separated out the call from PrintPassInstrumentation since we were essentially doing two completely separate things in printIR() from different callers. There were multiple ways to generate the name of some IR. That's all been moved to getIRName(). The printing of the IR name was also inconsistent, now it's always "IR Dump on $foo" where "$foo" is the name. For a function, it's the function name. For a loop, it's what's printed by Loop::print(), which is more detailed. For an SCC, it's the list of functions in parentheses. For a module it's "[module]", to differentiate between a possible SCC with a function called "module". To preserve D74814, we have to check if we're going to print anything at all first. This is unfortunate, but I would consider this a special case that shouldn't be handled in the core logic. Reviewed By: jamieschmeiser Differential Revision: https://reviews.llvm.org/D100231	2021-04-15 09:50:55 -07:00
dfukalov	ce1626f34a	[AA] Updates for D95543. Addressing latter comments in D95543: - `AliasResult::Result` renamed to `AliasResult::Kind` - Offset printing added for `PartialAlias` case in `-aa-eval` - Removed VisitedPhiBBs check from BasicAA' Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D100454	2021-04-15 12:22:03 +03:00
Nikita Popov	a1ed025d0e	Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting" This reverts commit `faf9f11589`. Issues with this patch have been reported in https://reviews.llvm.org/D100264#2689917 and https://bugs.llvm.org/show_bug.cgi?id=49967.	2021-04-15 09:43:52 +02:00
William S. Moses	d3e2b4c0a2	[SROA][TBAA] Handle shift of regular TBAA nodes SROA shifts TBAA nodes in a way that may present a problem for !tbaa but not !tbaa.struct nodes. Differential Revision: https://reviews.llvm.org/D99851	2021-04-14 14:35:20 -04:00
Nikita Popov	0d91075f77	[ValueTracking] Don't require strictly positive for mul nsw recurrence Just like in the mul nuw case, it's sufficient that the step is non-zero. If the step is negative, then the values will jump between positive and negative, "crossing" zero, but the value of the recurrence is never actually zero.	2021-04-14 19:39:59 +02:00
Nikita Popov	5c0fb026c9	[ValueTracking] Don't require non-zero step for add nuw It's okay if the step is zero, we'll just stay at the same non-zero value in that case. The valuable part of this is that the step doesn't even need to be a constant anymore.	2021-04-14 19:06:18 +02:00
Sander de Smalen	4f42d873c2	[TTI] NFC: Change getArithmeticInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100317	2021-04-14 17:20:36 +01:00
Sander de Smalen	d84bd951a8	[TTI] NFC: Change getFPOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D100316	2021-04-14 17:20:36 +01:00
Sander de Smalen	1af35e77f4	[TTI] NFC: Change getVectorInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100315	2021-04-14 17:20:35 +01:00
Sander de Smalen	174e8f6c5e	[TTI] NFC: Change getShuffleCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100314	2021-04-14 17:20:35 +01:00
Sander de Smalen	14b934f8a6	[TTI] NFC: Change getCFInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D100313	2021-04-14 17:20:34 +01:00
Sander de Smalen	596f669cfb	[TTI] NFC: Change getCallInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D100312	2021-04-14 17:20:34 +01:00
Sanjay Patel	7ef2c68a3d	[InstSimplify] improve efficiency for detecting non-zero value Stepping through callstacks in the example from D99759 reveals this potential compile-time improvement. The savings come from avoiding ValueTracking's computing known bits if we have already dealt with special-case patterns. Further improvements in this direction seem possible. This makes a degenerate test based on PR49785 about 40x faster (25 sec -> 0.6 sec), but it does not address the larger question of how to limit computeKnownBitsFromAssume(). Ie, the original test there is still infinite-time for all practical purposes. Differential Revision: https://reviews.llvm.org/D100408	2021-04-14 09:04:15 -04:00
Sanjay Patel	5ae5d25e38	[ValueTracking] match negative-stepping non-zero recurrence This is pulled out of D100408. This avoids a regression that would be exposed by making the calling code from InstSimplify more efficient.	2021-04-14 08:57:53 -04:00
Sanjay Patel	4919365397	[ValueTracking] reduce code duplication; NFC The start value can't be null for something to be a non-zero recurrence, so hoist that common check out of the switch. Subsequent checks may be incomplete or over-specified as noted in: D100408	2021-04-14 08:32:42 -04:00
Philip Reames	00c8be3f93	fix whitespace type	2021-04-13 19:02:41 -07:00
Nikita Popov	faf9f11589	[SCEV] Don't walk uses of phis without SCEV expression when forgetting I've run into some cases where a large fraction of compile-time is spent invalidating SCEV. One of the causes is forgetLoop(), which walks all values that are def-use reachable from the loop header phis. When invalidating a topmost loop, that might be close to all values in a function. Additionally, it's fairly common for there to not actually be anything to invalidate, but we'll still be performing this walk again and again. My first thought was that we don't need to continue walking the uses if the current value doesn't have a SCEV expression. However, this isn't quite right, because SCEV construction can skip over values (e.g. for a chain of adds, we might only create a SCEV expression for the final value). What this patch does instead is to only walk the (full) def-use chain of loop phis that have a SCEV expression. If there's no expression for a phi, then we also don't have any dependent expressions to invalidate. Differential Revision: https://reviews.llvm.org/D100264	2021-04-13 20:28:17 +02:00
Sander de Smalen	03f47bdcb1	[TTI] NFC: Change get[Interleaved]MemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100205	2021-04-13 14:21:02 +01:00
Sander de Smalen	d676b5749d	[TTI] NFC: Change getMaskedMemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100204	2021-04-13 14:21:01 +01:00
Sander de Smalen	db134e2428	[TTI] NFC: Change getCmpSelInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100203	2021-04-13 14:21:01 +01:00
Sander de Smalen	2285dfb73f	[TTI] NFC: Change getMinMaxReductionCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100202	2021-04-13 14:21:00 +01:00
Sander de Smalen	bd86824d98	[TTI] NFC: Change getArithmeticReductionCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html This patch is practically NFC, with the exception of an AArch64 SVE related cost-model change, where we can now return an Invalid cost instead of some bogus number. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100201	2021-04-13 14:20:59 +01:00
Sander de Smalen	fd1f8a5462	[TTI] NFC: Change getGatherScatterOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100200	2021-04-13 14:20:59 +01:00
Sander de Smalen	92d8421f49	[TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100199	2021-04-13 14:20:58 +01:00
Gulfem Savrun Yeniceri	e96df3e531	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-04-13 01:29:41 +00:00
Yuanfang Chen	c5fda0e662	Reland "Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom"" This reverts commit `a3fabc79ae` (relands `f4d682d6ce` with fix for the compile-time regression issue).	2021-04-12 14:50:54 -07:00
Nikita Popov	a3fabc79ae	Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom" This reverts commit `f4d682d6ce`. This caused a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=4b7bad9eaea2233521a94f6b096aaa88dc584e23&to=f4d682d6ce6c5b3a41a0acf297507c82f5c21eef&stat=instructions Possibly this is due to overeager parsing of target triples.	2021-04-12 22:55:59 +02:00
Arthur Eubanks	269b335bd7	[Inliner] Propagate SROA analysis through invariant group intrinsics SROA can handle invariant group intrinsics, let the inliner know that for better heuristics when the intrinsics are present. This fixes size issues in a couple files when turning on -fstrict-vtable-pointers in Chrome. Reviewed By: rnk, mtrofin Differential Revision: https://reviews.llvm.org/D100249	2021-04-12 10:54:22 -07:00
Hamza Sood	0a92aff721	Replace uses of std::iterator with explicit using This patch removes all uses of `std::iterator`, which was deprecated in C++17. While this isn't currently an issue while compiling LLVM, it's useful for those using LLVM as a library. For some reason there're a few places that were seemingly able to use `std` functions unqualified, which no longer works after this patch. I've updated those places, but I'm not really sure why it worked in the first place. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D67586	2021-04-12 10:47:14 -07:00
Yuanfang Chen	f4d682d6ce	[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom D24453 enabled libcalls simplication for ARM PCS. This may cause caller/callee calling conventions mismatch in some situations such as LTO. This patch makes instcombine aware that the compatible calling conventions differences are benign (not emitting undef idom). Differential Revision: https://reviews.llvm.org/D99773	2021-04-12 09:32:23 -07:00
Roman Lebedev	6d44b3c56d	[NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first While, indeed, we may end up pushing less updates that we'd reserve space for, self-dominating updates aren't often enough for that to matter. But this should matter for normal updates.	2021-04-11 23:56:22 +03:00
Roman Lebedev	9829f5e6b1	[CVP] @llvm.[us]{min,max}() intrinsics handling If we can tell that either one of the arguments is taken, bypass the intrinsic. Notably, we are indeed fine with non-strict predicate: * UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i * UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx * SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W * SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr Much like with all other comparison handling in CVP, while we could sort-of handle two Value's, at least for plain ICmpInst it does not appear to be worthwhile. This only fires 78 times on test-suite + dt + rs, but we don't canonicalize to these yet. (only SCEV produces them)	2021-04-11 00:33:47 +03:00

... 3 4 5 6 7 ...

10854 Commits