llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	7414bbebc2	[Analysis] improve function signature checking for calloc This would crash later if we thought the parameters were valid for the standard library call as shown in: https://llvm.org/PR50846	2021-06-27 08:19:00 -04:00
Eli Friedman	8d5bf0709d	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Sanjay Patel	1076b6c4f0	[Analysis] use better version of getLibFunc to check for alloc/free calls There's no reason to use the weaker name-only analysis when we have a function prototype to check (in fact, we probably should not even have that name-only function exposed for general use, but removing it requires auditing all of the callers). The version of getLibFunc that takes a Function argument also does some prototype checking to make sure the arguments/return type match the expected signature of a real library call. This is NFC-intended because the code in MemoryBuiltins does its own function signature checking. For now, that means there may be some redundancy in the checking, but that should not be above the noise for compile-time. Ideally, we can move the checks to a single location. There's still a hole in the logic that allows the example in https://llvm.org/PR50846 to cause a compiler crash.	2021-06-25 12:14:07 -04:00
Florian Hahn	6478f3fb78	[SCEV] Support single-cond range check idiom in applyLoopGuards. This patch extends applyLoopGuards to detect a single-cond range check idiom that InstCombine generates. It extends applyLoopGuards to detect conditions of the form (-C1 + X < C2). InstCombine will create this form when combining two checks of the form (X u< C2 + C1) and (X >=u C1). In practice, this enables us to correctly compute a tight trip count bounds for code as in the function below. InstCombine will fold the minimum iteration check created by LoopRotate with the user check (< 8). void unsigned_check(short pred, unsigned width) { if (width < 8) { for (int x = 0; x < width; x++) pred[x] = pred[x] pred[x]; } } As a consequence, LLVM creates dead vector loops for the code above, e.g. see https://godbolt.org/z/cb8eTcqET https://alive2.llvm.org/ce/z/SHHW4d Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104741	2021-06-25 10:24:40 +01:00
Sanjay Patel	50db987d59	[InstSimplify] move extract with undef index fold; NFC This puts it closer to the other undef query check and will avoid a potential ordering problem if we allow folding non-constant-int indexes.	2021-06-24 13:22:10 -04:00
Florian Hahn	121ecb05e7	[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions. This patch generalizes MatchBinaryAddToConst to support matching (A + C1), (A + C2), instead of just matching (A + C1), A. The existing cases can be handled by treating non-add expressions A as A + 0. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104634	2021-06-24 12:16:15 +01:00
Carl Ritson	ae266e743c	[LVI] Remove recursion from getValueForCondition (NFCI) Convert getValueForCondition to a worklist model instead of using recursion. In pathological cases getValueForCondition recurses heavily. Stack frames are quite expensive on x86-64, and some operating systems (e.g. Windows) have relatively low stack size limits. Using a worklist avoids potential failures from stack overflow. Differential Revision: https://reviews.llvm.org/D104191	2021-06-24 09:58:22 +09:00
Eli Friedman	b12192f7cd	[ScalarEvolution] Clarify implementation of getPointerBase(). getPointerBase should only be looking through Add and AddRec expressions; other expressions either aren't pointers, or can't be looked through. Technically, this is a functional change. For a multiply or min/max expression, if they have exactly one pointer operand, and that operand is the first operand, the behavior here changes. Similarly, if an AddRec has a pointer-type step, the behavior changes. But that shouldn't be happening in practice, and we plan to make such expressions illegal.	2021-06-23 12:55:59 -07:00
Eli Friedman	fdaf304e0d	[NFC][ScalarEvolution] Fix SCEVNAryExpr::getType(). SCEVNAryExpr::getType() could return the wrong type for a SCEVAddExpr. Remove it, and add getType() methods to the relevant subclasses. NFC because nothing uses it directly, as far as I know; this is just future-proofing.	2021-06-23 12:55:59 -07:00
Nikita Popov	00d3f7cc3c	[LAA] Make getPointersDiff() API compatible with opaque pointers Make getPointersDiff() and sortPtrAccesses() compatible with opaque pointers by explicitly passing in the element type instead of determining it from the pointer element type. The SLPVectorizer result is slightly non-optimal in that unnecessary pointer bitcasts are added. Differential Revision: https://reviews.llvm.org/D104784	2021-06-23 18:44:34 +02:00
Sanjay Patel	656001e7b2	[ValueTracking] look through bitcast of vector in computeKnownBits This borrows as much as possible from the SDAG version of the code (originally added with D27129 and since updated with big endian support). In IR, we can test more easily for correctness than we did in the original patch. I'm using the simplest cases that I could find for InstSimplify: we computeKnownBits on variable shift amounts to see if they are zero or in range. So shuffle constant elements into a vector, cast it, and shift it. The motivating x86 example from https://llvm.org/PR50123 is also here. We computeKnownBits in the caller code, but we only check if the shift amount is in range. That could be enhanced to catch the 2nd x86 test - if the shift amount is known too big, the result is 0. Alive2 understands the datalayout and agrees that the tests here are correct - example: https://alive2.llvm.org/ce/z/KZJFMZ Differential Revision: https://reviews.llvm.org/D104472	2021-06-23 11:46:46 -04:00
Juneyoung Lee	5af8bacc94	[InstSimplify] Add more poison folding optimizations This adds more poison folding optimizations to InstSimplify. Since all binary operators propagate poison, these are fine. Also, the precondition of `select cond, undef, x` -> `x` is relaxed to allow the case when `x` is undef. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104661	2021-06-23 20:25:24 +09:00
Florian Hahn	adee485adf	[SCEV] Support signed predicates in applyLoopGuards. This adds handling for signed predicates, similar to how unsigned predicates are already handled. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104732	2021-06-23 10:21:05 +01:00
Joseph Huber	2662351e3b	[OpenMP] Add new OpenMP globalization functions to library info Summary: The changes to globalization introduced in D97680 created two new functions to push / pop shareably memory on the GPU, __kmpc_alloc_shared and __kmpc_free_shared. This patch adds these new runtime functions to the library info so they can be used by the HeapToStack attributor interface. This optimization replaces malloc / free pairs with stack memory if legal. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D102087	2021-06-22 13:23:05 -04:00
Florian Hahn	6c782e6eb0	[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC). Hoist out common recurrence check and sink updating the map, to reduce the code required to support additional predicates.	2021-06-22 15:56:45 +01:00
Nikita Popov	e638a290f7	[ConstantFold] Delay fetching pointer element type Don't do this while stipping pointer casts, instead fetch it at the end. This improves compatibility with opaque pointers for the case where the base object is not opaque.	2021-06-22 15:51:00 +02:00
Florian Hahn	d17798823c	[SCEV] Retain AddExpr flags when subtracting a foldable constant. Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2. But we can retain flags under certain conditions: * Adding a smaller constant is NUW if the original AddExpr was NUW. * Adding a constant with the same sign and small magnitude is NSW, if the original AddExpr was NSW. This can improve results after using `SimplifyICmpOperands`, which may subtract one in order to use stricter predicates, as is the case for `isKnownPredicate`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104319	2021-06-22 11:27:51 +01:00
Nikita Popov	04395fd6cb	[ConstantFolding] Separate conditions in GEP evaluation (NFC) Handle to gep p, 0-v case separately, and not as part of the loop that ensures all indices are constant integers. Those two things are not really related.	2021-06-22 11:14:47 +02:00
Eli Friedman	8f3d16905d	[ScalarEvolution] Ensure backedge-taken counts are not pointers. A backedge-taken count doesn't refer to memory; returning a pointer type is nonsense. So make sure we always return an integer. The obvious way to do this would be to just convert the operands of the icmp to integers, but that doesn't quite work out at the moment: isLoopEntryGuardedByCond currently gets confused by ptrtoint operations. So we perform the ptrtoint conversion late for lt/gt operations. The test changes are mostly innocuous. The most interesting changes are more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to i64)) + %ptr)". This is expected: we can't fold this to zero because we need to preserve the pointer base. The call to isLoopEntryGuardedByCond in howFarToZero is less precise because of ptrtoint operations; this shows up in the function pr46786_c26_char in ptrtoint.ll. Fixing it here would require more complex refactoring. It should eventually be fixed by future improvements to isImpliedCond. See https://bugs.llvm.org/show_bug.cgi?id=46786 for context. Differential Revision: https://reviews.llvm.org/D103656	2021-06-21 16:24:16 -07:00
Jacob Hegna	f86d1f99b3	Remove ML inlining model artifacts. They are not conducive to being stored in git. Instead, we autogenerate mock model artifacts for use in tests. Production models can be specified with the cmake flag LLVM_INLINER_MODEL_PATH. LLVM_INLINER_MODEL_PATH has two sentinel values: - download, which will download the most recent compatible model. - autogenerate, which will autogenerate a "fake" model for testing the model uptake infrastructure. Differential Revision: https://reviews.llvm.org/D104251	2021-06-21 17:38:09 +00:00
Eli Friedman	62ed024c74	[NFC][ScalarEvolution] Clean up ExitLimit constructors. Make all the constructors forward to one constructor. Remove redundant assertions.	2021-06-20 17:40:30 -07:00
Juneyoung Lee	09e8c0d5aa	[InstSimplify] icmp poison, X -> poison This adds a simple transformation from icmp with poison constant to poison. Comparing poison with something else is poison, so this is okay. https://alive2.llvm.org/ce/z/e8iReb https://alive2.llvm.org/ce/z/q4MurY	2021-06-20 15:39:07 +09:00
Tomas Matheson	1bcfa84ae9	Allow building for release with EXPENSIVE_CHECKS D97225 moved LazyCallGraph verify() calls behind EXPENSIVE_CHECKS, but verity() is defined for debug builds only so this had the unintended effect of breaking release builds with EXPENSIVE_CHECKS. Fix by enabling verify() for both debug and EXPENSIVE_CHECKS. Differential Revision: https://reviews.llvm.org/D104514	2021-06-19 17:02:11 +01:00
Eli Friedman	8a567e5f22	[ScalarEvolution] Fix pointer/int type handling converting select/phi to min/max. The old version of this code would blindly perform arithmetic without paying attention to whether the types involved were pointers or integers. This could lead to weird expressions like negating a pointer. Explicitly handle simple cases involving pointers, like "x < y ? x : y". In all other cases, coerce the operands of the comparison to integer types. This avoids the weird cases, while handling most of the interesting cases. Differential Revision: https://reviews.llvm.org/D103660	2021-06-17 14:05:12 -07:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Joachim Meyer	053dbb939d	Use `-cfg-func-name` value as filter for `-view-cfg`, etc. Currently the value is only used when calling `F->viewCFG()` which is missing out on its potential and usefulness. So I added the check to the printer passes as well. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D102011	2021-06-16 23:54:51 +02:00
Eli Friedman	27963ccf07	[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI In preparation for D103660.	2021-06-16 12:32:32 -07:00
Sanjay Patel	ce95200b79	[InstSimplify] propagate poison through FP ops We already have this fold: fadd float poison, 1.0 --> poison ...via ConstantFolding, so this makes the behavior consistent if the other operand(s) are non-constant. The fold for undef was added before poison existed as a value/type in IR. This came up in D102673 / D103169 because we're trying to sort out the more complicated handling for constrained math ops. We should have the handling for the regular instructions done first, so we can build on that (or diverge as needed). Differential Revision: https://reviews.llvm.org/D104383	2021-06-16 11:31:58 -04:00
Roman Lebedev	a3113df219	[SCEV] PtrToInt on non-integral pointers is allowed As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change, we are now allowed to produce `ptrtoint` for non-integral pointers. This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104322	2021-06-16 10:24:25 +03:00
Arthur Eubanks	9aa1428174	[InstSimplify] Treat invariant group insts as bitcasts for load operands We can look through invariant group intrinsics for the purposes of simplifying the result of a load. Since intrinsics can't be constants, but we also don't want to completely rewrite load constant folding, we convert the load operand to a constant. For GEPs and bitcasts we just treat them as constants. For invariant group intrinsics, we treat them as a bitcast. Relanding with a check for self-referential values. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D101103	2021-06-15 12:59:43 -07:00
spupyrev	0a0800c4d1	A post-processing for BFI inference The current implementation for computing relative block frequencies does not handle correctly control-flow graphs containing irreducible loops. This results in suboptimally generated binaries, whose perf can be up to 5% worse than optimal. To resolve the problem, we apply a post-processing step, which iteratively updates block frequencies based on the frequencies of their predesessors. This corresponds to finding the stationary point of the Markov chain by an iterative method aka "PageRank computation". The algorithm takes at most O(\|E\| * IterativeBFIMaxIterations) steps but typically converges faster. It is turned on by passing option `use-iterative-bfi-inference` and applied only for functions containing profile data and irreducible loops. Tested on SPEC06/17, where it is helping to get correct profile counts for one of the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5% for binaries containing functions with hot irreducible loops. Reviewed By: hoy, wenlei, davidxl Differential Revision: https://reviews.llvm.org/D103289	2021-06-11 21:46:04 -07:00
Andrew Litteken	f6dea2e732	[IRSim] Strip out the findSimilarity call from the constructor Both doInitialize and runOnModule were running the entire analysis due to the actual work being done in the constructor. Strip it out here and only get the similarity during runOnModule. Author: lanza Reviewers: AndrewLitteken, paquette, plofti Differential Revision: https://reviews.llvm.org/D92524	2021-06-11 18:41:28 -05:00
Andrew Litteken	64720f57be	[IRSim] Don't copy the Mapper for createCandidatesFromSuffixTree Every invocation this was copying the Mapper for no reason. Take a const ref instead. Author: lanza Reviewers: AndrewLitteken, plofti, paquette, Differential Review: https://reviews.llvm.org/D92532	2021-06-11 16:36:23 -05:00
Simon Pilgrim	5e6bfb661e	[Analysis] Pass RecurrenceDescriptor as const reference. NFCI. We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.). Differential Revision: https://reviews.llvm.org/D104029	2021-06-11 10:24:14 +01:00
Philip Reames	7629b2a09c	[LI] Add a cover function for checking if a loop is mustprogress [nfc] Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.	2021-06-10 13:37:32 -07:00
Philip Reames	aaaeb4b160	[SCEV] Use mustprogress flag on loops (in addition to function attribute) This addresses a performance regression reported against `3c6e4191`. That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (`38540d7`) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form. It turns out that clang uses the function attribute form for C++, and the loop metadata form for C. As a result, `3c6e4191` ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.	2021-06-10 13:20:28 -07:00
Philip Reames	b6ee5f2b1d	Move code for checking loop metadata into Analysis [nfc] I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.	2021-06-10 13:01:22 -07:00
Serge Pavlov	8ff36aab69	[ConstantFolding] Enable folding of min/max/copysign for all floats Previously such folding was enabled for half, float and double values only. With this change it is allowed for other floating point values also. Differential Revision: https://reviews.llvm.org/D103956	2021-06-10 11:57:51 +07:00
Philip Reames	b65f30d6fb	[SCEV] Minor code motion to simplify a later patch [nfc]	2021-06-09 14:17:06 -07:00
Arthur Eubanks	222cce3828	Revert "[InstSimplify] Treat invariant group insts as bitcasts for load operands" This reverts commit `26044c6a54`. Breaks on invalid IR (see D101103).	2021-06-09 11:46:10 -07:00
Florian Hahn	b76f1f1202	[SCEV] Keep common NUW flags when inlining Add operands. Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr operands. As a consequence, we always drop flags when building expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`. We should be able to retain NUW flags common among all inlined SCEVAddExpr and the original flags. Reviewed By: nikic, mkazantsev Differential Revision: https://reviews.llvm.org/D103877	2021-06-09 17:13:21 +01:00
Artur Pilipenko	9197bac297	Add an option to hide "cold" blocks from CFG graph Introduce a new cl::opt to hide "cold" blocks from CFG DOT graphs. Use BFI to get block relative frequency. Hide the block if the frequency is below the threshold set by the command line option value. Reviewed By: davidxl, hoy Differential Revision: https://reviews.llvm.org/D103640	2021-06-08 11:29:27 -07:00
Caroline Concatto	6fd1604d14	[InstCombine] Add instcombine fold for extractelement + splat for scalable vectors This patch allows that scalable vector can also use the fold that already exists for fixed vector, only when the lane index is lower than the minimum number of elements of the vector. Differential Revision: https://reviews.llvm.org/D102404	2021-06-08 10:43:38 +01:00
Philip Reames	3c6e419198	[SCEV] Properly guard reasoning about infinite loops being UB on mustprogress Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location. Differential Revision: https://reviews.llvm.org/D103834	2021-06-07 14:47:36 -07:00
Daniil Suchkov	d32cc150fe	[BasicAA] Handle PHIs without incoming values gracefully Fix a bug introduced by `f6f6f6375d`. Now for empty PHIs, instead of crashing on assert(hasVal()) in Optional's internals, we'll return NoAlias, as we did before that patch. Differential Revision: https://reviews.llvm.org/D103831	2021-06-07 21:39:01 +00:00
Philip Reames	38540d71c7	[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be: for (unsigned i = 0; i < N; i += 2) { body; } Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body". The basic intuition behind this patch is as follows: * A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max. * Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this. In LLVM, this is tied to the mustprogress attribute. Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop. A couple notes for those reading along: * I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules. * We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path. Differential Revision: https://reviews.llvm.org/D103118	2021-06-07 11:24:00 -07:00
Simon Pilgrim	76a1be05fa	AssumeBundleQueries.cpp - don't dereference a dyn_cast<> result. NFCI. Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow. Fixes static analysis warning.	2021-06-06 15:25:03 +01:00
Roman Lebedev	e350494fb0	[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.	2021-06-05 12:17:51 +03:00
Fangrui Song	06e7de795b	Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build	2021-06-04 23:34:43 -07:00
Artur Pilipenko	a06e63fa52	NFC. Refactor DOTGraphTraits::isNodeHidden Restructure handling of cfg-hide-unreachable-paths and cfg-hide-deoptimize-paths options so as to make it easier to introduce new types of hidden blocks.	2021-06-03 11:27:06 -07:00

1 2 3 4 5 ...

10625 Commits