llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	337948ac6e	[InstCombine] add folds for binop with sexted bool and constant operands This is a generalization/extension of the existing and/or folds noted with TODO comments. Those have a one-use constraint that is not necessary. Potential follow-ups are noted by the TODO comments in the new function. We can also call this function from other binop visit* functions, but we need to add tests first. This solves: https://llvm.org/PR52543 https://alive2.llvm.org/ce/z/NWuCR5	2021-11-20 12:33:00 -05:00
Stanislav Mekhanoshin	c74f2e5b27	[InstCombine] Use SpecificBinaryOp_match in two more places Differential Revision: https://reviews.llvm.org/D114038	2021-11-17 01:16:06 -08:00
Nikita Popov	9f0194be45	[ConstantRange] Add getEquivalentICmp() variant with offset (NFCI) Add a variant of getEquivalentICmp() that produces an optional offset. This allows us to create an equivalent icmp for all ranges. Use this in the with.overflow folding code, which was doing this adjustment separately -- this clarifies that the fold will indeed always apply.	2021-11-06 21:59:45 +01:00
Shoaib Meenai	6404f4b5af	[InstCombine] Remove attributes after hoisting free above null check If the parameter had been annotated as nonnull because of the null check, we want to remove the attribute, since it may no longer apply and could result in miscompiles if left. Similarly, we also want to remove undef-implying attributes, since they may not apply anymore either. Fixes PR52110. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D111515	2021-10-13 15:34:56 -07:00
Philip Reames	47d10b25f8	[instcombine] PRE freeze to only potentially posion/undef operand of phi This extends the foldOpIntoPhi code used when visiting a freeze user of a phi to allow any non-undef/poison operand as opposed to only non-undef/poison constants. This lets us hoist a freeze in the increment of an IV into the preheader in many cases. Differential Revision: https://reviews.llvm.org/D111744	2021-10-13 13:55:54 -07:00
Philip Reames	6f34839407	[instcombine] propagate freeze through single use poison producing flag instruction If we have an instruction which produces poison only when flags are specified on the instruction, then we know that freezing the operands and dropping flags is equivalent to freezing the result. If we know those flags don't result in any undefined behavior being executed, then there's no point in preserving the flags as we gain no knowledge by having them. This patch extends the existing propagation logic which sinks freeze to single potential non-poison operands to allow dropping of flags when we know the freeze is the sole use of the instruction with poison flags. The main value is that we tend to sink freezes towards the phi in IV cycles where the incoming value to the phi is the freeze of an IV increment. This will in turn (in a future patch), let us fold the freeze through the phi into the loop preheader. Motivated by eliminating need for CanonicalizeFreezeInLoops for the clearly profitable cases from onephi.ll test case in the test directory. Differential Revision: https://reviews.llvm.org/D111675	2021-10-12 13:52:41 -07:00
Hongtao Yu	098a0d8fbc	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3. This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation. Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are: - Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes - Unblocked CSE by avoiding pseudo probe from clobbering memory SSA - Unblocked induction variable simpliciation - Allow empty loop deletion by treating probe intrinsic isDroppable - Some refactoring. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110847	2021-10-12 09:44:12 -07:00
Sanjay Patel	6a2a84c253	[InstCombine] add helper for "is desirable int type"; NFC This splits out the logic from shouldChangeType() that currently allows 8/16/32-bit transforms even if those types are not listed as legal in the data layout. This could be useful as a predicate for vector insert/extract transforms. Note that this leaves the subsequent checks in shouldChangeType() unchanged. We may want to merge the checks for i1 and/or "ToLegal" into "isDesirable", but that may alter existing transforms.	2021-10-04 14:30:18 -04:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Kazu Hirata	4f0225f6d2	[Transforms] Migrate from getNumArgOperands to arg_size (NFC) Note that getNumArgOperands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-10-01 09:57:40 -07:00
Alex Richardson	05663dc146	[InstSimplify] Don't lose inbounds when simplifying a GEP I noticed this while working on a (ptrtoint (gep null, x)) -> x fold. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D110168	2021-09-23 09:25:06 +01:00
hyeongyu kim	ec8311444a	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (2/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineCompares and InstructionCombining. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110227	2021-09-23 00:14:50 +09:00
Florian Hahn	e08a5dc86f	[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). InstCombine's worklist can be re-used by other passes like VectorCombine. Move it to llvm/Transform/Utils and rename it to InstructionWorklist. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110181	2021-09-22 08:47:21 +01:00
Owen Anderson	b5fbbdd202	Teach InstCombine to eliminate malloc-realloc-free triplets. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D109988	2021-09-21 18:07:49 +00:00
Anna Thomas	69921f6f45	[InstCombine] Improve TryToSinkInstruction with multiple uses This patch allows sinking an instruction which can have multiple uses in a single user. We were previously over-restrictive by looking for exactly one use, rather than one user. Also added an API for retrieving a unique undroppable user. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D109700	2021-09-21 10:04:04 -04:00
Nikita Popov	dd0226561e	[IR] Add helper to convert offset to GEP indices We implement logic to convert a byte offset into a sequence of GEP indices for that offset in a number of places. This patch adds a DataLayout::getGEPIndicesForOffset() method, which implements the core logic. I've updated SROA, ConstantFolding and InstCombine to use it, and there's a few more places where it looks relevant. Differential Revision: https://reviews.llvm.org/D110043	2021-09-20 20:18:16 +02:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Kazu Hirata	24c8eaec94	[Transforms] Use make_early_inc_range (NFC)	2021-09-15 19:55:24 -07:00
Anna Thomas	f9e4aebe4a	Revert "[InstCombine] Improve TryToSinkInstruction with multiple uses" This reverts commit `4ac4e52189`. There are couple of test failures, which needs update of the test cases. Doing a clean revert and will recommit the change along with fixed testcases.	2021-09-15 18:03:11 -04:00
Anna Thomas	4ac4e52189	[InstCombine] Improve TryToSinkInstruction with multiple uses This patch allows sinking an instruction which can have multiple uses in a single user. We were previously over-restrictive by looking for exactly one use, rather than one user. Also, the API for retrieving undroppable user has been updated accordingly since in both usecases (Attributor and InstCombine), we seem to care about the user, rather than the use. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D109700	2021-09-15 20:39:38 +00:00
Anna Thomas	b4e787d8f4	[InstCombining] Refactor checks for TryToSinkInstruction. NFC Moved out the checks for profitability of TryToSinkInstructions into a lambda function. This will also allow us to easily add checks for bailing out if the transform is not profitable. Tests-Run: instCombine tests.	2021-09-13 09:04:34 -04:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Simon Pilgrim	10c982e0b3	Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-08-23 21:09:26 +01:00
Chang-Sun Lin, Jr	9cae598f8b	[InstCombine] Avoid folding GEPs across loop boundaries Folding a GEP from outside to inside a loop will materialize an add where there wasn't an equivalent operation before. Check the containing loops before making this fold. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D107935	2021-08-19 20:03:44 +03:00
Sanjay Patel	de285eacb0	[InstCombine] allow for constant-folding in GEP transform This would crash the reduced test or as described in https://llvm.org/PR51485 ...because we can't mark a constant (-expression) with 'inbounds'.	2021-08-16 10:36:56 -04:00
Krishna	d99260641b	[InstCombine] Fold phi ( inttoptr/ptrtoint x ) to phi (x) The inttoptr/ptrtoint roundtrip optimization is not always correct. We are working towards removing this optimization and adding support to specific cases where this optimization works. In this patch, we focus on phi-node operands with inttoptr casts. We know that ptrtoint( inttoptr( ptrtoint x) ) is same as ptrtoint (x). So, we want to remove this roundtrip cast which goes through phi-node. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D106289	2021-08-03 17:52:59 +05:30
Eli Friedman	5c486ce04d	[LLVM IR] Allow volatile stores to trap. Proposed alternative to D105338. This is ugly, but short-term I think it's the best way forward: first, let's formalize the hacks into a coherent model. Then we can consider extensions of that model (we could have different flavors of volatile with different rules). Differential Revision: https://reviews.llvm.org/D106309	2021-07-26 10:51:00 -07:00
hyeongyu kim	aca5aeb752	[InstCombine] Add freezeAllUsesOfArgument to visitFreeze In D106041, a freeze was added before the branch condition to solve the miscompilation problem of SimpleLoopUnswitch. However, I found that the added freeze disturbed other optimizations in the following situations. ``` arg.fr = freeze(arg) use(arg.fr) ... use(arg) ``` It is a problem that occurred when arg and arg.fr were recognized as different values. Therefore, changing to use arg.fr instead of arg throughout the function eliminates the above problem. Thus, I add a function that changes all uses of arg to freeze(arg) to visitFreeze of InstCombine. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D106233	2021-07-24 18:08:58 +09:00
Simon Pilgrim	1c9bec727a	[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-07-22 10:58:51 +01:00
Krishna Kariya	da92e86263	[InstCombine] Fold IntToPtr/PtrToInt to bitcast The inttoptr/ptrtoint roundtrip optimization is not always correct. We are working towards removing this optimization and adding support to specific cases where this optimization works. This patch is the first one on this line. Consider the example: %i = ptrtoint i8* %X to i64 %p = inttoptr i64 %i to i16* %cmp = icmp eq i8* %load, %p In this specific case, the inttoptr/ptrtoint optimization is correct as it only compares the pointer values. In this patch, we fold inttoptr/ptrtoint to a bitcast (if src and dest types are different). Differential Revision: https://reviews.llvm.org/D105088	2021-07-18 23:13:25 +02:00
Arthur Eubanks	04b75c05b0	[InstCombine] Look through invariant group intrinsics when removing malloc Fixes some regressions with -fstrict-vtable-pointers in llvm-test-suite. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D106017	2021-07-15 09:02:40 -07:00
hyeongyu kim	1a5f4cbe1b	[InstCombine] Add optimization to prevent poison from being propagated. In D104569, Freeze was inserted just before br to solve the `branching on undef` miscompilation problem. But value analysis was being disturbed by added freeze. ``` v = load ptr cond = freeze(icmp (and v, const), const') br cond, ... ``` The case in which value analysis disturbed is as above. By changing freeze to add immediately after load, value analysis will be successful again. ``` v = load ptr freeze(icmp (and v, const), const') => v = load ptr v' = freeze v icmp (and v', const), const' ``` In this patch, I propose the above optimization. With this patch, the poison will not spread as the freeze is performed early. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D105392	2021-07-11 12:40:43 +09:00
Nico Weber	97c675d3d4	Revert "Revert "Temporarily do not drop volatile stores before unreachable"" This reverts commit `52aeacfbf5`. There isn't full agreement on a path forward yet, but there is agreement that this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338 Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality" This reverts commit `f4877c78c0`. And all the related changes to tests: This reverts commit `9a0152799f`. This reverts commit `3f7c9cc274`. This reverts commit `329f8197ef`. This reverts commit `aa9f58cc2c`. This reverts commit `2df37d5ddd`. This reverts commit `a72a441812`.	2021-07-09 11:44:34 -04:00
Roman Lebedev	52aeacfbf5	Revert "Temporarily do not drop volatile stores before unreachable" This reverts commit `4e413e1621`, which landed almost 10 months ago under premise that the original behavior didn't match reality and was breaking users, even though it was correct as per the LangRef. But the LangRef change still hasn't appeared, which might suggest that the affected parties aren't really worried about this problem. Please refer to discussion in: * https://reviews.llvm.org/D87399 (`Revert "[InstCombine] erase instructions leading up to unreachable"`) * https://reviews.llvm.org/D53184 (`[LangRef] Clarify semantics of volatile operations.`) * https://reviews.llvm.org/D87149 (`[InstCombine] erase instructions leading up to unreachable`) clang has `-Wnull-dereference` which will diagnose the obvious cases of null dereference, it was adjusted in `f4877c78c0`, but it will only catch the cases where the pointer is a null literal, it will not catch the cases where an arbitrary store is expected to trap. Differential Revision: https://reviews.llvm.org/D105338	2021-07-09 14:16:54 +03:00
Roman Lebedev	fc150cecd7	[SimplifyCFG] simplifyUnreachable(): erase instructions iff they are guaranteed to transfer execution to unreachable This replaces the current ad-hoc implementation, by syncing the code from InstCombine's implementation in `InstCombinerImpl::visitUnreachableInst()`, with one exception that here in SimplifyCFG we are allowed to remove EH instructions. Effectively, this now allows SimplifyCFG to remove calls (iff they won't throw and will return), arithmetic/logic operations, etc. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D105374	2021-07-03 10:45:44 +03:00
Roman Lebedev	13e35ac124	[NFC][InstCombine] visitUnreachableInst(): enhance comments somewhat	2021-07-02 17:30:01 +03:00
Roman Lebedev	dadedc99e9	[InstCombine] visitUnreachableInst(): iteratively erase instructions leading to unreachable In the original review D87149 it was mentioned that this approach was tried, and it lead to infinite combine loops, but i'm not seeing anything like that now, neither in the `check-llvm`, nor on some codebases i tried. This is a recommit of `d9d65527c2`, which i immediately reverted because i have messed up something during branch switch, and `597ccc92ce` accidentally ended up being pushed, which was very much not the intention. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D105339	2021-07-02 17:20:21 +03:00
Roman Lebedev	93a1642763	Revert "[NFCI][InstCombine] visitUnreachableInst(): iteratively erase instructions leading to unreachable" This reverts commit `d9d65527c2`.	2021-07-02 17:17:47 +03:00
Roman Lebedev	d9d65527c2	[NFCI][InstCombine] visitUnreachableInst(): iteratively erase instructions leading to unreachable In the original review D87149 it was mentioned that this approach was tried, and it lead to infinite combine loops, but i'm not seeing anything like that now, neither in the `check-llvm`, nor on some codebases i tried. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D105339	2021-07-02 17:17:03 +03:00
Philip Reames	955f125899	[instcombine] Fold overflow check using overflow intrinsic to comparison This follows up to D104665 (which added umulo handling alongside the existing uaddo case), and generalizes for the remaining overflow intrinsics. I went to add analogous handling to LVI, and discovered that LVI already had a more general implementation. Instead, we can port was LVI does to instcombine. (For context, LVI uses makeExactNoWrapRegion to constrain the value 'x' in blocks reached after a branch on the condition `op.with.overflow(x, C).overflow`.) Differential Revision: https://reviews.llvm.org/D104932	2021-07-01 09:41:55 -07:00
Sanjay Patel	153da08a6c	[InstCombine] hoist min/max intrinsics above select with constant op This is an extension of the handling for unary intrinsics and follows the logic that we use for binary ops. We don't canonicalize to min/max intrinsics yet, but this might help unlock other folds seen in D98152.	2021-06-27 10:02:23 -04:00
Philip Reames	2cd23eb243	[instcombine] Fold overflow check using umulo to comparison If we have a umul.with.overflow where the multiply result is not used and one of the operands is a constant, we can perform the overflow check cheaper with a comparison then by performing the multiply and extracting the overflow flag. (Noticed when looking at the conditions SCEV emits for overflow checks.) Differential Revision: https://reviews.llvm.org/D104665	2021-06-25 10:25:45 -07:00
Sanjay Patel	64b2676ca8	[InstCombine] fold ctlz/cttz-of-select with 1 or more constant arms Building on: `4c44b02d87` ...and adding handling for the extra operand in these intrinsics. This pattern is discussed in: https://llvm.org/PR50140	2021-06-21 11:04:12 -04:00
Juneyoung Lee	ce192ced2b	[InstCombine] Use poison constant to represent the result of unreachable instrs This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction. This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll . Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104602	2021-06-21 09:58:44 +09:00
Sanjay Patel	4c44b02d87	[InstCombine] fold ctpop-of-select with 1 or more constant arms The general pattern is mentioned in: https://llvm.org/PR50140 ...but we need to do a bit more to handle intrinsics with extra operands like ctlz/cttz.	2021-06-20 11:28:45 -04:00
Fraser Cormack	ae3f6de3a8	[InstCombine] Support negation of scalable-vector splats This patch is an extension of D103421. It allows the InstCombiner to generate the negated form of integer scalable-vector splats. It can technically handle fixed-length vectors too but those are completely covered by the preceding logic. This enables extra combining opportunities for scalable vector types. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103801	2021-06-07 15:14:00 +01:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Nikita Popov	f9e9b0cdb4	[CFG] Move reachable from entry checks into basic block variant These checks are not specific to the instruction based variant of isPotentiallyReachable(), they are equally valid for the basic block based variant. Move them there, to make sure that switching between the instruction and basic block variants cannot introduce regressions.	2021-05-15 15:42:02 +02:00
Philip Reames	15e19a2599	Revert "[instcombine] Exploit UB implied by nofree attributes" This change effectively reverts `86664638`, but since there have been some changes on top and I wanted to leave the tests in, it's not a mechanical revert. Why revert this now? Two main reasons: 1) There are continuing discussion around what the semantics of nofree. I am getting increasing uncomfortable with the seeming possibility we might redefine nofree in a way incompatible with these changes. 2) There was a reported miscompile triggered by this change (https://github.com/emscripten-core/emscripten/issues/9443). At first, I was making good progress on tracking down the issues exposed and those issues appeared to be unrelated latent bugs. Now that we've found at least one bug in the original change, and the investigation has stalled, I'm no longer comfortable leaving this in tree. In retrospect, I probably should have reverted this earlier and investigated the issues once the triggering change was out of tree.	2021-04-22 10:53:17 -07:00
Philip Reames	3b1474cab2	free(nullptr) does not violate the nofree specification This fixes a subtle and nasty bug in my `86664638`. The problem is that free(nullptr) is well defined (and common). The specification for the nofree attributes talks about memory objects, and doesn't explicitly address null, but I think it's reasonable to assume that nofree doesn't disallow a call to free(nullptr). If it did, we'd have to prove nonnull on an argument to ever infer nofree which doesn't seem to be the intent. This was found by Nuno and Alive2 over in https://reviews.llvm.org/D100141#2697374. Differential Revision: https://reviews.llvm.org/D100779	2021-04-20 09:08:05 -07:00
Juneyoung Lee	1c10201d96	Update InstCombine to use undef matcher instead This is a patch to use m_Undef() matcher instead of isa<UndefValue>(). As suggested in D100122, this update is separately committed.	2021-04-18 11:05:36 +09:00
Philip Reames	ff55d01a8e	[nofree] Restrict semantics to memory visible to caller This patch clarifies the semantics of the nofree function attribute to make clear that it provides an "as if" semantic. That is, a nofree function is guaranteed not to free memory which existed before the call, but might allocate and then deallocate that same memory within the lifetime of the callee. This is the result of the discussion on llvm-dev under the thread "Ambiguity in the nofree function attribute". The most important part of this change is the LangRef wording. The rest is minor comment changes to emphasize the new semantics where code was accidentally consistent, and fix one place which wasn't consistent. That one place is currently narrowly used as it is primarily part of the ongoing (and not yet enabled) deref-at-point semantics work. Differential Revision: https://reviews.llvm.org/D100141	2021-04-16 11:38:55 -07:00
Philip Reames	908215b346	Use AssumeInst in a few more places [nfc] Follow up to `a6d2a8d6f5`. These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.	2021-04-06 13:18:53 -07:00
Philip Reames	a6d2a8d6f5	Add a subclass of IntrinsicInst for llvm.assume [nfc] Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class. A follow up change will do the same for the newer assumption query/bundle mechanisms.	2021-04-06 11:16:22 -07:00
Stephen Tozer	3bfddc2593	Reapply "[DebugInfo] Handle multiple variable location operands in IR" Fixed section of code that iterated through a SmallDenseMap and added instructions in each iteration, causing non-deterministic code; replaced SmallDenseMap with MapVector to prevent non-determinism. This reverts commit `01ac6d1587`.	2021-03-17 16:45:25 +00:00
Hans Wennborg	01ac6d1587	Revert "[DebugInfo] Handle multiple variable location operands in IR" This caused non-deterministic compiler output; see comment on the code review. > This patch updates the various IR passes to correctly handle dbg.values with a > DIArgList location. This patch does not actually allow DIArgLists to be produced > by salvageDebugInfo, and it does not affect any pass after codegen-prepare. > Other than that, it should cover every IR pass. > > Most of the changes simply extend code that operated on a single debug value to > operate on the list of debug values in the style of any_of, all_of, for_each, > etc. Instances of setOperand(0, ...) have been replaced with with > replaceVariableLocationOp, which takes the value that is being replaced as an > additional argument. In places where this value isn't readily available, we have > to track the old value through to the point where it gets replaced. > > Differential Revision: https://reviews.llvm.org/D88232 This reverts commit `df69c69427`.	2021-03-17 13:36:48 +01:00
Mohammad Hadi Jooybar	302b80abf0	[InstCombine] Avoid Bitcast-GEP fusion for pointers directly from allocation functions Elimination of bitcasts with void pointer arguments results in GEPs with pure byte indexes. These GEPs do not preserve struct/array information and interrupt phi address translation in later pipeline stages. Here is the original motivation for this patch: ``` #include<stdio.h> #include<malloc.h> typedef struct __Node{ double f; struct __Node next; } Node; void foo () { Node a = (Node) malloc (sizeof(Node)); a->next = NULL; a->f = 11.5f; Node ptr = a; double sum = 0.0f; while (ptr) { sum += ptr->f; ptr = ptr->next; } printf("%f\n", sum); } ``` By explicit assignment `a->next = NULL`, we can infer the length of the link list is `1`. In this case we can eliminate while loop traversal entirely. This elimination is supposed to be performed by GVN/MemoryDependencyAnalysis/PhiTranslation . The final IR before this patch: ``` define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 { entry: %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2 %next = getelementptr inbounds i8, i8* %call, i64 8 %0 = bitcast i8* %next to %struct.__Node** store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2 %f = bitcast i8* %call to double* store double 1.150000e+01, double* %f, align 8, !tbaa !8 %tobool12 = icmp eq i8* %call, null br i1 %tobool12, label %while.end, label %while.body.lr.ph while.body.lr.ph: ; preds = %entry %1 = bitcast i8* %call to %struct.__Node* br label %while.body while.body: ; preds = %while.body.lr.ph, %while.body %sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ] %ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ] %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0 %2 = load double, double* %f1, align 8, !tbaa !8 %add = fadd contract double %sum.014, %2 %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1 %3 = load %struct.__Node, %struct.__Node* %next2, align 8, !tbaa !2 %tobool = icmp eq %struct.__Node* %3, null br i1 %tobool, label %while.end, label %while.body while.end: ; preds = %while.body, %entry %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add, %while.body ] %call3 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa) ret void } ``` Final IR after this patch: ``` ; Function Attrs: nofree nounwind define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 { while.end: %call3 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double 1.150000e+01) ret void } ``` IR before GVN before this patch: ``` define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 { entry: %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2 %next = getelementptr inbounds i8, i8* %call, i64 8 %0 = bitcast i8* %next to %struct.__Node** store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2 %f = bitcast i8* %call to double* store double 1.150000e+01, double* %f, align 8, !tbaa !8 %tobool12 = icmp eq i8* %call, null br i1 %tobool12, label %while.end, label %while.body.lr.ph while.body.lr.ph: ; preds = %entry %1 = bitcast i8* %call to %struct.__Node* br label %while.body while.body: ; preds = %while.body.lr.ph, %while.body %sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ] %ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ] %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0 %2 = load double, double* %f1, align 8, !tbaa !8 %add = fadd contract double %sum.014, %2 %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1 %3 = load %struct.__Node, %struct.__Node* %next2, align 8, !tbaa !2 %tobool = icmp eq %struct.__Node* %3, null br i1 %tobool, label %while.end.loopexit, label %while.body while.end.loopexit: ; preds = %while.body %add.lcssa = phi double [ %add, %while.body ] br label %while.end while.end: ; preds = %while.end.loopexit, %entry %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ] %call3 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa) ret void } ``` IR before GVN after this patch: ``` define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 { entry: %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2 %0 = bitcast i8* %call to %struct.__Node* %next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1 store %struct.__Node* null, %struct.__Node** %next, align 8, !tbaa !2 %f = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 0 store double 1.150000e+01, double* %f, align 8, !tbaa !8 %tobool12 = icmp eq i8* %call, null br i1 %tobool12, label %while.end, label %while.body.preheader while.body.preheader: ; preds = %entry br label %while.body while.body: ; preds = %while.body.preheader, %while.body %sum.014 = phi double [ %add, %while.body ], [ 0.000000e+00, %while.body.preheader ] %ptr.013 = phi %struct.__Node* [ %2, %while.body ], [ %0, %while.body.preheader ] %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0 %1 = load double, double* %f1, align 8, !tbaa !8 %add = fadd contract double %sum.014, %1 %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1 %2 = load %struct.__Node, %struct.__Node* %next2, align 8, !tbaa !2 %tobool = icmp eq %struct.__Node* %2, null br i1 %tobool, label %while.end.loopexit, label %while.body while.end.loopexit: ; preds = %while.body %add.lcssa = phi double [ %add, %while.body ] br label %while.end while.end: ; preds = %while.end.loopexit, %entry %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ] %call3 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa) ret void } ``` The phi translation fails before this patch and it prevents GVN to remove the loop. The reason for this failure is in InstCombine. When the Instruction combining pass decides to convert: ``` %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) %0 = bitcast i8* %call to %struct.__Node* %next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1 store %struct.__Node* null, %struct.__Node** %next ``` to ``` %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) %next = getelementptr inbounds i8, i8* %call, i64 8 %0 = bitcast i8* %next to %struct.__Node** store %struct.__Node* null, %struct.__Node** %0 ``` GEP instructions with pure byte indexes (e.g. `getelementptr inbounds i8, i8* %call, i64 8`) are obstacles for address translation. address translation is looking for structural similarity between GEPs and these GEPs usually do not match since they have different structure. This change will cause couple of failures in LLVM-tests. However, in all cases we need to change expected result by the test. I will update those tests as soon as I get green light on this patch. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D96881	2021-03-16 17:05:44 -04:00
Simonas Kazlauskas	7d7001b2cb	[InstCombine] Restrict a GEP transform to avoid changing provenance This is an alternative to D98120. Herein, instead of deleting the transformation entirely, we check that the underlying objects are both the same and therefore this transformation wouldn't incur a provenance change, if applied. https://alive2.llvm.org/ce/z/SYF_yv Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98588	2021-03-14 16:32:04 +02:00
Nikita Popov	42eb658f65	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC) This removes some (but not all) uses of type-less CreateGEP() and CreateInBoundsGEP() APIs, which are incompatible with opaque pointers. There are a still a number of tricky uses left, as well as many more variation APIs for CreateGEP.	2021-03-12 21:01:16 +01:00
gbtozers	df69c69427	[DebugInfo] Handle multiple variable location operands in IR This patch updates the various IR passes to correctly handle dbg.values with a DIArgList location. This patch does not actually allow DIArgLists to be produced by salvageDebugInfo, and it does not affect any pass after codegen-prepare. Other than that, it should cover every IR pass. Most of the changes simply extend code that operated on a single debug value to operate on the list of debug values in the style of any_of, all_of, for_each, etc. Instances of setOperand(0, ...) have been replaced with with replaceVariableLocationOp, which takes the value that is being replaced as an additional argument. In places where this value isn't readily available, we have to track the old value through to the point where it gets replaced. Differential Revision: https://reviews.llvm.org/D88232	2021-03-09 16:44:38 +00:00
Stephen Tozer	4343c68fa3	Fix: [DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch removed the only use of a lambda capture, triggering an error on `-Werror -Wunused-lambda-capture` builds.	2021-03-08 14:57:11 +00:00
gbtozers	e5d958c456	[DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch updates DbgVariableIntrinsics to support use of a DIArgList for the location operand, resulting in a significant change to its interface. This patch does not update all IR passes to support multiple location operands in a dbg.value; the only change is to update the DbgVariableIntrinsic interface and its uses. All code outside of the intrinsic classes assumes that an intrinsic will always have exactly one location operand; they will still support DIArgLists, but only if they contain exactly one Value. Among other changes, the setOperand and setArgOperand functions in DbgVariableIntrinsic have been made private. This is to prevent code from setting the operands of these intrinsics directly, which could easily result in incorrect/invalid operands being set. This does not prevent these functions from being called on a debug intrinsic at all, as they can still be called on any CallInst pointer; it is assumed that any code directly setting the operands on a generic call instruction is doing so safely. The intention for making these functions private is to prevent DIArgLists from being overwritten by code that's naively trying to replace one of the Values it points to, and also to fail fast if a DbgVariableIntrinsic is updated to use a DIArgList without a valid corresponding DIExpression.	2021-03-08 14:36:13 +00:00
Roman Lebedev	2ad1f5eb1a	[InstCombine] Don't canonicalize (gep i8* X, -(ptrtoint Y)) as (inttoptr (sub (ptrtoint X), (ptrtoint Y))) It's just a wrong thing to do. We introduce inttoptr where there were none, which results in loosing all provenance information because we no longer have a GEP{i,}, and pessimize all future optimizations, because we are basically not allowed to look past `inttoptr`. (gep i8* X, -(ptrtoint Y)) is the canonical form. So just drop this fold. Noticed while reviewing D98120.	2021-03-06 23:00:25 +03:00
Stephen Tozer	ec7b9b0c18	[InstCombine] Avoid redundant or out-of-order debug value sinking This patch modifies TryToSinkInstruction in the InstCombine pass, to prevent redundant debug intrinsics from being produced, and also prevent the intrinsics from being emitted in an incorrect order. It does this by ensuring that when this pass sinks an instruction and creates clones of the debug intrinsics that use that instruction, it inserts those debug intrinsics in their original order, and only inserts the last debug intrinsic for each variable in the Instruction's block. Differential revision: https://reviews.llvm.org/D95463	2021-02-26 13:04:33 +00:00
Philip Reames	8666463889	[instcombine] Exploit UB implied by nofree attributes This patch simply implements the documented UB of the current nofree attributes as specified. It doesn't try to be fancy about inference (yet), it just implements the cases already specified and inferred. Note: When this lands, it may expose miscompiles. If so, please revert and provide a test case. It's likely the bug is in the existing inference code and without a relatively complete test case, it will be hard to debug. Differential Revision: https://reviews.llvm.org/D96349	2021-02-18 08:34:22 -08:00
Hongtao Yu	1cb47a063e	[CSSPGO] Unblock optimizations with pseudo probe instrumentation. The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include: 1. IR InstCombine, sinking load operation to shorten lifetimes. 2. MIR LiveRangeShrink, similar to #1 3. MIR TwoAddressInstructionPass, i.e, opeq transform 4. MIR function argument copy elision 5. IR stack protection. (though not perf-critical but nice to have). Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D95982	2021-02-10 12:43:17 -08:00
Kazu Hirata	302313a264	[Transforms] Use range-based for loops (NFC)	2021-02-08 22:33:53 -08:00
Jeroen Dobbelaere	dcc7706fcf	[InstCombine] Remove unused llvm.experimental.noalias.scope.decl A @llvm.experimental.noalias.scope.decl is only useful if there is !alias.scope and !noalias metadata that uses the declared scope. When that is not the case for at least one of the two, the intrinsic call can as well be removed. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95141	2021-01-24 13:55:50 +01:00
Roman Lebedev	4ed0d8f2f0	[NFC][InstCombine] Extract freelyInvertAllUsersOf() out of canonicalizeICmpPredicate() I'd like to use it in an upcoming fold.	2021-01-22 17:23:53 +03:00
Kazu Hirata	e53472de68	[Transforms] Use llvm::append_range (NFC)	2021-01-20 21:35:54 -08:00
Kazu Hirata	8f5da41c4d	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-20 21:35:52 -08:00
Florian Hahn	c701f85c45	[STLExtras] Use return type from operator* of the wrapped iter. Currently make_early_inc_range cannot be used with iterators with operator* implementations that do not return a reference. Most notably in the LLVM codebase, this means the User iterator ranges cannot be used with make_early_inc_range, which slightly simplifies iterating over ranges while elements are removed. Instead of directly using BaseT::reference as return type of operator, this patch uses decltype to get the actual return type of the operator implementation in WrappedIteratorT. This patch also updates a few places to use make use of make_early_inc_range. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93992	2021-01-10 14:41:13 +00:00
Kazu Hirata	530c5af6a4	[Transforms] Construct SmallVector with iterator ranges (NFC)	2021-01-02 09:24:17 -08:00
Roman Lebedev	b3021a72a6	[IR][InstCombine] Add m_ImmConstant(), that matches on non-ConstantExpr constants, and use it A pattern to ignore ConstantExpr's is quite common, since they frequently lead into infinite combine loops, so let's make writing it easier.	2020-12-24 21:20:47 +03:00
Nikita Popov	90177912a4	Revert "[InstCombine] Fold gep inbounds of null to null" This reverts commit `eb79fd3c92`. This causes stage2 crashes, possibly due to StringMap being miscompiled. Reverting for now.	2020-12-24 10:20:31 +01:00
Nikita Popov	759b8c11c3	[InstCombine] Handle different pointer types when folding gep of null The source pointer type is not necessarily the same as the result pointer type, so we can't simply return the original null pointer, it might be a different one.	2020-12-23 21:58:26 +01:00
Nikita Popov	eb79fd3c92	[InstCombine] Fold gep inbounds of null to null Effectively, this is what we were previously already doing when the GEP was used in conjunction with a load or store, but this fold can also be applied more generally: > The only in bounds address for a null pointer in the default > address-space is the null pointer itself.	2020-12-23 21:41:53 +01:00
Florian Hahn	01089c876b	[InstCombine] Preserve !annotation on newly created instructions. If the source instruction has !annotation metadata, all instructions created during combining should also have it. Tell the builder to add it. The !annotation system was discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) This patch is based on an earlier patch by Francis Visoiu Mistrih. Reviewed By: thegameg, lebedev.ri Differential Revision: https://reviews.llvm.org/D91444	2020-12-17 15:20:23 +00:00
Florian Hahn	29077ae860	[IRBuilder] Generalize debug loc handling for arbitrary metadata. This patch extends IRBuilder to allow adding/preserving arbitrary metadata on created instructions. Instead of using references to specific metadata nodes (like DebugLoc), IRbuilder now keeps a vector of (metadata kind, MDNode *) pairs, which are added to each created instruction. The patch itself is a NFC and only moves the existing debug location handling over to the new system. In a follow-up patch it will be used to preserve !annotation metadata besides !dbg. The current approach requires iterating over MetadataToCopy to avoid adding duplicates, but given that the number of metadata kinds to copy/preserve is going to be very small initially (0, 1 (for !dbg) or 2 (!dbg and !annotation)) that should not matter. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D93400	2020-12-17 13:27:43 +00:00
Florian Hahn	eba09a2db9	[InstCombine] Preserve !annotation for newly created instructions. When replacing an instruction with !annotation with a newly created replacement, add the !annotation metadata to the replacement. This mostly covers cases where the new instructions are created using the ::Create helpers. Instructions created by IRBuilder will be handled by D91444. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D93399	2020-12-17 09:06:51 +00:00
Jun Ma	52a3267ffa	[InstCombine] Remove scalable vector restriction in foldVectorBinop Differential Revision: https://reviews.llvm.org/D93289	2020-12-15 21:14:59 +08:00
Sanjay Patel	4f051fe374	[InstCombine] avoid crash sinking to unreachable block The test is reduced from the example in D82005. Similar to `94f6d365e`, the test here would assert in the DomTree when we tried to convert a select to a phi with an unreachable block operand. We may want to add some kind of guard code in DomTree itself to avoid this sort of problem.	2020-12-10 13:10:26 -05:00
Sanjay Patel	94f6d365e4	[InstCombine] avoid crash on phi with unreachable incoming block (PR48369)	2020-12-06 09:31:47 -05:00
jasonliu	a65d8c5d72	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
Luqman Aden	4c0a016927	Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC. The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining vs table-based exception handling. x86-32 is the only arch for which Windows uses the former. 32-bit ARM would use what is called Win64SEH today, which is a bit confusing so instead let's just rename it to be a bit more clear. Reviewed By: compnerd, rnk Differential Revision: https://reviews.llvm.org/D90117	2020-10-27 23:22:13 -07:00
Vedant Kumar	3419252a79	[InstCombine] Remove dbg.values describing contents of dead allocas When InstCombine removes an alloca, it erases the dbg.{addr,declare} instructions which refer to the alloca. It would be better to instead remove all debug intrinsics which describe the contents of the dead alloca, namely all dbg.value(<dead alloca>, ..., DW_OP_deref)'s. This effectively undoes work performed in an InstCombine run earlier in the pipeline by LowerDbgDeclare, which inserts DW_OP_deref dbg.values before CallInst users of an alloca. The motivating example looks like: ``` define void @foo(i32 %0) { %a = alloca i32 ; This alloca is erased. store i32 %0, i32* %a dbg.value(i32 %0, "arg0") ; This dbg.value survives. dbg.value(i32* %a, "arg0", DW_OP_deref) call void @trivially_inlinable_no_op(i32* %a) ret void } ``` If the DW_OP_deref dbg.value is not erased, it becomes dbg.value(undef) after inlining, making "arg0" unavailable. But we already have dbg.value descriptions of the alloca's value (from LowerDbgDeclare), so the DW_OP_deref dbg.value cannot serve its purpose of describing an initialization of the alloca by some callee. It invalidates other useful dbg.values, causing large gaps in location coverage, so we should delete it (even though doing so may cause stale dbg.values to appear, if there's a dead store to `%a` in @trivially_inlinable_no_op). OTOH, it wouldn't be correct to delete all dbg.value descriptions of an alloca. Note that it's possible to describe a variable that takes on different pointer values, e.g.: ``` void use(int ); void t(int a, int b) { int local = &a; // dbg.value(i32* %a.addr, "local") local = &b; // dbg.value(i32* undef, "local") use(&a); // (note: %b.addr is optimized out) local = &a; // dbg.value(i32* %a.addr, "local") } ``` In this example, the alloca for "b" is erased, but we need to describe the value of "local" as <unavailable> before the call to "use". This prevents "local" from appearing to be equal to "&a" at the callsite. rdar://66592859 Differential Revision: https://reviews.llvm.org/D85555	2020-10-22 10:00:13 -07:00
Sanjay Patel	6bad3caeb0	[InstCombine] use unary shuffle creator to reduce code duplication; NFC	2020-09-21 15:34:24 -04:00
Sanjay Patel	cf75e83275	[InstCombine] replace zombie unreachable values with 'undef' before erasing The test (currently crashing) is reduced from the example provided in the post-commit discussion in D87149. Differential Revision: https://reviews.llvm.org/D87965	2020-09-20 12:25:08 -04:00
Nikita Popov	4e413e1621	[InstCombine] Temporarily do not drop volatile stores before unreachable See discussion in D87149. Dropping volatile stores here is legal per LLVM semantics, but causes issues for real code and may result in a change to LLVM volatile semantics. Temporarily treat volatile stores as "not guaranteed to transfer execution" in just this place, until this issue has been resolved.	2020-09-10 16:16:44 +02:00
Sanjay Patel	b22910daab	[InstCombine] erase instructions leading up to unreachable Normal dead code elimination ignores assume intrinsics, so we fail to delete assumes that are not meaningful (and potentially worse if they cause conflicts with other assumptions). The motivating example in https://llvm.org/PR47416 suggests that we might have problems upstream from here (difference between C and C++), but this should be a cheap way to make sure we remove more dead code. Differential Revision: https://reviews.llvm.org/D87149	2020-09-07 10:44:08 -04:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Roman Lebedev	49d223274f	[NFC][InstCombine] Add STATISTIC() for how many iterations we did As we've established, if it takes more than two iterations (one to perform folding and one to ensure that no folding opportunities remain) per function, then there are worklist management issues. So it may be interesting to keep track of it.	2020-08-29 15:10:13 +03:00
Roman Lebedev	3d76a133c7	Revert "[InstCombine] Lower infinite combine loop detection thresholds" And as being reported by Florian Hahn, there's a hit in MultiSource/Benchmarks/mafft from the test-suite on X86 with -O3 -flto, so reverting until addressed. This reverts commit `71e0b82c9f`.	2020-08-19 16:53:30 +03:00
Roman Lebedev	71e0b82c9f	[InstCombine] Lower infinite combine loop detection thresholds It's been a month since `2f3862eb9f`, and no new bug reports about the threshold were filled, so let's bump it again and wait again.	2020-08-19 14:37:57 +03:00
Roman Lebedev	ae7f08812e	[InstCombine] Aggregate reconstruction simplification (PR47060) This pattern happens in clang C++ exception lowering code, on unwind branch. We end up having a `landingpad` block after each `invoke`, where RAII cleanup is performed, and the elements of an aggregate `{i8, i32}` holding exception info are `extractvalue`'d, and we then branch to common block that takes extracted `i8` and `i32` elements (via `phi` nodes), form a new aggregate, and finally `resume`'s the exception. The problem is that, if the cleanup block is effectively empty, it shouldn't be there, there shouldn't be that `landingpad` and `resume`, said `invoke` should be a `call`. Indeed, we do that simplification in e.g. SimplifyCFG `SimplifyCFGOpt::simplifyResume()`. But the thing is, all this extra `extractvalue` + `phi` + `insertvalue` cruft, while it is pointless, does not look like "empty cleanup block". So the `SimplifyCFGOpt::simplifyResume()` fails, and the exception is has higher cost than it could have on unwind branch :S This doesn't happen that often, but it will basically happen once per C++ function with complex CFG that called more than one other function that isn't known to be `nounwind`. I think, this is a missing fold in InstCombine, so i've implemented it. I think, the algorithm/implementation is rather self-explanatory: 1. Find a chain of `insertvalue`'s that fully tell us the initializer of the aggregate. 2. For each element, try to find from which aggregate it was extracted. If it was extracted from the aggregate with identical type, from identical element index, great. 3. If all elements were found to have been extracted from the same aggregate, then we can just use said original source aggregate directly, instead of re-creating it. 4. If we fail to find said aggregate when looking only in the current block, we need be PHI-aware - we might have different source aggregate when coming from each predecessor. I'm not sure if this already handles everything, and there are some FIXME's, i'll deal with all that later in followups. I'd be fine with going with post-commit review here code-wise, but just in case there are thoughts, i'm posting this. On RawSpeed, for example, this has the following effect: ``` \| statistic name \| baseline \| proposed \| Δ \| % \| abs(%) \| \|---------------------------------------------------\|---------:\|---------:\|------:\|--------:\|-------:\| \| instcombine.NumAggregateReconstructionsSimplified \| 0 \| 1253 \| 1253 \| 0.00% \| 0.00% \| \| simplifycfg.NumInvokes \| 948 \| 1355 \| 407 \| 42.93% \| 42.93% \| \| instcount.NumInsertValueInst \| 4382 \| 3210 \| -1172 \| -26.75% \| 26.75% \| \| simplifycfg.NumSinkCommonCode \| 574 \| 458 \| -116 \| -20.21% \| 20.21% \| \| simplifycfg.NumSinkCommonInstrs \| 1154 \| 921 \| -233 \| -20.19% \| 20.19% \| \| instcount.NumExtractValueInst \| 29017 \| 26397 \| -2620 \| -9.03% \| 9.03% \| \| instcombine.NumDeadInst \| 166618 \| 174705 \| 8087 \| 4.85% \| 4.85% \| \| instcount.NumPHIInst \| 51526 \| 50678 \| -848 \| -1.65% \| 1.65% \| \| instcount.NumLandingPadInst \| 20865 \| 20609 \| -256 \| -1.23% \| 1.23% \| \| instcount.NumInvokeInst \| 34023 \| 33675 \| -348 \| -1.02% \| 1.02% \| \| simplifycfg.NumSimpl \| 113634 \| 114708 \| 1074 \| 0.95% \| 0.95% \| \| instcombine.NumSunkInst \| 15030 \| 14930 \| -100 \| -0.67% \| 0.67% \| \| instcount.TotalBlocks \| 219544 \| 219024 \| -520 \| -0.24% \| 0.24% \| \| instcombine.NumCombined \| 644562 \| 645805 \| 1243 \| 0.19% \| 0.19% \| \| instcount.TotalInsts \| 2139506 \| 2135377 \| -4129 \| -0.19% \| 0.19% \| \| instcount.NumBrInst \| 156988 \| 156821 \| -167 \| -0.11% \| 0.11% \| \| instcount.NumCallInst \| 1206144 \| 1207076 \| 932 \| 0.08% \| 0.08% \| \| instcount.NumResumeInst \| 5193 \| 5190 \| -3 \| -0.06% \| 0.06% \| \| asm-printer.EmittedInsts \| 948580 \| 948299 \| -281 \| -0.03% \| 0.03% \| \| instcount.TotalFuncs \| 11509 \| 11507 \| -2 \| -0.02% \| 0.02% \| \| inline.NumDeleted \| 97595 \| 97597 \| 2 \| 0.00% \| 0.00% \| \| inline.NumInlined \| 210514 \| 210522 \| 8 \| 0.00% \| 0.00% \| ``` So we manage to increase the amount of `invoke` -> `call` conversions in SimplifyCFG by almost a half, and there is a very apparent decrease in instruction and basic block count. On vanilla llvm-test-suite: ``` \| statistic name \| baseline \| proposed \| Δ \| % \| abs(%) \| \|---------------------------------------------------\|---------:\|---------:\|------:\|--------:\|-------:\| \| instcombine.NumAggregateReconstructionsSimplified \| 0 \| 744 \| 744 \| 0.00% \| 0.00% \| \| instcount.NumInsertValueInst \| 2705 \| 2053 \| -652 \| -24.10% \| 24.10% \| \| simplifycfg.NumInvokes \| 1212 \| 1424 \| 212 \| 17.49% \| 17.49% \| \| instcount.NumExtractValueInst \| 21681 \| 20139 \| -1542 \| -7.11% \| 7.11% \| \| simplifycfg.NumSinkCommonInstrs \| 14575 \| 14361 \| -214 \| -1.47% \| 1.47% \| \| simplifycfg.NumSinkCommonCode \| 6815 \| 6743 \| -72 \| -1.06% \| 1.06% \| \| instcount.NumLandingPadInst \| 14851 \| 14712 \| -139 \| -0.94% \| 0.94% \| \| instcount.NumInvokeInst \| 27510 \| 27332 \| -178 \| -0.65% \| 0.65% \| \| instcombine.NumDeadInst \| 1438173 \| 1443371 \| 5198 \| 0.36% \| 0.36% \| \| instcount.NumResumeInst \| 2880 \| 2872 \| -8 \| -0.28% \| 0.28% \| \| instcombine.NumSunkInst \| 55187 \| 55076 \| -111 \| -0.20% \| 0.20% \| \| instcount.NumPHIInst \| 321366 \| 320916 \| -450 \| -0.14% \| 0.14% \| \| instcount.TotalBlocks \| 886816 \| 886493 \| -323 \| -0.04% \| 0.04% \| \| instcount.TotalInsts \| 7663845 \| 7661108 \| -2737 \| -0.04% \| 0.04% \| \| simplifycfg.NumSimpl \| 886791 \| 887171 \| 380 \| 0.04% \| 0.04% \| \| instcount.NumCallInst \| 553552 \| 553733 \| 181 \| 0.03% \| 0.03% \| \| instcombine.NumCombined \| 3200512 \| 3201202 \| 690 \| 0.02% \| 0.02% \| \| instcount.NumBrInst \| 741794 \| 741656 \| -138 \| -0.02% \| 0.02% \| \| simplifycfg.NumHoistCommonInstrs \| 14443 \| 14445 \| 2 \| 0.01% \| 0.01% \| \| asm-printer.EmittedInsts \| 7978085 \| 7977916 \| -169 \| 0.00% \| 0.00% \| \| inline.NumDeleted \| 73188 \| 73189 \| 1 \| 0.00% \| 0.00% \| \| inline.NumInlined \| 291959 \| 291968 \| 9 \| 0.00% \| 0.00% \| ``` Roughly similar effect, less instructions and blocks total. See also: rGe492f0e03b01a5e4ec4b6333abb02d303c3e479e. Compile-time wise, this appears to be roughly geomean-neutral: http://llvm-compile-time-tracker.com/compare.php?from=39617aaed95ac00957979bc1525598c1be80e85e&to=b59866cf30420da8f8e3ca239ed3bec577b23387&stat=instructions And this is a win size-wize in general: http://llvm-compile-time-tracker.com/compare.php?from=39617aaed95ac00957979bc1525598c1be80e85e&to=b59866cf30420da8f8e3ca239ed3bec577b23387&stat=size-text See https://bugs.llvm.org/show_bug.cgi?id=47060 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D85787	2020-08-16 23:27:56 +03:00
David Stenberg	e8ebebb0bd	[InstCombine] Fix incorrect Modified status When removing instructions from unreachable blocks, and only debug info intrinsics were removed, InstCombine could incorrectly return a false Modified status. This is fixed by making removeAllNonTerminatorAndEHPadInstructions() also return how many debug info intrinsics that were removed, and take that into account. This was caught using the check introduced by D80916. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D85839	2020-08-13 15:10:41 +02:00
Kazu Hirata	cfdc96714b	[Instcombine] Fix uses of undef (PR46940) Without this patch, we attempt to distribute And over Xor even in unsafe circumstances like so: undef & (true ^ true) ==> (undef & true) ^ (undef & true) and evaluate it to undef instead of false. Note that "true ^ true" may show up implicitly with one true being part of a PHI node. This patch fixes the problem by teaching SimplifyUsingDistributiveLaws to not use undef as part of simplifications. Reviewers: spatel, aqjune, nikic, lebedev.ri, fhahn, jdoerfert Differential Revision: https://reviews.llvm.org/D85687	2020-08-11 14:13:32 -07:00
Juneyoung Lee	c771087161	[InstCombine] Fold freeze(undef) into a proper constant This is a simple patch that folds freeze(undef) into a proper constant after inspecting its uses. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84948	2020-08-06 18:40:04 +09:00
Sanjay Patel	f75cf240d6	[InstCombine] avoid crashing on vector constant expression (PR46872)	2020-07-28 15:02:36 -04:00
Nathan James	d127112724	[llvm][NFC] Silence unused variable warning by using isa over dyn_cast	2020-07-27 13:37:21 +01:00
Juneyoung Lee	e1eacf27c6	[InstCombine] Fold freeze into phi if one operand is not undef This patch adds folding freeze into phi if it has only one operand to target. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84601	2020-07-27 17:07:27 +09:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Roman Lebedev	2f3862eb9f	Reland "[InstCombine] Lower infinite combine loop detection thresholds" This reverts commit `4500db8c59`, which was reverted because lower thresholds exposed a new issue (PR46680). Now that it was resolved by `d12ec0f752`, we can reinstate lower limits and wait for a new bugreport before reverting this again...	2020-07-19 16:37:03 +03:00
Roman Lebedev	e2b75cafcb	[NFCI][InstCombine] Move store merging from `visitStoreInst()` into `visitUnconditionalBranchInst()` Summary: As @nikic is pointing out in https://bugs.llvm.org/show_bug.cgi?id=46680#c5, InstCombine should not have forward instruction scans, so let's move this transform into the proper place. This is pretty much NFCI. Reviewers: nikic, spatel Reviewed By: nikic Subscribers: hiraditya, llvm-commits, nikic Tags: #llvm Differential Revision: https://reviews.llvm.org/D83670	2020-07-14 10:41:51 +03:00
Vedant Kumar	3d52b1e81b	Revert "[InstCombine] Drop debug loc in TryToSinkInstruction (reland)" This reverts commit `9649c2095f`. See discussion on the llvm-commits thread: if it's OK to preserve the location when sinking a call, it's probably OK to always preserve the location.	2020-07-13 15:17:07 -07:00
Roman Lebedev	4500db8c59	Revert "Reland "[InstCombine] Lower infinite combine loop detection thresholds""" And there's a new hit: https://bugs.llvm.org/show_bug.cgi?id=46680 This reverts commit `7103c87596`.	2020-07-11 13:53:24 +03:00
Roman Lebedev	7103c87596	Reland "[InstCombine] Lower infinite combine loop detection thresholds"" This relands commit `cd7f8051ac` that was reverted since lower threshold have successfully found an issue. Now that the issue is fixed, let's wait until the next one is reported. This reverts commit `caa423eef0`.	2020-07-10 17:49:16 +03:00
Roman Lebedev	caa423eef0	Revert "[InstCombine] Lower infinite combine loop detection thresholds" And just after 3 days, we have a hit in `InstCombiner::mergeStoreIntoSuccessor()`: https://bugs.llvm.org/show_bug.cgi?id=46661 To be recommitted once that is addressed. This reverts commit `cd7f8051ac`.	2020-07-09 23:10:42 +03:00
Roman Lebedev	cd7f8051ac	[InstCombine] Lower infinite combine loop detection thresholds Summary: 1000 iteratons is still kinda a lot. Would it make sense to iteratively lower it, until it becomes `2`, with some delay inbetween in order to let users actually potentially encounter it? Reviewers: spatel, nikic, kuhar Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83160	2020-07-06 13:19:31 +03:00
Roman Lebedev	c3b8bd1eea	[InstCombine] Always try to invert non-canonical predicate of an icmp Summary: The actual transform i was going after was: https://rise4fun.com/Alive/Tp9H ``` Name: zz Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0 %t0 = and i8 %x, C0 %r = icmp eq i8 %t0, C1 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 Name: zz Pre: isPowerOf2(C0) %t0 = and i8 %x, C0 %r = icmp ne i8 %t0, 0 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 ``` but as it can be seen from the current tests, we already canonicalize most of it, and we are only missing handling multi-use non-canonical icmp predicates. If we have both `!=0` and `==0`, even though we can CSE them, we end up being stuck with them. We should canonicalize to the `==0`. I believe this is one of the cleanup steps i'll need after `-scalarizer` if i end up proceeding with my WIP alloca promotion helper pass. Reviewers: spatel, jdoerfert, nikic Reviewed By: nikic Subscribers: zzheng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83139	2020-07-04 18:12:04 +03:00
Sanjay Patel	ef70cc9d1a	[InstCombine] improve debug value names; NFC The use of 'tmp' can trigger warnings from the update_test_checks.py script. That's evidence of a flaw in the script's logic, but we can always do better than naming variables 'tmp' in LLVM too. The phi test file should be updated with auto-generated regex CHECK lines, so it isn't affected by cosmetic diffs, but I don't have time to do that right now.	2020-07-04 11:06:30 -04:00
Hiroshi Yamauchi	6bd1db08e7	[InstCombine] Don't let an alignment assume prevent new/delete removals. Remove allocations with alignment assume. Differential Revision: https://reviews.llvm.org/D81854	2020-07-01 09:22:32 -07:00
Vedant Kumar	9649c2095f	[InstCombine] Drop debug loc in TryToSinkInstruction (reland) Summary: The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug location of an instruction if the instruction either remains in its basic block, or if its basic block is folded into a predecessor that branches unconditionally". TryToSinkInstruction doesn't seem to satisfy the criteria as it's sinking an instruction to some successor block. Preserving the debug loc can make single-stepping appear to go backwards, or make a breakpoint hit on that location happen "too late" (since single-stepping from that breakpoint can cause the function to return unexpectedly). So, drop the debug location. This was reverted in `ee3620643d` because it removed source locations from inlinable calls, breaking a verifier rule. I've added an exception for calls because the alternative (setting a line 0 location) is not better. I tested the updated patch by completing a stage2 RelWithDebInfo build. Reviewers: aprantl, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82487	2020-06-26 17:18:15 -07:00
Vedant Kumar	ee3620643d	Revert "[InstCombine] Drop debug loc in TryToSinkInstruction" This reverts commit `903cf140d0`. This might be causing verifier failures on the bots, such as: "inlinable function call in a function with debug info must have a !dbg location" -- http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/16976/steps/bootstrap%20clang/logs/stdio	2020-06-26 14:59:40 -07:00
Vedant Kumar	903cf140d0	[InstCombine] Drop debug loc in TryToSinkInstruction Summary: The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug location of an instruction if the instruction either remains in its basic block, or if its basic block is folded into a predecessor that branches unconditionally". TryToSinkInstruction doesn't seem to satisfy the criteria as it's sinking an instruction to some successor block. Preserving the debug loc can make single-stepping appear to go backwards, or make a breakpoint hit on that location happen "too late" (since single-stepping from that breakpoint can cause the function to return unexpectedly). So, drop the debug location. Reviewers: aprantl, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82487	2020-06-26 13:23:24 -07:00
Sanjay Patel	46a285ad9e	[IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC The predicate can always be used to distinguish between icmp and fcmp, so we don't need to keep repeating this check in the callers.	2020-06-18 15:47:06 -04:00
Max Kazantsev	60da4369a1	[NFC] Bail early simplifying unconditional branches	2020-06-15 13:59:53 +07:00
Sanjay Patel	aeb5044801	[InstCombine] allow undef elements when comparing vector constants for min/max bailout This is a hacky, but low-risk fix to avoid the infinite loop in PR46271: https://bugs.llvm.org/show_bug.cgi?id=46271 As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict with a transform that wants to pull a 'not' op through min/max via SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include undefined elements in vector constants to avoid that. Alternatively, we could improve or cripple the demanded elements analysis, but that could create even more problems. The likely better, safer alternative will be to create min/max intrinsics, so we can remove all of the hacks related to min/max matching in instcombine. Differential Revision: https://reviews.llvm.org/D81698	2020-06-14 09:02:47 -04:00
Chris Jackson	4707bc2177	[DebugInfo] Refactor SalvageDebugInfo and SalvageDebugInfoForDbgValues - Simplify the salvaging interface and the algorithm in InstCombine Reviewers: vsk, aprantl, Orlando, jmorse, TWeaver Reviewed by: Orlando Differential Revision: https://reviews.llvm.org/D79863	2020-06-11 11:13:46 +01:00
Craig Topper	94b1404587	[InstCombine] Remove some repeated calls to getOperand. NFCI We had alread loaded operand 1 and 2 of the select as TV and FV using the more the readable getTrueValue/getFalseValue.	2020-06-10 16:54:50 -07:00
Chris Jackson	c6c65164af	[DebugInfo] Reduce SalvageDebugInfo() functions - Now all SalvageDebugInfo() calls will mark undef if the salvage attempt fails. Reviewed by: vsk, Orlando Differential Revision: https://reviews.llvm.org/D78369	2020-06-08 19:28:18 +01:00
Richard Smith	f39e12a06b	PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz. Summary: This transformation is correct for a builtin call to 'free(p)', but not for 'operator delete(p)'. There is no guarantee that a user replacement 'operator delete' has no effect when called on a null pointer. However, the principle behind the transformation is correct, and can be applied more broadly: a 'delete p' expression is permitted to unconditionally call 'operator delete(p)'. So do that in Clang under -Oz where possible. We do this whether or not 'p' has trivial destruction, since the destruction might turn out to be trivial after inlining, and even for a class-specific (but non-virtual, non-destroying, non-array) 'operator delete'. Reviewers: davide, dnsampaio, rjmccall Reviewed By: dnsampaio Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79378	2020-06-05 17:13:43 -07:00
Sanjay Patel	7eed772a27	[PatternMatch] abbreviate vector inst matchers; NFC Readability is not reduced with these opcodes/match lines, so reduce odds of awkward wrapping from 80-col limit.	2020-05-24 09:19:47 -04:00
Max Kazantsev	403810557b	[InstCombine] Sink pure instructions down to return and unreachable blocks If the only user of `Instr` is in a return or unreachable block, we can sink `Instr` to the`User` safely (unless it reads/writes memory). Return or unreachable blocks are guaranteed to execute zero or one time, and `Instr` always dominates `User`, so they either will be executed together (execution of `User` always implies execution of `Instr`) or not executed at all. Differential Revision: https://reviews.llvm.org/D80120 Reviewed By: asbirlea, jdoerfert	2020-05-22 14:33:42 +07:00
Max Kazantsev	e47c101e35	[InstCombine][NFC] Simplify check in sinking We just need to check that the only predecessor of user parent is BB, we don't need to iterate through BB's successors for it.	2020-05-18 18:10:40 +07:00
Alina Sbirlea	bd541b217f	[NewPassManager] Add assertions when getting statefull cached analysis. Summary: Analyses that are statefull should not be retrieved through a proxy from an outer IR unit, as these analyses are only invalidated at the end of the inner IR unit manager. This patch disallows getting the outer manager and provides an API to get a cached analysis through the proxy. If the analysis is not stateless, the call to getCachedResult will assert. Reviewers: chandlerc Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72893	2020-05-13 12:38:38 -07:00
Christopher Tetreault	beeabe382d	[SVE] Fix invalid usage of VectorType::getNumElements() in InstCombine Summary: Make foldVectorBinop return null if the instruction type is a scalable vector. It is unclear what, if any, of this function works with scalable vectors. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79196	2020-05-01 10:56:29 -07:00
Christopher Tetreault	7ca56c90bd	[SVE] Remove calls to isScalable from Transforms Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77756	2020-04-23 13:50:07 -07:00
Roman Lebedev	352fef3f11	[InstCombine] Negator - sink sinkable negations Summary: As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]), `sub` instruction can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`, we sparsely do that for non-constants. But we should. Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y` This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms). For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues For latter, if there are still any such opposing folds, we'd need to remove the colliding fold. In any case, reproducers welcomed! Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68408	2020-04-21 22:00:23 +03:00
Huihui Zhang	5c1d1a62e3	[InstCombine][SVE] Fix visitGetElementPtrInst for scalable type. Summary: This patch fix the following issues in InstCombiner::visitGetElementPtrInst 1. Skip for scalable type if transformation requires fixed size number of vector element. 2. Skip for scalable type if transformation relies on compile-time known type alloc size. 3. Use VectorType::getElementCount when scalable property is used to construct new VectorType. 4. Use TypeSize::getKnownMinSize when minimal size of a scalable type is valid to determine GEP 'inbounds'. 5. Explicitly call TypeSize::getFixedSize to avoid implicit type conversion to uint64_t. Reviewers: sdesmalen, efriedma, spatel, ctetreau Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78081	2020-04-14 12:38:32 -07:00
Vedant Kumar	4831f4b7bd	[InstCombine] Fix debug variance issue in tryToMoveFreeBeforeNullTest Fix an issue where the presence of debug info could disable an optimization in tryToMoveFreeBeforeNullTest.	2020-04-13 10:55:17 -07:00
Huihui Zhang	6e7eeb44b3	[GVN] Fix VNCoercion for Scalable Vector. Summary: For VNCoercion, skip scalable vector when analysis rely on fixed size, otherwise call TypeSize::getFixedSize() explicitly. Add unit tests to check funtionality of GVN load elimination for scalable type. Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau Reviewed By: efriedma Subscribers: bjope, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76944	2020-04-10 17:49:07 -07:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Sanjay Patel	4036a0af24	[InstCombine] enhance freelyNegateValue() by handling 'not' This patch extends D77230. If we have a 'not' instruction inside a negated expression, we can ignore extra uses of that op because the negation has a one-to-one replacement: negate becomes increment. Alive2 examples of the test cases: http://volta.cs.utah.edu:8080/z/T5-u9P http://volta.cs.utah.edu:8080/z/eT89L6 Differential Revision: https://reviews.llvm.org/D77459	2020-04-05 09:16:19 -04:00
Sanjay Patel	3d90048791	[InstCombine] enhance freelyNegateValue() by handling xor Negation is equivalent to bitwise-not + 1, so try to convert more subtracts into adds using this relationship: 0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1 I doubt this will recover the regression noted in rGf2fbdf76d8d0, but seems like we're going to need to improve here and/or revive D68408? Alive2 proofs: http://volta.cs.utah.edu:8080/z/Re5tMU http://volta.cs.utah.edu:8080/z/An-uns Differential Revision: https://reviews.llvm.org/D77230	2020-04-01 15:05:13 -04:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Nikita Popov	c538c57d6d	[InstCombine] Use replaceOperand() in descaling To make sure the old operand gets DCEd. NFC apart from worklist order.	2020-03-31 22:05:53 +02:00
Nikita Popov	0c87140065	[InstCombine] Use replaceOperand() in assoc cast simplification To make sure the old operands are DCEd. NFC apart from worklist order.	2020-03-29 20:28:37 +02:00
Nikita Popov	2215dcf1d7	[InstCombine] Remove unreachable blocks before DCE Dropping unreachable code may reduce use counts on other instructions, so it's better to do this earlier rather than later. NFC-ish, may only impact worklist order.	2020-03-28 21:19:16 +01:00
Nikita Popov	97cc1275c7	[InstCombine] Merge two functions; NFC Merge AddReachableCodeToWorklist() into prepareICWorklistFromFunction(). It's one logical step, and this makes it easier to move code.	2020-03-28 21:19:16 +01:00
Nikita Popov	30d712103f	[InstCombine] Use replaceOperand() API in GEP transforms To make sure that replaced operands get DCEd. This drops one iteration from gepphigep.ll, which is still not optimal. This was the last test case performing more than 3 iterations. NFC-ish, only worklist order should change.	2020-03-28 19:07:25 +01:00
Nikita Popov	b1f78baeaa	[InstCombine] Reduce code duplication in GEP of PHI transform; NFC The `NewGEP->setOperand(DI, NewPN)` call was duplicated, and the insertion of NewGEP is the same in both if/else, so we can extract it.	2020-03-28 19:07:25 +01:00
Fangrui Song	4c52d51e78	[InstCombine] Fix a code-sinking bug after D73832/f1a9efabcb9b - UserParent = PN->getIncomingBlock(I->use_begin()); + UserParent = PN->getIncomingBlock(SingleUse); The first use of I may be droppable (llvm.assume). When compiling llvm/lib/IR/AutoUpgrade.cpp with a bootstrapped clang with ThinLTO with minimized bitcode files, I see such a case in the function _ZN4llvm20UpgradeIntrinsicCallEPNS_8CallInstEPNS_8FunctionE clang -c -fthinlto-index=AutoUpgrade.o.thinlto.bc AutoUpgrade.bc -O3 Unfortunately it is really difficult to get a minimized reproduce.	2020-03-25 22:50:53 -07:00
Tyker	f1a9efabcb	Ignore/Drop droppable uses for code-sinking in InstCombine Summary: This patch allows code-sinking in InstCombine to be performed when instruction have uses in llvm.assume. Use are considered droppable when it is preferable to modify the User such that the use disappears rather than to prevent a transformation because of the use. for now uses are considered droppable if they are in an llvm.assume. Reviewers: jdoerfert, nikic, spatel, lebedev.ri, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73832	2020-03-25 20:42:52 +01:00
Nikita Popov	dc81923659	[InstCombine] Remove ExpensiveCombines option D75801 removed the last and only user of this option, so we can drop it now. The original idea behind this was to only run expensive transforms under -O3, but apart from the one known bits transform, this has never really taken off. I believe nowadays the recommendation is to put expensive transforms in AggressiveInstCombine instead, though that isn't terribly popular either :) Differential Revision: https://reviews.llvm.org/D76540	2020-03-22 16:56:28 +01:00
Nikita Popov	2b52e4e629	[InstCombine] Remove known bits constant folding If ExpensiveCombines is enabled (which is the case with -O3 on the legacy PM and always on the new PM), InstCombine tries to compute the known bits of all instructions in the hope that all bits end up being known, which is fairly expensive. How effective is it? If we add some statistics on how often the constant folding succeeds and how many KnownBits calculations are performed and run test-suite we get: "instcombine.NumConstPropKnownBits": 642, "instcombine.NumConstPropKnownBitsComputed": 18744965, In other words, we get one fold for every 30000 KnownBits calculations. However, the truth is actually much worse: Currently, known bits are computed before performing other folds, so there is a high chance that cases that get folded by known bits would also have been handled by other folds. What happens if we compute known bits after all other folds (hacky implementation: https://gist.github.com/nikic/751f25b3b9d9e0860db5dde934f70f46)? "instcombine.NumConstPropKnownBits": 0, "instcombine.NumConstPropKnownBitsComputed": 18105547, So it turns out despite doing 18 million known bits calculations, the known bits fold does not do anything useful on test-suite. I was originally planning to move this into AggressiveInstCombine so it only runs once in the pipeline, but seeing this, I think we're better off removing it entirely. As this is the only use of the "expensive combines" mechanism, it may be removed afterwards, but I'll leave that to a separate patch. Differential Revision: https://reviews.llvm.org/D75801	2020-03-20 20:54:06 +01:00
Nikita Popov	5c10967157	[InstCombine] Don't replace musttail result based on known bits This is the same change as D75824, but for two cases where InstCombine performs the same optimization: Replacing an instruction whose bits are fully known with a constant. This is not (generally) legal for musttail calls. Differential Revision: https://reviews.llvm.org/D76457	2020-03-20 10:17:09 +01:00
Eli Friedman	e24e95fe90	Remove CompositeType class. The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660	2020-03-18 13:53:17 -07:00
Sanjay Patel	467eec0910	[InstCombine] fold gep-of-select-of-constants (PR45084) As shown in: https://bugs.llvm.org/show_bug.cgi?id=45084 ...we failed to combine a gep with constant indexes with a pointer operand that is a select of constants. Differential Revision: https://reviews.llvm.org/D75807	2020-03-10 09:25:13 -04:00
Nikita Popov	0e890cd4d4	[ConstantFolding] Always return something from ConstantFoldConstant Spin-off from D75407. As described there, ConstantFoldConstant() currently returns null for non-ConstantExpr/ConstantVector inputs, but otherwise always returns non-null, independently of whether any folding has happened or not. This is confusing and makes consumer code more complicated. I would expect either that ConstantFoldConstant() returns only if it actually folded something, or that it always returns non-null. I'm going to the latter possibility here, which appears to be more useful considering existing usage. Differential Revision: https://reviews.llvm.org/D75543	2020-03-04 18:24:47 +01:00

1 2 3 4 5 ...

811 Commits