llvm-project

Commit Graph

Author	SHA1	Message	Date
Andy Kaylor	b4d945bacd	Fixing an infinite loop problem in InstCombine Patch by Mohammad Fawaz This issues started happening after `b373b5990d` Basically, if the memcpy is volatile, the collectUsers() function should return false, just like we do for volatile loads. Differential Revision: https://reviews.llvm.org/D106950	2021-07-29 12:57:17 -07:00
Andy Kaylor	b373b5990d	Enabling the copy-constant-to-alloca optimization in more instances Patch by Mohammad Fawaz This patch allows lifetime calls to be ignored (and later erased) if we know that the copy-constant-to-alloca optimization is going to happen. The case that is missed is when the global variable is in a different address space than the alloca (as shown in the example added to the lit test.) This used to work before `6da31fa4a6` Differential Revision: https://reviews.llvm.org/D106573	2021-07-27 10:11:43 -07:00
Johannes Doerfert	a33e128012	[InstCombine] Gracefully handle an alloca outside the alloca-AS While we might eventually want to disallow allocas that do not have the alloca-AS set, it seems undesirable to crash on them. Add a cast when required so that we can support such allocas (at least here). Differential Revision: https://reviews.llvm.org/D104866	2021-06-29 09:38:13 -05:00
Nikita Popov	7bb7fa12e7	[OpaquePtr] Support changing load type in InstCombine When the load type is changed to ptr, we need the load pointer type to also be ptr, because it's not allowed to create a pointer to an opaque pointer. This is achieved by adjusting the getPointerTo() API to return an opaque pointer for an opaque pointer base type. Differential Revision: https://reviews.llvm.org/D104718	2021-06-22 21:16:15 +02:00
Juneyoung Lee	ce192ced2b	[InstCombine] Use poison constant to represent the result of unreachable instrs This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction. This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll . Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104602	2021-06-21 09:58:44 +09:00
Hongtao Yu	30bb5be389	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2. As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention. The optimizations unblocked are: - In-block load propagation. - In-block dead store elimination - Memory copy optimization that turns stores to consecutive memories into a memset. These optimizations are local to a block, so they shouldn't affect the profile quality. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100075	2021-04-26 16:52:33 -07:00
Juneyoung Lee	1c10201d96	Update InstCombine to use undef matcher instead This is a patch to use m_Undef() matcher instead of isa<UndefValue>(). As suggested in D100122, this update is separately committed.	2021-04-18 11:05:36 +09:00
Benjamin Kramer	cf4161673c	[Instcombine] Disable memcpy of alloca bypass for instruction sources This transformation is fundamentally broken when it comes to dominance, it just happened to work when the source of the memcpy can be moved into the place of the alloca. The bug shows up a lot more often since `077bff39d4` allows the source to be a switch. It would be possible to check dominance of the source and all its operands, but that seems very heavy for instcombine.	2021-04-14 16:52:09 +02:00
Roman Lebedev	2fea5d5d4a	[InstCombine] tmp alloca bypass: ensure that the replacement dominates all alloca uses After `077bff39d4`, isDereferenceableForAllocaSize() can recurse into selects, which is causing a problem for the new test case, reduced from https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20210412/904154.html because the replacement (the select) is defined after the first use of an alloca, so we'd end up with a verifier error. Now, this new check is too restrictive. We likely can handle some cases, by trying to sink all uses of an alloca to after the the def.	2021-04-14 13:04:12 +03:00
Nikita Popov	e0615bcd39	[Loads] Add optimized FindAvailableLoadedValue() overload (NFCI) FindAvailableLoadedValue() accepts an iterator by reference. If no available value is found, then the iterator will either be left at a clobbering instruction or the beginning of the basic block. This allows using FindAvailableLoadedValue() across multiple blocks. If this functionality is not needed, as is the case in InstCombine, then we can use a much more efficient implementation: First try to find an available value, and only perform clobber checks if we actually found one. As this function only looks at a very small number of instructions (6 by default) and usually doesn't find an available value, this saves many expensive alias analysis queries.	2021-02-21 18:42:56 +01:00
Kazu Hirata	e53472de68	[Transforms] Use llvm::append_range (NFC)	2021-01-20 21:35:54 -08:00
Luo, Yuanke	055644cc45	[X86][AMX] Prohibit pointer cast on load. The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy. Differential Revision: https://reviews.llvm.org/D94372	2021-01-13 09:39:19 +08:00
Luo, Yuanke	981a0bd858	[X86] Add x86_amx type for intel AMX. The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927	2020-12-30 13:52:13 +08:00
Kazu Hirata	789d250613	[CodeGen, Transforms] Use *Map::lookup (NFC)	2020-12-27 09:57:27 -08:00
Nikita Popov	ef2f843347	Revert "[InstCombine] Check inbounds in load/store of gep null transform (PR48577)" This reverts commit `899faa50f2`. Upon further consideration, this does not fix the right issue. Doing this fold for non-inbounds GEPs is legal, because the resulting pointer is still based-on null, which has no associated address range, and as such and access to it is UB. https://bugs.llvm.org/show_bug.cgi?id=48577#c3	2020-12-24 12:36:56 +01:00
Nikita Popov	90177912a4	Revert "[InstCombine] Fold gep inbounds of null to null" This reverts commit `eb79fd3c92`. This causes stage2 crashes, possibly due to StringMap being miscompiled. Reverting for now.	2020-12-24 10:20:31 +01:00
Nikita Popov	eb79fd3c92	[InstCombine] Fold gep inbounds of null to null Effectively, this is what we were previously already doing when the GEP was used in conjunction with a load or store, but this fold can also be applied more generally: > The only in bounds address for a null pointer in the default > address-space is the null pointer itself.	2020-12-23 21:41:53 +01:00
Nikita Popov	899faa50f2	[InstCombine] Check inbounds in load/store of gep null transform (PR48577) If the GEP isn't inbounds, then accessing a GEP of null location is generally not UB. While this is a minimal fix, the GEP of null handling should probably be its own fold.	2020-12-23 21:03:22 +01:00
Joe Ellis	0f83505593	[SVE][InstCombine] Fix TypeSize warning in canReplaceGEPIdxWithZero The warning would fire when calling canReplaceGEPIdxWithZero on a GEP whose source element type is a scalable vector. The size of scalable vector types is not known, so this optimization cannot be performed. This patch fixes the issue by: - bailing out early in this routine if the GEP instruction's source element type is a scalable vector. - making use of getFixedSize -- this removes the dependency on the deprecated interface. Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D89968	2020-10-26 17:40:26 +00:00
Juneyoung Lee	62a0ec1612	Add support for !noundef metatdata on loads This patch adds metadata !noundef and makes load instructions can optionally have it. A load with !noundef always return a well-defined value (has no undef bit or isn't poison). If the loaded value isn't well defined, the behavior is undefined. This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values. It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise. The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead. The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89050	2020-10-17 13:50:10 +09:00
Matt Arsenault	6a9484f4bf	InstCombine: Fix losing load properties in copy-constant-to-alloca Preserve the alignment and metadata. Atomic loads are skipped for this, but pass along the properties for consistency.	2020-10-14 12:55:25 -04:00
Matt Arsenault	6da31fa4a6	InstCombine: Fix infinite loop in copy-constant-to-alloca transform This was broken by `16295d521e`, when instructions started being handled and not just constant expressions. This was re-inserting an equivalent bitcast to the original memcpy operand, which made a non-functional IR change on every iteration. This also fixes a secondary problem where it was inserting addrspacecasts which may not have been legal (i.e. it changed the source address space). Start visiting all pointer users and fail out if we can't process them. Also start handling the relevant memory intrinsic users. These cases can be dealt with by running InferAddressSpaces separately.	2020-10-14 12:55:25 -04:00
Roman Lebedev	544a6aa267	[InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load And another step towards transforms not introducing inttoptr and/or ptrtoint casts that weren't there already. As we've been establishing (see D88788/D88789), if there is a int<->ptr cast, it basically must stay as-is, we can't do much with it. I've looked, and the most source of new such casts being introduces, as far as i can tell, is this transform, which, ironically, tries to reduce count of casts.. On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in -33.58% less `IntToPtr`s (19014 -> 12629) and +76.20% more `PtrToInt`s (18589 -> 32753), which is an increase of +20.69% in total. However just on RawSpeed, where i know there are basically none `IntToPtr` in the original source code, this results in -99.27% less `IntToPtr`s (2724 -> 20) and +82.92% more `PtrToInt`s (4513 -> 8255). which is again an increase of 14.34% in total. To me this does seem like the step in the right direction, we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`, which seems like a reasonable trade-off. See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995 for some more discussion on the subject. (Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair` should be taught about this, yes) Reviewed By: nlopes, nikic Differential Revision: https://reviews.llvm.org/D88979	2020-10-11 20:24:28 +03:00
Roman Lebedev	e00f189d39	[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592) (it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html) This canonicalization seems dubious. Most importantly, while it does not create `inttoptr` casts by itself, it may cause them to appear later, see e.g. D88788. I think it's pretty obvious that it is an undesirable outcome, by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts are not no-op, and are no longer eager to look past them. Which e.g. means that given ``` %a = load i32 %b = inttoptr %a %c = inttoptr %a ``` we likely won't be able to tell that `%b` and `%c` is the same thing. As we can see in D88789 / D88788 / D88806 / D75505, we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least) And we can't recover the situation post-inlining in instcombine. So it really does look like this fold is actively breaking otherwise-good IR, in a way that is not recoverable. And that means, this fold isn't helpful in exposing the passes that are otherwise unaware of these patterns it produces. Thusly, i propose to simply not perform such a canonicalization. The original motivational RFC does not state what larger problem that canonicalization was trying to solve, so i'm not sure how this plays out in the larger picture. On vanilla llvm test-suite + RawSpeed, this results in increase of asm instructions and final object size by ~+0.05% decreases final count of bitcasts by -4.79% (-28990), ptrtoint casts by -15.41% (-3423), and of inttoptr casts by -25.59% (-6919, sic). Overall, there's -0.04% less IR blocks, -0.39% instructions. See https://bugs.llvm.org/show_bug.cgi?id=47592 Differential Revision: https://reviews.llvm.org/D88789	2020-10-06 00:00:30 +03:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Roman Lebedev	e2b75cafcb	[NFCI][InstCombine] Move store merging from `visitStoreInst()` into `visitUnconditionalBranchInst()` Summary: As @nikic is pointing out in https://bugs.llvm.org/show_bug.cgi?id=46680#c5, InstCombine should not have forward instruction scans, so let's move this transform into the proper place. This is pretty much NFCI. Reviewers: nikic, spatel Reviewed By: nikic Subscribers: hiraditya, llvm-commits, nikic Tags: #llvm Differential Revision: https://reviews.llvm.org/D83670	2020-07-14 10:41:51 +03:00
Roman Lebedev	2655a70a04	[InstCombine] After merging store into successor, queue prev. store to be visited (PR46661) We can happen to have a situation with many stores eligible for transform, but due to our visitation order (top to bottom), when we have processed the first eligible instruction, we would not try to reprocess the previous instructions that are now also eligible. So after we've successfully merged a store that was second-to-last instruction into successor, if the now-second-to-last instruction is also a such store that is eligible, add it to worklist to be revisited. Fixes https://bugs.llvm.org/show_bug.cgi?id=46661	2020-07-10 17:49:16 +03:00
Simon Pilgrim	6c6adde84f	InstCombineInternal.h - reduce AliasAnalysis.h include to forward declaration. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-24 19:27:38 +01:00
Nikita Popov	a4953db530	[InstCombine] Remove unnecessary MaybeAlign use (NFC) Alloca align is required now.	2020-06-06 11:44:01 +02:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Nikita Popov	604f44977b	[InstCombine] Clean up alignment handling (NFC) Now that load/store alignment is required, we can simplify code in some places.	2020-05-16 18:47:29 +02:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Eli Friedman	accc6b5545	LoadInst should store Align, not MaybeAlign. The fact that loads and stores can have the alignment missing is a constant source of confusion: code that usually works can break down in rare cases. So fix the LoadInst API so the alignment is never missing. To reduce the number of changes required to make this work, IRBuilder and certain LoadInst constructors will grab the module's datalayout and compute the alignment automatically. This is the same alignment instcombine would eventually apply anyway; we're just doing it earlier. There's a minor risk that the way we're retrieving the datalayout could break out-of-tree code, but I don't think that's likely. This is the last in a series of patches, so most of the necessary changes have already been merged. Differential Revision: https://reviews.llvm.org/D77454	2020-05-14 13:19:21 -07:00
Matt Arsenault	16295d521e	InstCombine: Broaden copy-constant-to-alloca optimization Consider any constant memory type, not just global constants. AMDGPU kernel parameters are effectively global constants, but appear as either reads from an intrinsic derived pointer or function argument.	2020-05-09 16:00:27 -04:00
Matt Arsenault	59bc99a08a	InstCombine: Fix return after else	2020-05-06 11:53:26 -04:00
Christopher Tetreault	7ca56c90bd	[SVE] Remove calls to isScalable from Transforms Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77756	2020-04-23 13:50:07 -07:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Craig Topper	fcc9d70260	Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign." This is breaking the clang build. This reverts commit `897409fb56`.	2020-04-20 13:25:06 -07:00
Craig Topper	897409fb56	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 13:08:05 -07:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Huihui Zhang	2ea5495759	[InstCombine][SVE] Fix InstCombiner::visitAllocaInst for scalable vector. Summary: DataLayout::getTypeAllocSize() return TypeSize. For cases where scalable property doesn't matter (check for zero-sized alloca), we should explicitly call getKnownMinSize() to avoid implicit type conversion to uint64_t, which is invalid for scalable vector type. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76386	2020-03-18 20:57:14 -07:00
Nikita Popov	23db9724d0	[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835) Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform if it wouldn't actually do anything (apart from removing and reinserting the same instructions). Note that the test case doesn't loop on current master anymore, only on the LLVM 10 release branch. The issue is already mitigated on master due to worklist order fixes, but we should fix the root cause there as well. As a side note, we should probably assert in combineLoadToNewType() that it does not combine to the same type. Not doing this here, because this assertion would also be triggered in another place right now. Differential Revision: https://reviews.llvm.org/D74278	2020-02-08 16:55:22 +01:00
Nikita Popov	878cb38a5c	[InstCombine] Add replaceOperand() helper Adds a replaceOperand() helper, which is like Instruction.setOperand() but adds the old operand to the worklist. This reduces the amount of missing or incorrect worklist management. This only applies the helper to a relatively small subset of setOperand() calls in InstCombine, namely those of the pattern `I.setOperand(); return &I;`, where it is most obviously applicable. Differential Revision: https://reviews.llvm.org/D73803	2020-02-03 19:00:17 +01:00
Nikita Popov	e6c9ab4fb7	[InstCombine] Rename worklist methods; NFC This renames Worklist.AddDeferred() to Worklist.add() and Worklist.Add() to Worklist.push(). The intention here is that Worklist.add() should be the go-to method for explicit worklist management, while the raw Worklist.push() is mostly for InstCombine internals. I will then migrate uses of Worklist.push() to Worklist.add() in followup changes. As suggested by spatel on D73411 I'm also changing the remaining method names to lowercase first character, in line with current coding standards. Differential Revision: https://reviews.llvm.org/D73745	2020-02-03 18:56:51 +01:00
Guillaume Chatelet	59f95222d4	[Alignment][NFC] Use Align with CreateAlignedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73274	2020-01-23 17:34:32 +01:00
Guillaume Chatelet	279fa8e006	[Alignement][NFC] Deprecate untyped CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260	2020-01-23 13:34:32 +01:00
Nikita Popov	522c030aa9	[InstCombine] Fix worklist management in DSE (PR44552) Fixes https://bugs.llvm.org/show_bug.cgi?id=44552. We need to make sure that the store is reprocessed, because performing DSE may expose more DSE opportunities. There is a slight caveat here though: We need to make sure that we add back the store the worklist first, because that means it will be processed after the operands of the removed store have been processed. This is a general bug in InstCombine worklist management that I hope to address at some point, but for now it means we need to do this manually rather than just returning the instruction as changed. Differential Revision: https://reviews.llvm.org/D72807	2020-01-17 18:10:56 +01:00
Nikita Popov	b4dd928ffb	[InstCombine] Make combineLoadToNewType a method; NFC So it can be reused as part of other combines. In particular for D71164.	2020-01-14 20:40:03 +01:00
Juneyoung Lee	3e32b7e127	[InstCombine] Let combineLoadToNewType preserve ABI alignment of the load (PR44543) Summary: If aligment on `LoadInst` isn't specified, load is assumed to be ABI-aligned. And said aligment may be different for different types. So if we change load type, but don't pay extra attention to the aligment (i.e. keep it unspecified), we may either overpromise (if the default aligment of the new type is higher), or underpromise (if the default aligment of the new type is smaller). Thus, if no alignment is specified, we need to manually preserve the implied ABI alignment. This addresses https://bugs.llvm.org/show_bug.cgi?id=44543 by making combineLoadToNewType preserve ABI alignment of the load. Reviewers: spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72710	2020-01-15 03:20:53 +09:00

1 2 3 4 5 ...

287 Commits