llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	856f7937c7	Compress a few pairs using PointerIntPairs Use the uniform structured bindings interface where possible. NFCI.	2022-12-04 16:55:16 +01:00
Sanjay Patel	a00936484b	[InstCombine] improve readability of combineLoadToOperationType(); NFC	2022-11-28 16:00:06 -05:00
Matthias Gehre	5a1d92fa3e	[InstCombine] Update debug intrinsics when rewriting allocas	2022-11-25 08:20:54 +01:00
OCHyams	fcd5098a03	[Assignment Tracking][14/*] Account for assignment tracking in instcombine The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Most of the updates here are just to ensure DIAssignID attachments are maintained and propagated correctly. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133307	2022-11-18 09:25:33 +00:00
Patrick Walton	01859da84b	[AliasAnalysis] Introduce getModRefInfoMask() as a generalization of pointsToConstantMemory(). The pointsToConstantMemory() method returns true only if the memory pointed to by the memory location is globally invariant. However, the LLVM memory model also has the semantic notion of locally-invariant: memory that is known to be invariant for the life of the SSA value representing that pointer. The most common example of this is a pointer argument that is marked readonly noalias, which the Rust compiler frequently emits. It'd be desirable for LLVM to treat locally-invariant memory the same way as globally-invariant memory when it's safe to do so. This patch implements that, by introducing the concept of a ModRefInfo mask. A ModRefInfo mask is a bound on the Mod/Ref behavior of an instruction that writes to a memory location, based on the knowledge that the memory is globally-constant memory (in which case the mask is NoModRef) or locally-constant memory (in which case the mask is Ref). ModRefInfo values for an instruction can be combined with the ModRefInfo mask by simply using the & operator. Where appropriate, this patch has modified uses of pointsToConstantMemory() to instead examine the mask. The most notable optimization change I noticed with this patch is that now redundant loads from readonly noalias pointers can be eliminated across calls, even when the pointer is captured. Internally, before this patch, AliasAnalysis was assigning Ref to reads from constant memory; now AA can assign NoModRef, which is a tighter bound. Differential Revision: https://reviews.llvm.org/D136659	2022-10-31 13:03:41 -07:00
Patrick Walton	36bbd68667	[InstCombine] Allow memcpys from constant memory to readonly nocapture parameters to be elided. Currently, InstCombine can elide a memcpy from a constant to a local alloca if that alloca is passed as a nocapture parameter to a function that's readnone or readonly, but it can't forward the memcpy if the argument is marked readonly nocapture, even though readonly guarantees that the callee won't mutate the pointee through that pointer. This patch adds support for detecting and handling such situations, which arise relatively frequently in Rust, a frontend that liberally emits readonly. A more general version of this optimization would use alias analysis to check the call's ModRef info for the pointee, but I was concerned about blowing up compile time, so for now I'm just checking for one of readnone on the function, readonly on the function, or readonly on the parameter. Differential Revision: https://reviews.llvm.org/D136822	2022-10-30 14:41:03 -07:00
Patrick Walton	cb2cb2d201	[InstCombine] Avoid deleting volatile memcpys. InstCombine can replace memcpy to an alloca with a pointer directly to the source in certain cases. Unfortunately, it also did so for volatile memcpys. This patch makes it stop doing that. This was discovered in D136822. Differential Revision: https://reviews.llvm.org/D137031	2022-10-30 02:40:16 -07:00
Craig Topper	1edc51b56a	[InstCombine] Explicitly check for scalable TypeSize. Instead of assuming it is a fixed size. Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D136517	2022-10-24 12:29:06 -07:00
Kazu Hirata	56ea4f9bd3	[Transforms] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-27 21:21:02 -07:00
Nuno Lopes	0299ebc1bd	InstCombine: use poison instead of undef as placeholder in insertvalue [NFC] These vectors are fully initialized so the placeholder value is irrelevant	2022-08-14 21:37:23 +01:00
Alex Bradbury	9bf2d8cbbe	[NFC] Use AllocaInst's getAddressSpace helper	2022-08-01 10:11:16 +01:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Nikita Popov	1881711fbb	[InstCombine] Remove memset of undef value This removes memset with undef char. We already do this for stores of undef value. This comes with the caveat that this optimization is not, strictly speaking, legal for undef values, because we might be overwriting a poison value. However, our entire load/store model currently still operates on undef values, so we need to support undef here as well for internal consistency. Once https://github.com/llvm/llvm-project/issues/52930 is resolved, these and related folds can be limited to poison -- I've added FIXMEs to that effect. Differential Revision: https://reviews.llvm.org/D124173	2022-04-29 14:51:18 +02:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Arthur Eubanks	4a26abc0b9	[InstCombine][OpaquePtr] Check store type in DSE implementation	2022-02-17 10:01:14 -08:00
Nikita Popov	d839afe3f9	[InstCombine] Avoid pointer element type access in PointerReplacer This code replaces the address space of the pointers while keeping the element type. Use the appropriate helpers to make this work with opaque pointers.	2022-01-27 12:28:32 +01:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Matt Arsenault	286237962a	InstCombine: Gracefully handle more allocas in the wrong address space Officially this is currently required to always use the datalayout's alloca address space. This may change in the future, and it's cleaner to propagate the existing alloca's addrspace anyway. This is a triple fix. Initially the change in simplifyAllocaArraySize would drop the address space, but produce output. Fixing this hit an assertion in the cast combine. This patch also makes the changes to handle this situation from `a33e128012` dead, so eliminate it. InstCombine should not take it upon itself to introduce addrspacecasts, and preserve the original address space instead.	2021-12-24 08:59:26 -05:00
Arthur Eubanks	5a81a60391	[NFC] Remove more calls to getAlignment() These are deprecated and should be replaced with getAlign(). Some of these asserts don't do anything because Load/Store/AllocaInst never have a 0 align value.	2021-12-15 14:40:57 -08:00
Arthur Eubanks	1172712f46	[NFC] Replace some deprecated getAlignment() calls with getAlign() Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D115370	2021-12-09 08:43:19 -08:00
Hongtao Yu	098a0d8fbc	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3. This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation. Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are: - Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes - Unblocked CSE by avoiding pseudo probe from clobbering memory SSA - Unblocked induction variable simpliciation - Allow empty loop deletion by treating probe intrinsic isDroppable - Some refactoring. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110847	2021-10-12 09:44:12 -07:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Andy Kaylor	b4d945bacd	Fixing an infinite loop problem in InstCombine Patch by Mohammad Fawaz This issues started happening after `b373b5990d` Basically, if the memcpy is volatile, the collectUsers() function should return false, just like we do for volatile loads. Differential Revision: https://reviews.llvm.org/D106950	2021-07-29 12:57:17 -07:00
Andy Kaylor	b373b5990d	Enabling the copy-constant-to-alloca optimization in more instances Patch by Mohammad Fawaz This patch allows lifetime calls to be ignored (and later erased) if we know that the copy-constant-to-alloca optimization is going to happen. The case that is missed is when the global variable is in a different address space than the alloca (as shown in the example added to the lit test.) This used to work before `6da31fa4a6` Differential Revision: https://reviews.llvm.org/D106573	2021-07-27 10:11:43 -07:00
Johannes Doerfert	a33e128012	[InstCombine] Gracefully handle an alloca outside the alloca-AS While we might eventually want to disallow allocas that do not have the alloca-AS set, it seems undesirable to crash on them. Add a cast when required so that we can support such allocas (at least here). Differential Revision: https://reviews.llvm.org/D104866	2021-06-29 09:38:13 -05:00
Nikita Popov	7bb7fa12e7	[OpaquePtr] Support changing load type in InstCombine When the load type is changed to ptr, we need the load pointer type to also be ptr, because it's not allowed to create a pointer to an opaque pointer. This is achieved by adjusting the getPointerTo() API to return an opaque pointer for an opaque pointer base type. Differential Revision: https://reviews.llvm.org/D104718	2021-06-22 21:16:15 +02:00
Juneyoung Lee	ce192ced2b	[InstCombine] Use poison constant to represent the result of unreachable instrs This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction. This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll . Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104602	2021-06-21 09:58:44 +09:00
Hongtao Yu	30bb5be389	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2. As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention. The optimizations unblocked are: - In-block load propagation. - In-block dead store elimination - Memory copy optimization that turns stores to consecutive memories into a memset. These optimizations are local to a block, so they shouldn't affect the profile quality. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100075	2021-04-26 16:52:33 -07:00
Juneyoung Lee	1c10201d96	Update InstCombine to use undef matcher instead This is a patch to use m_Undef() matcher instead of isa<UndefValue>(). As suggested in D100122, this update is separately committed.	2021-04-18 11:05:36 +09:00
Benjamin Kramer	cf4161673c	[Instcombine] Disable memcpy of alloca bypass for instruction sources This transformation is fundamentally broken when it comes to dominance, it just happened to work when the source of the memcpy can be moved into the place of the alloca. The bug shows up a lot more often since `077bff39d4` allows the source to be a switch. It would be possible to check dominance of the source and all its operands, but that seems very heavy for instcombine.	2021-04-14 16:52:09 +02:00
Roman Lebedev	2fea5d5d4a	[InstCombine] tmp alloca bypass: ensure that the replacement dominates all alloca uses After `077bff39d4`, isDereferenceableForAllocaSize() can recurse into selects, which is causing a problem for the new test case, reduced from https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20210412/904154.html because the replacement (the select) is defined after the first use of an alloca, so we'd end up with a verifier error. Now, this new check is too restrictive. We likely can handle some cases, by trying to sink all uses of an alloca to after the the def.	2021-04-14 13:04:12 +03:00
Nikita Popov	e0615bcd39	[Loads] Add optimized FindAvailableLoadedValue() overload (NFCI) FindAvailableLoadedValue() accepts an iterator by reference. If no available value is found, then the iterator will either be left at a clobbering instruction or the beginning of the basic block. This allows using FindAvailableLoadedValue() across multiple blocks. If this functionality is not needed, as is the case in InstCombine, then we can use a much more efficient implementation: First try to find an available value, and only perform clobber checks if we actually found one. As this function only looks at a very small number of instructions (6 by default) and usually doesn't find an available value, this saves many expensive alias analysis queries.	2021-02-21 18:42:56 +01:00
Kazu Hirata	e53472de68	[Transforms] Use llvm::append_range (NFC)	2021-01-20 21:35:54 -08:00
Luo, Yuanke	055644cc45	[X86][AMX] Prohibit pointer cast on load. The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy. Differential Revision: https://reviews.llvm.org/D94372	2021-01-13 09:39:19 +08:00
Luo, Yuanke	981a0bd858	[X86] Add x86_amx type for intel AMX. The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927	2020-12-30 13:52:13 +08:00
Kazu Hirata	789d250613	[CodeGen, Transforms] Use *Map::lookup (NFC)	2020-12-27 09:57:27 -08:00
Nikita Popov	ef2f843347	Revert "[InstCombine] Check inbounds in load/store of gep null transform (PR48577)" This reverts commit `899faa50f2`. Upon further consideration, this does not fix the right issue. Doing this fold for non-inbounds GEPs is legal, because the resulting pointer is still based-on null, which has no associated address range, and as such and access to it is UB. https://bugs.llvm.org/show_bug.cgi?id=48577#c3	2020-12-24 12:36:56 +01:00
Nikita Popov	90177912a4	Revert "[InstCombine] Fold gep inbounds of null to null" This reverts commit `eb79fd3c92`. This causes stage2 crashes, possibly due to StringMap being miscompiled. Reverting for now.	2020-12-24 10:20:31 +01:00
Nikita Popov	eb79fd3c92	[InstCombine] Fold gep inbounds of null to null Effectively, this is what we were previously already doing when the GEP was used in conjunction with a load or store, but this fold can also be applied more generally: > The only in bounds address for a null pointer in the default > address-space is the null pointer itself.	2020-12-23 21:41:53 +01:00
Nikita Popov	899faa50f2	[InstCombine] Check inbounds in load/store of gep null transform (PR48577) If the GEP isn't inbounds, then accessing a GEP of null location is generally not UB. While this is a minimal fix, the GEP of null handling should probably be its own fold.	2020-12-23 21:03:22 +01:00
Joe Ellis	0f83505593	[SVE][InstCombine] Fix TypeSize warning in canReplaceGEPIdxWithZero The warning would fire when calling canReplaceGEPIdxWithZero on a GEP whose source element type is a scalable vector. The size of scalable vector types is not known, so this optimization cannot be performed. This patch fixes the issue by: - bailing out early in this routine if the GEP instruction's source element type is a scalable vector. - making use of getFixedSize -- this removes the dependency on the deprecated interface. Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D89968	2020-10-26 17:40:26 +00:00
Juneyoung Lee	62a0ec1612	Add support for !noundef metatdata on loads This patch adds metadata !noundef and makes load instructions can optionally have it. A load with !noundef always return a well-defined value (has no undef bit or isn't poison). If the loaded value isn't well defined, the behavior is undefined. This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values. It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise. The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead. The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89050	2020-10-17 13:50:10 +09:00
Matt Arsenault	6a9484f4bf	InstCombine: Fix losing load properties in copy-constant-to-alloca Preserve the alignment and metadata. Atomic loads are skipped for this, but pass along the properties for consistency.	2020-10-14 12:55:25 -04:00
Matt Arsenault	6da31fa4a6	InstCombine: Fix infinite loop in copy-constant-to-alloca transform This was broken by `16295d521e`, when instructions started being handled and not just constant expressions. This was re-inserting an equivalent bitcast to the original memcpy operand, which made a non-functional IR change on every iteration. This also fixes a secondary problem where it was inserting addrspacecasts which may not have been legal (i.e. it changed the source address space). Start visiting all pointer users and fail out if we can't process them. Also start handling the relevant memory intrinsic users. These cases can be dealt with by running InferAddressSpaces separately.	2020-10-14 12:55:25 -04:00
Roman Lebedev	544a6aa267	[InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load And another step towards transforms not introducing inttoptr and/or ptrtoint casts that weren't there already. As we've been establishing (see D88788/D88789), if there is a int<->ptr cast, it basically must stay as-is, we can't do much with it. I've looked, and the most source of new such casts being introduces, as far as i can tell, is this transform, which, ironically, tries to reduce count of casts.. On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in -33.58% less `IntToPtr`s (19014 -> 12629) and +76.20% more `PtrToInt`s (18589 -> 32753), which is an increase of +20.69% in total. However just on RawSpeed, where i know there are basically none `IntToPtr` in the original source code, this results in -99.27% less `IntToPtr`s (2724 -> 20) and +82.92% more `PtrToInt`s (4513 -> 8255). which is again an increase of 14.34% in total. To me this does seem like the step in the right direction, we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`, which seems like a reasonable trade-off. See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995 for some more discussion on the subject. (Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair` should be taught about this, yes) Reviewed By: nlopes, nikic Differential Revision: https://reviews.llvm.org/D88979	2020-10-11 20:24:28 +03:00
Roman Lebedev	e00f189d39	[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592) (it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html) This canonicalization seems dubious. Most importantly, while it does not create `inttoptr` casts by itself, it may cause them to appear later, see e.g. D88788. I think it's pretty obvious that it is an undesirable outcome, by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts are not no-op, and are no longer eager to look past them. Which e.g. means that given ``` %a = load i32 %b = inttoptr %a %c = inttoptr %a ``` we likely won't be able to tell that `%b` and `%c` is the same thing. As we can see in D88789 / D88788 / D88806 / D75505, we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least) And we can't recover the situation post-inlining in instcombine. So it really does look like this fold is actively breaking otherwise-good IR, in a way that is not recoverable. And that means, this fold isn't helpful in exposing the passes that are otherwise unaware of these patterns it produces. Thusly, i propose to simply not perform such a canonicalization. The original motivational RFC does not state what larger problem that canonicalization was trying to solve, so i'm not sure how this plays out in the larger picture. On vanilla llvm test-suite + RawSpeed, this results in increase of asm instructions and final object size by ~+0.05% decreases final count of bitcasts by -4.79% (-28990), ptrtoint casts by -15.41% (-3423), and of inttoptr casts by -25.59% (-6919, sic). Overall, there's -0.04% less IR blocks, -0.39% instructions. See https://bugs.llvm.org/show_bug.cgi?id=47592 Differential Revision: https://reviews.llvm.org/D88789	2020-10-06 00:00:30 +03:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Roman Lebedev	e2b75cafcb	[NFCI][InstCombine] Move store merging from `visitStoreInst()` into `visitUnconditionalBranchInst()` Summary: As @nikic is pointing out in https://bugs.llvm.org/show_bug.cgi?id=46680#c5, InstCombine should not have forward instruction scans, so let's move this transform into the proper place. This is pretty much NFCI. Reviewers: nikic, spatel Reviewed By: nikic Subscribers: hiraditya, llvm-commits, nikic Tags: #llvm Differential Revision: https://reviews.llvm.org/D83670	2020-07-14 10:41:51 +03:00
Roman Lebedev	2655a70a04	[InstCombine] After merging store into successor, queue prev. store to be visited (PR46661) We can happen to have a situation with many stores eligible for transform, but due to our visitation order (top to bottom), when we have processed the first eligible instruction, we would not try to reprocess the previous instructions that are now also eligible. So after we've successfully merged a store that was second-to-last instruction into successor, if the now-second-to-last instruction is also a such store that is eligible, add it to worklist to be revisited. Fixes https://bugs.llvm.org/show_bug.cgi?id=46661	2020-07-10 17:49:16 +03:00

1 2 3 4 5 ...

309 Commits