llvm-project

Commit Graph

Author	SHA1	Message	Date
Max Kazantsev	62f4572e45	[IndVars][NFC] Make IVOperand parameter an instruction	2022-07-13 19:07:16 +07:00
Max Kazantsev	30e33b4b81	[SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional	2022-07-13 18:54:25 +07:00
Yuanfang Chen	fcb7d76d65	[coroutine] add nomerge function attribute to `llvm.coro.save` It is illegal to merge two `llvm.coro.save` calls unless their `llvm.coro.suspend` users are also merged. Marks it "nomerge" for the moment. This reverts D129025. Alternative to D129025, which affects other token type users like WinEH. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D129530	2022-07-12 10:39:38 -07:00
Nick Desaulniers	2240d72f15	[X86] initial -mfunction-return=thunk-extern support Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute Where the supported <value>s are: * keep (disable) * thunk-extern (enable) thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern. The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk! This new MachineFunctionPass is very similar to "x86-lvi-ret". The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere. Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk"). Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901 See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE. This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release. Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information. Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html Reviewed By: aaron.ballman, craig.topper Differential Revision: https://reviews.llvm.org/D129572	2022-07-12 09:17:54 -07:00
Nikita Popov	3d475dfeb9	[Mem2Reg] Consistently preserve nonnull assume for uninit load When performing a !nonnull load from uninitialized memory, we should preserve the nonnull assume just like in all other cases. We already do this correctly in the generic mem2reg code, but don't handle this case when using the optimized single-block implementation. Make sure that the optimized implementation exhibits the same behavior as the generic implementation.	2022-07-12 12:53:08 +02:00
Paul Osmialowski	b17754bcaa	[SimplifyLibCalls] refactor pow(x, n) expansion where n is a constant integer value Since the backend's codegen is capable to expand powi into fmul's, it is not needed anymore to do so in the ::optimizePow() function of SimplifyLibCalls.cpp. What is sufficient is to always turn pow(x, n) into powi(x, n) for the cases where n is a constant integer value. Dropping the current expansion code allowed relaxation of the folding conditions and now this can also happen at optimization levels below Ofast. The added CodeGen/AArch64/powi.ll test case ensures that powi is actually expanded into fmul's, confirming that this refactor did not cause any performance degradation. Following an idea proposed by David Sherwood <david.sherwood@arm.com>. Differential Revision: https://reviews.llvm.org/D128591	2022-07-09 12:00:22 -04:00
zhongyunde	716e1b856a	[IndVars] Eliminate redundant type cast between integer and float Recompute the range: match for fptosi of sitofp, and then query the range of the input to the sitofp according the comment on D129140. Fixes https://github.com/llvm/llvm-project/issues/55505. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D129191	2022-07-08 17:07:20 +08:00
Nikita Popov	34a5c2bcf2	[BasicBlockUtils] Allow critical edge splitting with callbr terminators After D129205, we support SplitBlockPredecessors() for predecessors with callbr terminators. This means that it is now also safe to invoke critical edge splitting for an edge coming from a callbr terminator. Remove checks in various passes that were protecting against that. Differential Revision: https://reviews.llvm.org/D129256	2022-07-08 09:20:44 +02:00
Martin Sebor	516915beb5	[InstCombine] Fold memchr and strchr equality with first argument Enhance memchr and strchr handling to simplify calls to the functions used in equality expressions with the first argument to at most two integer comparisons: - memchr(A, C, N) == A to N && A == C for either a dereferenceable A or a nonzero N, - strchr(S, C) == S to S == C for any S and C, and - strchr(S, '\0') == 0 to true for any S Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128939	2022-07-07 15:14:23 -06:00
Zaara Syeda	58b9666dc1	[LSR] Fix bug - check if loop has preheader before calling isInductionPHI Fix bug exposed by https://reviews.llvm.org/D125990 rewriteLoopExitValues calls InductionDescriptor::isInductionPHI which requires the PHI node to have an incoming edge from the loop preheader. This adds checks before calling InductionDescriptor::isInductionPHI to see that the loop has a preheader. Also did some refactoring. Differential Revision: https://reviews.llvm.org/D129297	2022-07-07 15:11:33 -04:00
Joseph Huber	41fba3c107	[Metadata] Add 'exclude' metadata to add the exclude flags on globals This patchs adds a new metadata kind `exclude` which implies that the global variable should be given the necessary flags during code generation to not be included in the final executable. This is done using the ``SHF_EXCLUDE`` flag on ELF for example. This should make it easier to specify this flag on a variable without needing to explicitly check the section name in the target backend. Depends on D129053 D129052 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D129151	2022-07-07 12:20:40 -04:00
Joseph Huber	ed801ad5e5	[Clang] Use metadata to make identifying embedded objects easier Currently we use the `embedBufferInModule` function to store binary strings containing device offloading data inside the host object to create a fatbinary. In the case of LTO, we need to extract this object from the LLVM-IR. This patch adds a metadata node for the embedded objects containing the embedded pointers and the sections they were stored at. This should create a cleaner interface for identifying these values. In the future it may be worthwhile to also encode an `ID` in the metadata corresponding to the object's special section type if relevant. This would allow us to extract the data from an object file and LLVM-IR using the same ID. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D129033	2022-07-07 12:20:25 -04:00
Nikita Popov	40a4078e14	[BasicBlockUtils] Allow splitting predecessors with callbr terminators SplitBlockPredecessors currently asserts if one of the predecessor terminators is a callbr. This limitation was originally necessary, because just like with indirectbr, it was not possible to replace successors of a callbr. However, this is no longer the case since D67252. As the requirement nowadays is that callbr must reference all blockaddrs directly in the call arguments, and these get automatically updated when setSuccessor() is called, we no longer need this limitation. The only thing we need to do here is use replaceSuccessorWith() instead of replaceUsesOfWith(), because only the former does the necessary blockaddr updating magic. I believe there's other similar limitations that can be removed, e.g. related to critical edge splitting. Differential Revision: https://reviews.llvm.org/D129205	2022-07-07 09:13:25 +02:00
Nikola Tesic	b5b6d3a41b	[Debugify] Port verify-debuginfo-preserve to NewPM Debugify in OriginalDebugInfo mode, introduced with D82545, runs only with legacy PassManager. This patch enables this utility for the NewPM. Differential Revision: https://reviews.llvm.org/D115351	2022-07-06 17:07:20 +02:00
Shilei Tian	1023ddaf77	[LLVM] Add the support for fmax and fmin in atomicrmw instruction This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instruction will be expanded to CAS loop. There are already a couple of targets supporting the feature. I'll create another patch(es) to enable them accordingly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D127041	2022-07-06 10:57:53 -04:00
Nikita Popov	20962c1240	[SimplifyCFG] Don't split predecessors of callbr terminator This addresses the assertion failure reported in https://reviews.llvm.org/D124159#3631240. I believe that this limitation in SplitBlockPredecessors is not actually necessary (because unlike with indirectbr, callbr is restricted in a way that does allow updating successors), but for now fix the assertion failure the same way we do everywhere else, by also skipping callbr.	2022-07-06 15:38:53 +02:00
Nikita Popov	f96cb66d19	[ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC) As constant expressions can no longer trap, it only makes sense to call isSafeToSpeculativelyExecute on Instructions, so limit the API to accept only them, rather than general Operators or Values.	2022-07-06 11:12:49 +02:00
Nikita Popov	8ee913d83b	[IR] Remove Constant::canTrap() (NFC) As integer div/rem constant expressions are no longer supported, constants can no longer trap and are always safe to speculate. Remove the Constant::canTrap() method and its usages.	2022-07-06 10:36:47 +02:00
Yuanfang Chen	b170d856a3	[SimplifyCFG] Skip hoisting common instructions that return token type By LangRef, hoisting token-returning instructions obsures the origin so it should be skipped. Found this issue while investigating a CoroSplit pass crash. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D129025	2022-07-05 11:21:57 -07:00
Zaara Syeda	dbf6ab5ef9	[LSR] Fix bug for optimizing unused IVs to final values This is a fix for a crash reported for https://reviews.llvm.org/D118808 The fix is to only consider PHINodes which are induction phis. Fixes #55529 Differential Revision: https://reviews.llvm.org/D125990	2022-07-05 12:30:58 -04:00
Nikita Popov	a4772cbaf0	Revert "[SimplifyCFG] Thread branches on same condition in more cases (PR54980)" This reverts commit `4e545bdb35`. The newly added test is the third infinite combine loop caused by this change. In this case, it's a combination of the branch to common dest and jump threading folds that keeps peeling off loop iterations. The core problem here is that we ideally would not thread over loop backedges, both because it is potentially non-profitable (it may break canonical loop structure) and because it may result in these kinds of loops. Unfortunately, due to the lack of a dominator tree in SimplifyCFG, there is no good way to prevent this. While we have LoopHeaders, this is an optional structure and we don't do a good job of keeping it up to date. It would be fine for a profitability check, but is not suitable for a correctness check. So for now I'm just giving up here, as I don't see a good way to robustly prevent infinite combine loops. Fixes https://github.com/llvm/llvm-project/issues/56203.	2022-07-05 16:57:46 +02:00
Nikita Popov	dc969061c6	[SimplifyCFG] Thread all predecessors with same value at once If there are multiple predecessors that have the same condition value (and thus same "real destination"), these were previously handled by copying the threaded block for each predecessor. Instead, we can reuse one block for all of them. This makes the behavior of SimplifyCFG's jump threading match that of the actual JumpThreading pass. This also avoids the infinite combine loop reported in: https://reviews.llvm.org/D124159#3624387	2022-07-05 14:33:53 +02:00
Nikita Popov	32a76fc292	[SCEVExpander] Avoid ConstantExpr::get() (NFCI) Use ConstantFoldBinaryOpOperands() instead. This will be important when not all binops have constant expression variants.	2022-07-04 14:59:00 +02:00
Nikita Popov	9604601c93	[SimplifyCFG] Remove redundant checks for hoisting (NFCI) These conditions are later checked in the HoistTerminator code path. Checking them here is somewhat confusing, because this code only checks the first instruction in the block, which is not necessarily the terminator.	2022-07-04 10:53:54 +02:00
Martin Sebor	0d68ff87d2	[InstCombine] Transform strrchr to memrchr for constant strings Add an emitter for the memrchr common extension and simplify the strrchr call handler to use it. This enables transforming calls with the empty string to the test C ? S : 0. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128954	2022-07-01 11:10:00 -06:00
Nikita Popov	65d59b4265	[LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266) LoopSimplify only requires that the loop predecessor has a single successor and is safe to hoist into -- it doesn't necessarily have to be an unconditional BranchInst. Adjust LoopDeletion to assert conditions closer to what it actually needs for correctness, namely a single successor and a side-effect-free terminator (as the terminator is getting dropped). Fixes https://github.com/llvm/llvm-project/issues/56266.	2022-07-01 16:13:35 +02:00
Nikita Popov	fabe915705	[SimplifyLibCalls] Use inbounds GEP When converting strchr(p, '\0') to p + strlen(p) we know that strlen() must return an offset that is inbounds of the allocated object (otherwise it would be UB), so we can use an inbounds GEP. An equivalent argument can be made for the other cases.	2022-07-01 14:31:44 +02:00
Nikita Popov	9b994593cc	[SCCP] Only handle unknown lattice values in resolvedUndefsIn() This is a minor refinement of resolvedUndefsIn(), mostly for clarity. If the value of an instruction is undef, then that's already a legal final result -- we can safely rauw such an instruction with undef. We only need to mark unknown values as overdefined, as that's the result we get for an instruction that has not been processed because it has an undef operand. Differential Revision: https://reviews.llvm.org/D128251	2022-07-01 09:14:37 +02:00
Chen Zheng	39fe49aa57	[Inline] don't add noalias metadata for unknown objects. The unidentified objects recognized in `getUnderlyingObjects` may still alias to the noalias parameter because `getUnderlyingObjects` may not check deep enough to get the underlying object because of `MaxLookup`. The real underlying object for the unidentified object may still be the noalias parameter. Originally Patched By: tingwang Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D127202	2022-07-01 02:16:55 -04:00
Martin Sebor	3a743a5892	[InstCombine] Fix memrchr logic error that prevents folding Correct a logic bug in the memrchr enhancement added in D123629 that makes it ineffective in a subset of cases. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128856	2022-06-30 11:35:26 -06:00
Nikita Popov	f34dcf2763	[IRBuilder] Migrate all binops to folding API Migrate all binops to use FoldXYZ rather than CreateXYZ APIs, which are compatible with InstSimplifyFolder and fallible constant folding. Rather than continuing to add one method for every single operator, add a generic FoldBinOp (plus variants for nowrap, exact and fmf operators), which we would need anyway for CreateBinaryOp. This change is not NFC because IRBuilder with InstSimplifyFolder may perform more folding. However, this patch changes SCEVExpander to not use the folder in InsertBinOp to minimize practical impact and keep this change as close to NFC as possible.	2022-06-30 16:41:17 +02:00
Nikita Popov	588e229bf9	[VNCoercion] Separate constant/non-constant mem intrinsic implementations (NFCI) This means we no longer need to have the same API between IRBuilder and IRBuilderFolder. The constant case is substantially simpler, so implementing it separately isn't an undue burden.	2022-06-30 15:26:06 +02:00
Nikita Popov	014c4bdb9d	[VNCoercion] Use ConstantFoldLoadFromConst API (NFCI) Nowdays we have a generic constant folding API to load a type from an offset. It should be able to do anything that VNCoercion can do. This avoids the weird templating between IRBuilder and ConstantFolder in one function, which is will stop working as the IRBuilderFolder moves from CreateXYZ to FoldXYZ APIs. Unfortunately, this doesn't eliminate this pattern from VNCoercion entirely yet.	2022-06-30 14:52:27 +02:00
Nikita Popov	1579fc62fe	[Evaluator] Add missing LLVM_DEBUG() Missed these in `41f0b6a781`, resulting in unconditional debug output.	2022-06-30 11:54:47 +02:00
Chen Zheng	b05801de35	[InlineFunction] Only check pointer arguments for a call Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128529	2022-06-30 05:39:47 -04:00
Nikita Popov	41f0b6a781	[Evaluator] Use ConstantFoldInstOperands() For instructions that don't need any special handling, use ConstantFoldInstOperands(), rather than re-implementing individual cases. This is probably not NFC because it can handle cases the previous code missed (e.g. vector operations).	2022-06-30 11:10:17 +02:00
Nikita Popov	a6d4b4138f	[ConstantFold] Supports compares in ConstantFoldInstOperands() Support compares in ConstantFoldInstOperands(), instead of forcing the use of ConstantFoldCompareInstOperands(). Also handle insertvalue (extractvalue was already handled). This removes a footgun, where many uses of ConstantFoldInstOperands() need a separate check for compares beforehand. It's particularly insidious if called on a constant expression, because it doesn't fail in that case, but will just not do DL-dependent folding.	2022-06-30 11:05:24 +02:00
Florian Hahn	6d5f814357	[LoopUnrollRuntime] Invalidate SCEV for exit phi in ConnectProlog. ConnectProlog adds new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Fix is analog to `cfc741bc0e`. Fixes #56286.	2022-06-29 20:28:43 +01:00
Florian Hahn	9a35f19e3e	[UnrollRuntime] Invalidate SCEVs for modified phis in ConnectEpilog. ConnectEpilog adds new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Fix is analog to `cfc741bc0e`. Fixes #56282.	2022-06-29 18:26:00 +01:00
Martin Sebor	8827679826	[InstCombine] Fold strncmp of constant arrays and variable size Extend the solution accepted in D127766 to strncmp and simplify strncmp(A, B, N) calls with constant A and B and variable N to the equivalent of N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0) where Pos is the offset of either the first mismatch between A and B or the terminating null character if both A and B are equal strings. Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D128089	2022-06-28 15:59:14 -06:00
Martin Sebor	e263a7670e	[InstCombine] Look through more casts when folding memchr and memcmp Enhance getConstantDataArrayInfo to let the memchr and memcmp library call folders look through arbitrarily long sequences of bitcast and GEP instructions. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128364	2022-06-28 15:58:42 -06:00
Nikita Popov	5548e807b5	[IR] Remove support for extractvalue constant expression This removes the extractvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. extractvalue is already not supported in bitcode, so we do not need to worry about bitcode auto-upgrade. Uses of ConstantExpr::getExtractValue() should be replaced with IRBuilder::CreateExtractValue() (if the fact that the result is constant is not important) or ConstantFoldExtractValueInstruction() (if it is). Though for this particular case, it is also possible and usually preferable to use getAggregateElement() instead. The C API function LLVMConstExtractValue() is removed, as the underlying constant expression no longer exists. Instead, LLVMBuildExtractValue() should be used (which will constant fold or create an instruction). Depending on the use-case, LLVMGetAggregateElement() may also be used instead. Differential Revision: https://reviews.llvm.org/D125795	2022-06-28 10:40:17 +02:00
Nikita Popov	f65c88c42f	[GlobalOpt] Fix memset handling in global ctor evaluation (PR55859) The global ctor evaluator currently handles by checking whether the memset memory is already zero, and skips it in that case. However, it only actually checks the first byte of the memory being set. This patch extends the code to check all bytes being set. This is done byte-by-byte to avoid converting undef values to zeros in larger reads. However, the handling is still not completely correct, because there might still be padding bytes (though probably this doesn't matter much in practice, as I'd expect global variable padding to be zero-initialized in practice). Mostly fixes https://github.com/llvm/llvm-project/issues/55859. Differential Revision: https://reviews.llvm.org/D128532	2022-06-27 16:50:49 +02:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Arthur Eubanks	e422c0d3b2	[GlobalOpt] Perform store->dominated load forwarding for stored once globals The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway. The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store. Compile time tracker: https://llvm-compile-time-tracker.com/compare.php?from=c80b88ee29f34078d2149de94e27600093e6c7c0&to=ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions Reviewed By: nikic, asbirlea, jdoerfert Differential Revision: https://reviews.llvm.org/D128128	2022-06-24 09:09:26 -07:00
Nikita Popov	e523baa664	[InlineFunction] Slightly clarify noalias scope calculation (NFC) Rename CanDeriveViaCapture -> RequiresNoCaptureBefore, drop unnecessary const cast, reformat some code avoid an ugly super-indented comment.	2022-06-24 12:31:46 +02:00
Florian Mayer	9320a32bb9	[MTE] [HWASan] Use LoopInfo for reachability queries. The reachability queries default to "reachable" after exploring too many basic blocks. LoopInfo helps it skip over the whole loop. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D127917	2022-06-22 15:28:49 -07:00
Brendon Cahoon	e13248ab0e	[UnifyLoopExits] Reduce number of guard blocks UnifyLoopExits creates a single exit, a control flow hub, for loops with multiple exits. There is an input to the block for each loop exiting block and an output from the block for each loop exit block. Multiple checks, or guard blocks, are needed to branch to the correct exit block. For large loops with lots of exit blocks, all the extra guard blocks cause problems for StructurizeCFG and subsequent passes. This patch reduces the number of guard blocks needed when the exit blocks branch to a common block (e.g., an unreachable block). The guard blocks are reduced by changing the inputs and outputs of the control flow hub. The inputs are the exit blocks and the outputs are the common block. Reducing the guard blocks enables StructurizeCFG to reorder the basic blocks in the CFG to reduce the values that exit a loop with multiple exits. This reduces the compile-time of StructurizeCFG and also reduces register pressure. Differential Revision: https://reviews.llvm.org/D123230	2022-06-22 15:44:23 -05:00
Florian Mayer	476ced4b89	[MTE] [HWASan] Support diamond lifetimes. We were overly conservative and required a ret statement to be dominated completely be a single lifetime.end marker. This is quite restrictive and leads to two problems: * limits coverage of use-after-scope, as we degenerate to use-after-return; * increases stack usage in programs, as we have to remove all lifetime markers if we degenerate to use-after-return, which prevents reuse of stack slots by the stack coloring algorithm. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D127905	2022-06-22 11:16:34 -07:00
Nikita Popov	1f88d80408	[SCCP] Don't mark edges feasible when resolving undefs As branch on undef is immediate undefined behavior, there is no need to mark one of the edges as feasible. We can leave all the edges non-feasible. In IPSCCP, we can replace the branch with an unreachable terminator. Differential Revision: https://reviews.llvm.org/D126962	2022-06-22 10:28:27 +02:00
Martin Sebor	b19194c032	[InstCombine] handle subobjects of constant aggregates Remove the known limitation of the library function call folders to only work with top-level arrays of characters (as per the TODO comment in the code) and allows them to also fold calls involving subobjects of constant aggregates such as member arrays.	2022-06-21 11:55:14 -06:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Florian Hahn	cfc741bc0e	[LoopPeel] Forget SCEV for updated exit phi values. LoopPeel add new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Forget SCEVs for such phis. Fixes #56044. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128164	2022-06-20 13:19:27 +02:00
Guillaume Chatelet	f1255186c7	[NFC][Alignment] Remove max functions between Align and MaybeAlign `llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are not used often enough to be required. They also make the code more opaque. Differential Revision: https://reviews.llvm.org/D128121	2022-06-20 08:37:48 +00:00
Nikita Popov	2b089e9ae0	[SimplifyCFG] Try to merge edge block when threading (PR55765) When threading, we always create a new block for the threaded edge (even if the edge is not critical), which will later get folded back into the predecessor if possible. Depending on precise processing order, this separate block may break the detection of trivial cycles in the threading code, which normally avoids infinite threading of loops. Explicitly merge the created edge block into the predecessor to avoid this. Fixes https://github.com/llvm/llvm-project/issues/55765. Differential Revision: https://reviews.llvm.org/D127216	2022-06-20 10:29:33 +02:00
Kazu Hirata	129b531c9c	[llvm] Use value_or instead of getValueOr (NFC)	2022-06-18 23:07:11 -07:00
Florian Hahn	e9cced2739	Recommit "[LAA] Initial support for runtime checks with pointer selects." This reverts commit `7aa8a67882`. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.	2022-06-17 21:06:26 +02:00
Martin Sebor	5fb67e32f8	[InstCombine] Fold memcmp of constant arrays and variable size The memcmp simplifier is limited to folding to constants calls with constant arrays and constant sizes. This change adds the ability to simplify memcmp(A, B, N) calls with constant A and B and variable N to the pseudocode equivalent of N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0) where Pos is the offset of the first mismatch between A and B. Differential Revision: https://reviews.llvm.org/D127766	2022-06-17 10:35:35 -06:00
Samuel Eubanks	bf02ed240d	Prevent crash when TurnSwitchRangeIntoICmp receives default unreachable destination TurnSwitchRangeIntoICmp crashes when given a switch with a default destination of unreachable Addresses issue #53208 https://github.com/llvm/llvm-project/issues/53208 Differential revision: https://reviews.llvm.org/D127712	2022-06-16 16:11:24 +02:00
Nikita Popov	2dac2c4f76	[SimplifyLibCalls] Drop duplicate check (NFC) The same condition already exists inside optimizeMemCmpConstantSize().	2022-06-15 09:37:09 +02:00
Serguei Katkov	d713f0eab8	Revert "[MachineSSAUpdater] compile time improvement in GetValueInMiddleOfBlock" It looks like it causes buildbot failures. As an example: https://lab.llvm.org/buildbot/#/builders/121/builds/20364 Revert to investigate... This reverts commit `6bf2791814`.	2022-06-14 20:27:21 +07:00
Serguei Katkov	6bf2791814	[MachineSSAUpdater] compile time improvement in GetValueInMiddleOfBlock GetValueInMiddleOfBlock uses result of GetValueAtEndOfBlockInternal if there is no value defined for current basic block. If there is already a value it tries (in this order): to find single register coming from all predecessors find existing phi node which matches our incoming registers build new phi. The compile time improvement is to use current available value if it is defined out of current BB or it is a PHI register. This is due to it can be used in the middle basic block. Reviewed By: sameerds Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D126523	2022-06-14 18:00:34 +07:00
Chuanqi Xu	d029db9e8a	[NFC] Fix Wswitch warning triggered by 735e6c	2022-06-14 14:45:15 +08:00
Guillaume Chatelet	2887dd754e	[NFC][Alignment] Use getAlign in VNCoercion	2022-06-13 15:13:05 +00:00
Nikita Popov	571c713144	[SimplifyCFG] Handle trapping aggregates (PR49839) Handle the fact that not only constant expressions, but also constant aggregates containing expressions can trap. This still doesn't fix the original C reproducer, probably due to more issues remaining in other passes.	2022-06-13 14:56:49 +02:00
Hans Wennborg	3800b157d7	[SimplifyCFG] Share code to compute switch density between ShouldBuildLookupTable() and ReduceSwitchRange() They're computing the same thing. No functionality change. Differential revision: https://reviews.llvm.org/D127482	2022-06-10 15:29:36 +02:00
Nikita Popov	d77f944832	[LoopInfo] Add getOutermostLoop() (NFC) This is a recurring pattern, add an API function for it.	2022-06-10 11:48:21 +02:00
Philip Reames	f85c5079b8	Pipe potentially invalid InstructionCost through CodeMetrics Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred. On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost. I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change. Differential Revision: https://reviews.llvm.org/D127131	2022-06-09 15:17:24 -07:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Nikita Popov	56c9976d46	[IndVarSimplify] Don't assert that terminator is not SCEVable (PR55925) The IV widening code currently asserts that terminators aren't SCEVable -- however, this is not the case for invokes with a returned attribute. As far as I can tell, this assertions is not necessary -- even if we have a critical edge (the second test case), the trunc gets inserted in a legal position. Fixes https://github.com/llvm/llvm-project/issues/55925. Differential Revision: https://reviews.llvm.org/D127288	2022-06-09 10:12:13 +02:00
Chuanqi Xu	0e10f12844	[NFC] Remove commented cerr debugging loggings There are some unused cerr debugging loggings in the codes. It is weird to remain such commented debug helpers in the product.	2022-06-08 15:58:06 +08:00
Martin Sebor	dd2a6d78ee	[InstCombine] Fold memchr of sequences of same characters Enhance memchr libcall folder to handle constant arrays consisting of one or two sequences of cosecutive equal characters. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126515	2022-06-07 13:45:10 -06:00
Martin Sebor	fb6627fa0c	[InstCombine] Add substr helper function (NFC). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126515	2022-06-07 13:27:36 -06:00
Nikita Popov	7fa97b473c	[SCCP] Don't mark ranges from branch conditions as potentially undef Now that transforms introducing branch on poison have been removed, we can stop marking ranges that have been derived from branch conditions as containing undef. The existing comment explains why this is legal. I've checked that alive2 is happy with SCCP tests after this change. Differential Revision: https://reviews.llvm.org/D126647	2022-06-07 10:20:24 +02:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Kazu Hirata	2c4d52467a	[Transforms/Utils] Use predecessors (NFC)	2022-06-05 00:16:14 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to `557efc9a8b`. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
Augie Fackler	73f664601c	BuildLibCalls: infer allockind attributes on relevant functions Differential Revision: https://reviews.llvm.org/D123089	2022-05-31 10:01:17 -04:00
Augie Fackler	42861faa8e	attributes: introduce allockind attr for describing allocator fn behavior I chose to encode the allockind information in a string constant because otherwise we would get a bit of an explosion of keywords to deal with the possible permutations of allocation function types. I'm not sure that CodeGen.h is the correct place for this enum, but it seemed to kind of match the UWTableKind enum so I put it in the same place. Constructive suggestions on a better location most certainly encouraged. Differential Revision: https://reviews.llvm.org/D123088	2022-05-31 10:01:17 -04:00
Nikita Popov	2e101cca69	[Local] Don't remove invoke of non-willreturn function The code was only checking for memory side-effects, but not for divergence side-effects. Replace this with a generic check.	2022-05-30 15:37:46 +02:00
serge-sans-paille	fb67d683db	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `7030654296` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D126417	2022-05-26 08:12:34 +02:00
Alexey Bataev	10f41a2147	[SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling. Need to use all ReductionOps when propagating flags for the reduction ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw flags. Differential Revision: https://reviews.llvm.org/D126371	2022-05-25 13:59:06 -07:00
Martin Sebor	46c0ec9df4	[InstCombine] Fold memrchr calls with sequences of identical bytes. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123631	2022-05-24 17:00:11 -06:00
Nikita Popov	81c648a3d9	[LoopUnroll] Freeze tripcount rather than condition This is a followup to D125754. We introduce two branches, one before the unrolled loop and one before the epilogue (and similar for the prologue case). The previous patch only froze the condition on the first branch. Rather than independently freezing the second condition, this patch instead freezes TripCount and bases BECount on it. These are the two quantities involved in the conditions, and this ensures that both work on a consistent, non-poisonous trip count. Differential Revision: https://reviews.llvm.org/D125896	2022-05-24 09:42:39 +02:00
Hendrik Greving	4f93d5cc1d	[BasicBlockUtils] Do not move loop metadata if outer loop header. Fixes a bug preventing moving the loop's metadata to an outer loop's header, which happens if the loop's exit is also the header of an outer loop. Adjusts test for above. Fixes #55416. Differential Revision: https://reviews.llvm.org/D125574	2022-05-23 16:39:54 -07:00
NAKAMURA Takumi	6ca7eb2c6d	[SCEV] Part 1, Serialize function calls in function arguments. Evaluation odering in function call arguments is implementation-dependent. In fact, gcc evaluates bottom-top and clang does top-bottom. Fixes #55283 partially. Part of https://reviews.llvm.org/D125627	2022-05-18 23:20:08 +09:00
Sun Ziping	242961f23b	[llvm][fix-irreducible] ensure that loop subtree under child is correctly reconnected to new loop The modified function was incorrectly (not unnecessarily) ignoring grandchild loops, and this change fixes the bug. In particular, this fixes the handling of the loop { inner, body }. The TODO in the same function is talking about the b1 self loop, which may be "unnecessarily" lost, but that is a different issue.	2022-05-18 10:45:52 +01:00
Nikita Popov	e9a1c82d69	[SCEVExpander] Expand umin_seq using freeze %x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y). This patch changes the expansion to umin(%x, freeze %y) instead (https://alive2.llvm.org/ce/z/wujUhp). The motivation for this change are the test cases affected by D124910, where the freeze expansion ultimately produces better optimization results. This is largely because `(%x umin_seq %y) == %x` is a common expansion pattern, which reliably optimizes in freeze representation, but only sometimes with the zero comparison (in particular, if %x == 0 can fold to something else, we generally won't be able to cover reasonable code from this.) Differential Revision: https://reviews.llvm.org/D125372	2022-05-18 09:53:07 +02:00
Nikita Popov	323514de58	[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits When performing runtime unrolling with multiple exits, one of the earlier (non-latch) exits may exit the loop on the first iteration, such that we never branch on the latch exit condition. As such, we need to freeze the condition of the new branch that is introduced before the loop, as it now executes unconditionally. Differential Revision: https://reviews.llvm.org/D125754	2022-05-18 09:51:22 +02:00
Sanjay Patel	be7f09f7b2	[IR] create and use helper functions that test the signbit; NFCI	2022-05-16 11:26:23 -04:00
Florian Hahn	b7315ffc3c	[LAA,LV] Add initial support for pointer-diff memory checks. This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078	2022-05-16 15:27:22 +01:00
Alexander Shaposhnikov	badd088c57	[GlobalOpt] Enable optimization of constructors with different priorities Adjust `optimizeGlobalCtorsList` to handle the case of different priorities. This addresses the issue https://github.com/llvm/llvm-project/issues/55083. Test plan: ninja check-all Differential revision: https://reviews.llvm.org/D125278	2022-05-13 22:19:29 +00:00
Nikita Popov	c1bb4a881e	[SCEVExpander] Deduplicate min/max expansion code (NFC)	2022-05-11 12:11:11 +02:00
Alexander Shaposhnikov	da823382d2	[Transform][Utils][NFC] Clean up CtorUtils.cpp	2022-05-11 01:07:54 +00:00
Nick Desaulniers	c167c0a4dc	[BuildLibCalls] infer inreg param attrs from NumRegisterParameters We're having a hard time booting the ARCH=i386 Linux kernel with clang after removing -ffreestanding because instcombine was dropping inreg from callers during libcall simplification, but not the callees defined in different translation units. This led the callers and callees to have wildly different calling conventions, which (predictably) blew up at runtime. Infer the inreg param attrs on function declarations from the module metadata "NumRegisterParameters." This allows us to boot the ARCH=i386 Linux kernel (w/ -ffreestanding removed). Fixes: https://github.com/llvm/llvm-project/issues/53645 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125285	2022-05-10 16:21:17 -07:00
Nikita Popov	0eafef1171	[SCEVExpander] Remove handling for mixed int/pointer min/max (NFCI) Mixed int/pointer min/max are no longer possible.	2022-05-10 15:11:39 +02:00
Hongtao Yu	9641b9be9d	[Inliner] Preserve !prof metadata when converting call to invoke. When a callee function is inlined via an invoke instruction, every function call inside the callee, if not an invoke, will be converted to an invoke after cloned to the caller body. I found that during the conversion the !prof metadata was dropped. This in turned caused a cloned indirect call not properly promoted in subsequent passes. The particular scenario I was investigating was with AutoFDO and thinLTO. In prelink, no ICP was triggered (neither by the sample loader nor PGO ICP), no indirect call was promoted. This is because 1) the particular indirect call did not have inlined samples; and 2) PGO ICP was intentionally disabled. After inlining, the prof metadata was dropped. Then in postlink, PGO ICP jumped in but didn't do anything. Thus the opportunity was missed. I'm making a simple fix to preserve !prof metadata when converting call to invoke. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D125249	2022-05-09 15:08:09 -07:00
Augie Fackler	1deea714b3	BuildLibCalls: simplify switch statement slightly Per feedback on D123086 after submit. Also added a test for vec_malloc et al attribute inference to show it's doing the right thing. The new tests exposed a defect, corrected by adding vec_free to the list of free functions in MemoryBuiltins.cpp, which had been overlooked all the way back in D94710, over a year ago. Differential Revision: https://reviews.llvm.org/D124859	2022-05-03 13:17:33 -04:00
Jonas Paulsson	304378fd09	Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls." (was `0f8c626`). This reverts commit `14d9390`. The patch previously failed to recognize cases where user had defined a function alias with an identical name as that of the library function. Module::getFunction() would then return nullptr which is what the sanitizer discovered. In this updated version a new function isLibFuncEmittable() has as well been introduced which is now used instead of TLI->has() anytime a library function is to be emitted . It additionally also makes sure there is e.g. no function alias with the same name in the module. Reviewed By: Eli Friedman Differential Revision: https://reviews.llvm.org/D123198	2022-05-02 19:37:00 +02:00
Augie Fackler	c7ae423e39	BuildLibCalls: add alloc-family attribute to many allocator functions Differential Revision: https://reviews.llvm.org/D123086	2022-05-02 11:12:55 -04:00
Augie Fackler	e940456531	BuildLibCalls: infer allocptr attribute for free and realloc() family functions Differential Revision: https://reviews.llvm.org/D123084	2022-05-02 09:43:21 -04:00
Nikita Popov	aae5f8115a	[Local] Consider atomic loads from constant global as dead Per the guidance in https://llvm.org/docs/Atomics.html#atomics-and-ir-optimization, an atomic load from a constant global can be dropped, as there can be no stores to synchronize with. Any write to the constant global would be UB. IPSCCP will already drop such loads, but the main helper in Local doesn't recognize this currently. This is motivated by D118387. Differential Revision: https://reviews.llvm.org/D124241	2022-05-02 10:52:58 +02:00
Florian Hahn	a80081763c	[SimplifyCFG] Avoid shifting by a too large exponent. TI->getBitWidth can be > 64 and in those cases the shift will be UB due to the exponent being too large. To fix this, cap the shift at 63. I think this should work out fine, because TableSize is itself a 64 bit type and the maximum table size must fit in the type. Also, if we would underestimate the size here, at most we get an extra ZExt. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D124608	2022-04-29 15:19:06 +01:00
Nikita Popov	884e9a877b	[SimplifyCFG] Replace condition value when threading Replace the condition value with the known constant value on the threaded edge. This happens implicitly with phi threading because we replace with the incoming value, but not for non-phi threading.	2022-04-29 09:50:27 +02:00
Nikita Popov	4e545bdb35	[SimplifyCFG] Thread branches on same condition in more cases (PR54980) SimplifyCFG implements basic jump threading, if a branch is performed on a phi node with constant operands. However, InstCombine canonicalizes such phis to the condition value of a previous branch, if possible. SimplifyCFG does support this as well, but only in the very limited case where the same condition is used in a direct predecessor -- notably, this does not include the common diamond pattern (i.e. two consecutive if/elses on the same condition). This patch extends the code to look back a limited number of blocks to find a branch on the same value, rather than only looking at the direct predecessor. Fixes https://github.com/llvm/llvm-project/issues/54980. Differential Revision: https://reviews.llvm.org/D124159	2022-04-29 09:44:05 +02:00
Arthur Eubanks	4e65291837	[OpaquePtr][GlobalOpt] Don't attempt to evaluate global constructors with arguments Previously all entries in global_ctors had to have the void()* type and we'd skip evaluating bitcasted functions. With opaque pointers we may see the function directly. Fixes #55147. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D124553	2022-04-27 19:00:44 -07:00
Martin Sebor	efa0f12c0b	[InstCombine] Fold strnlen calls in equality to zero. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123818	2022-04-27 12:03:24 -06:00
Alexandros Lamprineas	a910337b5d	[FuncSpec] Conditional jump or move depends on uninitialised value(s). I found this bug when performing a two-stage build of clang with Function Specialization enabled and tuned aggressively. The crash appears only on release builds. Fixes https://github.com/llvm/llvm-project/issues/55000. Before accessing the contents of the ArgInfo iterator inside SCCPInstVisitor::markArgInFuncSpecialization, we should be checking that the iterator is valid. Differential Revision: https://reviews.llvm.org/D124114	2022-04-27 07:28:25 +01:00
Martin Sebor	ffed0cfcdb	[SimplifyLibCalls] avoid slicing 64-bit integers in an ILP32 build (PR #54739 ) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123472	2022-04-26 17:20:56 -06:00
Martin Sebor	449adafabe	[InstCombine] Fold strnlen of constant strings. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123817	2022-04-26 16:15:28 -06:00
Martin Sebor	ce8f42d4af	[InstCombine] Fold memrchr calls with a constant character. Reviewed By: nikic Differential Revision: //reviews.llvm.org/D123629	2022-04-26 14:02:50 -06:00
Martin Sebor	10c99ce67d	[InstCombine] Fold memrchr calls with constant size, bail on excessive. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123626 Differential Revision: https://reviews.llvm.org/D123628	2022-04-26 14:02:50 -06:00
Martin Sebor	25febbd155	[InstCombine] Fold strnlen with a bound of zero and one. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123816	2022-04-26 14:02:50 -06:00
Martin Sebor	2807c420cd	[InstCombine] add a strnlen handler stub. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123815	2022-04-26 14:02:49 -06:00
Augie Fackler	a907d36cfe	Attributes: add a new `allocptr` attribute This continues the push away from hard-coded knowledge about functions towards attributes. We'll use this to annotate free(), realloc() and cousins and obviate the hard-coded list of free functions. Differential Revision: https://reviews.llvm.org/D123083	2022-04-26 13:57:11 -04:00
Igor Kudrin	39ce68886b	[LoopPeel][NFCI] Simplify the code to calculate peel count for PGO This reorganizes the code as a preparation for D123865: * Use more descriptive names for variables * Simplify a condition by use an already calculated value for `MaxPeelCount` * Remove a duplicate log entry * Report basic values for loop costs Differential Revision: https://reviews.llvm.org/D124388	2022-04-26 18:44:24 +04:00
Igor Kudrin	c71890e158	[LoopPeel][NFC] Exit early if there is no room for peeling Differential Revision: https://reviews.llvm.org/D123864	2022-04-26 18:43:56 +04:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Paul Kirth	4683a2effa	[llvm][misexpect] Avoid division by 0 when using sample profiling MisExpect diagnostics should not prevent compilation from succeeding, and the assertion is insufficient to prevent division by zero in release builds. This patch addresses that by replacing the assert with an early return. Additionally, it disables MisExpect diagnostics when using sample profiling, since this is the only known case where this error has manifested. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D124302	2022-04-22 22:48:00 +00:00
Nikita Popov	993b166deb	Reapply [SimplifyCFG] Handle branch on same condition in pred more directly Reapplying without changes, after a fix to a dependent patch. ----- Rather than creating a PHI node and then using the PHI threading code, directly handle this case in FoldCondBranchOnValueKnownInPredecessor(). This change is supposed to be NFC-ish, but may cause changes due to different transform order.	2022-04-22 10:27:38 +02:00
Nikita Popov	df18e37541	Reapply [SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension (NFCI) Reapply with SmallMapVector instead of SmallDenseMap, which should address the non-determinism issue. ----- This general threading transform can be performed whenever we know a constant value for the condition in a predecessor, which would currently just be the case of a phi node with constant arguments.	2022-04-22 09:42:11 +02:00
Fangrui Song	35e350d5ba	Revert "[SimplifyCFG] Handle branch on same condition in pred more directly" and "[SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension" This reverts commit `3df86e799e`. This reverts commit `8988254667`. `[SimplifyCFG] Handle branch on same condition in pred more directly` caused non-determinism when compiling opt with a bootstrapped clang. I have to revert the dependent commit as well.	2022-04-21 12:58:58 -07:00
Nikola Tesic	c5600aef88	[Debugify] Limit number of processed functions for original mode Debugify in OriginalDebugInfo mode, does (DebugInfo) collect-before-pass & check-after-pass for each instruction, which is pretty expensive. When used to analyze DebugInfo losses in large projects (like LLVM), this raises the build time unacceptably. This patch introduces a limit for the number of processed functions per compile unit. By default, the limit is set to UINT_MAX (practically unlimited), and by using the introduced option -debugify-func-limit the limit could be set to any positive integer number. Differential revision: https://reviews.llvm.org/D115714	2022-04-21 13:58:17 +02:00
Nikita Popov	3df86e799e	[SimplifyCFG] Handle branch on same condition in pred more directly Rather than creating a PHI node and then using the PHI threading code, directly handle this case in FoldCondBranchOnValueKnownInPredecessor(). This change is supposed to be NFC-ish, but may cause changes due to different transform order.	2022-04-21 11:22:02 +02:00
Nikita Popov	8988254667	[SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension This general threading transform can be performed whenever we know a constant value for the condition in a predecessor, which would currently just be the case of a phi node with constant arguments.	2022-04-21 10:49:49 +02:00
Nikita Popov	d727505e40	[SimplifyCFG] Remove one-use limitation in FoldCondBranchOnPHI() BlockIsSimpleEnoughToThreadThrough() already checks that the phi (and all other instructions) are not used outside the block, so this one-use check is not necessary for legality. I also don't see any reason why it would be necessary for profitability (in fact, those extra uses will be replaced with constants, which should be generally profitable).	2022-04-20 15:56:20 +02:00
Fangrui Song	14d9390721	Revert D123198 "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls." test/Transforms/InstCombine/pr39177.ll failed in a -DLLVM_USE_SANITIZER=Undefined build. ``` lib/Transforms/Utils/BuildLibCalls.cpp:1217:17: runtime error: reference binding to null pointer of type 'llvm::Function' ``` `Function &F = *M->getFunction(Name);` This reverts commit `0f8c626723`.	2022-04-19 22:26:10 -07:00
Paul Kirth	bac6cd5bf8	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-04-19 21:23:48 +00:00
Jonas Paulsson	0f8c626723	[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls. A new set of overloaded functions named getOrInsertLibFunc() are now supposed to be used instead of getOrInsertFunction() when building a libcall from within an LLVM optimizer(). The idea is that this new function also makes sure that any mandatory argument attributes are added to the function prototype (after calling getOrInsertFunction()). inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it only adds attributes that are not necessary for correctness but merely helping with later optimizations. Generally, the front end is responsible for building a correct function prototype with the needed argument attributes. If the middle end however is the one creating the call, e.g. when replacing one libcall with another, it then must take this responsibility. This continues the work of properly handling argument extension if required by the target ABI when building a lib call. getOrInsertLibFunc() now does this for all libcalls currently built by any LLVM optimizer. It is expected that when in the future a new optimization builds a new libcall with an integer argument it is to be added to getOrInsertLibFunc() with the proper handling. Note that not all targets have it in their ABI to sign/zero extend integer arguments to the full register width, but this will be done selectively as determined by getExtAttrForI32Param(). Review: Eli Friedman, Nikita Popov, Dávid Bolvanský Differential Revision: https://reviews.llvm.org/D123198	2022-04-19 21:22:07 +02:00
Joseph Huber	984a0dc386	[OpenMP] Use new offloading binary when embedding offloading images The previous patch introduced the offloading binary format so we can store some metada along with the binary image. This patch introduces using this inside the linker wrapper and Clang instead of the previous method that embedded the metadata in the section name. Differential Revision: https://reviews.llvm.org/D122683	2022-04-15 20:35:26 -04:00
chenglin.bi	00871e2f4f	[SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select When switch with 2^n cases go to one result, check if the 2^n cases can be covered by n bit masks. If yes we can use "and condition, ~mask" to simplify the switch case 0 2 4 6 -> and condition, -7 https://alive2.llvm.org/ce/z/jjH_0N case 0 2 8 10 -> and condition, -11 https://alive2.llvm.org/ce/z/K7E-2V case 2 4 8 12 -> and (sub condition, 2), -11 https://alive2.llvm.org/ce/z/CrxbYg Fix one case of https://github.com/llvm/llvm-project/issues/39957 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D122485	2022-04-15 00:10:00 +08:00
Ruiling Song	1e01f95057	LowerSwitch: Avoid inserting NewDefault block The NewDefault was used to simplify the updating of PHI nodes, but it causes some inefficiency for target that will run structurizer later. For example, for a simple two-case switch, the extra NewDefault is causing unstructured CFG like: O / \ O O / \ / \ C1 ND C2 \ \| / \ \| / D The change is to avoid the ND(NewDefault) block, that is we will get a structured CFG for above example like: O / \ / \ O O / \ / \ C1 \ / C2 \-> D <-/ The IR change introduced by this patch should be trivial to other targets, so I am doing this unconditionally. Fall-through among the cases will also cause unstructured CFG, but it need more work and will be addressed in a separate change. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D123607	2022-04-14 13:30:56 +08:00
Sanjay Patel	0ef46dc0f9	[SimplifyCFG] improve readability in switch-to-select; NFC	2022-04-13 17:14:45 -04:00
serge-sans-paille	262eba01b3	Revert "[ValueTracking] Make getStringLenth aware of strdup" This reverts commit `e810d55809`. The commit was not taken into account the fact that strduped string could be modified. Checking if such modification happens would make the function very costly, without a test case in mind it's not worth the effort.	2022-04-13 19:17:28 +02:00
Nikita Popov	8c74169990	[SimplifyLibCalls] Don't mark memchr() memory as fully dereferenceable C11 specifies memchr() as follows: > The memchr function locates the first occurrence of c (converted > to an unsigned char) in the initial n characters (each interpreted > as unsigned char) of the object pointed to by s. The implementation > shall behave as if it reads the characters sequentially and stops > as soon as a matching character is found. In particular, it is well-defined to specify a memchr size larger than the underlying object, as long as the character is found before the end of the object. Differential Revision: https://reviews.llvm.org/D123665	2022-04-13 16:46:18 +02:00
Sanjay Patel	cd0d0d633b	[SimplifyCFG] make a debug option for case max when converting switch to select This should be "NFC" as written, but it will make D122485 smaller and give us more flexibility to experiment with optimization level vs. compile-time. Differential Revision: https://reviews.llvm.org/D123625	2022-04-13 06:55:13 -04:00
Sanjay Patel	d9211be13d	[SimplifyCFG] cleanup code for converting switch to select (NFC) This renames functions for more general usage (and current capitalization style) before a proposed logic change in D122485. Differential Revision: https://reviews.llvm.org/D123614	2022-04-12 12:17:54 -04:00
serge-sans-paille	e810d55809	[ValueTracking] Make getStringLenth aware of strdup During strlen compile-time evaluation, make it possible to track size of strduped strings. Differential Revision: https://reviews.llvm.org/D123497	2022-04-12 14:47:29 +02:00
Nikita Popov	9af8cc8d17	[SimplifyLibCalls] Remove unnecessary inbounds check Even if the GEP is not inbounds, the GEP will have provenance of the global, and accessing past the extent of the global would be undefined behavior.	2022-04-11 16:51:09 +02:00
Matt Arsenault	9fdd25848a	Transforms: Fix code duplication between LowerAtomic and AtomicExpand	2022-04-08 19:06:36 -04:00
Evgeniy Brevnov	da41214d65	Add support for atomic memory copy lowering Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D118443	2022-04-08 10:41:31 +07:00
Augie Fackler	b916414096	BuildLibCalls: also set allocsize() attributes This is part of being able to get rid of two more columns in MemoryBuiltins.cpp's large table. We'll have two more changes before we can finish the job. Differential Revision: https://reviews.llvm.org/D119582	2022-04-07 12:38:44 -04:00
Benjamin Kramer	ff485d727f	Transforms: Remove unused include Utils can't depend on Scalar transforms.	2022-04-07 10:40:28 +02:00
Matt Arsenault	39f1568633	Transforms: Split LowerAtomics into separate Utils and pass This will allow code sharing from AtomicExpandPass. Not entirely sure why these exist as separate passes though.	2022-04-06 20:54:45 -04:00
Nikita Popov	1dc1d5a0d2	[SimplifyLibCalls] Use KnownBits helper APIs (NFC) Use helper APIs for isNonNegative() and getMaxValue() instead of flipping the zero value and having a long comment explaining why that is necessary.	2022-04-06 16:01:24 +02:00
Martin Storsjö	46776f7556	Fix warnings about variables that are set but only used in debug mode Add void casts to mark the variables used, next to the places where they are used in assert or `LLVM_DEBUG()` expressions. Differential Revision: https://reviews.llvm.org/D123117	2022-04-06 10:01:46 +03:00

1 2 3 4 5 ...

6465 Commits