llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	bc96b36a41	[CodeGen] Use std::lcm (NFC)	2022-09-03 11:17:33 -07:00
Simon Pilgrim	62cdfdab4d	[DAG] canCreateUndefOrPoison - add freeze(insert_subvector(x,y,c)) -> insert_subvector(freeze(x),freeze(y),c) support We already have plenty of assertions in place to ensure that the insertion index is constant and inrange	2022-09-03 13:41:33 +01:00
Simon Pilgrim	e2d140e9c3	[TTI] Add isExpensiveToSpeculativelyExecute wrapper CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead. This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.	2022-09-03 13:12:22 +01:00
Daniil Fukalov	b4e1b0e00d	[LiveIntervals] Split live intervals on any dead def Each dead def of the same virtual register is required to be split into multiple virtual registers with separate live intervals to avoid MachineVerifier error. Partially fixes https://github.com/llvm/llvm-project/issues/56050 and https://github.com/llvm/llvm-project/issues/56051 Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D130477	2022-09-02 20:00:22 +03:00
David Green	5073499b69	[TypePromotionPass] Rename variable to avoid name conflict. NFC	2022-09-02 12:35:15 +01:00
Fangrui Song	8d95fd7e56	[MachineFunctionPass] Support -filter-passes for -print-changed [MachineFunctionPass] Support -filter-passes for -print-changed -filter-passes specifies a `PassID` (a lower-case dashed-separated pass name, also used by -print-after, -stop-after, etc) instead of a CamelCasePass. `-filter-passes=CamelCaseNewPMPass` seems like a workaround for new PM passes before we can use lower-case dashed-separated pass names (as used by `-passes=`). Example: ``` # getPassName() is "IRTranslator". PassID is "irtranslator" llc -mtriple=aarch64 -print-changed -filter-passes=irtranslator < print-changed-machine.ll ``` Close https://github.com/llvm/llvm-project/issues/57453 Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D133055	2022-09-01 11:06:06 -07:00
Nikita Popov	5134bd432f	[DwarfEhPrepare] Assign dummy debug location for inserted _Unwind_Resume calls (PR57469) DwarfEhPrepare inserts calls to _Unwind_Resume into landing pads. If _Unwind_Resume happens to be defined in the same module and debug info is used, then this leads to a verifier error: inlinable function call in a function with debug info must have a !dbg location call void @_Unwind_Resume(ptr %exn.obj) #0 Fix this by assigning a dummy location to the call. (As this happens in the backend, inlining is not actually relevant here.) Fixes https://github.com/llvm/llvm-project/issues/57469. Differential Revision: https://reviews.llvm.org/D133095	2022-09-01 16:35:49 +02:00
Nikita Popov	c635ea5c50	[CombinerHelper] Avoid deprecated method (NFC)	2022-09-01 16:09:05 +02:00
Stephen Tozer	211efaa1ce	Reapply "[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops" Re-landing with an erroneous assert removed. This reverts commit `58d104b352`.	2022-09-01 14:20:24 +01:00
Amara Emerson	4cf3db41da	[GlobalISel] Add sdiv exact (X, constant) -> mul combine. This port of the SDAG optimization is only for exact sdiv case. Differential Revision: https://reviews.llvm.org/D130517	2022-09-01 13:34:00 +01:00
Craig Topper	77dbc5200b	[MachineCSE] Use TargetInstrInfo::isAsCheapAsAMove in isPRECandidate. Some targets like RISC-V require operands to be inspected to determine if an instruction is similar to a move. Spotted while investigating code differences between using an ADDI vs an ADDIW. RISC-V has the isAsCheapAsAMove flag for ADDI, but the TII hook checks the immediate is 0 or the register is X0. ADDIW is never generated with X0 or with an immediate of 0 so it doesn't have the isAsCheapAsAMove flag. I don't know enough about the PRE code to write a test for this yet. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132981	2022-08-31 15:39:41 -07:00
Sam Clegg	c5c4ba37b1	[WebAssembly][MC] Avoid the need for .size directives for functions Warn if `.size` is specified for a function symbol. The size of a function symbol is determined solely by its content. I noticed this simplification was possible while debugging #57427, but this change doesn't fix that specific issue. Differential Revision: https://reviews.llvm.org/D132929	2022-08-31 14:28:56 -07:00
Nick Desaulniers	d7474bef77	[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets This fixes a crash observed after https://reviews.llvm.org/D129997. Similar to D88823. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D130127	2022-08-31 13:07:10 -07:00
Simon Pilgrim	eaede4b5b7	[DAG] extractShiftForRotate - replace assertion for shift opcode with an early-out We feed the result from the first extractShiftForRotate call into the second, and that result might no longer be a shift op (usually due to constant folding). NOTE: We REALLY need to stop creating nodes on the fly inside extractShiftForRotate! Fixes Issue #57474	2022-08-31 15:50:48 +01:00
Simon Pilgrim	9d22800275	[DAG] visitFreeze - account for operand depth when calling isGuaranteedNotToBeUndefOrPoison (PR57402) We were calling isGuaranteedNotToBeUndefOrPoison on operands (with Depth = 0), but wasn't accounting for the fact that a later isGuaranteedNotToBeUndefOrPoison assertion will call from the new node (with Depth = 0 as well) - which will then recursively call isGuaranteedNotToBeUndefOrPoison for its operands with Depth = 1 Fixes #57402	2022-08-31 12:20:30 +01:00
Kai Luo	ad2f7fd286	[AtomicExpand] Make floating point conversion happens before fence insertion IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations. This also fixes atomic load of floating point values which requires fence on PowerPC. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127609	2022-08-31 09:54:58 +08:00
Markus Böck	2fdf963daf	[GlobalISel] Explicitly fail trying to translate `gc.statepoint` and related intrinsics The provided testcase would previously fail with an assertion due to later down below trying to allocate registers for `token` return types and arguments. This is especially problematic as the process would then exit instead of falling back to using FastIsel. This patch fixes that by simply explicitly failing translation if either of these intrinsics are encountered. Fixes https://github.com/llvm/llvm-project/issues/57349 Differential Revision: https://reviews.llvm.org/D132974	2022-08-31 00:47:17 +02:00
David Penry	9aca7b0217	[ModuloScheduler] Fix missing LLVM_DEBUG Guard a debug message with LLVM_DEBUG Differential Revision: https://reviews.llvm.org/D132895	2022-08-30 09:20:37 -07:00
Tomas Matheson	9a390d6692	[AArch64][GISel] fix G_ADD/G_SUB legalization widenScalarDst updates the insert point to after MI, so widenScalarSrc must be called before widenScalarDst. Otherwise The updated Src values will appear after MI and break SSA. e.g.: %14:_(s64), %15:_(s1) = G_UADDE %9:_, %11:_, %13:_ becomes %14:_(s64), %16:_(s32) = G_UADDE %9:_, %11:_, %17:_ %15:_(s1) = G_TRUNC %16:_(s32) %17:_(s32) = G_ZEXT %13:_(s1) Differential Revision: https://reviews.llvm.org/D132547 Change-Id: Ie3458747a6879433f4d5ab9939d2bd102dd0f2db	2022-08-30 10:59:32 +01:00
Xiang1 Zhang	a808ac2e42	[NFC] Clang-format for CodeGenPrepare.cpp	2022-08-30 13:42:36 +08:00
wanglian	e2bb9774b1	[LegalizeTypes] Support widen result for VECTOR_REVERSE. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132359	2022-08-30 10:01:26 +08:00
Craig Topper	2f811a6c7f	[VP][RISCV] Add vp.fabs intrinsic and RISC-V support. Mostly just modeled after vp.fneg except there is a "functional instruction" for fneg while fabs is always an intrinsic. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D132793	2022-08-29 09:32:06 -07:00
Kazu Hirata	267f21a21b	Use std::gcd (NFC) This patch replaces calls to greatestCommonDivisor with std::gcd where two arguments are of the same type. This means that std::common_type_t of the argument type is the same as the argument type. We could drop calls to std::abs in some cases, but that's left for another patch.	2022-08-28 10:41:51 -07:00
Kazu Hirata	d1688e9ddf	[llvm] Use std::gcd (NFC) This patch replaces calls to greatestCommonDivisor with std::gcd where both arguments are known to be of unsigned. This means that std::common_type_t of the two argument types should just be the wider one of the two.	2022-08-27 23:54:29 -07:00
Kazu Hirata	9d6ab7230b	[GlobalISel] Use std::lcm (NFC) This patch replaces getLCMSize with std::lcm, a C++17 feature. Note that all the arguments are of unsigned with no implicit type conversion as they are passed to getLCMSize.	2022-08-27 09:53:16 -07:00
Kazu Hirata	21de2888a4	Use llvm::is_contained (NFC)	2022-08-27 09:53:11 -07:00
Matthias Gehre	3e39b27101	[llvm/CodeGen] Add ExpandLargeDivRem pass Adds a pass ExpandLargeDivRem to expand div/rem instructions with more than 128 bits into a loop computing that value. As discussed on https://reviews.llvm.org/D120327, this approach has the advantage that it is independent of the runtime library. This also helps the clang driver, which otherwise would need to understand enough about the runtime library to know whether to allow _BitInts with more than 128 bits. Targets are still free to disable this pass and instead provide a faster implementation in a runtime library. Fixes https://github.com/llvm/llvm-project/issues/44994 Differential Revision: https://reviews.llvm.org/D126644	2022-08-26 11:55:15 +01:00
Simon Pilgrim	88c7b16bed	[DAG] Strip poison generating flags in freeze(op()) -> op(freeze()) fold This patch follows the InstCombine approach of stripping poison generating flags (nsw/nuw from add/sub etc.) to allow us to push a freeze() through the op. Unlike InstCombine it doesn't retain any flags, but we have plenty of DAG folds that do the same thing already. We assert that the newly generated op isGuaranteedNotToBeUndefOrPoison. Similar to the ValueTracking approach, isGuaranteedNotToBeUndefOrPoison has been updated to confirm that if an op can't create undef/poison and its operands are guaranteed not to be undef/poison - then its not undef/poison. This is just for the generic opcodes - target specific opcodes will need to do this manually just in case they have some special cases. Differential Revision: https://reviews.llvm.org/D132333	2022-08-26 11:47:51 +01:00
Matthias Gehre	6d13b80fcb	Revert "[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers" This reverts https://reviews.llvm.org/D120329. I abandoned the PR [0] to add __divei4 functions to compiler-rt in favor of adding a pass to transform div/rem [1]. This removes the backend code that was supposed to emit calls to the __divei4 functions. [0] https://reviews.llvm.org/D120327 [1] https://reviews.llvm.org/D130076 Differential Revision: https://reviews.llvm.org/D130079	2022-08-26 10:52:56 +01:00
Alex Richardson	0483b00875	Mark the $local function begin symbol as a function While this does not matter for most targets, when building for Arm Morello, we have to mark the symbol as a function and add size information, so that LLD can correctly evaluate relocations against the local symbol. Since Morello is an out-of-tree target, I tried to reproduce this with in-tree backends and with the previous reviews applied this results in a noticeable difference when targeting Thumb. Background: Morello uses a method similar Thumb where the encoding mode is specified in the LSB of the symbol. If we don't mark the target as a function, the relocation will not have the LSB set and calls will end up using the wrong encoding mode (which will almost certainly crash). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D131429	2022-08-26 09:34:04 +00:00
wanglian	2887d7786f	[DAGCombiner] Use FoldConstantArithmetic instead of dyn_cast in visitFP_ROUND. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D132439	2022-08-25 11:29:05 +08:00
Matthias Braun	5364f49407	Fix CSR update check D132080 introduced a bug leading to `RegisterClassInfo` caches not getting invalidated when there was exactly one more CSR register added. Differential Revision: https://reviews.llvm.org/D132606	2022-08-24 18:09:49 -07:00
Sami Tolvanen	cff5bef948	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Relands `67504c9549` with a fix for 32-bit builds. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 22:41:38 +00:00
Sami Tolvanen	a79060e275	Revert "KCFI sanitizer" This reverts commit `67504c9549` as using PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.	2022-08-24 19:30:13 +00:00
Sami Tolvanen	67504c9549	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 18:52:42 +00:00
spupyrev	8d5b694da1	extending code layout alg The diff modifies ext-tsp code layout algorithm in the following ways: (i) fixes merging of cold block chains (this is a port of D129397); (ii) adjusts the cost model utilized for optimization; (iii) adjusts some APIs so that the implementation can be used in BOLT; this is a prerequisite for D129895. The only non-trivial change is (ii). Here we introduce different weights for conditional and unconditional branches in the cost model. Based on the new model it is slightly more important to increase the number of "fall-through unconditional" jumps, which makes sense, as placing two blocks with an unconditional jump next to each other reduces the number of jump instructions in the generated code. Experimentally, this makes a mild impact on the performance; I've seen up to 0.2%-0.3% perf win on some benchmarks. Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D129893	2022-08-24 09:40:25 -07:00
Simon Pilgrim	f9de13232f	[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis. For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling. Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU. Differential Revision: https://reviews.llvm.org/D132520	2022-08-24 17:28:18 +01:00
Stephen Tozer	58d104b352	Revert "[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops" Reverting due to reported errors when running Linux kernel builds with KMSAN -gdwarf-4. This reverts commit `2cb9e1ac42`.	2022-08-24 15:24:32 +01:00
Simon Pilgrim	5377abcde2	[DAG] matchRotateHalf - constify SelectionDAG arg. NFC. Based off Issue #57283 - we need to try harder to ensure we're not creating nodes on-the-fly - so make sure we're just using SelectionDAG for analysis where possible	2022-08-24 10:57:38 +01:00
Simon Pilgrim	e624f8a3bb	[DAG] MatchRotate - bail if we fail to match a shl/srl pair extractShiftForRotate may fail to return canonicalized shifts due to constant folding or other simplification that can occur in getNode() Fixes Issue #57283	2022-08-24 03:05:07 +01:00
Sanjay Patel	f8dfbea324	[SDAG] expand more is-power-of-2 patterns that use popcount (ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0) Adjust the legality check to avoid the poor codegen on AArch64. We probably only want to use popcount on this pattern when it is a single instruction. fixes #57225 Differential Revision: https://reviews.llvm.org/D132237	2022-08-23 17:53:53 -04:00
Stephen Tozer	2cb9e1ac42	[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops This patch builds on prior support patches to enable support for variadic debug values in InstrRefLDV, allowing DBG_VALUE_LISTs to have their ranges extended. Differential Revision: https://reviews.llvm.org/D128212	2022-08-23 20:17:09 +01:00
Arthur Eubanks	d6cc7a5b46	[FastISel] Respect musttail over "disable-tail-calls" musttail should be honored even in the presence of attributes like "disable-tail-calls". SelectionDAG properly handles this. Update LangRef to explicitly mention that this is the semantics of musttail. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D132193	2022-08-23 08:55:40 -07:00
Jakub Kuderski	6fa87ec10f	[ADT] Deprecate is_splat and replace all uses with all_equal See the discussion thread for more details: https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D132335	2022-08-23 11:36:27 -04:00
Stephen Tozer	89d0cc99ec	[DebugInfo][InstrRef] Handle transfers of variadic debug values in LDV This patch adds the last of the changes required to enable DBG_VALUE_LIST handling in InstrRefLDV, handling variadic debug values during the transfer tracking step. Most of the changes are fairly straightforward, and based around tracking multiple locations per variable in TransferTracker::VLocTracker. Differential Revision: https://reviews.llvm.org/D128211	2022-08-23 15:01:28 +01:00
Thomas Symalla	562accddaa	[NFC] Fix typo in dbg message in RegisterCoalescer. funcion => function	2022-08-23 15:14:45 +02:00
Stephen Tozer	b12e5c884f	[DebugInfo][InstrRef][NFC] Emit variadic debug values from InstrRefLDV In preparation for supporting DBG_VALUE_LIST in InstrRefLDV, this patch adds the logic for emitting DBG_VALUE_LIST instructions from InstrRefLDV. The logical changes here are fairly simple, with the main change being that instead of directly prepending offsets to the DIExpr, we use appendOpsToArg to modify the expression for individual debug operands in the expression. The function emitLoc is also changed to take a list of debug ops, with an empty list meaning an undef value. Differential Revision: https://reviews.llvm.org/D128209	2022-08-23 13:22:56 +01:00
Denis Antrushin	d1fd791e72	[TwoAddressInstruction] Handle pointer compare sunk past statepoint. CodeGenPrepare pass can sink pointer comparison across statepoint to the point of use (see comment in IR/SafepointIRVerifier.cpp) Due to specifics of statepoints, it is still legal to have tied def and use rewritten to the same register in TwoAddress pass. However, properly updating LiveIntervals and LiveVariables becomes complicated. For simplicity, let's fall back to generic handling of tied registers when we detect such case. TODO: This fixes functional (assertion) failure. Ideally we should try to recompute new live range/liveness in place. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D132255	2022-08-23 12:34:11 +03:00
Stephen Tozer	53fd5af689	[DebugInfo] Let InstrRefBasedLDV handle joins for lists of debug ops In preparation for adding support for DBG_VALUE_LIST instructions in InstrRefLDV, this patch updates the logic for joining variables at block joins to support joining variables that use multiple debug operands. This is one of the more meaty "logical" changes, although the line count isn't too high - this changes pickVPHILoc to find a valid joined location for every operand, with part of the function being split off into pickValuePHILoc which finds a location for a single operand. Differential Revision: https://reviews.llvm.org/D128180	2022-08-22 20:22:22 +01:00
David Penry	ced705c440	[ModuloSchedule] Add interface call to accept/reject SMS schedules This interface allows a target to reject a proposed SMS schedule. For Hexagon/PowerPC, all schedules are accepted, leaving behavior unchanged. For ARM, schedules which exceed register pressure limits are rejected. Also, two RegisterPressureTracker methods now need to be public so that register pressure can be computed by more callers. Reapplication of D128941/(reversion:D132037) with small fix. Differential Revision: https://reviews.llvm.org/D132170	2022-08-22 12:10:13 -07:00
Philip Reames	274f86e7a6	[TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc] This completes the client side transition to the OperandValueInfo version of this routine. Backend TTI implementations still use the prior versions for now.	2022-08-22 11:06:32 -07:00
Kazu Hirata	36ec4deca5	[LiveDebugValues] Fix a warning This patch fixes: llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h:330:5: error: anonymous types declared in an anonymous union are an extension [-Werror,-Wnested-anon-types]	2022-08-22 10:57:16 -07:00
Stephen Tozer	b5ba5d2aab	[DebugInfo][NFC] Represent DbgValues with multiple ops in IRefLDV In preparation for allowing InstrRefBasedLDV to handle DBG_VALUE_LIST, this patch updates the internal representation that it uses to represent debug values to store a list of values. This is one of the more significant changes in terms of line count, but is fairly simple and should not affect the output of this pass. Differential Revision: https://reviews.llvm.org/D128177	2022-08-22 18:04:38 +01:00
Matthias Braun	b2542c40b9	RegisterClassInfo: Fix CSR cache invalidation `RegisterClassInfo` caches information like allocation orders and reuses it for multiple machine functions where possible. However the `MCPhysReg *CalleeSavedRegs` field used to test whether the set of callee saved registers changed did not work: After D28566 `MachineRegisterInfo::getCalleeSavedRegs()` can return dynamically computed CSR sets that are only valid while the `MachineRegisterInfo` object of the current function exists. This changes the code to make a copy of the CSR list instead of keeping a possibly invalid pointer around. Differential Revision: https://reviews.llvm.org/D132080	2022-08-22 09:28:26 -07:00
Stephen Tozer	11ce014a12	[DebugInfo][NFC] Update LDV to use generic DBG_VALUE* MI interface Currently, InstrRefLDV only handles DBG_VALUE instructions, not DBG_VALUE_LIST, and as a result of this it handles these instructions using functions that only work for that type of debug value, i.e. using getOperand(0) to get the debug operand. This patch changes this to use the generic debug value functions, such as getDebugOperand and isDebugOffsetImm, as well as adding an IsVariadic field to the DbgValueProperties class and a few other minor changes to acknowledge DBG_VALUE_LISTs. Note that this patch does not add support for DBG_VALUE_LIST here, but is a precursor to other patches that do add that support. Differential Revision: https://reviews.llvm.org/D128174	2022-08-22 16:28:12 +01:00
Stephen Tozer	53125e7d91	[DebugInfo] Handle joins PHI+Def values in InstrRef LiveDebugValues In the InstrRefBasedImpl for LiveDebugValues, we attempt to propagate debug values through basic blocks in part by checking to see whether all a variable's incoming debug values to a BB "agree", i.e. whether their properties match and they refer to the same underlying value. Prior to this patch, the check for agreement between incoming values relied on exact equality, which meant that a VPHI and a Def DbgValue that referred to the same underlying value would be seen as disagreeing. This patch changes this behaviour to treat them as referring to the same value, allowing the shared value to propagate into the BB. Differential Revision: https://reviews.llvm.org/D125953	2022-08-22 14:51:27 +01:00
Kazu Hirata	ec5eab7e87	Use range-based for loops (NFC)	2022-08-20 21:18:32 -07:00
Kazu Hirata	258531b7ac	Remove redundant initialization of Optional (NFC)	2022-08-20 21:18:28 -07:00
Luo, Yuanke	5159be3c9b	(Reland) [fastalloc] Support allocating specific register class in fastalloc This reverts commit `853bb192c4`.	2022-08-20 13:25:34 +08:00
Lorenzo Albano	98117fe208	[VP] Add splitting for VP_STRIDED_STORE and VP_STRIDED_LOAD Following the comment's thread of D117235, I added checks for the widening + splitting case, which also causes a split with one of the resulting vectors to be empty. Due to the same issues described in that same thread, the `fixed-vectors-strided-store.ll` test is missing the widening + splitting case, while the same case in the `strided-vpload.ll` test requires to manually split the loaded vector. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121784	2022-08-19 18:15:56 -07:00
Eli Friedman	8f826fe723	Fix reverse-iteration buildbot. A couple of instances of iterating over maps snuck in while the bot was down; fix them to use maps with deterministic iteration.	2022-08-19 14:21:05 -07:00
Nick Desaulniers	e412bac912	[MachineVerifier] add checks for INLINEASM_BR Test for a case we observed after the initial implementation of D129997 landed, in which case we observed a crash while building the ppc64le Linux kernel. In that case, we had one block with two exits, both to the same successor. Removing one of the exits corrupted the successor/predecessor lists. So when we have an INLINEASM_BR, check a few things for each indirect target: 1. that it exists. 2. that it is listed in our successors. 3. that its predecessor list contains the parent MBB of INLINEASM_BR. This would have caught the regression discovered after D129997 landed, after the pass that was problematic (early-tailduplication) rather than getting a stack trace in a later pass (regalloc) that doesn't understand the anomaly and crashes. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D130290	2022-08-19 12:52:26 -07:00
Bill Wendling	ac6a0cdc2e	[X86][AArch64][NFC] Simplify querying used argument registers Registers used for arguments are listed as "live-ins" into the starting basic block. This means we don't have to go through a potentially expensive search through all possible argument registers when we only care about used argument registers. Differential Revision: https://reviews.llvm.org/D132181	2022-08-19 11:39:05 -07:00
wanglian	fc2b4dfef2	[DAGCombiner] Add use check for VSCALE in visitSUB. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D132115	2022-08-19 09:46:18 +08:00
Eric Wang	ad8eb85545	[NFC][MLGO] ML Regalloc Priority Advisor This patch introduces the priority analysis and the priority advisor, the default implementation, and the scaffolding for introducing the other implementations of the advisor. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D131220	2022-08-18 15:33:48 -07:00
Paul Walker	96c8d615d6	[SVE] Extend findMoreOptimalIndexType so BUILD_VECTORs do not force 64bit indices. Extends findMoreOptimalIndexType to allow ISD::BUILD_VECTOR based indices to be truncated when such truncation is lossless. This can enable the use of 32bit gather/scatter indices thus making it less likely to have to split a gather/scatter in two. Depends on D125194 Differential Revision: https://reviews.llvm.org/D130533	2022-08-18 18:00:53 +01:00
Simon Pilgrim	fdec50182d	[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks. Original Patch by @samparker (Sam Parker) Differential Revision: https://reviews.llvm.org/D79483	2022-08-18 11:55:23 +01:00
wanglian	989ebc1783	[DAGCombiner][NFC] Tidy up unnecessary brackets in visitADD. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D132107	2022-08-18 15:48:22 +08:00
wanglian	230e277dfe	[DAGCombiner][NFC] Merge two if statement into one. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D131941	2022-08-18 10:12:35 +08:00
Daniil Fukalov	7ed3d81333	[NFCI] Move cost estimation from TargetLowering to TargetTransformInfo. TragetLowering had two last InstructionCost related `getTypeLegalizationCost()` and `getScalingFactorCost()` members, but all other costs are processed in TTI. E.g. it is not comfortable to use other TTI members in these two functions overrided in a target. Minor refactoring: `getTypeLegalizationCost()` now doesn't need DataLayout parameter - it was always passed from TTI. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D117723	2022-08-18 00:38:55 +03:00
Sanjay Patel	7f72a0f5bb	[SDAG] avoid generating libcall to function with same name This is a potentially better alternative to D131452 that also should avoid the infinite loop bug from: issue #56403 This is again a minimal fix to reduce merging pain for the release. But if this makes sense, then we might want to guard all of the RTLIB generation (and other libcalls?) with a similar name check. Differential Revision: https://reviews.llvm.org/D131521	2022-08-17 16:19:34 -04:00
Matthias Braun	19ce5e515f	RAGreedyStats: Ignore identity COPYs; count COPYs from/to physregs Improve copy statistics: - Count copies from or to physical registers: They are used to model function parameters and calling conventions and the register allocator optimizes for them. - Check physical registers assigned to virtual registers and stop counting "identity" `COPY`s where source and destination is the same physical registers; they will be removed in the `virtregmap` pass anyway. Differential Revision: https://reviews.llvm.org/D131932	2022-08-17 12:53:29 -07:00
Archit Saxena	e170d955fe	Split EH code by default The current machine function splitter is reliant on profile data to do profile summary analysis to split blocks into cold section. This may sometimes limit the usage of machine function splitter especially in cases where we could do some form of static analysis to split out cold blocks if profile data is absent or profile data which may be faulty (Consider Sample PGO). Of all code that could statically be marked cold Exception handling blocks are one of them (In fact BFI framework also tends to mark them as cold), and the most in size contribution. In my experiments I found out Exception handling pads and all code reachable from there account for up to 6-8% of the .text section on modern production binaries. This patch introduces a flag to split out all Exception handling blocks and blocks only reachable from Exceptional Handling pad to cold section. This flag has shown to give a performance win of up to 0.1% in terms of average cycles and instructions executed on internal facebook search service. Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D131824	2022-08-17 12:40:31 -07:00
Nick Desaulniers	6b0e2fa6f0	[SelectionDAG] make INLINEASM_BR use MachineBasicBlocks instead of BlockAddresses As part of re-architecting callbr to no longer use blockaddresses (https://reviews.llvm.org/D129288), we don't really need them in MIR. They make comparing MachineBasicBlocks of indirect targets during MachineVerifier a PITA. Suggested by @efriedma from the discussion: https://reviews.llvm.org/D130290#3669531 Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D130316	2022-08-17 09:34:31 -07:00
David Penry	1c9f0408bc	Revert "[ModuloSchedule] Add interface call to accept/reject SMS schedules" This reverts commit `8c4aea438c`. Needed because buildbot failures (warnings) gave a clue that there was a functional bug in the ARM rejection logic. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D132037	2022-08-17 09:32:43 -07:00
David Penry	8c4aea438c	[ModuloSchedule] Add interface call to accept/reject SMS schedules This interface allows a target to reject a proposed SMS schedule. For Hexagon/PowerPC, all schedules are accepted, leaving behavior unchanged. For ARM, schedules which exceed register pressure limits are rejected. Also, two RegisterPressureTracker methods now need to be public so that register pressure can be computed by more callers. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D128941	2022-08-17 08:13:26 -07:00
Andre Vieira	49223e0a2d	[TypePromotion] Don't promote PHI + ZExt if wider than RegisterBitWidth Differential Revision: https://reviews.llvm.org/D131966	2022-08-17 09:54:15 +01:00
Eli Friedman	cfd2c5ce58	Untangle the mess which is MachineBasicBlock::hasAddressTaken(). There are two different senses in which a block can be "address-taken". There can be a BlockAddress involved, which means we need to map the IR-level value to some specific block of machine code. Or there can be constructs inside a function which involve using the address of a basic block to implement certain kinds of control flow. Mixing these together causes a problem: if target-specific passes are marking random blocks "address-taken", if we have a BlockAddress, we can't actually tell which MachineBasicBlock corresponds to the BlockAddress. So split this into two separate bits: one for BlockAddress, and one for the machine-specific bits. Discovered while trying to sort out related stuff on D102817. Differential Revision: https://reviews.llvm.org/D124697	2022-08-16 16:15:44 -07:00
Nicolas Miller	ccfabfbb1f	Fix subrange liveness checking at rematerialization This patch fixes an issue where an instruction reading a whole register would be moved during register allocation into a spot where one of the subregisters was dead. The code to check whether an instruction can be rematerialized at a given point or not was already checking for subranges to ensure that subregisters are live, but only when the instruction being moved was using a subregister, this patch changes that so the subranges are checked even when the moved instruction uses the full register. This patch also adds a case to the original test for the subrange checking that trigger the issue described above. The original subrange checking code was introduced in this revision: https://reviews.llvm.org/D115278 And I've encountered this issue on AMDGPUs while working with DPC++: https://github.com/intel/llvm/issues/6209 Essentially the greedy register allocator attempts to move the following instruction: ``` %3961:vreg_64 = V_LSHLREV_B64_e64 3, %3078:vreg_64, implicit $exec ``` From `@3440` into the body of a loop `@16312`, but `%3078` has the following live ranges: ``` %3078 [2224r,2240r:0)[2240r,3488B:1)[16192B,38336B:1) 0@2224r 1@2240r L0000000000000003 [2224r,3440r:0) 0@2224r L000000000000000C [2240r,3488B:0)[16192B,38336B:0) 0@2240r ``` So `@16312e` `%3078.sub1` is alive but `%3078.sub0` is dead, so this instruction being moved there leads to invalid memory accesses as `3078.sub0` ends up being trashed and the result of this instruction is used as part of an address calculation for a load. On the original ticket this issue showed up on gfx906 and gfx90a but not on gfx908, this turned out to be because on gfx908 instead of moving the shift instruction into the loop, its value is spilled into an ACC register, gfx906 doesn't have ACC registers and for gfx90a ACC registers are used like regular vector registers and so aren't used for spilling. With this patch the original application from the DPC++ ticket works properly on gfx906, and the result of the shift instruction is correctly spilled instead of moving the instruction in the loop. Original Author: npmiller Reviewed by: rampitec Submitted by: rampitec Differential Revision: https://reviews.llvm.org/D131884	2022-08-16 10:50:09 -07:00
Arthur Eubanks	9181ce623f	[Windows] Put init_seg(compiler/lib) in llvm.global_ctors Currently we treat initializers with init_seg(compiler/lib) as similar to any other init_seg, they simply have a global variable in the proper section (".CRT$XCC" for compiler/".CRT$XCL" for lib) and are added to llvm.used. However, this doesn't match with how LLVM sees normal (or init_seg(user)) initializers via llvm.global_ctors. This causes issues like incorrect init_seg(compiler) vs init_seg(user) ordering due to GlobalOpt evaluating constructors, and the ability to remove init_seg(compiler/lib) initializers at all. Currently we use 'A' for priorities less than 200. Use 200 for init_seg(compiler) (".CRT$XCC") and 400 for init_seg(lib) (".CRT$XCL"), which do not append the priority to the section name. Priorities between 200 and 400 use ".CRT$XCC${Priority}". This allows for some wiggle room for people/future extensions that want to add initializers between compiler and lib. Fixes #56922 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D131910	2022-08-16 08:16:18 -07:00
Steve Merritt	ec60fca752	[CodeView] Use non-qualified names for static local variables Static variables declared within a routine or lexical block should be emitted with a non-qualified name. This allows the variables to be visible to the Visual Studio watch window. Differential Revision: https://reviews.llvm.org/D131400	2022-08-16 10:33:43 -04:00
Andre Vieira	c6b5a13b7a	[TypePromotion] Only search for PHI + ZExt promotion of Integers Differential Revision: https://reviews.llvm.org/D131948	2022-08-16 10:15:32 +01:00
wanglian	fbc4c26e9a	[SelectionDAG][NFC] Fix return type when used isConstantIntBuildVectorOrConstantInt and isConstantFPBuildVectorOrConstantFP Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D131870	2022-08-16 10:07:24 +08:00
Rahman Lavaee	df2213f345	[EHStreamer] Omit @LPStart when function has no landing pads When no landing pads exist for a function, `@LPStart` is undefined and must be omitted. EH table is generally not emitted for functions without landing pads, except when the personality function is uknown (`!isNoOpWithoutInvoke(classifyEHPersonality(Per))`). In that case, we must omit `@LPStart` even when machine function splitting is enabled. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D131626	2022-08-15 17:09:46 -07:00
Aiden Grossman	24cdf97d63	[mlgo] Add ability to create feature-gated development features in regalloc advisor Currently there is no way to add in development features to the ML regalloc evict advisor which is useful to have when working on feature engineering/improving the current model. This patch adds in the ability to add in development features to the ML regalloc evict advisor which are gated by a runtime flag and not added in at all if not compiled in LLVM development mode. This sets the stage for future work where we are planning on upstreaming some of the newer features that we are currently experimenting with. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D131209	2022-08-15 16:01:37 -07:00
David Green	dfc95bab07	[DAG] Ensure more Legal BUILD_VECTOR elements types in shuffle->And combine This is a followup to D131350, which caused another problem for i64 types being split into i32 on i32 targets. This patch tries to make sure that either Illegal types are OK, or that the element types of a buildvector are legal and bigger than or equal to the size of the original elements. Differential Revision: https://reviews.llvm.org/D131883	2022-08-15 14:41:45 +01:00
Luo, Yuanke	853bb192c4	Revert "(Reland) [fastalloc] Support allocating specific register class in fastalloc" This reverts commit `30f9e6ebd3`.	2022-08-15 20:33:15 +08:00
Ayke van Laethem	de48717fcf	[AVR] Support unaligned store This patch really just extends D39946 towards stores as well as loads. While the patch is in SelectionDAGBuilder, it only applies to AVR (the only target that supports unaligned atomic operations). Differential Revision: https://reviews.llvm.org/D128483	2022-08-15 14:29:37 +02:00
Simon Pilgrim	3a73133217	[DAG] canCreateUndefOrPoison - add freeze(sign_extend_inreg(x,vt)) -> sign_extend_inreg(freeze(x),vt) support Guaranteed not to create undef/poison	2022-08-15 12:18:59 +01:00
Peter Waller	6e85db7293	[DAGCombine] Combine signext_inreg of extract-extend The outer signext_inreg is redundant in the following: Fold (signext_inreg (extract_subvector (zext\|anyext\|sext iN_value to _) _) from iN) -> (extract_subvector (signext iN_value to iM)) Tests are precommitted and clone those by analogy from the AND case in the same file. Add a negative test to check extension width is handled correctly. This patch supersedes D130700. Differential Revision: https://reviews.llvm.org/D131503	2022-08-15 10:58:07 +00:00
Simon Pilgrim	7e294e676e	[DAG] canCreateUndefOrPoison - add freeze(assertsext/zext(x,bt)) -> assertsext/zext(freeze(x),vt) support These are guaranteed not to create undef/poison (although they may pass through) - the associated ISD::VALUETYPE node is also guaranteed never to generate poison	2022-08-15 11:13:43 +01:00
Kazu Hirata	f5a68feab3	Use llvm::none_of (NFC)	2022-08-14 16:25:39 -07:00
Simon Pilgrim	e2d13fd096	[DAG] canCreateUndefOrPoison - add freeze(shl(x,y)) -> shl(freeze(x),y) support These are guaranteed not to create undef/poison if the shift amount is known to be in range	2022-08-14 14:38:10 +01:00
Simon Pilgrim	a621d38bcb	[DAG] canCreateUndefOrPoison - add freeze(and/or/xor(x,y)) -> and/or/xor(freeze(x),y) support These are guaranteed not to create undef/poison	2022-08-14 13:14:53 +01:00
Simon Pilgrim	60534b8879	[DAG] canCreateUndefOrPoison - add freeze(add/sub/mul(x,y)) -> add/sub/mul(freeze(x),y,z) support These are guaranteed not to create undef/poison as long as there are no poison generating flags	2022-08-13 20:58:00 +01:00
Luo, Yuanke	30f9e6ebd3	(Reland) [fastalloc] Support allocating specific register class in fastalloc Reland commit `719658d078` The base RA support infrastructure that only allow a specific register class be allocated in RA pss. Since greedy RA, basic RA derived from base RA, they all allow allocating specific register class. Fast RA doesn't support allocating register for specific register class. This patch is to enable ShouldAllocateClass in fast RA, so that it can support allocating register for specific register class. Differential Revision: https://reviews.llvm.org/D131825	2022-08-13 13:57:34 +08:00
Joe Loser	b12aa497cd	[DAGCombine] Replace std::monostate equivalent in DAGCombiner.cpp Remove the `UnitT` type and operators in favor of using `std::monostate` directly. Differential Revision: https://reviews.llvm.org/D131778	2022-08-12 21:42:09 -06:00
Simon Pilgrim	4de35f4bbf	[DAG] Add TODO to remove creation of INSERT_SUBVECTOR nodes from SimplifyMultipleUseDemandedBits SimplifyMultipleUseDemandedBits shouldn't be creating general nodes like this - although we allow bitcasts, even general constant folding is avoided. Removing it causes a number of regressions that need addressing first, but I've added a TODO for now.	2022-08-12 10:45:30 +01:00
Filipp Zhinkin	1626ee6a95	[DAGCombine] Hoist shifts out of a logic operations tree. Hoist and combine shift operations from logic operations tree: logic (logic (SH x0, s), y), (logic (SH x1, s), z) --> logic (SH (logic x0, x1), s), (logic y, z) The transformation improves code generated for some cases related to the issue https://github.com/llvm/llvm-project/issues/49541. Correctness: https://alive2.llvm.org/ce/z/pVqVgY https://alive2.llvm.org/ce/z/YVvT-q https://alive2.llvm.org/ce/z/W5zTBq https://alive2.llvm.org/ce/z/YfJsvJ https://alive2.llvm.org/ce/z/3YSyDM https://alive2.llvm.org/ce/z/Bs2kzk https://alive2.llvm.org/ce/z/EoQpzU https://alive2.llvm.org/ce/z/Jnc_5H https://alive2.llvm.org/ce/z/_LP6k_ https://alive2.llvm.org/ce/z/KvZNC9 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D131189	2022-08-12 12:42:16 +03:00
wanglian	061f7ec9fa	[LegalizeTypes][NFC] Use getConstantOperandVal instead of cast constant getvalue Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131642	2022-08-12 14:35:10 +08:00
wanglian	1303057888	[LegalizeTypes][NFC] Use dyn_cast instead of isa and cast Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131544	2022-08-12 14:18:49 +08:00
Chen Zheng	8d19cfb72e	[PowerPC] omit location attribute for TLS variable on AIX TLS debug on AIX is not ready for now. The location generated in no-integrated-as mode is wrong and in integrated-as mode causes AIX linker error. Reviewed By: Esme Differential Revision: https://reviews.llvm.org/D130245	2022-08-12 00:54:48 -04:00
wanglian	3b71f1d5ab	[LegalizeTypes][NFC] Use getConstantOperandAPInt instead of cast constant getAPInt Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D131653	2022-08-12 10:21:54 +08:00
Peter Waller	898699831b	[DAGCombine] Check zext legality in zext-extract-extend combine Discussed in D131503. Fix to D130782.	2022-08-11 14:30:42 +00:00
Andre Vieira	1640679187	[TypePromotion] Search from ZExt + PHI Expand TypePromotion pass to try to promote PHI-nodes in loops that are the operand of a ZExt, using the ZExt's result type to determine the Promote Width. Differential Revision: https://reviews.llvm.org/D111237	2022-08-11 09:50:10 +01:00
Andre Vieira	05fc5037cd	[TypePromotion] Hoist out Promote Width calculation Hoist out promote width calculation to simplify runOnFunction. Differential Revision: https://reviews.llvm.org/D131489	2022-08-11 09:50:10 +01:00
Andre Vieira	e524d61f35	[TypePromotion] Don't delete Insns when iterating Differential Revision: https://reviews.llvm.org/D131488	2022-08-11 09:50:10 +01:00
Andre Vieira	57de4e059d	[TypePromotion] Don't insert Truncate for a no-op ZExt Differential Revision: https://reviews.llvm.org/D131487	2022-08-11 09:50:10 +01:00
aqjune	02e56e2533	[CodeGen] Generate efficient assembly for freeze(poison) version of `mm_cast` intel intrinsics This patch makes the variants of `mm_cast` intel intrinsics that use `shufflevector(freeze(poison), ..)` emit efficient assembly. (These intrinsics are planned to use `shufflevector(freeze(poison), ..)` after shufflevector's semantics update; relevant thread: D103874) To do so, this patch 1. Updates `LowerAVXCONCAT_VECTORS` in X86ISelLowering.cpp to recognize `FREEZE(UNDEF)` operand of `CONCAT_VECTOR` in addition to `UNDEF` 2. Updates X86InstrVecCompiler.td to recognize `insert_subvector` of `FREEZE(UNDEF)` vector as its first operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D130339	2022-08-11 13:36:21 +09:00
Simon Pilgrim	8623da5f74	[DAG] visitFREEZE - generalize freeze(op()) -> op(freeze()) to any number of operands canCreateUndefOrPoison currently only handles unary ops, but we intend to change that soon - this more closely matches the pushFreezeToPreventPoisonFromPropagating behaviour where the freeze is pushed up to a single operand value, as long as all others are guaranteed not to be poison/undef. However, pushFreezeToPreventPoisonFromPropagating would freeze all uses of the value - whilst this variant requires the frozen value to be only used in the op - we can look at generalize multiple uses later if the need arises.	2022-08-10 13:12:46 +01:00
Simon Pilgrim	bbc27d0148	[DAG] canCreateUndefOrPoison - add freeze(truncate(x)) -> truncate(freeze(x)) support	2022-08-10 11:27:22 +01:00
David Truby	b1b9c39629	[AArch64][SVE] Use SVE for VLS fcopysign for wide vectors Currently fcopysign for VLS vectors lowers through NEON even when the vector width is wider than a NEON vector, causing bad codegen as the vectors are split. This patch causes SVE to be used for these vectors instead, giving much better codegen on wide VLS vectors. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D128642	2022-08-10 10:17:19 +00:00
Simon Pilgrim	df3ea7365e	[DAG] Use DAG.getFreeze() to create freeze node. NFC.	2022-08-10 10:26:26 +01:00
Adrian Prantl	68f97d2f78	LiveDebugValues: Fix another crash related to unreachable blocks This is a follow-up patch to D130999. In the test, the MIR contains an unreachable MBB but the code attempts to look it up in MLocs. This patch fixes this issue by checking for the default-constructed value. rdar://97226240 Differential Revision: https://reviews.llvm.org/D131453	2022-08-09 10:34:57 -07:00
Simon Pilgrim	ed162d455a	[DAG] Avoid hasOneUse() calls if the cheaper !AssumeSingleUse test has already failed. NFC. Very minor optimization, but every little helps..	2022-08-09 16:42:19 +01:00
Simon Pilgrim	d79e7dc939	[DAG] SimplifyDemandedVectorElts - and/mul(x,y) - if a demanded element of y is known zero then we don't need to demand it in x This fixes most of the remaining regressions from the fixes in rG293899c64b75	2022-08-09 16:24:08 +01:00
Simon Pilgrim	2724143551	[DAG] canCreateUndefOrPoison - add freeze(ctpop(x)) -> ctpop(freeze(x)) and freeze(parity(x)) -> parity(freeze(x)) support Both are guaranteed not to create undef/poison	2022-08-09 10:10:29 +01:00
Luo, Yuanke	aaf6c7b05c	[globalisel] Select register bank for DBG_VALUE The register operand of DBG_VALUE is not selected to a proper register bank in both AArch64 and X86. This would cause getRegClass crash after global ISel. After discussion, we think the MIR should assume all vritual register should be set proper register class after global ISel, so this patch is to fix the gap of DBG_VALUE for AArch64 and X86. Differential Revision: https://reviews.llvm.org/D129037	2022-08-09 13:11:51 +08:00
Yuta Mukai	5357dd2f43	[MachinePipeliner] Fix Phi generation failure for large stages The previous code overwrites VRMap for prologue stages during Phi generation if a register spans many stages. As a result, the wrong register is used as the one coming from the prologue in Phis at later stages. (A process exists to correct this, but it does not work in all cases.) In addition, VRMap for prologue must be preserved until addBranches(). This patch fixes them by separating the map for Phis into a different variable (VRMapPhi). Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D127840	2022-08-09 13:14:26 +09:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Simon Pilgrim	6f2bee667a	[DAG] canCreateUndefOrPoison - add freeze(bswap(x)) -> bswap(freeze(x)) and freeze(bitreverse(x)) -> bitreverse(freeze(x)) support Both are guaranteed not to create undef/poison	2022-08-08 17:27:17 +01:00
Simon Pilgrim	e4b2c52420	[DAG] canCreateUndefOrPoison - add freeze(sext(x)) -> sext(freeze(x)) and freeze(zext(x)) -> zext(freeze(x)) support Both are guaranteed not to create undef/poison	2022-08-08 16:43:40 +01:00
Krzysztof Parzyszek	0f5385b70e	Recommit [RDF] Remove explicit template arguments from Print The build breakages should be addressed by d4abdd2e3d: [CMake] Check CMAKE_CXX_STANDARD and error if it's to old Thanks to Tobias and Roy for addressing these issues.	2022-08-08 07:28:45 -07:00
Simon Pilgrim	9641a201a5	[DAG] Add initial SelectionDAG::canCreateUndefOrPoison support This patch adds basic support for a DAG variant of the canCreateUndefOrPoison call and updates DAGCombiner::visitFREEZE to use it, further Opcodes (including target specific Opcodes) can be handled when we have test coverage. So far, I've left visitFREEZE to just use this for unary nodes (which currently means the existing BITCAST/FREEZE cases) - later patches will add other unary opcodes (with test coverage) and we can also refactor visitFREEZE to support a general number of operands like we do in InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating. I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place. Similarly we will be able to handle poison generating SDNodeFlags as and when it becomes an issue. Part of the work for D106675 / PR50468 Differential Revision: https://reviews.llvm.org/D130646	2022-08-08 15:16:06 +01:00
Simon Pilgrim	b334709467	Remove superfluous ; outside of a function	2022-08-08 12:14:03 +01:00
Shubham Narlawar	ab4fc87a9d	[DAG] Emit table lookup from TargetLowering::expandCTTZ() This patch emits table lookup in expandCTTZ. Context - https://reviews.llvm.org/D113291 transforms set of IR instructions to cttz intrinsic but there are some targets which does not support CTTZ or CTLZ. Hence, I generate a table lookup in TargetLowering::expandCTTZ(). Differential Revision: https://reviews.llvm.org/D128911	2022-08-08 12:08:05 +01:00
Simon Pilgrim	e5e93b6130	[DAG] FoldConstantArithmetic - add initial support for undef elements in bitcasted binop constant folding FoldConstantArithmetic can fold constant vectors hidden behind bitcasts (e.g. vXi64 -> v2Xi32 on 32-bit platforms), but currently bails if either vector contains undef elements. These undefs can often occur due to SimplifyDemandedBits/VectorElts calls recognising that the upper bits are often unnecessary (e.g. funnel-shift/rotate implicit-modulo and AND masks). This patch adds a basic 'FoldValueWithUndef' handler that will attempt to constant fold if one or both of the ops are undef - so far this just handles the AND and MUL cases where we always fold to zero. The RISCV codegen increase is interesting - it looks like the BUILD_VECTOR lowering was loading a constant pool entry but now (with all elements defined constant) it can materialize the constant instead? Differential Revision: https://reviews.llvm.org/D130839	2022-08-08 11:53:56 +01:00
David Green	061e0189a3	[DAG] Ensure Legal BUILD_VECTOR elements types in shuffle->And combine D129150 added a combine from shuffles to And that creates a BUILD_VECTOR of constant elements. We need to ensure that the elements are of a legal type, to prevent asserts during lowering. Fixes #56970. Differential Revision: https://reviews.llvm.org/D131350	2022-08-08 09:47:55 +01:00
Aaron Ballman	32fd0b7fd5	Revert "[RDF] Remove explicit template arguments from Print" This reverts commit `ede96de751`. This breaks the build on Windows with Visual Studio: https://lab.llvm.org/buildbot/#/builders/123/builds/12134	2022-08-07 08:24:01 -04:00
Kazu Hirata	a2d4501718	[llvm] Fix comment typos (NFC)	2022-08-07 00:16:14 -07:00
Kazu Hirata	3b114087c3	[llvm] Drop unnecessary const from return types (NFC) Identified with readability-const-return-type.	2022-08-07 00:16:11 -07:00
Fangrui Song	fa66789d06	[llvm] LLVM_NODISCARD => [[nodiscard]]. NFC With C++17 there is no Clang pedantic warning.	2022-08-07 00:26:33 +00:00
Krzysztof Parzyszek	2bc390bdd6	[RDF] Use default TargetOperandInfo if not given in constructor All current in-tree users use the default implementation.	2022-08-06 14:32:52 -05:00
Krzysztof Parzyszek	ede96de751	[RDF] Remove explicit template arguments from Print CTAD takes care of it.	2022-08-06 13:29:15 -05:00
Filipp Zhinkin	c55899f763	[DAGCombiner] Hoist funnel shifts from logic operation Hoist funnel shift from logic op: logic_op (FSH x0, x1, s), (FSH y0, y1, s) --> FSH (logic_op x0, y0), (logic_op x1, y1), s The transformation improves code generated for some cases related to issue https://github.com/llvm/llvm-project/issues/49541. Reduced amount of funnel shifts can also improve throughput on x86 CPUs by utilizing more available ports: https://quick-bench.com/q/gC7AKkJJsDZzRrs_JWDzm9t_iDM Transformation correctness checks: https://alive2.llvm.org/ce/z/TKPULH https://alive2.llvm.org/ce/z/UvTd_9 https://alive2.llvm.org/ce/z/j8qW3_ https://alive2.llvm.org/ce/z/7Wq7gE https://alive2.llvm.org/ce/z/Xr5w8R https://alive2.llvm.org/ce/z/D5xe_E https://alive2.llvm.org/ce/z/2yBZiy Differential Revision: https://reviews.llvm.org/D130994	2022-08-05 17:02:22 -04:00
Dawid Jurczak	1bd31a6898	[NFC] Add SmallVector constructor to allow creation of SmallVector<T> from ArrayRef of items convertible to type T Extracted from https://reviews.llvm.org/D129781 and address comment: https://reviews.llvm.org/D129781#3655571 Differential Revision: https://reviews.llvm.org/D130268	2022-08-05 13:35:41 +02:00
Fangrui Song	7d6017fd31	[TTI] Change new getVectorInstrCost overload to use const reference after D131114 A const reference is preferred over a non-null const pointer. `Type *` is kept as is to match the other overload. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D131197	2022-08-04 15:16:51 -07:00
Mingming Liu	bc8f2f3649	[AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation. 1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method. 2) This patch also changes a few callsites (VectorCombine.cpp, SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method. 3) This is a split of D128302. Differential Revision: https://reviews.llvm.org/D131114	2022-08-04 12:58:25 -07:00
Lorenzo Albano	74940d2668	[VP] Add widening for VP_STRIDED_LOAD and VP_STRIDED_STORE Reviewed By: frasercrmck, craig.topper Differential Revision: https://reviews.llvm.org/D121114	2022-08-04 16:12:01 +02:00
wanglian	b6b0690355	[LegalizeTypes][VP] Add split operand support for VP float and integer casting Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D130685	2022-08-04 15:41:50 +08:00
Mircea Trofin	0cb9746a7d	[nfc][mlgo] Separate logger and training-mode model evaluator This just shuffles implementations and declarations around. Now the logger and the TF C API-based model evaluator are separate. Differential Revision: https://reviews.llvm.org/D131116	2022-08-03 16:20:28 -07:00
Adrian Prantl	905f2d1ecb	Fix LDV InstrRefBasedImpl to not crash when encountering unreachable MBBs. The testcase was delta-reduced from an LTO build with sanitizer coverage and the MIR tail duplication pass caused a machine basic block to become unreachable in MIR. This caused the MBB to be invisible to the reverse post-order traversal used to initialize the MBB <-> RPONumber lookup tables. rdar://97226240 Differential Revision: https://reviews.llvm.org/D130999	2022-08-03 13:05:05 -07:00
Felipe de Azevedo Piovezan	a5a8a05c78	[SelectionDAG] Handle IntToPtr constants in dbg.value The function `handleDebugValue` has custom logic to handle certain kinds constants, namely integers, floats and null pointers. However, it does not handle constant pointers created from IntToPtr ConstantExpressions. This patch addresses the issue by replacing the Constant with its integer operand. A similar bug was addressed for GlobalISel in D130642. Reviewed By: aprantl, #debug-info Differential Revision: https://reviews.llvm.org/D130908	2022-08-03 14:10:05 -04:00
David Truby	9a976f3661	[llvm] Always use TargetConstant for FP_ROUND ISD Nodes This patch ensures consistency in the construction of FP_ROUND nodes such that they always use ISD::TargetConstant instead of ISD::Constant. This additionally fixes a bug in the AArch64 SVE backend where patterns were matching against TargetConstant nodes and sometimes failing when passed a Constant node. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130370	2022-08-03 14:02:11 +01:00
Fraser Cormack	646e2f4803	[VP] Rename VP int<->float conversion ISD opcodes These should be named like the non-VP versions for consistency. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D130967	2022-08-03 10:04:38 +01:00
Paul Kirth	d434e40f39	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-08-03 00:09:45 +00:00
Mircea Trofin	4146c1756d	[nfc] Remove unused parameter in TailDuplicator::duplicateSimpleBB Differential Revision: https://reviews.llvm.org/D131008	2022-08-02 13:39:34 -07:00
Kai Nacke	b38375378d	[GIsel] Add missing libcall for G_MUL to LegalizerHelper The LegalizerHelper misses the code to lower G_MUL to a library call, which this change adds. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D130987	2022-08-02 13:35:25 -04:00
Simon Pilgrim	b651fdff79	[DAG] matchRotateSub - ensure the (pre-extended) shift amount is wide enough for the amount mask (PR56859) matchRotateSub is given shift amounts that will already have stripped any/zero-extend nodes from - so make sure those values are wide enough to take a mask.	2022-08-02 11:38:52 +01:00
Tim Northover	b586dc21a7	Outliner: add "target-cpu" feature from source function to outlined The CPU is used to determine which inline asm instructions are allowed, so needs to be copied across in case the outlined function contains any.	2022-08-02 09:33:29 +01:00
Sotiris Apostolakis	995b61cdac	[SelectOpti] Auto-disable other cmov optis when the new select-opti pass is enabled Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D129817	2022-08-02 00:19:59 +00:00
Fangrui Song	2b70bebc6d	[MachineFunctionPass] Support -print-changed={,c}diff{,-quiet} Follow-up to D130434. Move doSystemDiff to PrintPasses.cpp and call it in MachineFunctionPass.cpp. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D130833	2022-08-01 12:56:15 -07:00
Marius Brehler	ddb6c28638	Avoid comparison of integers of different signs Otherwiese a warning is emitted when compiling with `-Wsign-compare`.	2022-08-01 11:20:41 +00:00
Simon Pilgrim	b43d7aacf8	[DAG] visitINSERT_VECTOR_ELT - extend folding to BUILD_VECTOR if all missing elements from an insertion chain are known zero	2022-08-01 11:32:33 +01:00
David Sherwood	41119a0f52	[DAGCombiner] Extend visitAND to include EXTRACT_SUBVECTOR Eliminate an AND by redefining an anyext\|sext\|zext. (and (extract_subvector (anyext\|sext\|zext v) _) iN_mask) => (extract_subvector (zeroext_iN v)) Differential Revision: https://reviews.llvm.org/D130782	2022-08-01 10:32:32 +01:00
Vladislav Dzhidzhoev	facb3ac385	[GlobalISel][DebugInfo] salvageDebugInfo analogue for gMIR Salvage debug info of instruction that is about to be deleted as dead in Combiner pass. Currently supported instructions are COPY and G_TRUNC. It allows to salvage debug info of some dead arguments of functions, by putting DWARF expression corresponding to the instruction being deleted into related DBG_VALUE instruction. Here is an example of missing variables location https://godbolt.org/z/K48osb9dK. We see that arguments x, y of function foo are not available in debugger, and corresponding DBG_VALUE instructions have undefined register operand instead of variables locaton after Aarch64PreLegalizerCombiner pass. The reason is that registers where variables are located are removed as dead (with instruction G_TRUNC). We can use salvageDebugInfo analogue for gMIR to preserve debug locations of dead variables. Statistics of llvm object files built with vs without this commit on -O2 optimization level (CMAKE_BUILD_TYPE=RelWithDebInfo, -fglobal-isel) on Aarch64 (macOS): Number of variables with 100% of parent scope covered by DW_AT_location has been increased by 7,9%. Number of variables with 0% coverage of parent scope has been decreased by 1,2%. Number of variables processed by location statistics has been increased by 2,9%. Average PC ranges coverage has been increased by 1,8 percentage points. Coverage can be improved by supporting more instructions, or by calling salvageDebugInfo for instructions that are deleted during Combiner rules exection. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D129909	2022-08-01 11:14:53 +02:00
Chuanqi Xu	9701053517	Introduce @llvm.threadlocal.address intrinsic to access TLS variable This belongs to a series of patches which try to solve the thread identification problem in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for a full background. The problem consists of two concrete problems: TLS variable and readnone functions. This patch tries to convert the TLS problem to readnone problem by converting the access of TLS variable to an intrinsic which is marked as readnone. The readnone problem would be addressed in following patches. Reviewed By: nikic, jyknight, nhaehnle, ychen Differential Revision: https://reviews.llvm.org/D125291	2022-08-01 10:51:30 +08:00
Luís Marques	260a641068	[RISCV] Pre-RA expand pseudos pass Expand load address pseudo-instructions earlier (pre-ra) to allow follow-up patches to fold the addi of PseudoLLA instructions into the immediate operand of load/store instructions. Differential Revision: https://reviews.llvm.org/D123264	2022-07-31 23:19:00 +02:00
Kazu Hirata	12b29900a1	Use any_of (NFC)	2022-07-30 10:35:56 -07:00
Dmitry Vassiliev	adc387460d	[CodeGen] Fixed undeclared MISchedCutoff in case of NDEBUG and LLVM_ENABLE_ABI_BREAKING_CHECKS This patch fixes the error llvm/lib/CodeGen/MachineScheduler.cpp(755): error C2065: 'MISchedCutoff': undeclared identifier in case of NDEBUG and LLVM_ENABLE_ABI_BREAKING_CHECKS. Note MISchedCutoff is declared under #ifndef NDEBUG. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130425	2022-07-30 18:24:50 +02:00
Simon Pilgrim	9ad082eb5a	[DAG] Pull out repeated getOperand() calls for shuffle ops. NFC.	2022-07-30 14:02:54 +01:00
Amaury Séchet	226086230c	[DAG] Use recursivelyDeleteUnusedNodes in CommitTargetLoweringOpt. It simplifies the logic and removes the need for manual bookkeeping. Differential Revision: https://reviews.llvm.org/D130445	2022-07-29 13:49:03 +00:00
Simon Pilgrim	af1b7ebcdf	[TargetLowering] Move a few hasOneUse() tests later to reduce unnecessary computations. NFC. Many of these cases, an early-out on the much cheaper getOpcode() check will avoid us needing to call hasOneUse() entirely.	2022-07-29 14:20:35 +01:00
Matt Arsenault	a4834ad068	RegisterCoalescer: Shrink main range after shrinking subranges If the subregister uses were dead, this would leave the main range segment pointing to a deleted instruction. Not sure if this should try to avoid shrinking if we know we don't have dead components.	2022-07-29 08:57:28 -04:00
Simon Pilgrim	641dba9e28	[DAG] Move a few hasOneUse() tests later to reduce unnecessary computations. NFC. Many of these cases, an early-out on the much cheaper getOpcode() check will avoid us needing to call hasOneUse() entirely.	2022-07-29 11:34:39 +01:00
Simon Pilgrim	9082c13106	[Support] Add KnownBits::concat method Add a method for the various cases where we need to concatenate 2 KnownBits together (BUILD_PAIR and SHIFT_PARTS in particular) - uses the existing APInt::concat 'HiBits.concat(LoBits)' convention Differential Revision: https://reviews.llvm.org/D130557	2022-07-29 11:06:39 +01:00
Felipe de Azevedo Piovezan	58526b2d2b	[GlobalISel] Handle nullptr constants in dbg.value Currently, the LLVM IR -> MIR translator fails to translate dbg.values whose first argument is a null pointer. However, in other portions of the code, such pointers are always lowered to the constant zero, for example see IRTranslator::Translate(Constant, Register). This patch addresses the limitation by following the same approach of lowering null pointers to zero. A prior test was checking that null pointers were always lowered to $noreg; this test is changed to check for zero, and the previous behavior is now checked by introducing a dbg.value whose first argument is the address of a global variable. Differential Revision: https://reviews.llvm.org/D130721	2022-07-28 14:58:14 -07:00
Felipe de Azevedo Piovezan	0ef6809c48	[GlobalISel][nfc] Remove unnecessary cast The getOperand method already returns a Constant when it is called on a ConstantExpression, as such the cast is not needed. To prevent a type mismatch between the different return statements of the lambda, the lambda return type is explicitly provided. Differential Revision: https://reviews.llvm.org/D130719	2022-07-28 14:55:07 -07:00
Simon Pilgrim	8c99cef1e7	[DAG] Remove SelectionDAG::GetDemandedBits and use SimplifyMultipleUseDemandedBits directly. GetDemandedBits is mainly a wrapper around SimplifyMultipleUseDemandedBits now, and is only used by DAGCombiner::visitSTORE so I've moved all remaining functionality there. visitSTORE was making use of this to 'simplify' constants for a trunc-store. Just removing this code left to a mixture of regressions and gains - it came down to whether a target preferred a sign or zero extended constant for materialization/truncation. I've just moved the code over for now, but a next step would be to move this to targetShrinkDemandedConstant, but some targets that override the method expect a basic binop, and might react badly to a store node.....	2022-07-28 17:03:44 +01:00
Simon Pilgrim	be488ba7de	[DAG] DAGCombiner::visitTRUNCATE - remove GetDemandedBits call This should now all be handled by SimplifyDemandedBits.	2022-07-28 15:23:04 +01:00
Simon Pilgrim	ea7f14dad0	[DAG] SelectionDAG::GetDemandedBits - don't simplify opaque constants I'm actually trying to get rid of GetDemandedBits - but while dismantling it I noticed that we were altering opaque constants. Fixing that causes a FP_TO_INT_SAT regression that should be addressed separately - I'll raise a bug.	2022-07-28 14:46:59 +01:00
Simon Pilgrim	69d5a038b9	[DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts. This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits. There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP. Differential Revision: https://reviews.llvm.org/D77804	2022-07-28 14:10:44 +01:00
Amaury Séchet	474a8ee03d	[DAG] Use recursivelyDeleteUnusedNodes in PromoteLoad It simplifies the code overall and removes the need for manual bookkeeping. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130447	2022-07-28 12:54:52 +00:00
Amaury Séchet	7920805b27	[DAG] Use recursivelyDeleteUnusedNodes in ReplaceLoadWithPromotedLoad It simplifies the code overall and removes the need for manual bookkeeping. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130444	2022-07-28 12:32:37 +00:00
David Spickett	a0ccba5e19	[llvm] Fix some test failures with EXPENSIVE_CHECKS and libstdc++ DebugLocEntry assumes that it either contains 1 item that has no fragment or many items that all have fragments (see the assert in addValues). When EXPENSIVE_CHECKS is enabled, _GLIBCXX_DEBUG is defined. On a few machines I've checked, this causes std::sort to call the comparator even if there is only 1 item to sort. Perhaps to check that it is implemented properly ordering wise, I didn't find out exactly why. operator< for a DbgValueLoc will crash if this happens because the optional Fragment is empty. Compiler/linker/optimisation level seems to make this happen or not. So I've seen this happen on x86 Ubuntu but the buildbot for release EXPENSIVE_CHECKS did not have this issue. Add an explicit check whether we have 1 item. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D130156	2022-07-28 08:53:38 +00:00
Matt Arsenault	bfdca1535c	RegAllocGreedy: Fix nondeterminism in tryLastChanceRecoloring tryLastChanceRecoloring iterates over the set of LiveInterval pointers and used that to seed the recoloring stack, which was nondeterministic. Fixes a future test failing about 20% of the time. This just takes the order the interfering vreg was encountered. Not sure if we should try to order this more intelligently.	2022-07-27 19:02:06 -04:00
Paul Kirth	6e9bab71b6	Revert "[llvm][NFC] Refactor code to use ProfDataUtils" This reverts commit `300c9a7881`. We will reland once these issues are ironed out.	2022-07-27 21:38:11 +00:00
Paul Kirth	300c9a7881	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-07-27 21:13:54 +00:00
Adrian Prantl	719ab04acf	[GlobalISel] Handle IntToPtr constants in dbg.value Currently, the IR to MIR translator can only handle two kinds of constant inputs to dbg.values intrinsics: constant integers and constant floats. In particular, it cannot handle pointers created from IntToPtr ConstantExpression objects. This patch addresses the limitation above by replacing the IntToPtr with its input integer prior to converting the dbg.value input. Patch by Felipe Piovezan! Differential Revision: https://reviews.llvm.org/D130642	2022-07-27 13:42:07 -07:00
Amara Emerson	65246d3eb4	Use hasNItemsOrLess() in MRI::hasAtMostUserInstrs().	2022-07-27 11:42:14 -07:00
Amara Emerson	19cdd1908b	[AArch64][GlobalISel] Add heuristics for localizing G_CONSTANT. This adds similar heuristics to G_GLOBAL_VALUE, querying the cost of materializing a specific constant in code size. Doing so prevents us from sinking constants which require multiple instructions to generate into use blocks. Code size savings on CTMark -Os: Program size.__text before after diff ClamAV/clamscan 381940.00 382052.00 0.0% lencod/lencod 428408.00 428428.00 0.0% SPASS/SPASS 411868.00 411876.00 0.0% kimwitu++/kc 449944.00 449944.00 0.0% Bullet/bullet 463588.00 463556.00 -0.0% sqlite3/sqlite3 284696.00 284668.00 -0.0% consumer-typeset/consumer-typeset 414492.00 414424.00 -0.0% 7zip/7zip-benchmark 595244.00 594972.00 -0.0% mafft/pairlocalalign 247512.00 247368.00 -0.1% tramp3d-v4/tramp3d-v4 372884.00 372044.00 -0.2% Geomean difference -0.0% Differential Revision: https://reviews.llvm.org/D130554	2022-07-27 10:51:16 -07:00
Simon Pilgrim	c0b3f7a50f	[DAG] SimplifyDemandedBits - ensure we clear known One bits that AssertZext asserts are really known Zero Matches ComputeKnownBits behaviour Thanks to @uabelho for the fuzz regression report on D129765	2022-07-27 13:57:47 +01:00
Simon Pilgrim	529bd4f352	[DAG] SimplifyDemandedBits - don't early-out for multiple use values SimplifyDemandedBits currently early-outs for multi-use values beyond the root node (just returning the knownbits), which is missing a number of optimizations as there are plenty of cases where we can still simplify when initially demanding all elements/bits. @lenary has confirmed that the test cases in aea-erratum-fix.ll need refactoring and the current increase codegen is not a major concern. Differential Revision: https://reviews.llvm.org/D129765	2022-07-27 10:54:06 +01:00
Dmitry Vassiliev	e3e63f30a5	[CodeGen] Fixed ambiguous symbol ExtAddrMode in case of NDEBUG and LLVM_ENABLE_DUMP This patch fixes the following error with MSVC 16.9.2 in case of NDEBUG and LLVM_ENABLE_DUMP: llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2872: 'ExtAddrMode': ambiguous symbol llvm/include/llvm/CodeGen/TargetInstrInfo.h(86): note: could be 'llvm::ExtAddrMode' llvm/lib/CodeGen/CodeGenPrepare.cpp(2447): note: or '`anonymous-namespace'::ExtAddrMode' llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2039: 'print': is not a member of 'llvm::ExtAddrMode' Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D130426	2022-07-27 00:21:57 +02:00
Fangrui Song	f106525de2	[MachineFunctionPass] Support -print-changed and -print-changed=quiet -print-changed for new pass manager is handy beside -print-after-all. Port it to MachineFunctionPass. Note: lib/Passes/StandardInstrumentations.cpp implements a number of misc features. If we want to use them for codegen, we may need to lift some functionality to LLVMIR. Reviewed By: aeubanks, jamieschmeiser Differential Revision: https://reviews.llvm.org/D130434	2022-07-26 10:16:49 -07:00
Simon Pilgrim	1ea7b9c6ee	[DAG] matchRotateSub - set demanded bits to the shift amount type size, not the shift result size. This should fix a report on D130251 of an assert due to a bitwidth mismatch in APInt::isSubSetOf	2022-07-26 17:58:51 +01:00
Stefan Gränitz	1e30820483	[WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222 This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs. * Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand. * PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers. * LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites. Reviewed By: theraven Differential Revision: https://reviews.llvm.org/D128190	2022-07-26 17:52:43 +02:00
Paul Walker	e5c892dd85	[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR. This patch starts small, only detecting sequences of the form <a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes. Differential Revision: https://reviews.llvm.org/D125194	2022-07-26 15:28:37 +00:00
wangpc	1a7078d106	[DAGCombine] Mask doesn't have to be (EltSize - 1) exactly when combining rotation I think what we need is the least Log2(EltSize) significant bits are known to be ones. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130251	2022-07-26 21:14:45 +08:00
Sven van Haastregt	c8d91b07bb	Reassoc FMF should not optimize FMA(a, 0, b) to (b) Optimizing (a * 0 + b) to (b) requires assuming that a is finite and not NaN. DAGCombiner will do this optimization when the reassoc fast math flag is set, which is not correct. Change DAGCombiner to only consider UnsafeMath for this optimization. Differential Revision: https://reviews.llvm.org/D130232 Co-authored-by: Andrea Faulds <andrea.faulds@arm.com>	2022-07-26 09:39:12 +01:00
Kazu Hirata	3f3930a451	Remove redundaunt virtual specifiers (NFC) Identified with tidy-modernize-use-override.	2022-07-25 23:00:59 -07:00
jacquesguan	cb370cf413	[DAGCombiner] Teach scalarizeExtractedBinop to support scalable splat. This patch supports the scalable splat part for scalarizeExtractedBinop. Differential Revision: https://reviews.llvm.org/D129725	2022-07-26 09:31:45 +08:00
Amara Emerson	5ae0472694	[GlobalISel] Fix miscompile of G_UREM + G_UDIV due to not checking for equality of the first operands of each. Fixes issue #55287 Differential Revision: https://reviews.llvm.org/D130525	2022-07-25 16:03:05 -07:00
Alexander Shaposhnikov	1e636f2676	[IRBuilder] Add assert for AtomicRMW ordering Add assert for AtomicRMW: Ordering != AtomicOrdering::Unordered (https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Verifier.cpp#L3944) and adjust expandAtomicStore accordingly. Test plan: 1/ ninja check-llvm check-clang check-lld 2/ Bootstrapped LLVM/Clang pass tests Differential revision: https://reviews.llvm.org/D130457	2022-07-25 22:51:25 +00:00
Matt Arsenault	62531518f9	RegAllocGreedy: Add a command line flag for reverseLocalAssignment Introduce a flag like for some of the other target heuristic controls to help with experimentation.	2022-07-25 15:47:15 -04:00
Vladislav Dzhidzhoev	fc93ba061a	[GlobalISel][DebugInfo] Remove debug info with zero line from constants inserted at entry block Emission of constants having DebugLoc with line 0 causes significant increase of debug_line section size for some source files. To illustrate, we can compare section sizes of several files from llvm test-suite, built with SelectionDAG vs GlobalISel, on Aarch64 (macOS), using -O0 optimization level: \| Source path \| SDAG text sz \| GISel text sz \| SDAG debug_line sz \| GISel debug_line sz \| -------------------------------------------------------------- \| ------------ \| ------------- \| ------------------ \| -------------------- \| `SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c` \| 15320 \| 660 \| 14872 \| 6340 \| `SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c` \| 33640 \| 26300 \| 2812 \| 6693 \| `SingleSource/Benchmarks/Misc/flops-4.c` \| 1428 \| 1196 \| 594 \| 1008 \| `MultiSource/Benchmarks/MiBench/consumer-typeset/z31.c` \| 2716 \| 964 \| 809 \| 903 \| `MultiSource/Benchmarks/Prolangs-C/gnugo/showinst.c` \| 2534 \| 2502 \| 189 \| 573 For instance, here is a fragment of `flops-4.c.o` debug line section dump ``` Address Line Column File ISA Discriminator Flags ------------------ ------ ------ ------ --- ------------- ------------- 0x0000000000000000 174 0 1 0 0 is_stmt 0x0000000000000010 0 0 1 0 0 0x0000000000000018 185 4 1 0 0 is_stmt prologue_end 0x000000000000001c 0 0 1 0 0 0x0000000000000024 186 4 1 0 0 is_stmt 0x000000000000002c 189 10 1 0 0 is_stmt 0x0000000000000030 0 0 1 0 0 0x0000000000000038 207 11 1 0 0 is_stmt 0x0000000000000044 208 11 1 0 0 is_stmt 0x0000000000000048 0 0 1 0 0 0x0000000000000058 210 10 1 0 0 is_stmt 0x000000000000005c 0 0 1 0 0 0x0000000000000060 211 10 1 0 0 is_stmt 0x0000000000000064 0 0 1 0 0 0x000000000000006c 212 10 1 0 0 is_stmt 0x0000000000000070 0 0 1 0 0 0x000000000000007c 213 10 1 0 0 is_stmt 0x0000000000000080 0 0 1 0 0 0x0000000000000088 214 10 1 0 0 is_stmt 0x000000000000008c 0 0 1 0 0 0x0000000000000094 215 10 1 0 0 is_stmt ``` Lot of zero lines are produced by constants (global values) having DebugLoc with line 0. It seems that they're not significant for debugging experience. With the commit applied, total size of debug_line sections of llvm shared libraries has reduced by 2.5%. Change of debug line section size of files listed above: \| Source path \| GISel debug_line sz \| Patch debug_line sz \| -------------------------------------------------------------- \| ------------------- \| -------------------- \| `SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c` \| 6340 \| 1465 \| `SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c` \| 6693 \| 3782 \| `SingleSource/Benchmarks/Misc/flops-4.c` \| 1008 \| 609 \| `MultiSource/Benchmarks/MiBench/consumer-typeset/z31.c` \| 903 \| 841 \| `MultiSource/Benchmarks/Prolangs-C/gnugo/showinst.c` \| 573 \| 190 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D127488	2022-07-25 17:19:01 +00:00
Nikita Popov	fb7caa3c7b	[AsmPrinter] Reject ptrtoint to larger size in lowerConstant() When using a ptrtoint to a size larger than the pointer width in a global initializer, we currently create a ptr & low_bit_mask style MCExpr, which will later result in a relocation error during object file emission. This patch rejects the constant expression already during lowerConstant(), which results in a much clearer error message that references the constant expression at fault. This fixes https://github.com/llvm/llvm-project/issues/56400, for certain definitions of "fix". Differential Revision: https://reviews.llvm.org/D130366	2022-07-25 10:18:27 +02:00
Kazu Hirata	b5188591a0	[llvm] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-24 21:50:35 -07:00
Kazu Hirata	acf648b5e9	Use llvm::less_first and llvm::less_second (NFC)	2022-07-24 16:21:29 -07:00
Kazu Hirata	ea29810c9d	[CodeGen] Remove a redundant void (NFC) Identified with modernize-redundant-void-arg.	2022-07-24 12:27:14 -07:00
Matt Arsenault	40abb28f61	RegAllocGreedy: Fix subranges when rematerializing dead subreg defs This would create a new interval missing the subrange and hit this verifier error: * Bad machine code: Live interval for subreg operand has no subranges * - function: test_remat_subreg_def - basic block: %bb.0 (0xa568758) [0B;128B) - instruction: 32B dead undef %4.sub0:vreg_64 = V_MOV_B32_e32 2, implicit $exec	2022-07-24 11:51:59 -04:00
Simon Pilgrim	562ee7cc5f	[DAG] visitSMUL_LOHI/visitUMUL_LOHI - ensure we canonicalize constants to the RHS	2022-07-24 16:09:56 +01:00
Simon Pilgrim	428c0f2adc	[DAG] getNode - assert that SMUL_LOHI/UMUL_LOHI nodes have the correct ops + types	2022-07-24 15:30:57 +01:00
Simon Pilgrim	0708771cce	[DAG] MaskedVectorIsZero - don't bother with (-1).isSubsetOf mask check. NFC. Just use KnownBits::isZero() to ensure all the bits are known zero.	2022-07-24 13:12:21 +01:00
Simon Pilgrim	e82d49bfed	[DAG] SimplifyMultipleUseDemandedBits - early-out for any scalable vector types Noticed while working to remove SelectionDAG::GetDemandedBits - we were relying on the callers to have already bailed for scalable vectors	2022-07-24 12:59:43 +01:00
Simon Pilgrim	a3e38b4a20	[DAG] SimplifyDemandedVectorElts - if every and/mul element-pair has a zero/undef then just constant fold to zero	2022-07-24 12:00:31 +01:00
Kazu Hirata	7bfa06f6c0	[CodeGen] Use range-based for loops (NFC)	2022-07-23 16:10:46 -07:00
Simon Pilgrim	ac8be21365	[DAG] isSplatValue - don't attempt to merge any BITCAST sub elements if they contain UNDEFs We still haven't found a solution that correctly handles 'don't care' sub elements properly - given how close it is to the next release branch, I'm making this fail safe change and we can revisit this later if we can't find alternatives. NOTE: This isn't a reversion of D128570 - it's the removal of undef handling across bitcasts entirely Fixes #56520	2022-07-23 18:38:48 +01:00
Dmitri Gribenko	aba43035bd	Use llvm::sort instead of std::sort where possible llvm::sort is beneficial even when we use the iterator-based overload, since it can optionally shuffle the elements (to detect non-determinism). However llvm::sort is not usable everywhere, for example, in compiler-rt. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D130406	2022-07-23 15:19:05 +02:00
Simon Pilgrim	5f89d2bae9	[DAG] Move OR(AND(X,C1),AND(OR(X,Y),C2)) -> OR(AND(X,OR(C1,C2)),AND(Y,C2)) fold to SimplifyDemandedBits This will fix the SystemZ v3i31 memcpy regression in D77804 (with the help of D129765 as well....). It should also allow us to /bend/ the oneuse limitation for cases where we can use demanded bits to safely peek though multiple uses of the AND ops.	2022-07-23 13:17:24 +01:00
Simon Pilgrim	6aff1b7b3c	[DAG] SimplifyDemandedBits - pull out repeated getValueType() calls. NFC.	2022-07-23 12:01:54 +01:00
Simon Pilgrim	2421a5af72	[DAG] ExpandIntRes_ADDSUB - create UADDO/USUBO instead of ADDCARRY/SUBCARRY if overflow is known to be zero As noticed on D127115, when splitting ADD/SUB nodes we often end up with cases where overflow from the lower bits is impossible - in such cases we're better off breaking the carry chain dependency as soon as possible. This path is being exercised by llvm/test/CodeGen/ARM/dsp-mlal.ll, although I haven't been able to get any codegen diff without a topological worklist.	2022-07-23 11:13:44 +01:00
Simon Pilgrim	8937252465	[DAG] computeKnownBits - add basic shift-by-parts handling Concat KnownBits from ISD::SHL_PARTS / ISD::SRA_PARTS / ISD::SRL_PARTS lo/hi operands and perform the KnownBits calculation by the shift amount on the extended type, before splitting the KnownBits based on the requested lo/hi result.	2022-07-23 09:46:30 +01:00
ARCHIT SAXENA	3bb1ce2319	Add a nop instruction if a section starts with landing pad for function splitter This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections. Detailed description: The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart. ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 .Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section movq %rax, %rdi <--- first instruction is at offest 0 from LPStart callq _Unwind_Resume@PLT ``` This will cause landing pad entries to become zero (.Ltmp11-foo10.cold) ``` .Lcst_begin4: .uleb128 .Ltmp9-.Lfunc_begin2 # >> Call Site 1 << .uleb128 .Ltmp10-.Ltmp9 # Call between .Ltmp9 and .Ltmp10 .uleb128 .Ltmp11-foo10.cold <---This is zero # jumps to .Ltmp11 .byte 3 # On action: 2 .uleb128 .Ltmp10-.Lfunc_begin2 # >> Call Site 2 << .uleb128 .Lfunc_end9-.Ltmp10 # Call between .Ltmp10 and .Lfunc_end9 .byte 0 # has no landing pad .byte 0 # On action: cleanup .p2align 2 ``` The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output: ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 nop <--- new instruction that is added .Ltmp11: movq %rax, %rdi callq _Unwind_Resume@PLT ``` Reviewed By: modimo, snehasish, rahmanl Differential Revision: https://reviews.llvm.org/D130133	2022-07-22 15:20:10 -07:00
Craig Topper	be208b40c1	[DAGCombiner] Simplify code around call to reduceLoadWidth in visitAND. NFC We were looking for loads or any_extend+load. reduceLoadWidth hasn't known how to look through such an any_extend to find the load since D40667 almost 5 years ago. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130333	2022-07-22 08:36:56 -07:00
Nikita Popov	c2be703c6c	[AsmPrinter] Move lowerConstant() error code out of switch (NFC) Move this out of the switch, so that different branches can indicate an error by breaking out of the switch. This becomes important if there are more than the two current error cases.	2022-07-22 16:08:28 +02:00
Cullen Rhodes	bf268a05cd	[AArch64] Emit vector FP cmp when LE is used with fast-math Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130093	2022-07-22 07:53:55 +00:00
jacquesguan	e60eb7053d	recommit "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat." With fix for AArch64 and Hexgon test cases.	2022-07-21 17:34:34 +08:00
David Green	23d6186be0	[SelectionDAG] Fix fptoi.sat scalable vector lowering Vector fptosi_sat and fptoui_sat were being expanded by unrolling the vector operation. This doesn't work for scalable vector, so this patch adds a call to TLI.expandFP_TO_INT_SAT if the vector is scalable. Scalable tests are added for AArch64 and RISCV. Some of the AArch64 fptoi_sat operations should be legal, but that will be handled in another patch. Differential Revision: https://reviews.llvm.org/D130028	2022-07-21 08:00:22 +01:00
esmeyi	339392ecf2	[AIX] follow-up of D124654. Emitting the remaining aliases instead of reporting an error to avoid SPEC2017 PEAK failures. And mark this as a TODO.	2022-07-21 01:10:09 -04:00
Simon Pilgrim	029e83b401	[DAG] getNode - don't bother creating ADDO(X,0) or SUBO(X,0) nodes. Similar to what we already do in getNode for basic ADD/SUB nodes, return the X operand directly, but here we know that there will be no/zero overflow as well. As noted on D127115 - this path is being exercised by llvm/test/CodeGen/ARM/dsp-mlal.ll, although I haven't been able to get any codegen without a topological worklist.	2022-07-20 12:04:33 +01:00
Simon Pilgrim	766cd95481	[DAG] getNode - assert that ADDO/SUBO nodes have the correct ops + types	2022-07-20 11:23:58 +01:00
Simon Pilgrim	9fc347aa4e	[DAG] PromoteIntRes_BUILD_VECTOR - extend constant boolean vectors according to target BooleanContents PromoteIntRes_BUILD_VECTOR currently always ANY_EXTENDs build vector operands, but if this is a constant boolean vector we're losing the useful ability to keep the vector matching the BooleanContents mode used by the target. This patch extends constant boolean vectors according to target BooleanContents, allowing a number of additional all-bits folds (notable XOR -> NOT conversions) to occur. Differential Revision: https://reviews.llvm.org/D129641	2022-07-20 10:49:31 +01:00
Lorenzo Albano	07d69d9fc9	[VP] Legalize the stride operand for EXPERIMENTAL_VP_STRIDED SDNodes Add promotion and expansion of integer operands for experimental_vp_strided SelectionDAG nodes; the expansion is actually just a truncation of the stride operand. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D123112	2022-07-20 10:22:43 +02:00
Kazu Hirata	76e18cc4f6	[llvm] Use llvm::any_of and llvm::none_of (NFC)	2022-07-20 00:36:19 -07:00
Kazu Hirata	0387da6f4f	Use value instead of getValue (NFC)	2022-07-19 21:18:26 -07:00
Kazu Hirata	41ae78ea3a	Use has_value instead of hasValue (NFC)	2022-07-19 20:15:44 -07:00
Kazu Hirata	bbbb4393ee	[CodeGen] Use value_or instead of getValueOr (NFC)	2022-07-19 19:50:43 -07:00
David Truby	4c82f56d8f	[llvm][SVE] Remove redundant and when comparing against extending load When determining if an `and` should be merged into an extending load the constant argument to the `and` is currently not checked if the argument requires truncation. This prevents the combine happening when the vector width is half the normal available vector width for SVE VLA vectors. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D129281	2022-07-19 17:08:32 +01:00
Simon Pilgrim	71c502cbca	[DAG] Call SimplifyDemandedBits from ISD::MUL nodes Noticed while triaging D129765.	2022-07-19 14:11:04 +01:00
Benjamin Kramer	8aff88fd3a	[LegalizeDAG] Propagate alignment in ExpandExtractFromVectorThroughStack Unlike the name suggests this can reuse any store as a base for a memory-based vector extract. If that store is underaligned the loads created to extract will have an invalid alignment. Since most CPUs are forgiving wrt alignment this is almost never an issue, on x86 this is only reproducible by extracting a 128 bit vector out of a wider vector. I tried making a test case in the context of https://reviews.llvm.org/D127982 but it's really really fragile, as the output pretty much looks like a missed optimization.	2022-07-19 13:13:55 +02:00
Simon Pilgrim	0f6b0461b0	[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits. This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115 Alive2: https://alive2.llvm.org/ce/z/fl7T7K Differential Revision: https://reviews.llvm.org/D129933	2022-07-19 10:59:07 +01:00
Max Kazantsev	69b284aaf6	Revert "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat." This reverts commit `58dfaaaace`. Massive AARCH test failures in buildbot.	2022-07-19 13:41:52 +07:00
jacquesguan	58dfaaaace	[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat. This revision supports to scalarize a binary operation of two scalable splat vectors. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122791	2022-07-19 11:20:51 +08:00
Matt Arsenault	8d0383eb69	CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.	2022-07-18 17:23:41 -04:00
Jay Foad	dbed4326dd	[LiveIntervals] Find better anchoring end points when repairing ranges r175673 changed repairIntervalsInRange to find anchoring end points for ranges automatically, but the calculation of Begin included the first instruction found that already had an index. This patch changes it to exclude that instruction: 1. For symmetry, so that the half open range [Begin,End) only includes instructions that do not already have indexes. 2. As a possible performance improvement, since repairOldRegInRange will scan fewer instructions. 3. Because repairOldRegInRange hits assertion failures in some cases when it sees a def that already has a live interval. (3) fixes about ten tests in the CodeGen lit test suite when -early-live-intervals is forced on. Differential Revision: https://reviews.llvm.org/D110182	2022-07-18 19:34:43 +01:00
Itay Bookstein	2570f226d1	[SDAG] Remove single-result restriction on commutative CSE The DAG Combiner unnecessarily restricts commutative CSE to nodes with a single result value. This commit removes that restriction. Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D129666	2022-07-18 19:19:13 +03:00
Lorenzo Albano	c00a44fa68	[VP] IR expansion pass for VP gather and scatter Add vp_gather and vp_scatter expansion to unpredicated intrinsics. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D120664	2022-07-18 17:00:38 +02:00
Nikita Popov	56b4b6e81b	[SDAG] Fix release build This variable was only declared in debug builds, but is needed in release builds as well.	2022-07-18 14:10:31 +02:00
Max Kazantsev	d693fd29f1	[Verifier] Make Verifier recognize undef tokens as correct IR Undef tokens may appear in unreached code as result of RAUW of some optimization, and it should not be considered as bad IR. Patch by Dmitry Bakunevich! Differential Revision: https://reviews.llvm.org/D128904 Reviewed By: mkazantsev	2022-07-18 16:26:06 +07:00
Lorenzo Albano	f390781cec	[VP] Implementing expansion pass for VP load and store. Added function to the ExpandVectorPredication pass to handle VP loads and stores. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D109584	2022-07-18 08:47:54 +02:00
Craig Topper	7fa1c32634	[CodeGen] Remove unnecessary APInt copy. NFC	2022-07-17 23:41:53 -07:00
Craig Topper	a55ff6aadd	[Support][CodeGen] Fix spelling Divison->Division. NFC	2022-07-17 23:16:29 -07:00
Craig Topper	795602af0c	[CodeGen] Don't compare bool with integer 0. NFC The IsAdd field is a bool.	2022-07-17 23:16:14 -07:00
Kazu Hirata	3112987d5c	Remove unused forward declarations (NFC)	2022-07-17 15:37:48 -07:00
Simon Pilgrim	53b90dd372	[DAG] Fold (or (and X, C1), (and (or X, Y), C2)) -> (or (and X, C1\|C2), (and Y, C2)) Pulled out of D77804 Alive2: https://alive2.llvm.org/ce/z/g61VRe	2022-07-17 18:51:41 +01:00
Simon Pilgrim	26ce33706f	[DAG] computeKnownBits - move UDIV handling to same place as UREM/SREM. NFC.	2022-07-17 11:59:42 +01:00
Simon Pilgrim	5ec47c6dc5	[DAG] Add MERGE_VALUE computeKnownBits/ComputeNumSignBits handling. Just forward the value tracking to the operand specified by the ResNo	2022-07-17 11:58:08 +01:00
Kazu Hirata	9e6d1f4b5d	[CodeGen] Qualify auto variables in for loops (NFC)	2022-07-17 01:33:28 -07:00
Kazu Hirata	c0fe37de04	[CodeGen] Remove redundant declaration createGreedyRegisterAllocator (NFC) The function is declared in llvm/include/llvm/CodeGen/Passes.h. Identified with readability-redundant-declaration.	2022-07-16 15:43:34 -07:00

... 3 4 5 6 7 ...

33072 Commits